Adding total column and row to sql server pivot - sql

I have a pivot table that I believe to be working. I want to add a total column and a total row to this pivot. Here is the code for the pivot table...
SELECT Month,
[N] AS Expected,
[R] AS Requested,
[T] AS Tires,
[U] AS Unexpected,
[D] AS Damage
FROM (
SELECT CustomerNo,
DATEPART(mm,InvoiceDate) AS Month,
[Type],
SUM(Total) AS Cost
FROM tbl_PM_History
WHERE (InvoiceDate >= #Start)
AND (InvoiceDate <= #End)
AND (CustomerNo = #Cust)
GROUP BY CustomerNo,
DATEPART(mm,InvoiceDate),
TYPE
) p
PIVOT (SUM(Cost) FOR [Type] IN ([N],[R],[T],[U],[D]))AS pvt
ORDER BY Month

Here's a fairly quick way (in programming time) to achieve what you want:
;
WITH details -- Wrap your original query in a CTE so as to encapsulate all your calculations
AS (
SELECT Month,
[N] AS Expected,
[R] AS Requested,
[T] AS Tires,
[U] AS Unexpected,
[D] AS Damage,
ISNULL([N],0) + ISNULL([R],0) + ISNULL([T],0) + ISNULL([U],0) + ISNULL([D], 0) as Total
FROM (
SELECT CustomerNo,
DATEPART(mm,InvoiceDate) AS Month,
[Type],
SUM(Total) AS Cost
FROM tbl_PM_History
WHERE (InvoiceDate >= #Start)
AND (InvoiceDate <= #End)
AND (CustomerNo = #Cust)
GROUP BY CustomerNo,
DATEPART(mm,InvoiceDate),
TYPE
) p
PIVOT (SUM(Cost) FOR [Type] IN ([N],[R],[T],[U],[D])) AS pvt
)
, summary // Use another CTE to add a total line, using UNION ALL
AS (
SELECT Month
, Expected,
, Requested
, Tires
, Unexpected
, Damage
, Total
, 0 as RecordCode
FROM details
UNION ALL
SELECT Null
, SUM(Expected)
, SUM(Requested)
, SUM(Tires)
, SUM(Unexpected)
, SUM(Damage)
, SUM(Total)
, 1
FROM details
)
SELECT Month -- Do your actual sorting.
, Expected,
, Requested
, Tires
, Unexpected
, Damage
, Total
FROM summary
ORDER by RecordCode, Month

Related

SQL - Return count of consecutive days where value was unchanged

I have a table like
date
ticker
Action
'2022-03-01'
AAPL
BUY
'2022-03-02'
AAPL
SELL.
'2022-03-03'
AAPL
BUY.
'2022-03-01'
CMG
SELL.
'2022-03-02'
CMG
HOLD.
'2022-03-03'
CMG
HOLD.
'2022-03-01'
GPS
SELL.
'2022-03-02'
GPS
SELL.
'2022-03-03'
GPS
SELL.
I want to do a group by ticker then count all the times that Actions have sequentially been the value that they are as of the last date, here it's 2022-03-03. ie for this example table it'd be like;
ticker
NumSequentialDaysAction
AAPL
0
CMG
1
GPS
2
Fine to pass in 2022-03-03 as a value, don't need to figure that out on the fly.
Tried something like this
---Table Creation---
CREATE TABLE UserTable
([Date] DATETIME2, [Ticker] varchar(5), [Action] varchar(5))
;
INSERT INTO UserTable
([Date], [Ticker], [Action])
VALUES
('2022-03-01' , 'AAPL' , 'BUY'),
('2022-03-02' , 'AAPL' , 'SELL'),
('2022-03-03' , 'AAPL' , 'BUY'),
('2022-03-01' , 'CMG' , 'SELL'),
('2022-03-02' , 'CMG' , 'HOLD'),
('2022-03-03' , 'CMG' , 'HOLD'),
('2022-03-01' , 'GPS' , 'SELL'),
('2022-03-02' , 'GPS' , 'SELL'),
('2022-03-03' , 'GPS' , 'SELL')
;
---Attempted Solution---
I'm thinking that I need to do a sub query to get the last value and join on itself to get the matching values. Then apply a window function, ordered by date to see that the proceeding value is sequential.
WITH CTE AS (SELECT Date, Ticker, Action,
ROW_NUMBER() OVER (PARTITION BY Ticker, Action ORDER BY Date) as row_num
FROM UserTable)
SELECT Ticker, COUNT(DISTINCT Date) as count_of_days
FROM CTE
WHERE row_num = 1
GROUP BY Ticker;
WITH CTE AS (SELECT Date, Ticker, Action,
DENSE_RANK() OVER (PARTITION BY Ticker ORDER BY Action,Date) as rank
FROM table)
SELECT Ticker, COUNT(DISTINCT Date) as count_of_days
FROM CTE
WHERE rank = 1
GROUP BY Ticker;
You can do this with the help of the LEAD function like so. You didn't specify which RDBMS you're using. This solution works in PostgreSQL:
WITH "withSequential" AS (
SELECT
ticker,
(LEAD("Action") OVER (PARTITION BY ticker ORDER BY date ASC) = "Action") AS "nextDayIsSameAction"
FROM UserTable
)
SELECT
ticker,
SUM(
CASE
WHEN "nextDayIsSameAction" IS TRUE THEN 1
ELSE 0
END
) AS "NumSequentialDaysAction"
FROM "withSequential"
GROUP BY ticker
Here is a way to do this using gaps and islands solution.
Thanks for sharing the create and insert scripts, which helps to build the solution quickly.
dbfiddle link.
https://dbfiddle.uk/rZLDTrNR
with data
as (
select date
,ticker
,action
,case when lag(action) over(partition by ticker order by date) <> action then
1
else 0
end as marker
from usertable
)
,interim_data
as (
select *
,sum(marker) over(partition by ticker order by date) as grp_val
from data
)
,interim_data2
as (
select *
,count(*) over(partition by ticker,grp_val) as NumSequentialDaysAction
from interim_data
)
select ticker,NumSequentialDaysAction
from interim_data2
where date='2022-03-03'
Another option, you could use the difference between two row_numbers approach as the following:
select [Ticker], count(*)-1 NumSequentialDaysAction -- you could use (distinct) to remove duplicate rows
from
(
select *,
row_number() over (partition by [Ticker] order by [Date]) -
row_number() over (partition by [Ticker], [Action] order by [Date]) grp
from UserTable
where [date] <= '2022-03-03'
) RN_Groups
/* get only rows where [Action] = last date [Action] */
where [Action] = (select top 1 [Action] from UserTable T
where T.[Ticker] = RN_Groups.[Ticker] and [date] <= '2022-03-03'
order by [Date] desc)
group by [Ticker], [Action], grp
See demo

Performance issue using IsNull function in the Select statement

I have a financial application. I have ViewHistoricInstrumentValue which has rows like this
instrument1, date1, price, grossValue, netValue
instrument2, date1, price, grossValue, netValue
...
instrument1, date2, price, grossValue, netValue
...
My views are complicated but the db itself is small (4000 transactions). ViewHistoricInstrumentValue was executed in less than 1 second before I added the next CTE to the view. After that it takes 26s. ActualEvaluationPrice is the price for instrumentX at dateY. If this value is missing from HistoricPrice table then I find the previous price for instrumentX.
, UsedEvaluationPriceCte AS (
SELECT *
, isnull(ActualEvaluationPrice,
(select top 1 HistoricPrice.Price -- PreviousPrice
from HistoricPrice JOIN ValidDate
on HistoricPrice.DateId = ValidDate.Id
and HistoricPrice.InstrumentId = StartingCte.InstrumentId
and ValidDate.[Date] < StartingCte.DateValue
order by ValidDate.[Date]))
as UsedEvaluationPrice
FROM StartingCte
)
My problem is that the execution time increased needlessly. Right now the HistoricPrice table has no missing value so ActualEvaluationPrice is never null, so the previous price should be never determined.
ViewHistoricInstrumentValue returns 1815 rows. One other mystery is that the first query takes 26s, but the second only 2s.
SELECT * FROM [ViewHistoricInstrumentValue]
SELECT top(2000) * FROM [ViewHistoricInstrumentValue]
Appendix
The execution plan: https://www.dropbox.com/s/5st69uhjkpd3b5y/IsNull.sqlplan?dl=0
The same plan: https://www.brentozar.com/pastetheplan/?id=rk9bK1Wiv
The view:
ALTER VIEW [dbo].[ViewHistoricInstrumentValue] AS
WITH StartingCte AS (
SELECT
HistoricInstrumentValue.DateId
, ValidDate.Date as DateValue
, TransactionId
, TransactionId AS [Row]
, AccountId
, AccountName
, ViewTransaction.InstrumentId
, ViewTransaction.InstrumentName
, OpeningDate
, OpeningPrice
, Price AS ActualEvaluationPrice
, ClosingDate
, Amount
, isnull(ViewTransaction.FeeValue, 0) as FeeValue
, HistoricInstrumentValue.Id AS Id
FROM ViewBriefHistoricInstrumentValue as HistoricInstrumentValue
JOIN ValidDate on HistoricInstrumentValue.DateId = ValidDate.Id
JOIN ViewTransaction ON ViewTransaction.Id = HistoricInstrumentValue.TransactionId
left JOIN ViewHistoricPrice ON ViewHistoricPrice.DateId = HistoricInstrumentValue.DateId AND
ViewHistoricPrice.InstrumentId = ViewTransaction.InstrumentId
)
, UsedEvaluationPriceCte AS (
SELECT *
, isnull(ActualEvaluationPrice,
(select top 1 HistoricPrice.Price -- PreviousPrice
from HistoricPrice JOIN ValidDate
on HistoricPrice.DateId = ValidDate.Id
and HistoricPrice.InstrumentId = StartingCte.InstrumentId
and ValidDate.[Date] < StartingCte.DateValue
order by ValidDate.[Date]))
as UsedEvaluationPrice
FROM StartingCte
)
, GrossEvaluationValueCte AS (
SELECT *
, Amount * UsedEvaluationPrice AS GrossEvaluationValue
, (UsedEvaluationPrice - OpeningPrice) * Amount AS GrossCapitalGains
FROM UsedEvaluationPriceCte
)
, CapitalGainsTaxCte AS (
SELECT *
, dbo.MyMax(GrossCapitalGains * 0.15, 0) AS CapitalGainsTax
FROM GrossEvaluationValueCte
)
, IsOpenCte AS (
SELECT
DateId
, DateValue
, TransactionId
, [Row]
, AccountId
, AccountName
, InstrumentId
, InstrumentName
, OpeningDate
, OpeningPrice
, ActualEvaluationPrice
, UsedEvaluationPrice
, ClosingDate
, Amount
, GrossEvaluationValue
, GrossCapitalGains
, CapitalGainsTax
, FeeValue
, GrossEvaluationValue - CapitalGainsTax - FeeValue AS NetEvaluationValue
, GrossCapitalGains - CapitalGainsTax - FeeValue AS NetUnrealizedGains
, CASE WHEN ClosingDate IS NULL OR DateValue < ClosingDate
THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END
AS IsOpen
, convert(NVARCHAR, DateValue, 20) + cast([Id] AS NVARCHAR(MAX)) AS Temp
, Id
FROM CapitalGainsTaxCte
)
Select * from IsOpenCte
I have no idea what your query is supposed to be doing. But this process:
ActualEvaluationPrice is the price for instrumentX at dateY. If this value is missing from HistoricPrice table then I find the previous price for instrumentX.
is handled easily with lag():
select vhiv.*
coalesce(vhiv.ActualEvaluationPrice,
lag(vhiv.ActualEvaluationPrice) over (partition by vhiv.InstrumentId order by DateValue)
) as UsedEvaluationPrice
from ViewHistoricInstrumentValue vhiv;
Note: If you need to filter out certain dates by joining to ValidDates, you can include the JOIN in the query. However, that is not part of the problem statement.

Splitting out a cost dynamically across weeks

I’m creating an interim table in SQL Server for use with PowerBI to query financial data.
I have a finance transactions table tblfinance with
CREATE TABLE TBLFinance
(ID int,
Value float,
EntryDate date,
ClientName varchar (250)
)
INSERT INTO TBLFinance(ID ,Value ,EntryDate ,ClientName)
VALUES(1,'1783.26','2018-10-31 00:00:00.000','Alpha')
, (2,'675.3','2018-11-30 00:00:00.000','Alpha')
, (3,'243.6','2018-12-31 00:00:00.000','Alpha')
, (4,'8.17','2019-01-31 00:00:00.000','Alpha')
, (5,'257.23','2019-01-31 00:00:00.000','Alpha')
, (6,'28','2019-02-28 00:00:00.000','Alpha')
, (7,'1470.61','2019-03-31 00:00:00.000','Bravo')
, (8,'1062.86','2019-04-30 00:00:00.000','Bravo')
, (9,'886.65','2019-05-31 00:00:00.000','Bravo')
, (10,'153.31','2019-05-31 00:00:00.000','Bravo')
, (11,'150.24','2019-06-30 00:00:00.000','Bravo')
, (12,'690.14','2019-07-31 00:00:00.000','Charlie')
, (13,'21.67','2019-08-31 00:00:00.000','Charlie')
, (14,'339.29','2018-10-31 00:00:00.000','Charlie')
, (15,'807.96','2018-11-30 00:00:00.000','Delta')
, (16,'48.94','2018-12-31 00:00:00.000','Delta')
I’m calculating transaction values that fall within a week. My week ends on a Sunday, so I have the following query:
INSERT INTO tblAnalysis
(WeekTotal
, WeekEnd
, Client
)
SELECT SUM (VALUE) AS WeekTotal
, dateadd (day, case when datepart (WEEKDAY, EntryDate) = 1 then 0 else 8 - datepart (WEEKDAY, EntryDate) end, EntryDate) AS WeekEnd
, ClientName as Client
FROM dbo.tblFinance
GROUP BY dateadd (day, case when datepart (WEEKDAY, EntryDate) = 1 then 0 else 8 - datepart (WEEKDAY, EntryDate) end, EntryDate), CLIENTNAME
I’ve now been informed that some of the costs incurred within a given week maybe monthly, and therefore need to be split into 4 weeks, or annually, so split into 52 weeks. I will write a case statement to update the costs based on ClientName, so assume there is an additional field called ‘Payfrequency’.
I want to avoid having to pull the values affected into a temp table, and effectively write this – because there’ll be different sums applied depending on frequency.
SELECT *
INTO #MonthlyCosts
FROM
(
SELECT
client
, VALUE / 4 AS VALUE
, WEEKENDING
FROM tblAnalysis
UNION
SELECT
client
, nt_acnt
, VALUE / 4 AS VALUE
, DATEADD(WEEK,1,WEEKENDING) AS WEEKENDING
FROM tblAnalysis
UNION
SELECT
client
, VALUE / 4 AS VALUE
, DATEADD(WEEK,2,WEEKENDING) AS WEEKENDING
FROM tblAnalysis
UNION
SELECT
client
, VALUE / 4 AS VALUE
, DATEADD(WEEK,3,WEEKENDING) AS WEEKENDING
FROM tblAnalysis
) AS A
I know I need a stored procedure to hold variables so the calculations can be carried out dynamically, but have no idea where to start.
You can use recursive CTEs to split the data:
with cte as (
select ID, Value, EntryDate, ClientName, payfrequency, 1 as n
from TBLFinance f
union all
select ID, Value, EntryDate, ClientName, payfrequency, n + 1
from cte
where n < payfrequency
)
select *
from cte;
Note that by default this is limited to 100 recursion steps. You can add option (maxrecursion 0) for unlimited numbers of days.
The best solution would be to make use of a numbers table. If you can create a table on your server with one column holding a sequence of integer numbers.
You can then use it like this for your weekly values:
SELECT
client
, VALUE / 52 AS VALUE
, DATEADD(WEEK,N.Number,WEEKENDING) AS WEEKENDING
FROM tblAnalysis AS A
CROSS JOIN tblNumbers AS N
WHERE NCHAR.Number <=52

SQL Addition Formula

Noob alert...
I have an example table as followed.
I am trying to create a column in SQL that shows the what percentage each customer had of size S per year.
So output should be something like:
(Correction: the customer C for 2019 Percentage should be 1)
Window functions will get you there.
DECLARE #TestData TABLE
(
[Customer] NVARCHAR(2)
, [CustomerYear] INT
, [CustomerCount] INT
, [CustomerSize] NVARCHAR(2)
);
INSERT INTO #TestData (
[Customer]
, [CustomerYear]
, [CustomerCount]
, [CustomerSize]
)
VALUES ( 'A', 2017, 1, 'S' )
, ( 'A', 2017, 1, 'S' )
, ( 'B', 2017, 1, 'S' )
, ( 'B', 2017, 1, 'S' )
, ( 'B', 2018, 1, 'S' )
, ( 'A', 2018, 1, 'S' )
, ( 'C', 2017, 1, 'S' )
, ( 'C', 2019, 1, 'S' );
SELECT DISTINCT [Customer]
, [CustomerYear]
, SUM([CustomerCount]) OVER ( PARTITION BY [Customer]
, [CustomerYear]
) AS [CustomerCount]
, SUM([CustomerCount]) OVER ( PARTITION BY [CustomerYear] ) AS [TotalCount]
, SUM([CustomerCount]) OVER ( PARTITION BY [Customer]
, [CustomerYear]
) * 1.0 / SUM([CustomerCount]) OVER ( PARTITION BY [CustomerYear] ) AS [CustomerPercentage]
FROM #TestData
ORDER BY [CustomerYear]
, [Customer];
Will give you
Customer CustomerYear CustomerCount TotalCount CustomerPercentage
-------- ------------ ------------- ----------- ---------------------------------------
A 2017 2 5 0.400000000000
B 2017 2 5 0.400000000000
C 2017 1 5 0.200000000000
A 2018 1 2 0.500000000000
B 2018 1 2 0.500000000000
C 2019 1 1 1.000000000000
Assuming there are no duplicate rows for a customer in a year, you can use window functions:
select t.*,
sum(count) over (partition by year) as year_cnt,
count * 1.0 / sum(count) over (partition by year) as ratio
from t;
Break it apart into tasks - that's probably the best rule to follow when it comes to SQL. So, I created a variable table #tmp which I populated with your sample data, and started out with this query:
select
customer,
year
from #tmp
where size = 'S'
group by customer, year
... this gets a row for each customer/year combo for 'S' entries.
Next, I want the total count for that customer/year combo:
select
customer,
year,
SUM(itemCount) as customerItemCount
from #tmp
where size = 'S'
group by customer, year
... now, how do we get the count for all customers for a specific year? We need a subquery - and we need that subquery to reference the year from the main query.
select
customer,
year,
SUM(itemCount) as customerItemCount,
(select SUM(itemCount) from #tmp t2 where year=t.year) as FullTotalForYear
from #tmp t
where size = 'S'
GROUP BY customer, year
... that make sense? That new line in the ()'s is a subquery - and it's hitting the table again - but this time, its just getting a SUM() over the particular year that matches the main table.
Finally, we just need to divide one of those columns by the other to get the actual percent (making sure not to make it int/int - which will always be an int), and we'll have our final answer:
select
customer,
year,
cast(SUM(itemCount) as float) /
(select SUM(itemCount) from #tmp t2 where year=t.year)
as PercentageOfYear
from #tmp t
where size = 'S'
GROUP BY customer, year
Make sense?
With a join of 2 groupings:
the 1st by size, year, customer and
the 2nd by size, year.
select
t.customer, t.year, t.count, t.size,
ty.total_count, 1.0 * t.count / ty.total_count percentage
from (
select t.customer, t.year, sum(t.count) count, t.size
from tablename t
group by t.size, t.year, t.customer
) t inner join (
select t.year, sum(t.count) total_count, t.size
from tablename t
group by t.size, t.year
) ty
on ty.size = t.size and ty.year = t.year
order by t.size, t.year, t.customer;
See the demo

Calculating average with pivot

I am trying to find the average with the help of pivot but not able to find the right solution.
The below is my query:
select branch, ISNULL([11:00], 0) as [11:00],ISNULL([11:15], 0) as
[11:15],ISNULL([11:30], 0) as [11:30], ISNULL([11:45], 0) as [11:45],
ISNULL([12:00], 0) as [12:00]
from
(
select b.branchname
,convert(varchar(5), intervals.interval_start_time, 108)
,sum(b.ordercount) ordercounts
from branch b cross apply dbo.getDate15MinInterval(CAST(b.TransactionDate
as date)) as intervals
where b.TransactionDate >= interval_start_time and b.TransactionDate <=
interval_end_time
and CAST(TransactionDate AS date) IN ('2017-07-01','2017-07-08')
group by DATEPART(WEEKDAY,TransactionDate),b.branchname,intervals.interval_start_time,intervals.interval_end_time
) t
pivot ( avg(ordercounts) for interval_start_time in ( [11:00], [11:15] ,
[11:30], [11:45], [12:00])) as p
My original table is:
Result from the above query is:
Expected result:
For 15minuteinterval query, please refer my original post:
Group data by interval of 15 minutes and use cross tab
SQL Server does integer arithmetic operations on integers. The problem is that this is an integer:
sum(b.ordercount) as ordercounts
(presumably).
So, just turn it into a floating/fixed point number. I usually just multiply by 1.0:
sum(b.ordercount)*1.0 as ordercounts
But you can be more specific about your types if you like.
Try casting to float -
AVG(CAST(ordercounts AS FLOAT))
SUM(CAST(b.ordercount AS FLOAT)) AS ordercounts
Can be done like this:
select branchname, [dayname], ISNULL([11:00], 0) as [11:00], AVG(CAST([11:00] as float)) over() [Avg_11:00]
from
(
select branchname, [dayname], ISNULL([11:00], 0) as [11:00], ISNULL([11:15], 0) as [13:15], ISNULL([11:30], 0) as [11:30], ISNULL([11:45], 0) as [11:45]
from
(
select intervals.[dayname]
, b.branchname
, convert(varchar(5), intervals.interval_start_time, 108) interval_start_time -- for hh:mm format
, sum(b.ordercount) ordercount
from branch b cross apply dbo.getDate15MinIntervals(CAST(b.TransactionDate as date)) as intervals
where b.transactiondate between interval_start_time and interval_end_time
group by intervals.[dayname], b.branchname, intervals.interval_start_time, intervals.interval_end_time
) t
pivot ( sum(ordercount) for interval_start_time in ( [11:00], [11:15] , [11:30], [11:45] )) as p
) t
group by branchname, [dayname], [11:00]
AVG OVER() is valid as of SQL Server 2008.
In this example I only used one interval, but you can extend it to all of the ones you need.
I tried with some sample data from yesterday's answer and it returns values as below:
Happy coding! :)