How to convert Dynamic 7 day rows into columns with t-sql - sql

Background Info
I have a large table 400M+ rows that changes daily (one days data drops out an a new days data drops in) The table is partitioned on a 'day' field so there are 31 paritions.
Each row in the table has data similar to this:
ID, Postcode, DeliveryPoint, Quantity, Day, Month
1 SN1 1BG A1 6 29 1
2 SN1 1BG A1 1 28 1
3 SN1 1BG A2 2 27 1
4 SN1 1BG A1 3 28 1
5 SN2 1AQ B1 1 29 12
6 SN1 1BG A1 2 26 12
I need to pull out 7 days of data in the format:
Postcode, Deliverypoint, 7dayAverage, Day1,day2,Day3,Day4,Day5,Day6,Day7
SN1 1BG A1 2 0 1 2 1 3 4 0
I can easily extract the data for the 7 day period but need to create a columnar version as shown above.
I have something like this:
select postcode,deliverypoint,
sum (case day when 23 then quantity else 0 end) as day1,
sum (case day when 24 then quantity else 0 end) as day2,
sum(case day when 25 then quantity else 0 end) as day3,
sum(case day when 26 then quantity else 0 end) as day4,
sum(case day when 27 then quantity else 0 end) as day5,
sum(case day when 28 then quantity else 0 end) as day6,
sum(case day when 29 then quantity else 0 end) as day7,
sum(quantity)*1.0/#daysinweek as wkavg
into #allweekdp
from maintable dp with (nolock)
where day in (select day from #days)
group by postcode,deliverypoint
where #days has the day numbers in the 7 day period.
But as you can see, I've hard-coded the day numbers into the query, I want to get them out of my temporary table #days but can't see a way of doing it (an array would be perfect here)
Or a I going about this in completely the wrong way ?
Kind Regards
Steve

If I understand correctly, what I would do is:
Convert the day and month columns into datetime values,
Get the first day of the week and day of the weekday (1-7) for each date, and
Pivot the data and group by the first day of the week
see here: sqlfiddle

As utexaspunk suggested, Pivot might be the way to go. I've never been comfortable with pivot and have preferred to pivot it manually so I control how everything looks, so I'm using a similar solution to how you did your script to solve the issue. No idea how the performance between my way and utexaspunk's will compare.
Declare #Min_Day Integer = Select MIN(day) as Min_Day From #days;
With Day_Coding_CTE as (
Select Distinct day
, day - #Min_Day + 1 as Day_Label
From #days
)
, Non_Columnar_CTE as (
Select dp.postcode
, dp.deliverypoint
, d.day
, c.Day_Label
, SUM(quantity) as Quantity
From maintable dp with (nolock)
Left Outer Join #days d
on dp.day = d.day --It also seems like you'll need more criteria here, but you'll have to figure out what those should be
Left Outer Join Day_Coding_CTE c
on d.day = c.day
)
Select postcode
, deliverypoint
, SUM(Case
When Day_Label = 1
Then Quantity
Else 0
End) as Day1
, SUM(Case
When Day_Label = 2
Then Quantity
Else 0
End) as Day2
, SUM(Case
When Day_Label = 3
Then Quantity
Else 0
End) as Day3
, SUM(Case
When Day_Label = 4
Then Quantity
Else 0
End) as Day4
, SUM(Case
When Day_Label = 5
Then Quantity
Else 0
End) as Day5
, SUM(Case
When Day_Label = 6
Then Quantity
Else 0
End) as Day6
, SUM(Case
When Day_Label = 7
Then Quantity
Else 0
End) as Day7
, SUM(Quantity)/#daysinweek as wkavg
From Non_Columnar_CTE
Group by postcode
deliverypoint

Related

Sum all the repeat event based on dates, aggregate by 7 days ,30 days >30 days

I am trying to calculate repeat if there is a repeat event in 3,7,30 and >30 days.
In the image below the the yellow is the sql table,
the green is transformation needed, where I find out what is the first event for Event A and Event B. and then find out what is the gap between the first event of A and next events of A.
Finally I need to aggregate and achieve the blue table where data is aggregate for the unique events.
I have been trying to achieve this in SQL but I am stuck as I am not sure how to filter and loop.
Original data and Expected outcome image
DECLARE #reference_date DATE = '2022-08-02';
SELECT
Event,
MIN(Date) as First_date,
SUM(CASE WHEN DATEDIFF(day, #reference_date, Date) BETWEEN 1 AND 2
THEN 1 ELSE 0 END) as "Within_3_Days",
SUM(CASE WHEN DATEDIFF(day, #reference_date, Date) BETWEEN 1 AND 6
THEN 1 ELSE 0 END) as "Within_7_Days",
SUM(CASE WHEN DATEDIFF(day, #reference_date, Date) BETWEEN 1 AND 29
THEN 1 ELSE 0 END) as "Within_30_Days",
SUM(CASE WHEN DATEDIFF(day, #reference_date, Date)>=30
THEN 1 ELSE 0 END) as ">_30_Days"
FROM event e0
GROUP BY Event
output:
Event
First_date
Within_3_Days
Within_7_Days
Within_30_Days
>_30_Days
A
2022-08-01
0
1
2
1
B
2022-09-15
0
0
0
1
The #reference_date is used to reference the date needed to determine if a date is within x days.
DBFIDDLE
P.S. I use dates in the format YYYY-MM-DD, because that's the only way I am SURE about the ordering of the Day and the Month part.
EDIT:
When using the first date of an event to determine the 'within' columns, you can do:
SELECT
e0.Event,
MIN(e0.Date) as First_date,
SUM(CASE WHEN DATEDIFF(day, e1.Date, e0.Date) BETWEEN 1 AND 2
THEN 1 ELSE 0 END) as "Within_3_Days",
SUM(CASE WHEN DATEDIFF(day, e1.Date, e0.Date) BETWEEN 1 AND 6
THEN 1 ELSE 0 END) as "Within_7_Days",
SUM(CASE WHEN DATEDIFF(day, e1.Date, e0.Date) BETWEEN 1 AND 29
THEN 1 ELSE 0 END) as "Within_30_Days",
SUM(CASE WHEN DATEDIFF(day, e1.Date, e0.Date)>=30
THEN 1 ELSE 0 END) as ">_30_Days"
FROM event e0
INNER JOIN (SELECT Event,MIN(Date) as Date from event GROUP BY Event) e1 on e1.Event=e0.Event
GROUP BY e0.Event
see: DBFIDDLE2

Group by datepart and find total count of individual values of each record

This is table structure;
ID Score Valid CreatedDate
1 A 1 2018-02-19 23:33:10.297
2 C 0 2018-02-19 23:32:40.700
3 B 1 2018-02-19 23:32:30.247
4 A 1 2018-02-19 23:31:37.153
5 B 0 2018-02-19 23:25:08.667
...
I need to find total number of each score and valid in each month
I mean final result should be like
Month A B C D E Valid(1) NotValid(0)
January 123 343 1021 98 12 1287 480
February 516 421 321 441 421 987 672
...
This is what I tried;
SELECT DATEPART(year, CreatedDate) as Ay,
(select count(*) from TableResults where Score='A') as 'A',
(select count(*) from TableResults where Score='B') as 'B',
...
FROM TableResults
group by DATEPART(MONTH, CreatedDate)
but couldn't figure how to calculate all occurrence of scores on each month.
Use conditional aggregation.
SELECT DATEPART(year, CreatedDate) as YR
, DATEPART(month, CreatedDate) MO
, sum(Case when score = 'A' then 1 else 0 end) as A
, sum(Case when score = 'B' then 1 else 0 end) as B
, sum(Case when score = 'C' then 1 else 0 end) as C
, sum(Case when score = 'D' then 1 else 0 end) as D
, sum(Case when score = 'E' then 1 else 0 end) as E
, sum(case when valid = 1 then 1 else 0 end) as Valid
, sum(case when valid = 0 then 1 else 0 end) as NotValid
FROM TableResults
GROUP BY DATEPART(MONTH, CreatedDate), DATEPART(year, CreatedDate)
I'm not a big fan of queries in the select; I find they tend to cause performance problems in the long run. Since we're aggregating here I just applied the conditional logic to all the columns.

Select multiple COUNTs for every day

I got a table of Visitors.
Visitor has the following columns:
Id
StartTime (Date)
Purchased (bool)
Shipped (bool)
For each day within the last 7 days, I want to select 3 counts of the Visitors who have that day as StartTime:
The count of total visitors
The count of total visitors where Purchased = true
The count of total visitors where Shipped = true
Ideally the returned result would be:
Day Total TotalPurchased TotalShipped
1 100 67 42
2 82 61 27
etc...
I am used to .NET Linq so this has proved to be quite a challenge for me.
All I have come up with so far is the following:
SELECT COUNT(*) AS Total
FROM [dbo].[Visitors]
WHERE DAY([StartTime]) = DAY(GETDATE())
It selects the total of the current day just fine, however I feel pretty stuck right now so it'd be nice if someone could point me in the right direction.
For the last 7 days use the query proposed by Stanislav but with this WHERE clause
SELECT DAY([StartTime]) theDay,
COUNT(*) AS Tot,
SUM(CASE WHEN Purchased=true THEN 1 ELSE 0 END) as TotPurch,
SUM(CASE WHEN Shipped=true THEN 1 ELSE 0 END) as TotShip
FROM [dbo].[Visitors]
WHERE [StartTime] BETWEEN GETDATE()-7 AND GETDATE()
GROUP BY DAY([StartTime])
SELECT COUNT(*) AS Total,
SUM(CASE WHEN Purchased=true THEN 1 ELSE 0 END) as TotalPurchased,
SUM(CASE WHEN Shipped=true THEN 1 ELSE 0 END) as TotalShipped
FROM [dbo].[Visitors]
WHERE DAY([StartTime]) = DAY(GETDATE())
and add GROUP BY DAY([StartTime]) as jarlh mentioned
Here's a simple select that will give you the dataset you want
SELECT DATEDIFF(day,StartTime, getdate())+1 as [Day], -- Add 1 to display 1 to 7 instead of 0 to 6
COUNT(*) as Total,
SUM(CASE WHEN Purchased = 1 THEN 1 ELSE 0 END) as TotalPurchased,
SUM(CASE WHEN Shipped = 1 THEN 1 ELSE 0 END) AS TotalShipped
FROM Visitors
WHERE DATEDIFF(day,startTime,GETDATE()) < 6
GROUP BY DATEDIFF(day,startTime,GETDATE())
ORDER BY 1
This query will not take into consideration the time component of the date.

How to subtract result of 2 queries grouped by a field

I have a table in this form:
id year type amount
1 2015 in 10
2 2015 out 5
3 2016 in 20
4 2016 out 1
...
The followin query will give me the sum of the amount of type = 'in' grouped by year:
SELECT year, sum(amount)
FROM table
WHERE type = in
GROUP BY year
How am I going to get the following result?
year sum(in) sum(out) "in-out"
2015 10 5 5
2016 20 1 19
sum(in) is the sum of the 'amount' where type='in'.
Use a CASE statement to handle the values of type.
SELECT year,
SUM(CASE WHEN type = 'in' THEN amount ELSE 0 END) AS sum_in,
SUM(CASE WHEN type = 'out' THEN amount ELSE 0 END) AS sum_out,
SUM(CASE WHEN type = 'in' THEN amount ELSE -amount END) AS in_out
FROM table
GROUP BY year;

Select Count usage divided by month

I do have a table license_Usage where which works like a log of the usage of licenses in a day
ID User license date
1 1 A 22/1/2015
2 1 A 23/1/2015
3 1 B 23/1/2015
4 1 A 24/1/2015
5 2 A 22/2/2015
6 2 A 23/2/2015
7 1 B 23/2/2015
Where I want to Count how many licenses a user used in a month, the result should look like:
User Jan Feb
1 2 1 ...
2 0 2
How can I manage to do that???
You need a PIVOT or cross tab query. e.g.
SELECT [User],
COUNT(CASE WHEN Month = 1 THEN 1 END) AS Jan,
COUNT(CASE WHEN Month = 2 THEN 1 END) AS Feb,
COUNT(CASE WHEN Month = 3 THEN 1 END) AS Mar
/*TODO - Fill in other 9 months using above pattern*/
FROM [license]
CROSS APPLY (SELECT MONTH([date])) AS CA(Month)
WHERE [date] >= '20150101'
AND [date] < '20160101'
AND [license] = 'A'
GROUP BY [User]
SQL Fiddle