I am working with a dataset that contains information about train delays. The dataset contains an arrival delay column and departing delay column. Each delay column is measured in minutes. I need to calculate the number of total delays for each day of the week to determine which day has the most train delays. If the delay is equal to or more than 1 minute, it needs to be counted as a delay. How can I complete this in SQL? I have tried the following code.
select dayofweek
count(case when arrivaldelay>=1 then 1 end)+
count(case when departuredelay>=1 then 1 end)
group by dayofweek;
dayofweek arrivaldelay departuredelay
2 12 5
4 7 10
4 6 -3
6 5 4
dayofweek delays
2 1
4 1
6 1
Assuming dayofweek is a stored column and not a function, then you can use either count or sum
select
dayofweek
, count(case when arrivaldelay >= 1 then 1 end)
+ count(case when departuredelay >= 1 then 1 end)
as delays
from mytable as t
group by dayofweek;
select
dayofweek
, sum(case when arrivaldelay >= 1 then 1 else 0 end)
+ sum(case when departuredelay >= 1 then 1 else 0 end)
as delays
from mytable as t
group by dayofweek;
both give the following result from the sample data in the question
+-----------+--------+
| dayofweek | delays |
+-----------+--------+
| 2 | 2 |
| 4 | 3 |
| 6 | 2 |
+-----------+--------+
IF dayofweek is NOT a stored column then you can extract the day of week from a date or timestamp, BUT there are differences in how this is achieved in different databases
demonstrated #db<>fiddle here
You can use sum() like this:
select dayofweek
( sum(case when arrivaldelay >= 1 then 1 else 0 end)+
sum(case when departuredelay >= 1 then 1 else 0 end)
)
from t
group by dayofweek;
Related
I have a query that currently gets daily records against a weekly number from a prepopulated table:
SELECT Employee,
sum(case when category = 'Shirts' then daily_total else 0 end) as Shirts_DAILY,
sum(case when category = 'Shirts' then weekly_quota else 0 end) as Shirts_QUOTA, -- this is a static column, this number is the same for every record
sum(case when category = 'Shoes' then daily_total else 0 end) as Shoes_DAILY,
sum(case when category = 'Shoes' then weekly_quota else 0 end) as Shoes_QUOTA, -- this is a static column, this number is the same for every record
CURRENT_DATE as DATE_OF_REPORT
from SalesNumbers
where date_of_report >= current_date
group by Employee;
This runs in a script nightly and returns records like this:
Employee | shirts_DAILY | shirts_QUOTA | Shoes_DAILY | Shoes_QUOTA | DATE_OF_REPORT
--------------------------------------------------------------------------------------------------------
123 15 75 14 85 2019-08-30
That's the record from last Friday Night's report. I'm trying to figure out a way to add a column for each category that would take the sum of daily totals (shirts_DAILY, shoes_DAILY) for each category on preceding weekdays (running sunday through saturday as a week) and divide by that category's quota (shirts_QUOTA, shoes_QUOTA).
For example, here are records from sunday through thursday
Employee | shirts_DAILY | shirts_QUOTA | Shoes_DAILY | Shoes_QUOTA | DATE_OF_REPORT
--------------------------------------------------------------------------------------------------------
123 15 75 16 85 2019-08-25
123 4 75 2 85 2019-08-26
123 8 75 6 85 2019-08-27
123 2 75 8 85 2019-08-28
123 15 75 14 85 2019-08-29
With my new change, I would want Friday night's record to take the sum of sunday through thursday's daily records and divide by the quota (including friday's daily in the sum)
Friday night's record with new column:
Employee | shirts_DAILY | shirts_QUOTA | shirtsPercent | Shoes_DAILY | Shoes_QUOTA | shoesPercent | DATE_OF_REPORT
-----------------------------------------------------------------------------------------------------------------------------------------------
123 2 75 61.3 7 85 62.4 2019-08-30
So friday's run added 15,4,8,2,15,2 for the shirts for 46/75 and 7,14,8,6,2,16 for shoes for 53/85. So the daily sum of each for the preceding week, including present day daily totals, if that makes sense.
What is the best way for me to achieve this?
SELECT Employee,
sum(case when category = 'Shirts' and date_of_report >= current date then
daily_total else 0 end) as Shirts_DAILY,
sum(case when category = 'Shirts' and date_of_report >= current date then
weekly_quota else 0 end) as Shirts_QUOTA,
( sum(case when category = 'Shirts' then
daily_total else 0 end) * 100 ) /
( sum(case when category = 'Shirts' and date_of_report >= current date then
weekly_quota else 0 end) ) as Shirts_PERCENT,
CURRENT_DATE as DATE_OF_REPORT
from SalesNumbers
where date_of_report >= ( current date - ( dayofweek(current date) - 1 ) days )
group by Employee
I'd like to build a query that returns if tasks are very late/late/near on time/on time.
Task status :
early if -2 days
near on time if -1 day
late if 1 day
vary late if 2 days
What i've tried :
SELECT field_1, diff,
COUNT(CASE WHEN diff <= -2 THEN 1 END) onTime,
COUNT(CASE WHEN diff <= -1 THEN 1 END) nearOnTime,
Count(CASE WHEN diff >= 2 THEN 1 END) veryLate,
Count(CASE WHEN diff >= 0 THEN 1 END) Late
FROM(
SELECT field_1, DATEDIFF(day,Max(predicted_date), realization_date) as diff
FROM table
Group by field_1, realization_date
HAVING end_date is not null) as req1
GROUP BY field_1, diff)
diff : difference between a predicated date and a realization date
=> returns the number of day between these two dates
It returns :
field_1 | diff | onTime | nearOnTime | veryLate | Late
---------+--------+----------+--------------+------------+-------
task1 | -3 | 1 | 1 | 0 | 0
task2 | 2 | 0 | 0 | 1 | 1
I think my approach is bad, so what is or are my options to returns task status?
maybe something along these lines.. ( a fiddle would help - this has not been tested)
SELECT field_1, diff,
CASE WHEN diff <= -2 THEN 'On Time',
WHEN diff <= -1 THEN 'nearOnTime',
WHEN diff >= 2 THEN 'veryLate',
WHEN diff >= 0 THEN 'Late'
else 'OK' END as status
FROM(
SELECT field_1, DATEDIFF(day,Max(predicted_date), realization_date) as diff
FROM table
Group by field_1, realization_date
HAVING end_date is not null) as req1
GROUP BY field_1, diff)
I have A little problem.
I have A table lets call it "events" with columns like: type, (1 or 0) , timestamp start , timestamp end.
I want to group them by hours (60 minutes periods)
Into 4 columns each calculating
How many minutes per hour there was no either type 1 or type 0 event.
How many minutes per hour there was an event type 1 and in the same time there was not event of type 2.
How many minutes per hour there was an event type 2 and in the same time there was no event of type 1
How many minutes per hour there was an event 2 and event 1 at the same time.
Result should look like this:
hour 00 10 01 11
12 10 20 20 10
13 5 15 25 15
Each row should always sum to 60 minutes.
Is it possible to do it in SQL? I need it in vertica so I can use verticas functions too.
Interesting Question! Here is a query which gets you what you need. I mocked up the following table and some dummy data, and then showed the results from the query at the end. As you required - the totals always add up to 60 minutes within each hour.
SETUP:
create table public.time_event_test(event_timestamp timestamptz, event_type int);
insert into public.time_event_test(event_timestamp,event_type) select getutcdate() as event_timestamp, 1 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',5,getutcdate()) as event_timestamp, 1 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',1,getutcdate()) as event_timestamp, 1 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',1,getutcdate()) as event_timestamp, 2 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',3,getutcdate()) as event_timestamp, 2 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',6,getutcdate()) as event_timestamp, 2 as event_type;
insert into public.time_event_test(event_timestamp,event_type) select TIMESTAMPADD('minute',90,getutcdate()) as event_timestamp, 2 as event_type;
QUERY:
select date_trunc('hour',dat) as hr
, 60 - sum(case when event_type1 = 1 or event_type2 = 1 then 1 else 0 end) as type_00
, sum(case when event_type1 = 0 and event_type2 = 1 then 1 else 0 end) as type_01
, sum(case when event_type1 = 1 and event_type2 = 0 then 1 else 0 end) as type_10
, sum(case when event_type1 = 1 and event_type2 = 1 then 1 else 0 end) as type_11
from (
select date_trunc('minute',event_timestamp) as dat
, max(case when event_type = 1 then 1 else 0 end) as event_type1
, max(case when event_type = 2 then 1 else 0 end) as event_type2
from public.time_event_test
group by 1
) x
group by 1 order by 1;
RESULTS:
hr | type_00 | type_01 | type_10 | type_11
------------------------+---------+---------+---------+---------
2016-12-21 01:00:00+00 | 52 | 3 | 2 | 3
2016-12-21 02:00:00+00 | 59 | 1 | 0 | 0
I do have a table license_Usage where which works like a log of the usage of licenses in a day
ID User license date
1 1 A 22/1/2015
2 1 A 23/1/2015
3 1 B 23/1/2015
4 1 A 24/1/2015
5 2 A 22/2/2015
6 2 A 23/2/2015
7 1 B 23/2/2015
Where I want to Count how many licenses a user used in a month, the result should look like:
User Jan Feb
1 2 1 ...
2 0 2
How can I manage to do that???
You need a PIVOT or cross tab query. e.g.
SELECT [User],
COUNT(CASE WHEN Month = 1 THEN 1 END) AS Jan,
COUNT(CASE WHEN Month = 2 THEN 1 END) AS Feb,
COUNT(CASE WHEN Month = 3 THEN 1 END) AS Mar
/*TODO - Fill in other 9 months using above pattern*/
FROM [license]
CROSS APPLY (SELECT MONTH([date])) AS CA(Month)
WHERE [date] >= '20150101'
AND [date] < '20160101'
AND [license] = 'A'
GROUP BY [User]
SQL Fiddle
Background Info
I have a large table 400M+ rows that changes daily (one days data drops out an a new days data drops in) The table is partitioned on a 'day' field so there are 31 paritions.
Each row in the table has data similar to this:
ID, Postcode, DeliveryPoint, Quantity, Day, Month
1 SN1 1BG A1 6 29 1
2 SN1 1BG A1 1 28 1
3 SN1 1BG A2 2 27 1
4 SN1 1BG A1 3 28 1
5 SN2 1AQ B1 1 29 12
6 SN1 1BG A1 2 26 12
I need to pull out 7 days of data in the format:
Postcode, Deliverypoint, 7dayAverage, Day1,day2,Day3,Day4,Day5,Day6,Day7
SN1 1BG A1 2 0 1 2 1 3 4 0
I can easily extract the data for the 7 day period but need to create a columnar version as shown above.
I have something like this:
select postcode,deliverypoint,
sum (case day when 23 then quantity else 0 end) as day1,
sum (case day when 24 then quantity else 0 end) as day2,
sum(case day when 25 then quantity else 0 end) as day3,
sum(case day when 26 then quantity else 0 end) as day4,
sum(case day when 27 then quantity else 0 end) as day5,
sum(case day when 28 then quantity else 0 end) as day6,
sum(case day when 29 then quantity else 0 end) as day7,
sum(quantity)*1.0/#daysinweek as wkavg
into #allweekdp
from maintable dp with (nolock)
where day in (select day from #days)
group by postcode,deliverypoint
where #days has the day numbers in the 7 day period.
But as you can see, I've hard-coded the day numbers into the query, I want to get them out of my temporary table #days but can't see a way of doing it (an array would be perfect here)
Or a I going about this in completely the wrong way ?
Kind Regards
Steve
If I understand correctly, what I would do is:
Convert the day and month columns into datetime values,
Get the first day of the week and day of the weekday (1-7) for each date, and
Pivot the data and group by the first day of the week
see here: sqlfiddle
As utexaspunk suggested, Pivot might be the way to go. I've never been comfortable with pivot and have preferred to pivot it manually so I control how everything looks, so I'm using a similar solution to how you did your script to solve the issue. No idea how the performance between my way and utexaspunk's will compare.
Declare #Min_Day Integer = Select MIN(day) as Min_Day From #days;
With Day_Coding_CTE as (
Select Distinct day
, day - #Min_Day + 1 as Day_Label
From #days
)
, Non_Columnar_CTE as (
Select dp.postcode
, dp.deliverypoint
, d.day
, c.Day_Label
, SUM(quantity) as Quantity
From maintable dp with (nolock)
Left Outer Join #days d
on dp.day = d.day --It also seems like you'll need more criteria here, but you'll have to figure out what those should be
Left Outer Join Day_Coding_CTE c
on d.day = c.day
)
Select postcode
, deliverypoint
, SUM(Case
When Day_Label = 1
Then Quantity
Else 0
End) as Day1
, SUM(Case
When Day_Label = 2
Then Quantity
Else 0
End) as Day2
, SUM(Case
When Day_Label = 3
Then Quantity
Else 0
End) as Day3
, SUM(Case
When Day_Label = 4
Then Quantity
Else 0
End) as Day4
, SUM(Case
When Day_Label = 5
Then Quantity
Else 0
End) as Day5
, SUM(Case
When Day_Label = 6
Then Quantity
Else 0
End) as Day6
, SUM(Case
When Day_Label = 7
Then Quantity
Else 0
End) as Day7
, SUM(Quantity)/#daysinweek as wkavg
From Non_Columnar_CTE
Group by postcode
deliverypoint