SQL query to get running total for records matching condition over last 12 months - sql

I am trying to query a table and get a running total for each of the last 12 months. A record could fall in more than one month if the range of two date fields falls on multiple months. The fields are DueDate and DeferralDate.
So for example, lets say I have the following 4 records:
Id | Date1 | Date2
1 01/20/2020 05/29/2020
2 02/01/2020 08/14/2020
3 04/01/2020 04/30/2020
4 07/08/2020 12/31/2020
My result would look like this:
Nov 19 | Dec 19 | Jan 20 | Feb 20 | Mar 20 | Apr 20 | May 20 | Jun 20 | Jul 20 | Aug 20 | Sept 20 | Oct 20
0 0 1 2 2 3 2 1 2 2 1 1
I have no idea how to go about this other than 12 separate queries but there's probably a better way to do it I'm unaware of. Hopefully someone can point me in the right direction.
Thanks in advance.

If you want this in columns, then it is conditional aggregation. Assuming you want any overlap in the month:
select sum(case when date1 < '2019-12-01' and date2 >= '20190-11-01' then 1 else 0 end) as cnt_201911,
sum(case when date1 < '2020-01-01' and date2 >= '20190-12-01' then 1 else 0 end) as cnt_201912,
sum(case when date1 < '2020-02-01' and date2 >= '2020-01-01' then 1 else 0 end) as cnt_202001,
sum(case when date1 < '2020-03-01' and date2 >= '2020-02-01' then 1 else 0 end) as cnt_202002,
. . .
from t

Select sum(count(date1) , count(date2)) ,
Format(date1,'MMMyy')
from tablename
Where month (date1) = month (date2)
Then you have to use Pivot to horizontalize the select result

Related

How to group records by hours considering start date and end date

I'm trying to group records by hours with consideration of duration. Assume there are long running processes and there is log data when process has been started and finished. I'm trying to get report by hours how many processes were running
The data looks like this
Process_name Start End
'A' '2019/01/01 14:10' '2019/01/01/ 14:55'
'B' '2019/01/01 14:20' '2019/01/01/ 16:30'
'C' '2019/01/01 15:05' '2019/01/01/ 15:10'
The result should be like this
Hour ProcessQount
14 2
15 2
16 1
You can do it if you join a recursive cte which returns all the hours of the day to the table:
with cte as (
select 0 as hour
union all
select hour + 1
from cte
where hour < 23
)
select c.hour Hour, count(*) ProcessQount
from cte c inner join tablename t
on c.hour between datepart(hh, t.[Start]) and datepart(hh, t.[End])
group by c.hour
See the demo.
Results:
> Hour | ProcessQount
> ---: | -----------:
> 14 | 2
> 15 | 2
> 16 | 1
If you change to a LEFT JOIN and count([Process_name]) then you get results for all the hours of the day:
> Hour | ProcessQount
.........................
> 12 | 0
> 13 | 0
> 14 | 2
> 15 | 2
> 16 | 1
> 17 | 0
> 18 | 0
.........................
Generate the hours and then use inequalities and aggregation:
select h, count(t.process_name)
from (values (14), (15), (16)) v(h) left join
t
on datepart(hour, start <= v.h) and
datepart(hour, end >= v.h)
group by v.h
order by v.h;
For reasonable results, this assumes that all the data you are looking at is for one day, as in your sample data.

DB2/SQL aggregates with preceeding weekdays

I have a query that currently gets daily records against a weekly number from a prepopulated table:
SELECT Employee,
sum(case when category = 'Shirts' then daily_total else 0 end) as Shirts_DAILY,
sum(case when category = 'Shirts' then weekly_quota else 0 end) as Shirts_QUOTA, -- this is a static column, this number is the same for every record
sum(case when category = 'Shoes' then daily_total else 0 end) as Shoes_DAILY,
sum(case when category = 'Shoes' then weekly_quota else 0 end) as Shoes_QUOTA, -- this is a static column, this number is the same for every record
CURRENT_DATE as DATE_OF_REPORT
from SalesNumbers
where date_of_report >= current_date
group by Employee;
This runs in a script nightly and returns records like this:
Employee | shirts_DAILY | shirts_QUOTA | Shoes_DAILY | Shoes_QUOTA | DATE_OF_REPORT
--------------------------------------------------------------------------------------------------------
123 15 75 14 85 2019-08-30
That's the record from last Friday Night's report. I'm trying to figure out a way to add a column for each category that would take the sum of daily totals (shirts_DAILY, shoes_DAILY) for each category on preceding weekdays (running sunday through saturday as a week) and divide by that category's quota (shirts_QUOTA, shoes_QUOTA).
For example, here are records from sunday through thursday
Employee | shirts_DAILY | shirts_QUOTA | Shoes_DAILY | Shoes_QUOTA | DATE_OF_REPORT
--------------------------------------------------------------------------------------------------------
123 15 75 16 85 2019-08-25
123 4 75 2 85 2019-08-26
123 8 75 6 85 2019-08-27
123 2 75 8 85 2019-08-28
123 15 75 14 85 2019-08-29
With my new change, I would want Friday night's record to take the sum of sunday through thursday's daily records and divide by the quota (including friday's daily in the sum)
Friday night's record with new column:
Employee | shirts_DAILY | shirts_QUOTA | shirtsPercent | Shoes_DAILY | Shoes_QUOTA | shoesPercent | DATE_OF_REPORT
-----------------------------------------------------------------------------------------------------------------------------------------------
123 2 75 61.3 7 85 62.4 2019-08-30
So friday's run added 15,4,8,2,15,2 for the shirts for 46/75 and 7,14,8,6,2,16 for shoes for 53/85. So the daily sum of each for the preceding week, including present day daily totals, if that makes sense.
What is the best way for me to achieve this?
SELECT Employee,
sum(case when category = 'Shirts' and date_of_report >= current date then
daily_total else 0 end) as Shirts_DAILY,
sum(case when category = 'Shirts' and date_of_report >= current date then
weekly_quota else 0 end) as Shirts_QUOTA,
( sum(case when category = 'Shirts' then
daily_total else 0 end) * 100 ) /
( sum(case when category = 'Shirts' and date_of_report >= current date then
weekly_quota else 0 end) ) as Shirts_PERCENT,
CURRENT_DATE as DATE_OF_REPORT
from SalesNumbers
where date_of_report >= ( current date - ( dayofweek(current date) - 1 ) days )
group by Employee

SQL Select Where date in (Jan and March) but not in Feb

I have a table like this in SQL called Balance
+----+-----------+-------+------+
| id | accountId | Date | Type |
+----+-----------+-------+------+
| PK | FK | Date | Int |
+----+-----------+-------+------+
I need to find the accountIds that has balance entries in January and March, but not in Febuary.
Only in 2018 and Type should be 2.
How would I go about writing my sql select statement?
Thanks
Edit:
What's I've done so far:
Selecting rows that either in Jan OR March is not a problem for me.
SELECT AccountId, Date FROM Balance
WHERE Month(Date) in (1,3) AND YEAR(Date) = 2018 AND Type =2
ORDER BY AccountId, Date
But if an AccountId has a single entry, say in January, then this will be included. And that's not what I want.
Only if an Account has entries in both Jan and March, and not in Feb is it interesting.
I suspect Group BY and HAVING are keys here, but I'm unsure how to proceed
I would do this using aggregation:
select b.accountid
from balance b
where date >= '2018-01-01' and date < '2019-01-01'
group by b.accountid
having sum(case when month(date) = 1 then 1 else 0 end) > 0 and -- has january
sum(case when month(date) = 3 then 1 else 0 end) > 0 and -- has march
sum(case when month(date) = 2 then 1 else 0 end) = 0 -- does not have february

SQL Calculations for budgeting

I have a database that contains the following columns:
Vendor, Amount, StartDate, Months
I would like to be able to calculate the average monthly amount based on the Months that are entered. I would also like to see it calculate out from the start date to the end date based on the StartDate + Months calculation. The resulting table would look something like this:
Vendor1 has 2 months of 1112 starting Jan 1 while Vendor2 has 3 months of 2040 staring Feb 1
| | ANNUAL | JAN | FEB | MAR | APR |
Vendor1 | 2,224 | 1,112 | 1,112 | | |
Vendor2 | 6,120 | | 2,040 | 2,040 | 2,040 |
Any assistance or direction would be greatly appreciated.
That's a strange DB design. However, here's what you've got to try:
SELECT (Amount * Months) AS Annual, (Case #(StartDate < DATE("01.02.year")) WHEN 1 THEN Amount ELSE NULL) AS Jan FROM Table --etc for all months
Will think of modifications though, because this way is a little too straightforward.
You would use conditional aggregation. Assuming the start dates are all in the same year, the code might look like this:
select vendorid, (amount * months) as total,
(case when month(startdate) <= 1 and month(startdate) + months >= 1
then amount
end) as jan,
(case when month(startdate) <= 2 and month(startdate) + months >= 2
then amount
end) as feb,
(case when month(startdate) <= 3 and month(startdate) + months >= 3
then amount
end) as mar,
(case when month(startdate) <= 4 and month(startdate) + months >= 4
then amount
end) as apr,
from t;

Count parts of total value as columns per row (pivot table)

I'm stuck with a seemingly easy query, but couldn't manage to get it working the last hours.
I have a table files that holds file names and some values like records in this file, DATE of creation (create_date), DATE of processing (processing_date) and so on. There can be multiple files for a create date in different hours and it is likely that they will not get processed in the same day of creaton, in fact it can even take up to three days or longer for them to get processed.
So let's assume I have these rows, as an example:
create_date | processing_date
------------------------------
2012-09-10 11:10:55.0 | 2012-09-11 18:00:18.0
2012-09-10 15:20:18.0 | 2012-09-11 13:38:19.0
2012-09-10 19:30:48.0 | 2012-09-12 10:59:00.0
2012-09-11 08:19:11.0 | 2012-09-11 18:14:44.0
2012-09-11 22:31:42.0 | 2012-09-21 03:51:09.0
What I want in a single query is to get a grouped column truncated to the day create_date with 11 additional columns for the differences between the processing_date and the create_date, so that the result should roughly look like this:
create_date | diff0days | diff1days | diff2days | ... | diff10days
------------------------------------------------------------------------
2012-09-10 | 0 2 1 ... 0
2012-09-11 | 1 0 0 ... 1
and so on, I hope you get the point :)
I have tried this and so far it works getting a single aggregated column for a create_date with a difference of - for example - 3:
SELECT TRUNC(f.create_date, 'DD') as created, count(1) FROM files f WHERE TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD') = 3 GROUP BY TRUNC(f.create_date, 'DD')
I tried combining the single queries and I tried sub-queries, but that didn't help or at least my knowledge about SQL is not sufficient.
What I need is a hint so that I can include the various differences as columns, like shown above. How could I possibly achieve this?
That's basically the pivoting problem:
SELECT TRUNC(f.create_date, 'DD') as created
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 0 then 1 end) as diff0days
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 1 then 1 end) as diff1days
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 2 then 1 end) as diff2days
, ...
FROM files f
GROUP BY
TRUNC(f.create_date, 'DD')
SELECT CreateDate,
sum(CASE WHEN DateDiff(day, CreateDate, ProcessDate) = 1 THEN 1 ELSE 0 END) AS Diff1,
sum(CASE WHEN DateDiff(day, CreateDate, ProcessDate) = 2 THEN 1 ELSE 0 END) AS Diff2,
...
FROM table
GROUP BY CreateDate
ORDER BY CreateDate
As you are using Oracle 11g you can also get desired result by using pivot query.
Here is an example:
-- sample of data from your question
SQL> create table Your_table(create_date, processing_date) as
2 (
3 select '2012-09-10', '2012-09-11' from dual union all
4 select '2012-09-10', '2012-09-11' from dual union all
5 select '2012-09-10', '2012-09-12' from dual union all
6 select '2012-09-11', '2012-09-11' from dual union all
7 select '2012-09-11', '2012-09-21' from dual
8 )
9 ;
Table created
SQL> with t2 as(
2 select create_date
3 , processing_date
4 , to_date(processing_date, 'YYYY-MM-DD')
- To_Date(create_date, 'YYYY-MM-DD') dif
5 from your_table
6 )
7 select create_date
8 , max(diff0) diff0
9 , max(diff1) diff1
10 , max(diff2) diff2
11 , max(diff3) diff3
12 , max(diff4) diff4
13 , max(diff5) diff5
14 , max(diff6) diff6
15 , max(diff7) diff7
16 , max(diff8) diff8
17 , max(diff9) diff9
18 , max(diff10) diff10
19 from (select *
20 from t2
21 pivot(
22 count(dif)
23 for dif in ( 0 diff0
24 , 1 diff1
25 , 2 diff2
26 , 3 diff3
27 , 4 diff4
28 , 5 diff5
29 , 6 diff6
30 , 7 diff7
31 , 8 diff8
32 , 9 diff9
33 , 10 diff10
34 )
35 ) pd
36 ) res
37 group by create_date
38 ;
Result:
Create_Date Diff0 Diff1 Diff2 Diff3 Diff4 Diff5 Diff6 Diff7 Diff8 Diff9 Diff10
--------------------------------------------------------------------------------
2012-09-10 0 2 1 0 0 0 0 0 0 0 0
2012-09-11 1 0 0 0 0 0 0 0 0 0 1