(SQL)How to Get absence "day" from date under 1-31 column using PIVOT - sql

I have a table abs_details that give data like follows -
PERSON_NUMBER ABS_DATE ABS_TYPE_NAME ABS_DAYS
1010 01-01-2022 PTO 1
1010 06-01-2022 PTO 0.52
1010 02-02-2022 VACATION 1
1010 03-02-2022 VACATION 0.2
1010 01-12-2021 PTO 1
1010 02-12-2021 sick 1
1010 30-12-2021 sick 1
1010 30-01-2022 SICK 1
I want this data to be displayed in the following way:
PERSON_NUMBER ABS_TYPE_NAME 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
1010 PTO 2 0.52
1010 VACATION 1 0.2
1010 SICK 1 2
For the days, 1-31 should should come in the header, if there is any absence taken on say 01st of the month or quarter passed then the value should go under 1 , if there is no value for date of the month, say no value is there from 07th-11th in the above case, then output should display the numbers but no value should be provided under it.
Is this feasible in SQL? I have an idea we can use pivot, but how to fix 1-31 header and give values underneath each day.
Any suggestions?
If I pass multiple quarter that is Q1(JAN-MAR), Q2(APR-JUN) it should sum up the values between the dates between those two quarters. if Just q1 then only q1 result
If I pass multiple month then it should display the sum of the values for an absence type in those multiple months.
I will be passing the year in the parameter and the above two should consider the year I pass.

Create a column which has all the dates, and pivot up using pivot function in oracle.
SELECT *
FROM
(
SELECT PERSON_NUMBER,
EXTRACT(DAY FROM TO_DATE(ABS_DATE)) AS DAY_X,
ABS_TYPE_NAME,
ABS_DAYS
FROM TABLE
-- Add additional filter here which you want
)
PIVOT(SUM(ABS_DAYS)
FOR DAY_X IN (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31))
Db fiddle - https://dbfiddle.uk/?rdbms=oracle_21&fiddle=ad3af639235f7a6db415ec714a3ee0d9

Related

count number of records by month over the last five years where record date > select month

I need to show the number of valid inspectors we have by month over the last five years. Inspectors are considered valid when the expiration date on their certification has not yet passed, recorded as the month end date. The below SQL code is text of the query to count valid inspectors for January 2017:
SELECT Count(*) AS RecordCount
FROM dbo_Insp_Type
WHERE (dbo_Insp_Type.CERT_EXP_DTE)>=#2/1/2017#);
Rather than designing 60 queries, one for each month, and compiling the results in a final table (or, err, query) are there other methods I can use that call for less manual input?
From this sample:
Id
CERT_EXP_DTE
1
2022-01-15
2
2022-01-23
3
2022-02-01
4
2022-02-03
5
2022-05-01
6
2022-06-06
7
2022-06-07
8
2022-07-21
9
2022-02-20
10
2021-11-05
11
2021-12-01
12
2021-12-24
this single query:
SELECT
Format([CERT_EXP_DTE],"yyyy/mm") AS YearMonth,
Count(*) AS AllInspectors,
Sum(Abs([CERT_EXP_DTE] >= DateSerial(Year([CERT_EXP_DTE]), Month([CERT_EXP_DTE]), 2))) AS ValidInspectors
FROM
dbo_Insp_Type
GROUP BY
Format([CERT_EXP_DTE],"yyyy/mm");
will return:
YearMonth
AllInspectors
ValidInspectors
2021-11
1
1
2021-12
2
1
2022-01
2
2
2022-02
3
2
2022-05
1
0
2022-06
2
2
2022-07
1
1
ID
Cert_Iss_Dte
Cert_Exp_Dte
1
1/15/2020
1/15/2022
2
1/23/2020
1/23/2022
3
2/1/2020
2/1/2022
4
2/3/2020
2/3/2022
5
5/1/2020
5/1/2022
6
6/6/2020
6/6/2022
7
6/7/2020
6/7/2022
8
7/21/2020
7/21/2022
9
2/20/2020
2/20/2022
10
11/5/2021
11/5/2023
11
12/1/2021
12/1/2023
12
12/24/2021
12/24/2023
A UNION query could calculate a record for each of 50 months but since you want 60, UNION is out.
Or a query with 60 calculated fields using IIf() and Count() referencing a textbox on form for start date:
SELECT Count(IIf(CERT_EXP_DTE>=Forms!formname!tbxDate,1,Null)) AS Dt1,
Count(IIf(CERT_EXP_DTE>=DateAdd("m",1,Forms!formname!tbxDate),1,Null) AS Dt2,
...
FROM dbo_Insp_Type
Using the above data, following is output for Feb and Mar 2022. I did a test with Cert_Iss_Dte included in criteria and it did not make a difference for this sample data.
Dt1
Dt2
10
8
Or a report with 60 textboxes and each calls a DCount() expression with criteria same as used in query.
Or a VBA procedure that writes data to a 'temp' table.

How do you get the last entry for each month in SQL?

I am looking to filter very large tables to the latest entry per user per month. I'm not sure if I found the best way to do this. I know I "should" trust the SQL engine (snowflake) but there is a part of me that does not like the join on three columns.
Note that this is a very common operation on many big tables, and I want to use it in DBT views which means it will get run all the time.
To illustrate, my data is of this form:
mytable
userId
loginDate
year
month
value
1
2021-01-04
2021
1
41.1
1
2021-01-06
2021
1
411.1
1
2021-01-25
2021
1
251.1
2
2021-01-05
2021
1
4369
2
2021-02-06
2021
2
32
2
2021-02-14
2021
2
731
3
2021-01-20
2021
1
258
3
2021-02-19
2021
2
4251
3
2021-03-15
2021
3
171
And I'm trying to use SQL to get the last value (by loginDate) for each month.
I'm currently doing a groupby & a join as follows:
WITH latest_entry_by_month AS (
SELECT "userId", "year", "month", max("loginDate") AS "loginDate"
FROM mytable
)
SELECT * FROM mytable NATURAL JOIN latest_entry_by_month
The above results in my desired output:
userId
loginDate
year
month
value
1
2021-01-25
2021
1
251.1
2
2021-01-05
2021
1
4369
2
2021-02-14
2021
2
731
3
2021-01-20
2021
1
258
3
2021-02-19
2021
2
4251
3
2021-03-15
2021
3
171
But I'm not sure if it's optimal.
Any guidance on how to do this faster? Note that I am not materializing the underlying data, so it is effectively un-clustered (I'm getting it from a vendor via the Snowflake marketplace).
Using QUALIFY and windowed function(ROW_NUMBER):
SELECT *
FROM mytable
QUALIFY ROW_NUMBER() OVER(PARTITION BY userId, year, month
ORDER BY loginDate DESC) = 1

Calculate Churn by aggregating by date range in SQL

I am trying to calculate the churn rate from a data that has customer_id, group, date. The aggregation is going to be by id, group and date. The churn formula is (customers in previous cohort - customers in last cohort)/customers in previous cohort
customers in previous cohort refers to cohorts in before 28 days
customers in last cohort refers to cohorts in last 28 days
I am not sure how to aggregate them by date range to calculate the churn.
Here is sample data that I copied from SQL Group by Date Range:
Date Group Customer_id
2014-03-01 A 1
2014-04-02 A 2
2014-04-03 A 3
2014-05-04 A 3
2014-05-05 A 6
2015-08-06 A 1
2015-08-07 A 2
2014-08-29 XXXX 2
2014-08-09 XXXX 3
2014-08-10 BB 4
2014-08-11 CCC 3
2015-08-12 CCC 2
2015-03-13 CCC 3
2014-04-14 CCC 5
2014-04-19 CCC 4
2014-08-16 CCC 5
2014-08-17 CCC 3
2014-08-18 XXXX 2
2015-01-10 XXXX 3
2015-01-20 XXXX 4
2014-08-21 XXXX 5
2014-08-22 XXXX 2
2014-01-23 XXXX 3
2014-08-24 XXXX 2
2014-02-25 XXXX 3
2014-08-26 XXXX 2
2014-06-27 XXXX 4
2014-08-28 XXXX 1
2014-08-29 XXXX 1
2015-08-30 XXXX 2
2015-09-31 XXXX 3
The goal is to calculate the churn rate every 28 days in between 2014 and 2015 by the formula given above. So, it is going to be aggregating the data by rolling it by 28 days and calculating the churn by the formula.
Here is what I tried to aggregate the data by date range:
SELECT COUNT(distinct customer_id) AS count_ids, Group,
DATE_SUB(CAST(Date AS DATE), INTERVAL 56 DAY) AS Date_min,
DATE_SUB(CURRENT_DATE, INTERVAL 28 DAY) AS Date_max
FROM churn_agg
GROUP BY count_ids, Group, Date_min, Date_max
Hope someone will help me with aggregation and churn calculation. I want to simply deduct the aggregated count_ids to deduct it from the next aggregated count_ids which is after 28 days. So this is going to be successive deduction of the same column value (count_ids). I am not sure if I have to use rolling window or simple aggregation to find the churn.
As corrected by #jarlh, it's not 2015-09-31 but 2015-09-30
You can use this to create 28 days calendar:
create table daysby28 (i int, _Date date);
insert into daysby28 (i, _Date)
SELECT i, cast('01-01-2014'as date) + i*INTERVAL '28 day'
from generate_series(0,50) i
order by 1;
After you use #jarlh churn_agg table creation he sent with the fiddle, with this query, you get what you want:
with cte as
(
select count(Customer) as TotalCustomer, Cohort, CohortDateStart From
(
select distinct a.Customer_id as Customer, b.i as Cohort, b._Date as CohortDateStart
from churn_agg a left join daysby28 b on a._Date >= b._Date and a._Date < b._Date + INTERVAL '28 day'
) a
group by Cohort, CohortDateStart
)
select a.CohortDateStart,
1.0*(b.TotalCustomer - a.TotalCustomer)/(1.0*b.TotalCustomer) as Churn from cte a
left join cte b on a.cohort > b.cohort
and not exists(select 1 from cte c where c.cohort > b.cohort and c.cohort < a.cohort)
order by 1
The fiddle of all together is here

Max date among records and across tables - SQL Server

I tried max to provide in table format but it seem not good in StackOver, so attaching snapshot of the 2 tables. Apologize about the formatting.
SQL Server 2012
**MS Table**
**mId tdId name dueDate**
1 1 **forecastedDate** 1/1/2015
2 1 **hypercareDate** 11/30/2016
3 1 LOE 1 7/4/2016
4 1 LOE 2 7/4/2016
5 1 demo for yy test 10/15/2016
6 1 Implementation – testing 7/4/2016
7 1 Phased Rollout – final 7/4/2016
8 2 forecastedDate 1/7/2016
9 2 hypercareDate 11/12/2016
10 2 domain - Forte NULL
11 2 Fortis completion 1/1/2016
12 2 Certification NULL
13 2 Implementation 7/4/2016
-----------------------------------------------
**MSRevised**
**mId revisedDate**
1 1/5/2015
1 1/8/2015
3 3/25/2017
2 2/1/2016
2 12/30/2016
3 4/28/2016
4 4/28/2016
5 10/1/2016
6 7/28/2016
7 7/28/2016
8 4/28/2016
9 8/4/2016
9 5/28/2016
11 10/4/2016
11 10/5/2016
13 11/1/2016
----------------------------------------
The required output is
1. Will be passing the 'tId' number, for instance 1, lets call it tid (1)
2. Want to compare tId (1)'s all milestones (except hypercareDate) with tid(1)'s forecastedDate milestone
3. return if any of the milestone date (other than hypercareDate) is greater than the forecastedDate
The above 3 steps are simple, but I have to first compare the milestones date with its corresponding revised dates, if any, from the revised table, and pick the max date among all that needs to be compared with the forecastedDate
I managed to solve this. Posting the answer, hope it helps aomebody.
//Insert the result into temp table
INSERT INTO #mstab
SELECT [mId]
, [tId]
, [msDate]
FROM [dbo].[MS]
WHERE ([msName] NOT LIKE 'forecastedDate' AND [msName] NOT LIKE 'hypercareDate'))
// this scalar function will get max date between forecasted duedate and forecasted revised date
SELECT #maxForecastedDate = [dbo].[fnGetMaxDate] ( 'forecastedDate');
// this will get the max date from temp table and compare it with forecasatedDate/
SET #maxmilestoneDate = (SELECT MAX(maxDate)
FROM ( SELECT ms.msDueDate AS dueDate
, mr.msRevisedDate AS revDate
FROM #mstab as ms
LEFT JOIN [MSRev] as mr on ms.msId = mr.msId
) maxDate
UNPIVOT (maxDate FOR DateCols IN (dueDate, revDate))up );

Subtract nonconsecutive values in same row in t-SQL

I have a data table that has annual data points and quarterly data points. I want to subtract the quarterly data points from the corresponding prior annual entry, e.g. Annual 2014 - Q3 2014, using t-SQL. I have an id variable for each entry, plus a reconcile id variable that shows which quarterly entry corresponds to which annual entry. See below:
CurrentDate PreviousDate Value Entry Id Reconcile Id Annual/Quarterly
9/30/2012 9/30/2011 112 2 3 Annual
9/30/2013 9/30/2012 123 1 2 Annual
9/30/2014 9/30/2013 123.5 9 1 Annual
12/31/2013 9/30/2014 124 4 1 Quarterly
3/31/2014 12/31/2013 124.5 5 1 Quarterly
6/30/2014 3/31/2014 125 6 1 Quarterly
9/30/2014 6/30/2014 125.5 7 1 Quarterly
12/31/2014 9/30/2014 126 10 9 Quarterly
3/31/2015 12/31/2014 126.5 11 9 Quarterly
6/30/2015 3/31/2015 127 12 9 Quarterly
For example, Reconcile ID 9 for the quarterly entries corresponds to Entry ID 9, which is an annual entry.
I have code to just subtract the prior entry from the current entry, but I cannot figure out how to subtract quarterly entries from annual entries where the Entry ID and Reconcile ID are the same.
Here is the code I am using, which is resulting in the right calculation, but increasing the number of results by many rows. I have also tried this as an inner join. I only want the original 10 rows, plus a new difference column:
SELECT DISTINCT T1.[EntryID]
, [T1].[RECONCILEID]
, [T1].[CurrentDate]
, [T1].[Annual_Quarterly]
, [T1].[Value]
, [T1].[Value]-T2.[Value] AS Difference
FROM Table T1
LEFT JOIN Table T2 ON T2.EntryID = T1.RECONCILEID;
Your code should be fine, here's the results I'm getting:
EntryId Annual_Quarterly CurrentDate ReconcileId Value recVal diff
2 Annual 9/30/2012 3 112
1 Annual 9/30/2013 2 123 112 11
9 Annual 9/30/2014 1 123.5 123 0.5
4 Quarterly 12/31/2013 1 124 123 1
5 Quarterly 3/31/2014 1 124.5 123 1.5
6 Quarterly 6/30/2014 1 125 123 2
7 Quarterly 9/30/2014 1 125.5 123 2.5
10 Quarterly 12/31/2014 9 126 123.5 2.5
11 Quarterly 3/31/2015 9 126.5 123.5 3
12 Quarterly 6/30/2015 9 127 123.5 3.5
with your data and this SQL:
SELECT
tr.EntryId,
tr.Annual_Quarterly,
tr.CurrentDate,
tr.ReconcileId,
tr.Value,
te.Value AS recVal,
tr.[VALUE]-te.[VALUE] AS diff
FROM
t AS tr LEFT JOIN
t AS te ON
tr.ReconcileId = te.EntryId
ORDER BY
tr.Annual_Quarterly,
tr.CurrentDate;
Your question is a bit vague as far as how you're wanting to subtract these values, but this should give you some idea.
Select T1.*, T1.Value - Coalesce(T2.Value, 0) As Difference
From Table T1
Left Join Table T2 On T2.[Entry Id] = T1.[Reconcile Id]