Converting Row Into a Column based on Month - sql

I have following input:
JAN_OLD FEB_OLD MAR_OLD APR_OLD MAY_OLD JUNE_OLD JAN_NEW FEB_NEW MAR_NEW APR_NEW MAY_NEW JUNE_NEW
10 11 12 13 14 15 20 21 22 23 24 25
Disired result set is as below:
JAN New OLD
FEB 20 10
MAR 21 11
APR 22 12
MAY 23 13
JUN 24 14
Can someone suggest how to achieve this?

Multiple Union All or single Cross Apply
SELECT months,old,new
FROM Your_table
CROSS apply (VALUES(jan_old,jan_new,'Jan'),
(FEB_OLD,FEB_new,'Feb'),
(MAR_OLD,MAR_new,'Mar'),
(APR_OLD,APR_new,'Apr'),
(MAY_OLD,MAY_new,'may'),
(JUNE_OLD,JUNE_new,'Jun'))
cs (old, new, months)
If you are not sure about the no. of columns then you may have to use Dynamic sql

Related

Select row with most recent date per location and increment recent date by 1 for each row by location using MariaDB

I have a table of location which has 'Date column'. I have to find recent date by each group of locationID for e.g. locationID 1 has most recent date '31 May 2022'. After finding recent date from the group of locationID I have to add 14 days in that recent date and store it in NewDate column. and add + 1 in that new date for other row for that group of locationID.
My table is:
id locationID Date NewDate
1 1 31 May 2022
2 1 16 May 2022
3 1 28 Apr 2021
4 2 29 Mar 2022
5 2 22 Feb 2022
6 3 14 Jun 2022
7 3 27 Oct 2021
8 4 01 Feb 2022
9 4 04 May 2022
10 4 14 Jun 2021
11 5 01 Jun 2022
12 5 29 May 2022
13 5 20 Sep 2022
14 5 11 Aug 2022
15 5 03 Aug 2022
Answer should be as below:
For e.g. for locationID = 1
id locationID Date NewDate
1 1 31 May 2022 14 Jun 2022 // Recent Date + 14 Days - 31 May + 14 Days
2 1 16 May 2022 15 Jun 2022 // Recent Date + 15 Days - 31 May + 15 Days
3 1 28 Apr 2021 16 Jun 2022 // Recent Date + 16 Days - 31 May + 16 Days
I have come across few similar post and found recent date like this:
SELECT L.*
FROM Locations L
INNER JOIN
(SELECT locationID, MAX(Date) AS MAXdate
FROM Locations
GROUP BY locationID) groupedL
ON L.locationID = groupedL.locationID
AND L.Date = groupedL.MAXdate
using above code I am able to find recent date per location but how do I add and increment required days and store it to NewDate column ? I am new to MariaDB, please suggest similar post link, any reference documents or blogs. Should I make some function to perform this logic and call the function to store required dates in NewDate column? I am not sure please suggest. Thank you.
RESULT SHOULD LOOK LIKE BELOW:
id locationID Date NewDate
1 1 31 May 2022 14 Jun 2022 // Recent Date for locationid 1 + 14 Days - 31 May + 14 Days
2 1 16 May 2022 15 Jun 2022 // Recent Date for locationid 1 + 15 Days - 31 May + 15 Days
3 1 28 Apr 2021 16 Jun 2022 // Recent Date for locationid 1 + 16 Days - 31 May + 16 Days
4 2 29 Mar 2022 12 APR 2022 // Recent Date for locationid 2 + 14 Days
5 2 22 Feb 2022 13 APR 2022 // Recent Date for locationid 2 + 15 Days
6 3 14 Jun 2022 28 JUN 2022 // Recent Date for locationid 3 + 14 Days
7 3 27 Oct 2021 29 JUN 2022 // Recent Date for locationid 3 + 15 Days
8 4 01 Feb 2022 18 MAY 2022 // Recent Date for locationid 4 + 14 Days
9 4 04 May 2022 19 MAY 2022 // Recent Date for locationid 4 + 15 Days
10 4 14 Jun 2021 20 MAY 2022 // Recent Date for locationid 4 + 16 Days
11 5 01 Jun 2022 04 OCT 2022 // Recent Date for locationid 5 + 14 Days
12 5 29 May 2022 05 OCT 2022 // Recent Date for locationid 5 + 15 Days
13 5 20 Sep 2022 06 OCT 2022 // Recent Date for locationid 5 + 16 Days
14 5 11 Aug 2022 07 OCT 2022 // Recent Date for locationid 5 + 17 Days
15 5 03 Aug 2022 08 OCT 2022 // Recent Date for locationid 5 + 18 Days
You can use a cte:
with cte as (
select l1.*, l2.m, (select sum(l4.id < l1.id and l4.locationid = l1.locationid) from locations l4) inc from locations l1
join (select l3.locationid, max(l3.dt) m from locations l3 group by l3.locationid) l2 on l1.locationid = l2.locationid
)
select c.id, c.locationid, c.dt, c.m + interval 14 + c.inc day from cte c
You could use analytic window functions and update the original table by joining to a sub-query (works for MariaDB):
update t
join (
select Id,
Date_Add(First_Value(date) over(partition by locationId order by date desc),
interval (13 + row_number() over(partition by locationId order by date desc)) day
) NewDate
from t
)nd on t.id = nd.id
set t.Newdate = nd.NewDate;
See DB<>Fiddle example

Calculation of values that rely on a date variable

I am trying to calculate the value of the last measurement taken (according to the date column) divided by the lowest value recorded (according to the measurement column) if two values in the “SUBJECT” column match and two values in the “PROCEDURE” column match. The the calculation would be produced in a new column. I am having trouble with this and I would appreciate a solution to this matter.
data Have;
input Subject Type :$12. Date &:anydtdte. Procedure :$12. Measurement;
format date yymmdd10.;
datalines;
500 Initial 15 AUG 2017 Invasive 20
500 Initial 15 AUG 2017 Surface 35
500 Followup 15 AUG 2018 Invasive 54
428 Followup 15 AUG 2018 Outer 29
765 Seventh 3 AUG 2018 Other 13
500 Followup 3 JUL 2018 Surface 98
428 Initial 3 JUL 2017 Outer 10
765 Initial 20 JUL 2019 Other 19
610 Third 20 AUG 2019 Invasive 66
610 Initial 17 Mar 2018 Invasive 17
;
*Intended output table
Subject Type Date Procedure Measurement Output
500 Initial 15 AUG 2017 Invasive 20 20/20
500 Initial 15 AUG 2017 Surface 35 35/35
500 Followup 15 AUG 2018 Invasive 54 54/20
428 Followup 15 AUG 2018 Outer 29 29/10
765 Seventh 3 AUG 2018 Other 13 13/19
500 Followup 3 JUL 2018 surface 98 98/35
428 Initial 3 JUL 2017 Outer 10 10/10
765 Initial 20 JUL 2019 Other 19 19/19
610 Third 20 AUG 2019 Invasive 66 66/17
610 Initial 17 Mar 2018 Invasive 17 17/17 ;
*Attempt;
PROC SQL;
create table want as
select a.*,
(select measurement as measurement_last_date
from have
where subject = a.subject and type = a.type
having date = max(date)) / min(a.measurement) as ratio
from have as a
group by subject, type
order by subject, type, date;
QUIT;
I think that you need use statement retain with data step.
the statement will retain your last row and you can 'll compare the last row with actual row processed.
link of some tutorial of how use statement retain.
enter link description here
SAS documentation
enter link description here

yearly average from monthly daterange data

I have the following table in postgresql;
Value period
1 [2017-01-01,2017-02-01)
2 [2017-02-01,2017-03-01)
3 [2017-03-01,2017-04-01)
4 [2017-04-01,2017-05-01)
5 [2017-05-01,2017-06-01)
6 [2017-06-01,2017-07-01)
7 [2017-07-01,2017-08-01)
8 [2017-08-01,2017-09-01)
9 [2017-09-01,2017-10-01)
10 [2017-10-01,2017-11-01)
11 [2017-11-01,2017-12-01)
12 [2017-12-01,2018-01-01)
13 [2018-01-01,2018-02-01)
14 [2018-02-01,2018-03-01)
15 [2018-03-01,2018-04-01)
16 [2018-04-01,2018-05-01)
17 [2018-05-01,2018-06-01)
18 [2018-06-01,2018-07-01)
19 [2018-07-01,2018-08-01)
20 [2018-08-01,2018-09-01)
21 [2018-09-01,2018-10-01)
22 [2018-10-01,2018-11-01)
23 [2018-11-01,2018-12-01)
24 [2018-12-01,2019-01-01)
25 [2019-01-01,2019-02-01)
26 [2019-02-01,2019-03-01)
27 [2019-03-01,2019-04-01)
28 [2019-04-01,2019-05-01)
29 [2019-05-01,2019-06-01)
30 [2019-06-01,2019-07-01)
31 [2019-07-01,2019-08-01)
32 [2019-08-01,2019-09-01)
33 [2019-09-01,2019-10-01)
34 [2019-10-01,2019-11-01)
35 [2019-11-01,2019-12-01)
36 [2019-12-01,2020-01-01)
37 [2020-01-01,2020-02-01)
38 [2020-02-01,2020-03-01)
39 [2020-03-01,2020-04-01)
40 [2020-04-01,2020-05-01)
41 [2020-05-01,2020-06-01)
42 [2020-06-01,2020-07-01)
How can I get yearly average from monthly data in postgresql?
Note: Column Value is type integer and column period is type daterange.
The expected result should be
6.5 2017
18.5 2018
30.5 2019
39.5 2020
If your periods are always taking one month, including the lower bound and excluding the upper, you could try this
select
avg(value * 1.0) as average,
extract(year from lower(period)) as year
from table
group by year

Adding set lists of future dates to rows in a SQL query

So I am doing a cohort analysis for customers, where a cohort is a group of people who started using the product in the same month. I then keep track of each cohort's total use for every subsequent month up till present time.
For example, the first "cohort month" is January 2012, then I have "use months" January 12, Feb 12, March 12, ..., March 17(current month). One column is "cohort month", and another is "use month". This process repeats for every subsequent cohort month. The table looks like:
Jan 12 | Jan 12
Jan 12 | Feb 12
...
Jan 12 | Mar 17
Feb 12 | Feb 12
Feb 12 | Mar 12
...
Feb 12 | Mar 17
...
Feb 17 | Feb 17
Feb 17 | Mar 17
Mar 17 | Mar 17
The problem arises because I want to do forecasting for one year out for both existing and future cohorts.
That means for the Jan 12 cohort, I want to do prediction for April 17 to Mar 18.
I also want to do predictions for the April 17 cohort (which doesn't exist yet) from April 17 to Mar 18. And so on till predictions for the Mar 18 cohort in Mar 18.
I can handle the predictions, don't worry about that.
My issue is that I cannot figure out how to add in this list of (April 17 .. Mar 17) in the "use month" column before every cohort switches.
I also need to add in cohorts April 17 to Mar 18, and have the applicable parts of this list of (April 17 ... Mar 17) for each of these future cohorts.
So I want the table to look like:
Jan 12 | Jan 12
Jan 12 | Feb 12
...
Jan 12 | Mar 17
Jan 12 | Apr 17
..
Jan 12 | Mar 18
Feb 12 | Feb 12
Feb 12 | Mar 12
...
Feb 12 | Mar 17
Feb 12 | Apr 17
...
Feb 12 | Mar 18
...
...
Feb 17 | Feb 17
Feb 17 | Mar 17
...
Feb 17 | Mar 18
Mar 17 | Mar 17
...
Mar 17 | Mar 18
I know the first solution to come to mind is to do a create a list of all dates Jan 12 to Mar 18, cross join it to itself, and then left outer join to the current table I have (where cohort / use months range from Jan 12 to Mar 17). However, this is not scalable.
Is there a way I can just iteratively add in this list of the months of the next year?
I am using HP Vertica, could use Presto or Hive if absolutely necessary
I think you should use the query here below to create a temporary table out of nothing, and join it with the rest of your query. You can't do anything in a procedural manner in SQL, I'm afraid. You won't be able to get away without a CROSS JOIN. But here, you limit the CROSS JOIN to the generation of the first-of-month pairs that you need.
Here goes:
WITH
-- create a list of integers from 0 to 100 using the TIMESERIES clause
i(i) AS (
SELECT dt::DATE - '2000-01-01'::DATE
FROM (
SELECT '2000-01-01'::DATE + 0
UNION ALL SELECT '2000-01-01'::DATE + 100
) d(d)
TIMESERIES dt AS '1 day' OVER(ORDER BY d::TIMESTAMP)
)
,
-- limits are Jan-2012 to the first of the current month plus one year
month_limits(month_limit) AS (
SELECT '2012-01-01'::DATE
UNION ALL SELECT ADD_MONTHS(TRUNC(CURRENT_DATE,'MONTH'),12)
)
-- create the list of possible months as a CROSS JOIN of the i table
-- containing the integers and the month_limits table, using ADD_MONTHS()
-- and the smallest and greatest month of the month limits
,month_list AS (
SELECT
ADD_MONTHS(MIN(month_limit),i) AS month_first
FROM month_limits CROSS JOIN i
GROUP BY i
HAVING ADD_MONTHS(MIN(month_limit),i) <= (
SELECT MAX(month_limit) FROM month_limits
)
)
-- finally, CROSS JOIN the obtained month list with itself with the
-- filters needed.
SELECT
cohort.month_first AS cohort_month
, use.month_first AS use_month
FROM month_list AS cohort
CROSS JOIN month_list AS use
WHERE use.month_first >= cohort.month_first
ORDER BY 1,2
;

join two PIVOT table

I have this query
With
NoOfOrder as
(
SELECT Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec,Jan,Feb,Mar
FROM
(
select LEFT(datename(month,InvoiceDate),3) mon,InvoiceNo as InvoiceNo
from tbl_InvoiceMain ,tbl_OrderMain,tbl_CompanyMaster
where tbl_InvoiceMain.OrderID = tbl_OrderMain.OrderID
and (CAST(tbl_InvoiceMain.InvoiceDate AS date) BETWEEN tbl_CompanyMaster.YearStart AND tbl_CompanyMaster.YearEnd)
) P
PIVOT (count(InvoiceNo)for mon in (Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec)) PV
),
OnTime as
(
SELECT Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec,Jan,Feb,Mar
FROM
(
select LEFT(datename(month,InvoiceDate),3) mon,InvoiceNo as InvoiceNo
from tbl_InvoiceMain ,tbl_OrderMain,tbl_CompanyMaster
where tbl_InvoiceMain.OrderID = tbl_OrderMain.OrderID
and (CAST(tbl_InvoiceMain.InvoiceDate AS date) BETWEEN tbl_CompanyMaster.YearStart AND tbl_CompanyMaster.YearEnd)
and CAST(tbl_InvoiceMain.InvoiceDate AS date) <= CAST(tbl_OrderMain.ScheduledDispatchDate AS date)
) P
PIVOT (count(InvoiceNo)for mon in (Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec)) PV
)
Select * From NoOfOrder
union all
Select * From OnTime
It gives this result:
Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar
18 35 39 52 32 47 47 22 14 0 0 0
9 10 16 22 6 11 19 10 5 0 0 0
Here is my expected result
Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar
NoOfOrder 18 35 39 52 32 47 47 22 14 0 0 0
OnTimeDelivered 9 10 16 22 6 11 19 10 5 0 0 0
DeliverPerformance% 50.00 28.57 41.03 42.31 18.75 23.40 40.43 45.45 35.71 0.00 0.00 0.00
The formula for DeliverPerformance is:
DeliverPerformance% = (OnTimeDelivered/NoOfOrder) X 100
How do I achieve this result on the next row?
for reference you check my question in good format
enter link description here
My immediate suggestion is to combine everything first prior to pivoting the results.
Your first query might look like this:
SELECT
LEFT(datename(month, InvoiceDate), 3) InvMon,
SUM(1) AS NoOfOrder,
SUM(CASE WHEN CAST(tbl_InvoiceMain.InvoiceDate AS date) <= CAST(tbl_OrderMain.ScheduledDispatchDate AS date) THEN 1 ELSE 0 END) OnTimeDelivered
FROM tbl_InvoiceMain, tbl_OrderMain, tbl_CompanyMaster
WHERE tbl_InvoiceMain.OrderID = tbl_OrderMain.OrderID
AND (CAST(tbl_InvoiceMain.InvoiceDate AS date) BETWEEN tbl_CompanyMaster.YearStart AND tbl_CompanyMaster.YearEnd)
GROUP BY LEFT(datename(month, InvoiceDate), 3)
Note that I'm always counting every invoice record and optionally counting the "on time" invoices with that CASE statement and the respective SUM functions.
My next thought is to put that query in a CTE and then the statement that uses that CTE will do the additional calculation like so:
SELECT InvMon, NoOfOrder, OnTimeDelivered, ((OnTimeDelivered / NoOfOrder) * 100) DeliverPerformance ...
And finally, that's what I pivot.