Based on CASE Pivot multiple rows into single row and multiple columns - sql

Suppose there is a table
NAME Unit Date Time Type LOAD
A1Cu 2 2020-01-02 10:30 CU 0.1
A1Ta 5 2020-01-02 10:30 TA 0.3
A1Ch 6 2020-01-02 10:30 CH 0.2
B1Ch 4 2020-02-15 11:40 CH 0.52
B1Ta 8 2020-02-15 11:40 TA 0.83
C1Ta 5 2020-06-18 21:00 TA 0.11
Z1Ch 8 2020-08-08 15:30 CH 0.24
D1Ta 8 2020-06-18 01:30 TA 0.3
C1Cu 6 2020-06-18 21:00 CU 0.2
Then for same date and time merge multiple rows into single column by applying following logic
FOR NAME AND UNIT (Stop when one of the condition is met)
IF Type CU Then take NAME & UNIT
ELSE IF Type CH Then take NAME & UNIT
ELSE IF Type TA Then take NAME UNIT
For Load Put in respective column
The final Result should be like
NAME Unit Date Time TypeCULoad TypeCHLoad TypeTALoad
A1Cu 2 2020-01-02 10:30 0.1 0.2 0.3
B1Ch 4 2020-02-15 11:40 NULL 0.52 0.83
C1Cu 6 2020-06-18 21:00 0.2 NULL 0.11
Z1Ch 8 2020-08-08 15:30 NULL 0.24 Null
D1Ta 8 2020-06-18 01:30 NULL NULL 0.3
I have the partial solution but finding it hard to get Name and Load Logic right:
SELECT Date, Time,[TypeCULoad], [TypeCHLoad ],[TypeTALoad] FROM
(
SELECT
Date, Time, col, val FROM(
SELECT *, 'Type'+Type+'Load' as Col, Load as Val FROM TESTAIR
) t
) tt
PIVOT ( max(val) for Col in ([TypeCULoad], [TypeCULoad],[TypeCULoad]) ) AS pvt
Result
Date Time TypeCULoad TypeCHLoad TypeTALoad
2020-01-02 10:30 0.1 0.2 0.3
2020-02-15 11:40 NULL 0.52 0.83
2020-06-18 21:00 0.2 NULL 0.11
2020-08-08 15:30 NULL 0.24 Null
2020-06-18 01:30 NULL NULL 0.3
Need help on Name and Load?

First try to get the desired results before applying PIVOT
with cte as
(
SELECT name, unit, date, time, 'Type' + Type + 'Load' as Col, Load as Val, Type, LEFT(name, 2) xxx,
RANK() OVER(PARTITION BY LEFT(name, 2) ORDER BY CASE type WHEN 'CU' THEN 1 WHEN 'CH' THEN 2 ELSE 3 END) dd
FROM TESTAIR
)
select name, unit, Date, Time, [TypeCULoad], [TypeCHLoad ], [TypeTALoad] from
(
select b.name, b.unit, a.date, a.time, a.col, a.val
from cte a
join cte b on a.xxx = b.xxx and b.dd = 1
) f
PIVOT (max(val) for Col in ([TypeCULoad], [TypeCHLoad],[TypeTALoad]) ) AS pvt

Related

Cumulative sum by month with missing months

I have to cumulative sum by month a quantity but in some months there's no quantity and SQL does not show these rows.
I have tried multiple other solutions I found here but none of them worked or at least I couldn't get them working. Currently, my code is as follows:
SELECT DISTINCT
A.FromDate
,A.ToDate
,A.OperationType
,A.[ItemCode]
,SUM(A.[Quantity]) OVER (PARTITION BY [ItemCode],OperationType,YEAR ORDER BY MONTH) [Quantity]
FROM (
SELECT
CONVERT(DATE,DATEADD(yy, DATEDIFF(yy, 0, T.OrderDate), 0)) AS FromDate
,EOMONTH(T.OrderDate) ToDate
,DATEPART(MONTH, t.OrderDate) AS [Month]
,DATEPART(YEAR, t.OrderDate) AS [Year]
,SUM(T.[Quantity]) [Quantity]
,OperationType
,[ItemCode]
FROM TEST T
WHERE [ItemCode] != ''
GROUP BY T.OrderDate,[ItemCode],OperationType
) A
With these results:
FromDate
ToDate
OType
ItemCode
Quantity
2021-01-01
2021-01-31
Type1
1
19
2021-01-01
2021-02-28
Type1
1
96
2021-01-01
2021-03-31
Type1
1
116
2021-01-01
2021-04-30
Type1
1
138
2021-01-01
2021-06-30
Type1
1
178
2021-01-01
2021-07-31
Type1
1
203
2021-01-01
2021-08-31
Type1
1
228
2021-01-01
2021-09-30
Type1
1
253
2021-01-01
2021-11-30
Type1
1
330
2021-01-01
2021-12-31
Type1
1
364
2022-01-01
2022-02-28
Type1
1
18
2022-01-01
2022-03-31
Type1
1
42
2022-01-01
2022-04-30
Type1
1
53
And I was expecting these results:
FromDate
ToDate
OType
ItemCode
Quantity
2021-01-01
2021-01-31
Type1
1
19
2021-01-01
2021-02-28
Type1
1
96
2021-01-01
2021-03-31
Type1
1
116
2021-01-01
2021-04-30
Type1
1
138
2021-01-01
2021-05-31
Type1
1
138
2021-01-01
2021-06-30
Type1
1
178
2021-01-01
2021-07-31
Type1
1
203
2021-01-01
2021-08-31
Type1
1
228
2021-01-01
2021-09-30
Type1
1
253
2021-01-01
2021-10-31
Type1
1
253
2021-01-01
2021-11-30
Type1
1
330
2021-01-01
2021-12-31
Type1
1
364
2022-01-01
2022-02-28
Type1
1
18
2022-01-01
2022-03-31
Type1
1
42
2022-01-01
2022-04-30
Type1
1
53
SQL Fiddle link: http://www.sqlfiddle.com/#!18/04a997/1
I would really appreciate some help. Thank you
Here is one way:
WITH m(Earliest,Latest) AS
(
SELECT DATEADD(DAY,1,MIN(EOMONTH(OrderDate,-1))),
MAX(EOMONTH(OrderDate)) FROM dbo.TEST
), TypeCodes AS
(
SELECT DISTINCT ItemCode, OperationType
FROM dbo.TEST
), Months AS
(
SELECT Month = DATEADD(MONTH, ROW_NUMBER()
OVER (ORDER BY ##SPID)-1, Earliest)
FROM m CROSS APPLY STRING_SPLIT(REPLICATE(',',
DATEDIFF(MONTH,Earliest,Latest)),',')
), raw AS
(
SELECT m.Month, i.OperationType, i.ItemCode,
Q = COALESCE(SUM(Quantity),0)
FROM Months AS m
CROSS JOIN TypeCodes AS i
LEFT OUTER JOIN dbo.TEST AS t
ON t.OrderDate >= m.Month
AND t.OrderDate < DATEADD(MONTH, 1, m.Month)
AND i.ItemCode = t.ItemCode
AND i.OperationType = t.OperationType
GROUP BY m.Month, i.OperationType, i.ItemCode
)
SELECT FromDate = Month,
ToDate = EOMONTH(Month),
OperationType,
ItemCode,
Quantity = SUM(Q) OVER (ORDER BY Month)
FROM raw;
Working example in this fiddle.
If you can't use STRING_SPLIT() because your database is stuck on an older compatibility level, you could put this function in a database that isn't:
USE ModernDatabase;
GO
CREATE FUNCTION dbo.StringSplit(#list nvarchar(max), #delim nchar(1))
RETURNS TABLE
AS
RETURN (SELECT value FROM STRING_SPLIT(#list, #delim));
Then you change:
FROM m CROSS APPLY STRING_SPLIT(...
To:
FROM m CROSS APPLY ModernDatabase.dbo.StringSplit(...

How can I join two tables on an ID and a DATE RANGE in SQL

I have 2 query result tables containing records for different assessments. There are RAssessments and NAssessments which make up a complete review.
The aim is to eventually determine which reviews were completed. I would like to join the two tables on the ID, and on the date, HOWEVER the date each assessment is completed on may not be identical and may be several days apart, and some ID's may have more of an RAssessment than an NAssessment.
Therefore, I would like to join T1 on to T2 on ID & on T1Date(+ or - 7 days). There is no other way to match the two tables and to align the records other than using the date range, as this is a poorly designed database. I hope for some help with this as I am stumped.
Here is some sample data:
Table #1:
ID
RAssessmentDate
1
2020-01-03
1
2020-03-03
1
2020-05-03
2
2020-01-09
2
2020-04-09
3
2022-07-21
4
2020-06-30
4
2020-12-30
4
2021-06-30
4
2021-12-30
Table #2:
ID
NAssessmentDate
1
2020-01-07
1
2020-03-02
1
2020-05-03
2
2020-01-09
2
2020-07-06
2
2020-04-10
3
2022-07-21
4
2021-01-03
4
2021-06-28
4
2022-01-02
4
2022-06-26
I would like my end result table to look like this:
ID
RAssessmentDate
NAssessmentDate
1
2020-01-03
2020-01-07
1
2020-03-03
2020-03-02
1
2020-05-03
2020-05-03
2
2020-01-09
2020-01-09
2
2020-04-09
2020-04-10
2
NULL
2020-07-06
3
2022-07-21
2022-07-21
4
2020-06-30
NULL
4
2020-12-30
2021-01-03
4
2021-06-30
2021-06-28
4
2021-12-30
2022-01-02
4
NULL
2022-01-02
Try this:
SELECT
COALESCE(a.ID, b.ID) ID,
a.RAssessmentDate,
b.NAssessmentDate
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table1
) a
FULL OUTER JOIN (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID) RowId, *
FROM table2
) b ON a.ID = b.ID AND a.RowId = b.RowId
WHERE (a.RAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')
OR (b.NAssessmentDate BETWEEN '2020-01-01' AND '2022-01-02')

how to clean sql table base on startdate, enddate and effective date

I have a really dirty table in which I have a mix between the start date and one values's change effective date.
The table look like this
id
value
startdate
enddate
effective date
1
0.3
2020-10-07
2021-02-28
2020-07-01
1
1
2020-10-07
2021-02-28
2020-10-07
2
0.46
2021-01-01
2021-01-01
2
1
2021-01-01
2020-10-07
2021-05-01
3
1
2021-08-01
2021-08-01
4
1
2019-03-01
2019-03-01
4
0.5
2019-03-01
2020-08-01
4
0.7
2019-03-01
2021-05-01
When the enddate is empty it means that there is not change planning and when the start date is later and the effective date, it means than they delete an older record and create a new one with other values.
my goal is to clean the table and get it sorted as something like this.
id
value
startdate_valid
enddate_valid
1
0.3
2020-07-01
2020-10-07
1
1
2020-10-07
2021-02-28
2
0.46
2021-01-01
2021-05-01
2
1
2021-05-01
3
1
2021-08-01
4
1
2019-03-01
2020-08-01
4
0.5
2020-08-01
2021-05-01
4
0.7
2021-05-01
any idea of how can I achieve this?
EDIT:
I think I was able to get the startdate_valid value by using
MAX([effective date]) OVER(PARTITION BY id, YEAR([effective date]), MONTH([effective date]) ORDER BY [effective date])
This make sense as I have the startdate included in the effective date but I am still stuck in order to get the enddate_valid
I have found a solution to my problem, I needed to do it in two steps so if someone has a better solution, please share and I will set it as correct
SELECT
*,
COALESCE(
LEAD(sub.StartDate_value) OVER(PARTITION BY sub.Code ORDER BY sub.StartDate_value),
sub.[startdate]) AS [EndDate_value]
FROM (
SELECT
id, name,
COALESCE(
MAX([effective date]) OVER(PARTITION BY id YEAR([effective date]), MONTH([effective date]) ORDER BY [effective date]),
startdate)
) AS StartDate_value
from table ) sub

Get all rows from one table stream and the row before in time from an other table

Suppose I have one table (table_1) and one table stream (stream_1) that gets changes made to table_1, in my case only inserts of new rows. And once I have acted on these changes, the rowes will be removed from stream_1 but remain in table_1.
From that I would like to calculate delta values for var1 (var1 - lag(var1) as delta_var1) partitioned on a customer and just leave var2 as it is. So the data in table_1 could look something like this:
timemessage
customerid
var1
var2
2021-04-01 06:00:00
1
10
5
2021-04-01 07:00:00
2
100
7
2021-04-01 08:00:00
1
20
10
2021-04-01 09:00:00
1
40
3
2021-04-01 15:00:00
2
150
5
2021-04-01 23:00:00
1
50
6
2021-04-02 06:00:00
2
180
2
2021-04-02 07:00:00
1
55
9
2021-04-02 08:00:00
2
200
4
And the data in stream_1 that I want to act on could looks like this:
timemessage
customerid
var1
var2
2021-04-01 23:00:00
1
50
6
2021-04-02 06:00:00
2
180
2
2021-04-02 07:00:00
1
55
9
2021-04-02 08:00:00
2
200
4
But to be able to calculate delta_var1 for all customers I would need the previous row in time for each customer before the ones in stream_1.
For example: To be able to calculate how much var1 has increased for customerid = 1 between 2021-04-01 09:00:00 and 2021-04-01 23:00:00 I want to include the 2021-04-01 09:00:00 row for customerid = 1 in my output.
So I would like to create a select containing all rows in stream_1 + the previous row in time for each customerid from table_1: The wanted output is the following in regard to the mentioned table_1 and stream_1.
timemessage
customerid
var1
var2
2021-04-01 09:00:00
1
40
3
2021-04-01 15:00:00
2
150
5
2021-04-01 23:00:00
1
50
6
2021-04-02 06:00:00
2
180
2
2021-04-02 07:00:00
1
55
9
2021-04-02 08:00:00
2
200
4
So given you have the "last value per day" in your wanted output, you are want a QUALIFY to keep only the wanted rows and using ROW_NUMBER partitioned by customerid and timemessage. Assuming the accumulator it positive only you can order by accumulatedvalue thus:
WITH data(timemessage, customerid, accumulatedvalue) AS (
SELECT * FROM VALUES
('2021-04-01', 1, 10)
,('2021-04-01', 2, 100)
,('2021-04-02', 1, 20)
,('2021-04-03', 1, 40)
,('2021-04-03', 2, 150)
,('2021-04-04', 1, 50)
,('2021-04-04', 2, 180)
,('2021-04-05', 1, 55)
,('2021-04-05', 2, 200)
)
SELECT * FROM data
QUALIFY ROW_NUMBER() OVER (PARTITION BY customerid,timemessage ORDER BY accumulatedvalue DESC) = 1
ORDER BY 1,2;
gives:
TIMEMESSAGE CUSTOMERID ACCUMULATEDVALUE
2021-04-01 1 10
2021-04-01 2 100
2021-04-02 1 20
2021-04-03 1 40
2021-04-03 2 150
2021-04-04 1 50
2021-04-04 2 180
2021-04-05 1 55
2021-04-05 2 200
if you can trust your data and data in table2 starts right after data in table1 then you can just get the last records for each customer from table1 and union with table2:
select * from table1
qualify row_number() over (partitioned by customerid order by timemessage desc) = 1
union all
select * from table2
if not
select a.* from table1 a
join table2 b
on a.customerid = b.customerid
and a.timemessage < b.timemessage
qualify row_number() over (partitioned by a.customerid order by a.timemessage desc) = 1
union all
select * from table2
also you can add a condition to not look to data for more than 1 day (or 1 hour or whatever safe interval is to look at) for better performance

PostgreSQL group by with interval but without window functions

This is follow-up of my previous question:
PostgreSQL group by with interval
There was a very good answer but unfortunately it is not working with PostgreSQL 8.0 - some clients still use this old version.
So I need to find another solution without using window functions
Here is what I have as a table:
id quantity price1 price2 date
1 100 1 0 2018-01-01 10:00:00
2 200 1 0 2018-01-02 10:00:00
3 50 5 0 2018-01-02 11:00:00
4 100 1 1 2018-01-03 10:00:00
5 100 1 1 2018-01-03 11:00:00
6 300 1 0 2018-01-03 12:00:00
I need to sum "quantity" grouped by "price1" and "price2" but only when they change
So the end result should look like this:
quantity price1 price2 dateStart dateEnd
300 1 0 2018-01-01 10:00:00 2018-01-02 10:00:00
50 5 0 2018-01-02 11:00:00 2018-01-02 11:00:00
200 1 1 2018-01-03 10:00:00 2018-01-03 11:00:00
300 1 0 2018-01-03 12:00:00 2018-01-03 12:00:00
It is not efficient, but you can implement the same logic with subqueries:
select sum(quantity), price1, price2,
min(date) as dateStart, max(date) as dateend
from (select d.*,
(select count(*)
from data d2
where d2.date <= d.date
) as seqnum,
(select count(*)
from data d2
where d2.price1 = d.price1 and d2.price2 = d.price2 and d2.date <= d.date
) as seqnum_pp
from data d
) t
group by price1, price2, (seqnum - seqnum_pp)
order by dateStart