PIVOT datetime and ORDER column values of multiple rows - sql

I have a table with values which are not in order
Id
DateTime
Status
1
2022-03-01 18:00:00.000
Stop1
2
2022-03-01 08:00:00.000
Start
3
2022-03-01 20:00:00.000
Stop2
4
2022-03-02 09:00:00.000
Start
5
2022-03-01 10:00:00.000
Stop2
6
2022-03-02 11:00:00.000
Finish
7
2022-03-01 14:00:00.000
Start
8
2022-03-02 10:00:00.000
Stop1
where Status can be 'Start', 'Stop1', 'Stop2', or 'Finish'.
I need the timeline like this, where the values are pivoted in the order (from the earliest to the latest; id is not relevant at this point)
Id
Start
Stop1
Stop2
Finish
2
2022-03-01 08:00:00
NULL
2022-03-01 10:00:00
NULL
7
2022-03-01 14:00:00
2022-03-01 18:00:00
2022-03-01 20:00:00
NULL
4
2022-03-02 09:00:00
2022-03-02 10:00:00
NULL
2022-03-02 11:00:00
After I PIVOTed it in SQL Server
SELECT *
FROM (
SELECT Id, DateTime, Status FROM table
) t
PIVOT (
MAX(DateTime)
FOR Status IN (Start, Stop1, Stop2, Finish)
) p
I got
Id
Start
Stop1
Stop2
Finish
2
2022-03-01 08:00:00
NULL
NULL
NULL
5
NULL
NULL
2022-03-01 10:00:00
NULL
7
2022-03-01 14:00:00
NULL
NULL
NULL
1
NULL
2022-03-01 18:00:00
NULL
NULL
3
NULL
NULL
2022-03-01 20:00:00
NULL
6
NULL
NULL
NULL
2022-03-02 11:00:00
8
NULL
2022-03-02 10:00:00
NULL
NULL
4
2022-03-02 09:00:00
NULL
NULL
NULL
How can I get that timeline?

Perhaps this will help. The window functions can be invaluable
Also, remember to "FEED" your pivot with only the required columns.
Example
Select *
From (
Select id = min(case when Status='Start' then ID end) over (partition by Grp)
,DateTime
,Status
From (
Select *
,Grp = sum( case when [Status]='Start' then 1 else 0 end) over (order by datetime)
from YourTable
) A
) src
Pivot ( max(DateTime) FOR Status IN (Start, Stop1, Stop2, Finish) ) p
Results

Related

Fill null values in a table with last non-null value SQL-Server

I'm currently trying to create a table for our data analysis team that contains the "currency exchange rate" ("tasa" in spanish) and I came across a problem. There is this table that contains every rate change made by the store's manager. If he doesn't change the rate, the last date is from when he did. So when I extract those values and pivot the table, I end up with a lot of nulls (mostly from weekends). I would like to return the last non-null value instead. Here is the table structure.
tasas:
fecha
tasa_v
co_mone
2021-01-05 00:00:00
1.30000
EUR
2021-01-05 00:00:00
1.30000
US$
2021-01-05 00:00:00
1.30000
ZEL
2021-01-06 00:00:00
1.40000
ZEL
2021-01-06 00:00:00
1.40000
US$
2021-01-06 00:00:00
1.40000
EUR
2021-01-07 00:00:00
1.45000
EUR
2021-01-07 00:00:00
1.51500
EUR
2021-01-07 00:00:00
1.45000
US$
2021-01-07 00:00:00
1.51500
US$
2021-01-07 00:00:00
1.45000
ZEL
2021-01-07 00:00:00
1.51500
ZEL
2021-01-08 00:00:00
1.65000
ZEL
2021-01-08 00:00:00
1.65000
US$
2021-01-08 00:00:00
1.65000
EUR
Then I join that with a calendar table, and pivot using date and co_mone, as well as AVG() for tasa_v.
DECLARE #startdate as date
DECLARE #enddate as date
set #startdate = '20210101'
set #enddate = '20221231'
SELECT fecha as 'Fecha',[US$],[EUR],[ZEL]
FROM
(
-- Source is the join between calendar and exchange rates dates
SELECT F.fecha, T.tasa_v, T.co_mone
FROM DWSTAGING_GML.dbo.Dim_Fecha F
left join dbo.tasas as T on cast(T.fecha as date) = F.fecha
WHERE F.fecha between #startdate and #enddate
) as SRC
-- pivot the table
PIVOT
(
AVG(tasa_v)
FOR co_mone IN ([US$],[EUR],[ZEL])
) as Pivoted
order by fecha
GO
The output:
(modified to show that the columns may have diferent values)
Fecha
US$
EUR
ZEL
2021-01-01 00:00:00
NULL
NULL
NULL
2021-01-02 00:00:00
NULL
NULL
NULL
2021-01-03 00:00:00
NULL
NULL
NULL
2021-01-04 00:00:00
NULL
NULL
NULL
2021-01-05 00:00:00
1.300000
1.300000
1.300000
2021-01-06 00:00:00
1.400000
1.400000
1.400000
2021-01-07 00:00:00
NULL
NULL
NULL
2021-01-08 00:00:00
1.650000
1.850000
1.650000
2021-01-09 00:00:00
NULL
NULL
NULL
What I'm looking for:
Fecha
US$
EUR
ZEL
2021-01-01 00:00:00
NULL
NULL
NULL
2021-01-02 00:00:00
NULL
NULL
NULL
2021-01-03 00:00:00
NULL
NULL
NULL
2021-01-04 00:00:00
NULL
NULL
NULL
2021-01-05 00:00:00
1.300000
1.300000
1.300000
2021-01-06 00:00:00
1.400000
1.400000
1.400000
2021-01-07 00:00:00
1.400000
1.400000
1.400000
2021-01-08 00:00:00
1.650000
1.850000
1.650000
2021-01-09 00:00:00
1.650000
1.850000
1.650000
I tried the solutions on this link and this similar one
select Fecha,
coalesce([US$], first_value([US$]) OVER (partition by grupo_US ORDER BY fecha)) as tasa_llenada
FROM (
select Fecha, [US$]
count([US$]) OVER (order by fecha) as grupo_US
FROM (Pivoted_table))
but I get the error
Msg 195, Level 15, State 10, Line 7
'first_value' is not a recognized built-in function name.
Any ideas?
I hope this works for filter out the nulls for you !
declare #startdate as date;
declare #enddate as date;
set #startdate = '20210101';
set #enddate = '20210109';
select fecha as 'Fecha',[US$],[EUR],[ZEL]
from
(
-- Source is the join between calendar and exchange rates dates
select F.fecha, T.tasa_v, T.co_mone
from #Dim_Fecha F
left join
(select
tsub.fecha, tsub.tasa_v, tsub.co_mone,
row_number() over (
partition by tsub.fecha, tsub.co_mone
order by tsub.fecha ) as id
from #tasas tsub
where tsub.tasa_v is not null
) T
on cast(T.fecha as date) = F.fecha
and T.id = 1
where
F.fecha between #startdate and #enddate
) as SRC
-- pivot the table
pivot
(
avg(tasa_v)
for co_mone in ([US$],[EUR],[ZEL])
) as Pivoted
order by Fecha;
Actually I think OVER and ROW_NUMBER is superfluous anyway - the main thing is just to filter out nulls right?
So:
declare #startdate as date;
declare #enddate as date;
set #startdate = '20210101';
set #enddate = '20210109';
select fecha as 'Fecha',[US$],[EUR],[ZEL]
from
(
-- Source is the join between calendar and exchange rates dates
select F.fecha, T.tasa_v, T.co_mone
from #Dim_Fecha F
left join
(select
tsub.fecha, tsub.tasa_v, tsub.co_mone
from #tasas tsub
where tsub.tasa_v is not null
) T
on cast(T.fecha as date) = F.fecha
where
F.fecha between #startdate and #enddate
) as SRC
-- pivot the table
pivot
(
avg(tasa_v)
for co_mone in ([US$],[EUR],[ZEL])
) as Pivoted
order by Fecha;

Transpose and join one column SQL

I have a table which will be filled in the following manner:
ID
MODELID
PROPERTYID
V
Q
T
TYPE
1
LotNumber
NULL
1243582
NULL
2021-10-08 00:00:00.000
NULL
2
GoodStrips
NULL
39288
NULL
2021-10-08 00:00:00.000
NULL
3
StripProc
NULL
492
NULL
2021-10-08 00:00:00.000
NULL
4
StripRaw
NULL
883
NULL
2021-10-08 00:00:00.000
NULL
5
LabelProc
NULL
414
NULL
2021-10-08 00:00:00.000
NULL
6
LabelRaw
NULL
54
NULL
2021-10-08 00:00:00.000
NULL
7
SmallTips
NULL
101
NULL
2021-10-08 00:00:00.000
NULL
8
LongTips
NULL
65
NULL
2021-10-08 00:00:00.000
NULL
For each block of 8 rows, the timestamp will be identical.
Ideally, I'd like to make another table or view from this initial table where my lot number or timestamp would act as an ID column, and all the other values would be placed in the same row, like so:
LotNumber
GoodStrips
StripProc
StripRaw
LabelProc
LabelRaw
SmallTips
LongTips
T
1243582
39288
492
883
414
54
101
65
2021-10-08 00:00:00.000
I've been trying to get an inner join working to no avail.
My attempt at doing the first few as a test:
Select m1.T, m1.MODELID, m2.V, m3.V
from Rejects945 m1
inner join Rejects945 m2 on m2.T = m1.T
inner join Rejects945 m3 on m3.T = m1.T
where m2.V = 'GoodStrips'
where m3.V = 'StripProc'
where MODELID = 'LotNumber'
I get the following error:
Msg 156, Level 15, State 1, Line 6
Incorrect syntax near the keyword 'where'
Any help is greatly appreciated.
You can do this with the PIVOT function.
PIVOT & UNPIVOT
See sample code below:
CREATE TABLE #Test
(
ID INT
,MODELID VARCHAR(100)
,V INT
,T DATETIME
)
INSERT #Test (ID, MODELID, V, T)
VALUES (1,'LotNumber',1243582,'8/10/2021 12:00:00 AM')
,(2,'GoodStrips',39288,'8/10/2021 12:00:00 AM')
,(3,'StripProc',492,'8/10/2021 12:00:00 AM')
,(4,'StripRaw',883,'8/10/2021 12:00:00 AM')
,(5,'LabelProc',414,'8/10/2021 12:00:00 AM')
,(6,'LabelRaw',54,'8/10/2021 12:00:00 AM')
,(7,'SmallTips',101,'8/10/2021 12:00:00 AM')
,(8,'LongTips',65,'8/10/2021 12:00:00 AM')
,(9,'LotNumber',2345234,'9/10/2021 12:00:00 AM')
,(10,'GoodStrips',4543,'9/10/2021 12:00:00 AM')
,(11,'StripProc',455,'9/10/2021 12:00:00 AM')
,(12,'StripRaw',43,'9/10/2021 12:00:00 AM')
,(13,'LabelProc',24,'9/10/2021 12:00:00 AM')
,(14,'LabelRaw',5,'9/10/2021 12:00:00 AM')
,(15,'SmallTips',2,'9/10/2021 12:00:00 AM')
,(16,'LongTips',666,'9/10/2021 12:00:00 AM')
select LotNumber
,GoodStrips
,StripProc
,StripRaw
,LabelProc
,LabelRaw
,SmallTips
,LongTips
,t
from
(
select v, MODELID, t
from #Test
) d
pivot
(
max(v)
for MODELID in (LotNumber
,GoodStrips
,StripProc
,StripRaw
,LabelProc
,LabelRaw
,SmallTips
,LongTips
)
) piv;

Select the last entry recorded in a table for each day, within a duration of days

How can I select the last entry recorded for each day? In this example, I need the last item number ordered and the last DateOrdered entry for each day over the last 5 days. Here's my table:
ItemNumber | DateOrdered
1 2020-04-01 08:00:00.000
3 2020-04-01 09:00:00.000
5 2020-04-01 10:00:00.000
4 2020-04-02 09:00:00.000
6 2020-04-02 10:00:00.000
7 2020-04-03 08:00:00.000
3 2020-04-03 09:00:00.000
2 2020-04-03 10:00:00.000
5 2020-04-04 10:00:00.000
8 2020-04-05 08:00:00.000
2 2020-04-05 09:00:00.000
8 2020-04-05 10:00:00.000
Here's the results I need:
ItemNumber | DateOrdered
5 2020-04-01 10:00:00.000
6 2020-04-02 10:00:00.000
2 2020-04-03 10:00:00.000
5 2020-04-04 10:00:00.000
8 2020-04-05 10:00:00.000
This is as close as I can get with it:
with tempTable as
(
select
*,
row_number() over(partition by datediff(d, 0, DateOrdered) order by DateOrdered desc) as rn
from myTable
)
select *
from tempTable
where rn = 1
You are almost there. You just need to fix the definition of your partition so it puts together all rows that belong to the same day.
This should do it:
with tempTable as
(
select
*,
row_number() over(partition by cast(DateOrdered as date) order by DateOrdered desc) as rn
from myTable
)
select *
from tempTable
where rn = 1

Applying LAG() to multiple rows with a null value

Given:
with
m as (
select 1 ID, cast('03/01/2015' as datetime) PERIOD_START, cast('3/31/2015' as datetime) PERIOD_END
union all
select 1 ID, '04/01/2015', '4/28/2015'
union all
select 1 ID, '05/01/2015', '5/31/2015'
union all
select 1 ID, '06/01/2015', '06/30/2015'
union all
select 1 ID, '07/01/2015', '07/31/2015'
)
,
a as (
SELECT 1 ID, cast('2015-03-13 14:17:00.000' as datetime) AUDIT_TIME, 'READ [2]' STATUS
UNION ALL
SELECT 1 ID, '2015-04-27 15:51:00.000' AUDIT_TIME, 'HELD [2]' STATUS
UNION ALL
SELECT 1 ID, '2015-07-08 17:54:00.000' AUDIT_TIME, 'COMPLETED [5]' STATUS
)
This query:
select m.ID,PERIOD_START,PERIOD_END
,a.AUDIT_TIME,STATUS
from m
LEFT OUTER JOIN a on m.id=a.id
and a.audit_time between m.period_start and m.period_end
generates this record set:
ID PERIOD_START PERIOD_END AUDIT_TIME STATUS
1 2015-03-01 00:00:00.000 2015-03-31 00:00:00.000 2015-03-13 14:17:00.000 READ [2]
1 2015-04-01 00:00:00.000 2015-04-28 00:00:00.000 2015-04-27 15:51:00.000 HELD [2]
1 2015-05-01 00:00:00.000 2015-05-31 00:00:00.000 NULL NULL
1 2015-06-01 00:00:00.000 2015-06-30 00:00:00.000 NULL NULL
1 2015-07-01 00:00:00.000 2015-07-31 00:00:00.000 2015-07-08 17:54:00.000 COMPLETED [5]
I need the 4/27/15 entry repeated for May and June:
ID PERIOD_START PERIOD_END AUDIT_TIME STATUS
1 2015-03-01 00:00:00.000 2015-03-31 00:00:00.000 2015-03-13 14:17:00.000 READ [2]
1 2015-04-01 00:00:00.000 2015-04-28 00:00:00.000 2015-04-27 15:51:00.000 HELD [2]
1 2015-05-01 00:00:00.000 2015-05-31 00:00:00.000 2015-04-27 15:51:00.000 HELD [2]
1 2015-06-01 00:00:00.000 2015-06-30 00:00:00.000 2015-04-27 15:51:00.000 HELD [2]
1 2015-07-01 00:00:00.000 2015-07-31 00:00:00.000 2015-07-08 17:54:00.000 COMPLETED [5]
Using the LAG() function:
select m.ID,PERIOD_START,PERIOD_END
,a.AUDIT_TIME
,LAG(audit_time) OVER (partition by m.ID order by period_start) PRIOR_AUDIT_TIME
,STATUS
,LAG(STATUS) OVER (partition by m.ID order by period_start) PRIOR_STATUS
from m
LEFT OUTER JOIN a on m.id=a.id
and a.audit_time between m.period_start and m.period_end
only works for a single row:
ID PERIOD_START PERIOD_END AUDIT_TIME PRIOR_AUDIT_TIME STATUS PRIOR_STATUS
1 2015-03-01 00:00:00.000 2015-03-31 00:00:00.000 2015-03-13 14:17:00.000 NULL READ [2] NULL
1 2015-04-01 00:00:00.000 2015-04-28 00:00:00.000 2015-04-27 15:51:00.000 2015-03-13 14:17:00.000 HELD [2] READ [2]
1 2015-05-01 00:00:00.000 2015-05-31 00:00:00.000 NULL 2015-04-27 15:51:00.000 NULL HELD [2]
1 2015-06-01 00:00:00.000 2015-06-30 00:00:00.000 NULL NULL NULL NULL
1 2015-07-01 00:00:00.000 2015-07-31 00:00:00.000 2015-07-08 17:54:00.000 NULL COMPLETED [5] NULL
Is there a way to do this without having to resort to a cursor?
You can do this with window functions:
with q as (
select m.ID, PERIOD_START, PERIOD_END, a.AUDIT_TIME, STATUS
from m LEFT OUTER JOIN
a
on m.id = a.id and
a.audit_time between m.period_start and m.period_end
)
select q.*,
max(status) over (partition by id, audit_grp) as imputed_status
from (select q.*,
max(audit_time) over (partition by id order by period_start) as audit_grp
from q
) q
The idea is to copy the audit_time value over, using max() as a cumulative window function. This then defines groups, so you can get the status as well.
ANSI supplies the IGNORE NULLSs directive to LAG(), but SQL Server does not (yet) support it.

Display lines with 0 for measures if date not in date dimension table

I have difficulties to make a join between a fact table and a date dimension table because I would like to display records with a date not in the dimension table.
Example : I don't have records for 2014-09-08 at 01:00 when I try this query because there is no records in the fact table with these filters.
select *
from FCT_SCAN scan
left join dim_date dt
on cast ( scan.DATE_HEURE as date ) = dt.DATE
and cast(cast ( scan.DATE_HEURE as time(0)) as varchar(5)) = CAST(dt.heure as varchar(5))
where CAST(scan.DATE_HEURE as DATE) = '2014-08-09'
and tranche_1h = '01:00:00'
order by heure
And I would like to display records with NULL or 0 values for the fields if the DATE_HEURE field is not in the dimension table.
Edit 1:
First I rewrite my initial query with the good prefixes for a better understanding.
select *
from FCT_SCAN scan
left join dim_date dt
on cast ( scan.DATE_HEURE as date ) = dt.DATE
and cast(cast ( scan.DATE_HEURE as time(0)) as varchar(5)) = CAST(dt.heure as varchar(5))
where CAST(scan.DATE_HEURE as date) = '2014-08-09'
and dt.tranche_1h = '01:00:00'
order by dt.heure
My problem is the following : I'm searching a special conditional join which will allow me to link my fact table with my date dimension table in Cognos. And this join must allow me to display "empty" records if some datetimes in the dimension table are not in the fact table AND records in the fact table if datetimes are present.
Update : Here are CREATE TABLE and SELECT scripts of DIM_DATE.
CREATE TABLE [dbo].[DIM_DATE](
[DATE_HEURE] [datetime] NOT NULL,
[ANNEE] [int] NULL,
[MOIS] [int] NULL,
[JOUR] [int] NULL,
[DATE] [date] NULL,
[JOUR_SEM_DATE] [varchar](10) NULL,
[NUM_JOUR_SEM_DATE] [int] NULL,
[HEURE] [time](0) NULL,
[TRANCHE_1H] [time](0) NULL,
[TRANCHE_DEMIH] [time](0) NULL,
[TRANCHE_QUARTH] [time](0) NULL,
[TRANCHE_10M] [time](0) NULL,
CONSTRAINT [PK_DIM_DATE] PRIMARY KEY CLUSTERED
(
[DATE_HEURE] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
DATE_HEURE ANNEE MOIS JOUR DATE JOUR_SEM_DATE NUM_JOUR_SEM_DATE HEURE TRANCHE_1H TRANCHE_DEMIH TRANCHE_QUARTH TRANCHE_10M
2013-01-01 00:00:00.000 2013 1 1 2013-01-01 Tuesday 3 00:00:00 00:00:00 00:00:00 00:00:00 00:00:00
2013-01-01 00:01:00.000 2013 1 1 2013-01-01 Tuesday 3 00:01:00 00:00:00 00:00:00 00:00:00 00:00:00
2013-01-01 00:02:00.000 2013 1 1 2013-01-01 Tuesday 3 00:02:00 00:00:00 00:00:00 00:00:00 00:00:00
2013-01-01 00:03:00.000 2013 1 1 2013-01-01 Tuesday 3 00:03:00 00:00:00 00:00:00 00:00:00 00:00:00
2013-01-01 00:04:00.000 2013 1 1 2013-01-01 Tuesday 3 00:04:00 00:00:00 00:00:00 00:00:00 00:00:00
2013-01-01 00:05:00.000 2013 1 1 2013-01-01 Tuesday 3 00:05:00 00:00:00 00:00:00 00:00:00 00:00:00
2013-01-01 00:06:00.000 2013 1 1 2013-01-01 Tuesday 3 00:06:00 00:00:00 00:00:00 00:00:00 00:00:00
2013-01-01 00:07:00.000 2013 1 1 2013-01-01 Tuesday 3 00:07:00 00:00:00 00:00:00 00:00:00 00:00:00
2013-01-01 00:08:00.000 2013 1 1 2013-01-01 Tuesday 3 00:08:00 00:00:00 00:00:00 00:00:00 00:00:00
2013-01-01 00:09:00.000 2013 1 1 2013-01-01 Tuesday 3 00:09:00 00:00:00 00:00:00 00:00:00 00:00:00
There is 1 record by minute from 2013-01-01 00:00:00 to 2017-12-31 23:59:00 stored in this table.
We make the join between a DATE_HEURE field in FCT_SCAN and the fields of DIM_DATE.
Here is the DATE_HEURE field in FCT_SCAN :
DATE_HEURE
2014-10-17 21:39:27.000
2014-10-17 21:44:37.000
2014-10-17 23:14:05.000
2014-10-17 23:14:01.000
2014-10-17 21:40:09.000
2014-10-17 21:44:25.000
2014-10-17 21:41:41.000
2014-10-17 21:41:51.000
2014-10-17 21:48:12.000
2014-10-17 23:09:32.000
I don't show you all the fields of FCT_SCAN because there is about 180 fields so...
Edit 2:
For information, my desired output looks like this if there is no data between 01:00 and 01:30 :
DATE_HEURE FIELD0 FIELD1 FIELD2 MEASURE0 MEASURE1 MEASURE2
2015-02-03 00:00:00 XXX XXX XXX 5 42 23
2015-02-03 00:30:00 XXX XXX XXX 5 42 23
2015-02-03 01:00:00 NULL NULL NULL 0 0 0
2015-02-03 01:30:00 NULL NULL NULL 0 0 0
2015-02-03 02:00:00 XXX XXX XXX 5 42 23
2015-02-03 02:30:00 XXX XXX XXX 5 42 23
Try Outer Apply:
select *
from FCT_SCAN scan
OUTER APPLY( select * from dim_date dt
where cast ( scan.DATE_HEURE as date ) = dt.DATE
and cast(cast ( scan.DATE_HEURE as time(0)) as varchar(5)) = CAST(dt.heure as varchar(5))) o
where CAST(scan.DATE_HEURE as DATE) = '2014-08-09'
and tranche_1h = '01:00:00'
order by heure
If tranche_1h column is from dim_date then use:
select *
from FCT_SCAN scan
OUTER APPLY( select * from dim_date dt
where cast ( scan.DATE_HEURE as date ) = dt.DATE
and cast(cast ( scan.DATE_HEURE as time(0)) as varchar(5)) = CAST(dt.heure as varchar(5)) and tranche_1h = '01:00:00') o
where CAST(scan.DATE_HEURE as DATE) = '2014-08-09'
order by heure
You may just be having issues as you need the time to the minute rather than second?
select *
from fct_scan scan
RIGHT JOIN dim_date dt
on dt.Date = cast ( scan.DATE_HEURE as date )
and cast(cast(cast(scan.DATE_HEURE as time(0)) as varchar(5)) as time) = dt.heure
order by heure