Nested Case when SQL Teradata - sql

please I have the below data that I need to calculate the date diff between the Current_date and the Max Date for each ID for Active Status only and the result of the date diff will be located once beside the max date and the other records return NULL.
|ID |Date |Status |
|----+------|---------|
|A |1-Apr |Active |
|A |15-Apr|Active |
|B |1-Mar |Suspended|
|B |15-Mar|Deactive |
|C |1-Jan |Active |
|C |15-Jan|Active |
I tried to use the below query but it duplicates the result with each date
SELECT
ID,
Date,
CASE WHEN STATUS = 'Active' THEN
CASE WHEN Date = MAX(Date) OVER (PARTITION BY ID)
THEN CURRENT_DATE - MAX(Date) OVER (PARTITION BY ID) ELSE NULL END
ELSE NULL END AS Duration
FROM cte
ORDER BY ID, Date;
But I need the result to be like the below
|ID |Date |Status |Duration|
|----+------|---------|--------|
|A |1-Apr |Active |NULL |
|A |15-Apr|Active |19 |
|B |1-Mar |Suspended|NULL |
|B |15-Mar|Deactive |NULL |
|C |1-Jan |Active |NULL |
|C |15-Jan|Active |109 |

If you only want this on the most recent row with "ACTIVE", then add in row_number():
SELECT ID, Date,
(CASE WHEN STATUS = 'Active' AND
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DATE DESC) = 1
THEN CURRENT_DATE - Date
END) as Duration
FROM cte
ORDER BY ID, Date;
Note that your code looks like it should work, unless there are duplicate most recent dates for an id. However, this version is a bit simpler, eschewing the nested CASE expression and the unnecessary second call to MAX().

SELECT ID,
DATE,
Status,
CASE
WHEN STATUS = 'Active' AND RNUM = 1 THEN CURRENT_DATE - DATE
ELSE NULL
END AS Duration
FROM (SELECT ID,
DATE,
Status,
ROW_NUMBER() OVER (PARTITION BY ID,Status ORDER BY DATE DESC) AS RNUM
FROM CTE) CTE2
ORDER BY ID,
DATE;

Related

How to use Dense_rank() order by time but ignore duplicates from another colum?

I have a dataframe like this :
network|Value1|Value2|datetime
---------------------------------------
1 |A |null |2021-07-16 15:59:56.133
1 |B |null |2021-07-15 11:00:05.633
1 |B |null |2021-07-15 10:59:59.100
1 |C |null |2021-07-15 06:03:49.000
1 |null |A |2021-07-16 15:59:56.133
1 |null |B |2021-07-16 14:45:00.309
1 |null |C |2021-07-16 09:19:26.580
I want to create two ranks:
for each network, I want to rank [Value1] by datetime desc
for each network, I want to rank [Value2] by datetime desc
But for each ranks, I don't want to count duplicates [Value1] or [Value2]
The expected outcome should be:
network|Value1|Value2|datetime |rank_Value1 |rank_Value2
-------------------------------------------------------------
1 |A |null |2021-07-16 15:59:56.133 |1 |null
1 |B |null |2021-07-15 11:00:05.633 |2 |null
1 |B |null |2021-07-15 10:59:59.100 |2 |null
1 |C |null |2021-07-15 06:03:49.000 |3 |null
1 |null |A |2021-07-16 15:59:56.133 |null |1
1 |null |B |2021-07-16 14:45:00.309 |null |2
1 |null |C |2021-07-16 09:19:26.580 |null |3
Since I want the rank to be the same when [Value] is duplicated and I want the rank to be incremented 1 by 1, I use DENSE_RANK() for that and I tried this:
SELECT *,
CASE WHEN Value1 is null THEN NULL ELSE DENSE_RANK() OVER (PARTITION BY network, CASE WHEN Value1 is not null THEN 1 ELSE 0 END order by datetime desc) END as rank_Value1,
CASE WHEN Value2 is null THEN NULL ELSE DENSE_RANK() OVER (PARTITION BY network, CASE WHEN Value2 is not null THEN 1 ELSE 0 END order by datetime desc) END as rank_Value2
FROM df
But the outcome is as followed :
network|Value1|Value2|datetime |rank_Value1 |rank_Value2
-------------------------------------------------------------
1 |A |null |2021-07-16 15:59:56.133 |1 |null
1 |B |null |2021-07-15 11:00:05.633 |2 |null
1 |B |null |2021-07-15 10:59:59.100 |3 |null
1 |C |null |2021-07-15 06:03:49.000 |4 |null
1 |null |A |2021-07-16 15:59:56.133 |null |1
1 |null |B |2021-07-16 14:45:00.309 |null |2
1 |null |C |2021-07-16 09:19:26.580 |null |3
I feel like I am almost there but I don't know how to do this...
I am not comfortable with TSQL so if someone can help me, I would really appreciate it!
Reading through the lines here, but I suspect this is a gaps and island problem. If so, then I think this might be what you are after:
WITH YourTable AS(
SELECT *
FROM (VALUES(1,'A ',null,CONVERT(datetime,'2021-07-16T15:59:56.133')),
(1,'B ',null,CONVERT(datetime,'2021-07-15T11:00:05.633')),
(1,'B ',null,CONVERT(datetime,'2021-07-15T10:59:59.100')),
(1,'C ',null,CONVERT(datetime,'2021-07-15T06:03:49.000')),
(1,null,'A ',CONVERT(datetime,'2021-07-16T15:59:56.133')),
(1,null,'B ',CONVERT(datetime,'2021-07-16T14:45:00.309')),
(1,null,'C ',CONVERT(datetime,'2021-07-16T09:19:26.580')))V(network,Value1,Value2,datetime)),
Grps AS(
SELECT network,
Value1,
Value2,
datetime,
ROW_NUMBER() OVER (PARTITION BY network ORDER BY datetime) -
ROW_NUMBER() OVER (PARTITION BY network, Value1 ORDER BY datetime) AS Group1,
ROW_NUMBER() OVER (PARTITION BY network ORDER BY datetime) -
ROW_NUMBER() OVER (PARTITION BY network, Value2 ORDER BY datetime) AS Group2
FROM YourTable)
SELECT network,
Value1,
Value2,
datetime,
CASE WHEN Value1 IS NOT NULL THEN DENSE_RANK() OVER (PARTITION BY network, CASE WHEN Value1 IS NOT NULL THEN 1 ELSE 0 END ORDER BY group1 DESC) END AS rank_Value1,
CASE WHEN Value2 IS NOT NULL THEN DENSE_RANK() OVER (PARTITION BY network, CASE WHEN Value2 IS NOT NULL THEN 1 ELSE 0 END ORDER BY group2 DESC) END AS rank_Value2
FROM Grps
ORDER BY CASE WHEN Value1 IS NULL THEN 1 ELSE 0 END,
datetime DESC;
How about measuring the change in values?
SELECT df.*,
(CASE WHEN value1 IS NOT NULL
THEN SUM(CASE WHEN next_value_1 = value1 THEN 0 ELSE 1 END) OVER (ORDER BY datetime)
END),
(CASE WHEN value2 IS NOT NULL
THEN SUM(CASE WHEN next_value_1 = value2 THEN 0 ELSE 1 END) OVER (ORDER BY datetime)
END)
FROM (SELECT df.*,
LEAD(value1) OVER (ORDER BY datetime) as as next_value1,
LEAD(value2) OVER (ORDER BY datetime) as as next_value2
FROM df
) df;
Note that in your sample data, the value1s and value2s are not interleaved. The above assumes that is the case. Otherwise, you need the (case) expression to separate out the rows with each value in the partitioning clauses.

how many rows have different value

Given a table events
sensor_id | event_type | value | time
----------+------------+--------+------------
2 |2 | 3.45 | 2014-02 (...)
2 |4 | (...) | (...)
2 |2 | (...) | (...)
3 |2 | (...) | (...)
2 |3 | (...) | (...)
Write an SQL query that returns a set of all sensors_id with the number of different event_types registered by each of them, ORDER BY sensor_id ASC
Query should return the following rowset
sensor_id | type
----------+------------
2 |3
3 |1
The names of the columns in the rowest don't matter, but their order does
My query:
SELECT
sensor_id, COUNT(*) AS `types`
FROM
`events`
GROUP BY
sensor_id
ORDER BY
sensor_id ASC
And result:
sensor_id | types
----------+------------
2 |4 <= error
3 |1
use distinct event_Type inside count
SELECT
sensor_id, COUNT(distinct event_type) AS `types`
FROM
`events`
GROUP BY
sensor_id
ORDER BY
sensor_id ASC
You can use window function:
select distinct sensor_id, types from (
SELECT
sensor_id, COUNT(distinct event_type) over(partition by sensor_id) AS `types`
FROM
`events` ) X
ORDER BY
sensor_id ASC;
Try this:
SELECT sensor_id, COUNT(DISTINCT event_type) as type
FROM #tbltemp
GROUP BY sensor_id
ORDER BY sensor_id
If you do not include count distinct value it will count no 2 two times (2,2,3,4).
If you put distinct it will count as (2,3,4) only.

Select two entry from a joined table (1:m)

I have a device table (dev) and device_date table (dev_data). Relationship 1:M
dev table:
| id |name |status |
|-----|-------|-------|
| 1 |a |111 |
|-----|-------|-------|
| 2 |b |123 |
|-----|-------|-------|
| ....|..... |.... |
dev_data table:
|id |dev_id |status |date |
|---|-------|--------|------------------------|
|1 |1 | 123 |2019-04-16T18:53:07.908Z|
|---|-------|--------|------------------------|
|2 |1 | 120 |2019-04-16T18:54:07.908Z|
|---|-------|--------|------------------------|
|3 |1 | 1207 |2019-04-16T18:55:07.908Z|
|---|-------|--------|------------------------|
|4 |2 | 123 |2019-04-16T18:53:08.908Z|
|---|-------|--------|------------------------|
|5 |2 | 121 |2019-04-16T18:54:08.908Z|
|---|-------|--------|------------------------|
|6 |2 | 127 |2019-04-16T18:55:08.908Z|
|...|.......|........|........................|
I need to select all dev and join dev_data, but add only 2 last records (by date)
the final response should look like this one:
status_calc_1 and status_calc_2 is diff between status in dev and dev_data
status_calc_1 => status difference of the last row from dev_data and dev
status_calc_2 => status difference of prelast row from dev_data and dev
|id |name |status_calc_1 | status_calc_2 |
|----|------|---------------|---------------|
|1 |a |1207-111 |120-111 |
|----|------|---------------|---------------|
|2 |b |127-123 |121-123 |
I tried this one:
select id, "name", status, max(dd.date) as last,
(select date from device_data p where p.dev_id = device.id and date < dd.date limit 1) as prelast
from device
inner join device_data dd on device.id = dd.dev_id
group by id, "name", status;
but get an error:
ERROR: subquery uses ungrouped column "device.id" from outer query
and this one:
select id, "name", status, max(dd.date) as last, max(dd2.date) as prelast,
from device
inner join device_data dd on device.id = dd.dev_id
inner join device_data dd2 on device.id = dd2.dev_id and dd2.date < dd.date
group by id, "name", status;
I get correct 2 last dev_data, but still, have no idea how to make 2 columns status_calc_1 and status_calc_2
status_calc_1 = last row dev_data.status - dev.status
status_calc_2 = prelast row dev_data.status - dev.status
You can use conditional aggregation:
select d.id, d.name, d.status,
max(dd.date) as last,
max(case when dd.seqnum = 2 then dd.date end) as prelast,
(max(case when dd.seqnum = 1 then dd.status end) - d.status) as status_calc_1,
(max(case when dd.seqnum = 2 then dd.status end) - d.status) as status_calc_2
from device d join
(select dd.*,
row_number() over (partition by dd.dev_id order by dd.date desc) as seqnum
from device_data dd
) dd
on d.id = dd.dev_id
where seqnum <= 2
group by d.id, d.name, d.status;

T-SQL get the last date time record

My table looks like this:
+---------+------------------------+-------+---------+---------+
|channel |date |code |comment |order_id |
+---------+------------------------+-------+---------+---------+
|1 |2017-10-27 12:04:45.397 |2 |comm1 |1 |
|1 |2017-10-27 12:14:20.997 |1 |comm2 |1 |
|2 |2017-10-27 12:20:59.407 |3 |comm3 |1 |
|2 |2017-10-27 13:14:20.997 |1 |comm4 |1 |
|3 |2017-10-27 12:20:59.407 |2 |comm5 |1 |
|3 |2017-10-27 14:20:59.407 |1 |comm6 |1 |
+---------+------------------------+-------+---------+---------+
And I expect result like this:
+---------+------------------------+-------+---------+
|channel |date |code |comment |
+---------+------------------------+-------+---------+
|1 |2017-10-27 12:14:20.997 |1 |comm2 |
|2 |2017-10-27 13:14:20.997 |1 |comm4 |
|3 |2017-10-27 14:20:59.407 |1 |comm6 |
+---------+------------------------+-------+---------+
Always 1 record with order_id = x and max date for each channel. Total number of channels is constant.
My query works but I'm worried about performance as the table grows. Doing three almost identical queries doesn't seem smart.
select
*
from
(select top(1)
channel,
date,
code,
comment
from
status
where
channel = 1 and
order_id = 1 and
cast(date as date) = '2017-10-27'
order by
date desc) channel1
union
select
*
from
(select top(1)
channel,
date,
code,
comment
from
status
where
channel = 2 and
order_id = 1 and
cast(date as date) = '2017-10-27'
order by
date desc) channel2
union
select
*
from
(select top(1)
channel,
date,
code,
comment
from
status
where
channel = 3 and
order_id = 1 and
cast(date as date) = '2017-10-27'
order by
date desc) channel3
How can I improve this?
Another option is using the WITH TIES clause. No sub-query or extra field.
Select top 1 with ties *
From YourTable
Order By Row_Number() over (Partition By channel order by date desc)
Try using the ROW_NUMBER() function and a derived table. It will save you a lot of headaches. Try:
select channel
,date
,code
,comment
from
(select *
,row_number() over(partition by channel order by code asc) rn --probably don't need asc since it is ascending by default
from mytable) t
where t.rn = 1
Assuming you want the latest row for each channel, this would work.
SELECT *
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY s.channel ORDER BY [date] DESC) AS rn,
*
FROM [status] AS s
) AS t
WHERE t.rn = 1

Oracle Script for target table

Hi I have the following requirement
In Table A
CRD | RNo | J_NAME
-------------------------
DOS1 |1 | NULL
DOS2 |2 | Name 1
DOS3 |3 | Name 2
DOS4 |4 | Name 3
DOS5 |5 | Name 1
DOS6 |6 | Name 1
DOS7 |7 | Name 4
DOS8 |8 | Name 2
Out put should be
CRD | RNo | J_NAME
-------------------------
DOS1 |1 | NULL
DOS2 |2 | A
DOS3 |3 | B
DOS4 |4 | C
DOS5 |5 | A
DOS6 |6 | A
DOS7 |7 | D
DOS8 |8 | B
Null allays should be null, If the name already exist in the target table then It will be add the same name eg: J_Name = A and B, if the source value is not in the target table then it will get a new entry from the list.
Ho I can achieve this?
You can try somthing like this:-
SELECT CRD, RNo,
CASE WHEN J_NAME ='Name 1' THEN 'A'
WHEN J_NAME ='Name 2' THEN 'B'
WHEN J_NAME ='Name 3' THEN 'C'
WHEN J_NAME ='Name 4' THEN 'D'
ELSE 'NULL' END AS J_NAME
FROM TAB;
If you're sure you won't run out of letters (since you're not telling how to generate values beyond Z), you could just dense_rank your names and give them a letter corresponding to the rank;
WITH cte AS (
SELECT * FROM mytable UNION ALL SELECT '', 0, NULL FROM DUAL
)
SELECT crd, rno, CASE WHEN j_name IS NULL THEN NULL ELSE CHR(calc+63) END j_name
FROM (
SELECT crd, rno, j_name, DENSE_RANK() OVER (ORDER BY j_name NULLS FIRST) calc
FROM cte
)
WHERE rno > 0 ORDER BY rno;
It basically takes the table contents, adds a row with a null J_NAME value to make sure there's a rank for NULL, and uses DENSE_RANK() on the resulting j_names to get a value to generate the letter from.
An SQLfiddle to test with.