How can I generate row_number by condition - sql

I am trying to put conditional numbering depending on a result from row_num column.
When the flg_start_trx is 1, I would like to have new column with brand new increment by 1 and it will stop when found the end of flg_session_match != 1
How can I fix this query if I want the result should be like row_num column?
case when (title is not null and title_program is null)
then (row_number() over (partition by flg_session_match,(case when (title is not null and title_program is null) then 1 else 0 end)
)
)
end as start_session

You may try this:
(As #Ahmed mentioned above, you need a column to define an order of each row, so I've added ts for that purpose.)
SELECT * EXCEPT(par0, par1),
IF (FIRST_VALUE(flg_start_trx) OVER w1 IS NOT NULL, ROW_NUMBER() OVER w1, NULL) AS row_num
FROM (
SELECT *,
COUNT(*) OVER w0 - COUNTIF(flg_start_trx IS NULL) OVER w0 AS par0,
COUNT(*) OVER w0 - COUNTIF(flg_session_match = 1) OVER w0 AS par1
FROM sample_data
WINDOW w0 AS (ORDER BY ts)
) WINDOW w1 AS (PARTITION BY par0, par1 ORDER BY ts);

Related

Query doesn't work when adding another condition with PARTITION BY

Originally I had the following query:
SELECT T.* FROM
(SELECT *, row_number() OVER
(PARTITION BY manager_id ORDER BY id) AS sn FROM my_table) T
WHERE (sn = 1)
AND ((checkbox IS NULL OR (checkbox = 0)
)
but then I added another boolean column to the db called "not_relevant" and I want to show results where its value is not 1, so I added the following:
SELECT T.* FROM
(SELECT *, row_number() OVER
(PARTITION BY manager_id ORDER BY id) AS sn FROM my_table) T
WHERE (sn = 1)
AND ((checkbox IS NULL) OR (checkbox = 0)
AND (not_relevant != 1)
)
But I get the same results even if not_relevant is 1
Why?

Get Earliest Date corresponding to the latest occurrence of a recurring name

I have a table with Name and Date columns. I want to get the earliest date when the current name appeared. For example:
Name
Date
X
30-Jan-2021
X
29-Jan-2021
X
28-Jan-2021
Y
27-Jan-2021
Y
26-Jan-2021
Y
25-Jan-2021
Y
24-Jan-2021
X
23-Jan-2021
X
22-Jan-2021
Now when I try to get the earliest date when current name (X) started to appear, I want 28-Jan, but the sql query would give 22-Jan-2021 because that's when X appeared originally for the first time.
Update: This was the query I was using:
Select min(Date) from myTable where Name='X'
I am using older SQL Server 2008 (in the process of upgrading), so do not have access to LEAD/LAG functions.
The solutions suggested below do work as intended. Thanks.
This is a type of gaps-and-islands problem.
There are many solutions. Here is one that is optimized for your case
Use LEAD/LAG to identify the first row in each grouping
Filter to only those rows
Number them rows and take the first one
WITH StartPoints AS (
SELECT *,
IsStart = CASE WHEN Name <> LEAD(Name, 1, '') OVER (ORDER BY Date DESC) THEN 1 END
FROM YourTable
),
Numbered AS (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC)
FROM StartPoints
WHERE IsStart = 1 AND Name = 'X'
)
SELECT
Name, Date
FROM Numbered
WHERE rn = 1;
db<>fiddle
For SQL Server 2008 or earlier (which I strongly suggest you upgrade from), you can use a self-join with row-numbering to simulate LEAD/LAG
WITH RowNumbered AS (
SELECT *,
AllRn = ROW_NUMBER() OVER (ORDER BY Date ASC)
FROM YourTable
),
StartPoints AS (
SELECT r1.*,
IsStart = CASE WHEN r1.Name <> ISNULL(r2.Name, '') THEN 1 END
FROM RowNumbered r1
LEFT JOIN RowNumbered r2 ON r2.AllRn = r1.AllRn - 1
),
Numbered AS (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC)
FROM StartPoints
WHERE IsStart = 1
)
SELECT
Name, Date
FROM Numbered
WHERE rn = 1;
This is a gaps and island problem. Based on the sample data, this will work:
WITH Groups AS(
SELECT YT.[Name],
YT.[Date],
ROW_NUMBER() OVER (ORDER BY YT.Date DESC) -
ROW_NUMBER() OVER (PARTITION BY YT.[Name] ORDER BY Date DESC) AS Grp
FROM dbo.YourTable YT),
FirstGroup AS(
SELECT TOP (1) WITH TIES
G.[Name],
G.[Date]
FROM Groups G
WHERE [Name] = 'X'
ORDER BY Grp ASC)
SELECT MIN(FG.[Date]) AS Mi
db<>fiddle
If i did understand, you want to know when the X disappeared and reappeared again. in that case you can search for gaps in dates by group.
this and example how to detect that
SELECT name
,DATE
FROM (
SELECT *
,DATEDIFF(day, lead(DATE) OVER (
PARTITION BY name ORDER BY DATE DESC
), DATE) DIF
FROM YourTable
) a
WHERE DIF > 1

postgresql cumsum by condition

I have the table
I need to calculate cumsum group by id for every row with type="end".
Can anyone see the problem?
Output result
This is a little tricky. One method is to assign a grouping by reverse counting the ends. Then use dense_rank():
select t.*,
dense_rank() over (order by grp desc) as result
from (select t.*,
count(*) filter (where type = 'end') over (order by created desc) as grp
from t
) t;
You can also do this without a subquery:
select t.*,
(count(*) filter (where type = 'end') over () -
count(*) filter (where type = 'end') over (order by created desc) -
1
)
from t;

Vertica/SQL: Getting rows immediately proceeding events

Consider a simple query
select from tbl where status=MELTDOWN
I would like to now create a table that in addition to including these rows, also includes the previous p rows and the subsequent n rows, so that I can get a sense as to what happens in the surrounding time of these MELTDOWNs. Any hints?
You can do this with window functions by getting the seqnum of the meltdown rows. I prefer to do this with lag()/lead() ignore nulls, but Vertical doesn't support that. I think this is the equivalent with first_value()/last_value():
with t as (
select t.*, row_number() over (order by id) as seqnum
from tbl
),
tt as (
select t.*,
last_value(case when status = 'meltdown' then seqnum end ignore nulls) over (order by seqnum rows between unbounded preceding and current row) as prev_meltdown_seqnum,
first_value(case when status = 'meltdown' then seqnum end ignore nulls) over (order by seqnum rows between current row and unbounded following) as prev_meltdown_seqnum,
from t
)
select tt.*
from tt
where seqnum between prev_melt_seqnum and prev_melt_seqnum + 7 or
seqnum between next_melt_seqnum -5 and next_melt_seqnum;
WITH
grouped AS
(
SELECT
SUM(
CASE WHEN status = 'Meltdown' THEN 1 ELSE 0 END
)
OVER (
ORDER BY timeStamp
)
AS GroupID,
tbl.*
FROM
tbl
),
sorted AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY GroupID ORDER BY timeStamp ASC ) AS incPos,
ROW_NUMBER() OVER (PARTITION BY GroupID ORDER BY timeStamp DESC) AS decPos,
MAX(GroupID) OVER () AS LastGroup
grouped.*
FROM
grouped
)
SELECT
sorted.*
FROM
sorted
WHERE
(incPos <= 8 AND GroupID > 0 ) -- Meltdown and the 7 events following it
OR (decPos <= 6 AND GroupID <> LastGroup) -- and the 6 events preceding a Meltdown
ORDER BY
timeStamp

Sql Range Groups Start and End Id

I have a query that I want to break into 'chunks' of size 200 and return the start id and end id of each 'chunk'.
Example:
select t.id
from t
where t.x = y --this predicate will cause the ids to not be sequential
If the example was the query I'm trying to break into 'chunks' I'd want to return:
(1st ID, 200th ID), (201st ID, 400th ID)...(start of final range ID, end of range ID)
Edit: For the final range, if it is not a full 200 rows it should still supply the final id in the query.
Is there a way to do this with just SQL or will I have to resort to application processing and/or multiple queries similar to a pagination implementation?
If there is a way to do this in SQL please supply an example.
Hmmm, I think the easiest way is to use row_number():
select id
from (select t.*, row_number() over (order by id) as seqnum
from t
where t.x = y
) t
where (seqnum % 200) in (0, 1);
EDIT:
Based on your comments:
select min(id) as startid, max(id) as endid
from (select t.*,
floor((row_number() over (order by id) - 1) / 200) as grp
from t
where t.x = y
) t
group by grp;
L for Left and R for Right
WITH cte AS (
SELECT
t.id,
row_number() over (order by id) as seqnum
FROM Table t
WHERE t.x = y
)
SELECT L.id as start_id, COALESCE(R.id, (SELECT MAX(ID) FROM cte) ) as end_id
FROM cte L
LEFT JOIN cte R
ON L.seqnum = R.seqnum - 199
WHERE L.seqnum % 200 = 1
SqlFiddleDemo
filtering only even number and block of 4.
See how R.seqnum - 199 for a block of size 200