Using ROW_NUMBER () to Compare 1st ARRAY to 2nd, 3rd, 4th, etc - sql

I'm using ROW_NUMBER and I'm trying to compare arr in rn 1 to arr in rn 2,3,4,etc
to see if they overlap. I can do this with a subquery / simple join. Is there a way that AVOIDS a join?
rn | id | job | arr |desired_result
---+----+-----+--------+---------
1 | 1 | 100 | {1,2} | {1,2}
2 | 1 | 101 | {2,3} | {1,2}
3 | 1 | 102 | {5,6,8}| {1,2}
4 | 1 | 103 | {2,7} | {1,2}
I made a dbfiddle
--USING JOIN
WITH a AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY id ORDER by job) as rn
,*
FROM a_table
)
SELECT *
FROM (
SELECT id,arr
FROM a
WHERE rn = 1
) x
JOIN a
ON a.id=x.id

You can use first_value():
SELECT a.*, first_value(arr) over (partition by id order by job)
FROM a_table a;
row_number() does not seem necessary.

Related

SQL Server Add row number each group

I working on a query for SQL Server 2016. I have order by serial_no and group by pay_type and I would like to add row number same example below
row_no | pay_type | serial_no
1 | A | 4000118445
2 | A | 4000118458
3 | A | 4000118461
4 | A | 4000118473
5 | A | 4000118486
1 | B | 4000118499
2 | B | 4000118506
3 | B | 4000118519
4 | B | 4000118521
1 | A | 4000118534
2 | A | 4000118547
3 | A | 4000118550
1 | B | 4000118562
2 | B | 4000118565
3 | B | 4000118570
4 | B | 4000118572
Help me please..
SELECT
ROW_NUMBER() OVER(PARTITION BY paytype ORDER BY serial_no) as row_no,
paytype, serial_no
FROM table
ORDER BY serial_no
You can assign groups to adjacent pay types that are the same and then use row_number(). For this purpose, the difference of row numbers is a good way to determine the groups:
select row_number() over (partition by pay_type, seqnum - seqnum_2 order by serial_no) as row_no,
t.*
from (select t.*,
row_number() over (order by serial_no) as seqnum,
row_number() over (partition by pay_type order by serial_no) as seqnum_2
from t
) t;
This type of problem is one example of a gaps-and-islands problem. Why does the difference of row numbers work? I find that the simplest way to understand is to look at the results of the subquery.
Here is a db<>fiddle.
add this to your select list
ROW_NUMBER() OVER ( ORDER BY (SELECT 1) )
since you already sorting by your stuff, so you don't need to sorting in your windowing function so consuming less CPU,

How to list the latest series with no gaps of a given clause?

Given the following example table:
+-----------+
| Id | Name |
+----+------+
| 1 | A |
| 2 | B |
| 3 | B |
| 4 | C |
| 5 | A |
| 6 | B |
| 7 | B |
| 8 | B |
| 9 | B |
| 10 | X |
+----+------+
I would like a query to get the following result:
+----+------+
| 6 | B |
| 7 | B |
| 8 | B |
| 9 | B |
+----+------+
The best query I could do was:
SELECT * FROM
(SELECT id, name, LEAD(id) OVER (ORDER BY id) t
FROM test WHERE name = 'B' ORDER BY id)
WHERE ID <> t-1;
sqlfiddle here
If you want the length and where it starts:
select min(id), max(id)
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by name order by id) as seqnum_1
from test t
) t
where name = 'B'
group by (seqnum - seqnum_1)
order by min(id) desc
fetch first 1 row only;
You can join back to the table to get the original rows.
Another method using window functions to count the number of non-Bs after a given row . . . and then choose the first:
select t.*
from (select t.*,
dense_rank() over (order by nonbs_after asc) as grp
from (select t.*,
sum(case when name <> 'B' then 1 else 0 end) over (order by id desc) as nonbs_after
from test t
) t
where name = 'B'
) t
where grp = 1;
Here is a db<>fiddle.

SQL query for selecting multiple records for one product for a single id

My table looks like this, what I'm trying to achieve is to pull out all the records for one user for the product that have the earliest date
product |type_id| user | Date |Desired ROW_NUMBER as output |
-------+--------+------+-------+---------------------
1 | 1 | A | 0101 | 1
1 | 1 | A | 0102 | 1
2 | 3 | A | 0105 | 2
2 | 5 | A | 0105 | 2
3 | 7 | B | 0101 | 1
3 | 8 | B | 0104 | 1
So I want to pull all the records with "1" in the desired row_num column, but I haven't figured out hot to get this without doing another group by. Any helps would be appreciated.
You can use window functions:
select t.*
from (select t.*,
rank() over (partition by user order by min_date) as seqnum
from (select t.*,
min(date) over (partition by user, product) as min_date
from t
) t
) t
where seqnum = 1;
Or, with only one subquery:
select t.*
from (select t.*,
min(date) over (partition by user, product) as min_date_up,
min(date) over (partition by user) as min_date_u
from t
) t
where min_date_u = min_date_up;
You can interpret this as "return all rows where the product has the minimum date for the user".
Here is a db<>fiddle.
SELECT * FROM [tableName] WHERE Desired ROW_NUMBER = 1 ORDER BY Date[DESC, ASC]
Pass the Desired ROW_NUMBER value dynamically as a parameter.

Select entire partition where max row in partition is greater than 1

I'm partitioning by some non unique identifier, but I'm only concerned in the partitions with at least two results. What would be the way to get out all the instances where there's exactly one of the specified identifier?
Query I'm using:
SELECT ROW_NUMBER() OVER
(PARTITION BY nonUniqueId ORDER BY nonUniqueId, aTimeStamp) as row
,nonUniqueId
,aTimeStamp
FROM myTable
What I'm getting:
row | nonUniqueId | aTimeStamp
---------------------------------
1 | 1234 | 2014-10-08...
2 | 1234 | 2014-10-09...
1 | 1235 | 2014-10-08...
1 | 1236 | 2014-10-08...
2 | 1236 | 2014-10-09...
What I want:
row | nonUniqueId | aTimeStamp
---------------------------------
1 | 1234 | 2014-10-08...
2 | 1234 | 2014-10-09...
1 | 1236 | 2014-10-08...
2 | 1236 | 2014-10-09...
Thanks for any direction :)
Based on syntax, I'm assuming this is SQL Server 2005 or higher. My answer will be meant for that.
You have a couple options.
One, use a CTE:
;WITH CTE AS (
SELECT ROW_NUMBER() OVER
(PARTITION BY nonUniqueId ORDER BY nonUniqueId, aTimeStamp) as row
,nonUniqueId
,aTimeStamp
FROM myTable
)
SELECT *
FROM CTE t
WHERE EXISTS (SELECT 1 FROM CTE WHERE row = 2 and nonUniqueId = t.nonUniqueId);
Or, you can use subqueries:
SELECT ROW_NUMBER() OVER
(PARTITION BY nonUniqueId ORDER BY nonUniqueId, aTimeStamp) as row
,nonUniqueId
,aTimeStamp
FROM myTable t
WHERE EXISTS (SELECT 1 FROM myTable
WHERE nonUniqueId = t.nonUniqueId GROUP BY nonUniqueId, aTimeStamp HAVING COUNT(*) >= 2);

SQL remove duplicates from GROUP BY results

I have a table with the following structure
sys_id(identity) | id | group_id | fld_id | val
-----------------------------------------------
I have a query
SELECT id,group_id,fld_id,val,COUNT(*)
FROM [DB_ALERT].[dbo].[DATATABLE]
GROUP BY id,group_id,fld_id,val
HAVING COUNT(*)>1
The resul set is like this
ID | group_id | fld_id | val| count(*)
__________________________________________
1000001| 1 | 1 | 23 | 2
1000003| 1 | 1 | 24 | 5
1000008| 1 | 1 | 14 | 4
Now in the result set I want to take only top 1 sys_id for each record and delete the others with same ID,Group,Fld and val (remove its dublicates). I know how to do this with cursors, but is there any way to do such operation in a single query?
Please try:
;with c as
(
select *, row_number() over(partition by ID, Group, Fld, val order by ID, Group, Fld, val) as n
from YouTable
)
delete from c
where n > 1