How do I remove duplicates in paging - sql

table1 & table2:
table1 & table2 http://aftabfarda.parsfile.com/1.png
SELECT *
FROM (SELECT DISTINCT dbo.tb1.ID, dbo.tb1.name, ROW_NUMBER() OVER (ORDER BY tb1.id DESC) AS row
FROM dbo.tb1 INNER JOIN
dbo.tb2 ON dbo.tb1.ID = dbo.tb2.id_tb1) AS a
WHERE row BETWEEN 1 AND 7
ORDER BY id DESC
Result:
Result... http://aftabfarda.parsfile.com/3.png
(id 11 Repeated 3 times)
How can I have this output:
ID name row
-- ------ ---
11 user11 1
10 user10 2
9 user9 3
8 user8 4
7 user7 5
6 user6 6
5 user5 7

You could apply distinct before row_number using a subquery:
select *
from (
select row_number() over (order by tbl.id desc) as row
, *
from (
select distinct t1.ID
, tb1.name
from dbo.tb1 as t1
join dbo.tb2 as t2
on t1.ID = t2.id_tb1
) as sub_dist
) as sub_with_rn
where row between 1 and 7

Alternatively to #Andomar's suggestion, you could use DENSE_RANK instead of ROW_NUMBER and rank the rows first (in the subquery), then apply DISTINCT (in the outer query):
SELECT DISTINCT
ID,
name,
row
FROM (
SELECT
t1.ID,
t1.name,
DENSE_RANK() OVER (ORDER BY t1.ID DESC) AS row
FROM dbo.tb1 t1
INNER JOIN dbo.tb2 t2 ON t1.ID = t2.id_tb1
) AS a
WHERE row BETWEEN 1 AND 7
ORDER BY ID DESC
Similar, but not quite the same, although both might boil down to the same query plan, I'm just not sure. Worth testing, I think.
And, of course, you could also try a semi-join instead of a proper join, in the form of either IN or EXISTS, to prevent duplicates in the first place:
SELECT
ID,
name,
row
FROM (
SELECT
ID,
name,
ROW_NUMBER() OVER (ORDER BY ID DESC) AS row
FROM dbo.tb1
WHERE ID IN (SELECT id_tb1 FROM dbo.tb2)
/* Or:
WHERE EXISTS (
SELECT *
FROM dbo.tb2
WHERE id_tb1 = dbo.tb1.ID
)
*/
) AS a
WHERE row BETWEEN 1 AND 7
ORDER BY ID DESC

Related

Match by id and date between 2 tables, OR last known match id

Trying to make work following:
T1: Take id per dt where name = A which is most recent by load_id
Notice 2 records on 5-Jan-23, with load_id 2 and 3 => take load_id = 3
T2: And display corresponding id per dt for each param rows, with most recent load_id
Notice only load_id = 13 is kept on 05-Jan-23
T2: In case of date now available in T1, keep T2 rows matching last known id
Fiddle: https://dbfiddle.uk/-JO16GSj
My SQL seems a bit wild. Can it be simplified?
SELECT t2.dt, t2.param, t2.load_id, t2.id FROM
(SELECT
dt,
param,
load_id,
MAX(load_id) OVER (PARTITION BY dt, param) AS max_load_id,
id
FROM table2) t2
LEFT JOIN
(SELECT * FROM
(SELECT
dt,
id,
load_id,
MAX(load_id) OVER (PARTITION BY dt) AS max_load_id
FROM table1
WHERE name = 'A') t1_prep
WHERE t1_prep.load_id = t1_prep.max_load_id) t1
ON t1.dt = t2.dt and t1.id = t2.id
WHERE t2.load_id = t2.max_load_id
ORDER BY 1, 2
Your query can be rewritten as:
SELECT t2.*
FROM ( SELECT *
FROM table2
ORDER BY RANK() OVER (PARTITION BY dt, param ORDER BY load_id DESC)
FETCH FIRST ROW WITH TIES
) t2
LEFT OUTER JOIN
( SELECT *
FROM table1
WHERE name = 'A'
ORDER BY RANK() OVER (PARTITION BY dt ORDER BY load_id DESC)
FETCH FIRST ROW WITH TIES
) t1
ON t1.dt = t2.dt and t1.id = t2.id
ORDER BY t2.dt, t2.param
However, since the columns from t1 are never output and are joined with a LEFT OUTER JOIN (and will only output single rows per dt) then it is irrelevant whether a match is found or not with t1 and that table can be eliminated from the query simplifying it to:
SELECT *
FROM (
SELECT *
FROM table2
ORDER BY RANK() OVER (PARTITION BY dt, param ORDER BY load_id DESC)
FETCH FIRST ROW WITH TIES
)
ORDER BY dt, param;
or using your query:
SELECT dt, param, load_id, id
FROM (
SELECT dt, param, load_id, id,
MAX(load_id) OVER (PARTITION BY dt, param) AS max_load_id
FROM table2
)
WHERE load_id = max_load_id
ORDER BY dt, param
Which, for the sample data, all output:
DT
PARAM
LOAD_ID
ID
04-JAN-23
0
11
1
04-JAN-23
1
11
1
05-JAN-23
0
13
3
05-JAN-23
1
13
3
06-JAN-23
0
14
3
06-JAN-23
1
14
3
07-JAN-23
1
14
3
08-JAN-23
1
15
3
09-JAN-23
0
16
3
09-JAN-23
1
16
3
10-JAN-23
0
17
3
10-JAN-23
1
17
3
fiddle

SQL: Merge two result set without any conditions

I have two tables like this:
Table1 with column N
N
---
1
2
3
4
5
And Table2 with column M:
M
---
5
9
1
8
1
Finally, I want to combine these two data sets with the same count of rows and also, save source order like this result:
N M
------
1 5
2 9
3 1
4 8
5 1
Can anyone help me?
Assuming you want to view this output we can use a ROW_NUMBER() trick here:
WITH cte1 AS (
SELECT N, ROW_NUMBER() OVER (ORDER BY N) rn
FROM Table1
),
cte2 AS (
SELECT M, ROW_NUMBER() OVER (ORDER BY M DESC) rn
FROM Table2
)
SELECT t1.N, t2.M
FROM cte1 t1
INNER JOIN cte2 t2
ON t2.rn = t1.rn
ORDER BY t1.rn;
According to Tim's answer. Also, this SO answers I could achieve my requirement and save source tables orders.
Like this:
WITH cte1 AS (
SELECT n, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) rn
FROM #TempTable
),
cte2 AS (
SELECT m, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) rn
FROM #TempTable2
)
SELECT t1.n, t2.m
FROM cte1 t1
INNER JOIN cte2 t2
ON t2.rn = t1.rn
ORDER BY t1.rn
Point:
There is no need to worry about specifying constant in the ORDER BY
expression.
Sources:
ROW_NUMBER Without ORDER BY
Tim's answer

MS-SQL max ID with inner join

Can't see the wood for the trees on this and I'm sure it's simple.
I'm trying to return the max ID for a related record in a joined table
Table1
NiD
Name
1
Peter
2
John
3
Arthur
Table2
ID
NiD
Value
1
1
5
2
2
10
3
3
10
4
1
20
5
2
15
Max Results
NiD
ID
Value
1
4
20
2
5
15
3
3
10
You can use row_number() for this:
select NiD, ID, Value
from (select t2.*,
row_number() over (partition by NiD order by ID desc) as seqnum
from table2 t2
) t2
where seqnum = 1;
As the question is stated, you do not need table1, because table2 has all the ids.
This is how I'd do it, I think ID and Value will be NULL when Table2 does not have a corresponding entry for a Table1 record:
SELECT NiD, ID, [Value]
FROM Table1
OUTER APPLY (
SELECT TOP 1 ID, [Value]
FROM Table2
WHERE Table1.NiD = Table2.NiD
ORDER BY [Value] DESC
) AS Top_Table2
CREATE TABLE Names
(
NID INT,
[Name] VARCHAR(MAX)
)
CREATE TABLE Results
(
ID INT,
NID INT,
VALUE INT
)
INSERT INTO Names VALUES (1,'Peter'),(2,'John'),(3,'Arthur')
INSERT INTO Results VALUES (1,1,5),(2,2,10),(3,3,10),(4,1,20),(5,2,15)
SELECT a.NID,
r.ID,
a.MaxVal
FROM (
SELECT NID,
MAX(VALUE) as MaxVal
FROM Results r
GROUP BY NID
) a
JOIN Results r
ON a.NID = r.NID AND a.MaxVal = r.VALUE
ORDER BY NID
Here's what I have used in similar situations, performance was fine, provided that the data set wasn't too large (under 1M rows).
SELECT
table1.nid
,table2.id
,table2.value
FROM table1
INNER JOIN table2 ON table1.nid = table2.nid
WHERE table2.value = (
SELECT MAX(value)
FROM table2
WHERE nid = table1.nid)
ORDER BY 1

SQL select top if columns are same

If I have a table like this:
Id StateId Name
1 1 a
2 2 b
3 1 c
4 1 d
5 3 e
6 2 f
I want to select like below:
Id StateId Name
4 1 d
5 3 e
6 2 f
For example, Ids 1,3,4 have stateid 1. So select row with max Id, i.e, 4.
; WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY STATEID ORDER BY ID DESC) AS RN
)SELECT ID, STATEID, NAME FROM CTE WHERE RN = 1
You can use ROW_NUMBER() + TOP 1 WITH TIES:
SELECT TOP 1 WITH TIES
Id,
StateId,
[Name]
FROM YourTable
ORDER BY ROW_NUMBER() OVER (PARTITION BY StateId ORDER BY Id DESC)
Output:
Id StateId Name
4 1 d
6 2 f
5 3 e
Disclaimer: I gave this answer before the OP had specified an actual database, and hence avoided using window functions. For a possibly more appropriate answer, see the reply by #Tanjim above.
Here is an option using joins which should work across most RDBMS.
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT StateId, MAX(Id) AS Id
FROM yourTable
GROUP BY StateId
) t2
ON t1.StateId = t2.StateId AND
t1.Id = t2.Id
The following using a subquery, to find the maximum Id for each of the states. The WHERE clause then only includes rows with ids from that subquery.
SELECT
[Id], [StateID], [Name]
FROM
TABLENAME S1
WHERE
Id IN (SELECT MAX(Id) FROM TABLENAME S2 WHERE S2.StateID = S1.StateID)

Microsoft SQL server to select Top N group

There are a lot of answers about how to select n rows from each group.
But what I am looking for is to select every row from top N group, for example I have the data below:
id group
1 a
2 a
3 b
4 c
5 c
6 d
7 d
.......
If I want to select Top 3 Group, my intended results as below:
1 a
2 a
3 b
4 c
5 c
How can I achieve this with Microsoft SQL server 2008?
One option is to use a subquery which selects the top N groups:
SELECT t1.id, t1.group
FROM yourTable t1
INNER JOIN
(
SELECT DISTINCT TOP(N) group
FROM yourTable
ORDER BY group
) t2
ON t1.group = t2.group
You could rank your rows by the group and then take only the top three:
SELECT [id], [group]
FROM (SELECT [id], [group], RANK() OVER (ORDER BY [group] ASC) rk
FROM mytable) t
WHERE rk <= 3
#Tim: I just modified your query.
SELECT t1.id, t1.group
FROM yourTable t1
INNER JOIN
(
SELECT TOP N group
FROM yourTable
GROUP BY group
--ORDER BY group USE IT IF YOU WANT
) t2
ON t1.group = t2.group