Microsoft SQL server to select Top N group - sql

There are a lot of answers about how to select n rows from each group.
But what I am looking for is to select every row from top N group, for example I have the data below:
id group
1 a
2 a
3 b
4 c
5 c
6 d
7 d
.......
If I want to select Top 3 Group, my intended results as below:
1 a
2 a
3 b
4 c
5 c
How can I achieve this with Microsoft SQL server 2008?

One option is to use a subquery which selects the top N groups:
SELECT t1.id, t1.group
FROM yourTable t1
INNER JOIN
(
SELECT DISTINCT TOP(N) group
FROM yourTable
ORDER BY group
) t2
ON t1.group = t2.group

You could rank your rows by the group and then take only the top three:
SELECT [id], [group]
FROM (SELECT [id], [group], RANK() OVER (ORDER BY [group] ASC) rk
FROM mytable) t
WHERE rk <= 3

#Tim: I just modified your query.
SELECT t1.id, t1.group
FROM yourTable t1
INNER JOIN
(
SELECT TOP N group
FROM yourTable
GROUP BY group
--ORDER BY group USE IT IF YOU WANT
) t2
ON t1.group = t2.group

Related

SQL: Merge two result set without any conditions

I have two tables like this:
Table1 with column N
N
---
1
2
3
4
5
And Table2 with column M:
M
---
5
9
1
8
1
Finally, I want to combine these two data sets with the same count of rows and also, save source order like this result:
N M
------
1 5
2 9
3 1
4 8
5 1
Can anyone help me?
Assuming you want to view this output we can use a ROW_NUMBER() trick here:
WITH cte1 AS (
SELECT N, ROW_NUMBER() OVER (ORDER BY N) rn
FROM Table1
),
cte2 AS (
SELECT M, ROW_NUMBER() OVER (ORDER BY M DESC) rn
FROM Table2
)
SELECT t1.N, t2.M
FROM cte1 t1
INNER JOIN cte2 t2
ON t2.rn = t1.rn
ORDER BY t1.rn;
According to Tim's answer. Also, this SO answers I could achieve my requirement and save source tables orders.
Like this:
WITH cte1 AS (
SELECT n, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) rn
FROM #TempTable
),
cte2 AS (
SELECT m, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) rn
FROM #TempTable2
)
SELECT t1.n, t2.m
FROM cte1 t1
INNER JOIN cte2 t2
ON t2.rn = t1.rn
ORDER BY t1.rn
Point:
There is no need to worry about specifying constant in the ORDER BY
expression.
Sources:
ROW_NUMBER Without ORDER BY
Tim's answer

Get all the users that has only 1 category in SQL

I have this table with U that represents "User" and C that represents "Category". I need a way to query how many categories a user has and limit that to 1. So basically, I need to get all the users that has only 1 category. How can this be achieved with SQL (PostgreSQL)?
I've tried to find any solution on Stackoverflow for a while now, but without success.
id U C
1 3 5
2 1 3
3 3 5
4 5 2
5 11 5
6 11 5
Expected result:
id U C
1 1 3
2 5 2
That is easy with the HAVING clause:
SELECT max(id), u, max(c)
FROM atable
GROUP BY u HAVING count(c) = 1
You can use window functions for your purpose.
Window Functions
SELECT DISTINCT * FROM(
SELECT a.*,count(c)over(partition by id) as cnt
FROM testtable a
) WHERE CNT=1
If you have no duplicates, you can use not exists:
select t.*
from t
where not exists (select 1 from t t2 where t2.u = t.u and t2.c <> t.c);
If you have duplicates of u/c, just use id:
select t.*
from t
where not exists (select 1 from t t2 where t2.u = t.u and t2.id <> t.id);
If you want to reassign a sequential number, then use row_number() as well:
select row_number() over (order by id) as new_id, t.*
from t
where not exists (select 1 from t t2 where t2.u = t.u and t2.c <> t.c);

SQL get the closest two rows within duplicate rows

I have following table
ID Name Stage
1 A 1
1 B 2
1 C 3
1 A 4
1 N 5
1 B 6
1 J 7
1 C 8
1 D 9
1 E 10
I need output as below with parameters A and N need to select closest rows where difference between stage is smallest
ID Name Stage
1 A 4
1 N 5
I need to select rows where difference between stage is smallest
This query can make use of an index on (name, stage) efficiently:
WITH cte AS (
SELECT TOP 1
a.id AS a_id, a.name AS a_name, a.stage AS a_stage
, n.id AS n_id, n.name AS n_name, n.stage AS n_stage
FROM tbl a
CROSS APPLY (
SELECT TOP 1 *, stage - a.stage AS diff
FROM tbl
WHERE name = 'N'
AND stage >= a.stage
ORDER BY stage
UNION ALL
SELECT TOP 1 *, a.stage - stage AS diff
FROM tbl
WHERE name = 'N'
AND stage < a.stage
ORDER BY stage DESC
) n
WHERE a.name = 'A'
ORDER BY diff
)
SELECT a_id AS id, a_name AS name, a_stage AS stage FROM cte
UNION ALL
SELECT n_id, n_name, n_stage FROM cte;
SQL Server uses CROSS APPLY in place of standard-SQL LATERAL.
In case of ties (equal difference) the winner is arbitrary, unless you add more ORDER BY expressions as tiebreaker.
dbfiddle here
This solution works, if u know the minimum difference is always 1
SELECT *
FROM myTable as a
CROSS JOIN myTable as b
where a.stage-b.stage=1;
a.ID a.Name a.Stage b.ID b.Name b.Stage
1 A 4 1 N 5
Or simpler if u don't know the minimum
SELECT *
FROM myTable as a
CROSS JOIN myTable as b
where a.stage-b.stage in (SELECT min (a.stage-b.stage)
FROM myTable as a
CROSS JOIN myTable as b)

SQL select top if columns are same

If I have a table like this:
Id StateId Name
1 1 a
2 2 b
3 1 c
4 1 d
5 3 e
6 2 f
I want to select like below:
Id StateId Name
4 1 d
5 3 e
6 2 f
For example, Ids 1,3,4 have stateid 1. So select row with max Id, i.e, 4.
; WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY STATEID ORDER BY ID DESC) AS RN
)SELECT ID, STATEID, NAME FROM CTE WHERE RN = 1
You can use ROW_NUMBER() + TOP 1 WITH TIES:
SELECT TOP 1 WITH TIES
Id,
StateId,
[Name]
FROM YourTable
ORDER BY ROW_NUMBER() OVER (PARTITION BY StateId ORDER BY Id DESC)
Output:
Id StateId Name
4 1 d
6 2 f
5 3 e
Disclaimer: I gave this answer before the OP had specified an actual database, and hence avoided using window functions. For a possibly more appropriate answer, see the reply by #Tanjim above.
Here is an option using joins which should work across most RDBMS.
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT StateId, MAX(Id) AS Id
FROM yourTable
GROUP BY StateId
) t2
ON t1.StateId = t2.StateId AND
t1.Id = t2.Id
The following using a subquery, to find the maximum Id for each of the states. The WHERE clause then only includes rows with ids from that subquery.
SELECT
[Id], [StateID], [Name]
FROM
TABLENAME S1
WHERE
Id IN (SELECT MAX(Id) FROM TABLENAME S2 WHERE S2.StateID = S1.StateID)

How do I remove duplicates in paging

table1 & table2:
table1 & table2 http://aftabfarda.parsfile.com/1.png
SELECT *
FROM (SELECT DISTINCT dbo.tb1.ID, dbo.tb1.name, ROW_NUMBER() OVER (ORDER BY tb1.id DESC) AS row
FROM dbo.tb1 INNER JOIN
dbo.tb2 ON dbo.tb1.ID = dbo.tb2.id_tb1) AS a
WHERE row BETWEEN 1 AND 7
ORDER BY id DESC
Result:
Result... http://aftabfarda.parsfile.com/3.png
(id 11 Repeated 3 times)
How can I have this output:
ID name row
-- ------ ---
11 user11 1
10 user10 2
9 user9 3
8 user8 4
7 user7 5
6 user6 6
5 user5 7
You could apply distinct before row_number using a subquery:
select *
from (
select row_number() over (order by tbl.id desc) as row
, *
from (
select distinct t1.ID
, tb1.name
from dbo.tb1 as t1
join dbo.tb2 as t2
on t1.ID = t2.id_tb1
) as sub_dist
) as sub_with_rn
where row between 1 and 7
Alternatively to #Andomar's suggestion, you could use DENSE_RANK instead of ROW_NUMBER and rank the rows first (in the subquery), then apply DISTINCT (in the outer query):
SELECT DISTINCT
ID,
name,
row
FROM (
SELECT
t1.ID,
t1.name,
DENSE_RANK() OVER (ORDER BY t1.ID DESC) AS row
FROM dbo.tb1 t1
INNER JOIN dbo.tb2 t2 ON t1.ID = t2.id_tb1
) AS a
WHERE row BETWEEN 1 AND 7
ORDER BY ID DESC
Similar, but not quite the same, although both might boil down to the same query plan, I'm just not sure. Worth testing, I think.
And, of course, you could also try a semi-join instead of a proper join, in the form of either IN or EXISTS, to prevent duplicates in the first place:
SELECT
ID,
name,
row
FROM (
SELECT
ID,
name,
ROW_NUMBER() OVER (ORDER BY ID DESC) AS row
FROM dbo.tb1
WHERE ID IN (SELECT id_tb1 FROM dbo.tb2)
/* Or:
WHERE EXISTS (
SELECT *
FROM dbo.tb2
WHERE id_tb1 = dbo.tb1.ID
)
*/
) AS a
WHERE row BETWEEN 1 AND 7
ORDER BY ID DESC