Getting distinct rows from select statement

Getting distinct rows from select statement - sql

I am having a select statement that generates the following output.
If you look the output there are few rows that generates duplicate AdjusterIds.
What i need is to get the FIRST ROW ONLY for repeated ids along with other rows.
I tried several Group By, Distinct etc but no luck.
Please note that there are several columns that i just omitted for simplicity.

You can use a CTE + ROW_NUMBER:
WITH CTE AS
(
SELECT t.*, RN = ROW_NUMBER() OVER (PARTITION BY AdjusterIds
ORDER BY AdjusterIds)
FROM dbo.TableName t
)
SELECT * FROM CTE WHERE RN = 1

Related

MS SQL add max()-1 to qyery

how to add to the query max(o.Acct)-1 rows. I need to visualize the last two o.Acct rows. My query is currently showing only the max(o.Acct)
SELECT Max(o.Acct) AS [MaxAcct],o.ObjectID,o.Opertype
FROM Operations o
GROUP By o.ObjectID,o.Opertype

If you want to see the last two rows (per group), you're better off using ROW_NUMBER() rather than GROUP BY.
SELECT
*
FROM
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ObjectID,
Opertype
ORDER BY Acct DESC
)
AS sequence_id
FROM
Operations
)
sortedOperations
WHERE
sequence_id <= 2
ORDER BY
ObjectID,
Opertype,
Acct

If you want the last two of something, I'm thinking order by and top. Something like this:
select top (2) o.*
from Operations o
order by o.acct desc;

Is there any optimised way in sql sever to optimse this code, I am trying to find 2nd duplicate

Is there any optimised way in sql sever to optimse this code, I am trying to find 2nd duplicate
WITH CTE AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY id,AN_KEY ORDER BY [ENTITYID]) AS [rn]
FROM [data].[dbo].[TRANSFER]
)
select *
INTO dbo.#UpSingle
from CTE
where RN=2

UPDATE:
As GurV pointed out - this query doesn't solve the problem. It will only give you the items that have exactly two duplicates, but not the row where the second duplicate lies.
I am just going to leave this here from reference purposes.
Original Answer
Why not try something like this from another SO post: Finding duplicate values in a SQL table
SELECT
id, AN_KEY, COUNT(*)
FROM
[data].[dbo].[TRANSFER]
GROUP BY
id, AN_KEY
HAVING
COUNT(*) = 2
I gather from your original SQL that the cols you would want to group by on are :
Id
AN_KEY

Here is another way to get the the second duplicate row (in the order of increasing ENTITYID of course):
select *
from [data].[dbo].[TRANSFER] a
where [ENTITYID] = (
select min([ENTITYID])
from [data].[dbo].[TRANSFER] b
where [ENTITYID] > (
select min([ENTITYID])
from [data].[dbo].[TRANSFER] c
where b.id = c.id
and b.an_key = c.an_key
)
and a.id = b.id
and a.an_key = b.an_key
)
Provided there is an index on id, an_key and ENTITYID columns, performance of both your query and this should be acceptable.

Let me assume that this query does what you want:
WITH CTE AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id, AN_KEY
ORDER BY [ENTITYID]) AS [rn]
FROM [data].[dbo].[TRANSFER] t
)
SELECT *
INTO dbo.#UpSingle
FROM CTE
WHERE RN = 2;
For performance, you want a composite index on [data].[dbo].[TRANSFER](id, AN_KEY, ENTITYID).

How to I do multiple columns partitioning with the rows being duplicated?

I have a set of SQL Stored procedure to use partitioning for my ranking to get percentile. by doing the below partitioning I am able to get my percentiles data right. However my problem is there are duplicates in each row. E.g for each DESC there are multiple duplicates when it is suppose to be only 1 row. Why is this so?
row_nums AS
(
SELECT DATE, DESC, NUM, ROW_NUMBER() OVER (PARTITION BY DATE, DESC ORDER BY NUM ASC) AS Row_Num
FROM ******
)
SELECT .................
This is the output I get currently: (Where there are duplicate rows being returned - Refer to Row 6 to 8)
http://i.stack.imgur.com/foe7g.png[^]
This is the output I want to achieve: http://i.stack.imgur.com/GkrHP.png[^]

You can remove duplicate by adding one more INNER query in FROM clause like below:
;WTIH row_nums AS
(
SELECT DATE, DESC, NUM, ROW_NUMBER() OVER (PARTITION BY DATE, DESC ORDER BY NUM ASC) AS Row_Num
FROM (
SELECT your required columns, COUNT(duplicated_rows_columnsname)
FROM ***
GROUP BY columnnames
HAVING COUNT(duplicated_rows_columnsname) = 1
)
)
SELECT .................
However, You can also remove duplicate row using DISTINCT clause in INNER. query.

Select Top 100 Groups

I have thousands of groups in a table, something like :
1..
1..
2..
2..
2..
2..
3..
3..
.
.
.
10000..
10000..
How can i make a select that give me the Top 3 groups each time.
I Want something like select Top 3 from rows , but it have to return the first three groups not the first three rows.

You can try this :
;with cte as (
select distinct groupId from mytable order by groupid
)
select * from mytable where TheGroupId in (select top 3 groupdid from cte)

You can use DENSE_RANK to assign a number to each group. All members of the same group will have the same number. Then in an outer query, select top 3 groups:
SELECT *
FROM (SELECT *, DENSE_RANK() OVER (ORDER BY id) AS rnk
FROM mytable ) t
WHERE t.rnk <= 3
The above query assumes that id is the column used to group records together.
SQL Fiddle Demo

Use Ranking function Row_Number() :
SELECT *
FROM (SELECT *,
Row_number()
OVER(
partition BY GroupId
ORDER BY GroupId) AS [rn]
FROM YourTable) t
WHERE rn <= 3
Check this MSDN doc for details of all ranking functions.

There is a sql TOP statement that does this
SELECT TOP number|percent column_name(s) FROM table_name;
a description of what it does and how it is used in alternative sql statements for example for mysql and ms access can be found here: http://www.w3schools.com/sql/sql_top.asp
My bad i misread your question, this will return the top rows not groups, could you explain what you are trying to do in more detail?

SELECT *
FROM
(SELECT *
,ROW_NUMBER() OVER (PARTITION BY [Group] ORDER BY [Group] ASC)rn
FROM TableName
)A
WHERE rn <= 3

row_number() over() combined with order by

How can I add a sequential row number to a query that is using order by?
Let say I have a request in this form :
SELECT row_number() over(), data
FROM myTable
ORDER BY data
This will produce the desired result as rows are ordered by "data", but the row numbers are also ordered by data. I understand this is normal as my row number is generated before the order by, but how can I generate this row number after the order by?
I did try to use a subquery like this :
SELECT row_number() over(ORDER BY data), *
FROM
(
SELECT data
FROM myTable
ORDER BY data
) As t1
As shown here, but DB2 doesn't seem to support this syntax SELECT ..., * FROM
Thanks !

You also need to use alaias name before '*'
SELECT row_number() over(ORDER BY data), t1.*
FROM
(
SELECT data
FROM myTable
ORDER BY data
) As t1
You don't need a subquery to do this,
SELECT data , row_number() over(ORDER BY data) as rn
FROM myTable
ORDER BY data

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Getting distinct rows from select statement - sql

You can use a CTE + ROW_NUMBER: WITH CTE AS ( SELECT t., RN = ROW_NUMBER() OVER (PARTITION BY AdjusterIds ORDER BY AdjusterIds) FROM dbo.TableName t ) SELECT FROM CTE WHERE RN = 1

Related

MS SQL add max()-1 to qyery

Is there any optimised way in sql sever to optimse this code, I am trying to find 2nd duplicate

How to I do multiple columns partitioning with the rows being duplicated?

Select Top 100 Groups

row_number() over() combined with order by

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Getting distinct rows from select statement - sql

You can use a CTE + ROW_NUMBER: WITH CTE AS ( SELECT t.*, RN = ROW_NUMBER() OVER (PARTITION BY AdjusterIds ORDER BY AdjusterIds) FROM dbo.TableName t ) SELECT * FROM CTE WHERE RN = 1

Related

MS SQL add max()-1 to qyery

Is there any optimised way in sql sever to optimse this code, I am trying to find 2nd duplicate

How to I do multiple columns partitioning with the rows being duplicated?

Select Top 100 Groups

row_number() over() combined with order by

Categories

Resources

You can use a CTE + ROW_NUMBER: WITH CTE AS ( SELECT t., RN = ROW_NUMBER() OVER (PARTITION BY AdjusterIds ORDER BY AdjusterIds) FROM dbo.TableName t ) SELECT FROM CTE WHERE RN = 1