How to select all duplicate rows except original one?

How to select all duplicate rows except original one? - sql

Let's say I have a table
CREATE TABLE names (
id SERIAL PRIMARY KEY,
name CHARACTER VARYING
);
with data
id name
-------------
1 John
2 John
3 John
4 Jane
5 Jane
6 Jane
I need to select all duplicate rows by name except the original one. So in this case I need the result to be this:
id name
-------------
2 John
3 John
5 Jane
6 Jane
How do I do that in Postgresql?

You can use ROW_NUMBER() to identify the 'original' records and filter them out. Here is a method using a cte:
with Nums AS (SELECT id,
name,
ROW_NUMBER() over (PARTITION BY name ORDER BY ID ASC) RN
FROM names)
SELECT *
FROM Nums
WHERE RN <> 1 --Filter out rows numbered 1, 'originals'

select * from names where not id in (select min(id) from names
group by name)

Related

Keep IDs in sequential order after deleing a row [duplicate]

This question already has an answer here:
Renumbering sequence numbers
(1 answer)
Closed 6 months ago.
I have a Microsoft Access Database table, where i use the "Id" value in order to gather the information in that row.
Example:
Id Name Surname
1 Jim Smith
2 Luis Evans
3 Charles Holland
4 John Price
I have a Query which is used to delete one of the rows of this table, however when i delete a row in the table the Id values don't stay in a sequential order. For exmaple if i delete the id 2 row the table will look like this:
Id Name Surname
1 Jim Smith
3 Charles Holland
4 John Price
How do i make it so when I delete a row in the table the Ids stay in sequential order? Like this:
Id Name Surname
1 Jim Smith
2 Charles Holland
3 John Price

You can use a sub-query to get the "question number". Something like this:
SELECT Q.ID, Q.ForeName, Q.Surname,
(SELECT COUNT (*) FROM tblQuestion AS Q1 WHERE Q1.ID<=Q.ID) AS QuestionNo
FROM tblQuestion AS Q
ORDER BY Q.ID ASC;
This counts the number of records that have an ID less than or equal to the ID of the current record. So, in the table with 1 record deleted, ID 1 has 1 record (ID 1), ID 3 has two records (ID 1 and 3), and ID 4 has three records (ID 1, 3 and 4).
Note that Name is a reserved word, so you should use a different name for the field.

you can get temporary data like this
select ROW_NUMBER() over(order by Id) As Id,Name,Surname From TableData

Add information to one table from table contains duplicates

I have the following table:
In Table_1, (ID, Name) pairs can repeat and have any combination
Table_1:
ID
Name
Value1
Value2
1
John
34
45
1
John
15
78
2
Randy
67
12
2
Randy
40
46
1
Randy
23
85
2
Holmes
10
100
I want to find all information for all unique pairs. So the output should be:
ID
Name
Value1
Value2
1
John
34
45
2
Randy
67
12
1
Randy
23
85
2
Holmes
10
100
When I do SELECT DISTINCT(ID, Name) I get the unique pairs correctly. But how do I add value1, value2 columns to this. Because adding value1, value2 causes the pairs to repeat.

You may use DISTINCT ON here:
SELECT DISTINCT ON (ID, Name) *
FROM yourTable
ORDER BY ID, Name;
Demo
This will arbitrarily return one record from each (ID, Name) combination. Note that if you wanted to choose which of the duplicate pair (or more) records gets retained, you could add another level to the ORDER BY clause. For example, to choose the duplicate record with the highest Value2 value, you could use:
SELECT DISTINCT ON (ID, Name) *
FROM yourTable
ORDER BY ID, Name, Value2 DESC;

try row_number and partition by.
SELECT *
FROM (
select *,
row_number() over(partition by Name order by Name desc) rn
from Table_1) as a
where rn = 1;

Select multiple records as multiple columns in SQL

I have a table that look like this:
ID Name
-------
1 John
1 Mary
1 Jane
2 John
2 Mary
3 Jane
Knowing that every ID can only contain up to three names, I want to use a SELECT statement to turn this into the following:
ID Name1 Name2 Name3
--------------------
1 John Mary Jane
2 John Mary
3 Jane
Is there a way to do this in SQL?

If you know that there are at most three names, you can do this using conditional aggregation:
select id,
max(case when seqnum = 1 then name end) as name1,
max(case when seqnum = 2 then name end) as name2,
max(case when seqnum = 3 then name end) as name3
from (select t.*, row_number() over (partition by id order by name) as seqnum
from table t
) t
group by id;

With Oracle, you can use the PIVOT feature.
However, you'll need to RANK your rows first, over id-name pairs, and then you can do the pivot directive for (literally) the rank you just generated.
Pages to read:
http://www.oracle.com/technetwork/articles/sql/11g-pivot-097235.html
http://www.oracle-developer.net/display.php?id=506

postgresql - filter out double rows (but not the first and last one)

i got an "postgres" SQL problem.
I got a table which looks like this
id name level timestamp
1 pete 1 100
2 pete 1 200
3 pete 1 500
4 pete 5 900
7 pete 5 1000
9 pete 5 1200
15 pete 2 700
Now I want to delete the lines i dont need. i only want to now the first line where he get a new level and the last line he has this level.
id name level timestamp
1 pete 1 100
3 pete 1 500
15 pete 2 700
4 pete 5 900
9 pete 5 1200
(there much more columns like realmpoints and so on)
I have a solution if the the timestamp is only increasing.
SELECT id, name, level, timestamp
FROM player_testing
WHERE id IN ( SELECT MAX(dup.id)
FROM player_testing As dup
GROUP BY dup.name, dup.level)
UNION
SELECT MIN(dup.id)
FROM player_testing As dup
GROUP BY dup.name, dup.level)
)
ORDER BY ts
But I find no way to makes it work for my problem.

select id, name, level, timestamp
from (
select id,name,level,timestamp,
row_number() over (partition by name, level order by timestamp) as rn,
count(*) over (partition by name, level) as max_rn
from player_testing
) t
where rn = 1 or rn = max_rn;
Btw: timestamp is a horrible name for a column. For one reason because it's a reserved word, but more importantly because it doesn't document what the column contains. Is that a start_timestamp and end_timestamp a valid_until_timestamp, ...?

Here is an alternate solution to #a_horse_with_no_name's without over partition, and thus more generic SQL:
select *
from player_testing as A
where id = (
select min(id)
from player_testing as B
where A.name = B.name
and A.level = B.level
)
or id = (
select max(id)
from player_testing as B
where A.name = B.name
and A.level = B.level
)
Here is the fiddle to show it working: http://sqlfiddle.com/#!2/47bd44/1

Applying a sort order to existing data using SQL 2008R2

I have some existing data that I need to apply a "SortOrder" to based upon a few factors:
The ordering starts at "1" for any given Owner
The ordering is applied alphabetically (basically following an ORDER BY Name) to increase the sort order.
Should two items have the same name (as I've illustrated in my data set), we can apply the lower sort order value to the item with the lower id.
Here is some sample data to help illustrate what I'm talking about:
What I have:
Id OwnerId Name SortOrder
------ ------- ---------------------- ---------
1 1 A Name NULL
2 1 C Name NULL
3 1 B Name NULL
4 2 Z Name NULL
5 2 Z Name NULL
6 2 A Name NULL
What I need:
Id OwnerId Name SortOrder
------ ------- ---------------------- ---------
1 1 A Name 1
3 1 B Name 2
2 1 C Name 3
6 2 A Name 1
4 2 Z Name 2
5 2 Z Name 3
This could either be done in the form of an UPDATE statement or doing an INSERT INTO (...) SELECT FROM (...) if it's easier to move the data from one table to the next.

Easy - use a CTE (Common Table Expression) and the ROW_NUMBER() ranking function:
;WITH OrderedData AS
(
SELECT Id, OwnerId, Name,
ROW_NUMBER() OVER(PARTITION BY OwnerId ORDER BY Name, Id) AS 'SortOrder'
FROM
dbo.YourTable
)
SELECT *
FROM OrderedData
ORDER BY OwnerId, SortOrder
The PARTITION BY clause groups your data into group for each value of OwnerId and the ROW_NUMBER() then starts counting at 1 for each new group of data.
Update: If you want to update your table to set the SortOrder column - try this:
;WITH OrderedData AS
(
SELECT
Id, OwnerId, Name,
ROW_NUMBER() OVER(PARTITION BY OwnerId ORDER BY Name, Id) AS 'RowNum'
FROM
dbo.YourTable
)
UPDATE OrderedData
SET SortOrder = RowNum
That should set the SortOrder column to the values that the ROW_NUMBER() function returns

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to select all duplicate rows except original one? - sql

You can use ROW_NUMBER() to identify the 'original' records and filter them out. Here is a method using a cte: with Nums AS (SELECT id, name, ROW_NUMBER() over (PARTITION BY name ORDER BY ID ASC) RN FROM names) SELECT * FROM Nums WHERE RN <> 1 --Filter out rows numbered 1, 'originals'

select * from names where not id in (select min(id) from names group by name)

Related

Keep IDs in sequential order after deleing a row [duplicate]

Add information to one table from table contains duplicates

Select multiple records as multiple columns in SQL

postgresql - filter out double rows (but not the first and last one)

Applying a sort order to existing data using SQL 2008R2

Categories

Resources