Add information to one table from table contains duplicates - sql

I have the following table:
In Table_1, (ID, Name) pairs can repeat and have any combination
Table_1:
ID
Name
Value1
Value2
1
John
34
45
1
John
15
78
2
Randy
67
12
2
Randy
40
46
1
Randy
23
85
2
Holmes
10
100
I want to find all information for all unique pairs. So the output should be:
ID
Name
Value1
Value2
1
John
34
45
2
Randy
67
12
1
Randy
23
85
2
Holmes
10
100
When I do SELECT DISTINCT(ID, Name) I get the unique pairs correctly. But how do I add value1, value2 columns to this. Because adding value1, value2 causes the pairs to repeat.

You may use DISTINCT ON here:
SELECT DISTINCT ON (ID, Name) *
FROM yourTable
ORDER BY ID, Name;
Demo
This will arbitrarily return one record from each (ID, Name) combination. Note that if you wanted to choose which of the duplicate pair (or more) records gets retained, you could add another level to the ORDER BY clause. For example, to choose the duplicate record with the highest Value2 value, you could use:
SELECT DISTINCT ON (ID, Name) *
FROM yourTable
ORDER BY ID, Name, Value2 DESC;

try row_number and partition by.
SELECT *
FROM (
select *,
row_number() over(partition by Name order by Name desc) rn
from Table_1) as a
where rn = 1;

Related

SQL compares the value of 2 columns and select the column with max value row-by-row

I have table something like:
GROUP
NAME
Value_1
Value_2
1
ABC
0
0
1
DEF
4
4
50
XYZ
6
6
50
QWE
6
7
100
XYZ
26
2
100
QWE
26
2
What I would like to do is to groupby group and select the name with highest value_1. If their value_1 are the same, compare and select the max with value_2. If they're still the same, select the first one.
The output will be something like:
GROUP
NAME
Value_1
Value_2
1
DEF
4
4
50
QWE
6
7
100
XYZ
26
2
The challenge for me here is I don't know how many categories in NAME so a simple case when is not working. Thanks for help
You can use window functions to solve the bulk of your problem:
select t.*
from (select t.*,
row_number() over (partition by group order by value1 desc, value2 desc) as seqnum
from t
) t
where seqnum = 1;
The one caveat is the condition:
If they're still the same, select the first one.
SQL tables represent unordered (multi-) sets. There is no "first" one unless a column specifies the ordering. The best you can do is choose an arbitrary value when all the other values are the same.
That said, you might have another column that has an ordering. If so, add that as a third key to the order by.

How to select all duplicate rows except original one?

Let's say I have a table
CREATE TABLE names (
id SERIAL PRIMARY KEY,
name CHARACTER VARYING
);
with data
id name
-------------
1 John
2 John
3 John
4 Jane
5 Jane
6 Jane
I need to select all duplicate rows by name except the original one. So in this case I need the result to be this:
id name
-------------
2 John
3 John
5 Jane
6 Jane
How do I do that in Postgresql?
You can use ROW_NUMBER() to identify the 'original' records and filter them out. Here is a method using a cte:
with Nums AS (SELECT id,
name,
ROW_NUMBER() over (PARTITION BY name ORDER BY ID ASC) RN
FROM names)
SELECT *
FROM Nums
WHERE RN <> 1 --Filter out rows numbered 1, 'originals'
select * from names where not id in (select min(id) from names
group by name)

Oracle: Get the smaller values and the first greater value

I have a table like this;
ID Name Value
1 Sample1 10
2 Sample2 20
3 Sample3 30
4 Sample4 40
And I would like to get all of the rows that contain smaller values and the first row that contains greater value.
For example when I send '25' as a parameter to Value column, I want to have following table;
ID Name Value
1 Sample1 10
2 Sample2 20
3 Sample3 30
I'm stuck at this point, thanks in advance.
Analytic functions to the rescue!
create table your_table (
id number,
value number)
insert into your_table
select level, level * 10
from dual
connect by level <= 5
select * from your_table
id | value
----+------
1 | 10
2 | 20
3 | 30
4 | 40
5 | 50
Ok, now we use lag(). Specify field, offset and the default value (for the first row that has no previous one).
select id, value, lag(value, 1, value) over (order by value) previous_value
from your_table
id | value | previous_value
---+-------+---------------
1 | 10 | 10
2 | 20 | 10
3 | 30 | 20
4 | 40 | 30
5 | 50 | 40
Now apply where.
select id, value
from (
select id, value, lag(value, 1, value) over (order by value) previous_value
from your_table)
where previous_value < 25
Works for me.
id | value
----+------
1 | 10
2 | 20
3 | 30
Of course you have to have some policy on ties. For example, what happens if two rows have the same value and they are both first — do you want to keep both or only one of them. Or maybe you have some other criterion for breaking the tie (say, sort by id). But the idea is fairly simple.
you can try a query like this :
SELECT * FROM YourTableName WHERE Value < 25 OR ID IN (SELECT TOP 1 ID FROM YourTableName WHERE Value >= 25 ORDER BY Value)
in Oracle, you can try this (but see "That Young Man" answer, I think it's better than mine):
SELECT * FROM (
SELECT ID, NAME, VALUE, 1 AS RN
FROM YT
WHERE VALUE < 25
UNION ALL
SELECT ID, NAME, VALUE, ROW_NUMBER()OVER (ORDER BY VALUE) AS RN
FROM YT
WHERE VALUE > 25
) A
WHERE RN=1;

Grouping results in sql query by a field in the result

I have a table with the following format:
User | Entity | ID
123 AB 1
123 AB 2
543 BC 3
098 CB 4
543 BC 5
543 ZG 6
etc...
I want to get a result set that only returns the User/Entity pairs and their ID for the greatest ID, so this result for example:
User | Entity | ID
123 AB 2
098 CB 4
543 BC 5
543 ZG 6
Is there any way to do this in SQL?
Try to use group by with max function
select user, Entity, max(id) as id
from table
group by user, Entity
You can also use CTE and Partition By
Like this:
;WITH CTE as
(
SELECT
Users,Entity,
ROW_NUMBER() OVER(PARTITION BY Entity ORDER BY ID DESC) AS Row,
Id
FROM Item
)
SELECT Users, Entity, Id From CTE Where Row = 1
Note that we used Order By ID DESC as we need highest ID. You can delete DESC if you want the smallest ID.
SQLFiddle: http://sqlfiddle.com/#!3/1dcb9/4

Oracle SQL: retrieve sum and value from row in group based on value of another column

I am trying to create a summary query that returns the sum of the quantity for each group along with the description from from the row with the largest quantity in that group.
For example, if the table looks like this:
GROUP QTY DESC
----- --- ----
1 23 CCC
1 42 AAA
1 61 BBB
2 11 ZZZ
2 53 XXX
2 32 YYY
The query would return:
1 125 BBB (desc from row with largest qty for group 1)
2 95 XXX (desc from row with largest qty for group 2)
Thanks!
The window function row_number() is your friend for this type of query. It assigns a sequential number to values. You can then use this information in an aggregation:
select group, sum(qty), max(case when seqnum = 1 then desc end)
from (select t.*,
row_number() over (partition by group order by qty desc) as seqnum
from t
) t
group by group
By the way, group and desc are lousy names for columns because they conflict with reserved words. You should rename them or enclose them in double quotes in the query.