SQL to merge max values from multiple rows - sql

suppose I have a table
-----------------------------------------------
| id | value1 | value2 | value3 |
-----------------------------------------------
| 102 | 10 | 1 | 3 |
-----------------------------------------------
| 102 | 2 | 11 | 0 |
-----------------------------------------------
| 102 | 0 | 9 | 13 |
-----------------------------------------------
| 102 | 3 | 5 | 7 |
-----------------------------------------------
and for each distinct id I want to return a row with max value in columns value1, value2 and value3, i.e.
-----------------------------------------------
| id | value1 | value2 | value3 |
-----------------------------------------------
| 102 | 10 | 11 | 13 |
-----------------------------------------------
(of course there are other ids than 102 in the table)
I managed to do it with "partition by" but the problem is that I have to use it in powerbuilder's datawindow, and as soon as I paste it there the whole IDE crashes and project gets corrupted.
I managed to create an sql that for each row does 3 inner joins with selects that return max of every column.
Is there any other easier way?
Thanks in advance for answering!

use GROUP BY and MAX()
SELECT id,
MAX(value1) val1,
MAX(value2) val2,
MAX(value3) val3
FROM tableName
GROUP BY ID
SQLFiddle Demo

SELECT id, MAX(value1) value1, MAX(value2) value2, MAX(value3) value3
FROM yourtable
GROUP BY id

Related

I need to group by one column and show more columns from one dataset

I have the following table:
AMNT1 | COLUMN1 | COLUMN2 | COLUMN3 | GROUP1
--------|-----------|-----------|-------------|--------
1.00 | COL1_ROW1 | COL2_ROW1 | COL3_ROW1 | AAA
9.00 | COL1_ROW2 | COL2_ROW2 | COL2_ROW2 | AAA
2.00 | COL1_ROW3 | COL2_ROW3 | COL3_ROW3 | BBB
3.00 | COL1_ROW4 | COL2_ROW4 | COL3_ROW4 | CCC
I want to sum AMNT1 grouped by GROUP1:
SELECT GROUP1, SUM(AMNT1) FROM ND_TEST GROUP BY GROUP1;
GROUP1 | SUM(AMNT1)
-------|-----------
AAA | 10.00
BBB | 2.00
CCC | 3.00
Addtionally I want to select COLUMN1, COLUMN2 and COLUMN3 from ONE row. So my output should be like this:
GROUP1 | SUM(AMNT1)| COLUMN1 | COLUMN2 | COLUMN3 |
-------|-----------|-----------|-----------|------------|
AAA | 10.00 | COL1_ROW1 | COL2_ROW1 | COL3_ROW1 |
BBB | 2.00 | COL1_ROW3 | COL2_ROW3 | COL3_ROW3 |
CCC | 3.00 | COL1_ROW4 | COL2_ROW4 | COL3_ROW4 |
If I use sum over partition I get duplicates per group... If I use aggregate functions, I dont get result from the same row...
Do you have an idea?
Thank you!
select group1, sum_amnt1, column1, column2, column3
from (
select group1, sum(amnt1) over (partition by group1) as sum_amnt1,
column1, column2, column3,
row_number() over (partition by group1 order by null) as rn
from your_table
)
where rn = 1
order by null in the row_number() function corresponds to your clarification (in a Comment) that any row from each group will be fine (you don't care which one).
You can use window function :
select nt.*
from (select nt.*, sum(AMNT1) over (partition by GROUP1) as sum,
row_number() over (partition by GROUP1 order by AMNT1) as seq
from ND_TEST as nt
) nt
where seq = 1;

Over Partition to find duplicates and remove them based on criteria SQL

I hope everyone is doing well. I have a dilemma that i can not quite figure out.
I am trying to find a unique value for a field that is not a duplicate.
For example:
Table 1
|Col1 | Col2| Col3 |
| 123 | A | 1 |
| 123 | A | 2 |
| 12 | B | 1 |
| 12 | B | 2 |
| 12 | C | 3 |
| 12 | D | 4 |
| 1 | A | 1 |
| 2 | D | 1 |
| 3 | D | 1 |
Col 1 is the field that would have the duplicate values. Col2 would be the owner of the value in Col 1. Col 3 uses the row number() Over Partition syntax to get the numbers in ascending order.
The goal i am trying to accomplish is to remove the value in col 1 if it is not truly unique when looking at col2.
Example:
Col1 has the value 123, Col2 has the value A. Although there are two instances of 123 being owned by A, i can determine that it is indeed unique.
Now look at Col1 that has the value 12 with values in Col2 of B,C,D.
Value 12 is associated with three different owners thus eliminating 12 from our result list.
So in the end i would like to see a result table such as this :
|Col1 | Col2|
| 123 | A |
| 1 | A |
| 2 | D |
| 3 | D |
To summarize, i would like to first use the partition numbers to identify if the value in col1 is repeated. From there i want to verify that the values in col 2 are the same. If so the value in col 1 and col 2 remains as one single entry. However if the values in col 2 do not match, all records for the col1 value are removed.
I will provide the syntax code for my query if needed.
Update**
I failed to mention that table 1 is the result of inner joining two tables.
So Col1 comes from table a and Col2 comes from table b.
The values in table a for col2 are hard to interpret so i had to make sense of them and assigned it proper name values.
The join query i used to combine the two are:
Select a.Col1, B.Col2 FROM Table a INNER JOIN Table b on a.Colx = b.Colx
Update**
Table a:
|Col1 | Colx| Col3 |
| 123 | SMS | 1 |
| 123 | S9W | 2 |
| 12 | NAV | 1 |
| 12 | NFR | 2 |
| 12 | ABC | 3 |
| 12 | DEF | 4 |
| 1 | SMS | 1 |
| 2 | DEF | 1 |
| 3 | DES | 1 |
Table b:
|Colx | Col2|
| SMS | A |
| S9W | A |
| DEF | D |
| DES | D |
| NAV | B |
| NFR | B |
| ABC | C |
Above are sample data for both tables that get joined in order to create the first table displayed in this body.
Thank you all so much!
NOT EXISTS operator can be used to do this task:
SELECT distinct Col1 , Col2
FROM table t
WHERE NOT EXISTS(
SELECT 1 FROM table t1
WHERE t.col1=t1.col1 AND t.col2 <> t1.col2
)
If I understand correctly, you want:
select col1, min(col2)
from t
group by col1
where min(col2) <> max(col2);
I think the third column is confusing you. It doesn't seem to play any role in the logic you want.

SQL - How to do something like value.Contains?

someone can help me, I need to exclude some repeated values, the result is:
There are some rows with null values and in that case I named 'No Informado'.
In line from 26 to 32 there is the same value1 and value2, but value3 is different.
I will need this result,
id | name | user
0x00E281759429DD4B807F467F8B2319E3 | PC_XBPOX0112 | llopez
0x00F37F5DA2C8854699EFBA30F7102DDD | PC_BSCTY1312 | No Informado
0x00F53DBE60CFF343942E3893ABA809EB | PC_SVCTY6834 | ntapia
0x00FDB75C00B8D84E8A1862A56C71A766 | NB_TSCTY06606 | jogonzalez
0x010029519191B34BB498E7F9FEAE3E21 | PC_BSCTY3229 | kfuentes
0x011506756396BC4588E705BFCFA84847 | PC_BSCTY3134 | csepulveda
0x0120BE537B242C4EB01C4F94E82E64BF | PC_BSCTY1296 | eaviles
0x01322ABEC4F19E41B2139291952838EE | PC_VSCTY6535 | vbravo
0x0133C6B80B50E44A928AF770510856E3 | PC_FSCTY0084 | mcarreno
0x01463ECF32DEBD41943330EC7C1822D4 | PC_BSCTY3220 | fegonzalez
0x01610C718C04264A8349FAEA6676363F | PC-FSCTY0543 | fcastro
someone can help me?
Forward thanks!
Another option is the WITH TIES clause in concert with Row_Number()
Example
Select Top 1 With Ties *
From YourTable
Order by Row_Number() over (Partition By ID Order by Date Desc)
Returns
id name date
1 name1 2018-01-01
2 name2 2018-01-01
3 name5 2018-02-01
SELECT Id
, MAX(name) AS Name
, MAX([date]) AS [date]
FROM TableName
GROUP BY Id

Selecting duplicates with a unique identifying column

I have a table that looks like this (simplified)
| uniqueID | value1 | value2 | value3 |
|:--------:|:------:|:------:|:------:|
| 1 | a | b | c |
| 2 | e | f | g |
| 3 | a | b | c |
| 4 | a | b | c |
| 5 | e | f | g |
The end goal is to get a list of uniqueIDs that have the same value1, value2, and value3 but without the first occurence. For the table above I would ideally like the result of the query to be:
| uniqueID |
|:--------:|
| 3 |
| 4 |
| 5 |
This way I can then remove those uniqueID's from the table later. My current code looks like this:
select value1, value2, value3, count(*)
from myTable
group by value1, value2, value3 having count(*) > 1;
This gets me:
| value1 | value2 | value3 | count(*) |
|:------:|:------:|:------:|:--------:|
| a | b | c | 3 |
| e | f | g | 2 |
Which works great to see which set of values are duplicated but does not help me identify the uniqueID for them.
Thanks
You might try something like this:
SELECT uniqueID, value1, value2, value3 FROM (
SELECT uniqueID, value1, value2, value3
, ROW_NUMBER() OVER ( PARTITION BY value1, value2, value3 ORDER BY uniqueID ) AS rn
FROM mytable
) WHERE rn > 1;
This will get all the unique combinations of values for which more than one exists and will eliminate the first (by filtering on the result of ROW_NUMBER()) where "first" is the minimum value of uniqueID for that combination.
If you wanted to get the ones that you don't want removed, you could do the following instead:
SELECT uniqueID, value1, value2, value3 FROM (
SELECT uniqueID, value1, value2, value3
, ROW_NUMBER() OVER ( PARTITION BY value1, value2, value3 ORDER BY uniqueID ) AS rn
FROM mytable
) WHERE rn = 1;
EDIT: Fixed some identifier names. Really, not a good idea to use CamelCase and headlessCamelCase in Oracle, where your table names and column names are just going to be converted to uppercase (unless you quote your identifiers).

SQL DB2 Select multiple columns values for multiple instances of IDs

Here is my data:
| ID | FIELD1 | FIELD2 | FIELD3 |
|-------------------------------|
| 1 | NULL | value1 | value2 |
|-------------------------------|
| 2 | NULL | value3 | NULL |
|-------------------------------|
| 3 | value4 | NULL | NULL |
|-------------------------------|
| 4 | value5 | value6 | value7 |
|-------------------------------|
| .. | ... | .... | .... |
Here is what I need to select:
| ID | ID2 | FIELDX |
|-------------------|
| 1 | 10 | value1 |
| 1 | 10 | value2 |
| 2 | 20 | value3 |
| 3 | 30 | value4 |
| 4 | 40 | value5 |
| 4 | 40 | value6 |
| 4 | 40 | value7 |
| .. | .. | .... |
The order of the data doesn't really matter. What matters is that each ID appears once for every associated FIELD1,2,3... value. Please note that there are many fields. I just chose to use these three as an example.
My attempt at the solution was this query:
SELECT x.ID, a.ID2, x.FIELDX
FROM (
SELECT t.ID, t.FIELD1
FROM SCHEMA1.TABLE1 t
UNION ALL
SELECT t.ID, t.FIELD2
FROM SCHEMA1.TABLE1 t
UNION ALL
SELECT t.ID, t.FIELD3
FROM SCHEMA1.TABLE1 t
) x
JOIN SCHEMA2.TABLE2 a ON x.ID = a.ID
WHERE x.FIELDX != NULL
WITH UR;
While this does do the job, I would rather not have to add a new inner select statement for each additional field. Moreover, I feel as though there is a more efficient way to do it.
Please advise.
DB2 doesn't have an explicit unpivot and your method is fine. A more efficient method is probably to do:
SELECT id, id2, fieldx
FROM (SELECT x.ID, a.ID2,
(case when col = 'field1' then field1
when col = 'field2' then field2
when col = 'field3' then field3
end) as FIELDX
FROM SCHEMA1.TABLE1 x join
SCHEMA2.TABLE2 a
on x.ID = a.ID cross join
(select 'field1' as col from sysibm.sysdummy1 union all
select 'field2' from sysibm.sysdummy1 union all
select 'field3' from sysibm.sysdummy1
) c
) x
WHERE x.FIELDX is not NULL;
This doesn't necessarily simplify the code. It does make it easier for DB2 to optimize the joins. And it only requires reading table1 once instead of one time for each column.
As a note: you should use fieldx is not null rather than fieldx != null.