sql sum duplicates in multiple columns - sql

I'm trying to sum values from duplicates rows (with the same ID, Month and Person) in multiple columns to the first or the last duplicate row. Then delete the duplicate rows exept the one with the total value.
The biggest problem is that sometimes I need to sum values in two different columns.
PrimaryTable:
ID Month Person Value1 Value2
**123 1 Smith** 10 20
**123 1 Smith** 5 NULL
**123 1 Smith** NULL 5
123 2 Smith 10 5
**189 3 Murphy** NULL 15
**189 3 Murphy** NULL 10
190 2 Brown 25 25
**345 2 Lee** 25 20
**345 2 Lee** 25 20
Result1 (expected result after sum duplicates values to the first one):
ID Month Person Value1 Value2
123 1 Smith **15** **25**
123 1 Smith 5 NULL
123 1 Smith NULL 5
123 2 Smith 10 5
189 3 Murphy NULL **25**
189 3 Murphy NULL 10
190 2 Brown 25 25
345 2 Lee **50** **40**
345 2 Lee 25 20
FinalTable (expected result after deleting duplicates, except the first one):
ID Month Person Value1 Value2
123 1 Smith **15** **25**
123 2 Smith 10 5
189 3 Murphy NULL **25**
190 2 Brown 25 25
345 2 Lee **50** **40**
I'm trying with this code:
SELECT ID, Month, Person, SUM(Value1), SumValue2
FROM
(
SELECT ID, Month, Person, Value1, SUM(Value2) AS SumValue2
FROM db.Hours
GROUP BY ID, Month, Person, Value1
)
GROUP BY ID, Month, Person, SumValue2
But sometimes it makes double sum of total of Value2.

SELECT ID, Month, Person, SUM(Value1) as SumValue1, SUM(Value2) AS SumValue2
FROM db.Hours
GROUP BY ID, Month, Person
I am not sure why you are looking at this as two steps etc. There is no removal of duplicates etc. this is a scenario for Group By Aggregation. Where you group like rows and summarize the value columns. The only reason you would need to make this a multi step operation would be if one of your value columns will be considered within your grouping e.g. ID, Month, Person, and Value1. In your case you simply need to group by ID, Month, Person and do the aggregation for Value1 and Value2.

Related

How to query: "for which do these values apply"?

I'm trying to match and align data, or resaid, count occurrences and then list for which values those occurrences occur.
Or, in a question: "How many times does each ID value occur, and for what names?"
For example, with this input
Name ID
-------------
jim 123
jim 234
jim 345
john 123
john 345
jane 234
jane 345
jan 45678
I want the output to be:
count ID name name name
------------------------------------
3 345 jim john jane
2 123 jim john
2 234 jim jane
1 45678 jan
Or similarly, the input could be (noticing that the ID values are not aligned),
jim john jane jan
----------------------------
123 345 234 45678
234 123 345
345
but that seems to complicate things.
As close as I am to the desired results is in SQL, as
for ID, count(ID)
from table
group by (ID)
order by count desc
which outputs
ID count
------------
345 3
123 2
234 2
45678 1
I'll appreciate help.
You seem to want a pivot. In SQL, you have to specify the number of columns in advance (unless you construct the query as a string).
But the idea is:
select ID, count(*) as cnt,
max(case when seqnum = 1 then name end) as name_1,
max(case when seqnum = 2 then name end) as name_2,
max(case when seqnum = 3 then name end) as name_3
from (select t.*,
row_number() over (partition by id order by id) as seqnum -- arbitrary ordering
from table t
) t
group by ID
order by count desc;
If you have an unknown number of columns, you can aggregate the values into an array:
select ID, count(*) as cnt,
array_agg(name order by name) as names
from table t
group by ID
order by count desc
the query would look similar to this if that's what you're looking for.
SELECT
name,
id,
COUNT(id) as count
FROM
dataSet
WHERE
dataSet.name = 'input'
AND dataSet.id = 'input'
GROUP BY
name,
id

SQL query to get only rows match the condition based on two separated columns under one 'group by'

The simple SELECT query would return the data as below:
Select ID, User, Country, TimeLogged from Data
ID User Country TimeLogged
1 Samantha SCO 10
1 John UK 5
1 Andrew NZL 15
2 John UK 20
3 Mark UK 10
3 Mark UK 20
3 Steven UK 10
3 Andrew NZL 15
3 Sharon IRL 5
4 Andrew NZL 25
4 Michael AUS 5
5 Jessica USA 30
I would like to return a sum of time logged for each user grouped by ID
But for only ID numbers where both of these values Country = UK and User = Andrew are included within their rows.
So the output in the above example would be
ID User Country TimeLogged
1 John UK 5
1 Andrew NZL 15
3 Mark UK 30
3 Steven UK 10
3 Andrew NZL 15
First you need to identify which IDs you're going to be returning
SELECT ID FROM MyTable WHERE Country='UK'
INTERSECT
SELECT ID FROM MyTable WHERE [User]='Andrew';
and based on that, you can then filter to aggregate the expected rows.
SELECT ID,
[User],
Country,
SUM(Timelogged) as Timelogged
FROM mytable
WHERE (Country='UK' OR [User]='Andrew')
AND ID IN( SELECT ID FROM MyTable WHERE Country='UK'
INTERSECT
SELECT ID FROM MyTable WHERE [User]='Andrew')
GROUP BY ID, [User], country;
So, you have described what you need to write almost perfectly but not quite. Your result table indicates that you want Country = UK OR User = Andrew, rather than AND
You need to select and group by, then include a WHERE:-
Select ID, User, Country, SUM(Timelogged) as Timelogged from mytable
WHERE Country='UK' OR User='Andrew'
Group by ID, user, country

SUM of duplicate rows query in sql [duplicate]

I'm trying to sum values from duplicates rows (with the same ID, Month and Person) in multiple columns to the first or the last duplicate row. Then delete the duplicate rows exept the one with the total value.
The biggest problem is that sometimes I need to sum values in two different columns.
PrimaryTable:
ID Month Person Value1 Value2
**123 1 Smith** 10 20
**123 1 Smith** 5 NULL
**123 1 Smith** NULL 5
123 2 Smith 10 5
**189 3 Murphy** NULL 15
**189 3 Murphy** NULL 10
190 2 Brown 25 25
**345 2 Lee** 25 20
**345 2 Lee** 25 20
Result1 (expected result after sum duplicates values to the first one):
ID Month Person Value1 Value2
123 1 Smith **15** **25**
123 1 Smith 5 NULL
123 1 Smith NULL 5
123 2 Smith 10 5
189 3 Murphy NULL **25**
189 3 Murphy NULL 10
190 2 Brown 25 25
345 2 Lee **50** **40**
345 2 Lee 25 20
FinalTable (expected result after deleting duplicates, except the first one):
ID Month Person Value1 Value2
123 1 Smith **15** **25**
123 2 Smith 10 5
189 3 Murphy NULL **25**
190 2 Brown 25 25
345 2 Lee **50** **40**
I'm trying with this code:
SELECT ID, Month, Person, SUM(Value1), SumValue2
FROM
(
SELECT ID, Month, Person, Value1, SUM(Value2) AS SumValue2
FROM db.Hours
GROUP BY ID, Month, Person, Value1
)
GROUP BY ID, Month, Person, SumValue2
But sometimes it makes double sum of total of Value2.
SELECT ID, Month, Person, SUM(Value1) as SumValue1, SUM(Value2) AS SumValue2
FROM db.Hours
GROUP BY ID, Month, Person
I am not sure why you are looking at this as two steps etc. There is no removal of duplicates etc. this is a scenario for Group By Aggregation. Where you group like rows and summarize the value columns. The only reason you would need to make this a multi step operation would be if one of your value columns will be considered within your grouping e.g. ID, Month, Person, and Value1. In your case you simply need to group by ID, Month, Person and do the aggregation for Value1 and Value2.

SQL Count/sum multiple columns

I want to use count/ sum multiple fields in a single query sample data and desired result is as listed below:
MemID claimNum ItemID PaidAmt
123 1234 4 5
123 2309 4 5
123 1209 4 5
123 1209 8 2.2
123 1210 8 2.2
Desired result
MemID count(claimNum) count(ItemID) sum(PaidAmt)
123 3 3 15
123 2 2 4.4
It looks like you want to group by both MemID and ItemID:
select MemID, count(claimNum), count(ItemID), sum(PaidAmt)
from the_table
group by MemID, ItemID
Use group by ItemID
select MemID, count(claimNum), count(ItemID), sum(PaidAmt)
from my_table
group by MemID, ItemID

DB2 SQL SUM and GROUPING

I am having problems with querying and grouping.
I am needing the following output:
officr, cbal, sname
ABC, 500.00, TOM JONES
ABC, 200.00, SUE JONES
ABC TOTAL 700.00
RAR, 100.10, JOE SMITH
RAR, 200.05, MILES SMITH
RAR TOTAL 300.15
SQL below produces the error:
[DB2 for i5/OS]SQL0122 - Column SNAME or expression in SELECT list not valid.
SELECT
lnmast.officr, SUM(LNMAST.CBAL), lnmast.sname
FROM
LNMAST
WHERE LNMAST.RATCOD IN (6,7,8) AND STATUS NOT IN ('2','8')
group by lnmast.officr
GROUP BY GROUPING SETS is a POWERFUL tool for grouping/cubing data. It lets you combine non-aggregated data with aggregated data in one query result.
SELECT lnmast.officr, SUM(LNMAST.CBAL), lnmast.sname
FROM LNMAST
WHERE LNMAST.RATCOD IN (6,7,8)
AND STATUS NOT IN ('2','8')
GROUP BY GROUPING SETS ((lnmast.officr, lnmast.sname),(lnmast.officr))
An example from IBM DOCS: www.ibm.com/support/knowledgecenter/en/... :
SELECT WEEK(SALES_DATE) AS WEEK,
DAYOFWEEK(SALES_DATE) AS DAY_WEEK,
SALES_PERSON, SUM(SALES) AS UNITS_SOLD
FROM SALES
WHERE WEEK(SALES_DATE) = 13
GROUP BY GROUPING SETS ( (WEEK(SALES_DATE), SALES_PERSON),
(DAYOFWEEK(SALES_DATE), SALES_PERSON))
ORDER BY WEEK, DAY_WEEK, SALES_PERSON
This results in:
WEEK DAY_WEEK SALES_PERSON UNITS_SOLD
----------- ----------- --------------- -----------
13 - GOUNOT 32
13 - LEE 33
13 - LUCCHESSI 8
- 6 GOUNOT 11
- 6 LEE 12
- 6 LUCCHESSI 4
- 7 GOUNOT 21
- 7 LEE 21
- 7 LUCCHESSI 4