DB2, SQL query to SUM 2 columns - sql

I need to add to columns in a row.
Table Data
id
Col1
Col2
1
10
20
2
11
20
3
12
20
Result expected
id
Sum
1
30
2
31
3
32
I tried sum(col1 + col2), but that gives the sum of all the columns together.

sum() is a aggregating function (one that give a single result for a group of rows), not a algebraic one: You want the addition (the mathematical sum) of the two columns:
select id, col1 + col2 as sum
from mytable

we have two type of columns in group clause (Aggregation column and Group column) in this query
select id, col1 + col2 as sum
from mytable
group by id
we have to insert id, col1 and col2 in front of group by, otherwise we get this error
Column 'TEST.COL1' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
if use MAX() aggregation function like this
SELECT
ID,
MAX(COL1+COL2) AS SUM
FROM TEST
GROUP BY ID
we get the result BUT this isn't good idea because the cost of this code 4 times more than
bellow code
SELECT
ID,COL1+COL2 AS SUM
FROM TEST

Try this
select id, col1 + col2 as sum
from mytable
group by id

Related

Aggregate function COUNT not scalar

The COUNT function doesn't result in a scalar as expected:
CREATE TABLE MyTable (Col1 INT, Col2 INT, Col3 INT)
INSERT INTO MyTable VALUES(2,3,9) -- Row 1
INSERT INTO MyTable VALUES(1,5,7) -- Row 2
INSERT INTO MyTable VALUES(2,3,9) -- Row 3
INSERT INTO MyTable VALUES(3,4,9) -- Row 4
SELECT COUNT(*) AS Result
FROM MyTable
WHERE Col3=9
GROUP BY Col1, Col2
I filter out the 3 rows where Col3=9.
In the remaining 3 rows there are two groups:
Group 1 where Col1=2 AND Col2=3 (Row 1 and 3)
Group 2 where Col1=3 AND Col2=4 (Row 4)
Finally I count those two rows.
Therefore, I expect the answer to be a scalar Result = 2 (the two groups where Col3=9).
But I got a non scalar result.
There are other ways to solve the this, so thats not the problem, but where am I thinking wrong?
Seems like you are looking for the total count of all the groups matching any condition. For this try like the following query.
SELECT COUNT(*) [Count] FROM
(
SELECT COUNT(*) AS Result
FROM MyTable
WHERE Col3=9
GROUP BY Col1, Col2
)T
SQL Fiddle
You can use subquery with singe aggregation :
select count(*)
from (select distinct col1, col2
from mytable
where col3 = 9
) t;

SQL Sum amount for column with unique values

Update
Realised I was doing it correctly. The reason why I had the issue was because I didn't realise my data for Col1 wasn't as expected, having some Col1 that associates with multiple Col0 (It was supposed to be Col1:Col0 1:1 relationship. That's why the confusion of it's not working as intended.
Original Question
I'm using SQL query to sum a column for total revenue of distinct values in one of the columns, and return a table with combining with other attributes.
Here's my table:
Col 0 Col1 Col2(unique) Revenue
X 1 A 10
X 1 B 20
X 1 C 0
X 2 D 5
X 2 E 8
Y 3 F 3
Y 3 G 0
Y 3 H 50
Desired output:
Col0 Col1 Revenue
X 1 30
X 2 13
Y 3 53
I tried:
WITH
rev_calc AS (
SELECT
Col0,
Col1,
Col2, ##this is for further steps to combine other tables for mapping after this
SUM(Revenue) AS total_revenue, ##total rev by Col1
FROM table_input
GROUP BY Col1, Col0, Col2 ##Have to group by Col0 and Col2 too as it raised error because of 'list expression'
)
SELECT DISTINCT
table2.mappedOfCol0,
rev_calc.Col1,
rev_calc.Col2,
rev_calc.total_revenue,
FROM another_table AS table2
LEFT JOIN rev_calc
ON rev_calc.Col0 = table2.mappedOfCol0
But getting actual output with multiple rows of revenues under a specific Col1.
For example, when i filter by Col1 = 1 in the output table, I get a list of different revenue amount still:
Col1 total_revenue
1 10
1 20
1 0
I thought the GROUP BY should have sum up the revenue by distinctly under Col1. What did I miss out here? I also tried querying first FROM (SELECT DISTINCT Col1....) way but the sum(revenue) is producing a list of different revenue as well
Newbie to SQL here, appreciate if anyone can share any insights here. Thanks.
Don't you just want aggregation?
select col0, col1, sum(revenue) as revenue
from mytable
group by col0, col1
I don't understand what you are trying to do with col2 in the query. This produces the result you want for the data you showed, that contains a single table.
As per explanation you provided, I think your requirement is aggregate revenue of selective records that map with another table based on Col2 values. If that is the case then you may try following query.
WITH
rev_calc AS (
SELECT
distinct(Col2) as Col2
From table_input
LEFT JOIN another_table
ON another_table.Col2 = table_input.Col2
)
SELECT
Col0,
Col1,
SUM(Revenue) AS total_revenue
FROM table_input
WHERE Col2 in (select Col2 from rev_calc)
GROUP BY Col0, Col1;

sqlite: select all columns where one filed has max value over all columns

I have a table like this:
id int, col1 int, ...
Different rows can have col1 of same value.
Now I want to gather all rows where col1 has a the maximum value.
e.g. this table values
1 4
2 3
3 4
The query shall give my row 1 and 3
You can use subquery:
SELECT id, col1
FROM tab
WHERE col1 = (SELECT MAX(col1) FROM tab);
SqlFiddleDemo

GROUP BY without aggregate function

I am trying to understand GROUP BY (new to oracle dbms) without aggregate function.
How does it operate?
Here is what i have tried.
EMP table on which i will run my SQL.
SELECT ename , sal
FROM emp
GROUP BY ename , sal
SELECT ename , sal
FROM emp
GROUP BY ename;
Result
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action:
Error at Line: 397 Column: 16
SELECT ename , sal
FROM emp
GROUP BY sal;
Result
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action: Error at Line: 411 Column: 8
SELECT empno , ename , sal
FROM emp
GROUP BY sal , ename;
Result
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action: Error at Line: 425 Column: 8
SELECT empno , ename , sal
FROM emp
GROUP BY empno , ename , sal;
So, basically the number of columns have to be equal to the number of columns in the GROUP BY clause, but i still do not understand why or what is going on.
That's how GROUP BY works. It takes several rows and turns them into one row. Because of this, it has to know what to do with all the combined rows where there have different values for some columns (fields). This is why you have two options for every field you want to SELECT : Either include it in the GROUP BY clause, or use it in an aggregate function so the system knows how you want to combine the field.
For example, let's say you have this table:
Name | OrderNumber
------------------
John | 1
John | 2
If you say GROUP BY Name, how will it know which OrderNumber to show in the result? So you either include OrderNumber in group by, which will result in these two rows. Or, you use an aggregate function to show how to handle the OrderNumbers. For example, MAX(OrderNumber), which means the result is John | 2 or SUM(OrderNumber) which means the result is John | 3.
Given this data:
Col1 Col2 Col3
A X 1
A Y 2
A Y 3
B X 0
B Y 3
B Z 1
This query:
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2, Col3
Would result in exactly the same table.
However, this query:
SELECT Col1, Col2 FROM data GROUP BY Col1, Col2
Would result in:
Col1 Col2
A X
A Y
B X
B Y
B Z
Now, a query:
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2
Would create a problem: the line with A, Y is the result of grouping the two lines
A Y 2
A Y 3
So, which value should be in Col3, '2' or '3'?
Normally you would use a GROUP BY to calculate e.g. a sum:
SELECT Col1, Col2, SUM(Col3) FROM data GROUP BY Col1, Col2
So in the line, we had a problem with we now get (2+3) = 5.
Grouping by all your columns in your select is effectively the same as using DISTINCT, and it is preferable to use the DISTINCT keyword word readability in this case.
So instead of
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2, Col3
use
SELECT DISTINCT Col1, Col2, Col3 FROM data
You're experiencing a strict requirement of the GROUP BY clause. Every column not in the group-by clause must have a function applied to reduce all records for the matching "group" to a single record (sum, max, min, etc).
If you list all queried (selected) columns in the GROUP BY clause, you are essentially requesting that duplicate records be excluded from the result set. That gives the same effect as SELECT DISTINCT which also eliminates duplicate rows from the result set.
The only real use case for GROUP BY without aggregation is when you GROUP BY more columns than are selected, in which case the selected columns might be repeated. Otherwise you might as well use a DISTINCT.
It's worth noting that other RDBMS's do not require that all non-aggregated columns be included in the GROUP BY. For example in PostgreSQL if the primary key columns of a table are included in the GROUP BY then other columns of that table need not be as they are guaranteed to be distinct for every distinct primary key column. I've wished in the past that Oracle did the same as it would have made for more compact SQL in many cases.
Let me give some examples.
Consider this data.
CREATE TABLE DATASET ( VAL1 CHAR ( 1 CHAR ),
VAL2 VARCHAR2 ( 10 CHAR ),
VAL3 NUMBER );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'b', 'b-details', 2 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'a', 'a-details', 1 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'c', 'c-details', 3 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'a', 'dup', 4 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'c', 'c-details', 5 );
COMMIT;
Whats there in table now
SELECT * FROM DATASET;
VAL1 VAL2 VAL3
---- ---------- ----------
b b-details 2
a a-details 1
c c-details 3
a dup 4
c c-details 5
5 rows selected.
--aggregate with group by
SELECT
VAL1,
COUNT ( * )
FROM
DATASET A
GROUP BY
VAL1;
VAL1 COUNT(*)
---- ----------
b 1
a 2
c 2
3 rows selected.
--aggregate with group by multiple columns but select partial column
SELECT
VAL1,
COUNT ( * )
FROM
DATASET A
GROUP BY
VAL1,
VAL2;
VAL1
----
b
c
a
a
4 rows selected.
--No aggregate with group by multiple columns
SELECT
VAL1,
VAL2
FROM
DATASET A
GROUP BY
VAL1,
VAL2;
VAL1
----
b b-details
c c-details
a dup
a a-details
4 rows selected.
--No aggregate with group by multiple columns
SELECT
VAL1
FROM
DATASET A
GROUP BY
VAL1,
VAL2;
VAL1
----
b
c
a
a
4 rows selected.
You have N columns in select (excluding aggregations), then you should have N or N+x columns
Use sub query e.g:
SELECT field1,field2,(SELECT distinct field3 FROM tbl2 WHERE criteria) AS field3
FROM tbl1 GROUP BY field1,field2
OR
SELECT DISTINCT field1,field2,(SELECT distinct field3 FROM tbl2 WHERE criteria) AS field3
FROM tbl1
If you have some column in SELECT clause , how will it select it if there is several rows ? so yes , every column in SELECT clause should be in GROUP BY clause also , you can use aggregate functions in SELECT ...
you can have column in GROUP BY clause which is not in SELECT clause , but not otherwise
As an addition
basically the number of columns have to be equal to the number of columns in the GROUP BY clause
is not a correct statement.
Any attribute which is not a part of GROUP BY clause can not be used for selection
Any attribute which is a part of GROUP BY clause can be used for selection but not mandatory.
For anyone trying to group data (from foreign tables as an example) like a json object with nested arrays of data you can achieve this in sql with array_agg (you can also use this in conjunction with json_build_object to create a json object with key-value pairs).
As a refference, I found helpful this video on yt: https://www.youtube.com/watch?v=A6N1h9mcJf4
-- Edit
If you want to have a nested array inside a nested array, you could do it by using array.
In the following example, 'variation_images' (subquery 2 - in relation to the variation table) are nested under the 'variation' query (subquery 1 - in relation to product table) which is nested under the product query (main query):
SELECT product.title, product.slug, product.description,
ARRAY(SELECT jsonb_build_object(
'var_id', variation.id, 'var_name', variation.name, 'images',
ARRAY(SELECT json_build_object('img_url', variation_images.images)
FROM variation_images WHERE variation_images.variation_id = variation.id)
)
FROM variation WHERE variation.product_id = product.id)
FROM product
I know you said you want to understand group by if you have data like this:
COL-A COL-B COL-C COL-D
1 Ac C1 D1
2 Bd C2 D2
3 Ba C1 D3
4 Ab C1 D4
5 C C2 D5
And you want to make the data appear like:
COL-A COL-B COL-C COL-D
4 Ab C1 D4
1 Ac C1 D1
3 Ba C1 D3
2 Bd C2 D2
5 C C2 D5
You use:
select * from table_name
order by col-c,colb
Because I think this is what you intend to do.

MYSQL - Using AVG() and DISTINCT together

How can you write the following in MYSQL?
SELECT AVG(col1) FROM table WHERE DISTINCT col2
more info:
table
col1 | col2
-----------
2 | 555.555.555.555
5 | 555.555.555.555
4 | 444.444.444.444
returns '3'
Basically I'm trying to select average value of col1 where ip addresses in col2 are distinct.
SELECT col2,
AVG(col1)
FROM table
GROUP BY col2
Right, because the distinct clause would find the first and third rows, the average of 2 and 4 is 3.
What I think you're looking for is "group by col2" instead of distinct.
I think you want the group by operator. It will group the rows before running calculations on them.