Oracle SQL grouping elements - sql

I know this question is kind of trivial but I have difficulties writing the SQL for the example shown below.
As shown in the first table, is the result that i generated and they are ok for analytics,
REGION SUBREGION SUM
------ --------- ------
CORP CORP1 5
CORP CORP2 10
CORP CORP3 5
SB SB1 10
SB SB2 10
MID null 10
LARGE null 20
but for summary report i need to display result as shown in the second table. Any clues?
REGION SUM
------ ----
CORP 20
CORP1 5
CORP2 10
CORP3 5
SB 20
SB1 10
SB2 10
MID 10
LARGE 20

Simply change your existing GROUP BY to GROUPING SET:
SELECT
Coalesce(subregion, region) AS region,
Sum(column)
FROM mytable
GROUP BY GROUPING SETS(region, subregion)
HAVING Coalesce(subregion, region) IS NOT NULL

sounds like you need to aggregate the same table by different fields, but get both back as if it were one field. The solution that comes to mind is a UNION
select sum() as sum, REGION as sf from table group by REGION
union ALL
select sum() as sum, SUB_REGION as sf from table group by SUB_REGION;
Hope that helps
based on Dan's question below, I add, if you don't want to agg the vals, just take out the sum and group by and do the straight union
select REGION as sf from table
union ALL
select SUB_REGION as sf from table;
EDIT:
One more thought, perhaps when you do the query in the first place, you may want to look into the concept of ROLLUPs as an additional clause on your group bu and agg function, might help solve this in one shot.

Do the same group by query you're doing, but try using group by ROLLUP:
Something like (untested):
select region, subregion, sum(some_column) as sum
from some_table
group by rollup(region, subregion)
order by region, subregion;

Try using OLAP functions rollup and grouping like this:
select
nvl(subregion, region) region, sum("sum")
from t
group by region, rollup(subregion)
having case when count(*) = 1 then 0 else 1 end = grouping(subregion);
In the above,
having case when count(*) = 1 then 0 else 1 end = grouping(subregion);
The above excludes the rolluped row if there is only one row for that region so that there are no duplicates.
Also, avoid using reserved keywords such as sum or count in your identifiers.
Demo:
SQL> with t(REGION ,SUBREGION ,s) as (
2 select 'CORP' , 'CORP1' , 5 from dual union all
3 select 'CORP' , 'CORP2' , 10 from dual union all
4 select 'CORP' , 'CORP3' , 5 from dual union all
5 select 'SB' ,'SB1' ,10 from dual union all
6 select 'SB' ,'SB2' ,10 from dual union all
7 select 'MID' , null , 10 from dual union all
8 select 'LARGE' , null , 20 from dual
9 )
10 select
11 nvl(subregion, region) region, sum(s)
12 from t
13 group by region, rollup(subregion)
14 having case when count(*) = 1 then 0 else 1 end = grouping(subregion);
REGIO SUM(S)
----- ----------
SB1 10
SB2 10
SB 20
MID 10
CORP1 5
CORP2 10
CORP3 5
CORP 20
LARGE 20
9 rows selected.
SQL>

Related

SQL Query To transform rows into columns with additional calculated column [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 days ago.
This post was edited and submitted for review 8 days ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
WIth sample data like here:
DATE Count Code
02-JAN-2023 25 A
03-JAN-2023 10 A
05-JAN-2023 15 A
01-JAN-2023 5 B
02-JAN-2023 20 B
04-JAN-2023 6 B
05-JAN-2023 9 B
I need to transform rows with code = 'A' or 'B' into A and B columns and to create another column to calculate the difference between A and B (DIFF = A - B). Empty values considered as 0. The result should be grouped and ordered by DATE.
The expected result should be:
Date A B DIFFERENCE(A-B)
01-JAN-2023 5 -5
02-JAN-2023 25 20 5
03-JAN-2023 10 10
04-JAN-2023 6 -6
05-JAN-2023 15 9 6
You need a conditional aggregation to get the desired result -
SELECT `date`, MAX(CASE WHEN CODE = 'A' THEN `count` ELSE NULL END) A,
MAX(CASE WHEN CODE = 'B' THEN `count` ELSE NULL END) B,
NVL(MAX(CASE WHEN CODE = 'A' THEN `count` ELSE NULL END), 0) -
NVL(MAX(CASE WHEN CODE = 'B' THEN `count` ELSE NULL END), 0) difference
FROM your_table
GROUP BY `date`;
Demo.
Conditional aggregation, yes - but most probably different aggregation (than Ankit suggested), I believe - sum instead of max. With sample data you posted it really doesn't matter, but there probably are some more rows, maybe even for the same date so max won't return correct result.
Instead of NVL function (Ankit used), consider using else 0 in case expression.
Also, you shouldn't (and you probably didn't) name columns using reserved words; date is reserved for datatype, so either column name isn't that, or you enclosed it into double quotes (which is mostly always bad idea).
Although you can do it in the same select statement, somewhat cleaner option is to use a subquery or a CTE (as my example shows); result will be just the same, but this is easier to read.
Sample data:
SQL> select * from test;
DATUM COUNT CODE
--------- ---------- -----
02-JAN-23 25 A
03-JAN-23 10 A
05-JAN-23 15 A
01-JAN-23 5 B
02-JAN-23 20 B
04-JAN-23 6 B
05-JAN-23 9 B
7 rows selected.
Query:
SQL> with temp as
2 (select datum,
3 sum(case when code = 'A' then count else 0 end) as count_a,
4 sum(case when code = 'B' then count else 0 end) as count_b
5 from test
6 group by datum
7 )
8 select datum, count_a, count_b,
9 count_a - count_b as diff
10 from temp
11 order by datum;
DATUM COUNT_A COUNT_B DIFF
--------- ---------- ---------- ----------
01-JAN-23 0 5 -5
02-JAN-23 25 20 5
03-JAN-23 10 0 10
04-JAN-23 0 6 -6
05-JAN-23 15 9 6
SQL>
You could use PIVOT to get your expected result:
Select A_DATE, COL_A, COL_B, Nvl(COL_A, 0) - Nvl(COL_B, 0) "DIFF_A_B"
From ( Select A_DATE, CNT, CODE From tbl )
PIVOT ( MAX(CNT) FOR CODE IN('A' "COL_A", 'B' "COL_B") )
Order By A_DATE
...
WITH -- Sample data
tbl AS
(
Select To_Date('02-JAN-2023', 'dd-MON-yyyy') "A_DATE", 25 "CNT", 'A' "CODE" From Dual Union All
Select To_Date('03-JAN-2023', 'dd-MON-yyyy') "A_DATE", 10 "CNT", 'A' "CODE" From Dual Union All
Select To_Date('05-JAN-2023', 'dd-MON-yyyy') "A_DATE", 15 "CNT", 'A' "CODE" From Dual Union All
Select To_Date('01-JAN-2023', 'dd-MON-yyyy') "A_DATE", 5 "CNT", 'B' "CODE" From Dual Union All
Select To_Date('02-JAN-2023', 'dd-MON-yyyy') "A_DATE", 20 "CNT", 'B' "CODE" From Dual Union All
Select To_Date('04-JAN-2023', 'dd-MON-yyyy') "A_DATE", 6 "CNT", 'B' "CODE" From Dual Union All
Select To_Date('05-JAN-2023', 'dd-MON-yyyy') "A_DATE", 9 "CNT", 'B' "CODE" From Dual
)
R e s u l t :
A_DATE COL_A COL_B DIFF_A_B
--------- ---------- ---------- ----------
01-JAN-23 5 -5
02-JAN-23 25 20 5
03-JAN-23 10 10
04-JAN-23 6 -6
05-JAN-23 15 9 6

how to count two different values from same column in oracle

I have a table connection_master in that table column name status_conn have 1 value for ON conn. and 2 for OFF conn. now I want to get only counts of ON and OFF connections in one query
I want output like this
on_counts off_counts
110 55
Use conditional aggregation:
SELECT
COUNT(CASE WHEN status_conn = 1 THEN 1 END) AS on_counts,
COUNT(CASE WHEN status_conn = 2 THEN 1 END) AS off_counts
FROM connection_master;
SELECT COUNT(status_conn) AS CONN_ON,COUNT(status_conn) AS CONN_OFF
FROM connection_master where substr(status_conn,1,1) IN (1,2)
You can use a PIVOT:
SELECT *
FROM connection_master
PIVOT (COUNT(*) FOR status_conn IN (1 AS on_count, 2 AS off_count));
Which, for the sample data:
CREATE TABLE connection_master (status_conn) AS
SELECT 1 FROM DUAL CONNECT BY LEVEL <= 110 UNION ALL
SELECT 2 FROM DUAL CONNECT BY LEVEL <= 55;
Outputs:
ON_COUNT
OFF_COUNT
110
55
fiddle

JOIN by closer value to key

With the following sample data:
WITH values AS (
SELECT
1 AS shard,
2008 AS year,
1 AS value
UNION ALL
SELECT
1 AS shard,
20012 AS year,
2 AS value
UNION ALL
SELECT
2 AS shard,
2011 AS year,
3 AS value
UNION ALL
SELECT
2 AS shard,
1998 AS year,
4 AS value
UNION ALL
SELECT
2 AS shard,
2001 AS year,
5 AS value
UNION ALL
SELECT
4 AS shard,
1990 AS year,
6 AS value
ORDER BY year
),
data AS (
SELECT
1 AS id,
1 AS shard,
2010 AS year
UNION ALL
SELECT
1 AS id,
2 AS shard,
2000 AS year
UNION ALL
SELECT
1 AS id,
3 AS shard,
1990 AS year
UNION ALL
SELECT
2 AS id,
1 AS shard,
2010 AS year
UNION ALL
SELECT
2 AS id,
2 AS shard,
2000 AS year
UNION ALL
SELECT
2 AS id,
3 AS shard,
1990 AS year
)
I want to join my data collection with the values stored in values collection. Data has an id which differentiates each process, so I want to perform the JOIN for each id. Also, the JOIN has a double mapping key, which are the shard and year fields. I want to retreive, for each entry on my data, the value of the CLOSER year in my values collection which matches its shard attribute.
I have come up with the piece of code, but it is not working as expected as it doesn't consider the values.shard field, and it matches every year no matter the shard they are on.
SELECT *
FROM (
SELECT
data.id,
data.year,
values.year AS closer_year,
ABS(data.year - values.year) AS diff,
values.value,
ROW_NUMBER() OVER (PARTITION BY data.id, data.shard ORDER BY ABS(data.year - values.year)) AS rn
FROM data, values
)
WHERE rn = 1
For the sample data, the expected output should be:
id year closer_year diff value rn
1 2010 2008 2 1 1
1 2000 2001 1 5 1
1 1990 null null null 1
2 2010 2008 2 1 1
2 2000 2001 1 5 1
2 1990 null null null 1
What am I missing?
I found what I was missing just after posting the question. I will answer it in case anyone has a similar use case.
When rereading the text, I noticed that the "match the shard" property I was missing was indeed a left join, so rewriting the query like this solved the problem:
SELECT *
FROM (
SELECT
data.id,
data.year,
values.year AS closer_year,
ABS(data.year - values.year) AS diff,
values.value,
ROW_NUMBER() OVER (PARTITION BY data.id, data.shard ORDER BY ABS(data.year - values.year)) AS rn
FROM data
LEFT JOIN values
ON data.shard = values.shard
)
WHERE rn = 1

SQL union same number of columns, same data types, different data

I have two result sets that look approximately like this:
Id Name Count
1 Asd 1
2 Sdf 4
3 Dfg 567
4 Fgh 23
But the Count column data is different for the second one and I would like both to be displayed, about like this:
Id Name Count from set 1 Count from set two
1 Asd 1 15
2 Sdf 4 840
3 Dfg 567 81
4 Fgh 23 9
How can I do this in SQL (with union if possible)?
My current SQL, hope this will better explain what I want to do:
(SELECT Id, Name, COUNT(*) FROM Customers where X)
union
(SELECT Id, Name, COUNT(*) FROM Customers where Y)
select *
from
(
SELECT 'S1' as dataset, Id, Name, COUNT(*) as resultcount FROM Customers where X
union
SELECT 'S2',Id, Name, COUNT(*) FROM Customers where Y
) s
pivot
(sum(resultcount) for dataset in (s1,s2)) p
You can do something like:
;WITH Unioned
AS
(
SELECT 'Set1' FromWhat, Id, Name FROM Table1
UNION ALL
SELECT 'Set2', Id, Name FROM Table2
)
SELECT
Id,
Name,
SUM(CASE FromWhat WHEN 'Set1' THEN 1 ELSE 0 END) 'Count from set 1',
SUM(CASE FromWhat WHEN 'Set2' THEN 1 ELSE 0 END) 'Count from set 2'
FROM Unioned
GROUP BY Id, Name;
SQL Fiddle Demo

How to transpose recordset columns into rows

I have a query whose code looks like this:
SELECT DocumentID, ComplexSubquery1 ... ComplexSubquery5
FROM Document
WHERE ...
ComplexSubquery are all numerical fields that are calculated using, duh, complex subqueries.
I would like to use this query as a subquery to a query that generates a summary like the following one:
Field DocumentCount Total
1 dc1 s1
2 dc2 s2
3 dc3 s3
4 dc4 s4
5 dc5 s5
Where:
dc<n> = SUM(CASE WHEN ComplexSubquery<n> > 0 THEN 1 END)
s <n> = SUM(CASE WHEN Field = n THEN ComplexSubquery<n> END)
How could I do that in SQL Server?
NOTE: I know I could avoid the problem by discarding the original query and using unions:
SELECT '1' AS TypeID,
SUM(CASE WHEN ComplexSubquery1 > 0 THEN 1 END) AS DocumentCount
SUM(ComplexSubquery1) AS Total
FROM (SELECT DocumentID, BLARGH ... AS ComplexSubquery1) T
UNION ALL
SELECT '2' AS TypeID,
SUM(CASE WHEN ComplexSubquery2 > 0 THEN 1 END) AS DocumentCount
SUM(ComplexSubquery2) AS Total
FROM (SELECT DocumentID, BLARGH ... AS ComplexSubquery2) T
UNION ALL
...
But I want to avoid this route, because redundant code makes my eyes bleed. (Besides, there is a real possibility that the number of complex subqueries grow in the future.)
WITH Document(DocumentID, Field) As
(
SELECT 1, 1 union all
SELECT 2, 1 union all
SELECT 3, 2 union all
SELECT 4, 3 union all
SELECT 5, 4 union all
SELECT 6, 5 union all
SELECT 7, 5
), CTE AS
(
SELECT DocumentID,
Field,
(select 10) As ComplexSubquery1,
(select 20) as ComplexSubquery2,
(select 30) As ComplexSubquery3,
(select 40) as ComplexSubquery4,
(select 50) as ComplexSubquery5
FROM Document
)
SELECT Field,
SUM(CASE WHEN RIGHT(Query,1) = Field AND QueryValue > 1 THEN 1 END ) AS DocumentCount,
SUM(CASE WHEN RIGHT(Query,1) = Field THEN QueryValue END ) AS Total
FROM CTE
UNPIVOT (QueryValue FOR Query IN
(ComplexSubquery1, ComplexSubquery2, ComplexSubquery3,
ComplexSubquery4, ComplexSubquery5)
)AS unpvt
GROUP BY Field
Returns
Field DocumentCount Total
----------- ------------- -----------
1 2 20
2 1 20
3 1 30
4 1 40
5 2 100
I'm not 100% positive from your example, but perhaps the PIVOT operator will help you out here? I think if you selected your original query into a temporary table, you could pivot on the document ID and get the sums for the other queries.
I don't have much experience with it though, so I'm not sure how complex you can get with your subqueries - you might have to break it down.