Tuning oracle subquery in select statement - sql

I have a master table and a reference table as below.
WITH MAS as (
SELECT 10 as CUSTOMER_ID, 1 PROCESS_ID, 44 PROCESS_TYPE, 200 as AMOUNT FROM DUAL UNION ALL
SELECT 10 as CUSTOMER_ID, 1 PROCESS_ID, 44 PROCESS_TYPE, 250 as AMOUNT FROM DUAL UNION ALL
SELECT 10 as CUSTOMER_ID, 2 PROCESS_ID, 45 PROCESS_TYPE, 300 as AMOUNT FROM DUAL UNION ALL
SELECT 10 as CUSTOMER_ID, 2 PROCESS_ID, 45 PROCESS_TYPE, 350 as AMOUNT FROM DUAL
), REFTAB as (
SELECT 44 PROCESS_TYPE, 'A' GROUP_ID FROM DUAL UNION ALL
SELECT 44 PROCESS_TYPE, 'B' GROUP_ID FROM DUAL UNION ALL
SELECT 45 PROCESS_TYPE, 'C' GROUP_ID FROM DUAL UNION ALL
SELECT 45 PROCESS_TYPE, 'D' GROUP_ID FROM DUAL
) SELECT ...
My first select statement which works correctly is this one:
SELECT CUSTOMER_ID,
SUM(AMOUNT) as AMOUNT1,
SUM(CASE WHEN PROCESS_TYPE IN (SELECT PROCESS_TYPE FROM REFTAB WHERE GROUP_ID = 'A')
THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN PROCESS_TYPE IN (SELECT PROCESS_TYPE FROM REFTAB WHERE GROUP_ID = 'D')
THEN 1 ELSE NULL END) as COUNT1
FROM MAS
GROUP BY CUSTOMER_ID
However, to address a performance issue, I changed it to this select statement:
SELECT CUSTOMER_ID,
SUM(AMOUNT) as AMOUNT1,
SUM(CASE WHEN GROUP_ID = 'A' THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN GROUP_ID = 'D' THEN 1 ELSE NULL END) as COUNT1
FROM MAS A
LEFT JOIN REFTAB B ON A.PROCESS_TYPE = B.PROCESS_TYPE
GROUP BY CUSTOMER_ID
For the AMOUNT2 and COUNT1 columns, the values stay the same. But for AMOUNT1, the value is multiplied because of the join with the reference table.
I know I can add 1 more left join with an additional join condition on GROUP_ID. But that won't be any different from using a subquery.
Any idea how to make the query work with just 1 left join while not multiplying the AMOUNT1 value?

I know I can add 1 more left join with adding aditional GROUP_ID clause but it wont be different from subquery.
You'd be surprised. Having 2 left joins instead of subqueries in the SELECT gives the optimizer more ways of optimizing the query. I would still try it:
select m.customer_id,
sum(m.amount) as amount1,
sum(case when grpA.group_id is not null then m.amount end) as amount2,
count(grpD.group_id) as count1
from mas m
left join reftab grpA
on grpA.process_type = m.process_type
and grpA.group_id = 'A'
left join reftab grpD
on grpD.process_type = m.process_type
and grpD.group_id = 'D'
group by m.customer_id
You can also try this query, which uses the SUM() analytic function to calculate the amount1 value before the join to avoid the duplicate value problem:
select m.customer_id,
m.customer_sum as amount1,
sum(case when r.group_id = 'A' then m.amount end) as amount2,
count(case when r.group_id = 'D' then 'X' end) as count1
from (select customer_id,
process_type,
amount,
sum(amount) over (partition by customer_id) as customer_sum
from mas) m
left join reftab r
on r.process_type = m.process_type
group by m.customer_id,
m.customer_sum
You can test both options, and see which one performs better.

Starting off with your original query, simply replacing your IN queries with EXISTS statements should provide a significant boost. Also, be wary of summing NULLs, perhaps your ELSE statements should be 0?
SELECT CUSTOMER_ID,
SUM(AMOUNT) as AMOUNT1,
SUM(CASE WHEN EXISTS(SELECT 1 FROM REFTAB WHERE REFTAB.GROUP_ID = 'A' AND REFTAB.PROCESS_TYPE = MAS.PROCESS_TYPE)
THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN EXISTS(SELECT 1 FROM REFTAB WHERE REFTAB.GROUP_ID = 'D' AND REFTAB.PROCESS_TYPE = MAS.PROCESS_TYPE)
THEN 1 ELSE NULL END) as COUNT1
FROM MAS
GROUP BY CUSTOMER_ID

The normal way is to aggregate the values before the group by. You can also use conditional aggregation, if the rest of the query is correct:
SELECT CUSTOMER_ID,
SUM(CASE WHEN seqnum = 1 THEN AMOUNT END) as AMOUNT1,
SUM(CASE WHEN GROUP_ID = 'A' THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN GROUP_ID = 'D' THEN 1 ELSE NULL END) as COUNT1
FROM MAS A LEFT JOIN
(SELECT B.*, ROW_NUMBER() OVER (PARTITION BY PROCESS_TYPE ORDER BY PROCESS_TYPE) as seqnum
FROM REFTAB B
) B
ON A.PROCESS_TYPE = B.PROCESS_TYPE
GROUP BY CUSTOMER_ID;
This ignores the duplicates created by the joins.

Related

SQL aggregate multiple values in a column then Pivot

I've got a table with a particular column of products. Let's say product A, B, and C.
This table also has a date column.
I'd like to built a pivot table by month, with columns being combinations of users of these products.
So
Just A
Just B
Just C
A and B
A and C
B and C
A, B and C.
I can do it without the combination values as follows:
Select * from
(Select product_type, date_month, person_id
From products_table)
PIVOT (
Count(person_id)
For product_type in
(
[A]
,[B]
,[C]
)
) As pivot_table;
So my question is, how do I build combinations of these values and then add them to the pivot? Do I need to build the combination columns first and then add them to the pivot somehow?
DATE_MONTH | A | B | C | A_B | A_C | B_C | A_B_C
01-01-2020 | 30 | 75 | 10 | 105 | 40 | 85| 115
I think you're looking for something like this. First it determines the distinct product types for each person_id. Then it aggregates the product types using STRING_AGG to create the groupings. Then it uses conditional aggregation to count the person_id's in each category by month.
Using STRING_AGG SQL Server 2017+
with
dist_prod_typ_cte(person_id, product_type) as (
select distinct person_id, product_type
from products_table),
prod_grp_cte(person_id, prod_grp) as (
select person_id, string_agg(product_type, '_') within group (order by product_type) p_type_grp
from dist_prod_typ_cte
group by person_id)
select pt.date_month,
count(case when pgc.prod_grp='A' then pt.person_id else null end) A,
count(case when pgc.prod_grp='B' then pt.person_id else null end) B,
count(case when pgc.prod_grp='C' then pt.person_id else null end) C,
count(case when pgc.prod_grp='A_B' then pt.person_id else null end) A_B,
count(case when pgc.prod_grp='A_C' then pt.person_id else null end) A_C,
count(case when pgc.prod_grp='B_C' then pt.person_id else null end) B_C,
count(case when pgc.prod_grp='A_B_C' then pt.person_id else null end) A_B_C
from products_table pt
join prod_grp_cte pgc on pt.person_id=pgc.person_id
group by pt.date_month;
Using STUFF' and 'FOR XML SQL Server 2016 and prior
with
dist_prod_typ_cte(person_id, product_type) as (
select distinct person_id, product_type
from products_table),
prod_grp_cte(person_id, prod_grp) as (
select person_id, stuff((select '_' + cast(dptc_in.product_type as varchar(18))
from dist_prod_typ_cte dptc_in
where dptc.person_id = dptc_in.person_id
order by dptc_in.product_type
for xml path('')), 1, 1, '')
from dist_prod_typ_cte
group by person_id)
select pt.date_month,
count(case when pgc.prod_grp='A' then pt.person_id else null end) A,
count(case when pgc.prod_grp='B' then pt.person_id else null end) B,
count(case when pgc.prod_grp='C' then pt.person_id else null end) C,
count(case when pgc.prod_grp='A_B' then pt.person_id else null end) A_B,
count(case when pgc.prod_grp='A_C' then pt.person_id else null end) A_C,
count(case when pgc.prod_grp='B_C' then pt.person_id else null end) B_C,
count(case when pgc.prod_grp='A_B_C' then pt.person_id else null end) A_B_C
from products_table pt
join prod_grp_cte pgc on pt.person_id=pgc.person_id
group by pt.date_month;

SQL query for combining data from the same table

I have this table :
create table mytable
(ID varchar(10), VNDCOD INT, MANUF varchar(10), PRICE INT, COST INT);
insert into mytable values
('4', 1000, 'AG', 5455, 9384),
('4', 1000, 'A1', 16, 31),
('4', 2000, 'AG', 5253, 8339)
I want to be able to select something like this:
ID MANUF PRICE COST PRICE COST
4 AG 5455 9384 5253 8339
4 A1 16 31
If there is two MANUF for an ID, we should combine them on one line where the PRICE and COST of the one with VNDCOD 1000 is on the left and VNDCOD 2000 on the right like in my expected result. I don't know if it is possible to do this in one query. Any help would be greatly appreciated.
Either use a self-join or a MAX(CASE) (fiddle):
select t1.ID, t1.MANUF, t1.PRICE, t1.COST, t2.PRICE, t2.COST
from mytable as t1 left join mytable as t2
on t1.ID = t2.ID
and t1.MANUF = t2.MANUF
and t2.VNDCOD = 2000
where t1.VNDCOD = 1000
;
select ID, MANUF,
max(case when VNDCOD = 1000 then PRICE end),
max(case when VNDCOD = 1000 then COST end),
max(case when VNDCOD = 2000 then PRICE end),
max(case when VNDCOD = 2000 then COST end)
from mytable
group by ID, MANUF
You can readily do this with conditional aggregation:
select id,
max(case when vndcode = 1000 then price end) as price_1000,
max(case when vndcode = 1000 then cost end) as cost_1000,
max(case when vndcode = 2000 then price end) as price_2000,
max(case when vndcode = 2000 then cost end) as cost_2000
from mytable t
group by id;

(Oracle) How to get SUM values based on values in other columns?

Suppose there is data like below:
ID Name Cost
ID1 A 10
ID1 A 60
ID1 B 20
ID1 C 20
ID2 B 10
ID2 B 50
ID2 C 50
ID3 B 5
Here in the table above, ID and NAME are not unique.
And I want to get SUM values based on NAME, so the expected result is like below:
ID A_Costs B_Costs C_Costs AB_Costs
ID1 70 20 20 90
ID2 60 50 60
ID3 5 5
A_Cost, B_Costs, and C_Costs are costs when the name is A, B or C.
But what do I do if I want to get costs when the name is A and B?
So what I was trying to do was this:
Select t2.ID,
SUM(DECODE (t2.name, 'A', t2.Cost, null)),
SUM(DECODE (t2.name, 'B', t2.Cost, null))
--(select sum(t1.cost) from table t1. where t1.name in ('A','B') and t1.id = t2.id)
from table t2
group by t2.id
But this does not work.
How do I get the costs when the name is A and B like the line I commented out? Is there any effective way to get the value like that in one query?
Thank you in advance.
If you want to use decode(), you can do:
sum(decode(t2.name, 'A', t2.cost, 'B' t2.cost))
Or you can use a case statement:
sum(case when t2.name in ('A', 'B') then t2.cost end)
Full query:
select id,
sum(case when name = 'A' then cost end) as a_costs,
sum(case when name = 'B' then cost end) as b_costs,
sum(case when name = 'C' then cost end) as c_costs,
sum(case when name IN ('A', 'B') then cost end) as ab_costs
from SomeTable
group by id
order by id
SQL Fiddle Demo
You will also have to aggregate after using sum in the inner query.
select
id, max(a_cost) as A_Costs, max(b_cost) as B_Costs,
max(c_cost) as C_Costs, nvl(max(a_cost),0) + nvl(max(b_cost),0) as AB_Costs
from (
select ID,
sum(case when name = 'A' then cost end) as a_cost,
sum(case when name = 'B' then cost end) as b_cost,
sum(case when name = 'C' then cost end) as c_cost
from table
group by id
) t
group by id

SQL query - sum of values by status for date interval

I get crazy because of one query. I have a table like following and I want to get a data - Summa of Values by Status For every Date in interval.
Table
Id Name Value Date Status
1 pro1 2 01.04.14 0
2 pro1 8 02.04.14 1
3 pro2 6 02.04.14 1
4 pro3 0 03.04.14 0
5 pro4 7 03.04.14 0
6 pro4 2 03.04.14 0
7 pro4 4 03.04.14 1
8 pro4 6 04.04.14 1
9 pro4 1 04.04.14 1
For example,
Input: Name = pro4, minDate = 01.02.14, maxDate = 04.09.14
Output:
Date Values sum for 0 Status Values sum for 1 Status
01.04.14 0 0
02.04.14 0 0
03.04.14 9 (=7+2) 4 (only 4 exist)
04.04.14 0 7 (6+1)
In 01.02.14 and 02.04.14 dates, pro4 has not values by status, but I want to show that rows, because I need all dates in that interval. Can anyone help me to create this query?
Edit:
I can not change structure, I have already that table with data. Every day exist in table many times (minimum 1 time)
Thanks in advance.
Assuming you have a row for each date in the table, use conditional aggregation:
select date,
sum(Case when name = 'pro4' and status = 0 then Value else 0 end) as values_0,
sum(case when name = 'pro4' and status = 1 then Value else 0 end) as values_1
from Table t
where date >= '2014-04-01' and date <= '2014-04-09'
group by date
order by date;
If you don't have this list of dates, you can take this approach instead:
with dates as (
select cast('2014-04-01' as date) as thedate
union all
select dateadd(day, 1, thedate)
from dates
where thedate < '2014-04-09'
)
select dates.thedate,
sum(Case when status = 0 then Value else 0 end) as values_0,
sum(case when status = 1 then Value else 0 end) as values_1
from dates left outer join
table t
on t.date = dates.thedate and t.name = 'pro4'
group by dates.thedate;
just an assumption query :
select Distinct date ,case when status = 0 and MAX(date) then SUM(value) ELSE 0 END Status0 ,
case when status = 1 and MAX(date) then SUM(value) ELSE 0 END Status1 from table
To expand my comment the complete query is
WITH [counter](N) AS
(SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1)
, days(N) AS (
SELECT row_number() over (ORDER BY (SELECT NULL)) FROM [counter])
, months (N) AS (
SELECT N - 1 FROM days WHERE N < 13)
, calendar ([date]) AS (
SELECT DISTINCT cast(dateadd(DAY, days.n
, dateadd(MONTH, months.n, '20131231')) AS date)
FROM months
CROSS JOIN days
)
SELECT a.Name
, c.Date
, [Sum of 0] = SUM(CASE Status WHEN 0 THEN Value ELSE 0 END)
, [Sum of 1] = SUM(CASE Status WHEN 1 THEN Value ELSE 0 END)
FROM Calendar c
LEFT JOIN myTable a ON c.Date = a.Date AND a.name = 'pro4'
WHERE c.date BETWEEN '20140201' AND '20140904'
GROUP BY c.Date, a.Name
ORDER BY c.Date
Note that the condition on the name need to be in the JOIN, otherwise you'll get only the date of your table.
If you need multiple years just add another CTE for the count and a dateadd(YEAR,...) in the CTE calendar
This is not really the exact query, but I think you can get that by having a query that looks like:
select date, status, sum(value) from table
where (date between mindate and maxdate) and name = product_name
group by date, status;
this page gives more info.
EDIT
So the above query only gives a part of the answer required by the OP. A LEFT OUTER JOIN of the original table and the result of the above query on thedate and status fields will give the missing info.
e.g.
select x.date, x.status, x.sum_of_values from table as y
left outer join
(select date, status, sum(value) as sum_of_values
from table
where (date between mindate and maxdate) and name = product_name
group by date, status) as x
on y.date= x.date and y.status = x.status
order by x.date;

Use the result from one column to make column names

I have select statement, something like:
select trim(time), type, count(1) from table
group by trim(time),type
The results are:
02.10.13 REZ1 1
02.10.13 REZ2 5
02.10.13 REZ3 3
Is it possible to make some select statement with some Oracle function to get the following result:
REZ1 REZ2 REZ3
1 5 3
So, results from one column are column names in the other statement, something like:
select ?SOMETHING?
from (
select trim(time), type, count(1)
from table
group by trim(time),type) s
WITH t(l_date, val, l_count)
AS
(SELECT to_date('02.10.13', 'dd.mm.yy'), 'REZ1' , 1 FROM dual UNION
SELECT to_date('02.10.13', 'dd.mm.yy'), 'REZ2' , 5 FROM dual UNION
SELECT to_date('02.10.13', 'dd.mm.yy'), 'REZ3' , 3 FROM dual
)
SELECT *
FROM
( SELECT val, l_count FROM t
) PIVOT (MAX(l_count) FOR (val) IN ('REZ1' REZ1,'REZ2' REZ2 ,'REZ3' REZ3));
Perhaps you can try the SQL alternative for this:
SELECT MAX(CASE type
WHEN 'REZ1' THEN cnt
ELSE NULL
END) AS REZ1,
MAX(CASE type
WHEN 'REZ2' THEN cnt
ELSE NULL
END) AS REZ2,
MAX(CASE type
WHEN 'REZ3' THEN cnt
ELSE NULL
END) AS REZ3
FROM (SELECT trim(time), type, count(1) as cnt FROM table
GROUP BY trim(time), type);
Or even better:
SELECT COUNT(CASE type
WHEN 'REZ1' THEN 1
ELSE NULL
END) AS REZ1,
COUNT(CASE type
WHEN 'REZ2' THEN 1
ELSE NULL
END) AS REZ2,
COUNT(CASE type
WHEN 'REZ3' THEN 1
ELSE NULL
END) AS REZ3
FROM table;
OUTPUT:
REZ1 | REZ2 | REZ3
1 | 5 | 3
This is applicable when you know the values in the type column. Even pivot, which is available in 11g, requires you to know the values in the column in order to pivot this as described in ajmalmhd04's answer.
Assuming the 2nd column will not have any value other than REZ1..3 then following solution will transpose the table data and take care of repetition as well by adding up the values.
WITH t
AS (SELECT TO_DATE ('02.10.13', 'dd.mm.yy') AS dt,
'REZ1' AS typ,
1 AS cnt
FROM DUAL
UNION
SELECT TO_DATE ('02.10.13', 'dd.mm.yy') AS dt,
'REZ2' AS typ,
5 AS cnt
FROM DUAL
UNION
SELECT TO_DATE ('02.10.13', 'dd.mm.yy') AS dt,
'REZ3' AS typ,
3 AS cnt
FROM DUAL
UNION
SELECT TO_DATE ('03.10.13', 'dd.mm.yy') AS dt,
'REZ1' AS typ,
7 AS cnt
FROM DUAL
UNION
SELECT TO_DATE ('03.10.13', 'dd.mm.yy') AS dt,
'REZ2' AS typ,
2 AS cnt
FROM DUAL)
SELECT dt,
SUM (CASE WHEN typ = 'REZ1' THEN cnt ELSE 0 END) AS "REZ1",
SUM (CASE WHEN typ = 'REZ2' THEN cnt ELSE 0 END) AS "REZ2",
SUM (CASE WHEN typ = 'REZ3' THEN cnt ELSE 0 END) AS "REZ3"
FROM t
GROUP BY dt;
DT REZ1 REZ2 REZ3
--------- ---------- ---------- ----------
02-OCT-13 1 5 3
03-OCT-13 7 2 0
2 rows selected.
Incase column#2 is going to have random values, probably some kind of cursor needs to use to tweak this query for considering dynamic columns.
I will try to post that if I able to get that designed.
select trim(time),
count(decode(type, 'REZ1', 1)) AS REZ1,
count(decode(type, 'REZ2', 1)) AS REZ2,
count(decode(type, 'REZ3', 1)) AS REZ3
from table
group by trim(time)