SQL: Optimize query with multiple CASE statements

SQL: Optimize query with multiple CASE statements - sql

I have a query with many CASE statements that runs for a very long time due to the number of rows. In my research I have not found a solution yet. Is there a way to write the CASE statements more efficiently and with better performance?
database: Oracle
table_a
table
table_y
are all the same table where I SELECT from.
example data
contract_number
product_description
product
damagenumber
date
internalname
payment
1
Product T-Shirt
product_name
111
20210101
Web
30.20
2
Product T-Shirt
product_name
222
20210202
Web
19.38
3
Product Hoodie
product_name2
333
20210215
Store
20.49
3
Product Hoodie
product_name2
334
20210302
Store
15.99
5
Product Hoodie
product_name2
123
20210120
Telephone
99.99
SELECT
contract_number,
product_description,
product,
CASE
WHEN ( x.produkt = 'product_name'
AND (
SELECT
COUNT(DISTINCT damagenumber)
FROM
table z
WHERE
date BETWEEN add_months(trunc(sysdate), - 6) AND sysdate
AND internalname <> 'CONDITION'
AND x.contract_number = z.contract_number
GROUP BY
z.contract_number
) = 1
AND (
SELECT
SUM(y.payment)
FROM
table_y y
WHERE
date BETWEEN add_months(trunc(sysdate), - 6) AND sysdate
AND internalname <> 'CONDITION'
AND x.contract_number = y.contract_number
) > 1500 ) THEN
(
SELECT
COUNT(DISTINCT damagenumber)
FROM
table z
WHERE
date BETWEEN add_months(trunc(sysdate), - 6) AND sysdate
AND internalname <> 'CONDITION'
AND x.contract_number = z.contract_number
GROUP BY
z.contract_number
)
ELSE
0
END) AS count_numbers,
FROM
table_a x
GROUP BY
x.contract_number,
x.product_description,
x.product;
The above is a simplified example. I have a lot of WHEN conditions in my query.
Thanks in advance

maybe this help
with tabletemp as (
SELECT z.contract_number
, COUNT(DISTINCT damagenumber) damagenumber_count
FROM table z
WHERE date BETWEEN add_months(trunc(sysdate), - 6) AND sysdate AND internalname <> 'CONDITION'
GROUP BY z.contract_number
), tabletemp2 as (
SELECT y.contract_number
, SUM(y.payment) payment_sum
FROM table_y y
WHERE date BETWEEN add_months(trunc(sysdate), - 6) AND sysdate AND internalname <> 'CONDITION'
group by y.contract_number
)
SELECT
contract_number,
product_description,
product,
CASE WHEN ( x.produkt = 'product_name' AND tt.damagenumber_count = 1 AND tt2.payment_sum > 1500 ) THEN
tt.damagenumber_count
ELSE
0
END AS count_numbers,
FROM
table_a x
join tabletemp tt on (tt.contract_number = x.contract_number)
join tabletemp2 tt2 on (tt2.contract_number = x.contract_number)
Maybe this query have some errors (I can't test) but you should try this way

Related

substituting "filter" in a sql query on oracle

We have a table with data that has one date column indicating what day the data is for ("planning_day") and another column for logging when the data was sent ("first_sent_time").
I'm trying to make a report showing how far in the past/future we've sent data on which day. So if today we sent 2 data for yesterday, 5 for today and 1 for the day after tomorrow, the result should be something like this:
sent_day minus2 minus1 sameDay plus1 plus2
2021-11-24 0 2 5 0 1
...
I know I could do this in postgres with a query using "filter":
select
trunc(t.first_sent_time),
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = -2) as "minus2",
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = -1) as "minus1",
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = 0) as "sameDay",
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = 1) as "plus1",
count(t.id) filter (where e.planning_day - trunc(e.first_sent_time) = 2) as "plus2"
from
my_table t
group by
trunc(t.first_sent_time)
;
Unfortunately, this "filter" doesn't exist in Oracle. I need help here. I tried something like following:
select
sent_day,
sum(minus2),
sum(minus1),
sum(sameDay),
sum(plus1),
sum(plus2)
from (
select
*
from (
select
b.id,
trunc(b.first_sent_time) as sent_day,
b.planning_day,
b.planning_day - trunc(b.first_sent_time) as day_diff
from
my_table b
where
b.first_sent_time >= DATE '2021-11-01'
)
pivot (
count(id) for day_diff in (-2 as "minus2",-1 as "minus1",0 as "sameDay", 1 as "plus1",2 as "plus2")
)
)
group by
sent_day
order by
sent_day
;
but it doesn't work and it feels like I'm going too complicated and there must be an easier solution.

Use a CASEexpression within the aggregation function to simulate the filter.
Here a simplified example
with dt as (
select 1 id , 1 diff_days from dual union all
select 2 id , 1 diff_days from dual union all
select 3 id , -1 diff_days from dual union all
select 4 id , -1 diff_days from dual union all
select 4 id , -1 diff_days from dual)
/* query */
select
count(case when diff_days = 1 then id end) as cnt_1,
count(case when diff_days = -1 then id end) as cnt_minus_1
from dt;
results in
CNT_1 CNT_MINUS_1
---------- -----------
2 3

Getting rid of grouping field

Is there a safe way to not have to group by a field when using an aggregate in another field? Here is my example
SELECT
C.CustomerName
,D.INDUSTRY_CODE
,CASE WHEN D.INDUSTRY_CODE IN ('003','004','005','006','007','008','009','010','017','029')
THEN 'PM'
WHEN UPPER(CustomerName) = 'ULINE INC'
THEN 'ULINE'
ELSE 'DR'
END AS BU
,ISNULL((SELECT SUM(GrossAmount)
where CONVERT(date,convert(char(8),InvoiceDateID )) between DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 1, 0) and DATEADD(year, -1, GETDATE())),0) [PREVIOUS YEAR GROSS]
FROM factMargins A
LEFT OUTER JOIN dimDate B ON A.InvoiceDateID = B.DateId
LEFT OUTER JOIN dimCustomer C ON A.CustomerID = C.CustomerId
LEFT OUTER JOIN CRCDATA.DBO.CU10 D ON D.CUST_NUMB = C.CustomerNumber
GROUP BY
C.CustomerName,D.INDUSTRY_CODE
,A.InvoiceDateID
order by CustomerName
before grouping I was only getting 984 rows but after grouping by the A.InvoiceDateId field I am getting over 11k rows. The rows blow up since there are multiple invoices per customer. Min and Max wont work since then it will pull data incorrectly. Would it be best to let my application (crystal) get rid of the extra lines? Usually I like to have my base data be as close as possible to how the report will layout if possible.

Try moving the reference to InvoiceDateID to within an aggregate function, rather than within a selected subquery's WHERE clause.
In Oracle, here's an example:
with TheData as (
select 'A' customerID, 25 AMOUNT , trunc(sysdate) THEDATE from dual union
select 'B' customerID, 35 AMOUNT , trunc(sysdate-1) THEDATE from dual union
select 'A' customerID, 45 AMOUNT , trunc(sysdate-2) THEDATE from dual union
select 'A' customerID, 11000 AMOUNT , trunc(sysdate-3) THEDATE from dual union
select 'B' customerID, 12000 AMOUNT , trunc(sysdate-4) THEDATE from dual union
select 'A' customerID, 15000 AMOUNT , trunc(sysdate-5) THEDATE from dual)
select
CustomerID,
sum(amount) as "AllRevenue"
sum(case when thedate<sysdate-3 then amount else 0 end) as "OlderRevenue",
from thedata
group by customerID;
Output:
CustomerID | AllRevenue | OlderRevenue
A | 26070 | 26000
B | 12035 | 12000
This says:
For each customerID
I want the sum of all amounts
and I want the sum of amounts earlier than 3 days ago

SQL Deduct value from multiple rows

I would like to apply total $10.00 discount for each customers.The discount should be applied to multiple transactions until all $10.00 used.
Example:
CustomerID Transaction Amount Discount TransactionID
1 $8.00 $8.00 1
1 $6.00 $2.00 2
1 $5.00 $0.00 3
1 $1.00 $0.00 4
2 $5.00 $5.00 5
2 $2.00 $2.00 6
2 $2.00 $2.00 7
3 $45.00 $10.00 8
3 $6.00 $0.00 9

The query below keeps track of the running sum and calculates the discount depending on whether the running sum is greater than or less than the discount amount.
select
customerid, transaction_amount, transactionid,
(case when 10 > (sum_amount - transaction_amount)
then (case when transaction_amount >= 10 - (sum_amount - transaction_amount)
then 10 - (sum_amount - transaction_amount)
else transaction_amount end)
else 0 end) discount
from (
select customerid, transaction_amount, transactionid,
sum(transaction_amount) over (partition by customerid order by transactionid) sum_amount
from Table1
) t1 order by customerid, transactionid
http://sqlfiddle.com/#!6/552c2/7
same query with a self join which should work on most db's including mssql 2008
select
customerid, transaction_amount, transactionid,
(case when 10 > (sum_amount - transaction_amount)
then (case when transaction_amount >= 10 - (sum_amount - transaction_amount)
then 10 - (sum_amount - transaction_amount)
else transaction_amount end)
else 0 end) discount
from (
select t1.customerid, t1.transaction_amount, t1.transactionid,
sum(t2.transaction_amount) sum_amount
from Table1 t1
join Table1 t2 on t1.customerid = t2.customerid
and t1.transactionid >= t2.transactionid
group by t1.customerid, t1.transaction_amount, t1.transactionid
) t1 order by customerid, transactionid
http://sqlfiddle.com/#!3/552c2/2

You can do this with recursive common table expressions, although it isn't particularly pretty. SQL Server stuggles to optimize these types of query. See Sum of minutes between multiple date ranges for some discussion.
If you wanted to go further with this approach, you'd probably need to make a temporary table of x, so you can index it on (customerid, rn)
;with x as (
select
tx.*,
row_number() over (
partition by customerid
order by transaction_amount desc, transactionid
) rn
from
tx
), y as (
select
x.transactionid,
x.customerid,
x.transaction_amount,
case
when 10 >= x.transaction_amount then x.transaction_amount
else 10
end as discount,
case
when 10 >= x.transaction_amount then 10 - x.transaction_amount
else 0
end as remainder,
x.rn as rn
from
x
where
rn = 1
union all
select
x.transactionid,
x.customerid,
x.transaction_amount,
case
when y.remainder >= x.transaction_amount then x.transaction_amount
else y.remainder
end,
case
when y.remainder >= x.transaction_amount then y.remainder - x.transaction_amount
else 0
end,
x.rn
from
y
inner join
x
on y.rn = x.rn - 1 and y.customerid = x.customerid
where
y.remainder > 0
)
update
tx
set
discount = y.discount
from
tx
inner join
y
on tx.transactionid = y.transactionid;
Example SQLFiddle

I usually like to setup a test environment for such questions. I will use a local temporary table. Please note, I made the data un-ordered since it is not guaranteed in a real life.
-- play table
if exists (select 1 from tempdb.sys.tables where name like '%transactions%')
drop table #transactions
go
-- play table
create table #transactions
(
trans_id int identity(1,1) primary key,
customer_id int,
trans_amt smallmoney
)
go
-- add data
insert into #transactions
values
(1,$8.00),
(2,$5.00),
(3,$45.00),
(1,$6.00),
(2,$2.00),
(1,$5.00),
(2,$2.00),
(1,$1.00),
(3,$6.00);
go
I am going to give you two answers.
First, in 2014 there are new windows functions for rows preceding. This allows us to get a running total (rt) and a rt adjusted by one entry. Give these two values, we can determine if the maximum discount has been exceeded or not.
-- Two running totals for 2014
;
with cte_running_total
as
(
select
*,
SUM(trans_amt)
OVER (PARTITION BY customer_id
ORDER BY trans_id
ROWS BETWEEN UNBOUNDED PRECEDING AND
0 PRECEDING) as running_tot_p0,
SUM(trans_amt)
OVER (PARTITION BY customer_id
ORDER BY trans_id
ROWS BETWEEN UNBOUNDED PRECEDING AND
1 PRECEDING) as running_tot_p1
from
#transactions
)
select
*
,
case
when coalesce(running_tot_p1, 0) <= 10 and running_tot_p0 <= 10 then
trans_amt
when coalesce(running_tot_p1, 0) <= 10 and running_tot_p0 > 10 then
10 - coalesce(running_tot_p1, 0)
else 0
end as discount_amt
from cte_running_total;
Again, the above version is using a common table expression and advanced windowing to get the totals.
Do not fret! The same can be done all the way down to SQL 2000.
Second solution, I am just going to use the order by, sub-queries, and a temporary table to store the information that is normally in the CTE. You can switch the temporary table for a CTE in SQL 2008 if you want.
-- w/o any fancy functions - save to temp table
select *,
(
select count(*) from #transactions i
where i.customer_id = o.customer_id
and i.trans_id <= o.trans_id
) as sys_rn,
(
select sum(trans_amt) from #transactions i
where i.customer_id = o.customer_id
and i.trans_id <= o.trans_id
) as sys_tot_p0,
(
select sum(trans_amt) from #transactions i
where i.customer_id = o.customer_id
and i.trans_id < o.trans_id
) as sys_tot_p1
into #results
from #transactions o
order by customer_id, trans_id
go
-- report off temp table
select
trans_id,
customer_id,
trans_amt,
case
when coalesce(sys_tot_p1, 0) <= 10 and sys_tot_p0 <= 10 then
trans_amt
when coalesce(sys_tot_p1, 0) <= 10 and sys_tot_p0 > 10 then
10 - coalesce(sys_tot_p1, 0)
else 0
end as discount_amt
from #results
order by customer_id, trans_id
go
In short, your answer is show in the following screen shot. Cut and paste the code into SSMS and have some fun.

Insert data into temp table, multiple columns from one Table

I have a table with columns: ID (Int), Date (Date) and Price (Decimal). Date column is in format 2013-04-14:
Table Example
ID Date Price
1 2012/05/02 23.5
1 2012/05/03 25.2
1 2012/05/04 22.5
1 2012/05/05 22.2
1 2012/05/06 26.5
2 2012/05/02 143.5
2 2012/05/03 145.2
2 2012/05/04 142.2
2 2012/05/05 146.5
3 2012/05/02 83.5
3 2012/05/03 85.2
3 2012/05/04 80.5
Query Example:
I want to be able to select all ID1 and ID3's data between a date range from the table and have this in a table with three columns, ordered by Date column. Also I would want to insert this into a temporary table to perform mathematical calculations on the data. Please comment if there is a better way.
Correct Result Example
Date ID1 ID3
2012-05-02 23.5 83.5
2012-05-03 25.2 85.2
2012-05-04 22.5 80.2
Any help and advice will be appreciated,
Thanks

Try the following.
CREATE TABLE #temp (
Date date,
x money,
y money
)
;
SELECT
Date,
MAX(CASE WHEN id=1 THEN price END) AS x,
MAX(CASE WHEN id=3 THEN price END) AS y
FROM Top40
WHERE Date BETWEEN '2012-05-02' AND '2012-05-04'
GROUP BY
Date
;
See SQL Fiddle for working example
EDIT:
To use the LAG window function on the x and y columns, you'll have to use a common table expression or CTE first.
WITH prices AS(
SELECT
Date as myDate,
MAX(CASE WHEN id=1 THEN price END) AS x,
MAX(CASE WHEN id=3 THEN price END) AS y
FROM Top40
WHERE Date BETWEEN '2012-05-02' AND '2012-05-04'
GROUP BY
Date
)
SELECT
myDate,
p.x,
(p.x/(LAG(p.x) OVER (ORDER BY MyDate))-1) as x_return,
p.y,
(p.y/(LAG(p.y) OVER (ORDER BY MyDate))-1) as y_return
FROM prices p
ORDER BY
myDate
;
See new SQL Fiddle for example.

The simplest way to do it in code (although it may not perform well with large data sets) is to do something like:
SELECT [Date], x = MAX(CASE WHEN ID = 1 THEN PRICE END)
, y = MAX(CASE WHEN ID = 3 THEN PRICE END)
INTO #tmp
FROM Top40
GROUP BY [Date]

Or...
select Date , t1.Price as Stock_1_Price , t2.Price as Stock_3_price
from ( select "Date" , max(Price) as Price from myData where ID = 1 group by "Date" ) t1
full join ( select "Date" , max(Price) as Price from myData where ID = 3 group by "Date" ) t2 on t2.Date = t1.Date
As far as populating a temp table, any of the usual ways works:
Table variable:
declare #work table
(
yyyymmdd varchar(32) not null ,
stock_1_price money null ,
stock_3_price money null
)
insert #work ( yyyymmdd , stock_1_price , stock_3_price )
select Date , t1.Price as Stock_1_Price , t2.Price as Stock_3_price
from ( select "Date" , max(Price) as Price from myData where ID = 1 group by "Date" ) t1
full join ( select "Date" , max(Price) as Price from myData where ID = 3 group by "Date" ) t2 on t2.Date = t1.Date
Declared temp table in tempdb
create table #work
(
yyyymmdd varchar(32) not null primary key clustered ,
stock_1_price money null ,
stock_3_price money null
)
insert #work ( yyyymmdd , stock_1_price , stock_3_price )
select Date , t1.Price as Stock_1_Price , t2.Price as Stock_3_price
from ( select "Date" , max(Price) as Price from myData where ID = 1 group by "Date" ) t1
full join ( select "Date" , max(Price) as Price from myData where ID = 3 group by "Date" ) t2 on t2.Date = t1.Date
non-declare temp table in tempdb via select into:
select Date , t1.Price as Stock_1_Price , t2.Price as Stock_3_price
into #work
from ( select "Date" , max(Price) as Price from myData where ID = 1 group by "Date" ) t1
full join ( select "Date" , max(Price) as Price from myData where ID = 3 group by "Date" ) t2 on t2.Date = t1.Date

SQL Query Help (Advanced - for me!)

I have a question about a SQL query I am trying to write.
I need to query data from a database.
The database has, amongst others, these 3 fields:
Account_ID #, Date_Created, Time_Created
I need to write a query that tells me how many accounts were opened per hour.
I have written said query, but there are times that there were 0 accounts created, so these "hours" are not populated in the results.
For example:
Volume Date__Hour
435 12-Aug-12 03
213 12-Aug-12 04
125 12-Aug-12 06
As seen in the example above, hour 5 did not have any accounts opened.
Is there a way that the result can populate the hour but and display 0 accounts opened for this hour?
Example of how I want my results to look like:
Volume Date_Hour
435 12-Aug-12 03
213 12-Aug-12 04
0 12-Aug-12 05
125 12-Aug-12 06
Thanks!
Update: This is what I have so far
SELECT count(*) as num_apps, to_date(created_ts,'DD-Mon-RR') as app_date, to_char(created_ts,'HH24') as app_hour
FROM accounts
WHERE To_Date(created_ts,'DD-Mon-RR') >= To_Date('16-Aug-12','DD-Mon-RR')
GROUP BY To_Date(created_ts,'DD-Mon-RR'), To_Char(created_ts,'HH24')
ORDER BY app_date, app_hour

To get the results you want, you will need to create a table (or use a query to generate a "temp" table) and then use a left join to your calculation query to get rows for every hour - even those with 0 volume.
For example, assume I have a table with app_date and app_hour fields. Also assume that this table has a row for every day/hour you wish to report on.
The query would be:
SELECT NVL(c.num_apps,0) as num_apps, t.app_date, t.app_hour
FROM time_table t
LEFT OUTER JOIN
(
SELECT count(*) as num_apps, to_date(created_ts,'DD-Mon-RR') as app_date, to_char(created_ts,'HH24') as app_hour
FROM accounts
WHERE To_Date(created_ts,'DD-Mon-RR') >= To_Date('16-Aug-12','DD-Mon-RR')
GROUP BY To_Date(created_ts,'DD-Mon-RR'), To_Char(created_ts,'HH24')
ORDER BY app_date, app_hour
) c ON (t.app_date = c.app_date AND t.app_hour = c.app_hour)

I believe the best solution is not to create some fancy temporary table but just use this construct:
select level
FROM Dual
CONNECT BY level <= 10
ORDER BY level;
This will give you (in ten rows):
1
2
3
4
5
6
7
8
9
10
For hours interval just little modification:
select 0 as num_apps, (To_Date('16-09-12','DD-MM-RR') + level / 24) as created_ts
FROM dual
CONNECT BY level <= (sysdate - To_Date('16-09-12','DD-MM-RR')) * 24 ;
And just for the fun of it adding solution for you(I didn't try syntax, so I'm sorry for any mistake, but the idea is clear):
SELECT SUM(num_apps) as num_apps, to_date(created_ts,'DD-Mon-RR') as app_date, to_char(created_ts,'HH24') as app_hour
FROM(
SELECT count(*) as num_apps, created_ts
FROM accounts
WHERE To_Date(created_ts,'DD-Mon-RR') >= To_Date('16-09-12','DD-MM-RR')
UNION ALL
select 0 as num_apps, (To_Date('16-09-12','DD-MM-RR') + level / 24) as created_ts
FROM dual
CONNECT BY level <= (sysdate - To_Date('16-09-12','DD-MM-RR')) * 24 ;
)
GROUP BY To_Date(created_ts,'DD-Mon-RR'), To_Char(created_ts,'HH24')
ORDER BY app_date, app_hour
;

You can also use a CASE statement in the SELECT to force the value you want.

It can be useful to have a "sequence table" kicking around, for all sorts of reasons, something that looks like this:
create table dbo.sequence
(
id int not null primary key clustered ,
)
Load it up with million or so rows, covering positive and negative values.
Then, given a table that looks like this
create table dbo.SomeTable
(
account_id int not null primary key clustered ,
date_created date not null ,
time_created time not null ,
)
Your query is then as simple as (in SQL Server):
select year_created = years.id ,
month_created = months.id ,
day_created = days.id ,
hour_created = hours.id ,
volume = t.volume
from ( select * ,
is_leap_year = case
when id % 400 = 0 then 1
when id % 100 = 0 then 0
when id % 4 = 0 then 1
else 0
end
from dbo.sequence
where id between 1980 and year(current_timestamp)
) years
cross join ( select *
from dbo.sequence
where id between 1 and 12
) months
left join ( select *
from dbo.sequence
where id between 1 and 31
) days on days.id <= case months.id
when 2 then 28 + years.is_leap_year
when 4 then 30
when 6 then 30
when 9 then 30
when 11 then 30
else 31
end
cross join ( select *
from dbo.sequence
where id between 0 and 23
) hours
left join ( select date_created ,
hour_created = datepart(hour,time_created ) ,
volume = count(*)
from dbo.SomeTable
group by date_created ,
datepart(hour,time_created)
) t on datepart( year , t.date_created ) = years.id
and datepart( month , t.date_created ) = months.id
and datepart( day , t.date_created ) = days.id
and t.hour_created = hours.id
order by 1,2,3,4

It's not clear to me if created_ts is a datetime or a varchar. If it's a datetime, you shouldn't use to_date; if it's a varchar, you shouldn't use to_char.
Assuming it's a datetime, and borrowing #jakub.petr's FROM Dual CONNECT BY level trick, I suggest:
SELECT count(*) as num_apps, to_char(created_ts,'DD-Mon-RR') as app_date, to_char(created_ts,'HH24') as app_hour
FROM (select level-1 as hour FROM Dual CONNECT BY level <= 24) h
LEFT JOIN accounts a on h.hour = to_number(to_char(a.created_ts,'HH24'))
WHERE created_ts >= To_Date('16-Aug-12','DD-Mon-RR')
GROUP BY trunc(created_ts), h.hour
ORDER BY app_date, app_hour

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: Optimize query with multiple CASE statements - sql

Related

substituting "filter" in a sql query on oracle

Getting rid of grouping field

SQL Deduct value from multiple rows

Insert data into temp table, multiple columns from one Table

SQL Query Help (Advanced - for me!)

Categories

Resources