CASE WHEN condition with MAX() function - sql

There are a lot questions on CASE WHEN topic, but the closest my question is related to this How to use CASE WHEN condition with MAX() function query which has not been resolved.
Here is some of my sample data:
date
debet
2022-07-15
57190.33
2022-07-14
815616516.00
2022-07-15
40866.67
2022-07-14
1221510.00
So, I want to all records for the last two dates and three additional columns: sum(sales) for the previous day, sum for the current day and the difference between them:
SELECT
[debet],
[date] ,
SUM( CASE WHEN [date] = MAX(date) THEN [debet] ELSE 0 END ) AS sum_act,
SUM( CASE WHEN [date] = MAX(date) - 1 THEN [debet] ELSE 0 END ) AS sum_prev ,
(
SUM( CASE WHEN [date] = MAX(date) THEN [debet] ELSE 0 END )
-
SUM( CASE WHEN [date] = MAX(date) - 1 THEN [debet] ELSE 0 END )
) AS diff
FROM
Table
WHERE
[date] = ( SELECT MAX(date) FROM Table WHERE date < ( SELECT MAX(date) FROM Table) )
OR
[date] = ( SELECT MAX(date) FROM Table WHERE date = ( SELECT MAX(date) FROM Table ) )
GROUP BY
[date],
[debet]
Further, of course, it informs that I can't use the aggregate function inside CASE WHEN. Now I use this combination: sum(CASE WHEN [date] = dateadd(dd,-3,cast(getdate() as date)) THEN [debet] ELSE 0 END). But here every time I need to make an adjustment for weekends and holidays. The question is, is there any other way than using 'getdate' in 'case when' Statement to get max date?
Expected result:
date
sum_act
sum_prev
diff
2022-07-15
97190.33
0.00
97190.33
2022-07-14
0.00
508769.96
-508769.96

You can use dense_rank() to filter the last 2 dates in your table. After that you can use either conditional case expression with sum() to calculate the required value
select [date],
sum_act = sum(case when rn = 1 then [debet] else 0 end),
sum_prev = sum(case when rn = 2 then [debet] else 0 end),
diff = sum(case when rn = 1 then [debet] else 0 end)
- sum(case when rn = 2 then [debet] else 0 end)
from
(
select *, rn = dense_rank() over (order by [date] desc)
from tbl
) t
where rn <= 2
group by [date]
db<>fiddle demo

Two steps:
Get the sums for the last three dates
Show the results for the last two dates.
Well, we could also get all daily sums in step 1, but we just need the last three in order to calculate the sums for the last two days, so why aggregate more data than necessary?
Here is the query. You may have to put the date column name in brackets in SQL Server, as date is a keyword in SQL.
select top(2)
date,
sum_debit_current,
sum_debit_previous,
sum_debit_current - sum_debit_previous as diff
(
select
date,
sum(debet) as sum_debit_current,
lag(sum(debet)) over (order by date) as sum_debit_previous
from table
where date in (select distinct top(3) date from table order by date desc)
group by date
)
order by date desc;
(SQL Server uses TOP(n) instead of standard SQL FETCH FIRST 3 ROWS and while SELECT DISTINCT TOP(3) date looks like "get the top 3 rows, then apply distinct on their date", it is really "apply distinct on the dates, then get the top 3" like in standard SQL.)

Related

Current and previous days date diff in days with some condition

I have the first three fields of the following table. I want to compute the number of consecutive days an amount was higher than 0 ("days" field).
key
date
amount
days
1
2023-01-23
0
0
1
2023-01-22
10
2
1
2023-01-21
20
1
1
2023-01-20
0
0
1
2023-01-19
0
0
1
2023-01-18
0
0
1
2023-01-17
3
1
1
2023-01-16
0
0
I have tried with some windows function using this link. Did not add and reset to 1 if the previous amount is 0.
My code:
case when f.amount > 0
then SUM ( DATE_PART('day',
date::text::timestamp - previou_bus_date::text::timestamp )
) OVER (partition by f.key
ORDER BY f.date
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW )
else 0
end as days
Another option, you could use the difference between two row_numbers approach as the following:
select key, date, amount,
sum(case when amount > 0 then 1 else 0 end) over
(partition by key, grp, case when amount > 0 then 1 else 0 end order by date) days
from
(
select *,
row_number() over (partition by key order by date) -
row_number() over (partition by key, case when amount > 0 then 1 else 0 end order by date) grp
from table_name
) T
order by date desc
See demo
This problem falls into the gaps-and-islands kind of problem, as long as you need to compute consecutive values of non-null amounts.
You can reliably solve this problem in 3 steps:
flagging when there's a change of partition, by using 1 when current amount > 0 and previous amount = 0
compute a running sum (with SUM) on flags generated at step 1, to create your partitioning, which to observe the number of consecutive values on
compute a ranking (with ROW_NUMBER) to rank your non-null consecutive amounts in each partition generated at step 2
WITH cte AS (
SELECT *,
CASE WHEN amount > 0
AND LAG(amount) OVER(PARTITION BY key_ ORDER BY date_) = 0
THEN 1
END AS change_part
FROM tab
), cte2 AS (
SELECT *,
SUM(change_part) OVER(PARTITION BY key_ ORDER BY date_) AS parts
FROM cte
)
SELECT key_, date_, amount,
CASE WHEN amount > 0
THEN ROW_NUMBER() OVER(PARTITION BY key_, parts ORDER BY date_)
ELSE 0
END AS days
FROM cte2
ORDER BY date_ DESC
Check the demo here.
Note: This is not the most performant solution, although I'm leaving it for reference to the next part (missing consecutive dates). #Ahmed's answer is more likely to work better in this case.
If your data should ever have holes in dates (some missing records, making the consecutiveness of amounts no-more valid), you should add a further condition in Step 1, where you create the flag for changing partition.
The partition should change:
either if when current amount > 0 and previous amount = 0
or if current date is greater than previous date + 1 day (consecutive dates are not consecutive in time)
WITH cte AS (
SELECT *,
CASE WHEN (amount > 0
AND LAG(amount) OVER(PARTITION BY key_ ORDER BY date_) = 0)
OR date_ > LAG(date_) OVER(PARTITION BY key_ ORDER BY date_)
+ INTERVAL '1 day'
THEN 1
END AS change_part
FROM tab
), cte2 AS (
...
Check the demo here.

SQL for begin and end of data rows

I've got the following table:
and I was wondering if there is an SQL query, which would give me the begin and end Calender week (CW), where the value is greater than 0.
So in the case of the table above, a result like below:
Thanks in advance!
You can assign a group by counting the number of zeros and then aggregating:
select article_nr, min(year), max(year)
from (select t.*,
sum(case when amount = 0 then 1 else 0 end) over (partition by article_nr order by year) as grp
from t
) t
where amount > 0
group by article_nr, grp;
select Atricle_Nr, min(Year&CW) as 'Begin(Year&CW)',max(Year&CW) as 'End(Year&CW)'
from table where Amount>0 group by Atricle_Nr;

select first non-null row with minimum date (Big Query)

I want to select the first non-null row with the minimum date. I'll like to use a CASE WHEN that condition is met, then 1 ELSE 0.
So more like CASE WHEN row IS NOT and DATE is minimum DATE then 1 ELSE 0. I just need to select ONLY one row.
Another option (for BigQuery Standard SQL)
#standardSQL
SELECT *, 0 AS marker FROM `project.dataset.table` WHERE item_count IS NULL
UNION ALL
SELECT *, IF(1 = ROW_NUMBER() OVER(PARTITION BY user ORDER BY date), 1, 0)
FROM `project.dataset.table` WHERE NOT item_count IS NULL
ORDER BY user, date
Consider:
select
t.*
case when date = min(case when itemcount is not null then date end) over(partition by user order by date)
then 1
else 0
end as marker
from mytable t
I am unsure whether BigQuery supports minif() as a window function:
select
t.*
case when date = minif(date, itemcount is not null) over(partition by user order by date)
then 1
else 0
end as marker
from mytable

MSSQL Group by and Select rows from grouping

I'm trying to figure out if what I'm trying to do is possible. Instead of resorting to multiple queries on a table, I wanted to group the records by business date and id then group by the id and select one date for a field and another date for the other field.
SELECT
*
{AMOUNT FROM DATE}
{AMOUNT FROM OTHER DATE}
FROM (
SELECT
date,
id,
SUM(amount) AS amount
FROM
table
GROUP BY id, date
AS subquery
GROUP BY id
It seems that you're looking to do a pivot query. I usually use cross tabs for this. Based on the query you posted, it could look like:
SELECT
id,
SUM(CASE WHEN date = '20190901' THEN amount ELSE 0 END) AmountFromSept01,
SUM(CASE WHEN date = '20191001' THEN amount ELSE 0 END) AmountFromOct01
FROM (
SELECT
date,
id,
SUM(amount) AS amount
FROM
table
GROUP BY id, date
)AS subquery
GROUP BY id;
You could also use a CTE.
WITH CTE AS(
SELECT
date,
id,
SUM(amount) AS amount
FROM
table
GROUP BY id, date
)
SELECT
id,
SUM(CASE WHEN date = '20190901' THEN amount ELSE 0 END) AmountFromSept01,
SUM(CASE WHEN date = '20191001' THEN amount ELSE 0 END) AmountFromOct01
FROM CTE
GROUP BY id;
Or even be a rebel and do the operation directly.
SELECT
id,
SUM(CASE WHEN date = '20190901' THEN amount ELSE 0 END) AmountFromSept01,
SUM(CASE WHEN date = '20191001' THEN amount ELSE 0 END) AmountFromOct01
FROM CTE
GROUP BY id;
However, some people have tested for performance and found that pre-aggregating can improve performance.
If I understand you correctly, then you're just trying to pivot, but only with two particular dates:
select id,
date1 = sum(iif(date = '2000-01-01', amount, null)),
date2 = sum(iif(date = '2000-01-02', amount, null))
from [table]
group by id

SQL find consecutive days of specific threshold reached

I have two columns; the_day and amount_raised. I want to find the count of consecutive days that at least 1 million dollars was raised. Am I able to do this in SQL? Ideally, I'd like to create a column that counts the consecutive days and then starts over if the 1 million dollar threshold is not reached.
What I've done thus far is create a third column that puts a 1 in the row if 1 million was reached. Could I create a subquery and count the consecutive 1's listed, then reset when it hits 0?
and here is the desired output
select dt,amt,
case when amt>=1000000 then -1+row_number() over(partition by col order by dt)
else 0 end col1
from (select *, sum(case when amt >= 1000000 then 0 else 1 end) over(order by dt) col
from t) x
Sample Demo
SELECT the_day,
amount_raised,
million_threshold,
CASE WHEN million_threshold <> lag_million_threshold AND million_threshold = lead_million_threshold
THEN 1
WHEN million_threshold = lag_million_threshold
THEN SUM(million_threshold) OVER ( ORDER BY the_day ROWS UNBOUNDED PRECEDING )
ELSE 0
END AS consecutive_day_cnt
FROM
(
SELECT the_day,
amount_raised,
million_threshold,
LAG(million_threshold,1) OVER ( ORDER BY the_day ) AS lag_million_threshold,
LEAD(million_threshold,1) OVER ( ORDER BY the_day ) AS lead_million_threshold
FROM
(
SELECT the_day,
amount_raised,
CASE WHEN amount_raised >= 1000000
THEN 1
ELSE 0
END AS million_threshold
FROM Yourtable
)
);