How do you get the sum of values by day, but day in columns SQL - sql

I have a table that looks like below where day, client_name and order_value are stored/
select day, client_name, order_value
from sample_table
day
client_name
order_value
2021-01-01
A
100
2021-01-01
A
100
2021-01-02
A
200
2021-01-03
A
100
2021-01-01
B
300
2021-01-01
B
400
2021-01-01
C
500
2021-01-02
C
500
2021-01-02
C
500
and I want to get the sum of order_value per client by day, but days in columns.
Basically, I want my result to come out something like this.
client_name
2021-01-01
2021-01-02
2021-01-03
A
200
200
100
B
700
Null
Null
C
500
1000
Null

If you know what the days are, you can use conditional aggregation:
select client_name,
sum(case when date = '2021-01-01' then order_value end) as date_20210101,
sum(case when date = '2021-01-02' then order_value end) as date_20210102,
sum(case when date = '2021-01-03' then order_value end) as date_20210103
from t
group by client_name ;
If you don't know the specific dates (i.e., you want them based on the data or a variable number), then you need to use dynamic SQL. That means that you construct the SQL statement as a string and then execute it.

Related

Is there anyway to check some interval in sql

For example I have a table like this:
CREATE TABLE sales (
id int NOT NULL PRIMARY KEY,
sku text NOT NULL,
date date NOT NULL,
amount real NOT NULL,
CONSTRAINT date_sku UNIQUE (sku,date)
)
Is there anyway to check for each sku if every 2 days average sales is bigger than for example 14 amount sold. I want to find date ranges, the percentage and amount it sold in those days.
dbfiddle
for example for sku B in my example, it sold 15 at 2022-01-01 and 20 at 2022-01-02 and the average is 17.5 for these 2 days which is bigger than 14 therefore it will appear in my result and the change is 17.5 / 14 = 1.25.
Again for the next 2 days we have 20 at 2022-01-02 and 13 at 2022-01-03. Therefore the average is 16.5 which is bigger than 14 and it will appear in the result
but for 13 at 2022-01-03 and 12 at 2022-01-04 and the average is about 12.5. Because 12.5 is not bigger than 14, it will not appear in the result.
my desired output with 14 amount example is:
sku start_date end_date amount_sold change_rate
B 2022-01-01 2022-01-02 17.5 1.25
B 2022-01-02 2022-01-03 16.5 1.17
D 2022-01-01 2022-01-02 28 2
I tried using CASE WHEN but I know that it wont work for large data like one year:
SELECT *
FROM (
SELECT sku,
AVG(CASE WHEN date BETWEEN '2022-01-01' AND '2022-01-02' THEN amount END) AS first_in,
AVG(CASE WHEN date BETWEEN '2022-01-02' AND '2022-01-03' THEN amount END) AS second_in,
AVG(CASE WHEN date BETWEEN '2022-01-03' AND '2022-01-04' THEN amount END) AS third_in
FROM sales
GROUP BY sku
) AS t
WHERE first_in > 14
OR second_in > 14
OR third_in > 14
As a general rule, use the LEAD (or LAG) to retrieve data from the next or previous record. At least this is what I did before you asked for possibly several days. Other window functions are suitable for your need if you want more than 1 day:
SELECT *, averageamount/14
FROM (
SELECT sku, date,
MAX(date) OVER w AS nextdate,
AVG(amount) OVER w AS averageAmount
FROM sales
WINDOW w AS (PARTITION BY sku ORDER BY date RANGE BETWEEN '0 day' PRECEDING AND '2 days' FOLLOWING )
) s
WHERE averageAmount > 14
This above select all the ranges that are up to 3 days long (days D, D+1 and D+2). You may want to remove the ranges that are less than 3 days long by appending the additional condition:
AND nextdate >= date + interval '2 days'

Calculating the cumulative sum with some conditions (gaps-and-islands problem)

Sorry if the title is a bit vague please suggest a title if you think it can articulate the problem. I'll start with what data I have and the end result I'm trying to get and then the TLDR:
This is the table I have:
Each row is a transaction. Outgoing amounts are negative, incomings are positive. The transactions can either be someone spending money ('spend' event) or it can be a loan disbursement into their account (amount > 0 and event = 'loan') or it can be them paying back their loan (amount < 0 and event = 'loan').
row number
id
created
amount
event
1
1
2022-01-01
-200
spend
2
1
2022-01-02
1000
loan
3
1
2022-01-03
-200
spend
4
1
2022-01-04
-500
spend
5
1
2022-01-05
-500
loan
6
1
2022-01-06
100
spend
7
1
2022-01-07
-500
spend
8
1
2022-01-08
1000
loan
9
1
2022-01-09
-100
spend
I'm trying to make:
row number
id
created
amount
event
cumulative_sum
1
1
2022-01-01
-200
spend
-200
2
1
2022-01-02
1000
loan
1000
3
1
2022-01-03
-200
spend
800
4
1
2022-01-04
-500
spend
300
5
1
2022-01-05
-500
loan
300
6
1
2022-01-06
100
spend
300
7
1
2022-01-07
-500
spend
-200
8
1
2022-01-08
1000
loan
1000
9
1
2022-01-09
-100
spend
900
Required logic:
I want to get a special cumulative sum which sums the amount only when:
(the amount is < 0 AND the event is spend) OR (when amount is > 0 AND event is loan)
.
The thing is I want the cumulative sum to start when that first positive loan amount. I don't care about anything before the positive loan amount and if they are counted it will obscure the results. The requirement is trying to select the rows which the loan enabled (if the loan is 1000 then we want to select the rows that add up to -1000 but only when event is spend and amount < 0).
my attempt
WITH tmp AS (
SELECT
1 AS id,
'2021-01-01' AS created,
-200 AS amount,
'spend' AS scheme
UNION ALL
SELECT
1 AS id,
'2022-01-02' AS created,
1000 AS amount,
'loan' AS scheme
UNION ALL
SELECT
1 AS id,
'2022-01-03' AS created,
-200 AS amount,
'spend' AS scheme
UNION ALL
SELECT
1 AS id,
'2022-01-04' AS created,
-500 AS amount,
'spend' AS scheme
UNION ALL
SELECT
1 AS id,
'2022-01-05' AS created,
-500 AS amount,
'loan' AS scheme
UNION ALL
SELECT
1 AS id,
'2022-01-06' AS created,
100 AS amount,
'spend' AS scheme
UNION ALL
SELECT
1 AS id,
'2022-01-07' AS created,
-500 AS amount,
'spend' AS scheme
UNION ALL
SELECT
1 AS id,
'2022-01-08' AS created,
1000 AS amount,
'loan' AS scheme
UNION ALL
SELECT
1 AS id,
'2022-01-09' AS created,
-100 AS amount,
'spend' AS scheme
)
SELECT
*,
SUM(CASE WHEN (scheme != 'loan' AND amount<0) OR (scheme = 'loan' AND amount > 0) THEN amount ELSE 0 END)
OVER (PARTITION BY id ORDER BY created ASC) AS cumulative_sum_spend
FROM tmp
Question
How do I make the cumulative sum reset at row 2 (not conditional to the row number - the requirement is the positive loan amount)?
That's a gaps-and-islands problem if I am understanding this correctly.
Islands start with a positive loan ; within each island, you want to compute a running sum in a subset of rows.
We can identify the islands in a subquery with a window count of positive loans, then do the maths in each group with a conditional expression:
select id, created, amount, event,
sum(case when (event = 'loan' and amount > 0) or (event = 'spend' and amount < 0) then amount end)
over(partition by id, grp order by created) as cumulative_sum
from (
select t.*,
sum(case when event = 'loan' and amount > 0 then 1 else 0 end)
over(partition by id order by created) grp
from tmp t
) t
order by id, created
One option would be something like this:
SELECT
*,
SUM(CASE WHEN cnt >= 1 AND ((scheme != 'loan' AND amount<0) OR (scheme = 'loan' AND amount > 0)) THEN amount ELSE 0 END)
OVER (PARTITION BY id ORDER BY created ASC) AS cumulative_sum_spend
FROM (
SELECT *, SUM(CASE WHEN amount > 0 THEN 1 ELSE 0 END) OVER (PARTITION BY id ORDER BY created) cnt
FROM tmp
) a
The idea here is that the inner query's window function counts the number of previous positive values. Then the outer query can do an extra check cnt >= 1 as part of its window function, so it will only consider values after the first positive one.

how to divide two rows

I have a table
date
measure
value
2022-12-09
A
10
2022-12-09
B
2
2022-12-03
A
300
2022-12-03
B
30
i need to have new rows C=A/B
date
measure
value
2022-12-09
A
10
2022-12-09
B
2
2022-12-09
C
5
2022-12-03
A
300
2022-12-03
B
30
2022-12-03
C
10
how it can be done
Using conditional aggregation along with a union we can try:
SELECT date, measure, value FROM yourTable
UNION ALL
SELECT
date,
'C',
MAX(CASE WHEN measure = 'A' THEN value END) /
MAX(CASE WHEN measure = 'B' THEN value END)
FROM yourTable
GROUP BY date
ORDER BY date, measure;

PARTITION BY with date between 2 date

I work on Azure SQL Database working with SQL Server
In SQL, I try to have a table by day, but the day is not in the table.
I explain it by the example below:
TABLE STARTER: (Format Date: YYYY-MM-DD)
Date begin
Date End
Category
Value
2021-01-01
2021-01-03
1
0.2
2021-01-02
2021-01-03
1
0.1
2021-01-01
2021-01-02
2
0.3
For the result, I try to have this TABLE RESULT:
Date
Category
Value
2021-01-01
1
0.2
2021-01-01
2
0.3
2021-01-02
1
0.3 (0.2+0.1)
2021-01-02
2
0.3
2021-01-03
1
0.3 (0.2+0.1)
For each day, I want to sum the value if the day is between the beginning and the end of the date. I need to do that for each category.
In terms of SQL code I try to do something like that:
SELECT SUM(CAST(value as float)) OVER (PARTITION BY Date begin, Category) as value,
Date begin,
Category,
Value
FROM TABLE STARTER
This code calculates only the value that has the same Date begin but don't consider all date between Date begin and Date End.
So in my code, it doesn't calculate the sum of the value for the 02-01-2021 of Category 1 because it doesn't write explicitly. (between 01-01-2021 and 03-01-2021)
Is it possible to do that in SQL?
Thanks so much for your help!
You can use a recursive CTE to expand the date ranges into the list of separate days. Then, it's matter of joining and aggregating.
For example:
with
r as (
select category,
min(date_begin) as date_begin, max(date_end) as date_end
from starter
group by category
),
d as (
select category, date_begin as d from r
union all
select d.category, dateadd(day, 1, d.d)
from d
join r on r.category = d.category
where d.d < r.date_end
)
select d.d, d.category, sum(s.value) as value
from d
join starter s on s.category = d.category
and d.d between s.date_begin and s.date_end
group by d.category, d.d;
Result:
d category value
----------- --------- -----
2021-01-01 1 0.20
2021-01-01 2 0.30
2021-01-02 1 0.30
2021-01-02 2 0.30
2021-01-03 1 0.30
See running example at db<>fiddle.
Note: Starting in SQL Server 2022 it seems there is/will be a new GENERATE_SERIES() function that will make this query much shorter.

SQL How to Query Total & Subtotal

I have a table looks like below where day, order_id, and order_type are stored.
select day, order_id, order_type
from sample_table
day
order_id
order_type
2021-03-01
1
offline
2021-03-01
2
offline
2021-03-01
3
online
2021-03-01
4
online
2021-03-01
5
offline
2021-03-01
6
offline
2021-03-02
7
online
2021-03-02
8
online
2021-03-02
9
offline
2021-03-02
10
offline
2021-03-03
11
offline
2021-03-03
12
offline
Below is desired output:
day
total_order
num_offline_order
num_online_order
2021-03-01
6
4
2
2021-03-02
4
2
2
2021-03-03
2
2
0
Does anybody know how to query to get the desired output?
You need to pivot the data. A simple way to implement conditional aggregation in Vertica uses :::
select day, count(*) as total_order,
sum( (order_type = 'online')::int ) as num_online,
sum( (order_type = 'offline')::int ) as num_offline
from t
group by day;
Use case and sum:
select day,
count(1) as total_order
sum(case when order_type='offline' then 1 end) as num_offline_order,
sum(case when order_type='online' then 1 end) as num_online_order
from sample_table
group by day
order by day
you can also use count to aggregate values that are not null
select
day,
count(*) as total_order,
count(case when order_type='offline' then 1 else null end) as offline_orders,
count(case when order_type='online' then 1 else null end) as online_orders
from sample_table
group by day
order by day;