If I have daily sales, how do I show weekly and monthly sales along with daily in a single record in oracle? I can calculate weekly sum and monthly sum in separate tables, but I want to the results in a single data set.
Output should look like shown below.
Date Week Month Daily_Sale Weekly_Sale Monthly_Sale
1/1/20 1 1 $5 $5 $5
1/2/20 1 1 $5 $10 $10
1/3/20 1 1 $1 $11 $11
1/4/20 1 1 $2 $13 $13
1/5/20 1 1 $5 $18 $18
1/6/20 1 1 $1 $19 $19
1/7/20 1 1 $1 $20 $20
1/8/20 2 1 $5 $5 $25
1/8/20 2 1 $5 $10 $30
1/10/20 2 1 $1 $11 $31
1/11/20 2 1 $2 $13 $33
1/12/20 2 1 $5 $18 $38
1/13/20 2 1 $1 $19 $39
1/14/20 2 1 $1 $20 $40
Thank you!
Edit: Highlighting the table
You seem to want running totals. Assuming that your tables contains sales information for more than a single year you need to partition on year as well.
select Date, extract(week from Date) Wk, extract(year from Date) Yr,
Daily_Sale,
sum(Daily_Sale) over (
partition by extract(year from Date), extract(week from Date)
order by Date
) as Weekly_Sale
sum(Daily_Sale) over (
partition by extract(year from Date), extract(month from Date)
order by Date
) as Monthly_Sale
from T
order by Date;
I don't have oracle compiler so I replicated the scenario in SSMS. Here is the query:
SELECT T1.D1, T1.AnnualSales, T1.MonthlySales, T1.WeeklySales, T1.DailySales
From
(SELECT [Date] As D1,
Sum([Sales]) Over (Partition by Year(Date)) as AnnualSales,
Sum([Sales]) Over (Partition by Month(Date)) as MonthlySales,
Sum([Sales]) Over (Partition by Datepart(wk,Date)) as WeeklySales,
Sum([Sales]) Over (Partition by Day(Date)) as DailySales
FROM [dbo].[DailySales_Test]) AS T1
Group by T1.D1, T1.AnnualSales, T1.MonthlySales, T1.WeeklySales, T1.DailySales
Related
I need to make a tracker containing item name and price date change.
The date shows when the price is changed. I used lead, but it shows each day not when the day has changed.
for instance,
item name = A
price date = 2022-11-21, price = $4
price date = 2022-11-25, price = $3
price date = 2022-11-30, price = $4
The expectation for the result is:
start date, next date, price
2022-11-21 2022-11-24 $4
2022-11-25 2022-11-29 $3
2022-11-30 2023-02-14 (current date) $4
Any help would be appreciated.
#Updated:
The dataset is containing daily price
for instance :
item name = A
price date = 2022-11-21, price = $4
price date = 2022-11-22, price = $4
price date = 2022-11-23, price = $4
price date = 2022-11-24, price = $4
price date = 2022-11-25, price = $3
price date = 2022-11-26, price = $3
price date = 2022-11-27, price = $3
price date = 2022-11-29, price = $3
price date = 2022-12-01, price = $4
Query :
select
item_name,
supplier_name,
price_date,
price,
lead(price_date) over (partition by item_name order by price) as next_price from
price
Result:
price = $4, 2022-11-21, 2022-11-22
price = $4, 2022-11-22, 2022-11-23
price = $4, 2022-11-23, 2022-11-24
price = $3, 2022-11-25, 2022-11-26
price = $3, 2022-11-26, 2022-11-27
price = $3, 2022-11-27, 2022-11-29
price = $3, 2022-11-29, 2022-11-30
price = $4, 2022-12-01, 2023-02-14
While my expectation is :
price = $4, 2022-11-21, 2022-11-25
price = $3, 2022-11-26, 2022-11-30
price = $4, 2023-12-01, 2023-02-14
You can try using gaps-and-island approach - i.e. introduce a column representing the change in price, use cumulative sum to calculate groups and use group by to calculate the results:
-- sample data
with dataset(date, price) as (
values (date '2022-11-21', 4),
(date '2022-11-22', 4),
(date '2022-11-23', 4),
(date '2022-11-24', 4),
(date '2022-11-25', 3),
(date '2022-11-26', 3),
(date '2022-11-27', 3),
(date '2022-11-29', 3),
(date '2022-12-01', 4)
)
-- query
select arbitrary(price) price,
min(date) as start_dt,
max(date) as end_dt
from (
select date, price, sum (change) over (order by date) as grp
from (
select *, if(lag(price) over (order by date) != price, 1, 0) as change
from dataset)
)
group by grp;
Output:
price
start_dt
end_dt
4
2022-11-21
2022-11-24
3
2022-11-25
2022-11-29
4
2022-12-01
2022-12-01
Few notes:
since your actual data is partitioned - do not forget to add item_name for window partitioning and final group by
if you really need current date as the end_dt for final row in partition there are several ways to achieve that - not very fun with inserting missing dates (see this for inspiration) or you can just roll up another subquery which will check if lead(end_dt) over (partition ... order by end_dt) is null and use current date for end_dt
I want to sum the previous 7 days revenue from each date for each customer. There are some missing dates for some customers and various different customers so I cannot use a Lag function. I was previously using windows but I could only partition by customer_ID and could not partition by the date range as well.
Some sample data as follows:
Customer_ID
Date
Revenue
1
01/02/21
$20
2
01/02/21
$30
1
02/02/21
$40
2
02/02/21
$50
1
03/02/21
$20
2
03/02/21
$60
1
04/02/21
$10
2
04/02/21
$80
1
05/02/21
$100
2
05/02/21
$40
1
06/02/21
$20
2
06/02/21
$30
1
07/02/21
$50
2
07/02/21
$70
1
08/02/21
$10
2
08/02/21
$20
1
09/02/21
$3
2
09/02/21
$40
This result would give the sum of the previous seven days revenue broken down by customer ID for each date. It is ordered by Customer_ID and Date
Customer_ID
Date
Revenue
1
01/02/21
$20
1
02/02/21
$60
1
03/02/21
$80
1
04/02/21
$90
1
05/02/21
$190
1
06/02/21
$210
1
07/02/21
$260
1
08/02/21
$250
1
09/02/21
$240
2
01/02/21
$30
2
02/02/21
$80
2
03/02/21
$140
2
04/02/21
$220
2
05/02/21
$260
2
06/02/21
$290
2
07/02/21
$360
2
08/02/21
$350
2
09/02/21
$340
Data:
Database table
Query Result:
Query Result
select customer_id,date,sum(revenue) from customer_table where date >= sysdate-7 and date < =sysdate group by customer_id,date;
Hope this helps you
You can try going with a self join, where you match on:
tab1.customer_id = table2.customer_id
tab1.date being matched with till-6-days-before records of tab2.date.
Then apply the SUM on t2.revenues and aggregate on the selected fields.
SELECT t1.Customer_ID,
t1.Date,
SUM(t2.Revenue) AS total
FROM tab t1
LEFT JOIN tab t2
ON t1.Customer_ID = t2.Customer_ID
AND t1.Date BETWEEN t2.Date AND DATEADD(day, -6, t2.Date)
GROUP BY t1.Customer_ID,
t1.Date
This approach would avoid the issue of missing dates for customers, as long as you are comparing dates instead of taking the "last 7 records" with LAG.
with cte as (-- Customer_ID Date Revenue
select 1 customer_id, DATE( '01/02/2021','DD/MM/YYYY') Some_date, 20 Revenue
union all select 2 customer_id, DATE( '01/02/2021','DD/MM/YYYY') Some_date, 30 Revenue
union all select 1 customer_id, DATE( '03/02/2021','DD/MM/YYYY') Some_date, 20 Revenue
union all select 2 customer_id, DATE( '03/02/2021','DD/MM/YYYY') Some_date, 60 Revenue
union all select 1 customer_id, DATE( '04/02/2021','DD/MM/YYYY') Some_date, 10 Revenue
union all select 2 customer_id, DATE( '04/02/2021','DD/MM/YYYY') Some_date, 80 Revenue
union all select 1 customer_id, DATE( '05/02/2021','DD/MM/YYYY') Some_date, 100 Revenue
union all select 2 customer_id, DATE( '05/02/2021','DD/MM/YYYY') Some_date, 40 Revenue
union all select 1 customer_id, DATE( '06/02/2021','DD/MM/YYYY') Some_date, 20 Revenue
union all select 2 customer_id, DATE( '06/02/2021','DD/MM/YYYY') Some_date, 30 Revenue
union all select 1 customer_id, DATE( '07/02/2021','DD/MM/YYYY') Some_date, 50 Revenue
union all select 2 customer_id, DATE( '07/02/2021','DD/MM/YYYY') Some_date, 70 Revenue
union all select 1 customer_id, DATE( '08/02/2021','DD/MM/YYYY') Some_date, 10 Revenue
union all select 2 customer_id, DATE( '08/02/2021','DD/MM/YYYY') Some_date, 20 Revenue
union all select 1 customer_id, DATE( '09/02/2021','DD/MM/YYYY') Some_date, 3 Revenue
union all select 1 customer_id, DATE( '02/02/2021','DD/MM/YYYY') Some_date, 40 Revenue
union all select 2 customer_id, DATE( '02/02/2021','DD/MM/YYYY') Some_date, 50 Revenue
union all select 2 customer_id, DATE( '09/02/2021','DD/MM/YYYY') Some_date, 40 Revenue)
select customer_id, revenue
, DATE_TRUNC('week', Some_date ) week_number
, sum(revenue)
over(partition by customer_id,week_number
order by Some_date asc
rows between unbounded preceding and current row) volia
from cte
I have a table with the columns Sales_Date and Sales. I am looking for a solution to get Sales for the last year from the Sales_Date Column. Sales_Date column has values from the year 2015 onwards.
For example:
Sales_Date
Sales
1/1/2016
$25
1/8/2016
$57
1/1/2015
$125
1/8/2015
$21
I am looking for the below result set:
Sales_Date
Sales
LYear_Sales_Date
LYear_Sales
1/1/2016
$25
1/1/2015
$125
1/8/2016
$57
1/8/2015
$21
Filter all data to this year (WHERE YEAR(Sales.Sales_Date) = 2016).
LEFT JOIN to the same table, combining each date with the same date one year prior (Sales LEFT JOIN Sales AS Sales_LastYear ON Sales_LastYear.Sales_Date = DATEADD(year, -1, Sales.Sales_Date)).
SELECT the fields that you want (SELECT Sales.Sales_Date, Sales_LastYear.Sales_Date AS LYear_Sales_Date, ...).
Replace the LEFT JOIN with an INNER JOIN, if you want only those records that have a matching last-year record.
Seems like LAG would work here. Assuming you are always wanting the for the same (day and) month:
WITH CTE AS(
SELECT Sales_Date,
Sales,
LAG(Sales_Date) OVER (PARTITION BY DAY(Sales_Date), MONTH(Sales_Date) ORDER BY YEAR(Sales_Date)) AS LYear_Sales_Date,
LAG(Sales) OVER (PARTITION BY DAY(Sales_Date), MONTH(Sales_Date) ORDER BY YEAR(Sales_Date)) AS LYear_Sales
FROM dbo.YourTable)
SELECT Sales_Date,
Sales,
LYear_Sales_Date,
LYear_Sales
FROM CTE
WHERE Sales_Date >= '20160101'
AND Sales_Date < '20170101';
I have to create a view in BigQuery with some details of product sales. The measurements to be included in the view are explained below. These measurements have to be calculated for each product for every day that product is sold. A product is identified by unique combination of 5 -6 attributes (in our demo, code1 and code2 columns). The date represents the transaction dates.
sales_today -> the sum of sales for each product (combination of code1 and code2) per day.
TotSales_previous_3_months -> the sum of sales for each product in the previous 3 months(without including any sales from current month). for e.g., if we are calculating TotSales_previous_3_months for a product sale on 5th March 2022, we have to sum up the sales of that product from 1st December 2021 to 28th February 2022.
TotSales_previous_6_months -> the sum of sales for each product in the previous 6 months(without including any sales from current month). Follow the same logic as for TotSales_previous_3_months.
sale_one_month_ago -> The sum of sales of the product on this day exactly one month ago. For e.g., if we are calculating sale_one_month_ago for a product sale on 5th March 2022, it would be the sum of sales of that product on 5th February 2022.
sale_one_year_ago -> The sum of sales of the product on this day exactly one month ago. For e.g., if we are calculating sale_one_month_ago for a product sale on 5th March 2022, it would be the sum of sales of that product on 5th March 2021.
Unique_count_flag -> flag = 1 if the number of sales of the product on a day = 1. If the number of sales of the product is more than 1 on a day, flag = 0.
I have created this table (test_sales) with some demo data for understanding.
code1
code2
date
gen
sales
1
A
2021-02-04
jerez
7
1
A
2021-02-04
abc
5
1
A
2022-02-04
wres
10
1
A
2022-03-04
tomz
10
1
A
2022-03-05
everyz
10
1
A
2022-05-01
ben10
30
1
A
2022-06-01
xyx
10
1
A
2022-06-01
xya
5
2
A
2022-05-10
iqoom
20
3
C
2022-01-10
imola
60
3
C
2022-04-01
nurburgring
50
3
C
2022-06-01
jerez
30
The result set after calculations should be like -
code1
code2
date
gen
sales
sales_today
TotSales_previous_3_months
TotSales_previous_6_months
sale_one_month_ago
sale_one_year_ago
Unique_count_flag
1
A
2021-02-04
jerez
7
12
0
0
0
0
1
A
2021-02-04
abc
5
12
0
0
0
0
1
A
2022-02-04
wres
10
10
0
0
0
12
1
1
A
2022-03-04
tomz
10
10
10
10
10
1
1
A
2022-03-05
everyz
10
10
10
10
0
1
1
A
2022-05-01
ben10
30
30
30
30
0
1
1
A
2022-06-01
xyx
10
15
50
60
30
0
1
A
2022-06-01
xya
5
15
50
60
30
0
2
A
2022-05-10
iqoom
20
20
0
0
0
1
3
C
2022-01-10
imola
60
60
0
0
0
1
3
C
2022-04-01
nurburgring
50
50
60
60
0
1
3
C
2022-06-01
jerez
30
30
50
110
0
1
I was able to create the below code to achieve result, but the problem is that this code works fine for small datasets but here I am dealing with around 60 GB of data(~50 columns and ~80 million rows). If I adapt the code given below for the original sales data(which itself is a combination of few tables after joining them) it just long runs. Is there an alternative or efficient way to achieve the results?
with temp as
(SELECT
code1,code2,date,gen,sales,
COUNT(*) OVER(PARTITION BY code1, code2, date) AS cnt,
SUM(sales) OVER(PARTITION BY code1, code2,date) AS sales_today,
array_agg(struct(sales as sales,date as date)) over(partition by code1,code2 order by date) as past_records
FROM
`test_sales`
)
select * except(past_records,cnt),
(select ifnull(sum(x.sales),0)
from unnest(temp.past_records) as x
where x.date between (date_trunc(temp.date,MONTH) - INTERVAL 3 MONTH) and (date_trunc(temp.date, MONTH) - interval 1 day)) as TotSales_previous_3_months,
(select ifnull(sum(x.sales),0)
from unnest(temp.past_records) as x
where x.date between (date_trunc(temp.date,MONTH) - INTERVAL 6 MONTH) and (date_trunc(temp.date, MONTH) - interval 1 day)) as TotSales_previous_6_months,
(select ifnull(sum(x.sales),0)
from unnest(temp.past_records) as x
where x.date = temp.date - INTERVAL 1 MONTH) as sale_one_month_ago,
(select ifnull(sum(x.sales),0)
from unnest(temp.past_records) as x
where x.date = temp.date - INTERVAL 1 YEAR) as sale_one_year_ago,
if(cnt = 1,1,0) as Unique_count_flag
from temp
Modified Code inspired from Mikhail's approach:-
select *,
-- extract(year from date) * 12 + extract(month from date) as months,
-- UNIX_DATE(date) AS days,
sum(sales) over(product_date) as sales_today,
sum(sales) over(product range between 3 preceding and 1 preceding) as TotSales_previous_3_months,
sum(sales) over(product range between 6 preceding and 1 preceding) as TotSales_previous_6_months,
case when extract(day from date) = 31 and extract(month from date) in (3,12,10,7,5)
then sum(sales) over(product_by_unix_date range between 31 preceding and 31 preceding)
when extract(day from date) = 30 and extract(month from date) = 3
then sum(sales) over(product_by_unix_date range between 30 preceding and 30 preceding)
when extract(day from date) = 29 and extract(month from date) = 3
then sum(sales) over(product_by_unix_date range between 29 preceding and 29 preceding)
else
sum(sales) over(product_day range between 1 preceding and 1 preceding)
end as sale_one_month_ago,
case when extract(day from date) = 29 and extract(month from date) = 2
then sum(sales) over(product_by_unix_date range between 366 preceding and 366 preceding)
else
sum(sales) over(product_day range between 12 preceding and 12 preceding)
end as sale_one_year_ago
from `river-blade-343102.test.test_sales`
window
product as (partition by code1, code2 order by extract(year from date) * 12 + extract(month from date)),
product_date as (partition by code1, code2, date ),
product_day as (partition by code1, code2, extract(day from date) order by extract(year from date) * 12 + extract(month from date)),
product_by_unix_date as (partition by code1,code2 order by UNIX_DATE(date))
Consider below version of your query - it still not the perfect - but at least it is easier to handle/read and maintain
select *,
sum(sales) over(product_date) as sales_today,
sum(sales) over(product range between 3 preceding and 1 preceding) as TotSales_previous_3_months,
sum(sales) over(product range between 6 preceding and 1 preceding) as TotSales_previous_6_months,
sum(sales) over(product_day range between 1 preceding and 1 preceding) as sale_one_month_ago,
sum(sales) over(product_day range between 12 preceding and 12 preceding) as sale_one_year_ago,
from test_sales
window
product as (partition by code1, code2 order by extract(year from date) * 12 + extract(month from date)),
product_date as (partition by code1, code2, date),
product_day as (partition by code1, code2, extract(day from date) order by extract(year from date) * 12 + extract(month from date))
if applied to sample data in your question - output is
Is there an alternative or efficient way to achieve the results?
So, definitely above is an alternative way with its own pros and cons
Whether it is more efficient - I do think so, but not 100% sure to be honest - it depends on your data - you need to test it against your data and see ...
Here's an example "transactions" table where each row is a record of an amount and the date of the transaction.
+--------+------------+
| amount | date |
+--------+------------+
| 1000 | 2020-01-06 |
| -10 | 2020-01-14 |
| -75 | 2020-01-20 |
| -5 | 2020-01-25 |
| -4 | 2020-01-29 |
| 2000 | 2020-03-10 |
| -75 | 2020-03-12 |
| -20 | 2020-03-15 |
| 40 | 2020-03-15 |
| -50 | 2020-03-17 |
| 200 | 2020-10-10 |
| -200 | 2020-10-10 |
+--------+------------+
The goal is to return one column "balance" with the balance of all transactions. Only catch is that there is a monthly fee of $5 for each month that there are not at least THREE payment transactions (represented by a negative value in the amount column) that total at least $100. So in the example, the only month where you wouldn't have a $5 fee is March because there were 3 payments (negative amount transactions) that totaled $145. So the final balance would be $2,746. The sum of the amounts is $2,801 minus the $55 monthly fees (11 months X 5). I'm not a postgres expert by any means, so if anyone has any pointers on how to get started solving this problem or what parts of the postgres documentation which help me most with this problem that would be much appreciated.
The expected output would be:
+---------+
| balance |
+---------+
| 2746 |
+---------+
This is rather complicated. You can calculate the total span of months and then subtract out the one where the fee is cancelled:
select amount, (extract(year from age) * 12 + extract(month from age)), cnt,
amount - 5 *( extract(year from age) * 12 + extract(month from age) + 1 - cnt) as balance
from (select sum(amount) as amount,
age(max(date), min(date)) as age
from transactions t
) t cross join
(select count(*) as cnt
from (select date_trunc('month', date) as yyyymm, count(*) as cnt, sum(amount) as amount
from transactions t
where amount < 0
group by yyyymm
having count(*) >= 3 and sum(amount) < -100
) tt
) tt;
Here is a db<>fiddle.
This calculates 2756, which appears to follow your rules. If you want the full year, you can just use 12 instead of the calculating using the age().
I would first left join with a generate_series that represents the months you are interested in (in this case, all in the year 2020). That adds the missing months with a balance of 0.
Then I aggregate these values per month and add the negative balance per month and the number of negative balances.
Finally, I calculate the grand total and subtract the fee for each month that does not meet the criteria.
SELECT sum(amount_per_month) -
sum(5) FILTER (WHERE negative_per_month > -100 OR negative_count < 3)
FROM (SELECT sum(amount) AS amount_per_month,
sum(amount) FILTER (WHERE amount < 0) AS negative_per_month,
month_start,
count(*) FILTER (WHERE amount < 0) AS negative_count
FROM (SELECT coalesce(t.amount, 0) AS amount,
coalesce(date_trunc('month', CAST (t.day AS timestamp)), dates.d) AS month_start
FROM generate_series(
TIMESTAMP '2020-01-01',
TIMESTAMP '2020-12-01',
INTERVAL '1 month'
) AS dates (d)
LEFT JOIN transactions AS t
ON dates.d = date_trunc('month', CAST (t.day AS timestamp))
) AS gaps_filled
GROUP BY month_start
) AS sums_per_month;
This would be my solution by simply using cte.
DB fiddle here.
balance
2746
Code:
WITH monthly_credited_transactions
AS (SELECT Date_part('month', date) AS cred_month,
Sum(CASE
WHEN amount < 0 THEN Abs(amount)
ELSE 0
END) AS credited_amount,
Sum(CASE
WHEN amount < 0 THEN 1
ELSE 0
END) AS credited_cnt
FROM transactions
GROUP BY 1),
credit_fee
AS (SELECT ( 12 - Count(1) ) * 5 AS fee,
1 AS id
FROM monthly_credited_transactions
WHERE credited_amount >= 100
AND credited_cnt >= 3),
trans
AS (SELECT Sum(amount) AS amount,
1 AS id
FROM transactions)
SELECT amount - fee AS balance
FROM trans a
LEFT JOIN credit_fee b
ON a.id = b.id
For me the below query worked (have adopted my answer from #GordonLinoff):
select CAST(totalamount - 5 *(12 - extract(month from firstt) + 1 - nofeemonths) AS int) as balance
from (select sum(amount) as totalamount, min(date) as firstt
from transactions t
) t cross join
(select count(*) as nofeemonths
from (select date_trunc('month', date) as months, count(*) as nofeemonths, sum(amount) as totalamount
from transactions t
where amount < 0
group by months
having count(*) >= 3 and sum(amount) < -100
) tt
) tt;
The firstt is the date of first transaction in that year and 12 - extract(month from firstt) + 1 - nofeemonths are the number of months for which the credit card fees of 5 will be charged.