How to query increased sales every day a month in SQL Redshift? - sql

Can some 1 help me with SQL redshift query to get the result the way mentioned below
My table:
SalesDate | Amount($)
2022-03-01 | 4
2022-03-01 | 5
2022-03-02 | 3
2022-03-02 | 10
2022-03-02 | 12
2022-03-03 | 1
etc..
I want to have an increased sales table group by SalesDate :
SalesDate | Amount($)
2022-03-01 | 9
2022-03-02 | 34
etc...
Currently, I tried to use this query but doesn't work:
select distinct salesdate::date as date_number
, sum(*) over (order by salesdate::date) asc ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as amount
from mytable
where salesdate >= '2022-03-01'
So I received the result not as my wanted. It increase but not as my wanted:
SalesDate | Amount($)
2022-03-01 | 4
2022-03-01 | 9
2022-03-02 | 12
2022-03-02 | 22
2022-03-02 | 34

You can try to use group by with aggregate function in a subquery, before use window function.
SELECT date_number,
sum(amount) over (order by date_number) totalAmount
FROM (
select salesdate::date as date_number,
sum(Amount) as amount
from mytable
where salesdate >= '2022-03-01'
GROUP BY salesdate::date
) t1

Related

How do I summarize sales data in SQL by month for last 24months?

I have big number of rows with sales for different products on various days.
I want to retrieve the sum for each product and per month. For the last 24months.
How do I write a WHERE function showing the last 24 months (based on latest date in table not actual date)?
How is that summarized and shown by month instead of individual days like 2018-01-24?
**Sample Data Table**
| SalesDate | Product | SLSqty |
| 2018-01-24 | Product A | 25 |
| 2019-06-10 | Product B | 10 |
| 2019-10-07 | Product C | 4 |
| 2020-03-05 | Product A | 20 |
| 2021-09-01 | Product A | 50 |
| 2021-09-01 | Product B | 10 |
| 2021-09-02 | Product C | 3 |
| 2021-09-04 | Product A | 50 |
| 2021-09-07 | Product B | 10 |
**Expected Result**
| SalesMONTH | Product | SLSqty |
| 2019-10-31 | Product C | 4 |
| 2020-03-31 | Product A | 20 |
| 2021-09-30 | Product A | 100|
| 2021-09-30 | Product A | 20 |
| 2021-09-30 | Product B | 3 |
I would make a parameter that stores the value of the latest date in your table. Then you can impute the parameter in you WHERE clause.
IF OBJECT_ID('TEMPDB..#TEMP') IS NOT NULL
DROP TABLE #TEMP
CREATE TABLE #TEMP(
[SalesDate] DATE
,[product] NVARCHAR(20)
,[SLSqty] INT
)
INSERT INTO #TEMP([SalesDate],[product],[SLSqty])
VALUES('2018-01-24','Product A',25)
,('2019-06-10','Product B',10)
,('2019-10-07','Product C',4 )
,('2020-03-05','Product A',20)
,('2021-09-01','Product A',50)
,('2021-09-01','Product B',10)
,('2021-09-02','Product C',3 )
,('2021-09-04','Product A',50)
,('2021-09-07','Product B',10)
DECLARE #DATEVAR AS DATE = (SELECT MAX(#TEMP.SalesDate) FROM #TEMP)
The last line declares the variable. If you select #DATEVAR, you get the output of a single date defined by the select statement:
Then you impute it into a where clause. Since you want 24 months prior to the latest date, I would use a DATEDIFF(MONTH,,) function in your where clause. It outputs an integer of months and you simply constrain it to be 24 months or less.
SELECT #TEMP.SalesDate
,#TEMP.product
,#TEMP.SLSqty
,DATEDIFF(MONTH,#TEMP.SalesDate,#DATEVAR) [# of months Diff]
FROM #TEMP
WHERE DATEDIFF(MONTH,#TEMP.SalesDate,#DATEVAR) <= 24
OUTPUT:
Now you have to aggregate the sales grouped by the year-month and product.
I compute year-month by calculating an integer like 202109 (Sept. 2021)
SELECT --#TEMP.SalesDate --(YOU HAVE TO TAKE THIS OUT FOR THE GROUP BY)
YEAR(#TEMP.SalesDate)*100+MONTH(#TEMP.SalesDate) [year-month for GROUP BY]
,#TEMP.product
,SUM(#TEMP.SLSqty) SLSqty
-- ,DATEDIFF(MONTH,#TEMP.SalesDate,#DATEVAR) [# of months Diff] --(YOU HAVE TO TAKE THIS OUT FOR THE GROUP BY)
FROM #TEMP
WHERE DATEDIFF(MONTH,#TEMP.SalesDate,#DATEVAR) <= 24
GROUP BY YEAR(#TEMP.SalesDate)*100+MONTH(#TEMP.SalesDate)
,#TEMP.product
Output:
Here is some oracle sql:
With data ( SalesDate,Product,SLSqty)as(
Select to_date('2018-01-24'),'Product A',25 from dual union all
Select to_date('2019-06-10'),'Product B',10 from dual union all
Select to_date('2019-10-07'),'Product C',4 from dual union all
Select to_date('2020-03-05'),'Product A',20 from dual union all
Select to_date('2021-09-01'),'Product A',50 from dual union all
Select to_date('2021-09-01'),'Product B',10 from dual union all
Select to_date('2021-09-02'),'Product C',3 from dual union all
Select to_date('2021-09-04'),'Product A',50 from dual union all
Select to_date('2021-09-07'),'Product B',10 from dual),
theLatest(SalesDate) as(
select max(SalesDate) from data
)
select to_char(d.SalesDate,'YYYY-MM'),d.Product, sum(SLSqty)
from data d
Join theLatest on d.SalesDate >= add_months(theLatest.SalesDate,-24)
group by to_char(d.SalesDate,'YYYY-MM'),d.Product
order by to_char(d.SalesDate,'YYYY-MM')

Postgresql how to select columns where it matches conditions?

I have a table like this:
inventory_id | customer_id | max
--------------+-------------+---------------------
4497 | 1 | 2005-07-28 00:00:00
1449 | 1 | 2005-08-22 00:00:00
1440 | 1 | 2005-08-02 00:00:00
3232 | 1 | 2005-08-02 00:00:00
3418 | 2 | 2005-08-02 00:00:00
654 | 2 | 2005-08-02 00:00:00
3164 | 2 | 2005-08-21 00:00:00
2053 | 2 | 2005-07-27 00:00:00
I want to select rows where most recent date with corresponding columns, This is what I want to achieve:
inventory_id | customer_id | max
--------------+-------------+---------------------
1449 | 1 | 2005-08-22 00:00:00
3164 | 2 | 2005-08-21 00:00:00
I tried to use aggregate but I need inventory_id and customer_id appear at the same time.
Is there any method that could do this?
Use distinct on:
select distinct on (customer_id) t.*
from t
order by customer_id, max desc;
distinct on is a Postgres extension that returns on row per whatever is in the parentheses. This row is based on the order by -- the first one that appears in the sorted set.
SELECT inventory_id, customer_id, max FROM
(SELECT inventory_id, customer_id, max,
ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY max DESC) AS ROWNO
FROM inventory_table) AS A
WHERE ROWNO=1

Oracle SQL: How can I sum every x number of subsequent rows for each row

I have a data table that looks like this:
|Contract Date | Settlement_Prcie |
|--------------|------------------|
| 01/10/2020 | 50 |
|--------------|------------------|
| 01/11/2020 | 10 |
|--------------|------------------|
| 01/01/2021 | 20 |
|--------------|------------------|
| 01/02/2021 | 30 |
|--------------|------------------|
| 01/03/2021 | 50 |
|--------------|------------------|
I would like to write a query that sums every two rows beneath ... For example, On the first row with contract date 01/10/2020, the sum column would add 10 and 20 to give a result of 30. The next row, the sum column would add 20 and 30 to give 40 and so on. The resulting table of results would look like this:
|Contract Date | Settlement_Prcie | Sum Column |
|--------------|------------------|------------|
| 01/10/2020 | 50 | 30
|--------------|------------------|------------|
| 01/11/2020 | 10 | 50
|--------------|------------------|------------|
| 01/01/2021 | 20 | 80
|--------------|------------------|------------|
| 01/02/2021 | 30 |
|--------------|------------------|
| 01/03/2021 | 50 |
|--------------|------------------|
Could anyone please help me with the query to do this not just for 2 subsequent rows but for x subsequent rows.
So far I had tried using a SUM (Settlement_Price) Over (order by Contract_date Rows between 3 preceding and current row) - Current row of course was not ok, but that is as far as I had gone.
You can use the SUM analytic function:
SELECT contract_date,
settlement_price,
CASE COUNT(*) OVER (
ORDER BY contract_date ROWS BETWEEN 1 FOLLOWING AND 2 FOLLOWING
)
WHEN 2
THEN SUM( settlement_price ) OVER (
ORDER BY contract_date ROWS BETWEEN 1 FOLLOWING AND 2 FOLLOWING
)
END AS sum_column
FROM table_name;
Or you can use LEAD:
SELECT contract_date,
settlement_price,
LEAD( settlement_price, 1 , NULL ) OVER ( ORDER BY contract_date )
+ LEAD( settlement_price, 2 , NULL ) OVER ( ORDER BY contract_date )
AS sum_column
FROM table_name;
So, for the test data:
CREATE TABLE table_name ( contract_date, settlement_price ) AS
SELECT DATE '2020-10-01', 50 FROM DUAL UNION ALL
SELECT DATE '2020-11-01', 10 FROM DUAL UNION ALL
SELECT DATE '2020-12-01', 20 FROM DUAL UNION ALL
SELECT DATE '2021-01-01', 30 FROM DUAL UNION ALL
SELECT DATE '2021-02-01', 50 FROM DUAL;
Both queries output:
CONTRACT_DATE | SETTLEMENT_PRICE | SUM_COLUMN
:------------ | ---------------: | ---------:
01-OCT-20 | 50 | 30
01-NOV-20 | 10 | 50
01-DEC-20 | 20 | 80
01-JAN-21 | 30 | null
01-FEB-21 | 50 | null
db<>fiddle here
SUM (Settlement_Price) Over (order by Contract_date Rows between 1 following and 2 following)

Aggregate quantity columns per distinct date in table sql

I want to sum quantity column from first date in table (2016-02-17 in this table) until per each distinct date in the table. Result relation should contains sum of quantities per each distinct date in table.
how can I write a query for this in sql server?
ID| quantity | date
---+----------+-----
18 | 6 | 2016-02-17 00:00:00.000
19 | 6 | 2016-02-17 00:00:00.000
18 | 4 | 2016-02-17 00:00:00.000
19 | 3 | 2016-02-18 00:00:00.000
18 | 1 | 2016-02-18 00:00:00.000
19 | 5 | 2016-02-18 00:00:00.000
18 | 6 | 2016-02-19 00:00:00.000
19 | 7 | 2016-02-19 00:00:00.000
18 | 8 | 2016-02-19 00:00:00.000
19 | 9 | 2016-02-19 00:00:00.000
Expected output:
| Date | quantity |
|------------|----------|
| 2016-02-17 | 16 |
| 2016-02-18 | 25 |
| 2016-02-19 | 55 |
Aggregate function SUM with GROUP BY will give you the sum values for the Distinct dates.
SELECT Date,
SUM(quantity) OVER(ORDER BY Date) quantity
FROM(
SELECT DATE, SUM(quantity) quantity
FROM Your_Table
GROUP BY DATE
)A
Check the SQL Fiddle for reference.
If you want the result for ID specific, use this. The PARTITION will make the difference.
SELECT Id, Date,
SUM(quantity) OVER(PARTITION BY ID ORDER BY Date) quantity
FROM(
SELECT Id, DATE, SUM(quantity) quantity
FROM A
GROUP BY Id, DATE
)A
You do not need a subquery or CTE to use window functions with aggregation:
SELECT DATE, SUM(quantity) as day_quantity,
SUM(SUM(quantity)) OVER (ORDER BY DATE) as running_quantity
FROM Your_Table
GROUP BY DATE
ORDER BY DATE;
If you want the results ordered by date (as implied by your result set), you should include an explicit ORDER BY.
Another approach is
WITH
b as (
Select my_date,
SUM(quantity) Over(order by my_date rows between unbounded preceding and current row) running_total
from main_table
)
SELECT my_date, max(running_total) running_total
from b group by dt

postgresql - cumul. sum active customers by month (removing churn)

I want to create a query to get the cumulative sum by month of our active customers. The tricky thing here is that (unfortunately) some customers churn and so I need to remove them from the cumulative sum on the month they leave us.
Here is a sample of my customers table :
customer_id | begin_date | end_date
-----------------------------------------
1 | 15/09/2017 |
2 | 15/09/2017 |
3 | 19/09/2017 |
4 | 23/09/2017 |
5 | 27/09/2017 |
6 | 28/09/2017 | 15/10/2017
7 | 29/09/2017 | 16/10/2017
8 | 04/10/2017 |
9 | 04/10/2017 |
10 | 05/10/2017 |
11 | 07/10/2017 |
12 | 09/10/2017 |
13 | 11/10/2017 |
14 | 12/10/2017 |
15 | 14/10/2017 |
Here is what I am looking to achieve :
month | active customers
-----------------------------------------
2017-09 | 7
2017-10 | 6
I've managed to achieve it with the following query ... However, I'd like to know if there are a better way.
select
"begin_date" as "date",
sum((new_customers.new_customers-COALESCE(churn_customers.churn_customers,0))) OVER (ORDER BY new_customers."begin_date") as active_customers
FROM (
select
date_trunc('month',begin_date)::date as "begin_date",
count(id) as new_customers
from customers
group by 1
) as new_customers
LEFT JOIN(
select
date_trunc('month',end_date)::date as "end_date",
count(id) as churn_customers
from customers
where
end_date is not null
group by 1
) as churn_customers on new_customers."begin_date" = churn_customers."end_date"
order by 1
;
You may use a CTE to compute the total end_dates and then subtract it from the counts of start dates by using a left join
SQL Fiddle
Query 1:
WITH edt
AS (
SELECT to_char(end_date, 'yyyy-mm') AS mon
,count(*) AS ct
FROM customers
WHERE end_date IS NOT NULL
GROUP BY to_char(end_date, 'yyyy-mm')
)
SELECT to_char(c.begin_date, 'yyyy-mm') as month
,COUNT(*) - MAX(COALESCE(ct, 0)) AS active_customers
FROM customers c
LEFT JOIN edt ON to_char(c.begin_date, 'yyyy-mm') = edt.mon
GROUP BY to_char(begin_date, 'yyyy-mm')
ORDER BY month;
Results:
| month | active_customers |
|---------|------------------|
| 2017-09 | 7 |
| 2017-10 | 6 |