Aggregate multiple columns based on specific date range with in a month - sql

I need to aggregate Amounts to be displayed by date range per month. To illustrate please take a look at the following table:
Invoice_Payment
Customer_id Invoice_no Invoice_date Amount
---------------------------------------------------
10 10023 2016-07-08 60
10 10018 2016-08-04 90
11 10016 2016-07-01 110
11 10021 2016-07-05 120
12 10028 2016-07-11 10
12 10038 2016-07-31 5
As you'll notice, I want to group them based on Customer_id and display the dates from start to end. Furthermore, this has to be done for each month only.
Following query I have tried so far:
select Customer_id, (mindate + ' to ' + maxdate) Date_Range, Amount
from (
select Customer_id, sum(Amount) Amount, min(Invoice_date) mindate, max(Invoice_date) maxdate
from Invoice_Payment
group by Customer_id
) I ;
From above query I'm getting Output like:
Customer_id Date_Range Amount
10 2016-07-08 to 2016-08-04 150
11 2016-07-01 to 2016-07-05 230
12 2016-07-11 to 2016-07-31 15
Please check this.. SQL Fiddle Working Demo
Let's say Customer_id = 10 who has Invoice_date in July,2016 and August,2016. I need to sum up all payments of that particular Customer for the month of July and August separately within specific date range. But I am getting sum of Amount of all Invoice_date from above endeavor.
Desired output :
Customer_id Date_Range Amount
10 2016-07-08 to 2016-07-08 60
10 2016-08-04 to 2016-08-04 90
11 2016-07-01 to 2016-07-05 230
12 2016-07-11 to 2016-07-31 15
How could I get over this ? Any help would be greatly appreciated.

You are almost done. Just add YEAR and MONTH to GROUP BY.
select Customer_id, (mindate + ' to ' + maxdate) Date_Range, Amount
from (
select Customer_id,
sum(Amount) Amount, min(Invoice_date) mindate, max(Invoice_date) maxdate
from #Invoice_Payment
group by
Customer_id,
YEAR(Invoice_date),
MONTH(Invoice_date)
) I ;

How about grouping by customer_id, month and year
select Customer_id, (mindate + ' to ' + maxdate) Date_Range, Amount
from (
select Customer_id,
sum(Amount) Amount, min(Invoice_date) mindate, max(Invoice_date) maxdate
from #Invoice_Payment
group by Customer_id,month(Invoice_date), year(Invoice_date)
) I
order by customer_id;

Related

Sales amounts of the top n selling vendors by month in bigquery

i have a table in bigquery like this (260000 rows):
vendor date item_price
x 2021-07-08 23:41:10 451,5
y 2021-06-14 10:22:10 41,7
z 2020-01-03 13:41:12 74
s 2020-04-12 01:14:58 88
....
exactly what I want is to group this data by month and find the sum of the sales of only the top 20 vendors in that month. Expected output:
month sum_of_only_top20_vendor's_sales
2020-01 7857
2020-02 9685
2020-03 3574
2020-04 7421
.....
Consider below approach
select month, sum(sale) as sum_of_only_top20_vendor_sales
from (
select vendor,
format_datetime('%Y%m', date) month,
sum(item_price) as sale
from your_table
group by vendor, month
qualify row_number() over(partition by month order by sale desc) <= 20
)
group by month
Another solution that potentially can show much much better performance on really big data:
select month,
(select sum(sum) from t.top_20_vendors) as sum_of_only_top20_vendor_sales
from (
select
format_datetime('%Y%m', date) month,
approx_top_sum(vendor, item_price, 20) top_20_vendors
from your_table
group by month
) t
or with a little refactoring
select month, sum(sum) as sum_of_only_top20_vendor_sales
from (
select
format_datetime('%Y%m', date) month,
approx_top_sum(vendor, item_price, 20) top_20_vendors
from your_table
group by month
) t, t.top_20_vendors
group by month

Joining on the same key on the next row

Suppose we have a table which contains customer_id, order_date, and ship_date. A reorder of the product occurs when the same customer's next order_date is within 30 days of the last ship_date.
select * from mytable
customer_id order_date ship_date
1 2017-08-04 2017-08-09
1 2017-09-01 2017-09-05
2 2017-02-02 2017-03-01
2 2017-04-05 2017-04-09
2 2017-04-15 2017-04-19
3 2018-02-02 2018-03-01
Requested: Reorders
customer_id order_date ship_date
1 2017-09-01 2017-09-05
2 2017-04-15 2017-04-19
How can I retrieve only the records for the same customers who had reorders, next order_date within 30
days of the last ship_date.
You can use exists as follows:
Select * from your_table t
Where exists (select 1 from your_table tt
Where tt.customer_id = t.customer_id
And t.ship_date > tt.ship_date
and t.ship_date <= dateadd(day, 30, tt.ship_date))
One method is lead():
select t.customer_id, t.order_date, t.next_ship_date
from (select t.*,
lead(order_date) over (partition by customer_id order by order_date) as next_order_date
lead(ship_date) over (partition by customer_id order by order_date) as next_ship_date
from t
) t
where next_order_date < dateadd(day, 30, ship_date);
EDIT:
If you want the "reorder" row, just use lag():
select t.*
from (select t.*,
lag(ship_date) over (partition by customer_id order by order_date) as prev_ship_date
from t
) t
where prev_ship_date > dateadd(day, 30, order_date);

how do i divide and add column

i have a list with peoples id and date, the list say when a person Entered to website (his id and date).
how can i show for all the dates how many people enter the site two days in a row?
the data ( 30,000 like this in diffrent dates)
01/03/2019 4616
01/03/2019 17584
01/03/2019 7812
01/03/2019 34
01/03/2019 12177
01/03/2019 7129
01/03/2019 11660
01/03/2019 2428
01/03/2019 17514
01/03/2019 10781
01/03/2019 7629
01/03/2019 11119
I succeeded to show the amount of pepole enter the site on the same day but i didnt succeeded to add a column that show the pepole that enter 2 days in row.
date number_of_entrance
2019-03-01 7099
2019-03-02 7021
2019-03-03 7195
2019-03-04 7151
2019-03-05 7260
2019-03-06 7169
2019-03-07 7076
2019-03-08 7081
2019-03-09 6987
2019-03-10 7172
select date,count(*) as number_of_entrance
fROM [finalaa].[dbo].[Daily_Activity]
group by Date
order by date;
how can i show for all the dates how many people enter the site two days in a row?
I would just use lag():
select count(distinct person)
from (select t.*,
lag(date) over (partition by person order by date) as prev_date
from t
) t
where prev_date = dateadd(day, -1, date);
Your code suggests SQL Server, so I used the date functions in that database.
If you want this per date:
select date, count(distinct person)
from (select t.*,
lag(date) over (partition by person order by date) as prev_date
from t
) t
where prev_date = dateadd(day, -1, date)
group by date;
You can use a subquery which returns the number of common entrances in 2 days:
select
t.date,
count(*) as number_of_entrance,
(
SELECT COUNT(g.id) FROM (
SELECT id
FROM [Daily_Activity]
WHERE date IN (t.date, t.date - 1)
GROUP BY id
HAVING COUNT(DISTINCT date) = 2
) g
) number_of_entrance_2_days_in_a_row
FROM [Daily_Activity] t
group by t.date
order by t.date;
Replace id with the 2nd column's name in the table.

Find the start and end date of stock difference

Please Suggest good sql query to find the start and end date of stock difference
imagine i data in a table like below.
Sample_table
transaction_date stock
2018-12-01 10
2018-12-02 10
2018-12-03 20
2018-12-04 20
2018-12-05 20
2018-12-06 20
2018-12-07 20
2018-12-08 10
2018-12-09 10
2018-12-10 30
Expected result should be
Start_date end_date stock
2018-12-01 2018-12-02 10
2018-12-03 2018-12-07 20
2018-12-08 2018-12-09 10
2018-12-10 null 30
It is the gap and island problem. You may use row_numer and group by for this.
select t.stock, min(transaction_date), max(transaction_date)
from (
select row_number() over (order by transaction_date) -
row_number() over (partition by stock order by transaction_date) grp,
transaction_date,
stock
from data
) t
group by t.grp, t.stock
In the following DBFIDDLE DEMO I solve also the null value of the last group, but the main idea of finding consecutive rows is build on the above query.
You may check this for an explanation of this solution.
You can try below using row_number()
select stock,min(transaction_date) as start_date,
case when min(transaction_date)=max(transaction_date) then null else max(transaction_date) end as end_date
from
(
select *,row_number() over(order by transaction_date)-
row_number() over(partition by stock order by transaction_date) as rn
from t1
)A group by stock,rn
Try to use GROUP BY with MIN and MAX:
SELECT
stock,
MIN(transaction_date) Start_date,
CASE WHEN COUNT(*)>1 THEN MAX(transaction_date) END end_date
FROM Sample_table
GROUP BY stock
ORDER BY stock
You can try with LEAD, LAG functions as below:
select currentStockDate as startDate,
LEAD(currentStockDate,1) as EndDate,
currentStock
from
(select *
from
(select
LAG(transaction_date,1) over(order by transaction_date) as prevStockDate,
transaction_date as CurrentstockDate,
LAG(stock,1) over(order by transaction_date) as prevStock,
stock as currentStock
from sample_table) as t
where (prevStock <> currentStock) or (prevStock is null)
) as t2

SQL Server : number of orders per date with day column

I have a query that pulls number of orders per date.
SELECT
name, CONVERT(VARCHAR(10), order_date, 120) AS order_date,
COUNT(1) AS orders
FROM
orders AS od
WHERE
id = 73
GROUP BY
CONVERT(VARCHAR(10), order_date, 120), name
ORDER BY
order_date, name
Below are the results I get when I run the query:
name order_date orders
--------------------------
20pmam 2016-07-27 39
20pmam 2016-07-28 30
20pmam 2016-07-29 32
20pmam 2016-07-31 468
20pmam 2016-08-02 75
20pmam 2016-07-05 30
I need my results to be like this, with a new column day
name order_date orders day
-------------------------------
20pmam 2016-07-27 39 1
20pmam 2016-07-28 30 2 // days between 2016-07-27 to 2016-07-28
20pmam 2016-07-29 32 3 // days between 2016-07-27 to 2016-07-29
20pmam 2016-07-31 468 5 // days between 2016-07-27 to 2016-07-31
20pmam 2016-08-02 75 7 // days between 2016-07-27 to 2016-08-02
20pmam 2016-08-05 30 10 // days between 2016-07-27 to 2016-08-05
The first/minimum order_date should be taken as day 1 ( in the above results 2016-07-27 is day 1) and should calculate others based on the first/minimum order_date.
Is this easily possible?
I don't have any idea how to get the desired result. I would appreciate any suggestions.
You can do this cross apply to get the minimum date before each order_date and use it in datediff.
SELECT name,CONVERT(VARCHAR(10), order_date, 120) AS order_date, Count(1) [orders],
1+coalesce(datediff(day,t.min_date,od.order_date),0) as [Day]
FROM orders AS od
cross apply (select min(od1.order_date) as min_date
from orders od1
where od.id=od1.id and od.name=od1.name and od1.order_date<od.order_date) t
WHERE id = 73
GROUP BY CONVERT(VARCHAR(10), order_date, 120),name,datediff(day,t.min_date,od.order_date)
ORDER BY order_date,name
Try something like:
SELECT name,
CONVERT(VARCHAR(10), order_date, 120) AS order_date,
Count(1) AS orders,
DATEDIFF(DAY, first_order_date, order_date) + 1
FROM orders AS od
JOIN (SELECT min(order_date) AS first_order_date
FROM orders) as fod ON 1 = 1
WHERE id = 73
GROUP BY CONVERT(VARCHAR(10), order_date, 120),
name,
DATEDIFF(DAY, first_order_date, order_date) + 1
ORDER BY order_date,
name
Hope this will solve your problem