How to summarize information over the dynamic period in sql? - sql

I have a table with orders and the following fields:
create table orders2 (
orderID int,
customerID int,
date DateTime,
amount int)
engine=Memory;
Each customer can make 0 or many orders each day. I need to create an SQL query that will show for each customer how many orders he/she made during the period of 3 days starting from the day when the customer has made his/her first order.
So, for each customer, the query should detect the date of the first order, then compute the date that is 3 days in the future from the first date, then filter rows to take only orders with dates in the given range, and then perform counting of orders (orderID) in that time period. At the moment, I was able to just detect the date of the first order for each customer.
SELECT
O.customerID,
O.date AS first_day,
COUNT(O.orderID) AS first_day_orders_num,
SUM(O.amount) AS first_day_amount
FROM orders2 AS O
INNER JOIN
(
SELECT
customerID,
MIN(date) AS first_date
FROM orders2
GROUP BY customerID
) AS I ON (O.customerID = I.customerID) AND (O.date = I.first_date)
GROUP BY
O.customerID,
O.date

I don't really understand what result do you need. Probably it can be solved using arrays.
Here is solution using vanilla sql
select customerID, min(first_date), sum(num_orders_per_day)
from (
select customerID, date, min(date) first_date, count() num_orders_per_day
from orders2
group by customerID, date
having date <= first_date + interval 3 days
)
group by customerID

You can use window functions to get the first order date:
select o.CustomerID, count(*) as num_orders_3_days
from (select o.*, min(date) over (partition by CustomerID) as min_date
from orders o
) o
where date < min_date + interval '3 day'
group by CustomerID;

Try this query:
SELECT customerID, orders_count
FROM (
SELECT customerID,
arraySort(x -> x.1, groupArray((date, orderID))) sorted_date_per_order_pairs,
sorted_date_per_order_pairs[1].1 + INTERVAL 3 day AS end_date,
arrayFilter(x -> x.1 < end_date, sorted_date_per_order_pairs) orders_in_period,
length(orders_in_period) orders_count
FROM orders2
GROUP BY customerID);

Related

Calculate average days between orders The last three records tsql

I trying to take an average per customer, but you're not grouping by customer.
I would like to calculate the average days between several order dates from a table called invoice. For each BusinessPartnerID, what is the average days between orders i want average days last three records orders .
I got the average of all order for each user but need days last three records orders
The sample table is as below
;WITH temp (avg,invoiceid,carname,carid,fullname,mobail)
AS
(
SELECT AvgLag = AVG(Lag) , Lagged.idinvoice,
Lagged.carname ,
Lagged.carid ,Lagged.fullname,Lagged.mobail
FROM
(
SELECT
(car2.Name) as carname ,
(car2.id) as carid ,( busin.Name) as fullname, ( busin.Mobile) as mobail , INV.Id as idinvoice , Lag = CONVERT(int, DATEDIFF(DAY, LAG(Date,1)
OVER (PARTITION BY car2.Id ORDER BY Date ), Date))
FROM [dbo].[Invoice] AS INV
JOIN [dbo].[InvoiceItem] AS INITEM on INV.Id=INITEM.Invoiceid
JOIN [dbo].[BusinessPartner] as busin on busin.Id=INV.BuyerId and Type=5
JOIN [dbo].[Product] as pt on pt.Id=INITEM.ProductId and INITEM.ProductId is not null and pt.ProductTypeId=3
JOIN [dbo].[Car] as car2 on car2.id=INv.BusinessPartnerCarId
) AS Lagged
GROUP BY
Lagged.carname,
Lagged.carid,Lagged.fullname,Lagged.mobail, Lagged.idinvoice
-- order by Lagged.fullname
)
SELECT * FROM temp where avg is not null order by avg
I don't really see how your query relate to your question. Starting from a table called invoice that has columns businesspartnerid, and date, here is how you would take the average of the day difference between the last 3 invoices of each business partner:
select businesspartnerid,
avg(1.0 * datediff(
day,
lag(date) over(partition by businesspartnerid order by date),
date
) avg_diff_day
from (
select i.*,
row_number() over(partiton by businesspartnerid order by date desc) rn
from invoice i
) i
where rn <= 3
group by businesspartnerid
Note that 3 rows gives you 2 intervals only, that will be averaged.

SQL: Difference between consecutive rows

Table with 3 columns: order id, member id, order date
Need to pull the distribution of orders broken down by No. of days b/w 2 consecutive orders by member id
What I have is this:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id+1;
It's not helping me completely as the output I need is:
You can use lag() to get the date of the previous order by the same customer:
select o.*,
datediff(
order_date,
lag(order_date) over(partition by member_id order by order_date, order_id)
) days_diff
from orders o
When there are two rows for the same date, the smallest order_id is considered first. Also note that I fixed your datediff() syntax: in Hive, the function just takes two dates, and no unit.
I just don't get the logic you want to compute num_orders.
May be something like this:
SELECT
a1.member_id,
count(distinct a1.order_id) as num_orders,
a1.order_date,
DATEDIFF(DAY, a1.order_date, a2.order_date) as days_since_last_order
from orders as a1
inner join orders as a2
on a2.member_id = a1.member_id
where not exists (
select intermediate_order
from orders as intermedite_order
where intermediate_order.order_date < a1.order_date and intermediate_order.order_date > a2.order_date) ;

Return the number of new users by month

Table ORDERS contains the following columns
order_id, order_date, ship_date, ship_mode, customer_id, country, quantity
I just need the code. Here is my work.
SELECT order_date, COUNT (Customer_Id) As new_usercount
FROM (
SELECT Customer_Id, DATE(MIN(Order_Date)) AS order_date
FROM Orders
GROUP BY Customer_Id
) AS ch
GROUP BY order_date
Thanks
Your query is almost correct.
After you aggregate to get each customer's min order date you must group by the year and month of the dates and not the dates:
SELECT YEAR(order_date) year,
MONTH(order_date) month,
COUNT(*) new_usercount
FROM (
SELECT Customer_Id, MIN(Order_Date) order_date
FROM Orders
GROUP BY Customer_Id
) t
GROUP BY YEAR(order_date), MONTH(order_date)

PostgreSQL Query To Obtain Value that Occurs more than once in 12 months

I have the following query to return the number of users that booked a flight at least twice, but I need to identify those which have booked a flight more than once in the range of 12 months
SELECT COUNT(*)
FROM sales
WHERE customer in
(
SELECT customer
FROM sales
GROUP BY customer
HAVING COUNT(*) > 1
)
You would use window functions. The simplest method is lag():
select count(distinct customer)
from (select s.*,
lag(date) over (partition by customer order by date) as prev_date
from sales s
) s
where prev_date > s.date - interval '12 month';
At the cost of a self-join, #AdrianKlaver's answer can adapt to any 12-month period.
SELECT COUNT(DISTINCT customer) FROM
(SELECT customer
FROM sales s1
JOIN sales s2
ON s1.customer = s2.customer
AND s1.ticket_id <> s2.ticket_id
AND s2.date_field BETWEEN s1.date_field AND (s1.date_field + interval'1 year')
GROUP BY customer
HAVING COUNT(*) > 1) AS subquery;
A stab at it with a made up date field:
SELECT COUNT(*)
FROM sales
WHERE customer in
(
SELECT customer
FROM sales
WHERE date_field BETWEEN '01/01/2019' AND '12/31/2019'
GROUP BY customer
HAVING COUNT(*) > 1
)

SQL: aggregation (group by like) in a column

I have a select that group by customers spending of the past two months by customer id and date. What I need to do is to associate for each row the total amount spent by that customer in the whole first week of the two month time period (of course it would be a repetition for each row of one customer, but for some reason that's ok ). do you know how to do that without using a sub query as a column?
I was thinking using some combination of OVER PARTITION, but could not figure out how...
Thanks a lot in advance.
Raffaele
Query:
select customer_id, date, sum(sales)
from transaction_table
group by customer_id, date
If it's a specific first week (e.g. you always want the first week of the year, and your data set normally includes January and February spending), you could use sum(case...):
select distinct customer_id, date, sum(sales) over (partition by customer_ID, date)
, sum(case when date between '1/1/15' and '1/7/15' then Sales end)
over (partition by customer_id) as FirstWeekSales
from transaction_table
In response to the comments below; I'm not sure if this is what you're looking for, since it involves a subquery, but here's my best shot:
select distinct a.customer_id, date
, sum(sales) over (partition by a.customer_ID, date)
, sum(case when date between mindate and dateadd(DD, 7, mindate)
then Sales end)
over (partition by a.customer_id) as FirstWeekSales
from transaction_table a
left join
(select customer_ID, min(date) as mindate
from transaction_table group by customer_ID) b
on a.customer_ID = b.customer_ID