Average for partition bounded by last 7 days - sql

I have a query spanned across last 30 days, which sums total revenue, however I also want along with sum of last 30 days, add average of last 7 days. I want something like this:
select
country
, avg(revenue) over (partition by country range between current_date - 7 and current_date) avg_revenue_last_7_days
, sum(revenue) total_revenue_30_days
from table
group by 1,2
Is it possible to get average for a smaller number of days than what aggregation is based on?
I want to avoid subqueries because the query already quite complex.

You don't need window functions for this, just conditional aggregation:
select country,
avg(case when datecol between current_date - 7 and current_date
then revenue
end) as avg_revenue_last_7_days,
sum(case when datecol between current_date - 30 and current_date
then revenue
end) as total_revenue_30_days
from table
group by country;

Related

Retrieve Customers with a Monthly Order Frequency greater than 4

I am trying to optimize the below query to help fetch all customers in the last three months who have a monthly order frequency +4 for the past three months.
Customer ID
Feb
Mar
Apr
0001
4
5
6
0002
3
2
4
0003
4
2
3
In the above table, the customer with Customer ID 0001 should only be picked, as he consistently has 4 or more orders in a month.
Below is a query I have written, which pulls all customers with an average purchase frequency of 4 in the last 90 days, but not considering there is a consistent purchase of 4 or more last three months.
Query:
SELECT distinct lines.customer_id Customer_ID, (COUNT(lines.order_id)/90) PurchaseFrequency
from fct_customer_order_lines lines
LEFT JOIN product_table product
ON lines.entity_id= product.entity_id
AND lines.vendor_id= product.vendor_id
WHERE LOWER(product.country_code)= "IN"
AND lines.date >= DATE_SUB(CURRENT_DATE() , INTERVAL 90 DAY )
AND lines.date < CURRENT_DATE()
GROUP BY Customer_ID
HAVING PurchaseFrequency >=4;
I tried to use window functions, however not sure if it needs to be used in this case.
I would sum the orders per month instead of computing the avg and then retrieve those who have that sum greater than 4 in the last three months.
Also I think you should select your interval using "month(CURRENT_DATE()) - 3" instead of using a window of 90 days. Of course if needed you should handle the case of when current_date is jan-feb-mar and in that case go back to oct-nov-dec of the previous year.
I'm not familiar with Google BigQuery so I can't write your query but I hope this helps.
So I've found the solution to this using WITH operator as below:
WITH filtered_orders AS (
select
distinct customer_id ID,
extract(MONTH from date) Order_Month,
count(order_id) CountofOrders
from customer_order_lines` lines
where EXTRACT(YEAR FROM date) = 2022 AND EXTRACT(MONTH FROM date) IN (2,3,4)
group by ID, Order_Month
having CountofOrders>=4)
select distinct ID
from filtered_orders
group by ID
having count(Order_Month) =3;
Hope this helps!
An option could be first count the orders by month and then filter users which have purchases on all months above your threshold:
WITH ORDERS_BY_MONTH AS (
SELECT
DATE_TRUNC(lines.date, MONTH) PurchaseMonth,
lines.customer_id Customer_ID,
COUNT(lines.order_id) PurchaseFrequency
FROM fct_customer_order_lines lines
LEFT JOIN product_table product
ON lines.entity_id= product.entity_id
AND lines.vendor_id= product.vendor_id
WHERE LOWER(product.country_code)= "IN"
AND lines.date >= DATE_SUB(CURRENT_DATE() , INTERVAL 90 DAY )
AND lines.date < CURRENT_DATE()
GROUP BY PurchaseMonth, Customer_ID
)
SELECT
Customer_ID,
AVG(PurchaseFrequency) AvgPurchaseFrequency
FROM ORDERS_BY_MONTH
GROUP BY Customer_ID
HAVING COUNT(1) = COUNTIF(PurchaseFrequency >= 4)

Finding the avg. revenue for each sales rep

I'd like to find the average monthly revenue for each sales owner--however my current query is taking the monthly total and just dividing it by the number of entries. Ultimately, I'd like to get the average by finding the total revenue for each month and then dividing it by the number of months and then eventually just finding the avg. of the past 6 months. Code as well as sample output below:
select activitydate, console_org_name, partneragency, partneradvertiser, org_sales_owner
,round(sum(gross_revenue_allocation)::numeric,2) as gross_revenue
,round(avg(sum(gross_revenue_allocation)) over (partition by org_sales_owner order by activitydate RANGE INTERVAL '5' MONTH PRECEDING)::numeric,2) as salesowner6monthavg
from data_provider_payout dpp
where activitydate >= '01/01/2019'
group by activitydate, console_org_name, partneragency, partneradvertiser, org_sales_owner
If I understand correctly, you need to aggregate by the month and the owner. That would be something like this:
select date_trunc('month', activitydate), org_sales_owner,
round(sum(gross_revenue_allocation)::numeric, 2) as gross_revenue,
round(avg(sum(gross_revenue_allocation)) over (
partition by org_sales_owner
order by min(activitydate)
range between interval '5 month' preceding and current_row
)
)::numeric, 2) as salesowner6monthavg
from data_provider_payout dpp
where activitydate >= '2019-01-01'
group by date_trunc('month', activitydate), org_sales_owner

Using Date to find the inequality for sales than 500

I'm curious as to find the daily average sales for the month of December 1998 not greater than 100 as a where clause. So what I imagine is that since the table consists of the date of sales (sth like 1 december 1998, consisting of different date, months and year), amount due....First I'm going to define a particular month.
DEFINE a = TO_DATE('1-Dec-1998', 'DD-Month-YYYY')
SELECT SUBSTR(Sales_Date, 4,6), (SUM(Amount_Due)/EXTRACT(DAY FROM LAST_DAY(Sales_Date))
FROM ......
WHERE SUM(AMOUNT_DUE)/EXTRACT(DAY FROM LAST_DAY(&a)) < 100
I'm stuck as to extract the sum of amount due in the month of december 1998 for the where clause....
How can I achieve the objective?
To me, it looks like this:
select to_char(sales_date, 'mm.yyyy') month,
avg(amount_due) avg_value
from your_table
where sales_date >= trunc(date '1998-12-01', 'mm')
and sales_date < add_months(trunc(date '1998-12-01', 'mm'), 1)
group by to_char(sales_date, 'mm.yyyy')
having avg(amount_due) < 100;
WHERE clause can be simplified; it shows how to fetch certain period:
trunc to mm returns first day in that month
add_months to the above value (first day in that month) will return first day of the next month
the bottom line: give me all rows whose sales_date is >= first day of this month and < first day of the next month; basically, the whole this month
Finally, the where clause you used should actually be the having clause.
As long as the amount_due column only contains numbers, you can use the sum function.
Below SQL query should be able to satisfy your requirement.
Select SUM(Amount_Due) from table Sales where Sales_Date between '1-12-1998' and '31-12-1998'
OR
Select SUM(Amount_Due) from table Sales where Sales_Date like '%-12-1998'

I want find customers transacting for any consecutive 3 months from year 2017 to 2018

I want to know the trick to find the list of customers who are transacting for consecutive 3 months ,that could be any 3 consecutive months with any number of occurrence.
example: suppose there is customer who transact in January then keep transacting till march then he stopped transacting.I want the list of these customer from my database .
I am working on AWS Athena.
One method uses aggregation and window functions:
select customer_id, yyyymm_2
from (select date_trunc(month, transactdate) as yyyymm, customer_id,
lag(date_trunc(month, transactdate), 2) over (partition by customer_id order by date_trunc(month, transactdate)) as prev_yyyymm_2
from t
where transactdate >= '2017-01-01' and
transactadte < '2019-01-01'
)
where prev_dt_2 = yyyymm - interval '2' month;
This aggregates transactions by month and looks at the transaction date two rows earlier. The outer filter checks that that date is exactly 2 months earlier.

Selecting data with counts more than 4 in a month from a daily data

I am trying to count the monthly number of merchants (and the total transaction amount they've processed) who have made at least 4 transactions each month in the last 2 years from a table containing daily transaction by merchants.
My query is as follow:
SELECT trx.month, COUNT(trx.merchants), SUM(trx.amount)
FROM
(
SELECT
DATE_TRUNC('month', transactions.payment_date) AS month,
merchants,
COUNT(DISTINCT payment_id) AS volume,
SUM(transactions.payment_amount) AS amount
FROM transactions
WHERE transactions.date >= NOW() - INTERVAL '2 years'
GROUP BY 1, 2
) AS trx
WHERE trx.volume >= 4
My question is: will this query pull the right data? If so, is this the most efficient way of writing it or can I improve the performance of this query?
First of all we must think about the time range. You say that you want at least four transactions each month in the last 24 months. But you certainly don't require this for, say, October 2018, when running the query on October 10, 2018. Neither do you want to only look at only the last twenty days of October 2016 then. We would want to look at the complete October 2016 till the complete September 2018.
Next we want to make sure that a merchant had at least four transactions each month. In other words: they had transactions each month and the minimum number of transactions per month was four. We can use window functions to run over monthly transactions to check this.
select merchants, month, volume, amount
from
(
select
merchants,
date_trunc('month', payment_date) as month,
count(distinct payment_id) as volume,
sum(payment_amount) as amount,
count(*) over (partition by merchants) number_of_months,
min(count(distinct payment_id)) over (partition by merchants) min_volume
from transactions
where date between date_trunc('month', current_date) - interval '24 months'
and date_trunc('month', current_date) - interval '1 days'
group by merchants, date_trunc('month', payment_date)
) monthly
where number_of_months = 24
and min_volume >= 4
order by merchants, month;
This gives you the list of merchants fulfilling the requirements with their monthly data. If you want the number of merchants instead, then aggregate. E.g.
select count(distinct merchants), sum(amount) as total
from (...) monthly
where number_of_months = 24 and min_volume >= 4;
or
select month, count(distinct merchants), sum(amount) as total
from (...) monthly
where number_of_months = 24 and min_volume >= 4
group by month
order by month;
for get only the list of merchant you could use having for filter the result of the aggreated values for distinct number of payement_id and month
SELECT merchants
FROM transactions
WHERE transactions.date >= NOW() - INTERVAL '2 years'
GROUP BY merchants
having count(distinct DATE_TRUNC('month', transactions.payment_date)) =24
and COUNT(DISTINCT payment_id) >= 4
And for you updated question just a suggestion
You could join with the query that return the marchant with more then 4 volume for each month in tow year and filter the result for aggreated directly in subquery using having
SELECT trx.month, COUNT(trx.merchants), SUM(trx.amount)
FROM (
SELECT DATE_TRUNC('month', transactions.payment_date) AS month
, merchants
, COUNT(DISTINCT payment_id) AS volume
, SUM(transactions.payment_amount) AS amount
FROM transactions
INNER JOIN (
SELECT merchants
FROM transactions
WHERE transactions.date >= NOW() - INTERVAL '2 years'
GROUP BY merchants
having count(distinct DATE_TRUNC('month', transactions.payment_date)) =24
and COUNT(DISTINCT payment_id) >= 4
) A on A.merchant = transactions.merchant
WHERE transactions.date >= NOW() - INTERVAL '2 years'
GROUP BY 1, 2
HAVING volume >= 4
) AS trx