SUM values BETWEEN specific dates in BigQuery - sql

I need to query a 12, 24, 36 and 48 month total for each customer
I've got a dataset that includes customer information (customer_id, products, spend, qty, purchase_date, etc) I need to display the totals for the different periods per customer
SELECT customer_id, MIN(purchase_date) AS first_purchase,
SUM(CASE WHEN purchase_date BETWEEN MIN(purchase_date) AND DATETIME_ADD(MIN(purchase_date), INTERVAL 1 YEAR THEN spend END) AS 12_mnth_total,
SUM(CASE WHEN purchase_date BETWEEN MIN(purchase_date) AND DATETIME_ADD(MIN(purchase_date), INTERVAL 2 YEAR THEN spend END) AS 24_mnth_total,
SUM(CASE WHEN purchase_date BETWEEN MIN(purchase_date) AND DATETIME_ADD(MIN(purchase_date), INTERVAL 3 YEAR THEN spend END) AS 36_mnth_total,
SUM(CASE WHEN purchase_date BETWEEN MIN(purchase_date) AND DATETIME_ADD(MIN(purchase_date), INTERVAL 4 YEAR THEN spend END) AS 48_mnth_total
FROM SalesTable
GROUP BY customer_id, purchase_date
ORDER BY purchase_date
My query shows me the following error: Syntax error: Expected ")" but got keyword THEN

You seem to want to count from the first purchase. You cannot nest aggregation functions the way that you are doing it. Instead, use a window function to get the minimum date for each customer and then aggregate:
SELECT customer_id, MIN(purchase_date) AS first_purchase,
SUM(CASE WHEN purchase_date BETWEEN min_purchase_date AND DATETIME_ADD(min_purchase_date, INTERVAL 1 YEAR) THEN spend
END) AS 12_mnth_total,
SUM(CASE WHEN purchase_date BETWEEN min_purchase_date AND DATETIME_ADD(min_purchase_date, INTERVAL 2 YEAR) THEN spend
END) AS 24_mnth_total,
SUM(CASE WHEN purchase_date BETWEEN min_purchase_date AND DATETIME_ADD(min_purchase_date, INTERVAL 3 YEAR) THEN spend
END) AS 36_mnth_total,
SUM(CASE WHEN purchase_date BETWEEN min_purchase_date AND DATETIME_ADD(min_purchase_date, INTERVAL 4 YEAR) THEN spend
END) AS 48_mnth_total,
FROM (SELECT s.*,
MIN(purchase_date) OVER (PARTITION BY customer_id) as min_purchase_date
FROM SalesTable s
) t
GROUP BY customer_id
ORDER BY first_purchase;
You ca simplify the logic by removing the first comparison in the case:
SELECT customer_id, MIN(purchase_date) AS first_purchase,
SUM(CASE WHEN purchase_date <= DATETIME_ADD(min_purchase_date, INTERVAL 1 YEAR) THEN spend
END) AS 12_mnth_total,
SUM(CASE WHEN purchase_date <= DATETIME_ADD(min_purchase_date, INTERVAL 2 YEAR) THEN spend
END) AS 24_mnth_total,
SUM(CASE WHEN purchase_date <= DATETIME_ADD(min_purchase_date, INTERVAL 3 YEAR) THEN spend
END) AS 36_mnth_total,
SUM(CASE WHEN purchase_date <= DATETIME_ADD(min_purchase_date, INTERVAL 4 YEAR) THEN spend
END) AS 48_mnth_total,
FROM (SELECT s.*,
MIN(purchase_date) OVER (PARTITION BY customer_id) as min_purchase_date
FROM SalesTable s
) t
GROUP BY customer_id
ORDER BY first_purchase;
Any purchase is logically on or after the first one.

The function DATETIME_ADD is not closed. I put it here INTERVAL 1 YEAR")".
Wouldnt know the exact sintax but its a good guess.
SELECT customer_id, MIN(purchase_date) AS first_purchase,
SUM(CASE WHEN purchase_date BETWEEN MIN(purchase_date) AND DATETIME_ADD(MIN(purchase_date), INTERVAL 1 YEAR) THEN spend END) AS 12_mnth_total,
SUM(CASE WHEN purchase_date BETWEEN MIN(purchase_date) AND DATETIME_ADD(MIN(purchase_date), INTERVAL 2 YEAR) THEN spend END) AS 24_mnth_total,
SUM(CASE WHEN purchase_date BETWEEN MIN(purchase_date) AND DATETIME_ADD(MIN(purchase_date), INTERVAL 3 YEAR) THEN spend END) AS 36_mnth_total,
SUM(CASE WHEN purchase_date BETWEEN MIN(purchase_date) AND DATETIME_ADD(MIN(purchase_date), INTERVAL 4 YEAR) THEN spend END) AS 48_mnth_total
FROM SalesTable
GROUP BY customer_id, purchase_date
ORDER BY purchase_date

Related

Get last 7 days, 20 days and YTD count

I have a table with columns sales_date, sales_id, sales_region and I am looking to display the count of sales for the past 7 days, 20 days and YTD.
I have this query below that returns the correct count for 7 and 20 days but the YTD shows the count minus the 7 and 20 days. How can I tweak this query to show the YTD correctly? Thank you
select region,
case when current_date- sales_date <=7 then 'Past7'
when current_date- sales_date <=28 then 'Past20'
else 'YTD'
end as "trendsales",
count(*) as salescount
from sales_table
where sales_date >= '2022-01-01'
group by 1
you could pivot it a bit. 4 columns Region, YTD, Past7, and Past20 would be columns.
select region,
sum(case when current_date- sales_date <=7 then 1 else 0 end) as Past7,
sum(case when current_date- sales_date <=28 then 1 else 0 end) as Past20,
count(*) as YTD
from sales_table
where sales_date >= '2022-01-01'
group by 1

SQL How to group data into separate month columns

So I'm running this query to get the name of the customer, total amount ordered, and number of orders they've submitted. With this query, I get their entire history from March to July, what I want is the name, march amount total/# of orders, april amount total/# of orders, may amount total/# of orders, ..... etc.
SELECT customer_name,MONTH(created_on), SUM(amount), COUNT(order_id)
FROM customer_orders
WHERE created_on BETWEEN '2020-03-01' AND '2020-08-01'
GROUP BY customer_name, MONTH(created_on)
If you want the values in separate columns, then use conditional aggregation:
SELECT customer_name,
SUM(CASE WHEN MONTH(created_on) = 3 THEN amount END) as march_amount,
SUM(CASE WHEN MONTH(created_on) = 3 THEN 1 ELSE 0 END) as march_count,
SUM(CASE WHEN MONTH(created_on) = 4 THEN amount END) as april_amount,
SUM(CASE WHEN MONTH(created_on) = 4 THEN 1 ELSE 0 END) as april_count,
. . .
FROM customer_orders
WHERE created_on >= '2020-03-01' AND
created_on < '2020-08-01'
GROUP BY customer_name;
Notice that I changed the date filter so it does not include 2020-08-01.

I am looking to find customers repurchase frequency in SQL from their first purchase date

I am trying to find the customer's repurchase rates from their first order date. For example, for 2016, how many customer purchased 1X in days 1-365 from their initial purchase, how many purchased twice etc.
I have a transaction_detail table which looks like below:
txn_date Customer_ID Transaction_Number Sales
1/2/2019 1 12345 $10
4/3/2018 1 65890 $20
3/22/2019 3 64453 $30
4/3/2019 4 88567 $20
5/21/2019 4 85446 $15
1/23/2018 5 89464 $40
4/3/2019 5 99674 $30
4/3/2019 6 32224 $20
1/23/2018 6 46466 $30
1/20/2018 7 56558 $30
I am able to find the customers who have shopped in 2016 and how many times have they repurchased in 2016, but I need to find the customer who have shopped in 2016 and how many times have they come back from their first purchase date.
I need a starting point for the query, I am not sure how to build this logic in my SQL code.
Any help would be appreciated.
I am using the below query:
WITH by_year
AS (SELECT
Customer_ID,
to_char(txn_date, 'YYYY') AS visit_year
FROM table
GROUP BY Customer_ID, to_char(txn_date, 'YYYY')),
with_first_year
AS (SELECT
Customer_ID,
visit_year,
FIRST_VALUE(visit_year) OVER (PARTITION BY Customer_ID ORDER BY visit_year) AS first_year
FROM by_year),
with_year_number
AS (SELECT
Customer_ID,
visit_year,
first_year,
(visit_year - first_year) AS year_number
FROM with_first_year)
SELECT
first_year AS first_year,
SUM(CASE WHEN year_number = 0 THEN 1 ELSE 0 END) AS year_0,
SUM(CASE WHEN year_number = 1 THEN 1 ELSE 0 END) AS year_1,
SUM(CASE WHEN year_number = 2 THEN 1 ELSE 0 END) AS year_2,
SUM(CASE WHEN year_number = 3 THEN 1 ELSE 0 END) AS year_3,
SUM(CASE WHEN year_number = 4 THEN 1 ELSE 0 END) AS year_4,
SUM(CASE WHEN year_number = 5 THEN 1 ELSE 0 END) AS year_5,
SUM(CASE WHEN year_number = 6 THEN 1 ELSE 0 END) AS year_6,
SUM(CASE WHEN year_number = 7 THEN 1 ELSE 0 END) AS year_7,
SUM(CASE WHEN year_number = 8 THEN 1 ELSE 0 END) AS year_8,
SUM(CASE WHEN year_number = 9 THEN 1 ELSE 0 END) AS year_9
FROM with_year_number
GROUP BY first_year
ORDER BY first_year
Use window functions and aggregation:
select cnt, count(*), min(customer_id), max(customer_id)
from (select customer_id, count(*) as cnt
from (select td.*,
min(txn_date) over (partition by Customer_ID) as min_txn_date
from transaction_detail td
) td
where txn_date >= min_txn_date and txn_date < min_txn_date + interval '365' day
group by customer_id
) c
group by cnt
order by cnt;
So as per my understanding, you want to know the count of the distinct person who first purchased in 2016 and repurchased after one year or more from date of purchase.
Select * from
(
Select customer_id,
Floor(months_between(txn_date, lead_txn_date)/12) as num_years
From
(
Select customer_id,
txn_date,
row_number() over (partition by Customer_ID order by txn_date) as rn,
lead(txn_date) over (partition by Customer_ID order by txn_date) as lead_txn_date
From your_table
)
Where txn_date >= date '2016-01-01'
and txn_date < date '2017-01-01'
and rn = 1
And months_between(txn_date, lead_txn_date) >= 12
)
Pivot
(
Count(1) for num_year in (1,2,3,4)
)
Ultimately, we are finding the number of years between first and second purchase of the customer. And first purchase must be in 2016.
Cheers!!

Split number to ranges in SQL

I am trying to write a SQL query that show the distribution of customers seniority since 2015/01/31:
Up to one month
Between one and six months
Between six months and one year
Over a year
I succeeded to split and group the number of months of the customers.
SELECT Seniority, COUNT(Customer_ID) [Number of Customers]
FROM
(SELECT Customer_ID,
DATEDIFF(MONTH, MIN(CONVERT(datetime, Order_Date)), '2015/01/31') Seniority
FROM Orders
GROUP BY Customer_ID) t
GROUP BY Seniority
How can I split by given ranges?
Expected:
Seniorty | Number of Customers
Up to one month | 0
Between one and six months | 14
Between six months and one year | 1
Over a year | 0
Use conditional aggregation:
WITH cte AS (
SELECT
'Up to one month' AS Seniority,
COUNT(CASE WHEN DATEDIFF(MONTH, Order_Date, '2015-01-31') < 1 THEN 1 END) AS [Number of Customers],
1 AS position
FROM Orders
UNION ALL
SELECT
'Between one and six months',
COUNT(CASE WHEN DATEDIFF(MONTH, Order_Date, '2015-01-31') >= 1 AND
DATEDIFF(MONTH, Order_Date, '2015-01-31') < 6 THEN 1 END),
2
FROM Orders
UNION ALL
SELECT
'Between six months and one year',
COUNT( CASE WHEN DATEDIFF(MONTH, Order_Date, '2015-01-31') >= 6 AND
DATEDIFF(MONTH, Order_Date, '2015-01-31') < 12 THEN 1 END),
3
FROM Orders
UNION ALL
SELECT
'Over a year',
COUNT(CASE WHEN DATEDIFF(MONTH, Order_Date, '2015-01-31') > 12 THEN 1 END),
4
FROM Orders
)
SELECT
Seniority,
[Number of Customers]
FROM cte
ORDER BY
position;
This answer assumes that the Order_Date column is already date or datetime. If not, then the first thing you should do is to convert this column to an actual date type.
WITH CTE AS (
SELECT CUSTOMER_ID,
CASE WHEN DATEDIFF(MONTH,'2015-01-31',ORDER_DATE)=1 THEN 'Up to one month'
WHEN DATEDIFF(MONTH,'2015-01-31',ORDER_DATE) BETWEEN 1 AND 6 THEN 'Between one and six months'
WHEN DATEDIFF(MONTH,'2015-01-31',ORDER_DATE) BETWEEN 6 AND 12 THEN 'Between six months and one year'
WHEN DATEDIFF(MONTH,'2015-01-31',ORDER_DATE)>12 THEN 'Over a year'
END AS SENIORITY
FROM ORDERS
)
SELECT SENIORITY AS 'Seniority', COUNT(CUSTOMER_ID) AS 'Number of Customers'
FROM CTE
WHERE SENIORITY IS NOT NULL
GROUP BY SENIORITY

how to do the program using subquery approach

SELECT SKU, SUM(CASE WHEN EXTRACT(MONTH FROM SALEDATE)=6 AND STYPE='P'
THEN AMT
END) AS VALUEJUNE,
SUM(CASE WHEN EXTRACT(MONTH FROM SALEDATE)=7 AND STYPE='P'
THEN AMT
END) AS VALUEJULY,
SUM(CASE WHEN EXTRACT(MONTH FROM SALEDATE)=8 AND STYPE='P'
THEN AMT
END) AS VALUEAUGUST
,(VALUEJUNE+VALUEJULY+VALUEAUGUST) AS totalsales
FROM TRNSACT
GROUP BY SKU
ORDER BY totalsales DESC ;
Put most of your query in a derived table, including the GROUP BY. Calculate totalsales on its result:
select sku, VALUEJUNE, VALUEJULY, VALUEAUGUST, (VALUEJUNE+VALUEJULY+VALUEAUGUST) AS totalsales
from
(
SELECT SKU,
SUM(CASE WHEN EXTRACT(MONTH FROM SALEDATE)=6 AND STYPE='P'
THEN AMT
END) AS VALUEJUNE,
SUM(CASE WHEN EXTRACT(MONTH FROM SALEDATE)=7 AND STYPE='P'
THEN AMT
END) AS VALUEJULY,
SUM(CASE WHEN EXTRACT(MONTH FROM SALEDATE)=8 AND STYPE='P'
THEN AMT
END) AS VALUEAUGUST
FROM TRNSACT
GROUP BY SKU
) dt
ORDER BY totalsales DESC ;