Obtain a customer's first purchase for one year - sql

I want to get as many new customers as I have per month. I know I could get the minimum purchase, but the problem I have is that if a customer had already had a purchase, but stopped buying for more than a year, he is considered a new user again.
You could help me by getting how many new customers I have per month. That is, whose minimum purchase date has been in that month and has not bought anything in the year before that minimum date.
I tried with this code, but if in this case, a customer had his first purchase in February 2019 and then the next purchase was in March 2020, just consider the purchase of February, when he should be new user in February 2019 and March 2020
select to_char(B.fp, 'YYYY-MM') month, count(B.email)
from(
select A.email, A.first_purchase fp
from(
select email, date(min(created)) first_purchase
from "Order_table" oo
group by email)A
where A.first_purchase >= (A.first_purchase + INTERVAL '-1 year'))B
group by 1,2

Use lag():
select to_char(ot.fp, 'YYYY-MM') as yyyymm
count(*) filter (where ot.fp > ot.prev_fp + interval '1 year' or b.prev_fp is null) as cnt_new
from (select ot.*, lag(ot.fp) over (partition by ot.email order by ot.fp) as prev_fp
from Order_table ot
) ot
group by yyyymm;

Related

rented per month postgreSQL

My task is to count the number of cars rented per month during a specified year.
I have two tables one called cars and one called rental
car table has (car_id, type, monthly_cost)
rental table has (rental_id, car_id, person_id, rental_date, return_date)
The problem is that I can count the number of rentals in a month just on the rental_date,
but that is just giving me new rentals.
For example rental_date: 2020-02-04 and return_date: 2020-05-05, this rental needs to be counted in feb, mars, apr and may.
select extract(month from rental_date) as month, count(*)
from rental
where extract(year from rental_date) = 2020
group by extract(month from rental_date);
This is my query for counting "new rentals".
One approach uses generate_seris() to generate all starts of months that year, and then brings the table with a left join on rental periods that overlap with each month:
select d.dt, count(r.rental_id) as cnt_rentals
from generate_series(date '2020-01-01', date '2020-12-01', '1 month') d(dt)
left join rental r
on r.rental_date < d.dt + interval '1 month'
and r.return_date >= d.dt
group by d.dt
Note that this properly handles rentals that cross the beginning of the year, while your original code did not. It also allows months without any rental.

Need to select all subscription types for all customer IDs if the customer ID includes a 7 day trial

I have a basic select statement that shows the customers who have seven day free trials:
select customer_id from purchases where description = '7 Day Free Trial'
This same customer id may also have additional descriptions in purchases indicating that they bought a subscription for a month, 6 months, one year as well as other options.
If the customer_id has a description = '7 Day Free Trial' then I need to select all the rows for these customer IDs. I want to see these rows not count them.
I know how to find the rows where customer_id is in the purchases table more than once with this query
SELECT *
FROM purchases WHERE customer_id IN
(
SELECT customer_id
FROM purchases
GROUP BY customer_id
HAVING COUNT(*) > 1)
ORDER BY customer_id ASC
which produces:
customer_id status description
12321 credit 7 Day Free Trial
12321 paid 1 Month Paid Subscription
78651 credit 6 Month Paid Subscription
78651 refund 6 Month Paid Subscription
45234 paid 30 Day Free Trial
45234 credit 1 Year Paid Description
But I am struggling to figure out how to approach the problem where if a customer _id has description = '7 Day Free Trial' select all the rows for that customer id.
An example of the output I desire is:
customer_id description
12321 7 Day Free Trial
12321 1 Month Paid Subscription
78651 7 Day Free Trial
78651 1 Year Paid Description
45234 7 Day Free Trial
45234 6 Month Paid Subscription
Any suggestions appreciated. I am simply not sure how to approach this.
One method uses window functions:
select p.*
from (select p.*, count(*) over (partition by customer_id) as cnt,
sum( (p2.description = '7 Day Free Trial')::int ) over (partition by customer_id) as num_free_trials
from purchases p
) p
where cnt >= 2 and num_free_trials > 0;
However, if you just want the descriptions, it might be good enough to put them in a single row -- using aggregation:
select customer_id, array_agg(description) as descriptions
from purchases p
group by customer_id
having count(*) filter (where p2.description = '7 Day Free Trial') > 0;
This allows you to use distinct to get only the unique descriptions, if you like.
In Redshift (the tag was changed after I answered), you can use:
having sum( (p2.description = '7 Day Free Trial')::int ) > 0
I figured it out. I get the results I was looking for at end of my post this way:
SELECT
customer_id,
id,
description
FROM
firecracker.productionrr_purchases
WHERE
customer_id IN (
SELECT
customer_id
FROM (
SELECT
customer_id,
count(*) AS trial_user_rec
FROM
firecracker.productionrr_purchases
WHERE
customer_id IN (
SELECT
customer_id
FROM
firecracker.productionrr_purchases
WHERE
lower(description) LIKE lower('7 Day Free Trial')
)
GROUP BY customer_id
)
WHERE trial_user_rec>1
)
ORDER BY customer_id DESC
which returns
customer_id description
12321 7 Day Free Trial
12321 1 Month Paid Subscription
78651 7 Day Free Trial
78651 1 Year Paid Description
45234 7 Day Free Trial
45234 6 Month Paid Subscription

I want find customers transacting for any consecutive 3 months from year 2017 to 2018

I want to know the trick to find the list of customers who are transacting for consecutive 3 months ,that could be any 3 consecutive months with any number of occurrence.
example: suppose there is customer who transact in January then keep transacting till march then he stopped transacting.I want the list of these customer from my database .
I am working on AWS Athena.
One method uses aggregation and window functions:
select customer_id, yyyymm_2
from (select date_trunc(month, transactdate) as yyyymm, customer_id,
lag(date_trunc(month, transactdate), 2) over (partition by customer_id order by date_trunc(month, transactdate)) as prev_yyyymm_2
from t
where transactdate >= '2017-01-01' and
transactadte < '2019-01-01'
)
where prev_dt_2 = yyyymm - interval '2' month;
This aggregates transactions by month and looks at the transaction date two rows earlier. The outer filter checks that that date is exactly 2 months earlier.

Accumulating values until up to date

I'm working on an order system where orders come in. For the analytics department I want to build a view that accumulates all sales for a given day.
That is not an issue, I got the working query for that. More complicated is a second number where I want to show the accumulated sales to that day.
Meaning if I have $100 of sales on Feb 1 the column should show $100. If I have $200 of sales on Feb 2 that column should show $300 and so on.
This is what I came up with so far:
select
date_trunc('day', o.created_at) :: date,
sum(o.value) sales_for_day,
count(o.accepted_at) as num_of_orders_for_day,
-- sales_for_month_to_date
-- num_of_orders_for_month_to_date
from
orders o
where
status = 'accepted'
group by
date_trunc('day', o.accepted_at);
Just use window functions:
select date_trunc('day', o.created_at) :: date,
sum(o.value) as sales_for_day,
count(o.accepted_at) as num_of_orders_for_day,
sum(sum(o.value)) over (partition by date_trunc('month', o.accepted_at order by min(o.created_at)) as sales_for_month_to_date
sum(count(*)) over (partition by date_trunc('month', o.accepted_at order by min(o.created_at)) as num_of_orders_for_month_to_date
from orders o
where status = 'accepted'
group by date_trunc('day', o.accepted_at);
Based on the comments in your code, I surmise that you want month-to-date numbers, so this also partitions by month.

Selecting data with counts more than 4 in a month from a daily data

I am trying to count the monthly number of merchants (and the total transaction amount they've processed) who have made at least 4 transactions each month in the last 2 years from a table containing daily transaction by merchants.
My query is as follow:
SELECT trx.month, COUNT(trx.merchants), SUM(trx.amount)
FROM
(
SELECT
DATE_TRUNC('month', transactions.payment_date) AS month,
merchants,
COUNT(DISTINCT payment_id) AS volume,
SUM(transactions.payment_amount) AS amount
FROM transactions
WHERE transactions.date >= NOW() - INTERVAL '2 years'
GROUP BY 1, 2
) AS trx
WHERE trx.volume >= 4
My question is: will this query pull the right data? If so, is this the most efficient way of writing it or can I improve the performance of this query?
First of all we must think about the time range. You say that you want at least four transactions each month in the last 24 months. But you certainly don't require this for, say, October 2018, when running the query on October 10, 2018. Neither do you want to only look at only the last twenty days of October 2016 then. We would want to look at the complete October 2016 till the complete September 2018.
Next we want to make sure that a merchant had at least four transactions each month. In other words: they had transactions each month and the minimum number of transactions per month was four. We can use window functions to run over monthly transactions to check this.
select merchants, month, volume, amount
from
(
select
merchants,
date_trunc('month', payment_date) as month,
count(distinct payment_id) as volume,
sum(payment_amount) as amount,
count(*) over (partition by merchants) number_of_months,
min(count(distinct payment_id)) over (partition by merchants) min_volume
from transactions
where date between date_trunc('month', current_date) - interval '24 months'
and date_trunc('month', current_date) - interval '1 days'
group by merchants, date_trunc('month', payment_date)
) monthly
where number_of_months = 24
and min_volume >= 4
order by merchants, month;
This gives you the list of merchants fulfilling the requirements with their monthly data. If you want the number of merchants instead, then aggregate. E.g.
select count(distinct merchants), sum(amount) as total
from (...) monthly
where number_of_months = 24 and min_volume >= 4;
or
select month, count(distinct merchants), sum(amount) as total
from (...) monthly
where number_of_months = 24 and min_volume >= 4
group by month
order by month;
for get only the list of merchant you could use having for filter the result of the aggreated values for distinct number of payement_id and month
SELECT merchants
FROM transactions
WHERE transactions.date >= NOW() - INTERVAL '2 years'
GROUP BY merchants
having count(distinct DATE_TRUNC('month', transactions.payment_date)) =24
and COUNT(DISTINCT payment_id) >= 4
And for you updated question just a suggestion
You could join with the query that return the marchant with more then 4 volume for each month in tow year and filter the result for aggreated directly in subquery using having
SELECT trx.month, COUNT(trx.merchants), SUM(trx.amount)
FROM (
SELECT DATE_TRUNC('month', transactions.payment_date) AS month
, merchants
, COUNT(DISTINCT payment_id) AS volume
, SUM(transactions.payment_amount) AS amount
FROM transactions
INNER JOIN (
SELECT merchants
FROM transactions
WHERE transactions.date >= NOW() - INTERVAL '2 years'
GROUP BY merchants
having count(distinct DATE_TRUNC('month', transactions.payment_date)) =24
and COUNT(DISTINCT payment_id) >= 4
) A on A.merchant = transactions.merchant
WHERE transactions.date >= NOW() - INTERVAL '2 years'
GROUP BY 1, 2
HAVING volume >= 4
) AS trx