postgreSQL- Count for value between previous month start date and end date - sql

I have a table as follows
user_id date month year visiting_id
123 11-04-2017 APRIL 2017 4500
123 12-05-2017 MAY 2017 4567
123 13-05-2017 MAY 2017 4568
123 17-05-2017 MAY 2017 4569
123 22-05-2017 MAY 2017 4570
123 11-06-2017 JUNE 2017 4571
123 12-06-2017 JUNE 2017 4572
I want to calculate the visiting count for the current month and last month at the monthly level as follows:
user_id month year visit_count_this_month visit_count_last_month
123 APRIL 2017 1 0
123 MAY 2017 4 1
123 JUNE 2017 2 4
I was able to calculate visit_count_this_month using the following query
SELECT v.user_id, v.month, v.year,
SUM(is_visit_this_month) as visit_count_this_month
FROM
(SELECT user_id, date, month, year,
CASE WHEN TO_CHAR(date, 'MM/YYYY') = TO_CHAR(date, 'MM/YYYY')
THEN 1 ELSE 0
END as is_visit_this_month
FROM visits
GROUP BY user_id, date, month, year
HAVING user_id = 123) v
GROUP BY v.user_id, v.month, v.year
However, I'm stuck with calculating visit_count_last_month. Similar to this, I also want to calculate visit_count_last_2months.
Can somebody help?

You can use a LATERAL JOIN like this:
SELECT user_id, month, year, COUNT(*) as visit_count_this_month, visit_count_last_month
FROM visits v
CROSS JOIN LATERAL (
SELECT COUNT(*) as visit_count_last_month
FROM visits
WHERE user_id = v.user_id
AND date = (CAST(v.date AS date) - interval '1 month')
) l
GROUP BY user_id, month, year, visit_count_last_month;
SQLFiddle - http://sqlfiddle.com/#!15/393c8/2

Assuming there are values for every month, you can get the counts per month first and use lag to get the previous month's values per user.
SELECT T.*
,COALESCE(LAG(visits,1) OVER(PARTITION BY USER_ID ORDER BY year,mth),0) as last_month_visits
,COALESCE(LAG(visits,2) OVER(PARTITION BY USER_ID ORDER BY year,mth),0) as last_2_month_visits
FROM (
SELECT user_id, extract(month from date) as mth, year, COUNT(*) as visits
FROM visits
GROUP BY user_id, extract(month from date), year
) T
If there can be missing months, it is best to generate all months within a specified timeframe and left join ing the table on to that. (This example shows it for all the months in 2017).
select user_id,yr,mth,visits
,coalesce(lag(visits,1) over(PARTITION BY USER_ID ORDER BY yr,mth),0) as last_month_visits
,coalesce(lag(visits,2) OVER(PARTITION BY USER_ID ORDER BY yr,mth),0) as last_2_month_visits
from (select u.user_id,extract(year from d.dt) as yr, extract(month from d.dt) as mth,count(v.visiting_id) as visits
from generate_series(date '2017-01-01', date '2017-12-31',interval '1 month') d(dt)
cross join (select distinct user_id from visits) u
left join visits v on extract(month from v.dt)=extract(month from d.dt) and extract(year from v.dt)=extract(year from d.dt) and u.user_id=v.user_id
group by u.user_id,extract(year from d.dt), extract(month from d.dt)
) t

Related

Sales amounts of the top n selling vendors by month in bigquery

i have a table in bigquery like this (260000 rows):
vendor date item_price
x 2021-07-08 23:41:10 451,5
y 2021-06-14 10:22:10 41,7
z 2020-01-03 13:41:12 74
s 2020-04-12 01:14:58 88
....
exactly what I want is to group this data by month and find the sum of the sales of only the top 20 vendors in that month. Expected output:
month sum_of_only_top20_vendor's_sales
2020-01 7857
2020-02 9685
2020-03 3574
2020-04 7421
.....
Consider below approach
select month, sum(sale) as sum_of_only_top20_vendor_sales
from (
select vendor,
format_datetime('%Y%m', date) month,
sum(item_price) as sale
from your_table
group by vendor, month
qualify row_number() over(partition by month order by sale desc) <= 20
)
group by month
Another solution that potentially can show much much better performance on really big data:
select month,
(select sum(sum) from t.top_20_vendors) as sum_of_only_top20_vendor_sales
from (
select
format_datetime('%Y%m', date) month,
approx_top_sum(vendor, item_price, 20) top_20_vendors
from your_table
group by month
) t
or with a little refactoring
select month, sum(sum) as sum_of_only_top20_vendor_sales
from (
select
format_datetime('%Y%m', date) month,
approx_top_sum(vendor, item_price, 20) top_20_vendors
from your_table
group by month
) t, t.top_20_vendors
group by month

SQL Bigquery Counting repeated customers from transaction table

I have a transaction table that looks something like this.
userid
orderDate
amount
111
2021-11-01
20
112
2021-09-07
17
111
2021-11-21
17
I want to count how many distinct customers (userid) that bought from our store this month also bought from our store in the previous month. For example, in February 2020, we had 20 customers and out of these 20 customers 7 of them also bought from our store in the previous month, January 2020. I want to do this for all the previous months so ending up with something like.
year
month
repeated customers
2020
01
11
2020
02
7
2020
03
9
I have written this but this only works for only the current month. How would I iterate or rewrite it to get the table as shown above.
WITH CURRENT_PERIOD AS (
SELECT DISTINCT userid
FROM table1
WHERE DATE(orderDate) BETWEEN DATE_TRUNC(CURRENT_DATE(),MONTH) AND DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
),
PREVIOUS_PERIOD AS (
SELECT DISTINCT userid
FROM table1
WHERE DATE(orderDate) BETWEEN DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 1 MONTH),MONTH) AND LAST_DAY(DATE_SUB(CURRENT_DATE(), INTERVAL 1 MONTH))
)
SELECT count(1)
FROM CURRENT_PERIOD RC
WHERE RC.userid IN (SELECT DISTINCT userid FROM PREVIOUS_PERIOD)
You can summarize to get one record per month, use lag(), and then aggregate:
select yyyymm,
countif(prev_yyyymm = date_add(yyyymm, interval -1 month)
from (select userid, date_trunc(order_date, month) as yyyymm,
lag(date_trunc(order_date, month)) over (partition by userid order by date_trunc(order_date, month)) as prev_yyyymm
from table1
group by 1, 2
) t
group by yyyymm
order by yyyymm;

SQL Count Entries for each Month of the last 6 Months

I got a problem while trying to count the entries that were created in a month for the last 6 months.
The table looks like this:
A B C D
Year Month Startingdate Identifier
-----------------------------------------
2019 3 2019-03-12 OAM_1903121
2019 2 2019-03-21 OAM_1902211
And the result should look like:
A B C
Year Month Amount of orders
---------------------------------
2019 3 26
2019 2 34
This is what I have so far, but it doesn't get me the proper results:
SELECT year, month, COUNT(Startingdate) as Amount
FROM table
WHERE Startingdate > ((TRUNC(add_months(sysdate,-3) , 'MM'))-1)
GROUP BY year, month
I have not tested it, but it should work:
select year, month, count(Stringdate) as Amount_of_order
from table
where Stringdate between add_months(sysdate, -6) and sysdate
group by year, month;
Let me know.
Try that :
SELECT YEAR(Startingdate) AS [Year], MONTH(Startingdate) AS [Month], COUNT(*) AS Amount
FROM table
WHERE Startingdate > DATEADD(MONTH, -6, GETDATE())
GROUP BY YEAR(Startingdate), MONTH(Startingdate)
ORDER BY YEAR(Startingdate), MONTH(Startingdate) DESC
I think your issue is the filtering. If so, this should handle the most recent six full months:
SELECT year, month, COUNT(*) as num_orders
FROM table
WHERE Startingdate >= TRUNC(add_months(sysdate, -6) , 'MM')
GROUP BY year, month;

PostgreSQL group by and order by

I have a table with a date column. I wanted to get the count of months and display them in the order of months. Months should be displayed as 'Jan', 'Feb' etc. If I use to_char function, the order by happens on text. I can use extract(month from dt), but that will also display month in number format. This is part of a report and month should be displayed in 'Mon' format only.
SELECT to_char(dt,'Mon'), COUNT(*) FROM tb GROUP BY to_char(dt,'Mon') ORDER BY to_char(dt,'Mon');
to_char | count
---------+-------
Dec | 1
Jan | 1
Jul | 2
select month, total
from (
select
extract(month from dt) as month_number,
to_char(dt,'mon') as month,
count(*) as total
from tb
group by 1, 2
) s
order by month_number

Find number of repeating visitors in a month - PostgreSQL

I am using PostgreSQL and my data looks something like this:
UserID TimeStamp
1 2014-02-03
2 2014-02-03
3 2014-02-03
1 2014-03-03
2 2014-03-03
6 2014-03-03
7 2014-03-03
This is just dummy data for 2 days in which some UserID is getting repeated on both the days. I would like to find out the number of repeated UserId every month. For this example the final result set should look like:
Count Year Month
0 2014 2
2 2014 3
In the above table, March 3014 has 2 repeat UserID and Feb 2014 has none.
I can find out the distinct UserID for each month but not the repeated UserID. Any help in this regard would be much appreciated.
select
count(distinct userid) as "Count",
extract(year from t0.timestamp) as "Year",
extract(month from t0.timestamp) as "Month"
from
t t1
inner join
t t0 using (userid)
where t0.timestamp < date_trunc('month', t1.timestamp)
group by 2, 3
or may be faster
select
count(distinct userid) as "Count",
extract(year from t0.timestamp) as "Year",
extract(month from t0.timestamp) as "Month"
from t t1
where exists (
select 1
from t
where
userid = t1.userid
and
timestamp < date_trunc('month', t1.timestamp)
)
group by 2, 3
This might work, have not tested it out yet.
SELECT
COUNT(DISTINCT(UserId))
, EXTRACT(YEAR FROM TIMESTAMP TimeStamp) AS Year
, EXTRACT(MONTH FROM TIMESTAMP Timestamp) AS Month
FROM TABLE
GROUP BY TimeStamp
To rephrase your question:
How many users are not new (i.e. already visited the shop/website/whatever in a previous month) for each month?
SELECT
yr, mon,
COUNT(*) AS all_users,
COUNT(*) - SUM(repeated) AS new_users,
SUM(repeated) AS existing_users
FROM
(
SELECT UserId,
EXTRACT(YEAR FROM TimeStamp) AS yr,
EXTRACT(MONTH FROM TimeStamp) AS mon,
CASE WHEN ROW_NUMBER() -- 1st time users get 0
OVER (PARTITION BY UserId
ORDER BY EXTRACT(YEAR FROM TimeStamp) ,
EXTRACT(MONTH FROM TimeStamp)) = 1
THEN 0
ELSE 1
END AS repeated
FROM vt
GROUP BY UserId,
EXTRACT(YEAR FROM TimeStamp),
EXTRACT(MONTH FROM TimeStamp)
) AS dt
GROUP BY yr,mon
ORDER BY 1,2
The inner GROUP BY is needed if there are multiple rows for a user within the same month.
Is this what you want?
select yyyymm, sum(case when cnt > 1 then 1 else 0 end) as dupcnt
from (select to_char(timestamp, 'YYYY-MM') as yyyymm, userid, count(*) as cnt
from table t
group by to_char(timestamp, 'YYYY-MM'), userid
) t
group by yyyymm
order by yyyymm;