Unique new users per period - sql

In psql, I've written a query that returns unique users per week with
COUNT(DISTINCT user_id)
However, I am also interested in counting the number of unique new users per week, in other words, users that have never been active before in any of the previous weeks.
How would one write this query in postgresql?
Current query:
SELECT TO_CHAR(date_trunc('week', start_time::date), 'YYYY-MM-DD')
AS weekly, COUNT(*) AS total_transactions, COUNT(DISTINCT user_id) AS unique_users
FROM transactions
GROUP BY weekly ORDER BY weekly

Use min to get the first appearance of a user_id. Use that to calculate unique users per week. You may also want to include grouping on year.
select
TO_CHAR(date_trunc('week', first_appearance), 'YYYY-MM-DD') AS weekly,
COUNT(*) AS total_transactions,
COUNT(DISTINCT user_id) AS unique_users
from (SELECT t.*,
MIN(start_time::date) OVER(PARTITION BY user_id) AS first_appearance
FROM transactions t
) t
GROUP BY weekly

Related

Select users which have at least one successful and one failed payment per month in PostgresQl

How can I select users from this kind of table that have at least one successful and one failed payment per month?
For example if the user has payments in March, April and May and in may there was only ten successful payments we don't want to show that user but if the user has had payments for as long as 10 months an in each month there are failed(false) and successful(true) payments, we want to show that user....
In this case we would only show users with user id 1 and 3
for now my query looks like this :
SELECT DISTINCT date_trunc('month', paydate)as uniquemonth
, success
,user_id
FROM payments order by user_id,uniquemonth
You can count the distinct months with failed and successful payments:
select user_id
from t
group by user_id
having count(*) filter (where success) = count(*) and
count(*) filter (where not success) = count(*);
You seem to have only one or two records per month. If you have more and the first column were really not the first of the month, you could use count(distinct):
select user_id
from t
group by user_id
having count(distinct date_trunc('month', uniquemonth)) filter (where success) = count(distinct date_trunc('month', uniquemonth)) and
count(distinct date_trunc('month', uniquemonth)) filter (where not success) = count(distinct date_trunc('month', uniquemonth))
here is one way:
select
to_char(uniquemonth, 'YYYY-MM')
, users
from tablename
group by to_char(uniquemonth, 'YYYY-MM'), users
having count(*) filter (where success) > 1
and count(*) filter (where not success) > 1;
Use MIN() and MAX() window functions to get the min and max values of success for each user/month and then use aggregation:
SELECT user_id
FROM (
SELECT *,
MIN(success::int) OVER (PARTITION BY user_id, to_char(paydate, 'YYYY-MM')) min_success,
MAX(success::int) OVER (PARTITION BY user_id, to_char(paydate, 'YYYY-MM')) max_success
FROM payments
) t
GROUP BY user_id
HAVING MAX(min_success) = 0 AND MIN(max_success) = 1

Daily Rolling Count of Distinct Users on Different time periods

I am trying to find the most optimal way to run the following query which I need to connect to tableau and visualise. The idea is to count 7 day active users, 30 day active users and 90 day active users for each day. So for today I want who was active and for yesterday and I want who was active within those timeframes.
To clarify users can be active multiple times within my time frames.
A count of 7 day actives users would be the distinct number of users who had a session with in the period todays date and todays date -6. I need to calculate this for every date within the last 6 month.
This is the query I have.
with dau as (
select date_trunc('day', created_date) created_at,
count(distinct customer_id) dau
from sessions
where created_date >= date_trunc('day', dateadd('month', -6, getdate()))
group by date_trunc('day', created_date)
)
select created_at,
dau,
(select count(distinct customer_id)
from sessions
where date_trunc('day', created_date) between created_at - 6 and created_at) wau,
(select count(distinct customer_id)
from sessions
where date_trunc('day', created_date) between created_at - 29 and created_at) as mau,
(select count(distinct customer_id)
from session_s
where date_trunc('day', created_date) between created_at - 89 and created_at) as three_mau
from dau
It takes 30 min to run which seems crazy. Is there a better way to do it? I am also looking into the use of materialised views as a faster way to use this in a dashboard. Would this work?
The result I am looking to get would be a table where the rows are dates within the last 6 months and each column is the count of distinct users on 7, 30 and 90 periods from that date.
Thanks in advance!

how to find consecutive user login across week

I'm fairly new to SQL & maybe the complexity level for this report is above my pay grade
I need help to figure out the list of users who are logging to the app consecutively every week in the time period chosen(this logic eventually needs to be extended to a month, quarter & year ultimately but a week is good for now)
Table structure for ref
events: User_id int, login_date timestamp
The table events can have 1 or more entries for a user. This inherently means that the user can login multiple times to the app. To shed some light, if we focus on Jan 2020- Mar2020 then I need the following in the output
user_id who logged into the app every week from 2020wk1 to 2020Wk14
at least once
the week they logged in
number of times they logged in that week
I'm also okay if the output of the query is just the user_id. The thing is I'm unable to make sense out of the output that I'm seeing on my end after trying the following SQL code, perhaps working on this problem for so long might be the reason for that!
SQL code tried so far:
SELECT DISTINCT user_id
,extract('year' FROM timestamp)||'Wk'|| extract('week' FROM timestamp)
,lead(extract('week' FROM timestamp)) over (partition by user_id, extract('week' FROM timestamp) order by extract('week' FROM timestamp))
FROM events
WHERE user_id = 'Anything that u wish to enter'
You can get the summary you want as:
select user_id, date_trunc('week', timestamp) as week, count(*)
from events
group by user_id, week;
But the filtering is tricker. It is better to go with dates rather than week numbers:
select user_id, date_trunc('week', timestamp) as week, count(*) as cnt,
count(*) over (partition by user_id) as num_weeks
from events
where timestamp >= ? and timestamp < ?
group by user_id, week;
Then you can use a subquery:
select uw.*
from (select user_id, date_trunc('week', timestamp) as week, count(*) as cnt,
count(*) over (partition by user_id) as num_weeks
from events
where timestamp >= ? and timestamp < ?
group by user_id, week
) uw
where num_weeks = ? -- 14 in your example

i am trying to use the avg() function in a subquery after using a count in the inner query but i cannot seem to get it work in SQL

my table name is CustomerDetails and it has the following columns:
customer_id, login_id, session_id, login_date
i am trying to write a query that calculates the average number of customers login in per day.
i tried this:
select avg(session_id)
from CustomerDetails
where exists (select count(session_id) from CustomerDetails as 'no_of_entries')
.
but then i realized it was going straight to the column and just calculating the average of that column but that's not what i want to do. can someone help me?
thanks
The first thing you need to do is get logins per day:
SELECT login_date, COUNT(*) AS loginsPerDay
FROM CustomerDetails
GROUP BY login_date
Then you can use that to get average logins per day:
SELECT AVG(loginsPerDay)
FROM (
SELECT login_date, COUNT(*) AS loginsPerDay
FROM CustomerDetails
GROUP BY login_date
)
If your login_date is a DATE type you're all set. If it has a time component then you'll need to truncate it to date only:
SELECT AVG(loginsPerDay)
FROM (
SELECT CAST(login_date AS DATE), COUNT(*)
FROM CustomerDetails
GROUP BY CAST(login_date AS DATE)
)
i am trying to write a query that calculates the average number of customers login in per day.
Count the number of customers. Divide by the number of days. I think that is:
select count(*) * 1.0 / count(distinct cast(login_date as date))
from customerdetails;
I understand that you want do count the number of visitors per day, not the number of visits. So if a customer logged twice on the same day, you want to count him only once.
If so, you can use distinct and two levels of aggregation, like so:
select avg(cnt_visitors) avg_cnt_vistors_per_day
from (
select count(distinct customer_id) cnt_visitors
from customer_details
group by cast(login_date as date)
) t
The inner query computes the count of distinct customers for each day, he outer query gives you the overall average.

Postgresql - Cumulative sum of created users

I have a users table with a timestamp when each user was created. I'd like to get the cumulative sum of users created per month.
I do have the following query which is working, but it's showing me the sum on a per day basis. I have a hard time going from this to a per month basis.
SELECT
created_at,
sum(count(*)) OVER (ORDER BY created_at) as total
FROM users
GROUP BY created_at
Expected output:
created_at count
-----------------
2016-07 100
2016-08 150
2016-09 200
2016-10 500
Former reading:
Calculating Cumulative Sum in PostgreSQL
Count cumulative total in Postgresql
Rolling sum / count / average over date interval
Cumulative sum of values by month, filling in for missing months
Postgres window function and group by exception
I'd take a two-step approach. First, use an inner query to count how many users were created each month. Then, wrap this query with another query that calculates the cumulative sum of these counts:
SELECT created_at, SUM(cnt) OVER (ORDER BY created_at ASC)
FROM (SELECT TO_CHAR(created_at, 'YYYY-MM') AS created_at, COUNT(*) AS cnt
FROM users
GROUP BY TO_CHAR(created_at, 'YYYY-MM')) t
ORDER BY 1 ASC;