Postgresql - Cumulative sum of created users - sql

I have a users table with a timestamp when each user was created. I'd like to get the cumulative sum of users created per month.
I do have the following query which is working, but it's showing me the sum on a per day basis. I have a hard time going from this to a per month basis.
SELECT
created_at,
sum(count(*)) OVER (ORDER BY created_at) as total
FROM users
GROUP BY created_at
Expected output:
created_at count
-----------------
2016-07 100
2016-08 150
2016-09 200
2016-10 500
Former reading:
Calculating Cumulative Sum in PostgreSQL
Count cumulative total in Postgresql
Rolling sum / count / average over date interval
Cumulative sum of values by month, filling in for missing months
Postgres window function and group by exception

I'd take a two-step approach. First, use an inner query to count how many users were created each month. Then, wrap this query with another query that calculates the cumulative sum of these counts:
SELECT created_at, SUM(cnt) OVER (ORDER BY created_at ASC)
FROM (SELECT TO_CHAR(created_at, 'YYYY-MM') AS created_at, COUNT(*) AS cnt
FROM users
GROUP BY TO_CHAR(created_at, 'YYYY-MM')) t
ORDER BY 1 ASC;

Related

How to get average from counting values?

I am using SQLite.
I need to determine average impressions per day, where impression = count of person_id.
COLUMNS:
person_id - unique identifier of the person
date - date they were shown the ad
ad_id - content of the ad: ad_1_product1, ad_2_product2, or ad_3_product3
clicked (TRUE/FALSE) - clicked on the ad
signed_up - (TRUE/FALSE) created an account
subscribed (TRUE/FALSE) - started a paid subscription
I set clicked, signed_up and subscribed as BOOLEAN, the rest is text.
MY CODE:
SELECT AVG(impressions) AS avg_impressions
FROM (
SELECT COUNT(person_id) as impressions
FROM videoadcampaign
GROUP BY date
) date;
I get 1 row with 1 colum avg_impression = 591
I cannot break down the average by date. Date is in 2021-04-27 format, total date count is 8.
result
expected, ignore the column name, it's just to show you
Any help is highly appreciated.
If you want the percentage of rows for each day, then you can do it by dividing each day's count by the total number of rows.
With SUM() window function:
SELECT date, 1.0 * COUNT(*) / SUM(COUNT(*)) OVER () AS avg_impressions
FROM videoadcampaign
GROUP BY date
Or, with a subquery:
SELECT date, 1.0 * COUNT(*) / (SELECT COUNT(*) FROM videoadcampaign) AS avg_impressions
FROM videoadcampaign
GROUP BY date
I assume that the column person_id is not nullable, so instead of COUNT(person_id) you may use COUNT(*).
The average impressions per day is the total impressions divided by the number of people. No subquery is needed:
SELECT COUNT(*) * 1.0 / COUNT(DISTINCT person_id) AS avg_impressions
FROM videoadcampaign;

PostgreSQL for the average number of attendances of an event per month

I'm trying to write some SQL to understand the average number of events attended, per month.
Attendees
| id | user_id | event_id | created_at |
I've tried:
SELECT AVG(b.rcount) FROM (select count(*) as rcount FROM attendees GROUP BY attendees.user_id) as b;
But this returns 5.77 (which is just the average of all time). I'm trying to get the average per month.
The results would ideally be:
2020-01-01, 2.1
2020-01-02, 2.4
2020-01-03, 3.3
...
I also tried this:
SELECT mnth, AVG(b.rcount) FROM (select date_trunc('month', created_at) as mnth, count(*) as rcount FROM attendees GROUP BY 1, 2) as b;
But got: ERROR: aggregate functions are not allowed in GROUP BY
If I follow you correctly, a simple approach is to divide the number of rows per month by the count of distinct users:
select
date_trunc('month', created_at) created_month,
1.0 * count(*) / count(distinct user_id) avg_events_per_user
from attendees
group by date_trunc('month', created_at)

i am trying to use the avg() function in a subquery after using a count in the inner query but i cannot seem to get it work in SQL

my table name is CustomerDetails and it has the following columns:
customer_id, login_id, session_id, login_date
i am trying to write a query that calculates the average number of customers login in per day.
i tried this:
select avg(session_id)
from CustomerDetails
where exists (select count(session_id) from CustomerDetails as 'no_of_entries')
.
but then i realized it was going straight to the column and just calculating the average of that column but that's not what i want to do. can someone help me?
thanks
The first thing you need to do is get logins per day:
SELECT login_date, COUNT(*) AS loginsPerDay
FROM CustomerDetails
GROUP BY login_date
Then you can use that to get average logins per day:
SELECT AVG(loginsPerDay)
FROM (
SELECT login_date, COUNT(*) AS loginsPerDay
FROM CustomerDetails
GROUP BY login_date
)
If your login_date is a DATE type you're all set. If it has a time component then you'll need to truncate it to date only:
SELECT AVG(loginsPerDay)
FROM (
SELECT CAST(login_date AS DATE), COUNT(*)
FROM CustomerDetails
GROUP BY CAST(login_date AS DATE)
)
i am trying to write a query that calculates the average number of customers login in per day.
Count the number of customers. Divide by the number of days. I think that is:
select count(*) * 1.0 / count(distinct cast(login_date as date))
from customerdetails;
I understand that you want do count the number of visitors per day, not the number of visits. So if a customer logged twice on the same day, you want to count him only once.
If so, you can use distinct and two levels of aggregation, like so:
select avg(cnt_visitors) avg_cnt_vistors_per_day
from (
select count(distinct customer_id) cnt_visitors
from customer_details
group by cast(login_date as date)
) t
The inner query computes the count of distinct customers for each day, he outer query gives you the overall average.

Unique new users per period

In psql, I've written a query that returns unique users per week with
COUNT(DISTINCT user_id)
However, I am also interested in counting the number of unique new users per week, in other words, users that have never been active before in any of the previous weeks.
How would one write this query in postgresql?
Current query:
SELECT TO_CHAR(date_trunc('week', start_time::date), 'YYYY-MM-DD')
AS weekly, COUNT(*) AS total_transactions, COUNT(DISTINCT user_id) AS unique_users
FROM transactions
GROUP BY weekly ORDER BY weekly
Use min to get the first appearance of a user_id. Use that to calculate unique users per week. You may also want to include grouping on year.
select
TO_CHAR(date_trunc('week', first_appearance), 'YYYY-MM-DD') AS weekly,
COUNT(*) AS total_transactions,
COUNT(DISTINCT user_id) AS unique_users
from (SELECT t.*,
MIN(start_time::date) OVER(PARTITION BY user_id) AS first_appearance
FROM transactions t
) t
GROUP BY weekly

Postgres SQL: Sum of ids greater than a day, computed day by day over a series

Looking to compute a moving sum day by day over a date range. i.e. Looking to sum all values greater than or equal to the date but do it row by row. I know that a window function is needed, but need some help with the actual function.
** I need to compute the sum greater than each date in a row. Notice on 2017-08-02 I do not count the value from the day before
Example data:
2017-08-1, 1
2017-08-2, 5
2017-08-3, 4
2017-08-4, 3
2017-08-5, 2
Desired Result:
2017-08-1, 15
2017-08-2, 14
2017-08-3, 9
2017-08-4, 5
2017-08-5, 2
Here is what I have to produce this data.
SELECT DATE_TRUNC('day', created_at),
COUNT(*)
FROM table
GROUP BY 1
ORDER BY 1 DESC
Just use cumulative sums:
SELECT DATE_TRUNC('day', created_at),
COUNT(*),
SUM(COUNT(*)) OVER (ORDER BY DATE_TRUNC('day', created_at) DESC) as sum_greater_than
FROM table
GROUP BY 1
ORDER BY 1 DESC;