Query to get user count for every month and previous month - sql

We have the following activity table and would like to query it to get the number of unique users for each month and the previous month. The date field (createdat) is a timestamp. The query needs to work in PostgreSQL.
Activity table:
| id | userid | createdat | username |
|--------|--------|-------------------------|----------------|
| 1d658a | 4957f3 | 2016-12-06 21:16:35:942 | Tom Jones |
| 3a86e3 | 684edf | 2016-12-03 21:16:35:943 | Harry Smith |
| 595756 | 582107 | 2016-12-26 21:16:35:944 | William Hanson |
| 2c87fe | 784723 | 2016-12-07 21:16:35:945 | April Cordon |
| 32509a | 4957f3 | 2016-12-20 21:16:35:946 | Tom Jones |
| 72e703 | 582107 | 2017-01-01 21:16:35:947 | William Hanson |
| 6d658a | 582107 | 2016-12-06 21:16:35:948 | William Hanson |
| 5c077c | 5934c4 | 2016-12-06 21:16:35:949 | Sandra Holmes |
| 92142b | 57ea5c | 2016-12-15 21:16:35:950 | Lucy Lawless |
| 3dd0a6 | 5934c4 | 2016-12-04 21:16:35:951 | Sandra Holmes |
| 43509a | 4957f3 | 2016-11-20 21:16:35:946 | Tom Jones |
| 85142b | 57ea5c | 2016-11-15 21:16:35:950 | Lucy Lawless |
| 7c87fe | 784723 | 2017-1-07 21:16:35:945 | April Cordon |
| 9c87fe | 784723 | 2017-2-07 21:16:35:946 | April Cordon |
Results:
| Month | UserThis Month | UserPreviousMonth |
|----------|----------------|-------------------|
| Dec 2016 | 6 | 2 |
| Jan 2017 | 2 | 6 |
| Feb 2017 | 1 | 2 |

You can try this query. to_char to get MON YYYY, You can try to write a subquery with lag windows function to get UserPreviousMonth count.
SELECT *
FROM (SELECT To_char(createdat, 'MON YYYY') Months,
Count(DISTINCT username) UserThisMonth,
Lag(Count(DISTINCT username)) OVER (
ORDER BY Date_part('year', createdat),
Date_part('month',createdat)
) UserPreviousMonth
FROM t
GROUP BY Date_part('year', createdat),
To_char(createdat, 'MON YYYY'),
Date_part('month', createdat)) t
WHERE userpreviousmonth IS NOT NULL
sqlfiddle:http://sqlfiddle.com/#!15/45e52/2
| months | userthismonth | userpreviousmonth |
|----------|---------------|-------------------|
| DEC 2016 | 6 | 2 |
| JAN 2017 | 2 | 6 |
| FEB 2017 | 1 | 2 |
EDIT
Types of Dec 2016 and Jan 2017 ... must string, because DateTime need a full date like 2017-01-01. If you need to be sorted and used on the graph I will suggest you sort on this query years and months columns, then make date string on front-end.
SELECT *
FROM (SELECT Date_part('year', createdat) years,
Date_part('month', createdat) months,
Count(DISTINCT username) UserThisMonth,
Lag(Count(DISTINCT username)) OVER (
ORDER BY Date_part('year', createdat),
Date_part('month',createdat)
) UserPreviousMonth
FROM user_activity
GROUP BY Date_part('year', createdat),
Date_part('month', createdat)) t
WHERE userpreviousmonth IS NOT NULL
sqlfiddle:http://sqlfiddle.com/#!15/2da2b/4
| years | months | userthismonth | userpreviousmonth |
|-------|--------|---------------|-------------------|
| 2016 | 12 | 6 | 2 |
| 2017 | 1 | 2 | 6 |
| 2017 | 2 | 1 | 2 |

Fastest and simplest with date_trunc(). Use to_char() once to display the month in preferred format:
WITH cte AS (
SELECT date_trunc('month', createdat) AS mon
, count(DISTINCT username) AS ct
FROM activity
GROUP BY 1
)
SELECT to_char(t1.mon, 'MON YYYY') AS month
, t1.ct AS users_this_month
, t2.ct AS users_previous_month
FROM cte t1
LEFT JOIN cte t2 ON t2.mon = t1.mon - interval '1 mon'
ORDER BY t1.mon;
db<>fiddle here
You commented:
the "Month" field in the results table needs to be a "date" data type so it can be sorted and used on the graph.
For this, simply cast in the final SELECT:
SELECT t1.mon::date AS month ...
Grouping and ordering by a (truncated) timestamp value is more efficient (and reliable) than by multiple values or a text representation.
The result includes the first month ('NOV 2016' in your demo), showing NULL for users_previous_month - like for any previous month without entries. You might want to display 0 instead or drop the row ...
Related:
How to get the date and time from timestamp in PostgreSQL select query?
PostgreSQL: running count of rows for a query 'by minute'
Aside: usernames in the form of "Tom Jones" are typically not unique. You'll want to operate with a unique ID instead.

Edit:
Shamelessly using #D-Shih's superior method of generating year/month combinations.
A couple of solutions:
WITH ua AS (
SELECT
TO_CHAR(createdate, 'YYYYMM') AS year_month,
COUNT(DISTINCT userid) distinct_users
FROM user_activity
GROUP BY
TO_CHAR(createdate, 'YYYYMM')
)
SELECT * FROM (
SELECT
TO_DATE(ua.year_month || '01', 'YYYYMMDD')
+ INTERVAL '1 month'
- INTERVAL '1 day'
AS month_end,
ua.distinct_users,
LAG(ua.distinct_users) OVER (ORDER BY ua.year_month) distinct_users_last_month
FROM ua
) uas WHERE uas.distinct_users_last_month IS NOT NULL
ORDER BY month_end DESC;
No windowing required:
WITH ua AS (
SELECT
TO_CHAR(createdate, 'YYYYMM') AS year_month,
TO_CHAR(createdate - INTERVAL '1 MONTH', 'YYYYMM') AS last_month,
COUNT(DISTINCT userid) AS distinct_users
FROM user_activity
GROUP BY
TO_CHAR(createdate, 'YYYYMM'),
TO_CHAR(createdate - INTERVAL '1 MONTH', 'YYYYMM')
)
SELECT
TO_DATE(ua1.year_month || '01', 'YYYYMMDD')
+ INTERVAL '1 month'
- INTERVAL '1 day'
AS month_end,
ua1.distinct_users,
ua2.distinct_users AS last_distinct_users
FROM
ua ua1 LEFT OUTER JOIN ua ua2
ON ua1.year_month = ua2.last_month
WHERE ua2.distinct_users IS NOT NULL
ORDER BY ua1.year_month DESC;
DDL:
CREATE TABLE user_activity (
id varchar(50),
userid varchar(50),
createdate timestamp,
username varchar(50)
);
COMMIT;
Data:
INSERT INTO user_activity VALUES ('1d658a','4957f3','20161206 21:16:35'::timestamp,'Tom Jones');
INSERT INTO user_activity VALUES ('3a86e3','684edf','20161203 21:16:35'::timestamp,'Harry Smith');
INSERT INTO user_activity VALUES ('595756','582107','20161226 21:16:35'::timestamp,'William Hanson');
INSERT INTO user_activity VALUES ('2c87fe','784723','20161207 21:16:35'::timestamp,'April Cordon');
INSERT INTO user_activity VALUES ('32509a','4957f3','20161220 21:16:35'::timestamp,'Tom Jones');
INSERT INTO user_activity VALUES ('72e703','582107','20170101 21:16:35'::timestamp,'William Hanson');
INSERT INTO user_activity VALUES ('6d658a','582107','20161206 21:16:35'::timestamp,'William Hanson');
INSERT INTO user_activity VALUES ('5c077c','5934c4','20161206 21:16:35'::timestamp,'Sandra Holmes');
INSERT INTO user_activity VALUES ('92142b','57ea5c','20161215 21:16:35'::timestamp,'Lucy Lawless');
INSERT INTO user_activity VALUES ('3dd0a6','5934c4','20161204 21:16:35'::timestamp,'Sandra Holmes');
INSERT INTO user_activity VALUES ('43509a','4957f3','20161120 21:16:35'::timestamp,'Tom Jones');
INSERT INTO user_activity VALUES ('85142b','57ea5c','20161115 21:16:35'::timestamp,'Lucy Lawless');
INSERT INTO user_activity VALUES ('7c87fe','784723','20170107 21:16:35'::timestamp,'April Cordon');
INSERT INTO user_activity VALUES ('9c87fe','784723','20170207 21:16:35'::timestamp,'April Cordo');
COMMIT;

Related

Get rows within current month

Get the rows from a table filtered by current month in psql.
Example table -
+--------------+--------------------+--------------------------+
| id | name | registeration_date |
+--------------------------------------------------------------+
| 1 | abc | 2018|12|31 18:30:00|00 |
+--------------------------------------------------------------+
| 2 | pqr | 2018|11|31 18:30:00|00 |
+--------------------------------------------------------------+
| 3 | lmn | 2020|07|10 18:30:00|00 |
+--------------+--------------------+--------------------------+
After query result, assuming current month is July, 2020:
+--------------+--------------------+--------------------------+
| id | name | registeration_date |
+--------------+--------------------+--------------------------+
| 3 | lmn | 2020|07|10 18:30:00|00 |
+--------------+--------------------+--------------------------+
The usual approach is to use a range query to allow for an index usage on registration_date
select *
from the_table
where registration_date >= date_trunc('month', current_date)
and registration_date < date_trunc('month', current_date) + interval '1 month';
In this, I use Extract function. For more details on extractcheck here.
SELECT *
FROM table
WHERE extract(month from registeration_date) = extract(month from current_date)
AND extract(year from registeration_date) = extract(year from current_date)

How can I insert data in table form into another table provided some specific conditions are satisfied

Logic: If today is Monday (reference 'time' table), data present in S should be inserted into M (along with a sent_day column which will have today's date).
If today is not Monday, dates corresponding to current week (unique week_id) should be checked in M table. If any of these dates are available in M then S should not be inserted into M. If these dates are not available in M then S should be inserted into M
time
+------------+------------+----------------+
| cal_dt | cal_day | week_id |
+------------+------------+----------------+
| 2020-03-23 | Monday | 123 |
| 2020-03-24 | Tuesday | 123 |
| 2020-03-25 | Wednesday | 123 |
| 2020-03-26 | Thursday | 123 |
| 2020-03-27 | Friday | 123 |
| 2020-03-30 | Monday | 124 |
| 2020-03-31 | Tueday | 124 |
+------------+------------+----------------+
M
+------------+----------+-------+
| sent_day | item | price |
+------------+----------+-------+
| 2020-03-11 | pen | 10 |
| 2020-03-11 | book | 50 |
| 2020-03-13 | Eraser | 5 |
| 2020-03-13 | sharpner | 5 |
+------------+----------+-------+
S
+----------+-------+
| item | price |
+----------+-------+
| pen | 25 |
| book | 20 |
| Eraser | 10 |
| sharpner | 3 |
+----------+-------+
Insert INTO M
SELECT
CASE WHEN(SELECT cal_day FROM time WHERE cal_dt = current_date) = 'Monday' THEN s.*
ELSE
(CASE WHEN(SELECT cal_dt FROM time WHERE wk_id =(SELECT wk_id FROM time WHERE cal_dt = current_date ) NOT IN (SELECT DISTINCT sent_day FROM M) THEN 1 ELSE 0 END)
THEN s.* ELSE END
FROM s
I would do this in two separate INSERT statements:
The first condition ("if today is monday") is quite easy:
insert into m (sent_day, item, price)
select current_date, item, price
from s
where exists (select *
from "time"
where cal_dt = current_date
and cal_day = 'Monday');
I find storing the date and the week day a bit confusing as the week day can easily be extracted from the day. For the test "if today is Monday" it's actually not necessary to consult the "time" table at all:
insert into m (sent_day, item, price)
select current_date, item, price
from s
where extract(dow from current_date) = 1;
The second part is a bit more complicated, but if I understand it correctly, it should be something like this:
insert into m (sent_day, item, price)
select current_date, item, price
from s
where not exists (select *
from m
where m.sent_day in (select cal_dt
from "time" t
where cal_dt = current_date
and cal_day <> 'Monday'));
If you just want a single INSERT statement, you could simply do a UNION ALL between the two selects:
insert into m (sent_day, item, price)
select current_date, item, price
from s
where extract(dow from current_date) = 1
union all
select current_date, item, price
from s
where not exists (select *
from m
where m.sent_day in (select cal_dt
from "time" t
where cal_dt = current_date
and cal_day <> 'Monday'));

postgresql - cumul. sum active customers by month (removing churn)

I want to create a query to get the cumulative sum by month of our active customers. The tricky thing here is that (unfortunately) some customers churn and so I need to remove them from the cumulative sum on the month they leave us.
Here is a sample of my customers table :
customer_id | begin_date | end_date
-----------------------------------------
1 | 15/09/2017 |
2 | 15/09/2017 |
3 | 19/09/2017 |
4 | 23/09/2017 |
5 | 27/09/2017 |
6 | 28/09/2017 | 15/10/2017
7 | 29/09/2017 | 16/10/2017
8 | 04/10/2017 |
9 | 04/10/2017 |
10 | 05/10/2017 |
11 | 07/10/2017 |
12 | 09/10/2017 |
13 | 11/10/2017 |
14 | 12/10/2017 |
15 | 14/10/2017 |
Here is what I am looking to achieve :
month | active customers
-----------------------------------------
2017-09 | 7
2017-10 | 6
I've managed to achieve it with the following query ... However, I'd like to know if there are a better way.
select
"begin_date" as "date",
sum((new_customers.new_customers-COALESCE(churn_customers.churn_customers,0))) OVER (ORDER BY new_customers."begin_date") as active_customers
FROM (
select
date_trunc('month',begin_date)::date as "begin_date",
count(id) as new_customers
from customers
group by 1
) as new_customers
LEFT JOIN(
select
date_trunc('month',end_date)::date as "end_date",
count(id) as churn_customers
from customers
where
end_date is not null
group by 1
) as churn_customers on new_customers."begin_date" = churn_customers."end_date"
order by 1
;
You may use a CTE to compute the total end_dates and then subtract it from the counts of start dates by using a left join
SQL Fiddle
Query 1:
WITH edt
AS (
SELECT to_char(end_date, 'yyyy-mm') AS mon
,count(*) AS ct
FROM customers
WHERE end_date IS NOT NULL
GROUP BY to_char(end_date, 'yyyy-mm')
)
SELECT to_char(c.begin_date, 'yyyy-mm') as month
,COUNT(*) - MAX(COALESCE(ct, 0)) AS active_customers
FROM customers c
LEFT JOIN edt ON to_char(c.begin_date, 'yyyy-mm') = edt.mon
GROUP BY to_char(begin_date, 'yyyy-mm')
ORDER BY month;
Results:
| month | active_customers |
|---------|------------------|
| 2017-09 | 7 |
| 2017-10 | 6 |

Rolling 90 days active users in BigQuery, improving preformance (DAU/MAU/WAU)

I'm trying to get the number of unique events on a specific date, rolling 90/30/7 days back. I've got this working on a limited number of rows with the query bellow but for large data sets I get memory errors from the aggregated string which becomes massive.
I'm looking for a more effective way of achieving the same result.
Table looks something like this:
+---+------------+-------------+
| | date | userid |
+---+------------+-------------+
| 1 | 2013-05-14 | xxxxx |
| 2 | 2017-03-14 | xxxxx |
| 3 | 2018-01-24 | xxxxx |
| 4 | 2013-03-21 | xxxxx |
| 5 | 2014-03-19 | xxxxx |
| 6 | 2015-09-03 | xxxxx |
| 7 | 2014-02-06 | xxxxx |
| 8 | 2014-10-30 | xxxxx |
| ..| ... | ... |
+---+------------+-------------+
Format of the desired result:
+---+------------+---------------------------------------------+
| | date | active_users_7_days | active_users_90_days |
+---+------------+---------------------------------------------+
| 1 | 2013-05-14 | 1240 | 34339 |
| 2 | 2017-03-14 | 4334 | 54343 |
| 3 | 2018-01-24 | ..... | ..... |
| 4 | 2013-03-21 | ..... | ..... |
| 5 | 2014-03-19 | ..... | ..... |
| 6 | 2015-09-03 | ..... | ..... |
| 7 | 2014-02-06 | ..... | ..... |
| 8 | 2014-10-30 | ..... | ..... |
| ..| ... | ..... | ..... |
+---+------------+---------------------------------------------+
My query looks like this:
#standardSQL
WITH
T1 AS(
SELECT
date,
STRING_AGG(DISTINCT userid) AS IDs
FROM
`consumer.events`
GROUP BY
date ),
T2 AS(
SELECT
date,
STRING_AGG(IDs) OVER(ORDER BY UNIX_DATE(date) RANGE BETWEEN 90 PRECEDING
AND CURRENT ROW) AS IDs
FROM
T1 )
SELECT
date,
(
SELECT
COUNT(DISTINCT (userid))
FROM
UNNEST(SPLIT(IDs)) AS userid) AS NinetyDays
FROM
T2
Counting unique users requires a lot of resources, even more if you want results over a rolling window. For a scalable solution, look into approximate algorithms like HLL++:
https://medium.freecodecamp.org/counting-uniques-faster-in-bigquery-with-hyperloglog-5d3764493a5a
For an exact count, this would work (but gets slower as the window gets larger):
#standardSQL
SELECT DATE_SUB(date, INTERVAL i DAY) date_grp
, COUNT(DISTINCT owner_user_id) unique_90_day_users
, COUNT(DISTINCT IF(i<31,owner_user_id,null)) unique_30_day_users
, COUNT(DISTINCT IF(i<8,owner_user_id,null)) unique_7_day_users
FROM (
SELECT DATE(creation_date) date, owner_user_id
FROM `bigquery-public-data.stackoverflow.posts_questions`
WHERE EXTRACT(YEAR FROM creation_date)=2017
GROUP BY 1, 2
), UNNEST(GENERATE_ARRAY(1, 90)) i
GROUP BY 1
ORDER BY date_grp
The approximate solution produces results way faster (14s vs 366s, but then the results are approximate):
#standardSQL
SELECT DATE_SUB(date, INTERVAL i DAY) date_grp
, HLL_COUNT.MERGE(sketch) unique_90_day_users
, HLL_COUNT.MERGE(DISTINCT IF(i<31,sketch,null)) unique_30_day_users
, HLL_COUNT.MERGE(DISTINCT IF(i<8,sketch,null)) unique_7_day_users
FROM (
SELECT DATE(creation_date) date, HLL_COUNT.INIT(owner_user_id) sketch
FROM `bigquery-public-data.stackoverflow.posts_questions`
WHERE EXTRACT(YEAR FROM creation_date)=2017
GROUP BY 1
), UNNEST(GENERATE_ARRAY(1, 90)) i
GROUP BY 1
ORDER BY date_grp
Updated query that gives correct results - removing rows with less than 90 days (works when no dates are missing):
#standardSQL
SELECT DATE_SUB(date, INTERVAL i DAY) date_grp
, HLL_COUNT.MERGE(sketch) unique_90_day_users
, HLL_COUNT.MERGE(DISTINCT IF(i<31,sketch,null)) unique_30_day_users
, HLL_COUNT.MERGE(DISTINCT IF(i<8,sketch,null)) unique_7_day_users
, COUNT(*) window_days
FROM (
SELECT DATE(creation_date) date, HLL_COUNT.INIT(owner_user_id) sketch
FROM `bigquery-public-data.stackoverflow.posts_questions`
WHERE EXTRACT(YEAR FROM creation_date)=2017
GROUP BY 1
), UNNEST(GENERATE_ARRAY(1, 90)) i
GROUP BY 1
HAVING window_days=90
ORDER BY date_grp
You can aggregate the date and do the sum. What is the aggregation? Take the most recent date:
select count(*) as num_users,
sum(case when date > datediff(current_date, interval -30 day) then 1 else 0 end) as num_users_30days,
sum(case when date > datediff(current_date, interval -60 day) then 1 else 0 end) as num_users_60days,
sum(case when date > datediff(current_date, interval -90 day) then 1 else 0 end) as num_users_90days
from (select user_id, max(date) as max(date)
from `consumer.events` e
group by user_id
) e;
If the most recent date for the user is in the period, then the user should be counted.
You can get this "as-of" a particular date by using a where clause in the subquery.

Querying all past and future round birthdays

I got the birthdates of users in a table and want to display a list of round birthdays for the next n years (starting from an arbitrary date x) which looks like this:
+----------------------------------------------------------------------------------------+
| Name | id | birthdate | current_age | birthday | year | month | day | age_at_date |
+----------------------------------------------------------------------------------------+
| User 1 | 1 | 1958-01-23 | 59 | 2013-01-23 | 2013 | 1 | 23 | 55 |
| User 2 | 2 | 1988-01-29 | 29 | 2013-01-29 | 2013 | 1 | 29 | 25 |
| User 3 | 3 | 1963-02-12 | 54 | 2013-02-12 | 2013 | 2 | 12 | 50 |
| User 1 | 1 | 1958-01-23 | 59 | 2018-01-23 | 2018 | 1 | 23 | 60 |
| User 2 | 2 | 1988-01-29 | 29 | 2018-01-29 | 2018 | 1 | 29 | 30 |
| User 3 | 3 | 1963-02-12 | 54 | 2018-02-12 | 2018 | 2 | 12 | 55 |
| User 1 | 1 | 1958-01-23 | 59 | 2023-01-23 | 2023 | 1 | 23 | 65 |
| User 2 | 2 | 1988-01-29 | 29 | 2023-01-29 | 2023 | 1 | 29 | 35 |
| User 3 | 3 | 1963-02-12 | 54 | 2023-02-12 | 2023 | 2 | 12 | 60 |
+----------------------------------------------------------------------------------------+
As you can see, I want to be "wrap around" and not only show the next upcoming round birthday, which is easy, but also historical and far future data.
The core idea of my current approach is the following: I generate via generate_series all dates from 1900 till 2100 and join them by matching day and month of the birthdate with the user. Based on that, I calculate the age at that date to select finally only that birthdays, which are round (divideable by 5) and yield to a nonnegative age.
WITH
test_users(id, name, birthdate) AS (
VALUES
(1, 'User 1', '23-01-1958' :: DATE),
(2, 'User 2', '29-01-1988'),
(3, 'User 3', '12-02-1963')
),
dates AS (
SELECT
s AS date,
date_part('year', s) AS year,
date_part('month', s) AS month,
date_part('day', s) AS day
FROM generate_series('01-01-1900' :: TIMESTAMP, '01-01-2100' :: TIMESTAMP, '1 days' :: INTERVAL) AS s
),
birthday_data AS (
SELECT
id AS member_id,
test_users.birthdate AS birthdate,
(date_part('year', age((test_users.birthdate)))) :: INT AS current_age,
date :: DATE AS birthday,
date_part('year', date) AS year,
date_part('month', date) AS month,
date_part('day', date) AS day,
ROUND(extract(EPOCH FROM (dates.date - birthdate)) / (60 * 60 * 24 * 365)) :: INT AS age_at_date
FROM test_users, dates
WHERE
dates.day = date_part('day', birthdate) AND
dates.month = date_part('month', birthdate) AND
dates.year >= date_part('year', birthdate)
)
SELECT
test_users.name,
bd.*
FROM test_users
LEFT JOIN birthday_data bd ON bd.member_id = test_users.id
WHERE
bd.age_at_date % 5 = 0 AND
bd.birthday BETWEEN NOW() - INTERVAL '5' YEAR AND NOW() + INTERVAL '10' YEAR
ORDER BY bd.birthday;
My current approach seems to be very inefficient and rather complicated: It takes >100ms. Does anybody have an idea for a more compact and performant query? I am using Postgresql 9.5.3. Thank you!
Maybe try to join the generate series:
create table bday(id serial, name text, dob date);
insert into bday (name, dob) values ('a', '08-21-1972'::date);
insert into bday (name, dob) values ('b', '03-20-1974'::date);
select * from bday ,
lateral( select generate_series( (1950-y)/5 , (2010-y)/5)*5 + y as year
from (select date_part('year',dob)::integer as y) as t2
) as t1;
This will for each entry generate years between 1950 and 2010.
You can add a where clause to exclude people born after 2010 (they cant have a birthday in range)
Or exclude people born before 1850 (they are unlikely...)
--
Edit (after your edit):
So your generate_series creates 360+ rows per annum. In 100 years that is over 30.000. And they get joined to each user. (3 users => 100.000 rows)
My query generates only rows for years needed. In 100 years that is 20 rows.
That means 20 rows per user.
By dividing by 5, it ensures that the start date is a round birthday.
(1950-y)/5) calculates how many round birthdays there were before 1950.
A person born in 1941 needs to skip 1941 and 1946, but has a round birthday in 1951. So that is the difference (9 years) divided by 5, and then actually plus 1 to account for the 0st.
If the person is born after 1950 the number is negative, and greatest(-1,...)+1 gives 0, starting at the actual birthday year.
But actually it should be
select * from bday ,
lateral( select generate_series( greatest(-1,(1950-y)/5)+1, (2010-y)/5)*5 + y as year
from (select date_part('year',dob)::integer as y) as t2
) as t1;
(you may be doing greatest(0,...)+1 if you want to start at age 5)