Fetch records from current week only - sql

I made a discord bot where I record contributions/posts of members in a database, in the following table.
CREATE TABLE IF NOT EXISTS posts (
id integer GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
...
post_date timestamptz NOT NULL DEFAULT CURRENT_TIMESTAMP
);
At the end of each week I'd manually run a command to fetch the member who had more contributions, so I could give him/her an award. To do that I created the following view table, which worked fine since it was for personal use only and I did it at the same date every week.
CREATE VIEW vw_posts AS
SELECT guild_id, account_id, COUNT(*) AS posts
FROM public.posts
WHERE post_date > CURRENT_TIMESTAMP - INTERVAL '1 week'
GROUP BY(guild_id, account_id)
ORDER BY posts DESC;
Now I'm doing a new command to show a weekly leaderboard. So after creating the command I quickly realized that my view is fetching in a 7-days interval rather than fetching the current week, so it's fetching data from the previous week.
I'm getting results like the red line, but I'd like the view to act as the green one.
I did a bit of research but most posts would suggest using date_trunc() or functions alike that wouldn't let me get the rest of the data, I'm definitely struggling to do the query even after reading the documentation.
Thanks for any advice!

For the current week use:
CREATE VIEW vw_posts AS
SELECT guild_id, account_id, COUNT(*) AS posts
FROM public.posts
WHERE post_date >= DATE_TRUNC('week', CURRENT_TIMESTAMP)
GROUP BY guild_id, account_id
ORDER BY posts DESC;
For the previous week:
CREATE VIEW vw_posts AS
SELECT guild_id, account_id, COUNT(*) AS posts
FROM public.posts
WHERE post_date >= DATE_TRUNC('week', CURRENT_TIMESTAMP) - INTERVAL '1 week' AND
post_date < DATE_TRUNC('week', CURRENT_TIMESTAMP)
GROUP BY guild_id, account_id
ORDER BY posts DESC;

Related

Get the user id list from the beginning of the month

Im trying to get the user_id list of all the users that purchased in the last month and never purchased before that, i tried using DATE_TRUNC but it shows no data. any idea?
SELECT DISTINCT user_id
FROM `gcommerce-analytics-prod.grouponi_groupon.tb_orders`
WHERE cast(on_date_ts AS DATE) BETWEEN CURRENT_DATE()
AND DATE_TRUNC(CURRENT_DATE(),month)
AND user_id NOT IN (
SELECT DISTINCT user_id
FROM `gcommerce-analytics-prod.grouponi_groupon.tb_orders`
WHERE cast(on_date_ts AS DATE) > '2010-01-01'
AND cast(on_date_ts AS DATE) < DATE_TRUNC(CURRENT_DATE(), month)
)

how to find consecutive user login across week

I'm fairly new to SQL & maybe the complexity level for this report is above my pay grade
I need help to figure out the list of users who are logging to the app consecutively every week in the time period chosen(this logic eventually needs to be extended to a month, quarter & year ultimately but a week is good for now)
Table structure for ref
events: User_id int, login_date timestamp
The table events can have 1 or more entries for a user. This inherently means that the user can login multiple times to the app. To shed some light, if we focus on Jan 2020- Mar2020 then I need the following in the output
user_id who logged into the app every week from 2020wk1 to 2020Wk14
at least once
the week they logged in
number of times they logged in that week
I'm also okay if the output of the query is just the user_id. The thing is I'm unable to make sense out of the output that I'm seeing on my end after trying the following SQL code, perhaps working on this problem for so long might be the reason for that!
SQL code tried so far:
SELECT DISTINCT user_id
,extract('year' FROM timestamp)||'Wk'|| extract('week' FROM timestamp)
,lead(extract('week' FROM timestamp)) over (partition by user_id, extract('week' FROM timestamp) order by extract('week' FROM timestamp))
FROM events
WHERE user_id = 'Anything that u wish to enter'
You can get the summary you want as:
select user_id, date_trunc('week', timestamp) as week, count(*)
from events
group by user_id, week;
But the filtering is tricker. It is better to go with dates rather than week numbers:
select user_id, date_trunc('week', timestamp) as week, count(*) as cnt,
count(*) over (partition by user_id) as num_weeks
from events
where timestamp >= ? and timestamp < ?
group by user_id, week;
Then you can use a subquery:
select uw.*
from (select user_id, date_trunc('week', timestamp) as week, count(*) as cnt,
count(*) over (partition by user_id) as num_weeks
from events
where timestamp >= ? and timestamp < ?
group by user_id, week
) uw
where num_weeks = ? -- 14 in your example

Get 30 days prior data for each row of query

I have a query where I have a list of ~ 20k users for a specific week of the month that represents that they have logged on to our site.
What I need to get - for each of these users, in the past 30 days if they have
1. logged on: defined by any rows recorded in the same table
2. max event in the 30 day window, prior to the date in the current where clause
This is the current code snippet that helps me narrow to the ~20k users for a given week to begin with:
select
user_id,
max(timestamp)
from table
where timestamp between '2019-02-01' and '2019-02-05'
group by 1,2;
Expected result set/columns:
user_id,
max(timestamp),
logged_on, [if they have any # of rows in the same table within 30 days prior to their max(timestamp) date]
previous_timestamp, [the 2nd most recent login date within 30 days prior to their max(timestamp) date]
I think this is what you're looking for. Not sure if it's the most efficient method though - perhaps windowing functions may perform better but like bob-mccormick mentioned: the tricky bit would be filling in dates where the user (partition key) was not active so that the range query will work correctly.
Example data setup (Snowflake syntax)
-- Create sample table
create temporary table user_logins (userid number, date_logged_on timestamp);
;
-- Insert some random sample data
insert overwrite into user_logins
select
uniform(1,10,random()) userid,
dateadd('minutes', uniform(1,86400,random()) * -1,current_timestamp::timestamp_ntz) date_logged_on
from table(generator(rowcount => 100))
;
Select statement
-- Run select
with user_last_logins as (
select
userid,
max(date_logged_on) last_login
from user_logins
where
date_logged_on between '2019-01-01' and '2019-05-08'
group by userid
)
select
user_last_logins.userid,
max(user_last_logins.last_login) last_logged_on,
count(prior_30_each_user.userid) num_logins_prior_30,
max(prior_30_each_user.date_logged_on)
from user_last_logins
left join user_logins prior_30_each_user
on user_last_logins.userid = prior_30_each_user.userid
and prior_30_each_user.date_logged_on > dateadd('day', -30, user_last_logins.last_login) and prior_30_each_user.date_logged_on < user_last_logins.last_login
group by user_last_logins.userid
;

how to perform query in Postresql that returns a data count created grouped by month?

In postgresql, how do I perform a query that returns the sum amounts of rows created of a particular table by month? I would like the result to be something like:
month: January
count: 67
month: February
count: 85
....
....
Let's suppose a I have a table, users. This table has a primary key, id, and a created_at column with time stored in ISO8601 formatting. Last year n number of users were created, and now I want to know how many were created by month, and I want the data returned to me in the above format -- grouped by month and an associated count reflecting how many users were created that month.
Does anyone know how to perform the above SQL query in postgresql?
The query would look something like this:
select date_trunc('month', created_at) as mm, count(*)
from users u
where subscribed = true and
created_at >= '2016-01-01' and
created_at < '2017-01-01'
group by date_trunc('month', created_at);
I don't know where the constant '2017-03-20 13:38:46.688-04' is coming from.
Of course you can make the year comparison dynamic:
select date_trunc('month', created_at) as mm, count(*)
from users u
where subscribed = true and
created_at >= date_trunc('year', now()) - interval '1 year' and
created_at < date_trunc('year', now())
group by date_trunc('month', created_at);

Select one row per day for each value

I have a SQL query in PostgreSQL 9.4 that, while more complex due to the tables I am pulling data from, boils down to the following:
SELECT entry_date, user_id, <other_stuff>
FROM <tables, joins, etc>
GROUP BY entry_date, user_id
WHERE <whatever limits I want, such as limiting the date range or users>
With the result that I have one row per user, per day for which I have data. In general, this query would be run for an entry_date period of one month, with the desired result of having one row per day of the month for each user.
The problem is that there may not be data for every user every day of the month, and this query only returns rows for days that have data.
Is there some way to modify this query so it returns one row per day for each user, even if there is no data (other than the date and the user) in some of the rows?
I tried doing a join with a generate_series(), but that didn't work - it can make there be no missing days, but not per user. What I really need would be something like "for each user in list, generate series of (user,date) records"
EDIT: To clarify, the final result that I am looking for would be that for each user in the database - defined as a record in a user table - I want one row per date. So if I specify a date range of 5/1/15-5/31/15 in my where clause, I want 31 rows per user, even if that user had no data in that range, or only had data for a couple of days.
generate_series() was the right idea. You probably did not get the details right. Could work like this:
WITH cte AS (
SELECT entry_date, user_id, <other_stuff>
FROM <tables, joins, etc>
GROUP BY entry_date, user_id
WHERE <whatever limits I want>
)
SELECT *
FROM (SELECT DISTINCT user_id FROM cte) u
CROSS JOIN (
SELECT entry_date::date
FROM generate_series(current_date - interval '1 month'
, current_date - interval '1 day'
, interval '1 day') entry_date
) d
LEFT JOIN cte USING (user_id, entry_date);
I picked a running time window of one month ending "yesterday". You did not define your "month" exactly.
Assuming entry_date to be data type date.
Simpler for your updated requirements
To get results for every user in a users table (and not for a current selection) and for your given time range, it gets simpler. You don't need the CTE:
SELECT *
FROM (SELECT user_id FROM users) u
CROSS JOIN (
SELECT entry_date::date
FROM generate_series(timestamp '2015-05-01'
, timestamp '2015-05-31'
, interval '1 day') entry_date
) d
LEFT JOIN (
SELECT entry_date, user_id, <other_stuff>
FROM <tables, joins, etc>
GROUP BY entry_date, user_id
WHERE <whatever>
) t USING (user_id, entry_date);
Why this particular way to call generate_series()?
Generating time series between two dates in PostgreSQL
And best use ISO 8601 date format (YYYY-MM-DD) which works regardless of locale settings.