Include count 0 in my group by request - sql

I have a COUNT + GROUP BY request for postgresql.
SELECT date_trunc('day', created_at) AS "Day" ,
count(*) AS "No. of actions"
FROM events
WHERE created_at > now() - interval '2 weeks'
AND "events"."application_id" = 7
AND ("what" LIKE 'ACTION%')
GROUP BY 1
ORDER BY 1
My request counts the number of "ACTION*" per day on my events table (a log table) in 2weeks for my application with the id 7. But the problem is it doesn't show when there is a Day without any actions recorded.
I know it is because of my WHERE clause, so I tried some stuff with JOIN requests, but nothing gave me the good answer.
Thank you for your help

Make a date table:
CREATE TABLE "myDates" (
"DateValue" date NOT NULL
);
INSERT INTO "myDates" ("DateValue")
select to_date('20000101', 'YYYYMMDD') + s.a as dates from generate_series(0,36524,1) as s(a);
Then left join on it:
SELECT d.DateValue AS "Day" ,
count(*) AS "No. of actions"
FROM myDates d left join events e on date_trunc('day', "events"."created_at") = d.DateValue
WHERE created_at > now() - interval '2 weeks' AND
"events"."application_id" = 7 AND
("what" LIKE 'ACTION%')
GROUP BY 1 ORDER BY 1

Ok a friend helped me, here is the answer:
SELECT "myDates"."DateValue" AS "Day" ,
(select count(*) from events WHERE date_trunc('day', "events"."created_at") = "myDates"."DateValue" AND
("events"."application_id" = 4) AND
("events"."what" LIKE 'ACTION%')) AS "No. of actions"
FROM "myDates"
where ("myDates"."DateValue" > now() - interval '2 weeks') AND ("myDates"."DateValue" < now())
So we need to ask all the date from the MyDates table, and ask the count on the second argument.

Related

SELF JOIN a query to obtain the number of reactivated users

Assume you have the table given below containing information on Facebook user logins. Write a query to obtain the number of reactivated users (which are dormant users who did not log in the previous month, who then logged in during the current month). Output the current month and number of reactivated users.
I have tried this question by first making an inner join combining a user's previous month to current month with this code.
WITH CTE as
(SELECT user_id,
EXTRACT(month from login_date) as current_month,
EXTRACT(month from login_date)-1 as prev_month
FROM user_logins)
SELECT a.user_id as user_id, a.current_month, a.prev_month,
b.user_id as prev_month_user
FROM CTE a LEFT JOIN CTE b
ON a.prev_month = b.current_month
My idea is to use a case statement
CASE WHEN a.user_id IN
(SELECT b.user_id
WHERE b.current_month = a.prev_month)
THEN 0 ELSE 1 END
BUT that is giving me wrong output for user_id 245 in current_month 4.
https://drive.google.com/file/d/1dOQQxaJWv7j7o7M1Q98nlj77KCzIHxKl/view?usp=sharing
How to fix this?
This gets you the first day of the current month:
select date_trunc('month', current_date)
You can add or subtract an interval of one month to get the previous or next month's starting date.
The complete query:
select *
from users
where user_id in
(
select user_id
from user_logins
where login_date >= date_trunc('month', current_date)
and login_date < date_trunc('month', current_date) + interval '1 month'
)
and user_id not in
(
select user_id
from user_logins
where login_date >= date_trunc('month', current_date) - interval '1 month'
and login_date < date_trunc('month', current_date)
)
Well, admittedly
and login_date < date_trunc('month', current_date) + interval '1 month'
is probably unnecessary here, because the table won't contain future logins :-) So, keep it or remove it, as you like.
If you want a self join, you should get distinct user/month pairs first. Then, as you want to get user/month pairs for which not exists a user/month-1 pair (and for which NOT EXISTS would be appropriate) your join must be an anti join. This means you outer join the user/month-1 pair and only keep the outer joined rows, i.e. the non-matches.
WITH cte AS
(
SELECT DISTINCT user_id, DATE_TRUNC('month', login_date) AS month
FROM user_logins
)
SELECT mon.month, mon.user_id
FROM cte mon
LEFT JOIN cte prev ON prev.user_id = mon.user_id
AND prev.month = mon.month - INTERVAL '1 month'
WHERE prev.month IS NULL -- anti join
ORDER BY mon.month, mon.user_id;
I don't find anti joins very readable and would use NOT EXISTS instead. But that's a matter of personal preference, I guess. The query gives you all users who logged in a month, but not the previous month. You can of course limit this to the cutrent month. Or you can aggregate per month and count. Or remove the WHERE clause and count repeating users vs. new ones (COUNT(*) = all that month, COUNT(prev.month) = all repeating users, COUNT(*) - COUNT(prev.month) = all new users).
Well having said this, ... wasn't the task about reactivated users? Then you are looking for users who were active once, then paused a month, then became active again. Here is a simple query to get this for users who paused last month:
select user_id
from user_logins
group by user_id
having min(login_date) < date_trunc('month', current_date) - interval '1 month'
and max(login_date) >= date_trunc('month', current_date)
and count(*) filter (where login_date >= date_trunc('month', current_date) - interval '1 month'
and login_date < date_trunc('month', current_date)) = 0;

Get daily count of rows for a Time

My database table looks like this:
CREATE TABLE record
(
id INT,
status INT,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id)
);
And I want to create a generic query to get count of record created after 3 hours of interval in last day
For example, I want to know in last 1 day after 3 hours how many records are created.
What I have so far: with a little help from stackoverflow I am able to create a query to calculate the count for a single full day.
SELECT
DATE(created_at) AS day, COUNT(1)
FROM
record
WHERE
created_at >= current_date - 1
GROUP BY
DATE(created_at)
This is telling me in full day like 24 records are created but I want to get how many are made in interval of 3 hours
If you want the count for the last three hours of data:
select count(*)
from record
where created_at >= now() - interval '3 hour';
If you want the last day minus 3 hours, that would be 21 hours:
select count(*)
from record
where created_at >= now() - interval '21 hour';
EDIT:
You want intervals of 3 hours for the last 24 hours. The simplest method is probably generate_series():
select gs.ts, count(r.created_at)
from generate_series(now() - interval '24 hour', now() - interval '3 hour', interval '3 hour') gs(ts) left join
record r
on r.created_at >= gs.ts and
r.created_at < gs.ts + interval '3 hour'
group by gs.ts
order by gs.ts;

Filtering query with join is not returning the correct results

I am trying to filter my query that looks at payments data joining with another table (accounts table) as I want the data filtered by the condition accounts.provider = 'z'. However, the results I'm returned are exact multiples of the real figures (times 13, 20 etc) - different dates are a different multiple. The query is also really slow, so looking for advice to make it run quicker too.
SELECT
distinct on (t.day) t.day as day,
coalesce(collected_payments,0)
from
( SELECT day::date
FROM generate_series(timestamp '2017-03-13', current_date + interval '1 week', interval '1 day') day
) d
left JOIN (
SELECT date_trunc('day', t.payment_date)::date AS day,
sum(case when t.payment_amount > 0
and t.description not ilike '%credit%'
and t.state = 'success'
then t.payment_amount end) as collected_payments
FROM payments t
inner join payments p on p.payment_date = date_trunc('day', t.payment_date)::date
inner join accounts on accounts.id = p.account_id and accounts.provider = 'z'
where date_trunc('day', t.payment_date)::date <= current_date + interval '1 week'
and date_trunc('day', t.payment_date)::date >= current_date - interval'1 months'
GROUP BY 1
) t USING (day)
ORDER BY day desc

SQL: Select average value of column for last hour and last day

I have a table like below image. What I need is to get average value of Volume column, grouped by User both for 1 hour and 24 hours ago. How can I use avg with two different date range in single query?
You can do it like:
SELECT user, AVG(Volume)
FROM mytable
WHERE created >= NOW() - interval '1 hour'
AND created <= NOW()
GROUP BY user
Few things to remember, you are executing the query on same server with same time zone. You need to group by the user to group all the values in volume column and then apply the aggregation function like avg to find average. Similarly if you need both together then you could do the following:
SELECT u1.user, u1.average, u2.average
FROM
(SELECT user, AVG(Volume) as average
FROM mytable
WHERE created >= NOW() - interval '1 hour'
AND created <= NOW()
GROUP BY user) AS u1
INNER JOIN
(SELECT user, AVG(Volume) as average
FROM mytable
WHERE created >= NOW() - interval '1 day'
AND created <= NOW()
GROUP BY user) AS u2
ON u1.user = u2.user
Use conditional aggregation. Postgres offers very convenient syntax using the FILTER clause:
SELECT user,
AVG(Volume) FILTER (WHERE created >= NOW() - interval '1 hour' AND created <= NOW()) as avg_1hour,
AVG(Volume) FILTER (WHERE created >= NOW() - interval '1 day' AND created <= NOW()) as avg_1day
FROM mytable
WHERE created >= NOW() - interval '1 DAY' AND
created <= NOW()
GROUP BY user;
This will filter out users who have had no activity in the past day. If you want all users -- even those with no recent activity -- remove the WHERE clause.
The more traditional method uses CASE:
SELECT user,
AVG(CASE WHEN created >= NOW() - interval '1 hour' AND created <= NOW() THEN Volume END) as avg_1hour,
AVG(CASE WHEN created >= NOW() - interval '1 day' AND created <= NOW() THEN Volume END) as avg_1day
. . .
SELECT User, AVG(Volume) , ( IIF(created < DATE_SUB(NOW(), INTERVAL 1 HOUR) , 1 , 0) )IntervalType
WHERE created < DATE_SUB(NOW(), INTERVAL 1 HOUR)
AND created < DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY User, (IIF(created < DATE_SUB(NOW(), INTERVAL 1 HOUR))
Please Tell me about it's result :)

PostgreSQL generate_series with WHERE clause

I'm having an issue generating a series of dates and then returning the COUNT of rows matching that each date in the series.
SELECT generate_series(current_date - interval '30 days', current_date, '1 day':: interval) AS i, COUNT(*)
FROM download
WHERE product_uuid = 'someUUID'
AND created_at = i
GROUP BY created_at::date
ORDER BY created_at::date ASC
I want the output to be the number of rows that match the current date in the series.
05-05-2018, 35
05-06-2018, 23
05-07-2018, 0
05-08-2018, 10
...
The schema has the following columns: id, product_uuid, created_at. Any help would be greatly appreciated. I can add more detail if needed.
Put the table generating function in the from and use a join:
SELECT g.dte, COUNT(d.product_uuid)
FROM generate_series(current_date - interval '30 days', current_date, '1 day':: interval
) gs(dte) left join
download d
on d.product_uuid = 'someUUID' AND
d.created_at::date = g.dte
GROUP BY g.dte
ORDER BY g.dte;