I want to get the result which should be less time consuming - sql

select count(*)
from table
where EXTRACT(MONTH FROM addondatetime) = EXTRACT(MONTH FROM current_date)
and EXTRACT(year FROM addondatetime) = EXTRACT(year FROM current_date)
this is my query. i want to extract month from table which is equal to current month but this query is taking almost 2 min

Try:
select count(*)
from table
where date_trunc('month', addondatetime) = date_trunc('month', current_date);
Also create a function based index:
create index test_just_month on test (date_trunc('month', addondatetime));

Related

Using Where and group by clause

Can anyone describe how can I suppose to retrieve data using filter conditions such as both where and group by clauses of different fields through SQL ?
For instance ,
Require to take out the No of days in a month does the temperature exceeding 35 degrees celsius ?
SELECT temp, count(*)
FROM weather_data
WHERE day between '01-jun-2022' to '30-jun-2022'
GROUP BY temp > '35';
My requirement is to find out the aggregate details like total count
So I tried using group by clause , Inaddition to that , I must use few conditions to filter further ,
Hence I used conditions in where clause before group by clause
it's correct query :
SELECT temp, count(*) FROM weather_data
WHERE temp > '35' AND day between '01-jun-2022' and '30-jun-2022' GROUP BY temp
You want to aggregate your data, so as to get one result row per month. In SQL this is GROUP BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day). Your DBMS may have additional functions to extract a month (year + month to be precise) from a date, such as TO_CHAR(day, 'YYYY-MM'), but this is vendor specific.
Now you only want to count days with a temperature obove 35 degrees. The first idea to solve this, is a WHERE clause that limits the rows you aggregate to the ones in question:
SELECT
EXTRACT(YEAR FROM day) AS year,
EXTRACT(MONTH FROM day) AS month,
COUNT(*)
FROM mytable
WHERE temp > 35
GROUP BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day)
ORDER BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day);
The problem with this: If a month has no day above that temperature, you won't select that month, because your WHERE clause removed those rows. That may be okay with you, but if you want to show the months with a zero count, then move the condition into the aggregation function. Thus you select all months but only count days with high temperatures:
SELECT
EXTRACT(YEAR FROM day) AS year,
EXTRACT(MONTH FROM day) AS month,
COUNT(CASE WHEN temp > 35 THEN 1 END)
FROM mytable
GROUP BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day)
ORDER BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day);
How does this work? COUNT <expression> ) counts non-null occurrences. CASE WHEN temp > 35 THEN 1 END is short for CASE WHEN temp > 35 THEN 1 ELSE NULL END. And instead of 1 you could use any value that is not null, e.g. 'count me'. Or you could use SUM instead, if you like that better: SUM(CASE WHEN temp > 35 THEN 1 ELSE 0 END).
At last you want to limit the date range. Date literals in SQL look like this: DATE 'YYYY-MM-DD'. And as we sometimes deal with dates and other times with datetimes or timestamps, it has become common, not to use BETWEEN, but >= and <, so as to have the range work for all those data types:
SELECT
EXTRACT(YEAR FROM day) AS year,
EXTRACT(MONTH FROM day) AS month,
COUNT(CASE WHEN temp > 35 THEN 1 END)
FROM mytable
WHERE day >= DATE '2022-06-01'
AND day < DATE '2022-07-01'
GROUP BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day)
ORDER BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day);
Try this:
SELECT temp, count(*)
FROM weather_data
WHERE date >= '01-jun-2022' AND date<='30-jun-2022' AND temp > '35'
GROUP BY temp;

Group by days of a month in CockroachDB

In CockroachDB, I want to have such this query on a specific month for its every day:
select count(*), sum(amount)
from request
where code = 'code_string'
and created_at >= '2022-07-31T20:30:00Z' and created_at < '2022-08-31T20:30:00Z'
the problem is that I want it on my local date. What should I do?
My goal is:
"month, day, count, sum" as result columns for a month.
UPDATE:
I have found a suitable query for this purpose:
select count(amount), sum(amount), extract(month from created_at) as monthTime, extract(day from created_at) as dayTime
from request
where code = 'code_string' and created_at >= '2022-07-31T20:30:00Z' and created_at < '2022-08-31T20:30:00Z'
group by dayTime, monthTime
Thanks to #histocrat for easier answer :) by replacing
extract(month from created_at) as monthTime, extract(day from created_at) as dayTime
by this:
date_part('month', created_at) as monthTime, date_part('day', created_at) as dayTime
To group results by both month and day, you can use the date_part function.
select month, day, count(*), sum(things)
from request
where code = 'code_string'
group by date_part('month', created_at) as month, date_part('day', created_at) as day;
Depending on what type created_at is, you may need to cast or convert it first (for example, group by date_part('month', created_at::timestamptz)).

SELF JOIN a query to obtain the number of reactivated users

Assume you have the table given below containing information on Facebook user logins. Write a query to obtain the number of reactivated users (which are dormant users who did not log in the previous month, who then logged in during the current month). Output the current month and number of reactivated users.
I have tried this question by first making an inner join combining a user's previous month to current month with this code.
WITH CTE as
(SELECT user_id,
EXTRACT(month from login_date) as current_month,
EXTRACT(month from login_date)-1 as prev_month
FROM user_logins)
SELECT a.user_id as user_id, a.current_month, a.prev_month,
b.user_id as prev_month_user
FROM CTE a LEFT JOIN CTE b
ON a.prev_month = b.current_month
My idea is to use a case statement
CASE WHEN a.user_id IN
(SELECT b.user_id
WHERE b.current_month = a.prev_month)
THEN 0 ELSE 1 END
BUT that is giving me wrong output for user_id 245 in current_month 4.
https://drive.google.com/file/d/1dOQQxaJWv7j7o7M1Q98nlj77KCzIHxKl/view?usp=sharing
How to fix this?
This gets you the first day of the current month:
select date_trunc('month', current_date)
You can add or subtract an interval of one month to get the previous or next month's starting date.
The complete query:
select *
from users
where user_id in
(
select user_id
from user_logins
where login_date >= date_trunc('month', current_date)
and login_date < date_trunc('month', current_date) + interval '1 month'
)
and user_id not in
(
select user_id
from user_logins
where login_date >= date_trunc('month', current_date) - interval '1 month'
and login_date < date_trunc('month', current_date)
)
Well, admittedly
and login_date < date_trunc('month', current_date) + interval '1 month'
is probably unnecessary here, because the table won't contain future logins :-) So, keep it or remove it, as you like.
If you want a self join, you should get distinct user/month pairs first. Then, as you want to get user/month pairs for which not exists a user/month-1 pair (and for which NOT EXISTS would be appropriate) your join must be an anti join. This means you outer join the user/month-1 pair and only keep the outer joined rows, i.e. the non-matches.
WITH cte AS
(
SELECT DISTINCT user_id, DATE_TRUNC('month', login_date) AS month
FROM user_logins
)
SELECT mon.month, mon.user_id
FROM cte mon
LEFT JOIN cte prev ON prev.user_id = mon.user_id
AND prev.month = mon.month - INTERVAL '1 month'
WHERE prev.month IS NULL -- anti join
ORDER BY mon.month, mon.user_id;
I don't find anti joins very readable and would use NOT EXISTS instead. But that's a matter of personal preference, I guess. The query gives you all users who logged in a month, but not the previous month. You can of course limit this to the cutrent month. Or you can aggregate per month and count. Or remove the WHERE clause and count repeating users vs. new ones (COUNT(*) = all that month, COUNT(prev.month) = all repeating users, COUNT(*) - COUNT(prev.month) = all new users).
Well having said this, ... wasn't the task about reactivated users? Then you are looking for users who were active once, then paused a month, then became active again. Here is a simple query to get this for users who paused last month:
select user_id
from user_logins
group by user_id
having min(login_date) < date_trunc('month', current_date) - interval '1 month'
and max(login_date) >= date_trunc('month', current_date)
and count(*) filter (where login_date >= date_trunc('month', current_date) - interval '1 month'
and login_date < date_trunc('month', current_date)) = 0;

Can we use dynamic SQL or loops to automate this process?

I have a base dataset that is updated monthly. This contains information about employees such as Employer ID. I would like to create a table where we can see the leavers and joiners for each month.
The logic for this is as follows: if employee ID appears in latest month but not prior, then it is a joiner. If ID appears in prior but not latest, then it is a leaver.
The base data is appended and we also have a date variable, so I am able to produce a table of joiners/leavers with either CTEs or CREATE TABLE by specifying date(s) in where clause and merging.
I was wondering whether there was a way I could do this without manually creating multiple tables/CTES ? I.E. something that repeats the logic for a date range.
Aware it’s fairly simple to do in other coding languages but not sure how to go about it in SQL. Any help is greatly appreciated.
Self-join the table. Same employee, adjancent months. I am multiplying a year be twelve and add the month, so as to get a continues month numbering (e.g. 12/2020 = 24252, 01/2021 = 24253). I am using a full outer join and only keep the outer joined rows, thus getting the leavers and the joiners.
select
extract(year from coalesce(m_next.date, date_trunc('month', m_prev.date) + interval '1 month')) as year,
extract(month from coalesce(m_next.date, date_trunc('month', m_prev.date) + interval '1 month')) as month,
count(m_next.date) as joiners,
count(m_prev.date) as leavers
from mytable m_next
full outer join mytable m_prev
on m_prev.employee_id = m_next.employee_id
and extract(year from m_prev.date) * 12 + extract(month from m_prev.date) =
extract(year from m_next.date) * 12 + extract(month from m_next.date) - 1
where m_next.date is null or m_prev.date is null
group by
extract(year from coalesce(m_next.date, date_trunc('month', m_prev.date) + interval '1 month')),
extract(month from coalesce(m_next.date, date_trunc('month', m_prev.date) + interval '1 month'))
order by
extract(year from coalesce(m_next.date, date_trunc('month', m_prev.date) + interval '1 month')),
extract(month from coalesce(m_next.date, date_trunc('month', m_prev.date) + interval '1 month'));
Demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=1c66b00a71d484cd3951baa0956ace63

Select data with a rolling date criteria

The below query returns a distinct count of 'members' for a given month and brand (see image below).
select to_char(transaction_date, 'YYYY-MM') as month, brand,
count(distinct UNIQUE_MEM_ID) as distinct_count
from source.table
group by to_char(transaction_date, 'YYYY-MM'), brand;
The data is collected with a 15 day lag after the month closes (meaning September 2016 MONTHLY data won't be 100% until October 15). I am only concerned with monthly data.
The query I would like to build: Until the 15th of this month (October), last month's data (September) should reflect August's data. The current partial month (October) should default to the prior month and thus also to the above logic.
After the 15th of this month, last month's data (September) is now 100% and thus September should reflect September (and October will reflect September until November 15th, and so on).
The current partial month will always = the prior month. The complexity of the query is how to calc prior month.
This query will be ran on a rolling basis so needs to be dynamic.
To be clear, I am trying to build a query where distinct_count for the prior month (until end of current month + 15 days) should reflect (current month - 2) value (for each respective brand). After 15 days of the close of the month, prior month = (current month - 1).
Partial current month defaults to prior month's data. The 15 day value should be variable/modifiable.
First, simplify the query to:
select to_char(transaction_date, 'YYYY-MM') as month, brand,
count(distinct members) as distinct_count
from source.table
group by members, to_char(transaction_date, 'YYYY-MM'), brand;
Then, you are going to have a problem. The problem is that one row (say from Aug 20th) needs to go into two groups. A simple group by won't handle this. So, let's use union all. I think the result is something like this:
select date_trunc('month', transaction_date) as month, brand,
count(distinct members) as distinct_count
from source.table
where (date_trunc('month', transaction_date) < date_trunc('month' current_date) - interval '1 month') or
(day(current_date) > 15 and date_trunc('month', transaction_date) = date_trunc('month' current_date) - interval '1 month')
group by date_trunc('month', transaction_date), brand
union all
select date_trunc('month' current_date) - interval '1 month' as month, brand,
count(distinct members) as distinct_count
from source.table
where (day(current_date) < 15 and date_trunc('month', transaction_date) = date_trunc('month' current_date) - interval '1 month')
group by brand;
Since you already have a working query, I concentrate on the subselect. The condition you can use here is CASE, especially "Searched CASE"
case
when extract(day from current_date) < 15 then
extract(month from current_date - interval '2 months')
else
extract(month from current_date - interval '1 month')
end case
This may be used as part of a where clause, for example.
Here is some sudo code to get the begin date and the end date for your interval.
Begin date:
date DATE_TRUNC('month', CURRENT_DATE - integer 15) - interval '1 month'
This will return the current month only after the 15th day, from there you can subtract a full month to get your starting point.
End Date:
To calculate this, grab the begin date, plus a month, minus a day.
If the source table is partitioned by transaction_date, this syntax (not masking transaction_date with expression) enables partitions eliminatation.
select to_char(transaction_date, 'YYYY-MM') as month
,count (distinct members) as distinct_count
,brand as brand
FROM source.table
where transaction_date between date_trunc('month', current_date) - case when extract (day from current_date) >= 15 then 1 else 2 end * interval '1' month
and date_trunc('month', current_date) - case when extract (day from current_date) >= 15 then 0 else 1 end * interval '1' month - interval '1' day
group by to_char(transaction_date, 'YYYY-MM')
,brand
;