For LOOP in PostgreSQL - sql

I have a table with the following columns:
(client_id, start_contract_date, end_contract_date)
Every client has a start_contract_date but some clients have a NULL for end_contract_date since they may still be active today.
If we check for a certain date D, a client is active if D is between start_contract_date and end_contract_date (or start_contract_date <= D, if end_contract_date is NULL.)
I want to count, for each month of each year, over 2016 until today, how many customers are active. My problem is that I do not know how to LOOP on the months and years.
I have a partial solution. I can count how many active clients for a specific month of a specific year.
SELECT 2016 as year , 7 as month, count(id_client)
FROM table
WHERE
EXTRACT(year from start_contract_date)<=2016
AND EXTRACT(month from start_contract_date)<=7
AND (EXTRACT(year from end_contract_date)>=2016 OR end_contract_date IS NULL)
AND (EXTRACT(month from end_contract_date)>=7 OR end_contract_date IS NULL)
;
So, how can I run a nested for loop that would be something like
FOR y IN 2016..2017
FOR m IN 1..12
I want the output to be
Year , Month , Count
2016 , 1 , 234
2016 , 2 , 54
…
2017 , 12 , 543

Use the function generate_series() to generate arbitrary series of months, e.g.:
select extract(year from d) as year, extract(month from d) as month
from generate_series('2017-11-01'::date, '2018-02-01', '1 month') d
year | month
------+-------
2017 | 11
2017 | 12
2018 | 1
2018 | 2
(4 rows)
Use the above and the function date_trunc() to extract year-month value from dates:
select extract(year from d) as year, extract(month from d) as month, count(id_client)
from generate_series('2016-01-01'::date, '2019-03-01', '1 month') d
left join my_table
on date_trunc('month', start_contract_date) <= date_trunc('month', d)
and (end_contract_date is null or date_trunc('month', end_contract_date) >= date_trunc('month', d))
group by d
order by d
Note also that the conditions in your query contain logical error.

Related

Get count of susbcribers for each month in current year even if count is 0

I need to get the count of new subscribers each month of the current year.
DB Structure: Subscriber(subscriber_id, create_timestamp, ...)
Expected result:
date | count
-----------+------
2021-01-01 | 3
2021-02-01 | 12
2021-03-01 | 0
2021-04-01 | 8
2021-05-01 | 0
I wrote the following query:
SELECT
DATE_TRUNC('month',create_timestamp)
AS create_timestamp,
COUNT(subscriber_id) AS count
FROM subscriber
GROUP BY DATE_TRUNC('month',create_timestamp);
Which works but does not include months where the count is 0. It's only returning the ones that are existing in the table. Like:
"2021-09-01 00:00:00" 3
"2021-08-01 00:00:00" 9
First subquery is used for retrieving year wise each month row then LEFT JOIN with another subquery which is used to retrieve month wise total_count. COALESCE() is used for replacing NULL value to 0.
-- PostgreSQL (v11)
SELECT t.cdate
, COALESCE(p.total_count, 0) total_count
FROM (select generate_series('2021-01-01'::timestamp, '2021-12-15', '1 month') as cdate) t
LEFT JOIN (SELECT DATE_TRUNC('month',create_timestamp) create_timestamp
, SUM(subscriber_id) total_count
FROM subscriber
GROUP BY DATE_TRUNC('month',create_timestamp)) p
ON t.cdate = p.create_timestamp
Please check from url https://dbfiddle.uk/?rdbms=postgres_11&fiddle=20dcf6c1784ed0d9c5772f2487bcc221
get the count of new subscribers each month of the current year
SELECT month::date, COALESCE(s.count, 0) AS count
FROM generate_series(date_trunc('year', LOCALTIMESTAMP)
, date_trunc('year', LOCALTIMESTAMP) + interval '11 month'
, interval '1 month') m(month)
LEFT JOIN (
SELECT date_trunc('month', create_timestamp) AS month
, count(*) AS count
FROM subscriber
GROUP BY 1
) s USING (month);
db<>fiddle here
That's assuming every row is a "new subscriber". So count(*) is simplest and fastest.
See:
Join a count query on generate_series() and retrieve Null values as '0'
Generating time series between two dates in PostgreSQL

SQL Count Entries for each Month of the last 6 Months

I got a problem while trying to count the entries that were created in a month for the last 6 months.
The table looks like this:
A B C D
Year Month Startingdate Identifier
-----------------------------------------
2019 3 2019-03-12 OAM_1903121
2019 2 2019-03-21 OAM_1902211
And the result should look like:
A B C
Year Month Amount of orders
---------------------------------
2019 3 26
2019 2 34
This is what I have so far, but it doesn't get me the proper results:
SELECT year, month, COUNT(Startingdate) as Amount
FROM table
WHERE Startingdate > ((TRUNC(add_months(sysdate,-3) , 'MM'))-1)
GROUP BY year, month
I have not tested it, but it should work:
select year, month, count(Stringdate) as Amount_of_order
from table
where Stringdate between add_months(sysdate, -6) and sysdate
group by year, month;
Let me know.
Try that :
SELECT YEAR(Startingdate) AS [Year], MONTH(Startingdate) AS [Month], COUNT(*) AS Amount
FROM table
WHERE Startingdate > DATEADD(MONTH, -6, GETDATE())
GROUP BY YEAR(Startingdate), MONTH(Startingdate)
ORDER BY YEAR(Startingdate), MONTH(Startingdate) DESC
I think your issue is the filtering. If so, this should handle the most recent six full months:
SELECT year, month, COUNT(*) as num_orders
FROM table
WHERE Startingdate >= TRUNC(add_months(sysdate, -6) , 'MM')
GROUP BY year, month;

postgreSQL- Count for value between previous month start date and end date

I have a table as follows
user_id date month year visiting_id
123 11-04-2017 APRIL 2017 4500
123 12-05-2017 MAY 2017 4567
123 13-05-2017 MAY 2017 4568
123 17-05-2017 MAY 2017 4569
123 22-05-2017 MAY 2017 4570
123 11-06-2017 JUNE 2017 4571
123 12-06-2017 JUNE 2017 4572
I want to calculate the visiting count for the current month and last month at the monthly level as follows:
user_id month year visit_count_this_month visit_count_last_month
123 APRIL 2017 1 0
123 MAY 2017 4 1
123 JUNE 2017 2 4
I was able to calculate visit_count_this_month using the following query
SELECT v.user_id, v.month, v.year,
SUM(is_visit_this_month) as visit_count_this_month
FROM
(SELECT user_id, date, month, year,
CASE WHEN TO_CHAR(date, 'MM/YYYY') = TO_CHAR(date, 'MM/YYYY')
THEN 1 ELSE 0
END as is_visit_this_month
FROM visits
GROUP BY user_id, date, month, year
HAVING user_id = 123) v
GROUP BY v.user_id, v.month, v.year
However, I'm stuck with calculating visit_count_last_month. Similar to this, I also want to calculate visit_count_last_2months.
Can somebody help?
You can use a LATERAL JOIN like this:
SELECT user_id, month, year, COUNT(*) as visit_count_this_month, visit_count_last_month
FROM visits v
CROSS JOIN LATERAL (
SELECT COUNT(*) as visit_count_last_month
FROM visits
WHERE user_id = v.user_id
AND date = (CAST(v.date AS date) - interval '1 month')
) l
GROUP BY user_id, month, year, visit_count_last_month;
SQLFiddle - http://sqlfiddle.com/#!15/393c8/2
Assuming there are values for every month, you can get the counts per month first and use lag to get the previous month's values per user.
SELECT T.*
,COALESCE(LAG(visits,1) OVER(PARTITION BY USER_ID ORDER BY year,mth),0) as last_month_visits
,COALESCE(LAG(visits,2) OVER(PARTITION BY USER_ID ORDER BY year,mth),0) as last_2_month_visits
FROM (
SELECT user_id, extract(month from date) as mth, year, COUNT(*) as visits
FROM visits
GROUP BY user_id, extract(month from date), year
) T
If there can be missing months, it is best to generate all months within a specified timeframe and left join ing the table on to that. (This example shows it for all the months in 2017).
select user_id,yr,mth,visits
,coalesce(lag(visits,1) over(PARTITION BY USER_ID ORDER BY yr,mth),0) as last_month_visits
,coalesce(lag(visits,2) OVER(PARTITION BY USER_ID ORDER BY yr,mth),0) as last_2_month_visits
from (select u.user_id,extract(year from d.dt) as yr, extract(month from d.dt) as mth,count(v.visiting_id) as visits
from generate_series(date '2017-01-01', date '2017-12-31',interval '1 month') d(dt)
cross join (select distinct user_id from visits) u
left join visits v on extract(month from v.dt)=extract(month from d.dt) and extract(year from v.dt)=extract(year from d.dt) and u.user_id=v.user_id
group by u.user_id,extract(year from d.dt), extract(month from d.dt)
) t

Total Number of Records per Week

I have a Postgres 9.1 database. I am trying to generate the number of records per week (for a given date range) and compare it to the previous year.
I have the following code used to generate the series:
select generate_series('2013-01-01', '2013-01-31', '7 day'::interval) as series
However, I am not sure how to join the counted records to the dates generated.
So, using the following records as an example:
Pt_ID exam_date
====== =========
1 2012-01-02
2 2012-01-02
3 2012-01-08
4 2012-01-08
1 2013-01-02
2 2013-01-02
3 2013-01-03
4 2013-01-04
1 2013-01-08
2 2013-01-10
3 2013-01-15
4 2013-01-24
I wanted to have the records return as:
series thisyr lastyr
=========== ===== =====
2013-01-01 4 2
2013-01-08 3 2
2013-01-15 1 0
2013-01-22 1 0
2013-01-29 0 0
Not sure how to reference the date range in the subsearch. Thanks for any assistance.
The simple approach would be to solve this with a CROSS JOIN like demonstrated by #jpw. However, there are some hidden problems:
The performance of an unconditional CROSS JOIN deteriorates quickly with growing number of rows. The total number of rows is multiplied by the number of weeks you are testing for, before this huge derived table can be processed in the aggregation. Indexes can't help.
Starting weeks with January 1st leads to inconsistencies. ISO weeks might be an alternative. See below.
All of the following queries make heavy use of an index on exam_date. Be sure to have one.
Only join to relevant rows
Should be much faster:
SELECT d.day, d.thisyr
, count(t.exam_date) AS lastyr
FROM (
SELECT d.day::date, (d.day - '1 year'::interval)::date AS day0 -- for 2nd join
, count(t.exam_date) AS thisyr
FROM generate_series('2013-01-01'::date
, '2013-01-31'::date -- last week overlaps with Feb.
, '7 days'::interval) d(day) -- returns timestamp
LEFT JOIN tbl t ON t.exam_date >= d.day::date
AND t.exam_date < d.day::date + 7
GROUP BY d.day
) d
LEFT JOIN tbl t ON t.exam_date >= d.day0 -- repeat with last year
AND t.exam_date < d.day0 + 7
GROUP BY d.day, d.thisyr
ORDER BY d.day;
This is with weeks starting from Jan. 1st like in your original. As commented, this produces a couple of inconsistencies: Weeks start on a different day each year and since we cut off at the end of the year, the last week of the year consists of just 1 or 2 days (leap year).
The same with ISO weeks
Depending on requirements, consider ISO weeks instead, which start on Mondays and always span 7 days. But they cross the border between years. Per documentation on EXTRACT():
week
The number of the week of the year that the day is in. By definition (ISO 8601), weeks start on Mondays and the first week of a
year contains January 4 of that year. In other words, the first
Thursday of a year is in week 1 of that year.
In the ISO definition, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, and for
late-December dates to be part of the first week of the next year. For
example, 2005-01-01 is part of the 53rd week of year 2004, and
2006-01-01 is part of the 52nd week of year 2005, while 2012-12-31 is
part of the first week of 2013. It's recommended to use the isoyear
field together with week to get consistent results.
Above query rewritten with ISO weeks:
SELECT w AS isoweek
, day::text AS thisyr_monday, thisyr_ct
, day0::text AS lastyr_monday, count(t.exam_date) AS lastyr_ct
FROM (
SELECT w, day
, date_trunc('week', '2012-01-04'::date)::date + 7 * w AS day0
, count(t.exam_date) AS thisyr_ct
FROM (
SELECT w
, date_trunc('week', '2013-01-04'::date)::date + 7 * w AS day
FROM generate_series(0, 4) w
) d
LEFT JOIN tbl t ON t.exam_date >= d.day
AND t.exam_date < d.day + 7
GROUP BY d.w, d.day
) d
LEFT JOIN tbl t ON t.exam_date >= d.day0 -- repeat with last year
AND t.exam_date < d.day0 + 7
GROUP BY d.w, d.day, d.day0, d.thisyr_ct
ORDER BY d.w, d.day;
January 4th is always in the first ISO week of the year. So this expression gets the date of Monday of the first ISO week of the given year:
date_trunc('week', '2012-01-04'::date)::date
Simplify with EXTRACT()
Since ISO weeks coincide with the week numbers returned by EXTRACT(), we can simplify the query. First, a short and simple form:
SELECT w AS isoweek
, COALESCE(thisyr_ct, 0) AS thisyr_ct
, COALESCE(lastyr_ct, 0) AS lastyr_ct
FROM generate_series(1, 5) w
LEFT JOIN (
SELECT EXTRACT(week FROM exam_date)::int AS w, count(*) AS thisyr_ct
FROM tbl
WHERE EXTRACT(isoyear FROM exam_date)::int = 2013
GROUP BY 1
) t13 USING (w)
LEFT JOIN (
SELECT EXTRACT(week FROM exam_date)::int AS w, count(*) AS lastyr_ct
FROM tbl
WHERE EXTRACT(isoyear FROM exam_date)::int = 2012
GROUP BY 1
) t12 USING (w);
Optimized query
The same with more details and optimized for performance
WITH params AS ( -- enter parameters here, once
SELECT date_trunc('week', '2012-01-04'::date)::date AS last_start
, date_trunc('week', '2013-01-04'::date)::date AS this_start
, date_trunc('week', '2014-01-04'::date)::date AS next_start
, 1 AS week_1
, 5 AS week_n -- show weeks 1 - 5
)
SELECT w.w AS isoweek
, p.this_start + 7 * (w - 1) AS thisyr_monday
, COALESCE(t13.ct, 0) AS thisyr_ct
, p.last_start + 7 * (w - 1) AS lastyr_monday
, COALESCE(t12.ct, 0) AS lastyr_ct
FROM params p
, generate_series(p.week_1, p.week_n) w(w)
LEFT JOIN (
SELECT EXTRACT(week FROM t.exam_date)::int AS w, count(*) AS ct
FROM tbl t, params p
WHERE t.exam_date >= p.this_start -- only relevant dates
AND t.exam_date < p.this_start + 7 * (p.week_n - p.week_1 + 1)::int
-- AND t.exam_date < p.next_start -- don't cross over into next year
GROUP BY 1
) t13 USING (w)
LEFT JOIN ( -- same for last year
SELECT EXTRACT(week FROM t.exam_date)::int AS w, count(*) AS ct
FROM tbl t, params p
WHERE t.exam_date >= p.last_start
AND t.exam_date < p.last_start + 7 * (p.week_n - p.week_1 + 1)::int
-- AND t.exam_date < p.this_start
GROUP BY 1
) t12 USING (w);
This should be very fast with index support and can easily be adapted to intervals of choice.
The implicit JOIN LATERAL for generate_series() in the last query requires Postgres 9.3.
SQL Fiddle.
Using across joinshould work, I'm just going to paste the markdown output from SQL Fiddle below. It would seem that your sample output is incorrect for series 2013-01-08: the thisyr should be 2, not 3. This might not be the best way to do this though, my Postgresql knowledge leaves a lot to be desired.
SQL Fiddle
PostgreSQL 9.2.4 Schema Setup:
CREATE TABLE Table1
("Pt_ID" varchar(6), "exam_date" date);
INSERT INTO Table1
("Pt_ID", "exam_date")
VALUES
('1', '2012-01-02'),('2', '2012-01-02'),
('3', '2012-01-08'),('4', '2012-01-08'),
('1', '2013-01-02'),('2', '2013-01-02'),
('3', '2013-01-03'),('4', '2013-01-04'),
('1', '2013-01-08'),('2', '2013-01-10'),
('3', '2013-01-15'),('4', '2013-01-24');
Query 1:
select
series,
sum (
case
when exam_date
between series and series + '6 day'::interval
then 1
else 0
end
) as thisyr,
sum (
case
when exam_date + '1 year'::interval
between series and series + '6 day'::interval
then 1 else 0
end
) as lastyr
from table1
cross join generate_series('2013-01-01', '2013-01-31', '7 day'::interval) as series
group by series
order by series
Results:
| SERIES | THISYR | LASTYR |
|--------------------------------|--------|--------|
| January, 01 2013 00:00:00+0000 | 4 | 2 |
| January, 08 2013 00:00:00+0000 | 2 | 2 |
| January, 15 2013 00:00:00+0000 | 1 | 0 |
| January, 22 2013 00:00:00+0000 | 1 | 0 |
| January, 29 2013 00:00:00+0000 | 0 | 0 |

Dynamic column names in view (Postgres)

I am currently programming an SQL view which should provide a count of a populated field for a particular month.
This is how I would like the view to be constructed:
Country | (Current Month - 12) Eg Feb 2011 | (Current Month - 11) | (Current Month - 10)
----------|----------------------------------|----------------------|---------------------
UK | 10 | 11 | 23
The number under the month should be a count of all populated fields for a particular country. The field is named eldate and is a date (cast as a char) of format 10-12-2011. I want the count to only count dates which match the month.
So column "Current Month - 12" should only include a count of dates which fall within the month which is 12 months before now. Eg Current Month - 12 for UK should include a count of dates which fall within February-2011.
I would like the column headings to actually reflect the month it is looking at so:
Country | Feb 2011 | March 2011 | April 2011
--------|----------|------------|------------
UK | 4 | 12 | 0
So something like:
SELECT c.country_name,
(SELECT COUNT("C1".eldate) FROM "C1" WHERE "C1".eldate LIKE %NOW()-12 Months% AS NOW() - 12 Months
(SELECT COUNT("C1".eldate) FROM "C1" WHERE "C1".eldate LIKE %NOW()-11 Months% AS NOW() - 11 Months
FROM country AS c
INNER JOIN "site" AS s using (country_id)
INNER JOIN "subject_C1" AS "C1" ON "s"."site_id" = "C1"."site_id"
Obviously this doesn't work but just to give you an idea of what I am getting at.
Any ideas?
Thank you for your help, any more queries please ask.
My first inclination is to produce this table:
+---------+-------+--------+
| Country | Month | Amount |
+---------+-------+--------+
| UK | Jan | 4 |
+---------+-------+--------+
| UK | Feb | 12 |
+---------+-------+--------+
etc. and pivot it. So you'd start with (for example):
SELECT
c.country,
EXTRACT(MONTH FROM s.eldate) AS month,
COUNT(*) AS amount
FROM country AS c
JOIN site AS s ON s.country_id = c.id
WHERE
s.eldate > NOW() - INTERVAL '1 year'
GROUP BY c.country, EXTRACT(MONTH FROM s.eldate);
You could then plug that into one the crosstab functions from the tablefunc module to achieve the pivot, doing something like this:
SELECT *
FROM crosstab('<query from above goes here>')
AS ct(country varchar, january integer, february integer, ... december integer);
You could truncate the dates to make the comparable:
WHERE date_trunc('month', eldate) = date_trunc('month', now()) - interval '12 months'
UPDATE
This kind of replacement for your query:
(SELECT COUNT("C1".eldate) FROM "C1" WHERE date_trunc('month', "C1".eldate) =
date_trunc('month', now()) - interval '12 months') AS TWELVE_MONTHS_AGO
But that would involve a scan of the table for each month, so you could do a single scan with something more along these lines:
SELECT SUM( CASE WHEN date_trunc('month', "C1".eldate) = date_trunc('month', now()) - interval '12 months' THEN 1 ELSE 0 END ) AS TWELVE_MONTHS_AGO
,SUM( CASE WHEN date_trunc('month', "C1".eldate) = date_trunc('month', now()) - interval '11 months' THEN 1 ELSE 0 END ) AS ELEVEN_MONTHS_AGO
...
or do a join with a table of months as others are showing.
UPDATE2
Further to the comment on fixing the columns from Jan to Dec, I was thinking something like this: filter on the last years worth of records, then sum on the appropriate month. Perhaps like this:
SELECT SUM( CASE WHEN EXTRACT(MONTH FROM "C1".eldate) = 1 THEN 1 ELSE 0 END ) AS JAN
,SUM( CASE WHEN EXTRACT(MONTH FROM "C1".eldate) = 2 THEN 1 ELSE 0 END ) AS FEB
...
WHERE date_trunc('month', "C1".eldate) < date_trunc('month', now())
AND date_trunc('month', "C1".eldate) >= date_trunc('month', now()) - interval '12 months'