How to find count of occurrence of each month between 2 dates - sql

Say, I have two dates:
start_date = '2018-06-01'
end_date = '2022-10-01'
How can I get a table of months and the number of times they occur between the two dates?
I want an output like the following:
month | count
-------------
1 | 4
2 | 4
3 | 4
4 | 4
5 | 4
6 | 5
7 | 5
8 | 5
9 | 5
10 | 5
11 | 4
12 | 4
Edit (to answer questions in comments):
No, I don't have table with dates, If I had, I would have added that in question
Year is not important, If it was required, I would have mentioned
Inclusive, and that is why I had given example input and output to tell exactly what I want.

demo:db<>fiddle
SELECT
date_part('month', gs), -- 2
COUNT(*) -- 3
FROM generate_series( -- 1
'2018-06-01',
'2022-10-01',
interval '1 month'
) gs
GROUP BY 1 -- 3
Generate all months between the given dates. Here the given dates are INCLUDED. If you want to exclude them, you need to add/subtract one month like:
generate_series(
'2018-06-01' + interval '1 month',
'2022-10-01' - interval '1 month',
interval '1 month'
)
Extract the month component of the generated dates
Group and count them.

Related

How to calculate total worktime per week [SQL]

I have a table of EMPLOYEES that contains information about the DATE and WORKTIME per that day. Fx:
ID | DATE | WORKTIME |
----------------------------------------
1 | 1-Sep-2014 | 4 |
2 | 2-Sep-2014 | 6 |
1 | 3-Sep-2014 | 5.5 |
1 | 4-Sep-2014 | 7 |
2 | 4-Sep-2014 | 4 |
1 | 9-Sep-2014 | 8 |
and so on.
Question: How can I create a query that would allow me to calculate amount of time worked per week (HOURS_PERWEEK). I understand that I need a summation of WORKTIME together with grouping considering both, ID and week, but so far my trials as well as googling didnt yield any results. Any ideas on this? Thank you in advance!
edit:
Got a solution of
select id, sum (worktime), trunc(date, 'IW') week
from employees
group by id, TRUNC(date, 'IW');
But will need somehow to connect that particular output with DATE table by updating a newly created column such as WEEKLY_TIME. Any hints on that?
You can find the start of the ISO week, which will always be a Monday, using TRUNC("DATE", 'IW').
So if, in the query, you GROUP BY the id and the start of the week TRUNC("DATE", 'IW') then you can SELECT the id and aggregate to find the SUM the WORKTIME column for each id.
Since this appears to be a homework question and you haven't attempted a query, I'll leave it at this to point you in the correct direction and you can complete the query.
Update
Now I need to create another column (lets call it WEEKLY_TIME) and populate it with values from the current output, so that Sep 1,3,4 (for ID=1) would all contain value 16.5, specifying that on that day (that is within the certain week) that person worked 16.5 in total. And for ID=2 it would then be a value of 10 for both Sep 2 and 4.
For this, if I understand correctly, you appear to not want to use aggregation functions and want to use the analytic version of the function:
select id,
"DATE",
trunc("DATE", 'IW') week,
worktime,
sum (worktime) OVER (PARTITION BY id, trunc("DATE", 'IW'))
AS weekly_time
from employees;
Which, for the sample data:
CREATE TABLE employees (ID, "DATE", WORKTIME) AS
SELECT 1, DATE '2014-09-01', 4 FROM DUAL UNION ALL
SELECT 2, DATE '2014-09-02', 6 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-03', 5.5 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-04', 7 FROM DUAL UNION ALL
SELECT 2, DATE '2014-09-04', 4 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-09', 8 FROM DUAL;
Outputs:
ID
DATE
WEEK
WORKTIME
WEEKLY_TIME
1
2014-09-01 00:00:00
2014-09-01 00:00:00
4
16.5
1
2014-09-03 00:00:00
2014-09-01 00:00:00
5.5
16.5
1
2014-09-04 00:00:00
2014-09-01 00:00:00
7
16.5
1
2014-09-09 00:00:00
2014-09-08 00:00:00
8
8
2
2014-09-04 00:00:00
2014-09-01 00:00:00
4
10
2
2014-09-02 00:00:00
2014-09-01 00:00:00
6
10
db<>fiddle here
edit: answer submitted without noticing "Oracle" tag. Otherwise, question answered here: Oracle SQL - Sum and group data by week
Select employee_Id,
DATEPART(week, workday) as [Week],
sum (worktime) as [Weekly Hours]
from WORK
group by employee_id, DATEPART(week, workday)
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=238b229156a383fa3c466b6c3c2dee1e

Calculate the number of people that don't have checks for a period of 3 continuous months between their start_date and end_date using SQL

Checks Table
check_id
date_of_service
person_id
start_date
end_date
1
01/20/18
1
01/20/18
01/20/19
1
02/20/18
1
01/20/18
01/20/19
3
03/20/18
2
01/20/17
01/20/19
4
10/20/18
2
01/20/17
01/20/19
7
04/20/18
3
04/20/18
01/20/19
2
04/20/19
4
04/20/19
04/20/20
5
01/20/19
5
01/20/17
01/20/18
6
10/20/18
3
04/20/18
01/20/19
Calculate the number of people that don't have checks for a period of 3 continuous months between their start_date and end_date using SQL.
In this case,For person_id = 1, there are checks on 01/20/18 and 02/20/18 (Jan - Mar 18), but no checks for 3 months in the next window, so this should be counted. (+1)
So this will be +1 for the no. of people with no checks in a 3 month period.
I want to find the number of people with no checks for any 3 month periods between their start date and end date. The 3 month period should begin from the start_date of the person_id.
Expected Table
Number of people
Value

How to generate series using start and end date and quarters on postgres

I have a table like shown below where I want to use the start and end date to evenly distribute the value for each row to the 3 months in each quarter to all of the quarters in between start and end date (last two columns).
I am familiar with generate series and intervals in Postgres but I am having hard time to get what I want.
My table has and ID column that groups rows together, a quarter column that indicates which quarter the row references for the ID, a value column that is the value for the whole quarter (and every quarter in the date range), and start_date and end_date columns indicating the date range. Here is a sample:
ID quarter value start_date end_date
1 2 152 2019-11-07 2050-12-30
1 1 785 2019-11-07 2050-12-30
2 2 152 2019-03-05 2050-12-30
2 1 785 2019-03-05 2050-12-30
3 4 41 2018-06-12 2050-12-30
3 3 50 2018-06-12 2050-12-30
3 2 88 2018-06-12 2050-12-30
3 1 29 2018-06-12 2050-12-30
4 2 1607 2018-12-17 2050-12-30
4 1 4803 2018-12-17 2050-12-30
Here is my desired output (for ID 1):
ID quarter value start_date end_date
1 2 152/3 2020-04-01 2020-07-01
1 1 785/3 2020-01-01 2020-04-01
1 2 152/3 2021-04-01 2021-07-01
1 1 785/3 2021-01-01 2021-04-01
start_date in the output will be the next quarter on first table. I need the series to be generated from the start_date to the end_date of the first table.
You can do this by using the GENERATE_SERIES function and passing in the start and end date for each unique (by ID) row and setting the interval to 3 months. Then join the result back with your original table on both ID and quarter.
Here's an example (note original_data is what I've called your first table):
WITH
quarters_table AS (
SELECT
t.ID,
(EXTRACT('month' FROM t.quarter_date) - 1)::INT / 3 + 1 AS quarter,
t.quarter_date::DATE AS start_date,
COALESCE(
LEAD(t.quarter_date) OVER (),
DATE_TRUNC('quarter', t.original_end_date) + INTERVAL '3 months'
)::DATE AS end_date
FROM (
SELECT
original_record.ID,
original_record.end_date AS original_end_date,
GENERATE_SERIES(
DATE_TRUNC('quarter', original_record.start_date),
DATE_TRUNC('quarter', original_record.end_date),
INTERVAL '3 months'
) AS quarter_date
FROM (
SELECT DISTINCT ON (original_data.ID)
original_data.ID,
original_data.start_date,
original_data.end_date
FROM
original_data
ORDER BY
original_data.ID
) AS original_record
) AS t
)
SELECT
quarters_table.ID,
quarters_table.quarter,
original_data.value::DOUBLE PRECISION / 3 AS value,
quarters_table.start_date,
quarters_table.end_date
FROM
quarters_table
INNER JOIN
original_data
ON
quarters_table.ID = original_data.ID
AND quarters_table.quarter = original_data.quarter;
Sample output:
id | quarter | value | start_date | end_date
----+---------+------------------+------------+------------
1 | 1 | 261.666666666667 | 2020-01-01 | 2020-04-01
1 | 2 | 50.6666666666667 | 2020-04-01 | 2020-07-01
1 | 1 | 261.666666666667 | 2021-01-01 | 2021-04-01
1 | 2 | 50.6666666666667 | 2021-04-01 | 2021-07-01
For completeness, here's the original_data table I've used in testing:
WITH
original_data AS (
SELECT
1 AS ID,
2 AS quarter,
152 AS value,
'2019-11-07'::DATE AS start_date,
'2050-12-30'::DATE AS end_date
UNION ALL
SELECT
1 AS ID,
1 AS quarter,
785 AS value,
'2019-11-07'::DATE AS start_date,
'2050-12-30'::DATE AS end_date
UNION ALL
SELECT
2 AS ID,
2 AS quarter,
152 AS value,
'2019-03-05'::DATE AS start_date,
'2050-12-30'::DATE AS end_date
-- ...
)
This is one way to go about it. Showing an example based on the output you've outlined. You can then add more conditions to the CASE/WHEN for additional quarters.
SELECT
ID,
Quarter,
Value/3 AS "Value",
CASE
WHEN Quarter = 1 THEN '2020-01-01'
WHEN Quarter = 2 THEN '2020-04-01'
END AS "Start_Date",
CASE
WHEN Quarter = 1 THEN '2020-04-01'
WHEN Quarter = 2 THEN '2020-07-01'
END AS "End_Date"
FROM
Table

Those who listened to more than 10 mins each month in the last 6 months

I'm trying to figure out the count of users who listened to more than 10 mins each month in the last 6 months
We have this event: Song_stopped_listen and one attribute is session_progress_ms
Now I'm trying to see the monthly evolution of the count of this cohort over the last 6 months.
I'm using bigquery and this is the query I tried, but I feel that something is off semantically, but I couldn't put my finger on:
SELECT
CONCAT(CAST(EXTRACT(YEAR FROM DATE (timestamp)) AS STRING),"-",CAST(EXTRACT(MONTH FROM DATE (timestamp)) AS STRING)) AS date
,SUM(absl.session_progress_ms/(1000*60*10)) as total_10_ms, COUNT(DISTINCT u.id) as total_10_listeners
FROM ios.song_stopped_listen as absl
LEFT JOIN ios.users u on absl.user_id = u.id
WHERE absl.timestamp > '2018-05-01'
Group by 1
HAVING(total_10_ms > 1)
Please help figure out what I'm doing wrong here.
Thank you.
data Sample:
user_id | session_progress_ms | timestamp
1 | 10000 | 2017-10-10 14:34:25.656 UTC
What I want to have:
||Month-year | Count of users who listened to more than 10 mins
|2018-5 | 500
|2018-6 | 600
|2018-7 | 300
|2018-8 | 5100
|2018-9 | 4500
|2018-10 | 1500
|2018-11 | 1500
|2018-12 | 2500
Use multiple levels of aggregation:
select user_id
from (select ssl.user_id, timestamp_trunc(timestamp, month) as mon,
sum(ssl.session_progress_ms/(1000*60)) as total_minutes
from ios.song_stopped_listen as ssl
where date(ssl.timetamp) < date_trunc(current_date, month) and
date(ssl.timestamp) >= date_add(date_trunc(current_date, month) interval 6 month),
group by 1, 2
) u
where total_minutes >= 10
group by user_id
having count(*) = 6;
To get the count, just use this as a subquery with count(*).

Determine a specific fortnight based on anchor dates

I have 2 x bi-weekly periods that were defined by 2 starting dates 1 week apart. For example, Group 1 started on 2016-01-15 and Group 2 started on 2016-01-22.
By bi-weekly, I mean a rolling period lasting 2 weeks.
How can I determine if the current date is in week 1 of Group 1 or is in week 1 of Group 2?
By way of example, today's date is 2016-04-04 so this would be day 1 of Group 2 and day 8 of Group 1, therefore I would like to a query to return 'Group 2'.
DATEDIFF calculates the difference between two dates. Divide it by 14 days and take the remainder (%).
If remainder is less than 7, then it is closer to that starting date.
Since you know that your starting dates are 1 week apart you really need to check only one starting date.
DECLARE #VarStartGroup1 date = '2016-01-15';
DECLARE #VarStartGroup2 date = '2016-01-22';
DECLARE #VarCurrentDate date = '2016-04-04';
SELECT
DATEDIFF(day, #VarStartGroup1, #VarCurrentDate) AS TotalDays1,
DATEDIFF(day, #VarStartGroup2, #VarCurrentDate) AS TotalDays2,
DATEDIFF(day, #VarStartGroup1, #VarCurrentDate) % 14 AS DayNumberInGroup1,
DATEDIFF(day, #VarStartGroup2, #VarCurrentDate) % 14 AS DayNumberInGroup2,
CASE WHEN DATEDIFF(day, #VarStartGroup1, #VarCurrentDate) % 14 < 7
THEN 'Group1' ELSE 'Group2' END AS Result
;
Result
+------------+------------+-------------------+-------------------+--------+
| TotalDays1 | TotalDays2 | DayNumberInGroup1 | DayNumberInGroup2 | Result |
+------------+------------+-------------------+-------------------+--------+
| 80 | 73 | 10 | 3 | Group2 |
+------------+------------+-------------------+-------------------+--------+
I included intermediate calculations in the result to help understand what is going on.