SQL Count Numbers of Projects started Each Day Between Two Dates - sql

I have this table that I need count how many projects (job) i have started each day.
job start end
1 01-01-2013 04-01-2013
2 01-01-2013 02-01-2013
3 01-01-2013 03-01-2013
4 03-01-2013 04-01-2013
5 03-01-2013 04-01-2013
6 03-01-2013 04-01-2013
...
i want count how many job's i have started each day.. / i mean how many job's are open each day..
date count
01-01-2013 3
02-01-2013 3
03-01-2013 5
04-01-2013 4
05-01-2013 0
...

select start, count(*) as jobs_per_day
from your_table
group by start
But this will not return a record for dates where you did not create any job.

The following works for me in postgresql
with dates as (
select aday::date
from generate_series((select min(start) from your_table),
(select max(end) from your_table),
'1 day'::interval) aday
), flat as (
select *
from dates, your_table
where dates.aday between your_table.start and your_table.end
)
select
aday,
count(*) as count
from flat
group by aday
order by aday
;
The first CTE generates a series of dates, which might have to be done differently in another RDBMS.

select start as date, count(*) as count from table_name
where start_date>="your start date" and end_date<="your end date"
group by start;

Related

sum values based on 7-day cycle in SQL Oracle

I have dates and some value, I would like to sum values within 7-day cycle starting from the first date.
date value
01-01-2021 1
02-01-2021 1
05-01-2021 1
07-01-2021 1
10-01-2021 1
12-01-2021 1
13-01-2021 1
16-01-2021 1
18-01-2021 1
22-01-2021 1
23-01-2021 1
30-01-2021 1
this is my input data with 4 groups to see what groups will create the 7-day cycle.
It should start with first date and sum all values within 7 days after first date included.
then start a new group with next day plus anothe 7 days, 10-01 till 17-01 and then again new group from 18-01 till 25-01 and so on.
so the output will be
group1 4
group2 4
group3 3
group4 1
with match_recognize would be easy current_day < first_day + 7 as a condition for the pattern but please don't use match_recognize clause as solution !!!
One approach is a recursive CTE:
with tt as (
select dte, value, row_number() over (order by dte) as seqnum
from t
),
cte (dte, value, seqnum, firstdte) as (
select tt.dte, tt.value, tt.seqnum, tt.dte
from tt
where seqnum = 1
union all
select tt.dte, tt.value, tt.seqnum,
(case when tt.dte < cte.firstdte + interval '7' day then cte.firstdte else tt.dte end)
from cte join
tt
on tt.seqnum = cte.seqnum + 1
)
select firstdte, sum(value)
from cte
group by firstdte
order by firstdte;
This identifies the groups by the first date. You can use row_number() over (order by firstdte) if you want a number.
Here is a db<>fiddle.

Finding multiple consecutive dates (datetime) in Ruby on Rails / Postgresql

How can we find X consecutive dates (using by hour) that meet a condition?
EDIT: here is the SQL fiddle http://sqlfiddle.com/#!17/44928/1
Example:
Find 3 consecutive dates where aa < 2 and bb < 6 and cc < 7
Given this table called weather:
timestamp
aa
bb
cc
01/01/2000 00:00
1
5
5
01/01/2000 01:00
5
5
5
01/01/2000 02:00
1
5
5
01/01/2000 03:00
1
5
5
01/01/2000 04:00
1
5
5
01/01/2000 05:00
1
5
5
Answer should return the 3 records from 02:00, 03:00, 04:00.
How can we do this in Ruby on Rails - or directly in SQL if that is better?
I started working on a method based on this answer:
Detect consecutive dates ranges using SQL
def consecutive_dates
the_query = "WITH t AS (
SELECT timestamp d,ROW_NUMBER() OVER(ORDER BY timestamp) i
FROM #d
GROUP BY timestamp
)
SELECT MIN(d),MAX(d)
FROM t
GROUP BY DATEDIFF(hour,i,d)"
ActiveRecord::Base.connection.execute(the_query)
end
But I was unable to get it working.
Assuming that you have one row every hour, then an easy way to get the first hour where this occurs uses lead():
select t.*
from (select t.*,
lead(timestamp, 2) over (order by timestamp) as timestamp_2
from t
where aa < 2 and bb < 6 and cc < 7
) t
where timestamp_2 = timestamp + interval '2 hour';
This filters on the conditions and looks at the rows two rows ahead. If it is two hours ahead, then three rows in a row match the conditions. Note: The above will return both 2020-01-01 02:00 and 2020-01-01 03:00.
From your question you only seem to want the earliest. To handle that, use lag() as well:
select t.*
from (select t.*,
lag(timestamp) over (order by timestamp) as prev_timestamp
lead(timestamp, 2) over (order by timestamp) as timestamp_2
from t
where aa < 2 and bb < 6 and cc < 7
) t
where timestamp_2 = timestamp + interval '2 hour' and
(prev_timestamp is null or prev_timestamp < timestamp - interval '1' hour);
You can generate the additional hours use generate_series() if you really need the original rows:
select t.timestamp + n.n * interval '1 hour', aa, bb, cc
from (select t.*,
lead(timestamp, 2) over (order by timestamp) as timestamp_2
from t
where aa < 2 and bb < 6 and cc < 7
) t cross join lateral
generate_series(0, 2) n
where timestamp_2 = timestamp + interval '2 hour';
Your data seems to have precise timestamps based on the question, so the timestamp equalities will work. If the real data has more fuzziness, then the queries can be tweaked to take this into account.
)This is a gaps-and-islands problem. Islands are adjacent records that match the condition, and you want islands that are at least 3 records long.
Here is one approach that uses a window count that increments every time value that does not match the condition is met to define the groups. We can then count how many rows there are in each group, and use that information to filter.
select *
from (
select t.*, count(*) over(partition by a, grp) cnt
from (
select t.*,
count(*) filter(where b <= 4) over(partition by a order by timestamp) grp
from mytable t
) t
) t
where cnt >= 3

SQL not returning a value if no row exist for time queried

I'm writing this SQL query which returns the number of records created in an hour in last 24 hours. I'm getting the result for only those hours that have a non zero value. If no records were created, it doesn't return anything at all.
Here's my query:
SELECT HOUR(timeStamp) as hour, COUNT(*) as count
FROM `events`
WHERE timeStamp > DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY HOUR(timeStamp)
ORDER BY HOUR(timeStamp)
The output of current Query:
+-----------------+----------+
| hour | count |
+-----------------+----------+
| 14 | 6 |
| 15 | 5 |
+-----------------+----------+
But i'm expecting 0 for hours in which no records were created. Where am I going wrong?
One solution is to generate a table of numbers from 0 to 23 and left join it with your original table.
Here is a query that uses a recursive query to generate the list of hours (if you are running MySQL, this requires version 8.0):
with hours as (
select 0 hr
union all select hr + 1 where h < 23
)
select h.hr, count(e.eventID) as cnt
from hours h
left join events e
on e.timestamp > now() - interval 1 day
and hour(e.timestamp) = h.hr
group by h.hr
If your RDBMS does not support recursive CTEs, then one option is to use an explicit derived table:
select h.hr, count(e.eventID) as cnt
from (
select 0 hr union all select 1 union all select 2 ... union all select 23
) h
left join events e
on e.timestamp > now() - interval 1 day
and hour(e.timestamp) = h.hr
group by h.hr

Totals over rolling timeframe

I have my data arranged like this:
obj_id quantity date
1 3 2014-05-06
2 2 2014-03-12
3 5 2014-10-07
4 7 2014-05-09
2 8 2014-12-31
1 5 2014-01-16
4 1 2014-07-26
3 2 2014-09-15
...
What I need is to find the OBJ_ID's that have the SUM(quantity) > MAX over the period of RANGE days.
In my case MAX is 18 and RANGE is 31 days.
In other words, every given OBJ_ID recieves QUANTITY (no matter of what) from time to time. I need to find OBJ_IDs that had received in total more than 18 and dates that this OBJ_ID recieved Qs span over less than 31 days. Doh.)
I think I need to use LAG here, but not sure how the whole thing should be.
Thanks in advance.
This might need some tweaking as I didn't have the time to decently test it, but maybe it'll get you on the right track:
(I've assumed you want the records where the date is within the last 31 days)
SELECT SUM(quantity)
FROM tblTable
WHERE date between DATEADD(day, -RANGE, GETDATE()) and GETDATE()
HAVING SUM(quantity) > MAX
GROUP BY obj_id
I'm currently testing a solution a colleague of mine has quickly put together:
SELECT A.*
FROM (
SELECT A.obj_id
, A.date
, A.in_month_date
, A.date - A.in_month_date AS in_month
, A.quantity
, A.in_month_quantity
FROM (
SELECT A.obj_id
, A.date
, FIRST_VALUE(A.date)
OVER (
PARTITION BY A.obj_id
ORDER BY A.date
RANGE BETWEEN 31 PRECEDING
AND CURRENT ROW
) AS in_month_date
, A.quantity
, SUM(A.quantity)
OVER (
PARTITION BY A.obj_id
ORDER BY A.date
RANGE BETWEEN 31 PRECEDING
AND CURRENT ROW
) AS in_month_quantity
FROM mytable A
) A
) A
WHERE A.in_month <= 31
AND A.in_month_quantity > 18

Oracle select sum by time window

Lets assume that we have the ORACLE table of the following format and data:
TIMESTAMP MESSAGENO ORGMESSAGE
------------------------- ---------------------- -------------------------------------
27.04.13 1 START PERIOD
27.04.13 3 10
27.04.13 4 5
28.04.13 5 6
28.04.13 3 20
29.04.13 4 25
29.04.13 5 26
30.04.13 2 END PERIOD
30.04.13 1 START PERIOD
01.05.13 3 10
02.05.13 4 15
02.05.13 5 16
03.05.13 3 30
03.05.13 4 35
04.05.13 5 36
05.05.13 2 END PERIOD
I want to select sum of all the ORGMESSAGE for all the period (window between START PERIOD and END PERIOD) grouped by MESSAGENO.
Exapmle output would be:
PERIOD START PERIOD END MESSAGENO SUM
------------ ------------- -------- ----
27.04.13 30.04.13 3 25
27.04.13 30.04.13 4 30
27.04.13 30.04.13 5 32
30.04.13 05.05.13 3 45
30.04.13 05.05.13 4 50
30.04.13 05.05.13 5 52
I am guessing that use of ORACLE Analityc function woulde be suitable but really dont know how and where to start.
Thanks in advance for any help.
If we assume that the period starts and ends match, then a simple way to find the matching messages is to count the preceding number of starts. This is a cumulative sum and it is easy in Oracle. The rest is just aggregation:
select min(timestamp) as periodstart, max(timestamp) as periodend, messageno, count(*)
from (select om.*,
sum(case when messageno = 1 then 1 else 0 end) over (order by timestamp) as grp
from orgmessages om
) om
where messageno not in (1, 2)
group by grp, messageno;
Note that this method (as with the others) really wants the timestamp to be unique on each record. In the data presented, these solutions will work. But if you have multiple starts and ends on the same day, none of them will work assuming that timestamp only has the date.
First find all period ends per period start. Then join with your table to group and sum.
select
dates.start_date,
dates.end_date,
messageno,
sum(to_number(orgmessage)) as period_sum
from mytable
join
(
select start_dates.timestmp as start_date, min(end_dates.timestmp) as end_date
from (select * from mytable where orgmessage = 'START PERIOD') start_dates
join (select * from mytable where orgmessage = 'END PERIOD') end_dates
on start_dates.timestmp < end_dates.timestmp
group by start_dates.timestmp
) dates on mytable.timestmp between dates.start_date and dates.end_date
where mytable.orgmessage not like '%PERIOD%'
group by dates.start_date, dates.end_date, messageno
order by dates.start_date, dates.end_date, messageno;
SQL fiddle: http://www.sqlfiddle.com/#!4/365de/15.
please, try this one, replace rrr with your table name
select periodstart, periodend, messageno, sum(to_number(orgmessage)) s
from (select TIMESTAMP periodstart,
(select min (TIMESTAMP) from rrr r2 where orgmessage = 'END PERIOD' and r2.TIMESTAMP > r.TIMESTAMP) periodend
from rrr r
where orgmessage = 'START PERIOD'
) borders, rrr r
where r.TIMESTAMP between borders.periodstart and borders.periodend
and r.orgmessage not in ('END PERIOD', 'START PERIOD')
group by periodstart, periodend, messageno
order by periodstart, periodend, messageno