How to fill the gaps? - sql

Assuming I have two records, both with a date and a count:
--Date-- --Count--
2011-09-20 00:00:00 5
2011-09-16 00:00:00 8
How would you select this for filling the time gaps, always taking the last previous record?
So the output would be:
--Date-- --Count--
2011-09-20 00:00:00 5
2011-09-19 00:00:00 8
2011-09-18 00:00:00 8
2011-09-17 00:00:00 8
2011-09-16 00:00:00 8
I couldn't figure out a neat solution for this, yet.
I guess this could be done with DATEDIFF, and a for-loop, but I hope this can be done easier.

You have 2 issues you're trying to resolve. The first issue is how to fill the gaps. The second issue is populating the Count field for those missing records.
Issue 1: This can be resolved by either using a Dates Lookup table or by creating a recursive common table expression. I would recommend creating a Dates Lookup table for this if that is an option. If you cannot create such a table, then you're going to need something like this.
WITH CTE AS (
SELECT MAX(dt) maxdate, MIN(dt) mindate
FROM yourtable
),
RecursiveCTE AS (
SELECT mindate dtfield
FROM CTE
UNION ALL
SELECT DATEADD(day, 1, dtfield)
FROM RecursiveCTE R
JOIN CTE T
ON R.dtfield < T.maxdate
)
That should create you a list of dates starting with the MIN date in your table and ending in the MAX.
Issue 2: Here is where a correlated subquery would come in handy (as much as I generally stay away from them) to get the last cnt from your original table:
SELECT r.dtfield,
(SELECT TOP 1 cnt
FROM yourtable
WHERE dt <= r.dtfield
ORDER BY dt DESC) cnt
FROM RecursiveCTE r
SQL Fiddle Demo

My solution goes like this.
Step 1: Have a Date table which has all the dates. - you can use many methods ex: Get a list of dates between two dates
Step 2: Do a Left outer from the date table to your result set. - which would result you with the below resultset: Call this table as "TEST_DATE_COUnt"
--Date-- --Count--
2011-09-20 00:00:00 5
2011-09-19 00:00:00 0
2011-09-18 00:00:00 0
2011-09-17 00:00:00 0
2011-09-16 00:00:00 8
Step 3: Do a Recursive query like below:
SELECT t1.date_x, t1.count_x,
(case when count_x=0 then (SELECT max(COUNT_X)
FROM TEST_DATE_COUNT r
WHERE r.DATE_X <= t1.DATE_X)
else COUNT_X
end)
cnt
FROM TEST_DATE_COUNT t1
Please let me know if this works. I tested and it worked.

Related

Get the current effective date from the list of records has past and future dates in SQL Server

I'm having list of records with a column EffectiveOn in SQL Server database table. I want to fetch the currently applicable EffectiveOn respective to current date. Consider the following table
Id Data EffectiveOn
_____________________________________
1 abc 2020-04-28
2 xyz 2020-08-05
3 dhd 2020-10-30
4 ert 2020-12-28
5 lkj 2021-03-19
In the above table I have to fetch the record (Id: 3) because the current date (i.e., today) is 2020-11-19
Expected Resultset
Id Data EffectiveOn
_____________________________________
3 dhd 2020-10-30
I tried the following solution but I can't How do I get the current records based on it's Effective Date?
Kindly assist me how to get the expected result-set.
You can do:
select top (1) *
from mytable t
where effectiveon <= convert(date, getdate())
order by effectiveon desc
This selects the greatest date before today (or today, if available).
You can try using Row_number function
select id, data, effectiveon
(
select ROW_NUMBER()over(order by effectiveon desc )sno,* from #table
where effectiveon < cast(getdate() as date)
)a where sno=1

Getting maximum sequential streak with events - updated question

I've previously posted a similar question to this, but an update on the parameters has meant that the solution posted wouldn't work, and I've had trouble trying to work out how to integrate the revised requirement. I'm not sure the protocol in here- it appears that I can't post an updated question to the original post at Getting maximum sequential streak with events
I’m looking for a single query, if possible, running PostgreSQL 9.6.6 under pgAdmin3 v1.22.1
I have a table with a date and a row for each event on the date:
Date Events
2018-12-10 1
2018-12-10 1
2018-12-10 0
2018-12-09 1
2018-12-08 0
2018-12-08 0
2018-12-07 1
2018-12-06 1
2018-12-06 1
2018-12-06 0
2018-12-06 1
2018-12-04 1
2018-12-03 0
I’m looking for the longest sequence of dates without a break. In this case, 2018-12-08 and 2018-12-03 are the only dates with no events, there are two dates with events between 2018-12-08 and today, and three between 2018-12-8 and 2018-12-07 - so I would like the answer of 3.
I know I can group them together with something like:
Select Date, count(Date) from Table group by Date order by Date Desc
To get just the most recent sequence, I’ve got something like this- the subquery returns the most recent date with no events, and the outer query counts the dates after that date:
select date, count(distinct date) from Table
where date>
( select date from Table
group by date
having count (case when Events is not null then 1 else null end) = 0
order by date desc
fetch first row only)
group by date
But now I need the longest streak, not just the most recent streak.
I had assumed when I posted previously that there were rows for every date in the range. But this assumption wasn't correct, so the answer given doesn't work. I also need the query to return the start and end date for the range.
Thank you!
You can assign group by doing a cumulative count of the 0s. Then count the distinct dates in each group:
select count(*), min(date), max(date), count(distinct date)
from (select t.*,
count(*) filter (where events = 0) over (order by date) as grp
from t
) t
group by grp
order by count(distinct date) desc
limit 1;

Smoothing out a result set by date

Using SQL I need to return a smooth set of results (i.e. one per day) from a dataset that contains 0-N records per day.
The result per day should be the most recent previous value even if that is not from the same day. For example:
Starting data:
Date: Time: Value
19/3/2014 10:01 5
19/3/2014 11:08 3
19/3/2014 17:19 6
20/3/2014 09:11 4
22/3/2014 14:01 5
Required output:
Date: Value
19/3/2014 6
20/3/2014 4
21/3/2014 4
22/3/2014 5
First you need to complete the date range and fill in the missing dates (21/3/2014 in you example). This can be done by either joining a calendar table if you have one, or by using a recursive common table expression to generate the complete sequence on the fly.
When you have the complete sequence of dates finding the max value for the date, or from the latest previous non-null row becomes easy. In this query I use a correlated subquery to do it.
with cte as (
select min(date) date, max(date) max_date from your_table
union all
select dateadd(day, 1, date) date, max_date
from cte
where date < max_date
)
select
c.date,
(
select top 1 max(value) from your_table
where date <= c.date group by date order by date desc
) value
from cte c
order by c.date;
May be this works but try and let me know
select date, value from test where (time,date) in (select max(time),date from test group by date);

SQL Server Date Selection in Recordset based on initial date and number of days in another recordset

What I'm about to ask about, I had difficulties what I was looking for throughout the forum postings on here, so any assistance at all will be greatly appreciated! :)
To help explain what I want, here's a little snippet of a view result-set that I'm working with:
CalendarDate: WorkDay:
2014-10-03 00:00:00.000 1
2014-10-02 00:00:00.000 1
2014-10-01 00:00:00.000 1
2014-09-30 00:00:00.000 1
2014-09-29 00:00:00.000 1
2014-09-26 00:00:00.000 1
2014-09-25 00:00:00.000 1
This view represents a table in our database that keeps track of actual working days for our company; this view excludes any non-working days (hence all the "1"s).
What I'm trying to do is take a datetime value from another result-set, find it in this result-set and count down the number of days (based on a value being brought in from another result-set as well). So, if I was starting with October 3, 2014 and the number of days I was going back was 5, I want to end up on September 26, 2014.
Personally, I see this being accomplished in a unique record count on a pre-sorted view, but SQL is a diverse universe of ways to do the same thing and I would like to achieve this in the most efficient way possible :).
Like I said at the beginning, I didn't know this question should be properly worded so if this is a duplicate post then I apologize.
you can use row_number analytic function and then derive the difference in days
Assuming your second result set is like this
create table Table2
( StartDate datetime,
days int
);
insert into Table2 values ('2014-10-03', 5);
insert into Table2 values ('2014-10-02', 5);
You can join with current table with this result set and get the required out dates using cte and row_number and self join.
with cte
as
(
select CalendarDate, row_number() over ( order by CalendarDate desc) as rn, WorkDay
from Table1
)
select T1.StartDate, T1.days, T2.CalendarDate as OutDate from
cte
join Table2 T1
on cte.calendarDate = T1.StartDate
join cte T2
on T2.rn - cte.rn = T1.days
result will come out like
STARTDATE DAYS OUTDATE
October, 03 2014 5 September, 26 2014
October, 02 2014 5 September, 25 2014
SQL FIDDLE
And when you use the TOP-Clause:
SELECT TOP 1 CalendarDate
FROM (SELECT TOP 5 CalendarDate
FROM DateTable
WHERE CalendarDate <'2014-10-03'
ORDER BY CalendarDate DESC
) AS T5
ORDER BY CalendarDate ASC

PostgreSQL query to count/group by day and display days with no data

I need to create a PostgreSQL query that returns
a day
the number of objects found for that day
It's important that every single day appear in the results, even if no objects were found on that day. (This has been discussed before but I haven't been able to get things working in my specific case.)
First, I found a sql query to generate a range of days, with which I can join:
SELECT to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD')
AS date
FROM generate_series(0, 365, 1)
AS offs
Results in:
date
------------
2013-03-28
2013-03-27
2013-03-26
2013-03-25
...
2012-03-28
(366 rows)
Now I'm trying to join that to a table named 'sharer_emailshare' which has a 'created' column:
Table 'public.sharer_emailshare'
column | type
-------------------
id | integer
created | timestamp with time zone
message | text
to | character varying(75)
Here's the best GROUP BY query I have so far:
SELECT d.date, count(se.id) FROM (
select to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD')
AS date
FROM generate_series(0, 365, 1)
AS offs
) d
JOIN sharer_emailshare se
ON (d.date=to_char(date_trunc('day', se.created), 'YYYY-MM-DD'))
GROUP BY d.date;
The results:
date | count
------------+-------
2013-03-27 | 11
2013-03-24 | 2
2013-02-14 | 2
(3 rows)
Desired results:
date | count
------------+-------
2013-03-28 | 0
2013-03-27 | 11
2013-03-26 | 0
2013-03-25 | 0
2013-03-24 | 2
2013-03-23 | 0
...
2012-03-28 | 0
(366 rows)
If I understand correctly this is because I'm using a plain (implied INNER) JOIN, and this is the expected behavior, as discussed in the postgres docs.
I've looked through dozens of StackOverflow solutions, and all the ones with working queries seem specific to MySQL/Oracle/MSSQL and I'm having a hard time translating them to PostgreSQL.
The guy asking this question found his answer, with Postgres, but put it on a pastebin link that expired some time ago.
I've tried to switch to LEFT OUTER JOIN, RIGHT JOIN, RIGHT OUTER JOIN, CROSS JOIN, use a CASE statement to sub in another value if null, COALESCE to provide a default value, etc, but I haven't been able to use them in a way that gets me what I need.
Any assistance is appreciated! And I promise I'll get around to reading that giant PostgreSQL book soon ;)
You just need a left outer join instead of an inner join:
SELECT d.date, count(se.id)
FROM
(
SELECT to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD') AS date
FROM generate_series(0, 365, 1) AS offs
) d
LEFT OUTER JOIN sharer_emailshare se
ON d.date = to_char(date_trunc('day', se.created), 'YYYY-MM-DD')
GROUP BY d.date;
Extending Gordon Linoff's helpful answer, I would suggest a couple of improvements such as:
Use ::date instead of date_trunc('day', ...)
Join on a date type rather than a character type (it's cleaner).
Use specific date ranges so they're easier to change later. In this case I select a year before the most recent entry in the table - something that couldn't have been done easily with the other query.
Compute the totals for an arbitrary subquery (using a CTE). You just have to cast the column of interest to the date type and call it date_column.
Include a column for cumulative total. (Why not?)
Here's my query:
WITH dates_table AS (
SELECT created::date AS date_column FROM sharer_emailshare WHERE showroom_id=5
)
SELECT series_table.date, COUNT(dates_table.date_column), SUM(COUNT(dates_table.date_column)) OVER (ORDER BY series_table.date) FROM (
SELECT (last_date - b.offs) AS date
FROM (
SELECT GENERATE_SERIES(0, last_date - first_date, 1) AS offs, last_date from (
SELECT MAX(date_column) AS last_date, (MAX(date_column) - '1 year'::interval)::date AS first_date FROM dates_table
) AS a
) AS b
) AS series_table
LEFT OUTER JOIN dates_table
ON (series_table.date = dates_table.date_column)
GROUP BY series_table.date
ORDER BY series_table.date
I tested the query, and it produces the same results, plus the column for cumulative total.
I'll try to provide an answer that includes some explanation. I'll start with the smallest building block and work up.
If you run a query like this:
SELECT series.number FROM generate_series(0, 9) AS series(number)
You get output like this:
number
--------
0
1
2
3
4
5
6
7
8
9
(10 rows)
This can be turned into dates like this:
SELECT CURRENT_DATE + sequential_dates.date AS date
FROM generate_series(0, 9) AS sequential_dates(date)
Which will give output like this:
date
------------
2019-09-29
2019-09-30
2019-10-01
2019-10-02
2019-10-03
2019-10-04
2019-10-05
2019-10-06
2019-10-07
2019-10-08
(10 rows)
Then you can do a query like this (for example), joining the original query as a subquery against whatever table you're ultimately interested in:
SELECT sequential_dates.date,
COUNT(calendar_items.*) AS calendar_item_count
FROM (SELECT CURRENT_DATE + sequential_dates.date AS date
FROM generate_series(0, 9) AS sequential_dates(date)) sequential_dates
LEFT JOIN calendar_items ON calendar_items.starts_at::date = sequential_dates.date
GROUP BY sequential_dates.date
Which will give output like this:
date | calendar_item_count
------------+---------------------
2019-09-29 | 1
2019-09-30 | 8
2019-10-01 | 15
2019-10-02 | 11
2019-10-03 | 1
2019-10-04 | 12
2019-10-05 | 0
2019-10-06 | 0
2019-10-07 | 27
2019-10-08 | 24
Based on Gordon Linoff's answer I realized another problem was that I had a WHERE clause that I didn't mention in the original question.
Instead of a naked WHERE, I made a subquery:
SELECT d.date, count(se.id) FROM (
select to_char(date_trunc('day', (current_date - offs)), 'YYYY-MM-DD')
AS date
FROM generate_series(0, 365, 1)
AS offs
) d
LEFT OUTER JOIN (
SELECT * FROM sharer_emailshare
WHERE showroom_id=5
) se
ON (d.date=to_char(date_trunc('day', se.created), 'YYYY-MM-DD'))
GROUP BY d.date;
I like Jason Swett SQL however ran into issue where the count on some dates should be a zero rather than a one.
Running the statment select count(*) from public.post_call_info where timestamp::date = '2020-11-23' count = zero, but below equals a one.
Also the + give me a forward schedule so changed to a minus provide 9 days data prior to current date.
SELECT sequential_dates.date,
COUNT(*) AS call_count
FROM (SELECT CURRENT_DATE - sequential_dates.date AS date
FROM generate_series(0, 9) AS sequential_dates(date)) sequential_dates
LEFT JOIN public.post_call_info ON public.post_call_info.timestamp::date =
sequential_dates.date
GROUP BY sequential_dates.date
order by date desc