SQL -- return 0s if no group exists - sql

I have a rollup table that sums up raw data for a given hour. It looks something like this:
stats_hours:
- obj_id : integer
- start_at : datetime
- count : integer
The obj_id points to a separate table, the start_at field contains a timestamp for the beginning of the hour of the data, and the count contains the sum of the data for that hour.
I would like to build a query that returns a set of data per day, so something like this:
Date | sum_count
2014-06-01 | 2000
2014-06-02 | 3000
2014-06-03 | 0
2014-06-04 | 5000
The query that I built does a grouping on the date column and sums up the count:
SELECT date(start_at) as date, sum(count) as sum_count
FROM stats_hours GROUP BY date;
This works fine unless I have no data for a given date, in which case it obviously leaves out the row:
Date | sum_count
2014-06-01 | 2000
2014-06-02 | 3000
2014-06-04 | 5000
Does anyone know of a good way in SQL to return a zeroed-out row in the case that there is no data for a given date group? Maybe some kind of case statement?

You need a full list of dates first, then connect that list to your available dates and group by that. Try the following:
--define start and end limits
Declare #todate datetime, #fromdate datetime
Select #fromdate='2009-03-01', #todate='2014-06-04'
;With DateSequence( Date ) as
(
Select #fromdate as Date
union all
Select dateadd(day, 1, Date)
from DateSequence
where Date < #todate
)
--select result
SELECT DateSequence.Date, SUM(Stats_Hours.Count) AS Sum_Count
FROM
DateSequence
LEFT JOIN
Stats_Hours ON DateSequence.Date = Stats_Hours.Start_At
GROUP BY DateSequence.Date
option (MaxRecursion 0)
EDIT: CTE code from this post

Related

Comparing dates using SQL

I have a date that looks like this: 2014-10-01 12:35:29.440
the table looks like this:
ORDER 1 | 2014-07-31 00:00:00.000
ORDER 2 | 2015-07-31 00:00:00.000
sorry i wanted ORDER 2 to show up.. As my get date returns todays date and that is GREATER than 2014-07-31 00:00:00.000
Here is what i have tried:
SELECT TOP 1 NAME
FROM ORDER_DATES
WHERE GETDATE() > ORDER_DATE
ORDER BY NAME DESC
Your question still isn't quite worded in a way that is conducive to what you need... but I think I understand what you want now based on the comments.
Based on the comment:
IF it doesnt match the date then it needs to return the next row.
Which is ORDER 2
Something like this should work:
SELECT TOP 1 name
FROM ORDER_DATES o
INNER JOIN (
-- This subquery finds the first date that occurs *after* the current date
SELECT MIN(ORDER_DATE) AS ORDER_DATE
FROM ORDER_DATES
WHERE ORDER_DATE > GETDATE()
) minDateAfterToday ON o.ORDER_DATE = minDateAfterToday.ORDER_DATE
ORDER BY name
This would work a lot better if you had an ID field in the table, but this should work with the given data, you'll potentially run into issues if you have two orders on the exact same date.
EDIT:
here's a fiddle showing the query in action:
http://sqlfiddle.com/#!6/f3057/1
DATEDIFF will come handy, also you have to order by ORDER_DATE:
SELECT TOP 1 NAME
FROM ORDER_DATES
WHERE DATEDIFF(DAY,ORDER_DATE,GETDATE())>0
ORDER BY ORDER_DATE DESC
You can write as:
SELECT NAME
FROM ORDER_DATES
WHERE cast(GETDATE()as date) > cast (ORDER_DATE as date)
ORDER BY NAME DESC
Demo
Check if you are querying against right table
declare #dt datetime = cast('2014-10-01 12:35:29.440' as datetime), #dt2 datetime= cast('2014-07-31 00:00:00.000' as datetime);
print(case when #dt > #dt2 then 1 else 0 end);
This piece of script shows output 1 i.e. condition should match for ORDER 1.
Verify if you are missing some thing.
Edit as per change to original question:
here the condition needed be reverted as date value is in future which is greater than current date
new query will be as
SELECT TOP 1 NAME
FROM ORDER_DATES
WHERE ORDER_DATE > GETDATE()
ORDER BY NAME DESC

How to return a default value when no rows are returned from the select statement

I have a select statement that returns two columns, a date column, and a count(value) column. When the count(value) column doesn't have any records, I need it to return a 0. Currently, it just skips that date record all together.
Here is the basics of the query.
select convert(varchar(25), DateTime, 101) as recordDate,
count(Value) as recordCount
from History
where Value < 700
group by convert(varchar(25), DateTime, 101)
Here are some results that I'm getting.
+------------+-------------+
| recordDate | recordCount |
+------------+-------------+
| 02/26/2014 | 143 |
| 02/27/2014 | 541 |
| 03/01/2014 | 21 |
| 03/02/2014 | 60 |
| 03/03/2014 | 113 |
+------------+-------------+
Notice it skips 2/28/2014. This is because the count(value) column doesn't have anything to count. How can I add the record in there that has the date of 2/28/2014, with a recordCount of 0?
To generate rows for missing dates you can join your data to a date dimension table
It would look something like this:
select convert(varchar(25), ddt.DateField, 101) as recordDate,
count(t.Value) as recordCount
from History h
right join dbo.DateDimensionTable ddt
on ddt.DateField = convert(varchar(25), h.DateTime, 101)
where h.Value < 700
group by convert(varchar(25), h.DateTime, 101)
If your table uses the DateTime column to store dates only (meaning the time is always midnight), then you can replace this
right join dbo.DateDimensionTable ddt
on ddt.DateField = convert(varchar(25), h.DateTime, 101)
with this
right join dbo.DateDimensionTable ddt
on ddt.DateField = h.DateTime
You may use COUNT(*). It will return zero if nothing was found for the column. Also you may group result set by value column if it is needed.
select convert(varchar(25), DateTime, 101) as recordDate,
CASE WHEN count(value) =0 THEN 0 ELSE COUNT(value) END recordCount
from History
where Value < 700
group by convert(varchar(25), DateTime, 101)
When you use a group by, it only creates a distinct list of values that exist in your records. Since 20140228 has no records, it will not show up in the group by.
Your best bet is to generate a list of values, dates in your case, and left join or apply that table against your history table.
I can't seem to copy my T-SQL in here so here's a hastebin.
http://hastebin.com/winaqutego.vbs
The best practice would be for you to have a datamart where a separate dimensional table for dates is kept with all dates you might be interested at - even if they lack amounts. DMason's answer shows the query with such a dimensional table.
To keep with the best practices you would have a fact table where you'd keep these historical data already pre-grouped at the granularity level you need (daily, in this case), so you wouldn't need a GROUP BY unless you needed a coarser granularity (weekly, monthly, yearly).
And in both your operational and datamart databases the dates would be stored as dates, not...
But then, since this is real world and you might not be able to change what somebody else made... If you: a) only care about the dates that appear in [History], and b) such dates are never stored with hours/minutes; then following query might be what you'd need:
SELECT MyDates.DateTime, COUNT(*)-1 AS RecordCount
FROM (
SELECT DISTINCT DateTime FROM History
) MyDates
LEFT JOIN History H
ON MyDates.DateTime = H.Datetime
AND H.Value < 700
GROUP BY MyDates.DateTime
Do try to add an index over DateTime and to further constrain the query with an earliest/latest date for better performance results.
I agree that a Dates table (AKA time dimension) is the right solution, but there is a simpler option:
SELECT
CONVERT(VARCHAR(25), DateTime, 101) AS RecordDate,
SUM(CASE WHEN Value < 700 THEN 1 ELSE 0 END) AS RecordCount
FROM
History
GROUP BY
CONVERT(VARCHAR(25), DateTime, 101)
Try this:
DECLARE #Records TABLE (
[RecordDate] DATETIME,
[RecordCount] INT
)
DECLARE #Date DATETIME = '02/26/2014' -- Enter whatever date you want to start with
DECLARE #EndDate DATETIME = '03/31/2014' -- Enter whatever date you want to stop
WHILE (1=1)
BEGIN
-- Insert the date into the temp table along with the count
INSERT INTO #Records (RecordDate, RecordCount)
VALUES (CONVERT(VARCHAR(25), #Date, 101),
(SELECT COUNT(*) FROM dbo.YourTable WHERE RecordDate = #Date))
-- Go to the next day
#Date = DATEADD(d, 1, #Date)
-- If we have surpassed the end date, break out of the loop
IF (#Date > #EndDate) BREAK;
END
SELECT * FROM #Records
If your dates have time components, you would need to modify this to check for start and end of day in the SELECT COUNT(*)... query.

Modulo Time in SQL Server 2005 - Return data every n hours

I have something like this:
SELECt *
FROM (
SELECT prodid, date, time, tmp, rowid
FROM live_pilot_plant
WHERE date BETWEEN CONVERT(DATETIME, '3/19/2012', 101)
AND CONVERT(DATETIME, '3/31/2012', 101)
) b
WHERE b.rowid % 400 = 0
FYI: The reason for the convert in the where clause, is because my date is stored as a varchar(10), I had to convert it to datetime in order to get the correct range of data. (I tried a bunch of different things and this worked)
I'm wondering how I can return the data I want every 4 hours during those selected dates. I have data collected approximately every 5 seconds (with some breaks in data) - ie data wasn't collected during a 2 hour period, but then continues at 5 second increments.
In my example I just used a modulo with my rowid - and the syntax works, but as I mentioned above there are some periods where data isnt collected so using logic like: if you take data every 5 seconds and multiple that by 4 hours you can approximately say how many rows are in between wont work.
My time column is a varchar column and is in the form hh:mm:ss
My ideal output is:
| prodid | date | time | tmp |
| 4 | 3/19/2012 | 10:00:00 | 2.3 |
| 7 | 3/19/2012 | 14:00:24 | 3.2 |
As you can see I can be a bit off (in terms of seconds) - I more so need the approximate value in terms of time.
Thank you in advance.
This should work
select prodid, date, time, tmp, rowid
from live_pilot_plant as lpp
inner join (
select min(prodid) as prodid -- is prodid your PK?? if not change it to rowid or whatelse is your PK
from live_pilot_plant
WHERE date BETWEEN CONVERT(DATETIME, '3/19/2012', 101) -- or whatever you want
AND CONVERT(DATETIME, '3/31/2012', 101) -- for better performance it is on the inner select
group by date,
floor( -- floor makes the trick
convert(float,convert(datetime, time)) -- assumes "time" column is a varchar containing data like '19:23:05'
* 6 -- 6 comes form 24 hours / 4 hours
)
) as filter on lpp.prodid = filter.prodid -- if prodid is not the PK also correct here.
A side note for everyone else who have date + time data in only one datetime field, suppose named "when_it_was", the group by can be as simple as:
group by floor(when_it_was * 6) -- again, 6 comes from 24/4
something along the lines of the following should work. Basically create date + time partitions, each partition representing a block of 4 hours and pick the record with the highest rank from each partition
select * from (
select *,
row_number() over (partition by date,cast(left( time, charindex( ':', time) - 1) as int) / 4 order by
date, time) as ranker from live_pilot_plant
) Z where ranker = 1
Assuming rowid is a PK and increased with date/time. Just convert time field to 4 hours interval number substring(time,1,2))/4 and select MIN(rowid) from each of 4 hours groups in a day:
select prodid, date, time, tmp, rowid from live_pilot_plant where rowid in
(
select min(rowid)
from live_pilot_plant
WHERE CONVERT(DATETIME, date, 101) BETWEEN CONVERT(DATETIME, '3/19/2012', 101)
AND CONVERT(DATETIME, '3/31/2012', 101)
group by date,convert(int,substring(time,1,2))/4
)
order by CONVERT(DATETIME, date, 101),time

Find latest value earlier than specified date

I have exchange rate table in which there are multiple date wise records with exchange rate.
Date Rate
17/05/2012 5
23/05/2012 6
27/05/2012 7
Now I want rate while passing any date like if, I pass 20/05/2012 then rate 5 should return because 20/05/2012 elapse in date range 17 and 23 may 2012.
Assuming you have correct datatypes (that is, not varchar to store date values...)
SELECT TOP 1
Rate
FROM
MyTable
WHERE
DateColumn <= '20120520'
ORDER BY
DateColumn DESC
Something like this will work:
select Rate from tablename where Date in (
select max(Date) as Date
from tablename
where Date <= convert(datetime, '20/05/2012', 103)
)

Generate missing dates + Sql Server (SET BASED)

I have the following
id eventid startdate enddate
1 1 2009-01-03 2009-01-05
1 2 2009-01-05 2009-01-09
1 3 2009-01-12 2009-01-15
How to generate the missing dates pertaining to every eventid?
Edit:
The missing gaps are to be find out based on the eventid's. e.g. for eventid 1 the output should be 1/3/2009,1/4/2009,1/5/2009.. for eventtype id 2 it will be 1/5/2009, 1/6/2009... to 1/9/2009 etc
My task is to find out the missing dates between two given dates.
Here is the whole thing which i have done so far
declare #tblRegistration table(id int primary key,startdate date,enddate date)
insert into #tblRegistration
select 1,'1/1/2009','1/15/2009'
declare #tblEvent table(id int,eventid int primary key,startdate date,enddate date)
insert into #tblEvent
select 1,1,'1/3/2009','1/5/2009' union all
select 1,2,'1/5/2009','1/9/2009' union all
select 1,3,'1/12/2009','1/15/2009'
;with generateCalender_cte as
(
select cast((select startdate from #tblRegistration where id = 1 )as datetime) DateValue
union all
select DateValue + 1
from generateCalender_cte
where DateValue + 1 <= (select enddate from #tblRegistration where id = 1)
)
select DateValue as missingdates from generateCalender_cte
where DateValue not between '1/3/2009' and '1/5/2009'
and DateValue not between '1/5/2009' and '1/9/2009'
and DateValue not between '1/12/2009'and'1/15/2009'
Actually what I am trying to do is that, I have generated a calender table and from there I am trying to find out the missing dates based on the id's
The ideal output will be
eventid missingdates
1 2009-01-01 00:00:00.000
1 2009-01-02 00:00:00.000
3 2009-01-10 00:00:00.000
3 2009-01-11 00:00:00.000
and also it has to be in SET BASED and the start and end dates should not be hardcoded
Thanks in adavnce
The following uses a recursive CTE (SQL Server 2005+):
WITH dates AS (
SELECT CAST('2009-01-01' AS DATETIME) 'date'
UNION ALL
SELECT DATEADD(dd, 1, t.date)
FROM dates t
WHERE DATEADD(dd, 1, t.date) <= '2009-02-01')
SELECT t.eventid, d.date
FROM dates d
JOIN TABLE t ON d.date BETWEEN t.startdate AND t.enddate
It generates dates using the DATEADD function. It can be altered to take a start & end date as parameters. According to KM's comments, it's faster than using the numbers table trick.
Like rexem - I made a function that contains a similar CTE to generate any series of datetime intervals you need. Very handy for summarizing data by datetime intervals like you are doing.
A more detailed post and the function source code are here:
Insert Dates in the return from a query where there is none
Once you have the "counts of events by date" ... your missing dates would be the ones with a count of 0.