How to return a default value when no rows are returned from the select statement - sql

I have a select statement that returns two columns, a date column, and a count(value) column. When the count(value) column doesn't have any records, I need it to return a 0. Currently, it just skips that date record all together.
Here is the basics of the query.
select convert(varchar(25), DateTime, 101) as recordDate,
count(Value) as recordCount
from History
where Value < 700
group by convert(varchar(25), DateTime, 101)
Here are some results that I'm getting.
+------------+-------------+
| recordDate | recordCount |
+------------+-------------+
| 02/26/2014 | 143 |
| 02/27/2014 | 541 |
| 03/01/2014 | 21 |
| 03/02/2014 | 60 |
| 03/03/2014 | 113 |
+------------+-------------+
Notice it skips 2/28/2014. This is because the count(value) column doesn't have anything to count. How can I add the record in there that has the date of 2/28/2014, with a recordCount of 0?

To generate rows for missing dates you can join your data to a date dimension table
It would look something like this:
select convert(varchar(25), ddt.DateField, 101) as recordDate,
count(t.Value) as recordCount
from History h
right join dbo.DateDimensionTable ddt
on ddt.DateField = convert(varchar(25), h.DateTime, 101)
where h.Value < 700
group by convert(varchar(25), h.DateTime, 101)
If your table uses the DateTime column to store dates only (meaning the time is always midnight), then you can replace this
right join dbo.DateDimensionTable ddt
on ddt.DateField = convert(varchar(25), h.DateTime, 101)
with this
right join dbo.DateDimensionTable ddt
on ddt.DateField = h.DateTime

You may use COUNT(*). It will return zero if nothing was found for the column. Also you may group result set by value column if it is needed.

select convert(varchar(25), DateTime, 101) as recordDate,
CASE WHEN count(value) =0 THEN 0 ELSE COUNT(value) END recordCount
from History
where Value < 700
group by convert(varchar(25), DateTime, 101)

When you use a group by, it only creates a distinct list of values that exist in your records. Since 20140228 has no records, it will not show up in the group by.
Your best bet is to generate a list of values, dates in your case, and left join or apply that table against your history table.
I can't seem to copy my T-SQL in here so here's a hastebin.
http://hastebin.com/winaqutego.vbs

The best practice would be for you to have a datamart where a separate dimensional table for dates is kept with all dates you might be interested at - even if they lack amounts. DMason's answer shows the query with such a dimensional table.
To keep with the best practices you would have a fact table where you'd keep these historical data already pre-grouped at the granularity level you need (daily, in this case), so you wouldn't need a GROUP BY unless you needed a coarser granularity (weekly, monthly, yearly).
And in both your operational and datamart databases the dates would be stored as dates, not...
But then, since this is real world and you might not be able to change what somebody else made... If you: a) only care about the dates that appear in [History], and b) such dates are never stored with hours/minutes; then following query might be what you'd need:
SELECT MyDates.DateTime, COUNT(*)-1 AS RecordCount
FROM (
SELECT DISTINCT DateTime FROM History
) MyDates
LEFT JOIN History H
ON MyDates.DateTime = H.Datetime
AND H.Value < 700
GROUP BY MyDates.DateTime
Do try to add an index over DateTime and to further constrain the query with an earliest/latest date for better performance results.

I agree that a Dates table (AKA time dimension) is the right solution, but there is a simpler option:
SELECT
CONVERT(VARCHAR(25), DateTime, 101) AS RecordDate,
SUM(CASE WHEN Value < 700 THEN 1 ELSE 0 END) AS RecordCount
FROM
History
GROUP BY
CONVERT(VARCHAR(25), DateTime, 101)

Try this:
DECLARE #Records TABLE (
[RecordDate] DATETIME,
[RecordCount] INT
)
DECLARE #Date DATETIME = '02/26/2014' -- Enter whatever date you want to start with
DECLARE #EndDate DATETIME = '03/31/2014' -- Enter whatever date you want to stop
WHILE (1=1)
BEGIN
-- Insert the date into the temp table along with the count
INSERT INTO #Records (RecordDate, RecordCount)
VALUES (CONVERT(VARCHAR(25), #Date, 101),
(SELECT COUNT(*) FROM dbo.YourTable WHERE RecordDate = #Date))
-- Go to the next day
#Date = DATEADD(d, 1, #Date)
-- If we have surpassed the end date, break out of the loop
IF (#Date > #EndDate) BREAK;
END
SELECT * FROM #Records
If your dates have time components, you would need to modify this to check for start and end of day in the SELECT COUNT(*)... query.

Related

Operand clash with date table and varchar date strings in cross join

This follows on from my previous question, but since I tried to simplify, I appear to have missed something Daily snapshot table using cte loop
I am trying to set up the below cross join between dates and an employee table. I need a daily count according to division and department, but the dates won't link easily since the dates are stored as varchar (not my choice, I can't change it).
I now have a date table that includes a style112 (yyyymmdd) key that I can link to the table, but there seems to be a failure somewhere along the joins.
I'm so close, but really am lost! I have never had to work with string dates and wouldn't wish it upon anyone.
DECLARE #DATESTART AS Date = '20180928';
DECLARE #DATEEND AS Date = '20181031';
WITH Dates AS (
SELECT #DATESTART AS Dte
UNION ALL
SELECT DTE + 1
FROM Dates
WHERE Dte <= #DATEEND )
SELECT
Dt.Dte
,CAST(DTC.Style112 AS VARCHAR)
,Emp.Division_Description
,Emp.Department_Description
,(SELECT
COUNT(*)
FROM ASS_D_EmpMaster_Live E
WHERE
E.[Start_Date] <= CAST(DTC.Style112 AS VARCHAR)
AND (E.Leaving_Date > CAST(DTC.Style112 AS VARCHAR)
OR E.Leaving_Date = '00000000')
) Counts
FROM Dates Dt
LEFT JOIN ASS_C_DateConversions DTC
ON DTC.[Date] = Dt.DtE
CROSS JOIN
(
SELECT DISTINCT
Division_Description
,Department_Description
FROM
ASS_D_EmpMaster_Live e
) Emp
OPTION (MAXRECURSION 1000)
Desired output:
Date
Dept1
Dept2
Dept3
20180901
25
231
154
20180902
23
232
154
I don't think you need the conversion table at all and I would remove it. And I believe the subquery should look like this:
SELECT COUNT(*)
FROM ASS_D_EmpMaster_Live E
WHERE
CAST(E.Start_Date AS DATE) <= Dt.Dte
AND (CAST(E.Leaving_Date AS DATE) > Dt.Dte OR E.Leaving_Date = '00000000')

Comparing dates using SQL

I have a date that looks like this: 2014-10-01 12:35:29.440
the table looks like this:
ORDER 1 | 2014-07-31 00:00:00.000
ORDER 2 | 2015-07-31 00:00:00.000
sorry i wanted ORDER 2 to show up.. As my get date returns todays date and that is GREATER than 2014-07-31 00:00:00.000
Here is what i have tried:
SELECT TOP 1 NAME
FROM ORDER_DATES
WHERE GETDATE() > ORDER_DATE
ORDER BY NAME DESC
Your question still isn't quite worded in a way that is conducive to what you need... but I think I understand what you want now based on the comments.
Based on the comment:
IF it doesnt match the date then it needs to return the next row.
Which is ORDER 2
Something like this should work:
SELECT TOP 1 name
FROM ORDER_DATES o
INNER JOIN (
-- This subquery finds the first date that occurs *after* the current date
SELECT MIN(ORDER_DATE) AS ORDER_DATE
FROM ORDER_DATES
WHERE ORDER_DATE > GETDATE()
) minDateAfterToday ON o.ORDER_DATE = minDateAfterToday.ORDER_DATE
ORDER BY name
This would work a lot better if you had an ID field in the table, but this should work with the given data, you'll potentially run into issues if you have two orders on the exact same date.
EDIT:
here's a fiddle showing the query in action:
http://sqlfiddle.com/#!6/f3057/1
DATEDIFF will come handy, also you have to order by ORDER_DATE:
SELECT TOP 1 NAME
FROM ORDER_DATES
WHERE DATEDIFF(DAY,ORDER_DATE,GETDATE())>0
ORDER BY ORDER_DATE DESC
You can write as:
SELECT NAME
FROM ORDER_DATES
WHERE cast(GETDATE()as date) > cast (ORDER_DATE as date)
ORDER BY NAME DESC
Demo
Check if you are querying against right table
declare #dt datetime = cast('2014-10-01 12:35:29.440' as datetime), #dt2 datetime= cast('2014-07-31 00:00:00.000' as datetime);
print(case when #dt > #dt2 then 1 else 0 end);
This piece of script shows output 1 i.e. condition should match for ORDER 1.
Verify if you are missing some thing.
Edit as per change to original question:
here the condition needed be reverted as date value is in future which is greater than current date
new query will be as
SELECT TOP 1 NAME
FROM ORDER_DATES
WHERE ORDER_DATE > GETDATE()
ORDER BY NAME DESC

Filter based on multiple date ranges?

The user clicks on a month and then this stored procedure is executed. It checks for the total booked time and what groups have been filtered.
| Job Group | Month Booked | Time (hrs) |
Cleaning Jan 7
I have the following SQL:
SELECT
tsks.grouping_ref, ttg.description AS grouping_desc,
SUM(ts.booked_time) AS booked_time_total,
DATENAME(MONTH, ts.start_dtm) + ' ' + DATENAME(YEAR, ts.start_dtm) AS month_name,
#month_ref AS month_ref
FROM
timesheets ts
JOIN
timesheet_categories cat ON ts.timesheet_cat_ref = cat.timesheet_cat_ref
JOIN
timesheet_tasks tsks ON ts.task_ref = tsks.task_ref
JOIN
timesheet_task_groupings ttg ON tsks.grouping_ref = ttg.grouping_ref
WHERE
ts.status IN(1, 2) --Booked and approved
AND cat.is_leave_category = 0 --Ignore leave
AND DATEPART(YEAR, ts.start_dtm) = #Year
AND DATEPART(MONTH, ts.start_dtm) = #Month
GROUP BY
tsks.grouping_ref, ttg.description,
DATENAME(MONTH, ts.start_dtm),
DATENAME(YEAR, ts.start_dtm)
ORDER BY
grouping_desc
I want to filter based on multiple date ranges.
I thought about adding this:
AND ((ts.start_dtm BETWEEN '2011-12-28' AND '2012-01-01')
OR (ts.start_dtm BETWEEN '2012-01-02' AND '2012-01-29'))
But then realized it wouldn't matter what month the user clicked it would still show all the records as it will carry out the OR statement.
What I need is something that's based on the month_ref, eg:
CASE WHEN #month_ref = 81201 THEN
AND (ts.start_dtm BETWEEN '2011-12-28' AND '2012-01-01')
END
But the case statement needs to go just after the WHERE clause.
I have about 12 accounting months for 2012 which I need to add as case statements so that when the user clicks on March, it will fire the correct filter.
In the database ts.start_dtm looks like this:
2011-04-01 00:00:00.000
Hope that was enough information for my first post?
I'm stuck writing the case statement and where to put it, been trying for hours now.
Hope you can help :)
Give the irregular nature of your dates would preclude using dateparts; I would build a temporary table of the permissible dates based on the user query and join on it. The static integers table in my app has 1 through 64000 your tables may vary.
DECLARE
#startdate DateTime = '2012-05-01',
#EndDate DateTime = '2012-06-03'
DECLARE
#AllDates TABLE (MyDate DateTime)
INSERT INTO #AllDates
SELECT
DATEADD(dd, StaticInteger, #startdate)
FROM dbo.tblStaticIntegers
WHERE StaticInteger <= DATEDIFF(dd, #startdate, #EndDate)
One option would be to have a table mapping the month reference number to a start and end date thus retrieving those values and using them in your ts.start_dtm check. ie it would have:
Month-ref | Start | End
81201 | 2011-12-28 | 2012-01-01
81202 | 2012-01-02 | 2012-01-29
etc
You can just join to this reference table or alternatively retrieve the two dates before your main query

Select data from SQL DB per day

I have a table with order information in an E-commerce store. Schema looks like this:
[Orders]
Id|SubTotal|TaxAmount|ShippingAmount|DateCreated
This table does only contain data for every Order. So if a day goes by without any orders, no sales data is there for that day.
I would like to select subtotal-per-day for the last 30 days, including those days with no sales.
The resultset would look like this:
Date | SalesSum
2009-08-01 | 15235
2009-08-02 | 0
2009-08-03 | 340
2009-08-04 | 0
...
Doing this, only gives me data for those days with orders:
select DateCreated as Date, sum(ordersubtotal) as SalesSum
from Orders
group by DateCreated
You could create a table called Dates, and select from that table and join the Orders table. But I really want to avoid that, because it doesn't work good enough when dealing with different time zones and things...
Please don't laugh. SQL is not my kind of thing... :)
Create a function that can generate a date table as follows:
(stolen from http://www.codeproject.com/KB/database/GenerateDateTable.aspx)
Create Function dbo.fnDateTable
(
#StartDate datetime,
#EndDate datetime,
#DayPart char(5) -- support 'day','month','year','hour', default 'day'
)
Returns #Result Table
(
[Date] datetime
)
As
Begin
Declare #CurrentDate datetime
Set #CurrentDate=#StartDate
While #CurrentDate<=#EndDate
Begin
Insert Into #Result Values (#CurrentDate)
Select #CurrentDate=
Case
When #DayPart='year' Then DateAdd(yy,1,#CurrentDate)
When #DayPart='month' Then DateAdd(mm,1,#CurrentDate)
When #DayPart='hour' Then DateAdd(hh,1,#CurrentDate)
Else
DateAdd(dd,1,#CurrentDate)
End
End
Return
End
Then, join against that table
SELECT dates.Date as Date, sum(SubTotal+TaxAmount+ShippingAmount)
FROM [fnDateTable] (dateadd("m",-1,CONVERT(VARCHAR(10),GETDATE(),111)),CONVERT(VARCHAR(10),GETDATE(),111),'day') dates
LEFT JOIN Orders
ON dates.Date = DateCreated
GROUP BY dates.Date
declare #oldest_date datetime
declare #daily_sum numeric(18,2)
declare #temp table(
sales_date datetime,
sales_sum numeric(18,2)
)
select #oldest_date = dateadd(day,-30,getdate())
while #oldest_date <= getdate()
begin
set #daily_sum = (select sum(SubTotal) from SalesTable where DateCreated = #oldest_date)
insert into #temp(sales_date, sales_sum) values(#oldest_date, #daily_sum)
set #oldest_date = dateadd(day,1,#oldest_date)
end
select * from #temp
OK - I missed that 'last 30 days' part. The bit above, while not as clean, IMHO, as the date table, should work. Another variant would be to use the while loop to fill a temp table just with the last 30 days and do a left outer join with the result of my original query.
including those days with no sales.
That's the difficult part. I don't think the first answer will help you with that. I did something similar to this with a separate date table.
You can find the directions on how to do so here:
Date Table
I have a Log table table with LogID an index which i never delete any records. it has index from 1 to ~10000000. Using this table I can write
select
s.ddate, SUM(isnull(o.SubTotal,0))
from
(
select
cast(datediff(d,LogID,getdate()) as datetime) AS ddate
from
Log
where
LogID <31
) s right join orders o on o.orderdate = s.ddate
group by s.ddate
I actually did this today. We also got a e-commerce application. I don't want to fill our database with "useless" dates. I just do the group by and create all the days for the last N days in Java, and peer them with the date/sales results from the database.
Where is this ultimately going to end up? I ask only because it may be easier to fill in the empty days with whatever program is going to deal with the data instead of trying to get it done in SQL.
SQL is a wonderful language, and it is capable of a great many things, but sometimes you're just better off working the finer points of the data in the program instead.
(Revised a bit--I hit enter too soon)
I started poking at this, and as it hits some pretty tricky SQL concepts it quickly grew into the following monster. If feasible, you might be better off adapting THEn's solution; or, like many others advise, using application code to fill in the gaps could be preferrable.
-- A temp table holding the 30 dates that you want to check
DECLARE #Foo Table (Date smalldatetime not null)
-- Populate the table using a common "tally table" methodology (I got this from SQL Server magazine long ago)
;WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
Tally AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS Number FROM L3)
INSERT #Foo (Date)
select dateadd(dd, datediff(dd, 0, dateadd(dd, -number + 1, getdate())), 0)
from Tally
where Number < 31
Step 1 is to build a temp table containint the 30 dates that you are concerned with. That abstract wierdness is about the fastest way known to build a table of consecutive integers; add a few more subqueries, and you can populate millions or more in mere seconds. I take the first 30, and use dateadd and the current date/time to convert them into dates. If you already have a "fixed" table that has 1-30, you can use that and skip the CTE entirely (by replacing table "Tally" with your table).
The outer two date function calls remove the time portion of the generated string.
(Note that I assume that your order date also has no time portion -- otherwise you've got another common problem to resolve.)
For testing purposes I built table #Orders, and this gets you the rest:
SELECT f.Date, sum(ordersubtotal) as SalesSum
from #Foo f
left outer join #Orders o
on o.DateCreated = f.Date
group by f.Date
I created the Function DateTable as JamesMLV pointed out to me.
And then the SQL looks like this:
SELECT dates.date, ISNULL(SUM(ordersubtotal), 0) as Sales FROM [dbo].[DateTable] ('2009-08-01','2009-08-31','day') dates
LEFT JOIN Orders ON CONVERT(VARCHAR(10),Orders.datecreated, 111) = dates.date
group by dates.date
SELECT DateCreated,
SUM(SubTotal) AS SalesSum
FROM Orders
GROUP BY DateCreated

SQL for counting events by date

I feel like I've seen this question asked before, but neither the SO search nor google is helping me... maybe I just don't know how to phrase the question. I need to count the number of events (in this case, logins) per day over a given time span so that I can make a graph of website usage. The query I have so far is this:
select
count(userid) as numlogins,
count(distinct userid) as numusers,
convert(varchar, entryts, 101) as date
from
usagelog
group by
convert(varchar, entryts, 101)
This does most of what I need (I get a row per date as the output containing the total number of logins and the number of unique users on that date). The problem is that if no one logs in on a given date, there will not be a row in the dataset for that date. I want it to add in rows indicating zero logins for those dates. There are two approaches I can think of for solving this, and neither strikes me as very elegant.
Add a column to the result set that lists the number of days between the start of the period and the date of the current row. When I'm building my chart output, I'll keep track of this value and if the next row is not equal to the current row plus one, insert zeros into the chart for each of the missing days.
Create a "date" table that has all the dates in the period of interest and outer join against it. Sadly, the system I'm working on already has a table for this purpose that contains a row for every date far into the future... I don't like that, and I'd prefer to avoid using it, especially since that table is intended for another module of the system and would thus introduce a dependency on what I'm developing currently.
Any better solutions or hints at better search terms for google? Thanks.
Frankly, I'd do this programmatically when building the final output. You're essentially trying to read something from the database which is not there (data for days that have no data). SQL isn't really meant for that sort of thing.
If you really want to do that, though, a "date" table seems your best option. To make it a bit nicer, you could generate it on the fly, using i.e. your DB's date functions and a derived table.
I had to do exactly the same thing recently. This is how I did it in T-SQL (
YMMV on speed, but I've found it performant enough over a coupla million rows of event data):
DECLARE #DaysTable TABLE ( [Year] INT, [Day] INT )
DECLARE #StartDate DATETIME
SET #StartDate = whatever
WHILE (#StartDate <= GETDATE())
BEGIN
INSERT INTO #DaysTable ( [Year], [Day] )
SELECT DATEPART(YEAR, #StartDate), DATEPART(DAYOFYEAR, #StartDate)
SELECT #StartDate = DATEADD(DAY, 1, #StartDate)
END
-- This gives me a table of all days since whenever
-- you could select #StartDate as the minimum date of your usage log)
SELECT days.Year, days.Day, events.NumEvents
FROM #DaysTable AS days
LEFT JOIN (
SELECT
COUNT(*) AS NumEvents
DATEPART(YEAR, LogDate) AS [Year],
DATEPART(DAYOFYEAR, LogDate) AS [Day]
FROM LogData
GROUP BY
DATEPART(YEAR, LogDate),
DATEPART(DAYOFYEAR, LogDate)
) AS events ON days.Year = events.Year AND days.Day = events.Day
Create a memory table (a table variable) where you insert your date ranges, then outer join the logins table against it. Group by your start date, then you can perform your aggregations and calculations.
The strategy I normally use is to UNION with the opposite of the query, generally a query that retrieves data for rows that don't exist.
If I wanted to get the average mark for a course, but some courses weren't taken by any students, I'd need to UNION with those not taken by anyone to display a row for every class:
SELECT AVG(mark), course FROM `marks`
UNION
SELECT NULL, course FROM courses WHERE course NOT IN
(SELECT course FROM marks)
Your query will be more complex but the same principle should apply. You may indeed need a table of dates for your second query
Option 1
You can create a temp table and insert dates with the range and do a left outer join with the usagelog
Option 2
You can programmetically insert the missing dates while evaluating the result set to produce the final output
WITH q(n) AS
(
SELECT 0
UNION ALL
SELECT n + 1
FROM q
WHERE n < 99
),
qq(n) AS
(
SELECT 0
UNION ALL
SELECT n + 1
FROM q
WHERE n < 99
),
dates AS
(
SELECT q.n * 100 + qq.n AS ndate
FROM q, qq
)
SELECT COUNT(userid) as numlogins,
COUNT(DISTINCT userid) as numusers,
CAST('2000-01-01' + ndate AS DATETIME) as date
FROM dates
LEFT JOIN
usagelog
ON entryts >= CAST('2000-01-01' AS DATETIME) + ndate
AND entryts < CAST('2000-01-01' AS DATETIME) + ndate + 1
GROUP BY
ndate
This will select up to 10,000 dates constructed on the fly, that should be enough for 30 years.
SQL Server has a limitation of 100 recursions per CTE, that's why the inner queries can return up to 100 rows each.
If you need more than 10,000, just add a third CTE qqq(n) and cross-join with it in dates.