How can I get the count to display zero for months that have no records - sql

I am pulling transactions that happen on an attribute (attribute ID 4205 in table 1235) by the date that a change happened to the attribute (found in the History table) and counting up the number of changes that occurred by month. So far I have
SELECT TOP(100) PERCENT MONTH(H.transactiondate) AS Month, COUNT(*) AS Count
FROM hsi.rmObjectInstance1235 AS O LEFT OUTER JOIN
hsi.rmObjectHistory AS H ON H.objectID = O.objectID
WHERE H.attributeid = 4205) AND Year(H.transaction date) = '2020'
GROUP BY MONTH(H.transactiondate)
And I get
Month Count
---------------
1 9
2 4
3 11
4 14
5 1
I need to display a zero for months June - December instead of excluding those months.

One option uses a recursive query to generate the dates, and then brings the original query with a left join:
with all_dates as (
select cast('2020-01-01' as date) dt
union all
select dateadd(month, 1, dt) from all_dates where dt < '2020-12-01'
)
select
month(d.dt) as month,
count(h.objectid) as cnt
from all_dates d
left join hsi.rmobjecthistory as h
on h.attributeid = 4205
and h.transaction_date >= d.dt
and h.transaction_date < dateadd(month, 1, d.dt)
and exists (select 1 from hsi.rmObjectInstance1235 o where o.objectID = h.objectID)
group by month(d.dt)
I am quite unclear about the intent of the table hsi.rmObjectInstance1235 in the query, as none of its column are used in the select and group by clauses; it it is meant to filter hsi.rmobjecthistory by objectID, then you can rewrite this as an exists condition, as shown in the above solution. Possibly, you might as well be able to just remove that part of the query.
Also, note that
top without order by does not really make sense
top (100) percent is a no op
As a consequence, I removed that row-limiting clause.

Related

Show all results in date range replacing null records with zero

I am querying the counts of logs that appear on particular days. However on some days, no log records I'm searching for are recorded. How can I set count to 0 for these days and return a result with the full set of dates in a date range?
SELECT r.LogCreateDate, r.Docs
FROM(
SELECT SUM(TO_NUMBER(REPLACE(ld.log_detail, 'Total Documents:' , ''))) AS Docs, to_char(l.log_create_date,'YYYY-MM-DD') AS LogCreateDate
FROM iwbe_log l
LEFT JOIN iwbe_log_detail ld ON ld.log_id = l.log_id
HAVING to_char(l.log_create_date , 'YYYY-MM-DD') BETWEEN '2020-01-01' AND '2020-01-07'
GROUP BY to_char(l.log_create_date,'YYYY-MM-DD')
ORDER BY to_char(l.log_create_date,'YYYY-MM-DD') DESC
) r
ORDER BY r.logcreatedate
Current Result - Id like to include the 01, 04, 05 with 0 docs.
LOGCREATEDATE
Docs
2020-01-02
7
2020-01-03
3
2020-01-06
6
2020-01-07
1
You need a full list of dates first, then outer join the log data to that. There are several ways to generate the list of dates but now common table expressions (cte) are an ANSI standard way to do this, like so:
with cte (dt) as (
select to_date('2020-01-01','yyyy-mm-dd') as dt from dual -- start date here
union all
select dt + 1 from cte
where dt + 1 < to_date('2020-02-01','yyyy-mm-dd') -- finish (the day before) date here
)
select to_char(cte.dt,'yyyy-mm-dd') as LogCreateDate
, r.Docs
from cte
left join (
SELECT SUM(TO_NUMBER(REPLACE(ld.log_detail, 'Total Documents:' , ''))) AS Docs
, trunc(l.log_create_date) AS LogCreateDate
FROM iwbe_log l
LEFT JOIN iwbe_log_detail ld ON ld.log_id = l.log_id
HAVING trunc(l.log_create_date) BETWEEN to_date('2020-01-01','yyyy-mm-dd' AND to_date('2020-01-07','yyyy-mm-dd')
GROUP BY trunc(l.log_create_date)
) r on cte.dt = r.log_create_date
order by cte.dt
also, when dealing with dates I prefer to not convert them to strings until final output which allows you to still get proper date order and maximum query efficiency.

Detect if a month is missing and insert them automatically with a select statement (MSSQL)

I am trying to write a select statement which detects if a month is not existent and automatically inserts that month with a value 0. It should insert all missing months from the first entry to the last entry.
Example:
My table looks like this:
After the statement it should look like this:
You need a recursive CTE to get all the years in the table (and the missing ones if any) and another one to get all the month numbers 1-12.
A CROSS join of these CTEs will be joined with a LEFT join to the table and finally filtered so that rows prior to the first year/month and later of the last year/month are left out:
WITH
limits AS (
SELECT MIN(year) min_year, -- min year in the table
MAX(year) max_year, -- max year in the table
MIN(DATEFROMPARTS(year, monthnum, 1)) min_date, -- min date in the table
MAX(DATEFROMPARTS(year, monthnum, 1)) max_date -- max date in the table
FROM tablename
),
years(year) AS ( -- recursive CTE to get all the years of the table (and the missing ones if any)
SELECT min_year FROM limits
UNION ALL
SELECT year + 1
FROM years
WHERE year < (SELECT max_year FROM limits)
),
months(monthnum) AS ( -- recursive CTE to get all the month numbers 1-12
SELECT 1
UNION ALL
SELECT monthnum + 1
FROM months
WHERE monthnum < 12
)
SELECT y.year, m.monthnum,
DATENAME(MONTH, DATEFROMPARTS(y.year, m.monthnum, 1)) month,
COALESCE(value, 0) value
FROM months m CROSS JOIN years y
LEFT JOIN tablename t
ON t.year = y.year AND t.monthnum = m.monthnum
WHERE DATEFROMPARTS(y.year, m.monthnum, 1)
BETWEEN (SELECT min_date FROM limits) AND (SELECT max_date FROM limits)
ORDER BY y.year, m.monthnum
See the demo.
You should not be storing date components in two separate columns; instead, you should have just one column, with a proper date-like datatype.
One approach is to use a recursive query to generate all starts of month between the earliest and latest date in the table, then brin the table with a left join.
In SQL Server:
with cte as (
select min(datefromparts(year, monthnum, 1)) as dt,
max(datefromparts(year, monthnum, 1)) as dt_max
from mytable
union all
select dateadd(month, 1, dt)
from cte
where dt < dt_max
)
select c.dt, coalesce(t.value, 0) as value
from cte c
left join mytable t on datefromparts(t.year, t.month, 1) = c.dt
If your data spreads over more that 100 months, you need to add option(maxrecursion 0) at the end of the query.
You can extract the date components in the final select if you like:
select
year(c.dt) as yr,
month(c.dt) as monthnum,
datename(month, c.dt) as monthname,
coalesce(t.value, 0) as value
from ...

How to select all dates in SQL query

SELECT oi.created_at, count(oi.id_order_item)
FROM order_item oi
The result is the follwoing:
2016-05-05 1562
2016-05-06 3865
2016-05-09 1
...etc
The problem is that I need information for all days even if there were no id_order_item for this date.
Expected result:
Date Quantity
2016-05-05 1562
2016-05-06 3865
2016-05-07 0
2016-05-08 0
2016-05-09 1
You can't count something that is not in the database. So you need to generate the missing dates in order to be able to "count" them.
SELECT d.dt, count(oi.id_order_item)
FROM (
select dt::date
from generate_series(
(select min(created_at) from order_item),
(select max(created_at) from order_item), interval '1' day) as x (dt)
) d
left join order_item oi on oi.created_at = d.dt
group by d.dt
order by d.dt;
The query gets the minimum and maximum date form the existing order items.
If you want the count for a specific date range you can remove the sub-selects:
SELECT d.dt, count(oi.id_order_item)
FROM (
select dt::date
from generate_series(date '2016-05-01', date '2016-05-31', interval '1' day) as x (dt)
) d
left join order_item oi on oi.created_at = d.dt
group by d.dt
order by d.dt;
SQLFiddle: http://sqlfiddle.com/#!15/49024/5
Friend, Postgresql Count function ignores Null values. It literally does not consider null values in the column you are searching. For this reason you need to include oi.created_at in a Group By clause
PostgreSql searches row by row sequentially. Because an integral part of your query is Count, and count basically stops the query for that row, your dates with null id_order_item are being ignored. If you group by oi.created_at this column will trump the count and return 0 values for you.
SELECT oi.created_at, count(oi.id_order_item)
FROM order_item oi
Group by io.created_at
From TechontheNet (my most trusted source of information):
Because you have listed one column in your SELECT statement that is not encapsulated in the count function, you must use a GROUP BY clause. The department field must, therefore, be listed in the GROUP BY section.
Some info on Count in PostgreSql
http://www.postgresqltutorial.com/postgresql-count-function/
http://www.techonthenet.com/postgresql/functions/count.php
Solution #1 You need Date Table where you stored all date data. Then do a left join depending on period.
Solution #2
WITH DateTable AS
(
SELECT DATEADD(dd, 1, CONVERT(DATETIME, GETDATE())) AS CreateDateTime, 1 AS Cnter
UNION ALL
SELECT DATEADD(dd, -1, CreateDateTime), DateTable.Cnter + 1
FROM DateTable
WHERE DateTable.Cnter + 1 <= 5
)
Generate Temporary table based on your input and then do a left Join.

Pad out an SQL table with data for Graphing Purposes

SQL Server 2005
I have an SQL Function (ftn_GetExampleTable) which returns a table with multiple result rows
EXAMPLE
ID MemberID MemberGroupID Result1 Result2 Result3 Year Week
1 1 1 High Risk 2 xx 2011 22
2 11 4 Low Risk 1 yy 2011 21
3 12 5 Med Risk 3 zz 2011 25
etc.
Now I do a count and group by on a table above this for Result 2 for instance so I get
SELECT MemberGroupID, Result2, Count(*) AS ExampleCount, Year, Week
FROM ftn_GetExampleTable
GROUP BY MemberGroupID, Result2, Year, Week
MemberGroupID Result2 ExampleCount Year Week
1 2 4 2011 22
4 1 2 2011 21
5 3 1 2011 25
Now imagine when I go to graph this new table between Weeks 20 and 23 of Year 2011, you'll see that it won't graph 20 or 23 or certain groups or even certain results in this example as they are not in the included data, so I need "false data" inserted into this table which has all the possibilities so they at least show on a graph even if the count is 0, does this make sense?
I am wondering on the easiest and kind of most dynamic way as it could be Result1 or Result3 I want to Graph on (different column types).
Thanks in advance
It looks like your dimensions are: MemberGroupID,Result2, and week (Year,Week).
One approach to solving this is to generate a list of all values you want for all the dimensions, and produce a cartesian product of them. As an example,
SELECT m.MemberGroupID, n.Result2, w.Year, w.Week
FROM (SELECT MemberGroupID FROM ftn_GetExampleTable GROUP BY MemberGroupID) m
CROSS
JOIN (SELECT Result2 FROM ftn_GetExampleTable GROUP BY Result2 ) n
CROSS
JOIN (SELECT Year, Week FROM myCalendar WHERE ... ) w
You don't necessarily need a table named myCalendar. (That approach does seem to be the popular one.) You just need a row source from which you can derive a list of (Year, Week) tuples. (There are answers to the question elsewhere in Stackoverflow, how to generate a list of dates.)
And the list of MemberGroupID and Result2 values doesn't have to come from the ftn_GetExampleTable rowsource, you could substitute another query.
With a cartesian product of those dimensions, you've got a complete "grid". Now you can LEFT JOIN your original result set to that.
Any place you don't have a matching row from the "gappy" result query, you'll get a NULL returned. You can leave the NULL, or replace it with a 0, which is probably what you want if it's a "count" you are returning.
SELECT d.MemberGroupID
, d.Result2
, d.Year
, d.Week
, IFNULL(r.ExampleCount,0) as ExampleCount
FROM ( <dimension query from above> ) d
LEFT
JOIN ( <original ExampleCount query> ) r
ON r.MemberGroupID = d.MemberGroupID
AND r.Result2 = d.Result2
AND r.Year = d.Year
AND r.Week = d.Week
That query can be refactored to make use of Common Table Expressions, which makes the query a little easier to read, especially if you are including multiple measures.
; WITH d AS ( /* <dimension query with no gaps (example above)> */
)
, r AS ( /* <original query with gaps> */
SELECT MemberGroupID, Result2, Count(*) AS ExampleCount, Year, Week
FROM ftn_GetExampleTable
GROUP BY MemberGroupID, Result2, Year, Week
)
SELECT d.*
, IFNULL(r.ExampleCount,0)
FROM d
LEFT
JOIN r
ON r.Year=d.Year AND r.Week=d.Week AND r.MemberGroupID = d.MemberGroupID
AND r.Result2 = d.Result2
This isn't a complete working solution to your problem, but it outlines an approach you can use.
Whenever I need to generate a sequence within SQL-Server I use the sys.all_objects table along with the ROW_NUMBER function, then maninpulate it as required:
SELECT ROW_NUMBER() OVER(ORDER BY Object_ID) AS Sequence
FROM Sys.All_Objects
So for the list of year and week numbers I would use:
DECLARE #StartDate DATETIME,
#EndDate DATETIME
SET #StartDate = '20110101'
SET #EndDate = '20120601'
SELECT DATEPART(YEAR, Date) AS YEAR,
DATEPART(WEEK, Date) AS WeekNum
FROM ( SELECT DATEADD(WEEK, ROW_NUMBER() OVER(ORDER BY Object_ID) - 1, #StartDate) AS Date
FROM Sys.All_Objects
) Dates
WHERE Date < #endDate
Where the dates subquery provides a list of dates at one week intervals between your start and end dates.
So in your example the end result would be something like:
DECLARE #StartDate DATETIME,
#EndDate DATETIME
SET #StartDate = '20110101'
SET #EndDate = '20120601'
;WITH Data AS
( SELECT MemberGroupID,
Result2,
Count(*) AS ExampleCount,
Year,
Week
FROM ftn_GetExampleTable
GROUP BY MemberGroupID, Result2, Year, Week
), Dates AS
( SELECT DATEPART(YEAR, Date) AS YEAR,
DATEPART(WEEK, Date) AS WeekNum
FROM ( SELECT DATEADD(WEEK, ROW_NUMBER() OVER(ORDER BY Object_ID) - 1, #StartDate) AS Date
FROM Sys.All_Objects
) Dates
WHERE Date < #endDate
)
SELECT YearNum,
WeeNum,
MemberID,
Result2,
COALESCE(ExampleCount, 0) AS ExampleCount
FROM Dates
LEFT JOIN Data
ON YearNum = Data.Year
AND WeekNum = Data.Week

SQL to identify missing week

I have a database table with the following structure -
Week_End Sales
2009-11-01 43223.43
2009-11-08 4324.23
2009-11-15 64343.23
...
Week_End is a datetime column, and the date increments by 7 days with each new entry.
What I want is a SQL statement that will identify if there is a week missing in the sequence. So, if the table contained the following data -
Week_End Sales
2009-11-01 43223.43
2009-11-08 4324.23
2009-11-22 64343.73
...
The query would return 2009-11-15.
Is this possible? I am using SQL Server 2008, btw.
You've already accepted an answer so I guess you don't need this, but I was almost finished with it anyway and it has one advantage that the selected solution doesn't have: it doesn't require updating every year. Here it is:
SELECT T1.*
FROM Table1 T1
LEFT JOIN Table1 T2
ON T2.Week_End = DATEADD(week, 1, T1.Week_End)
WHERE T2.Week_End IS NULL
AND T1.Week_End <> (SELECT MAX(Week_End) FROM Table1)
It is based on Andemar's solution, but handles the changing year too, and doesn't require the existence of the Sales column.
Join the table on itself to search for consecutive rows:
select a.*
from YourTable a
left join YourTable b
on datepart(wk,b.Week_End) = datepart(wk,a.Week_End) + 1
-- No next week
where b.sales is null
-- Not the last week
and datepart(wk,a.Week_End) <> (
select datepart(wk,max(Week_End)) from YourTable
)
This should return any weeks without a next week.
Assuming your "week_end" dates are always going to be the Sundays of the week, you could try a CTE - a common table expression that lists out all the Sundays for 2009, and then do an outer join against your table.
All those rows missing from your table will have a NULL value for their "week_end" in the select:
;WITH Sundays2009 AS
(
SELECT CAST('20090104' AS DATETIME) AS Sunday
UNION ALL
SELECT
DATEADD(DAY, 7, cte.Sunday)
FROM
Sundays2009 cte
WHERE
DATEADD(DAY, 7, cte.Sunday) < '20100101'
)
SELECT
sun.Sunday 'Missing week end date'
FROM
Sundays2009 sun
LEFT OUTER JOIN
dbo.YourTable tbl ON sun.Sunday = tbl.week_end
WHERE
tbl.week_end IS NULL
I know this has already been answered, but can I suggest something really simple?
/* First make a list of weeks using a table of numbers (mine is dbo.nums(num), starting with 1) */
WITH AllWeeks AS (
SELECT DATEADD(week,num-1,w.FirstWeek) AS eachWeek
FROM
dbo.nums
JOIN
(SELECT MIN(week_end) AS FirstWeek, MAX(week_end) as LastWeek FROM yourTable) w
ON num <= DATEDIFF(week,FirstWeek,LastWeek)
)
/* Now just look for ones that don't exist in your table */
SELECT w.eachWeek AS MissingWeek
FROM AllWeeks w
WHERE NOT EXISTS (SELECT * FROM yourTable t WHERE t.week_end = w.eachWeek)
;
If you know the range you want to look over, you don't need to use the MIN/MAX subquery in the CTE.