I have table with date ranges
and i need to add rows between this ranges. I must granulate this table to minutes. How can i add this extra rows
The recursive CTE option from #MatthewBaker would only need minor changes to meet your needs.
WITH
by_minute
AS
(
SELECT *, datetime_from, minute_marker FROM your_table
UNION ALL
SELECT *, DATEADD(minute, 1, minute_marker) FROM by_minute WHERE DATEADD(minute, 1, minute_marker) < datetime_to
)
SELECT
*
FROM
by_minute
OPTION
(MAXRECURSION 0)
The OPTION (MAXRECURSION 0) allows SQL Server to keep recursively generating the minutes beyond the default of 100. Still, I would not recommend this if the intervals being generated are more than a few hundred minutes long (maybe up to one day [1440 minutes]).
In such a case the simpler approach would be to utilise a table of numbers, and simply join on to that.
An example for creating such a table could be : https://www.mssqltips.com/sqlservertip/4176/the-sql-server-numbers-table-explained--part-1/
From there, you just join on the number of row that you need...
SELECT
yourTable.*,
DATEADD(minute, Numbers.[Number], yourTable.datetime_from) AS minute_marker
FROM
yourTable
INNER JOIN
dbo.Numbers
ON Numbers.[Number] >= 0
AND Numbers.[Number] < DATEDIFF(minute, yourTable.datetime_from, yourTable.datetime_to)
Another recommendation I have is to NOT use 59th second to represent the end of a minute. What if you get data at 59.600 seconds? That's after then end of the minute but before the start of the new one? Instead use markers that are Inclusive Start and Exclusive End...
The first minute of 2012 = '2012-01-01 00:00:00.000' -> '2012-01-01 00:01:00.000'
The final minute of 2012 = '2012-12-31 23:59:00.000' -> '2013-01-01 00:00:00.000'
With such a structure you only ever need my_point_in_time >= start AND my_point_in_time < end, and you never need worry about the precision of the datatypes being used.
(It also matches human natural language. When we say things like between 1 and 2 we most often mean >= 1 AND < 2.)
If you use the following:
WITH cte
AS (SELECT CAST('2017-01-01 00:00:00' AS DATETIME) AS startTime
UNION ALL
SELECT DATEADD(MINUTE, 1, startTime)
FROM cte
WHERE startTime < '2017-01-02 00:00:00'
)
SELECT *
FROM cte
OPTION (MAXRECURSION 0)
It will give you a minute by minute result. Substitute in the range you want. You can then use that as a basis to write an insert. Iterative CTE's aren't the most efficient, but probably the easiest
Related
I'm looking to see if there is a way to get the total daily inventory for open items in the past few months. Basically, each record has a start date and an end date. The start date is always the same. The end date will be null until it has been processed. Once processed, it is updated with a process date. Getting one day is fine, but I need to get the total volume, everyday, for a the last few months.
My current method of doing this is putting the results in an aggregate table. I can run the results one time through a while loop, then each day run whatever open volume there is from a stored procedure. This method works, but seems messy.
DECLARE #D AS DATE = '04/01/2019'
WHILE #D <= CAST(GETDATE() AS DATE)
BEGIN
INSERT INTO DBO.OPEN_INVENTORY
SELECT
#D OPEN_DATE
,COUNT(1) OPEN_VOLUME
FROM
DBO.INVENTORY_RECORDS
WHERE
#D BETWEEN START_DATE AND ISNULL(END_DATE,'12/31/2199')
SET #D = DATEADD(D,+1,#D)
END
I would like to reproduce these results without having to store the volumes into an aggregate table. Is there a way to accomplish this in a single select?
Yes, the best way would be to use what's known as a "Tally Table". They are extremely quick are building large sets of sequential data, and unlike a WHILE, CURSOR or rCTE, aren't recursive.
This is a big of a stab in the dark, as I have no sample data, but I think this is what you're after.
DECLARE #D AS DATE = '20190104';
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1, N N2, N N3), --1000 rows should be enough?
Dates AS(
SELECT DATEADD(DAY, T.I, #D) AS CalendarDate
FROM Tally T
WHERE DATEADD(DAY, T.I, #D) <= GETDATE())
SELECT D.CalendarDate,
COUNT(IR.YourIDColumn) AS OPEN_VOLUMNE
FROM Dates D
LEFT JOIN DBO.INVENTORY_RECORDS IR ON D.Date >= IR.START_DATE
AND (D.Date <= IR.END_DATE OR IR.END_DATE IS NULL)
GROUP BY D.CalendarDate;
If not, try to troubleshoot it yourself, and then supply sample and expected results if not.
I'm not sure if there should be a loop for this or what the easiest approach would be.
My data consists of a list of people participating in our program. They have various start and end dates, but the following equation is able to capture the number of people who participated on a specific date:
DECLARE #PopulationDate DATETIME = '2018-06-01 05:00:00';
select count(People)
FROM Program_Log
WHERE
START_TIME <= #PopulationDate
AND (END_TIME >= #PopulationDate OR END_TIME IS NULL)`
Is there a way I can loop in different date values to get the number of program participants each day for an entire year?
Multiple years?
One simple way is to use a CTE to generate the dates and then a left join to bring in the data. For instance, the following gets the counts as of the first of the month for this year:
with dates as (
select cast('2018-01-01' as date) as dte
union all
select dateadd(month, 1, dte)
from dates
where dte < getdate()
)
select d.dte, count(pl.people)
from dates d left join
program_log pl
on pl.start_time <= d.dte and (pl.end_time >= d.dte or pl.end_time is null)
group by d.dte
order by d.dte;
Note that this will work best for a handful of dates. If you want more than 100, you need to add option (maxrecursion 0) to the end of the query.
Also, count(people) is highly suspicious. Perhaps you mean sum(people) or something similar.
I'd like to check if there is anything to return given a number to check against, and if that query returns no entries, increase the number until an entry is reached and display that entry. Currently, the code looks like this :
SELECT *
FROM news
WHERE DATEDIFF(day, date, getdate() ) <= #url.d#
ORDER BY date desc
where #url.d# is an integer being passed through (say 31). If that returns no results, I'd like to increase the number stored in #url.d# by 1 until an entry is found.
This kind of incremental querying is just not efficient. You'll get better results by saying - "I'll never need more than 100 results so give me these" :
SELECT top 100 *
FROM news
ORDER BY date desc
Then filtering further on the client side if you want only a particular day's items (such as the items with a common date as the first item in the result).
Or, you could transform your multiple query request into a two query request:
DECLARE
#theDate datetime,
#theDate2 datetime
SET #theDate = (SELECT Max(date) FROM news)
--trim the time off of #theDate
SET #theDate = DateAdd(dd, DateDiff(dd, 0, #theDate), 0)
SET #theDate2 = DateAdd(dd, 1, #theDate)
SELECT *
FROM news
WHERE #theDate <= date AND date < #theDate2
ORDER BY date desc
In MySQL:
SELECT news.*,
(
SELECT COUNT(*)
FROM news
WHERE date < DATEADD(day, GETDATE(), -#url.d#)
)
FROM news
WHERE date >= DATEADD(day, GETDATE(), -#url.d#)
ORDER BY
date DESC
LIMIT 1
In SQL Server:
SELECT TOP 1
news.*,
(
SELECT COUNT(*)
FROM news
WHERE date < DATEADD(day, GETDATE(), -#url.d#)
)
FROM news
WHERE date >= DATEADD(day, GETDATE(), -#url.d#)
ORDER BY
date DESC
Note that using this syntax makes your query sargable, that is an index can be used to filter on date efficiently.
First, I think you will probably want to avpod using the DateDiff function in your where clause, instead, compute the desired cutoff date and do use any computations on the date column within the where clause, this will be more efficient, so rather than
WHERE DATEDIFF(day, date, getdate() ) <= #url.d#
you would have something like
WHERE date >= #cutoffDate
where #cutoffDate is a computed date based on #url.d#
Now, as for grabbing the correct cutoff date. My assumption is that under normal circumstances, there will be articles returned from the request otherwise you would just grab articles from the most recent date. So, the approach that I would take would be to grab the OLDEST of the computed cutoff date (based on #url.d# and the MOST RECENT article date. Something like
-- #urld == #url.d
-- compute the cutoff date as the OLDEST of the most recent article and
-- the date based on #url.d
declare #cutoff datetime
select #cutoff = DateAdd(dd,-1*#urld,GetDate())
select #cutoff
select #cutoff = min(cutoffDate)
from
(SELECT Max(date) as cutoffDate from News
UNION
select #cutoff) Cutoff
-- grab the articles with dates that are more recent than the cutoff date
select *
from News
WHERE date >= #cutoff
I'm also guessing that you would probably want to round to midnight for the dates (which I didn't do here). This is a multi-query approach and should probably be implemented in a single stored procedure ... if this is what you are looking for.
Good luck with the project!
If you wanted the one row:
SELECT t.*
FROM NEWS t
WHERE t.id = (SELECT MAX(n.id)
FROM NEWS n
WHERE n.date BETWEEN DATEADD(day, -:url.d, getDate()) AND getDate())
It might not be obvious that the DATEADD is using a negative in order to go back however many number of days desired.
If you wanted all the rows in that date:
SELECT t.*
FROM NEWS t
WHERE t.date BETWEEN DATEADD(day, -:url.d, getDate()) AND getDate())
I tried to use OPTION (MAXRECURSION 0) in a view to generate a list of dates.
This seems to be unsupported. Is there a workaround for this issue?
EDIT to Explain what I actually want to do:
I have 2 tables.
table1: int weekday, bool available
table2: datetime date, bool available
I want the result:
view1: date (here all days in this year), available(from table2 or from table1 when not in table2).
That means I have to apply a join on a date with a weekday.
I hope this explanation is understandable, because I actually use more tables with more fields in the query.
I found this code to generate the recursion:
WITH Dates AS
(
SELECT cast('2008-01-01' as datetime) Date
UNION ALL
SELECT Date + 1
FROM Dates
WHERE Date + 1 < DATEADD(yy, 1, GETDATE())
)
No - if you can find a way to do it within 100 levels of recusion (have a table of numbers), which will get you to within 100 recursion levels, you'll be able to do it. But if you have a numbers or pivot table, you won't need the recursion anyway...
See this question (but I would create a table and not a table-valued function), this question and this link and this link
You can use a CTE for hierarchical queries.
Here you go:
;WITH CTE_Stack(IsPartOfRecursion, Depth, MyDate) AS
(
SELECT
0 AS IsPartOfRecursion
,0 AS Dept
,DATEADD(DAY, -1, CAST('01.01.2012' as datetime)) AS MyDate
UNION ALL
SELECT
1 AS IsPartOfRecursion
,Parent.Depth + 1 AS Depth
--,DATEADD(DAY, 1, Parent.MyDate) AS MyDate
,DATEADD(DAY, 1, Parent.MyDate) AS MyDate
FROM
(
SELECT 0 AS Nothing
) AS TranquillizeSyntaxCheckBecauseWeDontHaveAtable
INNER JOIN CTE_Stack AS Parent
--ON Parent.Depth < 2005
ON DATEADD(DAY, 1, Parent.MyDate) < DATEADD(YEAR, 1, CAST('01.01.2012' as datetime))
)
SELECT * FROM CTE_Stack
WHERE IsPartOfRecursion = 1
OPTION (MAXRECURSION 367) -- Accounting for leap-years
;
I feel like I've seen this question asked before, but neither the SO search nor google is helping me... maybe I just don't know how to phrase the question. I need to count the number of events (in this case, logins) per day over a given time span so that I can make a graph of website usage. The query I have so far is this:
select
count(userid) as numlogins,
count(distinct userid) as numusers,
convert(varchar, entryts, 101) as date
from
usagelog
group by
convert(varchar, entryts, 101)
This does most of what I need (I get a row per date as the output containing the total number of logins and the number of unique users on that date). The problem is that if no one logs in on a given date, there will not be a row in the dataset for that date. I want it to add in rows indicating zero logins for those dates. There are two approaches I can think of for solving this, and neither strikes me as very elegant.
Add a column to the result set that lists the number of days between the start of the period and the date of the current row. When I'm building my chart output, I'll keep track of this value and if the next row is not equal to the current row plus one, insert zeros into the chart for each of the missing days.
Create a "date" table that has all the dates in the period of interest and outer join against it. Sadly, the system I'm working on already has a table for this purpose that contains a row for every date far into the future... I don't like that, and I'd prefer to avoid using it, especially since that table is intended for another module of the system and would thus introduce a dependency on what I'm developing currently.
Any better solutions or hints at better search terms for google? Thanks.
Frankly, I'd do this programmatically when building the final output. You're essentially trying to read something from the database which is not there (data for days that have no data). SQL isn't really meant for that sort of thing.
If you really want to do that, though, a "date" table seems your best option. To make it a bit nicer, you could generate it on the fly, using i.e. your DB's date functions and a derived table.
I had to do exactly the same thing recently. This is how I did it in T-SQL (
YMMV on speed, but I've found it performant enough over a coupla million rows of event data):
DECLARE #DaysTable TABLE ( [Year] INT, [Day] INT )
DECLARE #StartDate DATETIME
SET #StartDate = whatever
WHILE (#StartDate <= GETDATE())
BEGIN
INSERT INTO #DaysTable ( [Year], [Day] )
SELECT DATEPART(YEAR, #StartDate), DATEPART(DAYOFYEAR, #StartDate)
SELECT #StartDate = DATEADD(DAY, 1, #StartDate)
END
-- This gives me a table of all days since whenever
-- you could select #StartDate as the minimum date of your usage log)
SELECT days.Year, days.Day, events.NumEvents
FROM #DaysTable AS days
LEFT JOIN (
SELECT
COUNT(*) AS NumEvents
DATEPART(YEAR, LogDate) AS [Year],
DATEPART(DAYOFYEAR, LogDate) AS [Day]
FROM LogData
GROUP BY
DATEPART(YEAR, LogDate),
DATEPART(DAYOFYEAR, LogDate)
) AS events ON days.Year = events.Year AND days.Day = events.Day
Create a memory table (a table variable) where you insert your date ranges, then outer join the logins table against it. Group by your start date, then you can perform your aggregations and calculations.
The strategy I normally use is to UNION with the opposite of the query, generally a query that retrieves data for rows that don't exist.
If I wanted to get the average mark for a course, but some courses weren't taken by any students, I'd need to UNION with those not taken by anyone to display a row for every class:
SELECT AVG(mark), course FROM `marks`
UNION
SELECT NULL, course FROM courses WHERE course NOT IN
(SELECT course FROM marks)
Your query will be more complex but the same principle should apply. You may indeed need a table of dates for your second query
Option 1
You can create a temp table and insert dates with the range and do a left outer join with the usagelog
Option 2
You can programmetically insert the missing dates while evaluating the result set to produce the final output
WITH q(n) AS
(
SELECT 0
UNION ALL
SELECT n + 1
FROM q
WHERE n < 99
),
qq(n) AS
(
SELECT 0
UNION ALL
SELECT n + 1
FROM q
WHERE n < 99
),
dates AS
(
SELECT q.n * 100 + qq.n AS ndate
FROM q, qq
)
SELECT COUNT(userid) as numlogins,
COUNT(DISTINCT userid) as numusers,
CAST('2000-01-01' + ndate AS DATETIME) as date
FROM dates
LEFT JOIN
usagelog
ON entryts >= CAST('2000-01-01' AS DATETIME) + ndate
AND entryts < CAST('2000-01-01' AS DATETIME) + ndate + 1
GROUP BY
ndate
This will select up to 10,000 dates constructed on the fly, that should be enough for 30 years.
SQL Server has a limitation of 100 recursions per CTE, that's why the inner queries can return up to 100 rows each.
If you need more than 10,000, just add a third CTE qqq(n) and cross-join with it in dates.