SQL Server Between Start and End dates - sql

I'm creating an internal holiday booking system and I need to put business logic rules into place but I need to do a check on how many people are booked off on the dates between the Start and End date because for example 2 apprentices may only be booked off on 1 day but I have no way off grabbing the dates between.
Any help would be appreciated
Below is the job role table

You haven't posted the RDBMS, or the names of the tables, or what exactly the Job Role table is supposed to be doing ... but I'll take a shot at this anyway. I'm using a recursive CTE to generate a list of dates, but it would be far better for you to use a Date table, and I don't even know whether your RDBMS will support this. I've also posted the syntax for a table variable below that populates data mimicking your sample.
The final output, naturally, will need to be customized to do whatever you need to do. This does show you, however, a list of every date when more than one employee is on vacation. Add extra conditions to the second JOIN or to the WHERE clause to filter on other things (like JobRole) if necessary.
-- Code originally from http://smehrozalam.wordpress.com/2009/06/09/t-sql-using-common-table-expressions-cte-to-generate-sequences/
-- Define start and end limits
DECLARE #todate DATETIME, #fromdate DATETIME
SELECT #fromdate='2014-01-01', #todate=GETDATE()-1
DECLARE #TimeOff TABLE (StartDate DATETIME, EndDate DATETIME, EmployeeID INT)
INSERT INTO #TimeOff (StartDate, EndDate, EmployeeID)
SELECT '1/1/2014', '1/7/2014', 7 UNION
SELECT '2/1/2014', '2/7/2014', 7 UNION
SELECT '3/3/2014', '3/9/2014', 7 UNION
SELECT '2/5/2014', '2/6/2014', 8
;WITH DateSequence( Date ) AS -- this will list all dates. Use this if you don't have a date table
(
SELECT #fromdate as Date
UNION ALL
SELECT DATEADD(DAY, 1, Date)
FROM DateSequence
WHERE Date < #todate
)
--select result
SELECT DateSequence.Date, TimeOffA.StartDate, TimeOffB.EndDate, TimeOffA.EmployeeID
FROM
DateSequence -- a full list of all possible dates
INNER JOIN
#TimeOff TimeOffA ON -- all dates when an employee is on vacation -- replace this with your actual table's name
DateSequence.Date BETWEEN TimeOffA.StartDate AND TimeOffA.EndDate
INNER JOIN
#TimeOff TimeOffB ON -- all dates when an employee who is NOT employee A is on vacation -- replace this with your actual table's name
DateSequence.Date BETWEEN TimeOffB.StartDate AND TimeOffB.EndDate AND
TimeOffA.EmployeeID <> TimeOffB.EmployeeID
option (MaxRecursion 2000)

Related

How do I phrase this condition in T-SQL syntax?

I have a table in SQL Server 2012 with employees data. An extract is shown below:
empID DateOfEntry DateLeft
102 2015-05-21 2016-04-20
104 2015-05-14 2015-12-28
...
I need to extract all employees who were present during the period 2016-07-01 and 2017-06-30.
To do this, I need to add a WHERE filter on the DateLeft column in my query but I am a bit confused as to how to phrase this logic.
I think this is what you are after:
WHERE DateOfEntry <= '2017-06-30'
and DateLeft >= '2016-07-01'
You can think of your question as "people who started before (or on) your latest date and left after (or on) your Earliest date".
"all employees who were present during the period 2016-07-01 and 2017-06-30"
Your phrasing implies several scenarios so I will address all
If you want employees who were present in these two dates specifically, use IN operator in your where filter:
SELECT *
FROM <Table>
WHERE DateLeft in ('2016-07-01','2017-06-30');
If you want employees who were present through these range of two dates you can use
'between' syntax:
SELECT *
FROM <Table>
WHERE DateLeft between '2017-06-30' and '2016-07-01';
Note that when using 'between' begin and end values are included, so assuming you want to exclude the last date in range for example you will need to use greater then or equal / lower then operators
SELECT *
FROM <Table>
WHERE DateLeft >= '2017-06-30' and DateLeft < '2016-07-01';
I think you need something like this.
DECLARE #Employee TABLE(empID INT,DateOfEntry DATE,DateLeft DATE)
INSERT #Employee
SELECT 101,'2015-05-21','2016-08-20' UNION ALL -- O
SELECT 102,'2015-05-21','2016-04-20' UNION ALL -- X
SELECT 104,'2015-05-14','2015-12-28' UNION ALL -- X
SELECT 105,'2015-05-14','2018-12-28' -- O
DECLARE #StartDate DATE = '2016-07-01'
DECLARE #EndDate DATE = '2017-06-30'
SELECT * FROM #Employee
WHERE
(DATEDIFF(DAY,DateLeft,#StartDate) <=0 AND DATEDIFF(DAY,DateLeft,#EndDate) >= 0)
OR
(DATEDIFF(DAY,DateLeft,#StartDate) <=0 AND DATEDIFF(DAY,DateLeft,#EndDate) <= 0)
Try this:
SELECT *
FROM [mytable]
WHERE [DateOfEntry] >= '2016-07-01'
AND [DateLeft] <= '2017-06-30';
Also, when working with dates and ranges, do not use BETWEEN even some people are going to show you the syntax.
To be precise, you should avoid passing dates as string in this format, too:
YYYY-MM-DD
As it can break under certain scenarios — such as when the user's language settings are set to French:
SET LANGUAGE FRENCH;
GO
SELECT CONVERT(DATETIME, '2009-10-13');
You can use YYYYMMDD instead.
Aaron Bertrand has nice article about dates and dates ranges, part of his series - Bad habits to kick - you can find more details and examples there, if you want :-)

How do I compare dates in one SQL table to a range defined in another table?

I have one table holding events and dates:
NAME | DOB
-------------------
Adam | 6/26/1999
Barry | 7/18/2005
Daniel| 1/18/1984
I have another table defining date ranges as either start or end times, each with a descriptive code:
CODE | DATE
---------------------
YearStart| 6/28/2013
YearEnd | 8/14/2013
I am trying to write SQL that will find all Birthdates that fall between the start and end of the times described in the second table. The YearStart will always be in June, and the YearEnd will always be in August. My thought was to try:
SELECT
u.Name
CAST(MONTH(u.DOB) AS varchar) + '/' + CAST(DAY(u.DOB) AS varchar) as 'Birthdate',
u.DOB as 'Birthday'
FROM
Users u
WHERE
MONTH(DOB) = '7' OR
(MONTH(DOB) = '6' AND DAY(DOB) >= DAY(SELECT d.Date FROM Dates d WHERE d.Code='YearStart')) OR
(MONTH(DOB) = '8' AND DAY(DOB) <= DAY(SELECT d.Date FROM Dates d WHERE d.Code='YearEnd')))
ORDER BY
MONTH(DOB) ASC, DAY(DOB) ASC
But this doesn't pass, I'm guessing because there is no guarantee that the internal SELECT statement will return only one row, so cannot be parsed as a datetime. How do I actually accomplish this query?
This seems strange and I still feel like we're missing a relevant piece of the requirements, but look at the following. It seems from your description that the years are irrelevant and you want birthdays that fall between the given months/days.
SELECT
t1.Name, t1.DOB
FROM
t1
JOIN t2 AS startDate ON (startDate.Code = 'YearStart')
JOIN t2 AS endDate ON (endDate.Code = 'YearEnd')
WHERE
STUFF(CONVERT(varchar, t1.DOB, 112), 1, 4, '') BETWEEN
STUFF(CONVERT(varchar, startDate.[Date], 112), 1, 4, '')
AND
STUFF(CONVERT(varchar, endDate.[Date], 112), 1, 4, '')
Try using a PIVOT to get the years on the same row, like this. This will return only 'Bob'
DECLARE #Names TABLE(
NAME VARCHAR(20),
DOB VARCHAR(10));
DECLARE #Dates TABLE(
CODE VARCHAR(20),
THEDATE VARCHAR(10));
INSERT #Names (NAME,DOB) VALUES ('Adam', '6/26/1999');
INSERT #Names (NAME,DOB) VALUES ('Daniel', '1/18/1984');
INSERT #Names (NAME,DOB) VALUES ('Bob', '7/1/2013');
INSERT #Dates (CODE,THEDATE) VALUES ('YearStart', '6/28/2013');
INSERT #Dates (CODE,THEDATE) VALUES ('YearEnd', '8/14/2013');
SELECT * FROM #Names;
SELECT * FROM #Dates;
SELECT n.*
FROM #Names AS n
INNER JOIN (
SELECT
1 AS YearTypeId
, [YearStart]
, [YearEnd]
FROM ( SELECT [CODE]
, THEDATE
FROM #Dates
) p PIVOT ( MIN(THEDATE)
FOR [CODE]
IN ([YearStart],[YearEnd])
) AS pvt) AS y
ON
n.DOB >= y.YearStart
AND n.DOB <= y.YearEnd
From the last paragraph in your question, I am assuming that the Dates table have one YearStart and one YearEnd row for each year, correct? If so, your SQL query should include the year you are interrested in.
Also, even if "date" is not strictly speaking a reserved word for SQL Server (see Reserved Keywords for Transact SQL), you should avoid using such column names since, for example, ODBC does not allow them.
But to do something with only the information that you have already provided, you could do something like this to get the birthday celebrants for the last year defined in Dates (providing there really is both a YearStart and YearEnd entry for that year):
SELECT DISTINCT <the rest as in your example>
FROM Users u
WHERE u.DOB >= (SELECT max(d1.Date) FROM Dates d1 WHERE d1.Code = 'YearStart')
AND u.DOB <= (SELECT max(d2.Date) FROM Dates d2 WHERE d2.Code = 'YearEnd')
ORDER BY u.DOB;
The main difference between the query above (which I have not tested - this is just to show the principle) and your post is that I trust the datetime type (or whichever variant of it that you have used in the database) to work as intended. What I mean by that is that the database engine is well aware (well, it should be) of which of two full dates is the earliest and the latest - you do not have to extract and compare their components separately.
/Bosse

List Transactions that Meet Criteria

Realizing that another question I asked before may be too difficult, I'm changing my requirements.
I work for a credit card company. Our database has a customer table and a transaction table. Fields in the customer table are SSN and CustomerKey. Fields in the transaction table are CustomerKey, transaction date (Transdate), and transaction amount (TransAmt).
I need a query that can identify each ssn where the sum of any of their transaction amounts > 1000 within a two day period in 2012. If a ssn has transaction amounts > 1000 within a two day period, I need the query to return all the transactions for that ssn.
Here is an example of the raw data in the Transaction Table:
Trans#-----CustKey-----Date--------Amount
1-----------12345----01/01/12--------$600
2-----------12345----01/02/12--------$500
3-----------67890----01/03/12--------$10
4-----------98765----04/01/12--------$600
5-----------43210----04/02/12--------$600
6-----------43210----04/03/12--------$100
7-----------13579----04/02/12--------$600
8-----------24568----04/03/12--------$100
Here is an example of the raw data in the Customer Table:
CustKey-----SSN
12345------123456789
67890------123456789
98765------987654321
43210------987654321
13579------246801357
24568------246801357
Here are the results I need:
Trans#------SSN---------Date---------Amount
1--------123456789----01/01/12---------$600
2--------123456789----01/02/12---------$500
3--------123456789----01/03/12----------$10
4--------987654321----04/01/12---------$600
5--------987654321----04/02/12---------$600
6--------987654321----04/03/12---------$100
As you can see in my results included all transactions for SSN 123456789 and 987654321, and excluded SSN 246801357.
One way of doing this is to roll through each two day period within a year. Here is an SQL Fiddle example.
The idea is pretty simple:
1) Create a temp table to store all matching customers
create table CustomersToShow
(
SSN int
)
2) Loop trough a year and populate temp table with customers that match the amount criteria
declare #firstDayOfTheYear datetime = '1/1/2012';
declare #lastDayOfTheYear datetime = '12/31/2012';
declare #currentDate datetime = #firstDayOfTheYear;
declare #amountThreshold money = 1000;
while #currentDate <= #lastDayOfTheYear
begin
insert into CustomersToShow(SSN)
select b.SSN
from transactions a
join customers b
on a.CustKey = b.CustKey
where TransactionDate >= #currentDate
and TransactionDate <= DATEADD(day, 2, #currentDate)
group by b.SSN
having SUM(a.TransactionAmount) >= #amountThreshold
set #currentDate = DATEADD(day,2,#currentDate)
end
3) And then just select
select a.TransNumber, b.SSN, a.TransactionDate, a.TransactionAmount
from transactions a
join customers b
on a.CustKey = b.CustKey
join CustomersToShow c
on b.SSN = c.SSN
Note: This will be slow...
While you could probably come up with a hacky way to do this via standard SQL, this is a problem that IMO is more suited to being solved by code (i.e. not by set-based logic / SQL).
It would be easy to solve if you sort the transaction list by customerKey and date, then loop through the data. Ideally I would do this in code, but alternatively you could write a stored procedure and use a loop and a cursor.
This is easy and well-suited to set-based logic if you look at it right. You simply need to join to a table that has every date range you're interested in. Every T-SQL database (Oracle has it built-in) should have a utility table named integers - it's very useful surprisingly often:
CREATE TABLE integers ( n smallint, constraint PK_integers primary key clustered (n))
INSERT integers select top 1000 row_number() over (order by o.id) from sysobjects o cross join sysobjects
Your date table then looks like:
SELECT dateadd(day, n-1, '2012') AS dtFrom, dateadd(day, n+1, '2012') AS dtTo
from integers where n <= 366
You can then (abbreviating):
SELECT ssn, dtFrom
FROM yourTables t
JOIN ( SELECT dateadd(day, n-1, '2012') as dtFrom, dateadd(day, n+1, '2012') AS dtTo
from integers where n <= 366 ) d on t.date between d.dtFrom and d.dtTo
GROUP BY ssn, dtFrom
HAVING sum(amount) > 1000
You can select all your transactions:
WHERE ssn in ( SELECT distinct ssn from ( <above query> ) t )

Select data from SQL DB per day

I have a table with order information in an E-commerce store. Schema looks like this:
[Orders]
Id|SubTotal|TaxAmount|ShippingAmount|DateCreated
This table does only contain data for every Order. So if a day goes by without any orders, no sales data is there for that day.
I would like to select subtotal-per-day for the last 30 days, including those days with no sales.
The resultset would look like this:
Date | SalesSum
2009-08-01 | 15235
2009-08-02 | 0
2009-08-03 | 340
2009-08-04 | 0
...
Doing this, only gives me data for those days with orders:
select DateCreated as Date, sum(ordersubtotal) as SalesSum
from Orders
group by DateCreated
You could create a table called Dates, and select from that table and join the Orders table. But I really want to avoid that, because it doesn't work good enough when dealing with different time zones and things...
Please don't laugh. SQL is not my kind of thing... :)
Create a function that can generate a date table as follows:
(stolen from http://www.codeproject.com/KB/database/GenerateDateTable.aspx)
Create Function dbo.fnDateTable
(
#StartDate datetime,
#EndDate datetime,
#DayPart char(5) -- support 'day','month','year','hour', default 'day'
)
Returns #Result Table
(
[Date] datetime
)
As
Begin
Declare #CurrentDate datetime
Set #CurrentDate=#StartDate
While #CurrentDate<=#EndDate
Begin
Insert Into #Result Values (#CurrentDate)
Select #CurrentDate=
Case
When #DayPart='year' Then DateAdd(yy,1,#CurrentDate)
When #DayPart='month' Then DateAdd(mm,1,#CurrentDate)
When #DayPart='hour' Then DateAdd(hh,1,#CurrentDate)
Else
DateAdd(dd,1,#CurrentDate)
End
End
Return
End
Then, join against that table
SELECT dates.Date as Date, sum(SubTotal+TaxAmount+ShippingAmount)
FROM [fnDateTable] (dateadd("m",-1,CONVERT(VARCHAR(10),GETDATE(),111)),CONVERT(VARCHAR(10),GETDATE(),111),'day') dates
LEFT JOIN Orders
ON dates.Date = DateCreated
GROUP BY dates.Date
declare #oldest_date datetime
declare #daily_sum numeric(18,2)
declare #temp table(
sales_date datetime,
sales_sum numeric(18,2)
)
select #oldest_date = dateadd(day,-30,getdate())
while #oldest_date <= getdate()
begin
set #daily_sum = (select sum(SubTotal) from SalesTable where DateCreated = #oldest_date)
insert into #temp(sales_date, sales_sum) values(#oldest_date, #daily_sum)
set #oldest_date = dateadd(day,1,#oldest_date)
end
select * from #temp
OK - I missed that 'last 30 days' part. The bit above, while not as clean, IMHO, as the date table, should work. Another variant would be to use the while loop to fill a temp table just with the last 30 days and do a left outer join with the result of my original query.
including those days with no sales.
That's the difficult part. I don't think the first answer will help you with that. I did something similar to this with a separate date table.
You can find the directions on how to do so here:
Date Table
I have a Log table table with LogID an index which i never delete any records. it has index from 1 to ~10000000. Using this table I can write
select
s.ddate, SUM(isnull(o.SubTotal,0))
from
(
select
cast(datediff(d,LogID,getdate()) as datetime) AS ddate
from
Log
where
LogID <31
) s right join orders o on o.orderdate = s.ddate
group by s.ddate
I actually did this today. We also got a e-commerce application. I don't want to fill our database with "useless" dates. I just do the group by and create all the days for the last N days in Java, and peer them with the date/sales results from the database.
Where is this ultimately going to end up? I ask only because it may be easier to fill in the empty days with whatever program is going to deal with the data instead of trying to get it done in SQL.
SQL is a wonderful language, and it is capable of a great many things, but sometimes you're just better off working the finer points of the data in the program instead.
(Revised a bit--I hit enter too soon)
I started poking at this, and as it hits some pretty tricky SQL concepts it quickly grew into the following monster. If feasible, you might be better off adapting THEn's solution; or, like many others advise, using application code to fill in the gaps could be preferrable.
-- A temp table holding the 30 dates that you want to check
DECLARE #Foo Table (Date smalldatetime not null)
-- Populate the table using a common "tally table" methodology (I got this from SQL Server magazine long ago)
;WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
Tally AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS Number FROM L3)
INSERT #Foo (Date)
select dateadd(dd, datediff(dd, 0, dateadd(dd, -number + 1, getdate())), 0)
from Tally
where Number < 31
Step 1 is to build a temp table containint the 30 dates that you are concerned with. That abstract wierdness is about the fastest way known to build a table of consecutive integers; add a few more subqueries, and you can populate millions or more in mere seconds. I take the first 30, and use dateadd and the current date/time to convert them into dates. If you already have a "fixed" table that has 1-30, you can use that and skip the CTE entirely (by replacing table "Tally" with your table).
The outer two date function calls remove the time portion of the generated string.
(Note that I assume that your order date also has no time portion -- otherwise you've got another common problem to resolve.)
For testing purposes I built table #Orders, and this gets you the rest:
SELECT f.Date, sum(ordersubtotal) as SalesSum
from #Foo f
left outer join #Orders o
on o.DateCreated = f.Date
group by f.Date
I created the Function DateTable as JamesMLV pointed out to me.
And then the SQL looks like this:
SELECT dates.date, ISNULL(SUM(ordersubtotal), 0) as Sales FROM [dbo].[DateTable] ('2009-08-01','2009-08-31','day') dates
LEFT JOIN Orders ON CONVERT(VARCHAR(10),Orders.datecreated, 111) = dates.date
group by dates.date
SELECT DateCreated,
SUM(SubTotal) AS SalesSum
FROM Orders
GROUP BY DateCreated

SQL for counting events by date

I feel like I've seen this question asked before, but neither the SO search nor google is helping me... maybe I just don't know how to phrase the question. I need to count the number of events (in this case, logins) per day over a given time span so that I can make a graph of website usage. The query I have so far is this:
select
count(userid) as numlogins,
count(distinct userid) as numusers,
convert(varchar, entryts, 101) as date
from
usagelog
group by
convert(varchar, entryts, 101)
This does most of what I need (I get a row per date as the output containing the total number of logins and the number of unique users on that date). The problem is that if no one logs in on a given date, there will not be a row in the dataset for that date. I want it to add in rows indicating zero logins for those dates. There are two approaches I can think of for solving this, and neither strikes me as very elegant.
Add a column to the result set that lists the number of days between the start of the period and the date of the current row. When I'm building my chart output, I'll keep track of this value and if the next row is not equal to the current row plus one, insert zeros into the chart for each of the missing days.
Create a "date" table that has all the dates in the period of interest and outer join against it. Sadly, the system I'm working on already has a table for this purpose that contains a row for every date far into the future... I don't like that, and I'd prefer to avoid using it, especially since that table is intended for another module of the system and would thus introduce a dependency on what I'm developing currently.
Any better solutions or hints at better search terms for google? Thanks.
Frankly, I'd do this programmatically when building the final output. You're essentially trying to read something from the database which is not there (data for days that have no data). SQL isn't really meant for that sort of thing.
If you really want to do that, though, a "date" table seems your best option. To make it a bit nicer, you could generate it on the fly, using i.e. your DB's date functions and a derived table.
I had to do exactly the same thing recently. This is how I did it in T-SQL (
YMMV on speed, but I've found it performant enough over a coupla million rows of event data):
DECLARE #DaysTable TABLE ( [Year] INT, [Day] INT )
DECLARE #StartDate DATETIME
SET #StartDate = whatever
WHILE (#StartDate <= GETDATE())
BEGIN
INSERT INTO #DaysTable ( [Year], [Day] )
SELECT DATEPART(YEAR, #StartDate), DATEPART(DAYOFYEAR, #StartDate)
SELECT #StartDate = DATEADD(DAY, 1, #StartDate)
END
-- This gives me a table of all days since whenever
-- you could select #StartDate as the minimum date of your usage log)
SELECT days.Year, days.Day, events.NumEvents
FROM #DaysTable AS days
LEFT JOIN (
SELECT
COUNT(*) AS NumEvents
DATEPART(YEAR, LogDate) AS [Year],
DATEPART(DAYOFYEAR, LogDate) AS [Day]
FROM LogData
GROUP BY
DATEPART(YEAR, LogDate),
DATEPART(DAYOFYEAR, LogDate)
) AS events ON days.Year = events.Year AND days.Day = events.Day
Create a memory table (a table variable) where you insert your date ranges, then outer join the logins table against it. Group by your start date, then you can perform your aggregations and calculations.
The strategy I normally use is to UNION with the opposite of the query, generally a query that retrieves data for rows that don't exist.
If I wanted to get the average mark for a course, but some courses weren't taken by any students, I'd need to UNION with those not taken by anyone to display a row for every class:
SELECT AVG(mark), course FROM `marks`
UNION
SELECT NULL, course FROM courses WHERE course NOT IN
(SELECT course FROM marks)
Your query will be more complex but the same principle should apply. You may indeed need a table of dates for your second query
Option 1
You can create a temp table and insert dates with the range and do a left outer join with the usagelog
Option 2
You can programmetically insert the missing dates while evaluating the result set to produce the final output
WITH q(n) AS
(
SELECT 0
UNION ALL
SELECT n + 1
FROM q
WHERE n < 99
),
qq(n) AS
(
SELECT 0
UNION ALL
SELECT n + 1
FROM q
WHERE n < 99
),
dates AS
(
SELECT q.n * 100 + qq.n AS ndate
FROM q, qq
)
SELECT COUNT(userid) as numlogins,
COUNT(DISTINCT userid) as numusers,
CAST('2000-01-01' + ndate AS DATETIME) as date
FROM dates
LEFT JOIN
usagelog
ON entryts >= CAST('2000-01-01' AS DATETIME) + ndate
AND entryts < CAST('2000-01-01' AS DATETIME) + ndate + 1
GROUP BY
ndate
This will select up to 10,000 dates constructed on the fly, that should be enough for 30 years.
SQL Server has a limitation of 100 recursions per CTE, that's why the inner queries can return up to 100 rows each.
If you need more than 10,000, just add a third CTE qqq(n) and cross-join with it in dates.