Related
Below is the table I have created and inserted values in it:
CREATE TABLE employees_list
(
employeeID int identity(1,1),
employeeName varchar(25)
)
GO
INSERT INTO employees_list VALUES ('Kevin'),('Charles')
GO
CREATE TABLE hourlyRates
(
employeeID int,
rate int,
rateDate date
)
INSERT INTO hourlyRates VALUES (1, 28, '2016-01-01'),
(1, 39, '2016-02-01'),
(2, 43, '2016-01-01'),
(2, 57, '2016-02-01')
CREATE TABLE workingHours
(
employeeID int,
startdate datetime,
enddate datetime
)
GO
INSERT INTO workingHours VALUES (1, '2016-01-01 09:00', '2016-01-01 17:00'),
(1, '2016-01-02 09:00', '2016-01-02 17:00'),
(1, '2016-02-01 10:00', '2016-02-01 16:00'),
(1, '2016-02-02 11:00', '2016-02-02 13:00'),
(2, '2016-01-01 10:00', '2016-01-01 16:00'),
(2, '2016-01-02 08:00', '2016-01-02 14:00'),
(2, '2016-02-01 14:00', '2016-02-01 19:00'),
(2, '2016-02-02 13:00', '2016-02-02 16:00')
GO
SELECT * FROM employees_list
SELECT * FROM hourlyRates
SELECT * FROM workingHours
Now the question is:
Display employee ID, name, start date, end date, hours worked and hourly rate for the Employee whose ID number is 1.
This is what I have done:
SELECT
workingHours.employeeID, employeeName,
startdate, enddate,
DATEDIFF(HOUR, startdate, enddate) AS 'Hours Worked',
rate AS 'Hourly Rate'
FROM
hourlyRates, workingHours, employees_list
WHERE
hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND workingHours.employeeID = 1
And I got the following result:
The problem with the result above is that the result is being repeated or duplicated from row number 5 to row number 8.
It is supposed to generate the first 4 rows if I'm not mistaken.
I even tried adding DISTINCT in the SELECT statement and still it is showing duplicated result.
What change is needed on my query to eliminate the duplication?
The problem is you're joining the tables only on the employeeID. With that, you will get a row for every combination of hourlyRates and workingHours - which is 8 in this case. You'd have to join the tables somehow on the dates as well. You want to take a rate from hourlyRates only when it's in the correct month. The simplest way to do that with your query would be adding another join condition:
SELECT workingHours.employeeID,employeeName,startdate,enddate,
DATEDIFF(HOUR,startdate,enddate) AS 'Hours Worked',
rate AS 'Hourly Rate'
FROM hourlyRates,workingHours,employees_list
WHERE hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND (hourlyRates.rateDate
BETWEEN
DATEFROMPARTS(DATEPART(YEAR, workingHours.startDate), DATEPART(MONTH,workingHours.startDate), 1)
AND DATEFROMPARTS(DATEPART(YEAR, workingHours.endDate), DATEPART(MONTH,workingHours.endDate), 1))
AND workingHours.employeeID = 1
The new join condition is taking year and month parts with the DATEPART function, from workingHours's dates, and making sure that hourlyRates.rateDate is between those two.
DATEFROMPARTS function takes year, month and day integers (for the start of the month, we take that day is 1) and converts it to DATE type.
There are a couple of problems with the above query however. The obvious one being the syntax. You should use the new join syntax. The other problem is that the query is not Sargable because of the function calls on column values in the join condition. These queries should be avoided when possible.
There are a couple of ways of making a query sargable. For example we could insert the processed dates into a temp table, and use it for the query:
SELECT wh.employeeID
,DATEFROMPARTS(DATEPART(YEAR, wh.startDate), DATEPART(MONTH,wh.startDate), 1) AS startDateYearMonth
,DATEFROMPARTS(DATEPART(YEAR, wh.endDate), DATEPART(MONTH,wh.endDate), 1) AS endDateMonthYearMonth
,wh.startdate
,wh.enddate
INTO #TempWorkingHours
FROM workingHours wh
SELECT el.employeeID
,el.employeeName
,twh.startdate
,twh.enddate
,DATEDIFF(HOUR,twh.startdate,twh.enddate) AS [Hours Worked]
,hr.rate AS [Hourly Rate]
FROM employees_list el
JOIN hourlyRates hr ON el.employeeID = hr.employeeID
JOIN #TempWorkingHours twh ON el.employeeID = twh.employeeID
AND (hr.rateDate
BETWEEN
twh.startDateYearMonth
AND twh.endDateMonthYearMonth
)
WHERE el.employeeID = 1
Note that this will not improve performance (will even make it worse because of the temp table overhead) if you don't use indexes on the pre-processed columns. Without indexes, a non-sargable query will be fine, and with the new join syntax would look like this:
SELECT el.employeeID
,el.employeeName
,wh.startdate
,wh.enddate
,DATEDIFF(HOUR,wh.startdate,wh.enddate) AS [Hours Worked]
,hr.rate AS [Hourly Rate]
FROM employees_list el
JOIN hourlyRates hr ON el.employeeID = hr.employeeID
JOIN workingHours wh ON hr.employeeID = wh.EmployeeID
AND (hr.rateDate
BETWEEN
DATEFROMPARTS(DATEPART(YEAR, wh.startDate), DATEPART(MONTH,wh.startDate), 1)
AND DATEFROMPARTS(DATEPART(YEAR, wh.endDate), DATEPART(MONTH,wh.endDate), 1)
)
WHERE wh.employeeID = 1
I have data like shown below:
ID Duration Start Date End Date
------------------------------------------------------
10 2 2013-09-03 05:00:00 2013-09-03 05:02:00
I need output like below:
10 2 2013-09-03 05:00:00 2013-09-03 05:01:00 1
10 2 2013-09-03 05:01:00 2013-09-03 05:02:00 2
Based on the column Duration, if the value is 2, I need rows to be duplicated twice.
And if we see at the Output for Start Date and End Date time should be changed accordingly.
And Row count as an additional column for number rows duplicated in this case 1 / 2 shown above will help a lot.
And if duration is 0 and 1 then do nothing , only when duration > 1 then duplicate rows.
And at last Additional column for number row Sequence 1 , 2 ,3 for showing how many rows was duplicated.
try the sql below, I added some comments where I thought it was seemed necessery.
declare #table table(Id integer not null, Duration int not null, StartDate datetime, EndDate datetime)
insert into #table values (10,2, '2013-09-03 05:00:00', '2013-09-03 05:02:00')
insert into #table values (11,3, '2013-09-04 05:00:00', '2013-09-04 05:03:00')
;WITH
numbers AS (
--this is the number series generator
--(limited to 1000, you can change that to whatever you need
-- max possible duration in your case).
SELECT 1 AS num
UNION ALL
SELECT num+1 FROM numbers WHERE num+1<=100
)
SELECT t.Id
, t.Duration
, StartDate = DATEADD(MINUTE, IsNull(Num,1) - 1, t.StartDate)
, EndDate = DATEADD(MINUTE, IsNull(Num,1), t.StartDate)
, N.num
FROM #table t
LEFT JOIN numbers N
ON t.Duration >= N.Num
-- join it with numbers generator for Duration times
ORDER BY t.Id
, N.Num
This works better when Duration = 0:
declare #table table(Id integer not null, Duration int not null, StartDate datetime, EndDate datetime)
insert into #table values (10,2, '2013-09-03 05:00:00', '2013-09-03 05:02:00')
insert into #table values (11,3, '2013-09-04 05:00:00', '2013-09-04 05:03:00')
insert into #table values (12,0, '2013-09-04 05:00:00', '2013-09-04 05:03:00')
insert into #table values (13,1, '2013-09-04 05:00:00', '2013-09-04 05:03:00')
;WITH
numbers AS (
--this is the number series generator
--(limited to 1000, you can change that to whatever you need
-- max possible duration in your case).
SELECT 1 AS num
UNION ALL
SELECT num+1 FROM numbers WHERE num+1<=100
)
SELECT
Id
, Duration
, StartDate
, EndDate
, num
FROM
(SELECT
t.Id
, t.Duration
, StartDate = DATEADD(MINUTE, Num - 1, t.StartDate)
, EndDate = DATEADD(MINUTE, Num, t.StartDate)
, N.num
FROM #table t
INNER JOIN numbers N
ON t.Duration >= N.Num ) A
-- join it with numbers generator for Duration times
UNION
(SELECT
t.Id
, t.Duration
, StartDate-- = DATEADD(MINUTE, Num - 1, t.StartDate)
, EndDate --= DATEADD(MINUTE, Num, t.StartDate)
, 1 AS num
FROM #table t
WHERE Duration = 0)
ORDER BY Id,Num
I have a table that stores some events in the database, which log operational time of machines and I'd like to calculate a total running time within a specific date for a specific shift (or all shifts).
CREATE TABLE events (
ID int,
StartTime datetime,
EndTime datetime,
DurationSeconds bigint,
...
)
I'd like to select total duration of events in the specified date range #dateStart datetime, #dateEnd datetime while considering daily shifts (#shiftStart time, #shiftEnd time). The events will overlap the shifts and the date ranges.
For example, if I have shift starting at 6:00 and ending at 12:00, and the event lasted for whole 2 days (2014/01/01 00:00 - 2014/01/03 00:00), the total time for this row is (48 hours - 2*6 hours = 36 hours).
If an event starts in the middle of the shift, then only the 'in-shift' portion should be considered.
So far I had an implementation without considering the shifts like:
select sum(
--duration minus start overlap and end overlap
duration - dbo.udf_max(datediff(s,#pTo,endtime),0) - dbo.udf_max(datediff(s,starttime,#pFrom),0)
)
from events
where starttime < #pTo and endtime > #pFrom
I'd really like to have a set-based solution as the data sets are rather large and consider a looping cursor-based solution as the last resort.
Ok lets make some test data (I changed the times a little to show variance)
DECLARE #Events TABLE
(
ID int IDENTITY (1, 1),
StartTime datetime,
EndTime datetime,
DurationSeconds bigint
)
INSERT INTO #Events
( StartTime, EndTime )
VALUES
( '2014/01/01 01:00', '2014/01/02 22:00');
DECLARE #Shift TABLE
(
ShiftName VARCHAR(20),
StartTime DATETIME,
EndTime DATETIME
)
INSERT INTO #Shift
( ShiftName, StartTime, EndTime )
VALUES
( 'Night', '2014/01/01 00:00', '2014/01/01 06:00' ),
( 'Morning', '2014/01/01 06:00', '2014/01/01 12:00' ),
( 'Afternoon', '2014/01/01 12:00', '2014/01/01 18:00' ),
( 'Evening', '2014/01/01 18:00', '2014/01/02 00:00' ),
( 'Night', '2014/01/02 00:00', '2014/01/02 06:00' ),
( 'Morning', '2014/01/02 06:00', '2014/01/02 12:00' ),
( 'Afternoon', '2014/01/02 12:00', '2014/01/02 18:00' ),
( 'Evening', '2014/01/02 18:00', '2014/01/03 00:00' );
Here I make a numbers table to find all the minutes for the events duration
DECLARE #StartDate DATETIME = '1/1/2014';
DECLARE #number_of_numbers INT = 100000;
;WITH
a AS (SELECT 1 AS i UNION ALL SELECT 1),
b AS (SELECT 1 AS i FROM a AS x, a AS y),
c AS (SELECT 1 AS i FROM b AS x, b AS y),
d AS (SELECT 1 AS i FROM c AS x, c AS y),
e AS (SELECT 1 AS i FROM d AS x, d AS y),
f AS (SELECT 1 AS i FROM e AS x, e AS y),
numbers AS
(
SELECT TOP(#number_of_numbers)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS number
FROM f
), mins_in_day AS
Now I find all the minutes worked for each shift and total them
(
SELECT DATEADD(MINUTE, n.number, #StartDate) AS DayMinute
FROM numbers n
), output AS
(
SELECT s.ShiftName, CONVERT(DATE, s.StartTime) ShiftDay, COUNT(1) AS TotalMinutes FROM #Shift s
INNER JOIN mins_in_day sc
ON sc.DayMinute >= s.StartTime AND sc.DayMinute < s.EndTime
INNER JOIN #Events e
ON sc.DayMinute >= e.StartTime AND sc.DayMinute <e.EndTime
GROUP BY s.ShiftName, s.StartTime
)
SELECT * FROM output
Here is the output:
ShiftName ShiftDay TotalMinutes
Night 2014-01-01 300
Morning 2014-01-01 360
Afternoon 2014-01-01 360
Evening 2014-01-01 360
Night 2014-01-02 360
Morning 2014-01-02 360
Afternoon 2014-01-02 360
Evening 2014-01-02 240
EDITED:
I'm working in Sql Server 2005 and I'm trying to get a year over year (YOY) count of distinct users for the current fiscal year (say Jun 1-May 30) and the past 3 years. I'm able to do what I need by running a select statement four times, but I can't seem to find a better way at this point. I'm able to get a distinct count for each year in one query, but I need it to a cumulative distinct count. Below is a mockup of what I have so far:
SELECT [Year], COUNT(DISTINCT UserID)
FROM
(
SELECT u.uID AS UserID,
CASE
WHEN dd.ddEnd BETWEEN #yearOneStart AND #yearOneEnd THEN 'Year1'
WHEN dd.ddEnd BETWEEN #yearTwoStart AND #yearTwoEnd THEN 'Year2'
WHEN dd.ddEnd BETWEEN #yearThreeStart AND #yearThreeEnd THEN 'Year3'
WHEN dd.ddEnd BETWEEN #yearFourStart AND #yearFourEnd THEN 'Year4'
ELSE 'Other'
END AS [Year]
FROM Users AS u
INNER JOIN UserDataIDMatch AS udim
ON u.uID = udim.udim_FK_uID
INNER JOIN DataDump AS dd
ON udim.udimUserSystemID = dd.ddSystemID
) AS Data
WHERE LOWER([Year]) 'other'
GROUP BY
[Year]
I get something like:
Year1 1
Year2 1
Year3 1
Year4 1
But I really need:
Year1 1
Year2 2
Year3 3
Year4 4
Below is a rough schema and set of values (updated for simplicity). I tried to create a SQL Fiddle, but I'm getting a disk space error when I attempt to build the schema.
CREATE TABLE Users
(
uID int identity primary key,
uFirstName varchar(75),
uLastName varchar(75)
);
INSERT INTO Users (uFirstName, uLastName)
VALUES
('User1', 'User1'),
('User2', 'User2')
('User3', 'User3')
('User4', 'User4');
CREATE TABLE UserDataIDMatch
(
udimID int indentity primary key,
udim.udim_FK_uID int foreign key references Users(uID),
udimUserSystemID varchar(75)
);
INSERT INTO UserDataIDMatch (udim_FK_uID, udimUserSystemID)
VALUES
(1, 'SystemID1'),
(2, 'SystemID2'),
(3, 'SystemID3'),
(4, 'SystemID4');
CREATE TABLE DataDump
(
ddID int identity primary key,
ddSystemID varchar(75),
ddEnd datetime
);
INSERT INTO DataDump (ddSystemID, ddEnd)
VALUES
('SystemID1', '10-01-2013'),
('SystemID2', '10-01-2014'),
('SystemID3', '10-01-2015'),
('SystemID4', '10-01-2016');
Unless I'm missing something, you just want to know how many records there are where the date is less than or equal to the current fiscal year.
DECLARE #YearOneStart DATETIME, #YearOneEnd DATETIME,
#YearTwoStart DATETIME, #YearTwoEnd DATETIME,
#YearThreeStart DATETIME, #YearThreeEnd DATETIME,
#YearFourStart DATETIME, #YearFourEnd DATETIME
SELECT #YearOneStart = '06/01/2013', #YearOneEnd = '05/31/2014',
#YearTwoStart = '06/01/2014', #YearTwoEnd = '05/31/2015',
#YearThreeStart = '06/01/2015', #YearThreeEnd = '05/31/2016',
#YearFourStart = '06/01/2016', #YearFourEnd = '05/31/2017'
;WITH cte AS
(
SELECT u.uID AS UserID,
CASE
WHEN dd.ddEnd BETWEEN #yearOneStart AND #yearOneEnd THEN 'Year1'
WHEN dd.ddEnd BETWEEN #yearTwoStart AND #yearTwoEnd THEN 'Year2'
WHEN dd.ddEnd BETWEEN #yearThreeStart AND #yearThreeEnd THEN 'Year3'
WHEN dd.ddEnd BETWEEN #yearFourStart AND #yearFourEnd THEN 'Year4'
ELSE 'Other'
END AS [Year]
FROM Users AS u
INNER JOIN UserDataIDMatch AS udim
ON u.uID = udim.udim_FK_uID
INNER JOIN DataDump AS dd
ON udim.udimUserSystemID = dd.ddSystemID
)
SELECT
DISTINCT [Year],
(SELECT COUNT(*) FROM cte cteInner WHERE cteInner.[Year] <= cteMain.[Year] )
FROM cte cteMain
Concept using an existing query
I have done something similar for finding out the number of distinct customers who bought something in between years, I modified it to use your concept of year, the variables you add would be that start day and start month of the year and the start year and end year.
Technically there is a way to avoid using a loop but this is very clear and you can't go past year 9999 so don't feel like putting clever code to avoid a loop makes sense
Tips for speeding up the query
Also when matching dates make sure you are comparing dates, and not comparing a function evaluation of the column as that would mean running the function on every record set and would make indices useless if they existed on dates (which they should). Use date add on
zero to initiate your target dates subtracting 1900 from the year, one from the month and one from the target date.
Then self join on the table where the dates create a valid range (i.e. yearlessthan to yearmorethan) and use a subquery to create a sum based on that range. Since you want accumulative from the first year to the last limit the results to starting at the first year.
At the end you will be missing the first year as by our definition it does not qualify as a range, to fix this just do a union all on the temp table you created to add the missing year and the number of distinct values in it.
DECLARE #yearStartMonth INT = 6, #yearStartDay INT = 1
DECLARE #yearStart INT = 2008, #yearEnd INT = 2012
DECLARE #firstYearStart DATE =
DATEADD(day,#yearStartDay-1,
DATEADD(month, #yearStartMonth-1,
DATEADD(year, #yearStart- 1900,0)))
DECLARE #lastYearEnd DATE =
DATEADD(day, #yearStartDay-2,
DATEADD(month, #yearStartMonth-1,
DATEADD(year, #yearEnd -1900,0)))
DECLARE #firstdayofcurrentyear DATE = #firstYearStart
DECLARE #lastdayofcurrentyear DATE = DATEADD(day,-1,DATEADD(year,1,#firstdayofcurrentyear))
DECLARE #yearnumber INT = YEAR(#firstdayofcurrentyear)
DECLARE #tempTableYearBounds TABLE
(
startDate DATE NOT NULL,
endDate DATE NOT NULL,
YearNumber INT NOT NULL
)
WHILE #firstdayofcurrentyear < #lastYearEnd
BEGIN
INSERT INTO #tempTableYearBounds
VALUES(#firstdayofcurrentyear,#lastdayofcurrentyear,#yearNumber)
SET #firstdayofcurrentyear = DATEADD(year,1,#firstdayofcurrentyear)
SET #lastdayofcurrentyear = DATEADD(year,1,#lastdayofcurrentyear)
SET #yearNumber = #yearNumber + 1
END
DECLARE #tempTableCustomerCount TABLE
(
[Year] INT NOT NULL,
[CustomerCount] INT NOT NULL
)
INSERT INTO #tempTableCustomerCount
SELECT
YearNumber as [Year],
COUNT(DISTINCT CustomerNumber) as CutomerCount
FROM Ticket
JOIN #tempTableYearBounds ON
TicketDate >= startDate AND TicketDate <=endDate
GROUP BY YearNumber
SELECT * FROM(
SELECT t2.Year as [Year],
(SELECT
SUM(CustomerCount)
FROM #tempTableCustomerCount
WHERE Year>=t1.Year
AND Year <=t2.Year) AS CustomerCount
FROM #tempTableCustomerCount t1 JOIN #tempTableCustomerCount t2
ON t1.Year < t2.Year
WHERE t1.Year = #yearStart
UNION
SELECT [Year], [CustomerCount]
FROM #tempTableCustomerCount
WHERE [YEAR] = #yearStart
) tt
ORDER BY tt.Year
It isn't efficient but at the end the temp table you are dealing with is so small I don't think it really matters, and adds a lot more versatility versus the method you are using.
Update: I updated the query to reflect the result you wanted with my data set, I was basically testing to see if this was faster, it was faster by 10 seconds but the dataset I am dealing with is relatively small. (from 12 seconds to 2 seconds).
Using your data
I changed the tables you gave to temp tables so it didn't effect my environment and I removed the foreign key because they are not supported for temp tables, the logic is the same as the example included but just changed for your dataset.
DECLARE #startYear INT = 2013, #endYear INT = 2016
DECLARE #yearStartMonth INT = 10 , #yearStartDay INT = 1
DECLARE #startDate DATETIME = DATEADD(day,#yearStartDay-1,
DATEADD(month, #yearStartMonth-1,
DATEADD(year,#startYear-1900,0)))
DECLARE #endDate DATETIME = DATEADD(day,#yearStartDay-1,
DATEADD(month,#yearStartMonth-1,
DATEADD(year,#endYear-1899,0)))
DECLARE #tempDateRangeTable TABLE
(
[Year] INT NOT NULL,
StartDate DATETIME NOT NULL,
EndDate DATETIME NOT NULL
)
DECLARE #currentDate DATETIME = #startDate
WHILE #currentDate < #endDate
BEGIN
DECLARE #nextDate DATETIME = DATEADD(YEAR, 1, #currentDate)
INSERT INTO #tempDateRangeTable(Year,StartDate,EndDate)
VALUES(YEAR(#currentDate),#currentDate,#nextDate)
SET #currentDate = #nextDate
END
CREATE TABLE Users
(
uID int identity primary key,
uFirstName varchar(75),
uLastName varchar(75)
);
INSERT INTO Users (uFirstName, uLastName)
VALUES
('User1', 'User1'),
('User2', 'User2'),
('User3', 'User3'),
('User4', 'User4');
CREATE TABLE UserDataIDMatch
(
udimID int indentity primary key,
udim.udim_FK_uID int foreign key references Users(uID),
udimUserSystemID varchar(75)
);
INSERT INTO UserDataIDMatch (udim_FK_uID, udimUserSystemID)
VALUES
(1, 'SystemID1'),
(2, 'SystemID2'),
(3, 'SystemID3'),
(4, 'SystemID4');
CREATE TABLE DataDump
(
ddID int identity primary key,
ddSystemID varchar(75),
ddEnd datetime
);
INSERT INTO DataDump (ddSystemID, ddEnd)
VALUES
('SystemID1', '10-01-2013'),
('SystemID2', '10-01-2014'),
('SystemID3', '10-01-2015'),
('SystemID4', '10-01-2016');
DECLARE #tempIndividCount TABLE
(
[Year] INT NOT NULL,
UserCount INT NOT NULL
)
-- no longer need to filter out other because you are using an
--inclusion statement rather than an exclusion one, this will
--also make your query faster (when using real tables not temp ones)
INSERT INTO #tempIndividCount(Year,UserCount)
SELECT tdr.Year, COUNT(DISTINCT UId) FROM
Users u JOIN UserDataIDMatch um
ON um.udim_FK_uID = u.uID
JOIN DataDump dd ON
um.udimUserSystemID = dd.ddSystemID
JOIN #tempDateRangeTable tdr ON
dd.ddEnd >= tdr.StartDate AND dd.ddEnd < tdr.EndDate
GROUP BY tdr.Year
-- will show you your result
SELECT * FROM #tempIndividCount
--add any ranges that did not have an entry but were in your range
--can easily remove this by taking this part out.
INSERT INTO #tempIndividCount
SELECT t1.Year,0 FROM
#tempDateRangeTable t1 LEFT OUTER JOIN #tempIndividCount t2
ON t1.Year = t2.Year
WHERE t2.Year IS NULL
SELECT YearNumber,UserCount FROM (
SELECT 'Year'+CAST(((t2.Year-t1.Year)+1) AS CHAR) [YearNumber] ,t2.Year,(
SELECT SUM(UserCount)
FROM #tempIndividCount
WHERE Year >= t1.Year AND Year <=t2.Year
) AS UserCount
FROM #tempIndividCount t1
JOIN #tempIndividCount t2
ON t1.Year < t2.Year
WHERE t1.Year = #startYear
UNION ALL
--add the missing first year, union it to include the value
SELECT 'Year1',Year, UserCount FROM #tempIndividCount
WHERE Year = #startYear) tt
ORDER BY tt.Year
Benefits over using a WHEN CASE based approach
More Robust
Do not need to explicitly determine the end and start dates of each year, just like in a logical year just need to know the start and end date. Can easily change what you are looking for with some simple modifications(i.e. say you want all 2 year ranges or 3 year).
Will be faster if the database is indexed properly
Since you are searching based on the same data type you can utilize the indices that should be created on the date columns in the database.
Cons
More Complicated
The query is a lot more complicated to follow, even though it is more robust there is a lot of extra logic in the actual query.
In some circumstance will not provide good boost to execution time
If the dataset is very small, or the number of dates being compared isn't significant then this could not save enough time to be worth it.
In SQL Server once you match a WHEN inside a CASE, it stop evaluating will not going on evaluating next WHEN clauses. Hence you can't accumulate that way.
if I understand you correctly, this would show your results.
;WITH cte AS
(F
SELECT dd.ddEnd [dateEnd], u.uID AS UserID
FROM Users AS u
INNER JOIN UserDataIDMatch AS udim
ON u.uID = udim.udim_FK_uID
INNER JOIN DataDump AS dd
ON udim.udimUserSystemID = dd.ddSystemID
WHERE ddEnd BETWEEN #FiscalYearStart AND #FiscalYearEnd3
)
SELECT datepart(year, #FiscalYearStart) AS [Year], COUNT(DISTINCT UserID) AS CntUserID
FROM cte
WHERE dateEnd BETWEEN #FiscalYearStart AND #FiscalYearEnd1
GROUP BY #FiscalYearStart
UNION
SELECT datepart(year, #FiscalYearEnd1) AS [Year], COUNT(DISTINCT UserID) AS CntUserID
FROM cte
WHERE dateEnd BETWEEN #FiscalYearStart AND #FiscalYearEnd2
GROUP BY #FiscalYearEnd1
UNION
SELECT datepart(year, #FiscalYearEnd3) AS [Year], COUNT(DISTINCT UserID) AS CntUserID
FROM cte
WHERE dateEnd BETWEEN #FiscalYearStart AND #FiscalYearEnd3
GROUP BY #FiscalYearEnd2
I'm working on a system (ASP.NET/MSSQL/C#) for scheduling restaurant employees.
The problem I'm having is I need to "auto-rotate" the shift "InTimes" every week.
The user needs to be able to copy one day's schedule to the same day next week with all the employee shift times rotated one slot.
For example, in the table below, Monica has the 10:30am shift this Monday, so she would have the 11:00am next week, and Adam would go from 12:00pm to 10:30am.
The time between shifts is not constant, nor is the number of employees on each shift.
Any ideas on how to do this (ideally with SQL statements) would be greatly appreciated.
Please keep in mind I'm a relative novice.
RecordID EmpType Date Day Meal ShiftOrder InTime EmployeeID
1 Server 29-Aug-11 Monday Lunch 1 10:30:00 AM Monica
2 Server 29-Aug-11 Monday Lunch 2 11:00:00 AM Sofia
3 Server 29-Aug-11 Monday Lunch 3 11:30:00 AM Jenny
4 Server 29-Aug-11 Monday Lunch 4 12:00:00 PM Adam
5 Server 29-Aug-11 Monday Dinner 1 4:30:00 PM Adam
6 Server 29-Aug-11 Monday Dinner 2 4:45:00 PM Jenny
7 Server 29-Aug-11 Monday Dinner 3 5:00:00 PM Shauna
8 Server 29-Aug-11 Monday Dinner 4 5:15:00 PM Sofia
10 Server 29-Aug-11 Monday Dinner 5 5:30:00 PM Monica
Somehow an employee would need to get his last (few) shifts
SELECT TOP 3 * FROM shift WHERE EmployeeID LIKE 'monica' ORDER BY [date] DESC
Next he/she would need to enter the time and date offset he would like to work next week, relative to a schedule before.
INSERT INTO shift SELECT
recordID
,[date]
,CASE [Intime]
WHEN [Intime] BETWEEN 00:00 AND 10:00 THEN 'Breakfast'
WHEN [Intime] BETWEEN 10:01 AND 04:29 THEN 'Lunch'
WHEN [Intime] BETWEEN 04:30 AND 23:59 THEN 'Dinner'
END as Meal
,No_idea_how_to_generate_this AS ShiftOrder
,[Intime]
,EmployeeID
FROM (SELECT
NULL as recordID
,DATEADD(DAY, 7+#dateoffset, ls.[date]) as [date]
,CAST(DATEADD(MINUTE, #timeoffset, ls.[time] AS TIME) as [Intime]
,EmployeeId
FROM Shift WHERE recordID = #recordID ) AS subselect
Here:
- #recordID is the record the employee choose as the starting point for the new appointment.
- #dateoffset is the number of days to add the the starting record
- #timeoffset is the number of minutes to add to the starting record
All the rest is determined by the row the user used as the starting point.
Here's what I came up with:
CREATE TABLE #tmp
(
[RecordID] INT ,
[EmpType] VARCHAR(20) ,
[Date] DATE ,
[Day] VARCHAR(10) ,
[Meal] VARCHAR(10) ,
[ShiftOrder] INT ,
[InTime] TIME ,
[EmployeeID] VARCHAR(50)
)
INSERT INTO [#tmp]
( [RecordID] ,
[EmpType] ,
[Date] ,
[Day] ,
[Meal] ,
[ShiftOrder] ,
[InTime] ,
[EmployeeID]
)
VALUES (1,'Server','29-Aug-11','Monday','Lunch',1,'10:30:00 AM','Monica'),
(2,'Server','29-Aug-11','Monday','Lunch',2,'11:00:00 AM','Sofia'),
(3,'Server','29-Aug-11','Monday','Lunch',3,'11:30:00 AM','Jenny'),
(4,'Server','29-Aug-11','Monday','Lunch',4,'12:00:00 PM','Adam'),
(5,'Server','29-Aug-11','Monday','Dinner',1,'4:30:00 PM','Adam'),
(6,'Server','29-Aug-11','Monday','Dinner',2,'4:45:00 PM','Jenny'),
(7,'Server','29-Aug-11','Monday','Dinner',3,'5:00:00 PM','Shauna'),
(8,'Server','29-Aug-11','Monday','Dinner',4,'5:15:00 PM','Sofia'),
(10,'Server','29-Aug-11','Monday','Dinner',5,'5:30:00 PM','Monica');
WITH CountByShift AS (SELECT *, COUNT(1) OVER (PARTITION BY EmpType, [Day], [Meal]) AS [CountByShiftByDayByEmpType]
FROM [#tmp]
),
NewShiftOrder AS (
SELECT *, ([ShiftOrder] + 1) % [CountByShiftByDayByEmpType] AS [NewShiftOrder]
FROM [CountByShift]
)
SELECT [RecordID] ,
[EmpType] ,
[Date] ,
[Day] ,
[Meal] ,
[ShiftOrder] ,
CASE WHEN [NewShiftOrder] = 0 THEN [CountByShiftByDayByEmpType] ELSE [NewShiftOrder] END AS [NewShiftOrder],
[InTime] ,
[EmployeeID]
FROM NewShiftOrder
ORDER BY [RecordID]
You need a table with all of the shifts in it:
create table dbo.Shifts (
[Day] varchar(9) not null,
Meal varchar(6) not null,
ShiftOrder integer not null,
InTime time not null,
constraint PK__dbo_Shifts primary key ([Day], Meal, ShiftOrder)
);
If that table is properly populated you can then run this to get a map of the current Day, Meal, ShiftOrder n-tuple to the next in that Day, Meal pair:
with numbers_per_shift as (
select [Day], Meal, max(ShiftOrder) as ShiftOrderCount
from dbo.Shifts s
group by [Day], Meal
)
select s.[Day], s.Meal, s.ShiftOrder,
s.ShiftOrder % n.ShiftOrderCount + 1 as NextShiftOrder
from dbo.Shifts as s
inner join numbers_per_shift as n
on s.[Day] = n.[Day]
and s.Meal = n.Meal;
For the table to be properly populated each of the shift orders would have to begin with one and increase by one with no skipping or repeating within a Day, Meal pair.
Borrowing most of the #tmp table definition from #Ben Thul, assuming you have an identity field, not assuming you are storing dates and times as dates and times...this should run well over and over, copying the latest date into the following week:
CREATE TABLE #tmp
(
[RecordID] INT ,
[EmpType] VARCHAR(20) ,
[Date] VARCHAR(9) ,
[Day] VARCHAR(10) ,
[Meal] VARCHAR(10) ,
[ShiftOrder] INT ,
[InTime] VARCHAR(11) ,
[EmployeeID] VARCHAR(50)
)
INSERT INTO [#tmp]
( [RecordID] ,
[EmpType] ,
[Date] ,
[Day] ,
[Meal] ,
[ShiftOrder] ,
[InTime] ,
[EmployeeID]
)
VALUES (1,'Server','29-Aug-11','Monday','Lunch',1,'10:30:00 AM','Monica'),
(2,'Server','29-Aug-11','Monday','Lunch',2,'11:00:00 AM','Sofia'),
(3,'Server','29-Aug-11','Monday','Lunch',3,'11:30:00 AM','Jenny'),
(4,'Server','29-Aug-11','Monday','Lunch',4,'12:00:00 PM','Adam'),
(5,'Server','29-Aug-11','Monday','Dinner',1,' 4:30:00 PM','Adam'),
(6,'Server','29-Aug-11','Monday','Dinner',2,' 4:45:00 PM','Jenny'),
(7,'Server','29-Aug-11','Monday','Dinner',3,' 5:00:00 PM','Shauna'),
(8,'Server','29-Aug-11','Monday','Dinner',4,' 5:15:00 PM','Sofia'),
(10,'Server','29-Aug-11','Monday','Dinner',5,' 5:30:00 PM','Monica');
with
Shifts as (
select EmpType, [Day], Meal, ShiftOrder, InTime
from #tmp
where [Date] = (select max(cast([Date] as datetime)) from #tmp)
),
MaxShifts as (
select EmpType, [Day], Meal, max(ShiftOrder) as MaxShiftOrder
from #tmp
where [Date] = (select max(cast([Date] as datetime)) from #tmp)
group by EmpType, [Day], Meal
)
insert into #tmp (EmpType, [Date], [Day], Meal, ShiftOrder, InTime, EmployeeID)
select s.EmpType
, replace(convert(varchar(11), dateadd(dd, 7, cast(a.[Date] as datetime)), 6), ' ', '-') as [Date]
, s.Day
, s.Meal
, s.ShiftOrder
, s.InTime
, a.EmployeeID
from #tmp as a
join MaxShifts as m on a.EmpType = m.EmpType
and a.[Day] = m.[Day]
and a.Meal = m.Meal
join Shifts as s on a.EmpType = s.EmpType
and a.[Day] = s.[Day]
and a.Meal = s.Meal
and 1 + a.ShiftOrder % m.MaxShiftOrder = s.ShiftOrder
where a.[Date] = (select max(cast([Date] as datetime)) from #tmp)
I'm assuming that the schedule is really tied to a meal and weekday in a below answer.
Also I would like to note that ShiftOrder and Day columns should not be columns. Day is obviously determined by Date so it is a total waste of space (computed column OR determine it on the UI side) and ShiftOrder is determined by Date and InTime columns (probably easy to calculate in a query with RANK() function or on the UI side). That said it will make this query a bit easier :)
declare #dt date = cast('29-Aug-11' as date)
/* note: the date above may be passed from UI or it maybe calculated based on getdate() and dateadd function or s.t. like that */
INSERT INTO [table] (EmpType,Date,Day,Meal,ShiftOrder,InTime,EmployeeID)
SELECT t1.EmpType, dateadd(day, 7, t1.date), t1.day, t1.meal, t2.ShiftOrder, t2.InTime, t1.EmployeeID
FROM [table] t1
INNER JOIN [table] t2
ON (t1.Date = t2.Date
and t1.Meal = t2.Meal
and (
t1.ShiftOrder = t2.ShiftOrder + 1
or
(
t1.ShiftOrder = (select max(shiftOrder) from [table] where meal = t1.meal and date =t1.date)
and
t2.ShiftOrder = (select min(shiftOrder) from [table] where meal = t1.meal and date =t1.date)
)
)
)
WHERE t1.Date = #dt
This is a pretty straight-forward set-oriented problem. Aggregations (count(*) and max()) and lookup tables are unnecessary. You can do it with one SQL statement.
The first step (set) is to identity those employees who simply slide down in the schedule.
The next step (set) is to identity those employees who need to "wrap around" to the head of the schedule.
Here's what I came up with:
/* Set up the temp table for demo purposes */
DROP TABLE #tmp
CREATE TABLE #tmp
(
[RecordID] INT ,
[EmpType] VARCHAR(20) ,
[Date] DATE ,
[Day] VARCHAR(10) ,
[Meal] VARCHAR(10) ,
[ShiftOrder] INT ,
[InTime] TIME,
[EmployeeID] VARCHAR(50)
)
INSERT INTO [#tmp]
( [RecordID] ,
[EmpType] ,
[Date] ,
[Day] ,
[Meal] ,
[ShiftOrder] ,
[InTime] ,
[EmployeeID]
)
VALUES (1,'Server','29-Aug-11','Monday','Lunch',1,'10:30:00 AM','Monica'),
(2,'Server','29-Aug-11','Monday','Lunch',2,'11:00:00 AM','Sofia'),
(3,'Server','29-Aug-11','Monday','Lunch',3,'11:30:00 AM','Jenny'),
(4,'Server','29-Aug-11','Monday','Lunch',4,'12:00:00 PM','Adam'),
(5,'Server','29-Aug-11','Monday','Dinner',1,' 4:30:00 PM','Adam'),
(6,'Server','29-Aug-11','Monday','Dinner',2,' 4:45:00 PM','Jenny'),
(7,'Server','29-Aug-11','Monday','Dinner',3,' 5:00:00 PM','Shauna'),
(8,'Server','29-Aug-11','Monday','Dinner',4,' 5:15:00 PM','Sofia'),
(10,'Server','29-Aug-11','Monday','Dinner',5,' 5:30:00 PM','Monica');
/* the "fills" CTE will find those employees who "wrap around" */
;WITH fills AS (
SELECT
[d2].[EmpType],
[d2].[Date],
[d2].[Day],
[d2].[Meal],
1 AS [ShiftOrder],
[d2].[InTime],
[d2].[EmployeeID]
FROM
[#tmp] d1
RIGHT OUTER JOIN
[#tmp] d2 ON
([d1].[Meal] = [d2].[Meal])
AND ([d1].[ShiftOrder] = [d2].[ShiftOrder] + 1)
WHERE
[d1].[EmployeeID] IS NULL
)
INSERT INTO [table] (EmpType,Date,Day,Meal,ShiftOrder,InTime,EmployeeID)
SELECT
[d1].[EmpType],
DATEADD(DAY, 7, [d1].[Date]) AS [Date],
DATENAME(dw,(DATEADD(DAY, 7, [d1].[Date]))) AS [Day],
[d1].[Meal],
[d1].[ShiftOrder],
[d1].[InTime],
ISNULL([d2].[EmployeeID], [f].[EmployeeID]) AS [EmployeeID]
FROM
[#tmp] d1
LEFT OUTER JOIN
[#tmp] d2 ON
([d1].[Meal] = [d2].[Meal]) AND ([d1].[ShiftOrder] = [d2].[ShiftOrder] + 1)
LEFT OUTER JOIN
[fills] f ON
([d1].[Meal] = [f].[Meal]) AND ([d1].[ShiftOrder] = [f].[ShiftOrder])
You can use a subquery (for a tutorial on subqueries, see http://www.databasejournal.com/features/mssql/article.php/3464481/Using-a-Subquery-in-a-T-SQL-Statement.htm) to get the last shift time.
After this, its trivial addition and modular division (in case you don't know what that is, have a look at this).
Hope this helped. I'm a bit tired right now, so I can't provide you with an example.
I'm a SQL programmer and DBA for 20 yrs now. With that said, business logic this complex should be in the C# part of the system. Then the TDD built application can handle the inevitable changes, and still be refactor-able and correct.
My recommendation is 'push-back'. Your response should be something along the lines of "This isn't just some look-up/fill-in the blank logic. This kind of complex business logic belongs in the App". It belongs in something that can be unit tested, and will be unit tested every time its changed.
The right answer sometimes is 'No', this is one of them.
How about using a Pivot Table for all employees and then adding shift timings as rows?? Order the names based on Shift for the initial Day.
Something like this..
Date_time Shift_Order Monica Sofia Jenny Adam Shauna
08/29/11 1 10:30AM 11:00AM 11:30AM 12:00PM NULL
08/29/11 2 5:30PM 5:15PM 4:45PM 4:30PM 5:00PM