SQL Server 2014 - Use previous value when date not present - sql

I asked a similar question yesterday but I was not very good in my description of what I wanted. This will be far clearer.
Lead/Lag is not getting me what I need. Its close, but not enough.
Using SQL Server 2014 for client, actual server built on SQL 2012.
Here is my code:
Creating Team Table
CREATE TABLE ##TeamTable
([UserID] varchar(50), [CurrentTeam] varchar(5), [ChangeDate] datetime)
;
INSERT INTO ##TeamTable
([UserID], [CurrentTeam], [ChangeDate])
VALUES
('User1', 'Team1', '6/1/2016'),
('User1', 'Team2', '9/1/2016'),
('User1', 'Team3', '12/1/2016'),
('User2', 'Team1', '4/1/2016'),
('User2', 'Team2', '10/1/2016'),
('User2', 'Team3', '11/1/2016');
Now to create data table I need to join to
CREATE TABLE ##DataTable
([UserID] varchar(50), Month_sk datetime, Media varchar(50), NCO int)
INSERT INTO ##DataTable
([UserID] , Month_sk , Media , NCO )
VALUES
('User1', '2016-06-01 00:00:00', 'Fax', 100),
('User1', '2016-06-01 00:00:00', 'Voice', 120),
('User1', '2016-07-01 00:00:00', 'Voice', 90),
('User1', '2016-07-01 00:00:00', 'Email', 100),
('User1', '2016-08-01 00:00:00', 'Voice', 150),
('User1', '2016-08-01 00:00:00', 'Email', 100),
('User1', '2016-09-01 00:00:00', 'Voice', 100),
('User1', '2016-09-01 00:00:00', 'Email', 120),
('User1', '2016-10-01 00:00:00', 'Voice', 90),
('User1', '2016-10-01 00:00:00', 'Email', 100),
('User1', '2016-11-01 00:00:00', 'Voice', 150),
('User1', '2016-11-01 00:00:00', 'Email', 100),
('User1', '2016-12-01 00:00:00', 'Voice', 150),
('User1', '2016-12-01 00:00:00', 'Email', 100),
('User2', '2016-04-01 00:00:00', 'Fax', 100),
('User2', '2016-04-01 00:00:00', 'Voice', 120),
('User2', '2016-05-01 00:00:00', 'Fax', 100),
('User2', '2016-05-01 00:00:00', 'Voice', 120),
('User2', '2016-06-01 00:00:00', 'Fax', 100),
('User2', '2016-06-01 00:00:00', 'Voice', 120),
('User2', '2016-07-01 00:00:00', 'Voice', 90),
('User2', '2016-07-01 00:00:00', 'Email', 100),
('User2', '2016-08-01 00:00:00', 'Voice', 150),
('User2', '2016-08-01 00:00:00', 'Email', 100),
('User2', '2016-09-01 00:00:00', 'Voice', 100),
('User2', '2016-09-01 00:00:00', 'Email', 120),
('User2', '2016-10-01 00:00:00', 'Voice', 90),
('User2', '2016-10-01 00:00:00', 'Email', 100),
('User2', '2016-11-01 00:00:00', 'Voice', 150),
('User2', '2016-11-01 00:00:00', 'Email', 100),
('User2', '2016-12-01 00:00:00', 'Voice', 150),
('User2', '2016-12-01 00:00:00', 'Email', 100);
Here is a basic Select to show whats going on:
SELECT b.UserID
,b.Media
,b.NCO
,Month_sk
,CurrentTeam
FROM ##DataTable b
LEFT OUTER JOIN ##TeamTable a on b.UserID = a.UserID and b.Month_sk = a.ChangeDate
order by UserID, Month_sk, media
This gives me a result set that looks like this:
What I need is for where I have nulls, that it would be pulling in the previous team name that's not null. So in User1 case, those 4 nulls for months of July and August would say Team1 since that was the team he was last on. Same for the nulls after Team2, those should say Team2.
Lead/Lag is close or I'm not using it right. Hopefully with all this code, this makes someone's jobs way easier.
UPDATE:
Lag/Lead gives same results. Still need the nulls to fill in
SELECT b.UserID
,b.Media
,b.NCO
,Month_sk
,CurrentTeam
,LAG(CurrentTeam,1, currentteam) OVER(PARTITION BY a.userid, changedate ORDER BY ChangeDate) as Lag
FROM ##DataTable b
LEFT OUTER JOIN ##TeamTable a on b.UserID = a.UserID and b.Month_sk = a.ChangeDate
order by UserID, Month_sk, media

(Moving update notes to end)
I think the easiest solution (conceptually) is to join against all months up to month_sk and then filter to get only the last match. This "feels" potentially inefficient, so you'd want to test it with realistic data volume and if there's a problem then look for something better. (But "something better" may involve changes to the physical data model...)
So:
select userid, media, nco, month_sk, currentteam
from (SELECT b.UserID
, b.Media
, b.NCO
, Month_sk
, CurrentTeam
, rank() over(partition by b.userID
order by a.changeDate desc) n
FROM ##DataTable b
INNER JOIN ##TeamTable a
on b.UserID = a.UserID
and b.Month_sk >= a.ChangeDate
) x
where n = 1
order by UserID, Month_sk, media
Note that in previous versions I used row_number() over() instead of rank() over()... and you can do that, but if you do then you have to include in the partitioning key any data from the b table that could cause a duplication of a row from the a table during the join. Using rank ensures that all such duplicates share their rank as they ought to.
UPDATE - After I initially wrote this, I deleted it because I thought I'd misread your question; but as I was writing a replacement realized I may have had it right in the first place. So here it is, with a caveat:
This assumes that the only reason you get the NULL value is the outer join. If ever the "right hand" table has a row and just a value for a column therein is NULL, then getting the previous value for that column would require further work with subqueries or analytic funcitons. But even then lead/lag may not work, since they are position based. (I think something with LAST_VALUE might be more suitable, but will leave the details of that unless it's needed.)
UPDATE 2 - based on your description of the data model in below comments, I'm changing the query to show an inner join as it sounds like that will work (once you broaden the join criteria) and should be more efficient.
UPDATE 3 - I did misread your sample data and got the partitioning expression for calculating n wrong. Should be fixed assuming the values from the b table are unique. If not it's still fixable but requires more trickery...

You can do this with an APPLY and a sub query like this.
SELECT
userid,
media,
nco,
month_sk,
currentteam
FROM
##DataTable td
OUTER APPLY (
SELECT TOP (1)
CurrentTeam,
ChangeDate
FROM
##TeamTable tt
WHERE
tt.UserID = td.UserID
and tt.ChangeDate <= td.Month_sk
ORDER BY
tt.ChangeDate desc
) dataTableWithTeam
ORDER BY
td.UserID,
td.Month_sk,
td.media

In this version, I first identify the appropriate "linking" month in the CTE, and then use that as a lookup in the final join. (It got much easier once I realized Media and NCO played no real part in the join.)
WITH cteDateLookup
as (
-- Get the ChangeDate for this User/Month
SELECT
b.UserID
,b.Month_sk
,max(a.ChangeDate) ChangeDate
from ##DataTable b
left outer join ##TeamTable a
on b.UserID = a.UserID
and b.Month_sk >= a.ChangeDate
group by
b.UserID
,b.Month_sk
)
-- Use the cte as a "lookup" for the appropriate date
SELECT
b.UserID
,b.Media
,b.NCO
,b.Month_sk
,a.CurrentTeam
from ##DataTable b
left outer join cteDateLookup cte
on cte.UserId = b.UserId
and b.Month_sk = cte.Month_sk
left outer join ##TeamTable a
on a.UserId = cte.UserId
and a.ChangeDate = cte.ChangeDate
order by
b.UserID
,b.Month_sk
,b.media

Related

How to display months sorted in order in SQL Server?

Below is the table I have created and inserted values in it:
CREATE TABLE employees_list
(
employeeID int identity(1,1),
employeeName varchar(25)
)
GO
INSERT INTO employees_list VALUES ('Kevin'),('Charles')
GO
CREATE TABLE hourlyRates
(
employeeID int,
rate int,
rateDate date
)
INSERT INTO hourlyRates VALUES (1, 28, '2016-01-01'),
(1, 39, '2016-02-01'),
(2, 43, '2016-01-01'),
(2, 57, '2016-02-01')
CREATE TABLE workingHours
(
employeeID int,
startdate datetime,
enddate datetime
)
GO
INSERT INTO workingHours VALUES (1, '2016-01-01 09:00', '2016-01-01 17:00'),
(1, '2016-01-02 09:00', '2016-01-02 17:00'),
(1, '2016-02-01 10:00', '2016-02-01 16:00'),
(1, '2016-02-02 11:00', '2016-02-02 13:00'),
(2, '2016-01-01 10:00', '2016-01-01 16:00'),
(2, '2016-01-02 08:00', '2016-01-02 14:00'),
(2, '2016-02-01 14:00', '2016-02-01 19:00'),
(2, '2016-02-02 13:00', '2016-02-02 16:00')
GO
SELECT * FROM employees_list
SELECT * FROM hourlyRates
SELECT * FROM workingHours
Then I ran a query to calculate salaries paid to Employees each month:
SELECT
employeeName,
DATENAME(MONTH, startdate) AS 'Month',
SUM(DATEDIFF(HOUR, startdate, enddate) * rate) AS 'Total Salary'
FROM
hourlyRates, workingHours, employees_list
WHERE
hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND (hourlyRates.rateDate BETWEEN DATEFROMPARTS(DATEPART(YEAR, workingHours.startDate), DATEPART(MONTH, workingHours.startDate),1)
AND DATEFROMPARTS(DATEPART(YEAR, workingHours.endDate), DATEPART(MONTH, workingHours.endDate),1))
GROUP BY
employeeName, DATENAME(MONTH, startdate)
And I got the following output:
As you can see from the screenshot above that I got the result I wanted.
But the only issue is the month is not being displayed in order.
I tried adding ORDER BY DATENAME(MONTH, startdate) and still the order of month is not being sorted.
I even tried ORDER BY DATEPART(MM, startdate) but it is showing error mentioning that it is not contained in an aggregate function or GROUP BY clause.
What minor change do I need to make in my query ?
Why add ORDER BY DATENAME(MONTH,startdate) not work
Because the ORDER depends on character instead of the month of number.
You can try to add MONTH(startdate) in ORDER BY & GROUP BY, because you might need to add non-aggregate function in GROUP BY
SELECT employeeName,DATENAME(MONTH,startdate) AS 'Month',
SUM(DATEDIFF(HOUR,startdate,enddate) * rate) AS 'Total Salary'
FROM hourlyRates
INNER JOIN workingHours
ON hourlyRates.employeeID = workingHours.employeeID
INNER JOIN employees_list
ON employees_list.employeeID = workingHours.employeeID
WHERE
(hourlyRates.rateDate
BETWEEN DATEFROMPARTS(DATEPART(YEAR, workingHours.startDate), DATEPART(MONTH,workingHours.startDate),1)
AND DATEFROMPARTS(DATEPART(YEAR, workingHours.endDate), DATEPART(MONTH,workingHours.endDate),1))
GROUP BY employeeName,DATENAME(MONTH,startdate),MONTH(startdate)
ORDER BY MONTH(startdate)
sqlfiddle
NOTE
I would use INNER JOIN ANSI syntax instead of , which mean CROSS JOIN because JOIN syntax is generally considered more readable.
As mentioned, ORDER BY DATENAME will sort by the textual name of the month not by the actual ordering of months.
It's best to just group and sort by EOMONTH, then you can pull out the month name from that in the SELECT
Further improvements:
Always use explicit join syntax, not old-style , comma joins.
Give tables short aliases, to make your query more readable.
Your date interval check might not be quite right, and you may need to also adjust the rate caluclation, but I don't know without further info.
A more accurate calculation would probably mean calculating part-dates.
SELECT
e.employeeName,
DATENAME(month, EOMONTH(wh.startdate)) AS Month,
SUM(DATEDIFF(HOUR, wh.startdate, wh.enddate) * hr.rate) AS [Total Salary]
FROM hourlyRates hr
JOIN workingHours wh ON hr.employeeID = wh.employeeID
AND hr.rateDate
BETWEEN DATEFROMPARTS(YEAR(wh.startDate), MONTH(wh.startDate), 1)
AND DATEFROMPARTS(YEAR(wh.endDate), MONTH(wh.endDate), 1)
JOIN employees_list e ON e.employeeID = wh.employeeID
GROUP BY
e.employeeId,
e.employeeName,
EOMONTH(wh.startdate)
ORDER BY
EOMONTH(wh.startdate),
e.employeeName;
db<>fiddle

Get all dates between provided dates

I have this table and sample data. I want to get the entire month's or specific dates attendance and information like hours he worked or days he was absent.
CREATE TABLE Attendance
(
[EmpCode] int,
[TimeIn] datetime,
[TimeOut] datetime
)
INSERT INTO Attendance VALUES (12, '2018-08-01 09:00:00', '2018-08-01 17:36:00');
INSERT INTO Attendance VALUES (12, '2018-08-02 09:00:00', '2018-08-02 18:10:00');
INSERT INTO Attendance VALUES (12, '2018-08-03 09:25:00', '2018-08-03 16:56:00');
INSERT INTO Attendance VALUES (12, '2018-08-04 09:13:00', '2018-08-05 18:09:00');
INSERT INTO Attendance VALUES (12, '2018-08-06 09:00:00', '2018-08-07 18:15:00');
INSERT INTO Attendance VALUES (12, '2018-08-07 09:27:00', '2018-08-08 17:36:00');
INSERT INTO Attendance VALUES (12, '2018-08-08 09:35:00', '2018-08-09 17:21:00');
INSERT INTO Attendance VALUES (12, '2018-08-10 09:00:00', '2018-08-10 17:45:00');
INSERT INTO Attendance VALUES (12, '2018-08-11 09:50:00', '2018-08-11 17:31:00');
INSERT INTO Attendance VALUES (12, '2018-08-13 09:23:00', '2018-08-13 17:19:00');
INSERT INTO Attendance VALUES (12, '2018-08-15 09:21:00', '2018-08-15 17:36:00');
INSERT INTO Attendance VALUES (12, '2018-08-16 09:00:00', '2018-08-16 17:09:00');
INSERT INTO Attendance VALUES (12, '2018-08-17 09:34:00', '2018-08-17 17:29:00');
INSERT INTO Attendance VALUES (12, '2018-08-18 09:00:00', '2018-08-18 17:10:00');
INSERT INTO Attendance VALUES (12, '2018-08-20 09:34:00', '2018-08-20 17:12:00');
INSERT INTO Attendance VALUES (12, '2018-08-21 09:20:00', '2018-08-21 17:15:00');
INSERT INTO Attendance VALUES (12, '2018-08-22 09:12:00', '2018-08-22 17:19:00');
INSERT INTO Attendance VALUES (12, '2018-08-23 09:05:00', '2018-08-23 17:21:00');
INSERT INTO Attendance VALUES (12, '2018-08-24 09:07:00', '2018-08-24 17:09:00');
INSERT INTO Attendance VALUES (12, '2018-08-25 09:12:00', '2018-08-25 17:05:00');
INSERT INTO Attendance VALUES (12, '2018-08-27 09:21:00', '2018-08-27 17:46:00');
INSERT INTO Attendance VALUES (12, '2018-08-28 09:17:00', '2018-08-28 17:12:00');
INSERT INTO Attendance VALUES (12, '2018-08-29 09:00:00', '2018-08-29 17:36:00');
INSERT INTO Attendance VALUES (12, '2018-08-30 09:12:00', '2018-08-30 17:24:00');
I have a query that tells how many hours employee have worked, but it is only showing days on which data was present in table. I want to show all dates between provided dates and in case there is no data it should NULL in columns.
Here is the query:
SELECT
[EmpCode],
FirstIN = CAST(MIN([TimeIn]) AS TIME),
LastOUT = CAST(MAX([TimeOut]) AS TIME),
CONVERT(VARCHAR(6), Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME))/3600)
+ ':'
+ RIGHT('0' + CONVERT(VARCHAR(2), (Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME)) % 3600) / 60), 2)
+ ':'
+ RIGHT('0' + CONVERT(VARCHAR(2), Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME)) % 60) , 2 ) AS HoursSpent,
CAST(COALESCE(TimeIn, TimeOut) AS DATE) [Date]
FROM Attendance
WHERE CAST(COALESCE(TimeIn, TimeOut) AS DATE) BETWEEN '2018-08-01' AND '2018-08-25'
GROUP BY EmpCode, TimeIn, TimeOut
For that you need to use recursive way to generate possible dates :
with t as (
select '2018-08-01' as startdt
union all
select dateadd(day, 1, startdt)
from t
where startdt < '2018-08-25'
)
select . . .
from t left join
Attendance at
on cast(coalesce(at.TimeIn, at.TimeOut) as date) = t.startdt;
Just make sure to use date from t instead of Attendance table in SELECT statement.
Note : If you have a large no of date period, then don't forgot to use Query hint OPTION (MAXRECURSION 0), By defalut it has 100 recursion levels.
You May Try Recursive CTE to populate the Dates and Then Join With that to Get the Interval
DECLARE #From DATETIME = '2018-08-01' ,#To DATETIME= '2018-08-25'
;WITH CTE
AS
(
SELECT
[EmpCode] EmpId,
MyDate = #From
FROM Attendance A
UNION ALL
SELECT
EmpId,
MyDate = DATEADD(DAY,1,MyDate)
FROM CTE
WHERE MyDate < #To
)
SELECT
[EmpCode] = CTE.EmpId,
CTE.MyDate,
FirstIN = CAST(MIN([TimeIn]) AS TIME),
LastOUT = CAST(MAX([TimeOut]) AS TIME),
CONVERT(VARCHAR(6), Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME))/3600)
+ ':'
+ RIGHT('0' + CONVERT(VARCHAR(2), (Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME)) % 3600) / 60), 2)
+ ':'
+ RIGHT('0' + CONVERT(VARCHAR(2), Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME)) % 60) , 2 )
AS HoursSpent,
CAST(CTE.MyDate AS DATE) [Date]
FROM CTE
LEFT JOIN Attendance A
ON A.EmpCode = CTE.EmpId
AND CAST(CTE.MyDate AS DATE) = CAST(COALESCE(TimeIn, TimeOut) AS DATE)
GROUP BY CTE.EmpId, TimeIn, TimeOut,CTE.MyDate
ORDER BY 6
A different method, using a Tally Table. The advantage here is that an rCTE is a form of RBAR. The idea of a Tally table isn't as obvious, but is quicker, and also, won't need the OPTION (MAXRECURSION 0) added if you have more than 100 days. in fact, this example handles up to 10,000 days, which shuold be more than enough:
DECLARE #EmpCode int = 12;
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1 --10
CROSS JOIN N N2 --100
CROSS JOIN N N3 --1000
CROSS JOIN N N4 --10000
),
Dates AS(
SELECT DATEADD(DAY, T.I, TT.MinTimeIn) AS CalendarDate,
#EmpCode AS EmpCode
FROM Tally T
CROSS APPLY (SELECT MIN(CONVERT(date,TimeIn)) AS MinTimeIn,
MAX(CONVERT(date,TimeOut)) AS MaxTimeOut
FROM Attendance
WHERE EmpCode = #EmpCode) TT
WHERE DATEADD(DAY, T.I, TT.MinTimeIn) <= CONVERT(date, TT.MaxTimeOut))
SELECT CalendarDate
EmpCode,
TimeIn,
TimeOut
FROM Dates D
LEFT JOIN Attendance A ON D.CalendarDate = CONVERT(date,A.TimeIn)
AND D.EmpCode = A.EmpCode;

Select with formula that includes previous value

DECLARE #sales TABLE
(
code VARCHAR(10) NOT NULL,
date1 DATE NOT NULL,
sales NUMERIC(10, 2) NOT NULL,
profits NUMERIC(10, 2) NOT NULL
);
INSERT INTO #sales(Code, Date1, sales, profits)
VALUES ('q', '20140708', 0.51,21),
('q', '20140712', 0.3,33),
('q', '20140710', 0.5,12),
('q', '20140711', 0.6,43),
('q', '20140712', 0.2,66),
('q', '20140713', 0.7,21),
('q', '20140714', 0.24,76),
('q', '20140714', 0.24,12),
('x', '20140709', 0.25,0),
('x', '20140710', 0.16,0),
('x', '20140711', 0.66,31),
('x', '20140712', 0.23,12),
('x', '20140712', 0.35,11),
('x', '20140714', 0.57,1),
('c', '20140712', 0.97,2),
('c', '20140714', 0.71,3);
SELECT code,
CONVERT(VARCHAR, date1, 104) AS SPH_DATE_FORMATO,
Cast(Sum(sales)
OVER (
ORDER BY date1) AS NUMERIC (18, 2)) AS SPH_CLOSE
FROM #sales
WHERE date1 > Dateadd(month, -21, Getdate())
AND code = 'q'
This select gives me the accmulated sales ordered by date for the 'g' code and this is fine.
But now I would need an additional column that calculates:
(1+ previous day sales)*(1+ today sales) -1
also ordered by date for the 'g' code
Can anyone help with this, please?
you can do like this using CTE, just change your select query like this
;with Sales as
(
SELECT code, convert(varchar, date1, 104) AS SPH_DATE_FORMATO, cast(SUM(sales) OVER (ORDER BY date1) as numeric (18,2)) AS SPH_CLOSE,ROW_NUMBER() OVER(ORDER BY Date1 ASC) as rowid
FROM #sales
where date1 >DATEADD(month, -21, GETDATE()) and code='q')
select S1.code,S1.SPH_DATE_FORMATO,S1.SPH_CLOSE
,S2.SPH_close as Sales_Last_Day
from Sales S1 left outer join Sales S2 on S1.rowid -1 = S2.rowid

Shift manipulation in SQL to get counts

I have attendance in following table called Attendance
EID is employee ID and in shift column, D denotes a Day shift and N denotes a Night shift.
Now I'm trying to get following data pertaining to each employee.
No of Day shifts - count of D,
No of Night shifts - count of N,
No of Days worked - no of days an employee has worked either shift or both shifts (Even an employee worked both Day and Night on the same day its taken as one day.)
I can get all three information in three different results as follows...
WITH CTE (EID, in_time, shift) AS
(
SELECT EID, in_time, shift FROM Attendance
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND PID = 'A002'
)
SELECT EID, COUNT(*) AS DayTotal
FROM CTE
WHERE (shift = 'D')
GROUP BY EID
SELECT EID, COUNT(*) AS NightTotal
FROM Attendance
WHERE (shift = 'N')
GROUP BY EID
;
WITH CTE2 (EID, in_time, shift) AS
(
SELECT EID, in_time, shift FROM Attendance
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND PID = 'A002'
)
SELECT EID, COUNT ( DISTINCT CONVERT (DATE, in_time)) AS [Days]
FROM CTE2
WHERE (shift = 'D' OR shift = 'N')
GROUP BY EID
But I want to have this in single result (table). So I tried following query but it's not giving the intended output.
WITH CTE (EID, in_time, shift) AS
(
SELECT EID, in_time, shift FROM Attendance
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND PID = 'A002'
)
SELECT EID,
CASE WHEN Shift = 'D' THEN COUNT(Shift) END AS [Day],
CASE WHEN Shift = 'N' THEN COUNT(Shift) END AS [Night],
COUNT ( DISTINCT CONVERT (DATE, in_time)) AS [Days]
FROM CTE
GROUP BY EID, shift
Could you please let me know a way to do this?
The intended result
I think you can get what you want using conditional aggregation:
SELECT EID,
sum(case when shift = 'd' then 1 else 0 end) as dayshifts,
sum(case when shift = 'n' then 1 else 0 end) as nightshifts,
count(*) as total
FROM Attendance a
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND
CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND
PID = 'A002';
EDIT:
If you want counts of distinct dates for the total, then use count(distinct):
SELECT EID,
sum(case when shift = 'd' then 1 else 0 end) as dayshifts,
sum(case when shift = 'n' then 1 else 0 end) as nightshifts,
count(distinct case when shift in ('d', 'n') then cast(in_time as date) end) as total
FROM Attendance a
WHERE (in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102) AND
CONVERT(DATETIME, '2014-07-31 00:00:00', 102)) AND
PID = 'A002';
WITH cte (eid, in_time, shift)
AS (SELECT eid,
in_time,
shift
FROM attendance
WHERE ( in_time BETWEEN CONVERT(DATETIME, '2014-01-07 00:00:00', 102)
AND
CONVERT(DATETIME,
'2014-07-31 00:00:00',
102
) )
AND pid = 'A002')
SELECT eid,
Sum(CASE
WHEN shift = 'D' THEN 1
ELSE 0
END) AS DayTotal,
Sum(CASE
WHEN shift = 'N' THEN 1
ELSE 0
END) AS NightTotal,
Count (DISTINCT CONVERT (DATE, in_time)) AS Days
FROM cte
GROUP BY eid
#Chathuranga, Since Day and Night Shifts of a day should be counted as one, Please let me know if the below solution works for you.
DECLARE #Attendance TABLE (EID INT,
PID CHAR(4),
In_Time DATETIME,
Out_Time DATETIME,
Shift CHAR(1))
INSERT INTO #Attendance
VALUES
('100', 'A001', '2014-07-01 07:00:00.000', '2014-07-01 19:30:00.000', 'D'),
('102', 'A001', '2014-07-01 19:30:00.000', '2014-07-02 07:00:00.000', 'N'),
('100', 'A001', '2014-07-01 19:30:00.000', '2014-07-02 07:00:00.000', 'N'),
('104', 'A001', '2014-07-02 07:00:00.000', '2014-07-02 19:30:00.000', 'D'),
('100', 'A001', '2014-07-03 19:30:00.000', '2014-07-04 07:00:00.000', 'N'),
('102', 'A001', '2014-07-03 19:30:00.000', '2014-07-04 07:00:00.000', 'N'),
('104', 'A001', '2014-07-03 07:00:00.000', '2014-07-03 19:30:15.000', 'D'),
('102', 'A001', '2014-07-04 07:00:00.000', '2014-07-04 19:30:00.000', 'D'),
('100', 'A001', '2014-07-04 07:00:00.000', '2014-07-04 19:30:10.000', 'D')
SELECT EID,
SUM(CASE
WHEN Shift = 'D' THEN 1
ELSE 0
END) AS DayShift,
SUM(CASE
WHEN Shift = 'N' THEN 1
ELSE 0
END) AS NightShift,
COUNT(DISTINCT CAST(In_Time AS DATE)) AS DayTotal
FROM #Attendance
GROUP BY EID

calculate between 2 row 2 column and different userID - SQL 2012

I have a pretty long list (10000+ records) where I need to calculate the time between end and start off the related UserID.
So - I have an UserID, StartDate (datetime) and EndDate (datetime).
I tried to get along with this code - but I do not get what I need - the time between Start and End of each individual User. How can I add the "UserID" part - to get the difference between end and start of next row of a UserID?
declare #MyTable table
(UserID int, StartDate datetime, FinishDate datetime);
insert into #MyTable values
('1', '2013-11-25 14:25', '2013-11-25 16:35'),
('2', '2013-12-01 10:20', '2013-12-02 12:20'),
('2', '2013-12-06 09:15', '2013-12-06 16:15'),
('1', '2013-12-08 08:00', '2013-12-08 16:30'),
('1', '2013-12-09 07:45', '2013-12-15 09:45');
with CTE_RN as
(
select
StartDate,
FinishDate,
ROW_NUMBER() OVER(ORDER BY StartDate) as RN
from #MyTable
)
select
f.FinishDate,
s.StartDate,
DATEDIFF(minute, f.FinishDate, s.StartDate) as DifHours
from CTE_RN as f
inner join CTE_RN as s
on s.RN = f.RN + 1
I edit this - as there was some confusion in my first description.
You could use the LEAD() and do without the self-join, that might help your query a bit.
Can you put up in a sample dataset to get a better understanding.
From your query it seems there are multiple rows for one set of useridet, userid. Need more info clarity...
Try this...
declare #MyTable table
(UserID int, StartDate datetime, FinishDate datetime);
insert into #MyTable values
('1', '2013-11-25 14:25', '2013-11-25 16:35'),
('2', '2013-12-01 10:20', '2013-12-02 12:20'),
('2', '2013-12-06 09:15', '2013-12-06 16:15'),
('1', '2013-12-08 08:00', '2013-12-08 16:30'),
('1', '2013-12-09 07:45', '2013-12-15 09:45');
select
FinishDate New_Start_dt,
LEAD(startdate,1,NULL) over(partition by userid order by startdate) New_finish_dt,
DATEDIFF(minute,FinishDate,LEAD(StartDate,1,NULL) over(partition by userid order by StartDate)) as DifHours
from #MyTable
I think... this is the solution:
declare #MyTable table
(UserID int, StartDate datetime, FinishDate datetime);
insert into #MyTable values
('1', '2013-11-25 14:25', '2013-11-25 16:35'),
('1', '2013-12-08 08:00', '2013-12-08 16:30'),
('2', '2013-12-06 09:15', '2013-12-06 16:15'),
('2', '2013-12-01 10:20', '2013-12-02 12:20'),
('1', '2013-12-09 07:45', '2013-12-15 09:45');
with CTE_RN as
(
select
UserID,
StartDate,
FinishDate,
ROW_NUMBER() OVER(ORDER BY UserID,StartDate) as RN
from #MyTable
)
select
s.UserID,
f.FinishDate,
s.StartDate,
DATEDIFF(minute, f.FinishDate, s.StartDate) as DifMin
from CTE_RN as f
inner join CTE_RN as s
on s.RN = f.RN + 1
where s.UserID=f.UserID