calculate between 2 row 2 column and different userID - SQL 2012 - sql

I have a pretty long list (10000+ records) where I need to calculate the time between end and start off the related UserID.
So - I have an UserID, StartDate (datetime) and EndDate (datetime).
I tried to get along with this code - but I do not get what I need - the time between Start and End of each individual User. How can I add the "UserID" part - to get the difference between end and start of next row of a UserID?
declare #MyTable table
(UserID int, StartDate datetime, FinishDate datetime);
insert into #MyTable values
('1', '2013-11-25 14:25', '2013-11-25 16:35'),
('2', '2013-12-01 10:20', '2013-12-02 12:20'),
('2', '2013-12-06 09:15', '2013-12-06 16:15'),
('1', '2013-12-08 08:00', '2013-12-08 16:30'),
('1', '2013-12-09 07:45', '2013-12-15 09:45');
with CTE_RN as
(
select
StartDate,
FinishDate,
ROW_NUMBER() OVER(ORDER BY StartDate) as RN
from #MyTable
)
select
f.FinishDate,
s.StartDate,
DATEDIFF(minute, f.FinishDate, s.StartDate) as DifHours
from CTE_RN as f
inner join CTE_RN as s
on s.RN = f.RN + 1
I edit this - as there was some confusion in my first description.

You could use the LEAD() and do without the self-join, that might help your query a bit.
Can you put up in a sample dataset to get a better understanding.
From your query it seems there are multiple rows for one set of useridet, userid. Need more info clarity...
Try this...
declare #MyTable table
(UserID int, StartDate datetime, FinishDate datetime);
insert into #MyTable values
('1', '2013-11-25 14:25', '2013-11-25 16:35'),
('2', '2013-12-01 10:20', '2013-12-02 12:20'),
('2', '2013-12-06 09:15', '2013-12-06 16:15'),
('1', '2013-12-08 08:00', '2013-12-08 16:30'),
('1', '2013-12-09 07:45', '2013-12-15 09:45');
select
FinishDate New_Start_dt,
LEAD(startdate,1,NULL) over(partition by userid order by startdate) New_finish_dt,
DATEDIFF(minute,FinishDate,LEAD(StartDate,1,NULL) over(partition by userid order by StartDate)) as DifHours
from #MyTable

I think... this is the solution:
declare #MyTable table
(UserID int, StartDate datetime, FinishDate datetime);
insert into #MyTable values
('1', '2013-11-25 14:25', '2013-11-25 16:35'),
('1', '2013-12-08 08:00', '2013-12-08 16:30'),
('2', '2013-12-06 09:15', '2013-12-06 16:15'),
('2', '2013-12-01 10:20', '2013-12-02 12:20'),
('1', '2013-12-09 07:45', '2013-12-15 09:45');
with CTE_RN as
(
select
UserID,
StartDate,
FinishDate,
ROW_NUMBER() OVER(ORDER BY UserID,StartDate) as RN
from #MyTable
)
select
s.UserID,
f.FinishDate,
s.StartDate,
DATEDIFF(minute, f.FinishDate, s.StartDate) as DifMin
from CTE_RN as f
inner join CTE_RN as s
on s.RN = f.RN + 1
where s.UserID=f.UserID

Related

How to display months sorted in order in SQL Server?

Below is the table I have created and inserted values in it:
CREATE TABLE employees_list
(
employeeID int identity(1,1),
employeeName varchar(25)
)
GO
INSERT INTO employees_list VALUES ('Kevin'),('Charles')
GO
CREATE TABLE hourlyRates
(
employeeID int,
rate int,
rateDate date
)
INSERT INTO hourlyRates VALUES (1, 28, '2016-01-01'),
(1, 39, '2016-02-01'),
(2, 43, '2016-01-01'),
(2, 57, '2016-02-01')
CREATE TABLE workingHours
(
employeeID int,
startdate datetime,
enddate datetime
)
GO
INSERT INTO workingHours VALUES (1, '2016-01-01 09:00', '2016-01-01 17:00'),
(1, '2016-01-02 09:00', '2016-01-02 17:00'),
(1, '2016-02-01 10:00', '2016-02-01 16:00'),
(1, '2016-02-02 11:00', '2016-02-02 13:00'),
(2, '2016-01-01 10:00', '2016-01-01 16:00'),
(2, '2016-01-02 08:00', '2016-01-02 14:00'),
(2, '2016-02-01 14:00', '2016-02-01 19:00'),
(2, '2016-02-02 13:00', '2016-02-02 16:00')
GO
SELECT * FROM employees_list
SELECT * FROM hourlyRates
SELECT * FROM workingHours
Then I ran a query to calculate salaries paid to Employees each month:
SELECT
employeeName,
DATENAME(MONTH, startdate) AS 'Month',
SUM(DATEDIFF(HOUR, startdate, enddate) * rate) AS 'Total Salary'
FROM
hourlyRates, workingHours, employees_list
WHERE
hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND (hourlyRates.rateDate BETWEEN DATEFROMPARTS(DATEPART(YEAR, workingHours.startDate), DATEPART(MONTH, workingHours.startDate),1)
AND DATEFROMPARTS(DATEPART(YEAR, workingHours.endDate), DATEPART(MONTH, workingHours.endDate),1))
GROUP BY
employeeName, DATENAME(MONTH, startdate)
And I got the following output:
As you can see from the screenshot above that I got the result I wanted.
But the only issue is the month is not being displayed in order.
I tried adding ORDER BY DATENAME(MONTH, startdate) and still the order of month is not being sorted.
I even tried ORDER BY DATEPART(MM, startdate) but it is showing error mentioning that it is not contained in an aggregate function or GROUP BY clause.
What minor change do I need to make in my query ?
Why add ORDER BY DATENAME(MONTH,startdate) not work
Because the ORDER depends on character instead of the month of number.
You can try to add MONTH(startdate) in ORDER BY & GROUP BY, because you might need to add non-aggregate function in GROUP BY
SELECT employeeName,DATENAME(MONTH,startdate) AS 'Month',
SUM(DATEDIFF(HOUR,startdate,enddate) * rate) AS 'Total Salary'
FROM hourlyRates
INNER JOIN workingHours
ON hourlyRates.employeeID = workingHours.employeeID
INNER JOIN employees_list
ON employees_list.employeeID = workingHours.employeeID
WHERE
(hourlyRates.rateDate
BETWEEN DATEFROMPARTS(DATEPART(YEAR, workingHours.startDate), DATEPART(MONTH,workingHours.startDate),1)
AND DATEFROMPARTS(DATEPART(YEAR, workingHours.endDate), DATEPART(MONTH,workingHours.endDate),1))
GROUP BY employeeName,DATENAME(MONTH,startdate),MONTH(startdate)
ORDER BY MONTH(startdate)
sqlfiddle
NOTE
I would use INNER JOIN ANSI syntax instead of , which mean CROSS JOIN because JOIN syntax is generally considered more readable.
As mentioned, ORDER BY DATENAME will sort by the textual name of the month not by the actual ordering of months.
It's best to just group and sort by EOMONTH, then you can pull out the month name from that in the SELECT
Further improvements:
Always use explicit join syntax, not old-style , comma joins.
Give tables short aliases, to make your query more readable.
Your date interval check might not be quite right, and you may need to also adjust the rate caluclation, but I don't know without further info.
A more accurate calculation would probably mean calculating part-dates.
SELECT
e.employeeName,
DATENAME(month, EOMONTH(wh.startdate)) AS Month,
SUM(DATEDIFF(HOUR, wh.startdate, wh.enddate) * hr.rate) AS [Total Salary]
FROM hourlyRates hr
JOIN workingHours wh ON hr.employeeID = wh.employeeID
AND hr.rateDate
BETWEEN DATEFROMPARTS(YEAR(wh.startDate), MONTH(wh.startDate), 1)
AND DATEFROMPARTS(YEAR(wh.endDate), MONTH(wh.endDate), 1)
JOIN employees_list e ON e.employeeID = wh.employeeID
GROUP BY
e.employeeId,
e.employeeName,
EOMONTH(wh.startdate)
ORDER BY
EOMONTH(wh.startdate),
e.employeeName;
db<>fiddle

SQL : 12 hr shift function from 7am - 7am

I need to query all production records by 12 hr. shift. 7am-7am. if the date is after midnight and before 7am it's actually the previous day shift. In the below example I need to make them all 2022-01-01 like the last column. If I query by 2022-01-01 I don't get all the rows. Can I use a function for this to compare the time and make it the previous day?
declare #temp table
(
Emp_id int,
Time datetime,
shiftDate date
);
insert into #temp values (1, '2022-01-01 08:10:00:000', '2022-01-01')
insert into #temp values (1, '2022-01-01 10:21:00:000', '2022-01-01')
insert into #temp values (1, '2022-01-01 13:10:00:000', '2022-01-01')
insert into #temp values (1, '2022-01-01 22:22:00:000', '2022-01-01')
insert into #temp values (1, '2022-01-02 02:15:00:000', '2022-01-01')
insert into #temp values (1, '2022-01-02 04:22:00:000', '2022-01-01')
insert into #temp values (1, '2022-01-02 06:18:00:000', '2022-01-01')
insert into #temp values (1, '2022-01-02 06:55:00:000', '2022-01-01')
select * from #temp
select * from #temp
where convert(date, [time]) = '2022-01-01'

Get all dates between provided dates

I have this table and sample data. I want to get the entire month's or specific dates attendance and information like hours he worked or days he was absent.
CREATE TABLE Attendance
(
[EmpCode] int,
[TimeIn] datetime,
[TimeOut] datetime
)
INSERT INTO Attendance VALUES (12, '2018-08-01 09:00:00', '2018-08-01 17:36:00');
INSERT INTO Attendance VALUES (12, '2018-08-02 09:00:00', '2018-08-02 18:10:00');
INSERT INTO Attendance VALUES (12, '2018-08-03 09:25:00', '2018-08-03 16:56:00');
INSERT INTO Attendance VALUES (12, '2018-08-04 09:13:00', '2018-08-05 18:09:00');
INSERT INTO Attendance VALUES (12, '2018-08-06 09:00:00', '2018-08-07 18:15:00');
INSERT INTO Attendance VALUES (12, '2018-08-07 09:27:00', '2018-08-08 17:36:00');
INSERT INTO Attendance VALUES (12, '2018-08-08 09:35:00', '2018-08-09 17:21:00');
INSERT INTO Attendance VALUES (12, '2018-08-10 09:00:00', '2018-08-10 17:45:00');
INSERT INTO Attendance VALUES (12, '2018-08-11 09:50:00', '2018-08-11 17:31:00');
INSERT INTO Attendance VALUES (12, '2018-08-13 09:23:00', '2018-08-13 17:19:00');
INSERT INTO Attendance VALUES (12, '2018-08-15 09:21:00', '2018-08-15 17:36:00');
INSERT INTO Attendance VALUES (12, '2018-08-16 09:00:00', '2018-08-16 17:09:00');
INSERT INTO Attendance VALUES (12, '2018-08-17 09:34:00', '2018-08-17 17:29:00');
INSERT INTO Attendance VALUES (12, '2018-08-18 09:00:00', '2018-08-18 17:10:00');
INSERT INTO Attendance VALUES (12, '2018-08-20 09:34:00', '2018-08-20 17:12:00');
INSERT INTO Attendance VALUES (12, '2018-08-21 09:20:00', '2018-08-21 17:15:00');
INSERT INTO Attendance VALUES (12, '2018-08-22 09:12:00', '2018-08-22 17:19:00');
INSERT INTO Attendance VALUES (12, '2018-08-23 09:05:00', '2018-08-23 17:21:00');
INSERT INTO Attendance VALUES (12, '2018-08-24 09:07:00', '2018-08-24 17:09:00');
INSERT INTO Attendance VALUES (12, '2018-08-25 09:12:00', '2018-08-25 17:05:00');
INSERT INTO Attendance VALUES (12, '2018-08-27 09:21:00', '2018-08-27 17:46:00');
INSERT INTO Attendance VALUES (12, '2018-08-28 09:17:00', '2018-08-28 17:12:00');
INSERT INTO Attendance VALUES (12, '2018-08-29 09:00:00', '2018-08-29 17:36:00');
INSERT INTO Attendance VALUES (12, '2018-08-30 09:12:00', '2018-08-30 17:24:00');
I have a query that tells how many hours employee have worked, but it is only showing days on which data was present in table. I want to show all dates between provided dates and in case there is no data it should NULL in columns.
Here is the query:
SELECT
[EmpCode],
FirstIN = CAST(MIN([TimeIn]) AS TIME),
LastOUT = CAST(MAX([TimeOut]) AS TIME),
CONVERT(VARCHAR(6), Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME))/3600)
+ ':'
+ RIGHT('0' + CONVERT(VARCHAR(2), (Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME)) % 3600) / 60), 2)
+ ':'
+ RIGHT('0' + CONVERT(VARCHAR(2), Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME)) % 60) , 2 ) AS HoursSpent,
CAST(COALESCE(TimeIn, TimeOut) AS DATE) [Date]
FROM Attendance
WHERE CAST(COALESCE(TimeIn, TimeOut) AS DATE) BETWEEN '2018-08-01' AND '2018-08-25'
GROUP BY EmpCode, TimeIn, TimeOut
For that you need to use recursive way to generate possible dates :
with t as (
select '2018-08-01' as startdt
union all
select dateadd(day, 1, startdt)
from t
where startdt < '2018-08-25'
)
select . . .
from t left join
Attendance at
on cast(coalesce(at.TimeIn, at.TimeOut) as date) = t.startdt;
Just make sure to use date from t instead of Attendance table in SELECT statement.
Note : If you have a large no of date period, then don't forgot to use Query hint OPTION (MAXRECURSION 0), By defalut it has 100 recursion levels.
You May Try Recursive CTE to populate the Dates and Then Join With that to Get the Interval
DECLARE #From DATETIME = '2018-08-01' ,#To DATETIME= '2018-08-25'
;WITH CTE
AS
(
SELECT
[EmpCode] EmpId,
MyDate = #From
FROM Attendance A
UNION ALL
SELECT
EmpId,
MyDate = DATEADD(DAY,1,MyDate)
FROM CTE
WHERE MyDate < #To
)
SELECT
[EmpCode] = CTE.EmpId,
CTE.MyDate,
FirstIN = CAST(MIN([TimeIn]) AS TIME),
LastOUT = CAST(MAX([TimeOut]) AS TIME),
CONVERT(VARCHAR(6), Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME))/3600)
+ ':'
+ RIGHT('0' + CONVERT(VARCHAR(2), (Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME)) % 3600) / 60), 2)
+ ':'
+ RIGHT('0' + CONVERT(VARCHAR(2), Datediff(second, CAST(MIN([TimeIn]) AS TIME), CAST(MAX([TimeOut]) AS TIME)) % 60) , 2 )
AS HoursSpent,
CAST(CTE.MyDate AS DATE) [Date]
FROM CTE
LEFT JOIN Attendance A
ON A.EmpCode = CTE.EmpId
AND CAST(CTE.MyDate AS DATE) = CAST(COALESCE(TimeIn, TimeOut) AS DATE)
GROUP BY CTE.EmpId, TimeIn, TimeOut,CTE.MyDate
ORDER BY 6
A different method, using a Tally Table. The advantage here is that an rCTE is a form of RBAR. The idea of a Tally table isn't as obvious, but is quicker, and also, won't need the OPTION (MAXRECURSION 0) added if you have more than 100 days. in fact, this example handles up to 10,000 days, which shuold be more than enough:
DECLARE #EmpCode int = 12;
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1 --10
CROSS JOIN N N2 --100
CROSS JOIN N N3 --1000
CROSS JOIN N N4 --10000
),
Dates AS(
SELECT DATEADD(DAY, T.I, TT.MinTimeIn) AS CalendarDate,
#EmpCode AS EmpCode
FROM Tally T
CROSS APPLY (SELECT MIN(CONVERT(date,TimeIn)) AS MinTimeIn,
MAX(CONVERT(date,TimeOut)) AS MaxTimeOut
FROM Attendance
WHERE EmpCode = #EmpCode) TT
WHERE DATEADD(DAY, T.I, TT.MinTimeIn) <= CONVERT(date, TT.MaxTimeOut))
SELECT CalendarDate
EmpCode,
TimeIn,
TimeOut
FROM Dates D
LEFT JOIN Attendance A ON D.CalendarDate = CONVERT(date,A.TimeIn)
AND D.EmpCode = A.EmpCode;

SQL Server 2014 - Use previous value when date not present

I asked a similar question yesterday but I was not very good in my description of what I wanted. This will be far clearer.
Lead/Lag is not getting me what I need. Its close, but not enough.
Using SQL Server 2014 for client, actual server built on SQL 2012.
Here is my code:
Creating Team Table
CREATE TABLE ##TeamTable
([UserID] varchar(50), [CurrentTeam] varchar(5), [ChangeDate] datetime)
;
INSERT INTO ##TeamTable
([UserID], [CurrentTeam], [ChangeDate])
VALUES
('User1', 'Team1', '6/1/2016'),
('User1', 'Team2', '9/1/2016'),
('User1', 'Team3', '12/1/2016'),
('User2', 'Team1', '4/1/2016'),
('User2', 'Team2', '10/1/2016'),
('User2', 'Team3', '11/1/2016');
Now to create data table I need to join to
CREATE TABLE ##DataTable
([UserID] varchar(50), Month_sk datetime, Media varchar(50), NCO int)
INSERT INTO ##DataTable
([UserID] , Month_sk , Media , NCO )
VALUES
('User1', '2016-06-01 00:00:00', 'Fax', 100),
('User1', '2016-06-01 00:00:00', 'Voice', 120),
('User1', '2016-07-01 00:00:00', 'Voice', 90),
('User1', '2016-07-01 00:00:00', 'Email', 100),
('User1', '2016-08-01 00:00:00', 'Voice', 150),
('User1', '2016-08-01 00:00:00', 'Email', 100),
('User1', '2016-09-01 00:00:00', 'Voice', 100),
('User1', '2016-09-01 00:00:00', 'Email', 120),
('User1', '2016-10-01 00:00:00', 'Voice', 90),
('User1', '2016-10-01 00:00:00', 'Email', 100),
('User1', '2016-11-01 00:00:00', 'Voice', 150),
('User1', '2016-11-01 00:00:00', 'Email', 100),
('User1', '2016-12-01 00:00:00', 'Voice', 150),
('User1', '2016-12-01 00:00:00', 'Email', 100),
('User2', '2016-04-01 00:00:00', 'Fax', 100),
('User2', '2016-04-01 00:00:00', 'Voice', 120),
('User2', '2016-05-01 00:00:00', 'Fax', 100),
('User2', '2016-05-01 00:00:00', 'Voice', 120),
('User2', '2016-06-01 00:00:00', 'Fax', 100),
('User2', '2016-06-01 00:00:00', 'Voice', 120),
('User2', '2016-07-01 00:00:00', 'Voice', 90),
('User2', '2016-07-01 00:00:00', 'Email', 100),
('User2', '2016-08-01 00:00:00', 'Voice', 150),
('User2', '2016-08-01 00:00:00', 'Email', 100),
('User2', '2016-09-01 00:00:00', 'Voice', 100),
('User2', '2016-09-01 00:00:00', 'Email', 120),
('User2', '2016-10-01 00:00:00', 'Voice', 90),
('User2', '2016-10-01 00:00:00', 'Email', 100),
('User2', '2016-11-01 00:00:00', 'Voice', 150),
('User2', '2016-11-01 00:00:00', 'Email', 100),
('User2', '2016-12-01 00:00:00', 'Voice', 150),
('User2', '2016-12-01 00:00:00', 'Email', 100);
Here is a basic Select to show whats going on:
SELECT b.UserID
,b.Media
,b.NCO
,Month_sk
,CurrentTeam
FROM ##DataTable b
LEFT OUTER JOIN ##TeamTable a on b.UserID = a.UserID and b.Month_sk = a.ChangeDate
order by UserID, Month_sk, media
This gives me a result set that looks like this:
What I need is for where I have nulls, that it would be pulling in the previous team name that's not null. So in User1 case, those 4 nulls for months of July and August would say Team1 since that was the team he was last on. Same for the nulls after Team2, those should say Team2.
Lead/Lag is close or I'm not using it right. Hopefully with all this code, this makes someone's jobs way easier.
UPDATE:
Lag/Lead gives same results. Still need the nulls to fill in
SELECT b.UserID
,b.Media
,b.NCO
,Month_sk
,CurrentTeam
,LAG(CurrentTeam,1, currentteam) OVER(PARTITION BY a.userid, changedate ORDER BY ChangeDate) as Lag
FROM ##DataTable b
LEFT OUTER JOIN ##TeamTable a on b.UserID = a.UserID and b.Month_sk = a.ChangeDate
order by UserID, Month_sk, media
(Moving update notes to end)
I think the easiest solution (conceptually) is to join against all months up to month_sk and then filter to get only the last match. This "feels" potentially inefficient, so you'd want to test it with realistic data volume and if there's a problem then look for something better. (But "something better" may involve changes to the physical data model...)
So:
select userid, media, nco, month_sk, currentteam
from (SELECT b.UserID
, b.Media
, b.NCO
, Month_sk
, CurrentTeam
, rank() over(partition by b.userID
order by a.changeDate desc) n
FROM ##DataTable b
INNER JOIN ##TeamTable a
on b.UserID = a.UserID
and b.Month_sk >= a.ChangeDate
) x
where n = 1
order by UserID, Month_sk, media
Note that in previous versions I used row_number() over() instead of rank() over()... and you can do that, but if you do then you have to include in the partitioning key any data from the b table that could cause a duplication of a row from the a table during the join. Using rank ensures that all such duplicates share their rank as they ought to.
UPDATE - After I initially wrote this, I deleted it because I thought I'd misread your question; but as I was writing a replacement realized I may have had it right in the first place. So here it is, with a caveat:
This assumes that the only reason you get the NULL value is the outer join. If ever the "right hand" table has a row and just a value for a column therein is NULL, then getting the previous value for that column would require further work with subqueries or analytic funcitons. But even then lead/lag may not work, since they are position based. (I think something with LAST_VALUE might be more suitable, but will leave the details of that unless it's needed.)
UPDATE 2 - based on your description of the data model in below comments, I'm changing the query to show an inner join as it sounds like that will work (once you broaden the join criteria) and should be more efficient.
UPDATE 3 - I did misread your sample data and got the partitioning expression for calculating n wrong. Should be fixed assuming the values from the b table are unique. If not it's still fixable but requires more trickery...
You can do this with an APPLY and a sub query like this.
SELECT
userid,
media,
nco,
month_sk,
currentteam
FROM
##DataTable td
OUTER APPLY (
SELECT TOP (1)
CurrentTeam,
ChangeDate
FROM
##TeamTable tt
WHERE
tt.UserID = td.UserID
and tt.ChangeDate <= td.Month_sk
ORDER BY
tt.ChangeDate desc
) dataTableWithTeam
ORDER BY
td.UserID,
td.Month_sk,
td.media
In this version, I first identify the appropriate "linking" month in the CTE, and then use that as a lookup in the final join. (It got much easier once I realized Media and NCO played no real part in the join.)
WITH cteDateLookup
as (
-- Get the ChangeDate for this User/Month
SELECT
b.UserID
,b.Month_sk
,max(a.ChangeDate) ChangeDate
from ##DataTable b
left outer join ##TeamTable a
on b.UserID = a.UserID
and b.Month_sk >= a.ChangeDate
group by
b.UserID
,b.Month_sk
)
-- Use the cte as a "lookup" for the appropriate date
SELECT
b.UserID
,b.Media
,b.NCO
,b.Month_sk
,a.CurrentTeam
from ##DataTable b
left outer join cteDateLookup cte
on cte.UserId = b.UserId
and b.Month_sk = cte.Month_sk
left outer join ##TeamTable a
on a.UserId = cte.UserId
and a.ChangeDate = cte.ChangeDate
order by
b.UserID
,b.Month_sk
,b.media

Summing up the records as per given conditions

I have a table like below, What I need that for any particular fund and up to any particular date logic will sum the amount value. Let say I need the sum for 3 dates as 01/28/2015,03/30/2015 and 04/01/2015. Then logic will check for up to first date how many records are there in table . If it found more than one record then it'll sum the amount value. Then for next date it'll sum up to the next date but from the previous date it had summed up.
Id Fund Date Amount
1 A 01/20/2015 250
2 A 02/28/2015 300
3 A 03/20/2015 400
4 A 03/30/2015 200
5 B 04/01/2015 500
6 B 04/01/2015 600
I want result to be like below
Id Fund Date SumOfAmount
1 A 02/28/2015 550
2 A 03/30/2015 600
3 B 04/01/2015 1100
Based on your question, it seems that you want to select a set of dates, and then for each fund and selected date, get the sum of the fund amounts from the selected date to the previous selected date. Here is the result set I think you should be expecting:
Fund Date SumOfAmount
A 2015-02-28 550.00
A 2015-03-30 600.00
B 2015-04-01 1100.00
Here is the code to produce this output:
DECLARE #Dates TABLE
(
SelectedDate DATE PRIMARY KEY
)
INSERT INTO #Dates
VALUES
('02/28/2015')
,('03/30/2015')
,('04/01/2015')
DECLARE #FundAmounts TABLE
(
Id INT PRIMARY KEY
,Fund VARCHAR(5)
,Date DATE
,Amount MONEY
);
INSERT INTO #FundAmounts
VALUES
(1, 'A', '01/20/2015', 250)
,(2, 'A', '02/28/2015', 300)
,(3, 'A', '03/20/2015', 400)
,(4, 'A', '03/30/2015', 200)
,(5, 'B', '04/01/2015', 500)
,(6, 'B', '04/01/2015', 600);
SELECT
F.Fund
,D.SelectedDate AS Date
,SUM(F.Amount) AS SumOfAmount
FROM
(
SELECT
SelectedDate
,LAG(SelectedDate,1,'1/1/1900') OVER (ORDER BY SelectedDate ASC) AS PreviousDate
FROM #Dates
) D
JOIN
#FundAmounts F
ON
F.Date BETWEEN DATEADD(DAY,1,D.PreviousDate) AND D.SelectedDate
GROUP BY
D.SelectedDate
,F.Fund
EDIT: Here is alternative to the LAG function for this example:
FROM
(
SELECT
SelectedDate
,ISNULL((SELECT TOP 1 SelectedDate FROM #Dates WHERE SelectedDate < Dates.SelectedDate ORDER BY SelectedDate DESC),'1/1/1900') AS PreviousDate
FROM #Dates Dates
) D
If i change your incorrect sample data to ...
CREATE TABLE TableName
([Id] int, [Fund] varchar(1), [Date] datetime, [Amount] int)
;
INSERT INTO TableName
([Id], [Fund], [Date], [Amount])
VALUES
(1, 'A', '2015-01-28 00:00:00', 250),
(2, 'A', '2015-01-28 00:00:00', 300),
(3, 'A', '2015-03-30 00:00:00', 400),
(4, 'A', '2015-03-30 00:00:00', 200),
(5, 'B', '2015-04-01 00:00:00', 500),
(6, 'B', '2015-04-01 00:00:00', 600)
;
this query using GROUP BY works:
SELECT MIN(Id) AS Id,
MIN(Fund) AS Fund,
[Date],
SUM(Amount) AS SumOfAmount
FROM dbo.TableName t
WHERE [Date] IN ('01/28/2015','03/30/2015','04/01/2015')
GROUP BY [Date]
Demo
Initially i have used Row_number and month function to pick max date of every month and in 2nd cte i did sum of amounts and joined them..may be this result set matches your out put
declare #t table (Id int,Fund Varchar(1),Dated date,amount int)
insert into #t (id,Fund,dated,amount) values (1,'A','01/20/2015',250),
(2,'A','01/28/2015',300),
(3,'A','03/20/2015',400),
(4,'A','03/30/2015',200),
(5,'B','04/01/2015',600),
(6,'B','04/01/2015',500)
;with cte as (
select ID,Fund,Amount,Dated,ROW_NUMBER() OVER
(PARTITION BY DATEDIFF(MONTH, '20000101', dated)ORDER BY dated desc)AS RN from #t
group by ID,Fund,DATED,Amount
),
CTE2 AS
(select SUM(amount)Amt from #t
GROUP BY MONTH(dated))
,CTE3 AS
(Select Amt,ROW_NUMBER()OVER (ORDER BY amt)R from cte2)
,CTE4 AS
(
Select DISTINCT C.ID As ID,
C.Fund As Fund,
C.Dated As Dated
,ROW_NUMBER()OVER (PARTITION BY RN ORDER BY (SELECT NULL))R
from cte C INNER JOIN CTE3 CC ON c.RN = CC.R
Where C.RN = 1
GROUP BY C.ID,C.Fund,C.RN,C.Dated )
select C.R,C.Fund,C.Dated,cc.Amt from CTE4 C INNER JOIN CTE3 CC
ON c.R = cc.R
declare #TableName table([Id] int, [Fund] varchar(1), [Date] datetime, [Amount] int)
declare #Sample table([SampleDate] datetime)
INSERT INTO #TableName
([Id], [Fund], [Date], [Amount])
VALUES
(1, 'A', '20150120 00:00:00', 250),
(2, 'A', '20150128 00:00:00', 300),
(3, 'A', '20150320 00:00:00', 400),
(4, 'A', '20150330 00:00:00', 200),
(5, 'B', '20150401 00:00:00', 500),
(6, 'B', '20150401 00:00:00', 600)
INSERT INTO #Sample ([SampleDate])
values ('20150128 00:00:00'), ('20150330 00:00:00'), ('20150401 00:00:00')
-- select * from #TableName
-- select * from #Sample
;WITH groups AS (
SELECT [Fund], [Date], [AMOUNT], MIN([SampleDate]) [SampleDate] FROM #TableName
JOIN #Sample ON [Date] <= [SampleDate]
GROUP BY [Fund], [Date], [AMOUNT])
SELECT [Fund], [SampleDate], SUM([AMOUNT]) FROM groups
GROUP BY [Fund], [SampleDate]
Explanation:
The CTE groups finds the earliest SampleDate which is later than (or equals to) your
data's date and enriches your data accordingly, thus giving them the group to be summed up in.
After that, you can group on the derived date.