I have a relatively simple query and now have to make what appears to be simple change to that query. It's not going very well though.
The query:
select filekey, eventdate, hourstype, hours from hourshist
yields the following results:
filekey eventdate hourstype hours
1 6/1/2018 1 9
1 6/1/2018 2 3
1 6/2/2018 1 8
Which was fine until a change was requested that stated if hourstype 1 and 2 occur on the same day hourstype remains as 1 and hours (with type 1 or 2) are summed. Any other hourstype (3,4,5,etc.) would not be summed and would show as its own row. Days that have either hourstype 1 or 2 but not both should not change at all. It should yield the following results:
filekey eventdate hourstype hours
1 6/1/2018 1 12
1 6/2/2018 1 8
Thanks for your help.
To avoid over-complicating your aggregation by using case statements I would use two queries and a union to keep it fairly simple. The following produces your desired results in my environment.
SELECT filekey, eventdate, 1 AS 'hourstype', SUM(hours) AS 'hours' FROM hourshist
WHERE hourstype IN (1,2)
GROUP BY filekey, eventdate
UNION
SELECT filekey, eventdate, hourstype, hours FROM hourshist
WHERE hourstype NOT IN (1,2);
I used the following to create the test environment you described:
CREATE TABLE hourshist
(
filekey INT,
eventdate DATE,
hourstype INT,
hours INT
);
INSERT INTO hourshist
VALUES (1,'6/1/2018',1,9)
,(1,'6/1/2018',2,3)
,(1,'6/2/2018',1,8);
Related
I have a table for which I have to perform a rather complex filter: first a filter by date is applied, but then records from the previous and next days should be included if their time difference does not exceed 8 hours compared to its prev or next record (depending if the date is less or greater than filter date).
For those adjacent days the selection should stop at the first record that does not satisfy this condition.
This is how my raw data looks like:
Id
Desc
EntryDate
1
Event type 1
2021-03-12 21:55:00.000
2
Event type 1
2021-03-12 01:10:00.000
3
Event type 1
2021-03-11 20:17:00.000
4
Event type 1
2021-03-11 05:04:00.000
5
Event type 1
2021-03-10 23:58:00.000
6
Event type 1
2021-03-10 11:01:00.000
7
Event type 1
2021-03-10 10:00:00.000
In this example set, if my filter date is '2021-03-11', my expected result set should be all records from that day plus adjacent records from 03-12 and 03-10 that satisfy the 8 hours condition. Note how record with Id 7 is not be included because record with Id 6 does not comply:
Id
EntryDate
2
2021-03-12 01:10:00.000
3
2021-03-11 20:17:00.000
4
2021-03-11 05:04:00.000
5
2021-03-10 23:58:00.000
Need advice how to write this complex query
This is a variant of gaps-and-islands. Define the difference . . . and then groups based on the differences:
with e as (
select t.*
from (select t.*,
sum(case when prev_entrydate > dateadd(hour, -8, entrydate) then 0 else 1 end) over (order by entrydate) as grp
from (select t.*,
lag(entrydate) over (order by entrydate) as prev_entrydate
from t
) t
)
select e.*
from e.*
where e.grp in (select e2.grp
from t e2
where date(e2.entrydate) = #filterdate
);
Note: I'm not sure exactly how filter date is applied. This assumes that it is any events on the entire day, which means that there might be multiple groups. If there is only one group (say the first group on the day), the query can be simplified a bit from a performance perspective.
declare #DateTime datetime = '2021-03-11'
select *
from t
where t.EntryDate between DATEADD(hour , -8 , #DateTime) and DATEADD(hour , 32 , #DateTime)
I have a table which has information on races that have taken place, it holds participants who took part, where they finished in the race and what time they finished. I would like to add a time difference column which shows how far behind each participant was behind the winner.
Race ID Finish place Time Name
1 1 00:00:10 Matt
1 2 00:00:11 Mick
1 3 00:00:17 Shaun
2 1 00:00:13 Claire
2 2 00:00:15 Helen
What I would like to See
Race ID Finish place Time Time Dif Name
1 1 00:00:10 Matt
1 2 00:00:11 00:00:01 Mick
1 3 00:00:17 00:00:07 Shaun
2 1 00:00:13 Claire
2 2 00:00:15 00:00:02 Helen
I have seen similar questions asked but I was unable to relate it to my problem.
My initial idea was to have a number of derived tables which filtered out by finish place but there could be more than 10 racers so things would start to get messy. I'm using Management Studio 2012
You can use min() as a window function:
select t.*,
(case when time <> min_time then time - min_time
end) as diff
from (select t.*, min(t.time) over (partition by t.race_id) as min_time
from t
) t
I would be more inclined to express this as seconds:
(case when time <> min_time then datediff(second, min_time, time)
end) as diff
Using http://www.convertcsv.com/csv-to-sql.htm to build example data:
DROP TABLE IF EXISTS mytable
CREATE TABLE mytable(
Race_ID INTEGER
,Finish_place INTEGER
,Time VARCHAR(30)
,Name VARCHAR(30)
);
INSERT INTO mytable(Race_ID,Finish_place,Time,Name) VALUES (1, 1,'00:00:10','Matt');
INSERT INTO mytable(Race_ID,Finish_place,Time,Name) VALUES (1, 2,'00:00:11','Mick');
INSERT INTO mytable(Race_ID,Finish_place,Time,Name) VALUES (1, 3,'00:00:17','Shaun');
INSERT INTO mytable(Race_ID,Finish_place,Time,Name) VALUES (2, 1,'00:00:13','Claire');
INSERT INTO mytable(Race_ID,Finish_place,Time,Name) VALUES (2, 2,'00:00:15','Helen');
A CTE with only first finshed places would be easier to understand.
WITH CTE_FIRST
AS (
SELECT
M.Race_ID
,M.Finish_place
,M.Time
,M.Name
FROM mytable M
WHERE M.Finish_place = 1
)
SELECT
M.Race_ID
,M.Finish_place
,M.Time
,CASE
WHEN m.Finish_place = 1
THEN NULL
ELSE CONVERT(VARCHAR, DATEADD(ss, DATEDIFF(SECOND, c.Time, M.Time), 0), 108)
END AS [Time Dif]
,M.Name
FROM mytable M
INNER JOIN CTE_FIRST c
ON M.Race_ID = c.Race_ID
You can use window functions. MIN([time]) OVER (PARTITION BY race_id ORDER BY finish_place) gives first row's time value in the same race. DATEDIFF(SECOND, (MIN([time]) OVER (PARTITION BY race_id ORDER BY finish_place)), time) gives the difference.
I have a data like below format in table:
Id EmployeeCode JobNumber TransferNo FromDate Todate
--------------------------------------------------------------------------
1 127 1.0 0 01-Mar-19 10-Mar-19
2 127 1.0 NULL 11-Mar-19 15-Mar-19
3 127 J-1 1 16-Mar-19 NULL
4 136 1.0 0 01-Mar-19 15-Mar-19
5 136 J-1 1 16-Mar-19 20-Mar-19
6 136 1.0 2 21-Mar-19 NULL
And I want result like this:
Id EmployeeCode JobNumber TransferNo FromDate Todate
--------------------------------------------------------------------------
2 127 1.0 NULL 01-Mar-19 15-Mar-19
3 127 J-1 1 16-Mar-19 NULL
4 136 1.0 0 01-Mar-19 15-Mar-19
5 136 J-1 1 16-Mar-19 20-Mar-19
6 136 1.0 2 21-Mar-19 NULL
The idea is
If Job is same in continuous than Single row with max id with min date and max date. For example, for employee 127 first job and second job number is same and second and third row is different, then the first and second row will be returned, with minimum fromdate and max todate, and third row will be returned as is.
If job number is different with its next job number than all rows will be returned.
For example: for employee 136: first job number is different with second, second is different with third, so all rows will be returned.
You can group by jobNumber and EmployeeCode and use the Max/Min-Aggregate-Functions to get the dates you want
I doubt you will get a result from simple set-based queries.
So my advice: Declare a cursor on SELECT DISTINCT EmployeeCode .... Within that cursor select all rows with that EmployeeCode. Work in this set to figure out your values and construct a resultset from that.
This is an example of a gaps and islands problem. The solution here is to define the "islands" by their starts, so the process is:
determine when a new grouping begins (i.e. no overlap with previous row)
do a cumulative sum of the the starts to get the grouping value
aggregate
This looks like
select max(id), EmployeeCode, JobNumber,
min(fromdate), max(todate)
from (select t.*,
sum(case when fromdate = dateadd(day, 1, prev_todate) then 0 else 1 end) over
(partition by EmployeeCode, JobNumber order by id
) as grouping
from (select t.*,
lag(todate) over (partition by EmployeeCode, JobNumber order by id) as prev_todate
from t
) t
) t
group by grouping, EmployeeCode, JobNumber;
It is unclear what the logic is for TransferNo. The simplest solution is just min() or max(), but that will not return NULL.
I have a data set where I need to count patient visits with such rules:
Two or more visits to the same doctor in the same day count as 1 visit, regardless of the reason
Two or more visits to different doctors for the same reason count as 1 visit
Two or more visits to different doctors on the same day for different reasons count as two or more visits.
Example data:
DoctorId PatientId VisitDate ReasonCode RowId
-------- --------- --------- ---------- -----
1 100 2014-01-01 200 1
1 100 2014-01-01 210 2
2 100 2014-01-01 200 3
2 100 2014-01-11 300 4
1 100 2014-01-15 200 5
2 400 2014-01-15 200 6
In this example, my final count would be based on grouping rowId 1, 2, 3 for 1 visit; grouping row 4 as 1 visit, grouping row 5 as 1 visit for a total of 3 visits for patient 100. Patient 400 has 1 visit as well.
patientid visitdate numberofvisits
--------- --------- --------------
100 2014-01-01 3
100 2014-01-11 1
100 2014-01-15 1
400 2014-01-15 1
Where I'm stuck is how to handle the group by so that I get the different scenarios covered. If the grouping were doctor, date, I'd be fine. If it were doctor, date, ReasonCode, I'd be fine. It's the logic of the doctorId and the ReasonCode in the scenario where 2 doctors are involved, and doctorid and date in the other when it's the same doctor. I've not been deeply into Sql Server in a long time, so it's possible that a common table expression is the solution and I'm not seeing it. I'm using Sql Server 2014 and there's a decent lattitude in performance. I would be looking for a sql server query that produces the results above. As best I can tell, there's no way to group this the way I need it counted.
The answer was an except clause and grouping each of the sets before a final count. Sometimes, we over-complicate things.
DECLARE #tblAllData TABLE
(
DoctorId INT NOT NULL
, PatientId INT NOT NULL
, VisitDate DATE NOT NULL
, ReasonCode INT NOT NULL
, RowId INT NOT NULL
)
INSERT #tblAllData
SELECT
1,100,'2014-01-01',200,1
UNION ALL
SELECT
1,100,'2014-01-01',210,2
UNION ALL
SELECT
2,100,'2014-01-01',200,3
UNION ALL
SELECT
2,100,'2014-01-11',300,4
UNION ALL
SELECT
1,100,'2014-01-15',200,5
UNION ALL
SELECT
2,400,'2014-01-15',200,6
DECLARE #tblTempCountedRows AS TABLE
(
PatientId INT NOT NULL
, VisitDate DATE
, ReasonCode INT
)
INSERT #tblTempCountedRows
SELECT PatientId, VisitDate,0
FROM #tblAllData
GROUP BY PatientId, DoctorId, VisitDate
EXCEPT
SELECT PatientId, VisitDate, ReasonCode
FROM #tblAllData
GROUP BY PatientId, VisitDate, ReasonCode
select * from #tblTempCountedRows
DECLARE #tblFinalCountedRows AS TABLE
(
PatientId INT NOT NULL
, VisitCount INT
)
INSERT #tblFinalCountedRows
SELECT
PatientId
, count(1) as Member_visit_Count
FROM
#tblTempCountedRows
GROUP BY PatientId
SELECT * from #tblFinalCountedRows
Here's a Sql Fiddle with the results:
Sql Fiddle
I was trying to aggregate a 7 days data for FY13 (starts on 10/1/2012 and ends on 9/30/2013) in SQL Server but so far no luck yet. Could someone please take a look. Below is my example data.
DATE BREAD MILK
10/1/12 1 3
10/2/12 2 4
10/3/12 2 3
10/4/12 0 4
10/5/12 4 0
10/6/12 2 1
10/7/12 1 3
10/8/12 2 4
10/9/12 2 3
10/10/12 0 4
10/11/12 4 0
10/12/12 2 1
10/13/12 2 1
So, my desired output would be like:
DATE BREAD MILK
10/1/12 1 3
10/2/12 2 4
10/3/12 2 3
10/4/12 0 4
10/5/12 4 0
10/6/12 2 1
Total 11 15
10/7/12 1 3
10/8/12 2 4
10/9/12 2 3
10/10/12 0 4
10/11/12 4 0
10/12/12 2 1
10/13/12 2 1
Total 13 16
--------through 9/30/2013
Please note, since FY13 starts on 10/1/2012 and ends on 9/30/2012, the first week of FY13 is 6 days instead of 7 days.
I am using SQL server 2008.
You could add a new computed column for the date values to group them by week and sum the other columns, something like this:
SELECT DATEPART(ww, DATEADD(d,-2,[DATE])) AS WEEK_NO,
SUM(Bread) AS Bread_Total, SUM(Milk) as Milk_Total
FROM YOUR_TABLE
GROUP BY DATEPART(ww, DATEADD(d,-2,[DATE]))
Note: I used DATEADD and subtracted 2 days to set the first day of the week to Monday based on your dates. You can modify this if required.
Use option with GROUP BY ROLLUP operator
SELECT CASE WHEN DATE IS NULL THEN 'Total' ELSE CONVERT(nvarchar(10), DATE, 101) END AS DATE,
SUM(BREAD) AS BREAD, SUM(MILK) AS MILK
FROM dbo.test54
GROUP BY ROLLUP(DATE),(DATENAME(week, DATE))
Demo on SQLFiddle
Result:
DATE BREAD MILK
10/01/2012 1 3
10/02/2012 2 4
10/03/2012 2 3
10/04/2012 0 4
10/05/2012 4 0
10/06/2012 2 1
Total 11 15
10/07/2012 1 3
10/08/2012 4 7
10/10/2012 0 4
10/11/2012 4 0
10/12/2012 2 1
10/13/2012 2 1
Total 13 16
You are looking for a rollup. In this case, you will need at least one more column to group by to do your rollup on, the easiest way to do that is to add a computed column that groups them into weeks by date.
Take a lookg at: Summarizing Data Using ROLLUP
Here is the general idea of how it could be done:
You need a derived column for each row to determine which fiscal week that record belongs to. In general you could subtract that record's date from 10/1, get the number of days that have elapsed, divide by 7, and floor the result.
Then you can GROUP BY that derived column and use the SUM aggregate function.
The biggest wrinkle is that 6 day week you start with. You may have to add some logic to make sure that the weeks start on Sunday or whatever day you use but this should get you started.
The WITH ROLLUP suggestions above can help; you'll need to save the data and transform it as you need.
The biggest thing you'll need to be able to do is identify your weeks properly. If you don't have those loaded into tables already so you can identify them, you can build them on the fly. Here's one way to do that:
CREATE TABLE #fy (fyear int, fstart datetime, fend datetime);
CREATE TABLE #fylist(fyyear int, fydate DATETIME, fyweek int);
INSERT INTO #fy
SELECT 2012, '2011-10-01', '2012-09-30'
UNION ALL
SELECT 2013, '2012-10-01', '2013-09-30';
INSERT INTO #fylist
( fyyear, fydate )
SELECT fyear, DATEADD(DAY, Number, DATEADD(DAY, -1, fy.fstart)) AS fydate
FROM Common.NUMBERS
CROSS APPLY (SELECT * FROM #fy WHERE fyear = 2013) fy
WHERE fy.fend >= DATEADD(DAY, Number, DATEADD(DAY, -1, fy.fstart));
WITH weekcalc AS
(
SELECT DISTINCT DATEPART(YEAR, fydate) yr, DATEPART(week, fydate) dt
FROM #fylist
),
ridcalc AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY yr, dt) AS rid, yr, dt
FROM weekcalc
)
UPDATE #fylist
SET fyweek = rid
FROM #fylist
JOIN ridcalc
ON DATEPART(YEAR, fydate) = yr
AND DATEPART(week, fydate) = dt;
SELECT list.fyyear, list.fyweek, p.[date], COUNT(bread) AS Bread, COUNT(Milk) AS Milk
FROM products p
JOIN #fylist list
ON p.[date] = list.fydate
GROUP BY list.fyyear, list.fyweek, p.[date] WITH ROLLUP;
The Common.Numbers reference above is a simple numbers table that I use for this sort of thing (goes from 1 to 1M). You could also build that on the fly as needed.