Sql Server group by sets of columns - sql

I have a data set where I need to count patient visits with such rules:
Two or more visits to the same doctor in the same day count as 1 visit, regardless of the reason
Two or more visits to different doctors for the same reason count as 1 visit
Two or more visits to different doctors on the same day for different reasons count as two or more visits.
Example data:
DoctorId PatientId VisitDate ReasonCode RowId
-------- --------- --------- ---------- -----
1 100 2014-01-01 200 1
1 100 2014-01-01 210 2
2 100 2014-01-01 200 3
2 100 2014-01-11 300 4
1 100 2014-01-15 200 5
2 400 2014-01-15 200 6
In this example, my final count would be based on grouping rowId 1, 2, 3 for 1 visit; grouping row 4 as 1 visit, grouping row 5 as 1 visit for a total of 3 visits for patient 100. Patient 400 has 1 visit as well.
patientid visitdate numberofvisits
--------- --------- --------------
100 2014-01-01 3
100 2014-01-11 1
100 2014-01-15 1
400 2014-01-15 1
Where I'm stuck is how to handle the group by so that I get the different scenarios covered. If the grouping were doctor, date, I'd be fine. If it were doctor, date, ReasonCode, I'd be fine. It's the logic of the doctorId and the ReasonCode in the scenario where 2 doctors are involved, and doctorid and date in the other when it's the same doctor. I've not been deeply into Sql Server in a long time, so it's possible that a common table expression is the solution and I'm not seeing it. I'm using Sql Server 2014 and there's a decent lattitude in performance. I would be looking for a sql server query that produces the results above. As best I can tell, there's no way to group this the way I need it counted.

The answer was an except clause and grouping each of the sets before a final count. Sometimes, we over-complicate things.
DECLARE #tblAllData TABLE
(
DoctorId INT NOT NULL
, PatientId INT NOT NULL
, VisitDate DATE NOT NULL
, ReasonCode INT NOT NULL
, RowId INT NOT NULL
)
INSERT #tblAllData
SELECT
1,100,'2014-01-01',200,1
UNION ALL
SELECT
1,100,'2014-01-01',210,2
UNION ALL
SELECT
2,100,'2014-01-01',200,3
UNION ALL
SELECT
2,100,'2014-01-11',300,4
UNION ALL
SELECT
1,100,'2014-01-15',200,5
UNION ALL
SELECT
2,400,'2014-01-15',200,6
DECLARE #tblTempCountedRows AS TABLE
(
PatientId INT NOT NULL
, VisitDate DATE
, ReasonCode INT
)
INSERT #tblTempCountedRows
SELECT PatientId, VisitDate,0
FROM #tblAllData
GROUP BY PatientId, DoctorId, VisitDate
EXCEPT
SELECT PatientId, VisitDate, ReasonCode
FROM #tblAllData
GROUP BY PatientId, VisitDate, ReasonCode
select * from #tblTempCountedRows
DECLARE #tblFinalCountedRows AS TABLE
(
PatientId INT NOT NULL
, VisitCount INT
)
INSERT #tblFinalCountedRows
SELECT
PatientId
, count(1) as Member_visit_Count
FROM
#tblTempCountedRows
GROUP BY PatientId
SELECT * from #tblFinalCountedRows
Here's a Sql Fiddle with the results:
Sql Fiddle

Related

sql query to fill sparse data in timeline

I have a table holding various information change related to employees. Some information change over time, but not alltogether, and changes occur periodically but not regularly. Changes are recorded by date, and if an item is not changed for the given employee at the given time, then the item's value is Null for that record. Say it looks like this:
employeeId
Date
Salary
CommuteDistance
1
2000-01-01
1000
Null
2
2000-01-15
2000
20
3
2000-01-30
3000
Null
2
2010-02-15
2100
Null
3
2010-03-30
Null
30
1
2020-02-01
1100
10
1
2030-03-01
Null
100
Now, how can I write a query to fill the null values with the most recent non-null values for all employees at all dates, while keeping the value Null if there is no such previous non-null value? It should look like:
employeeId
Date
Salary
CommuteDistance
1
2000-01-01
1000
Null
2
2000-01-15
2000
20
3
2000-01-30
3000
Null
2
2010-02-15
2100
20
3
2010-03-30
3000
30
1
2020-02-01
1100
10
1
2030-03-01
1100
100
(Note how the bolded values are taken over from previous records of same employee).
I'd like to use the query inside a view, then in turn query that view to get the picture at an arbitrary date (e.g., what were the salary and commute distance for the employees on 2021-08-17? - I should be able to do that, but I'm unable to build the view). Or, is there a better way to acomplish this?
There's no point in showing my attempts, since I'm quite inexperienced with advanced sql (I assume the solution empolys advanced knowledge, since I found my basic knowledge insufficient for this) and I got nowhere near the desired result.
You may get the last not null value for employee salary or CommuteDistance using the following:
SELECT T.employeeId, T.Date,
COALESCE(Salary, MAX(Salary) OVER (PARTITION BY employeeId, g1)) AS Salary,
COALESCE(CommuteDistance, MAX(CommuteDistance) OVER (PARTITION BY employeeId, g2)) AS CommuteDistance
FROM
(
SELECT *,
MAX(CASE WHEN Salary IS NOT null THEN Date END) OVER (PARTITION BY employeeId ORDER BY Date) AS g1,
MAX(CASE WHEN CommuteDistance IS NOT null THEN Date END) OVER (PARTITION BY employeeId ORDER BY Date) AS g2
FROM TableName
) T
ORDER BY Date
See a demo.
We group by employeeId and by Salary/CommuteDistance and all the nulls after them by Date. Then we fill in the blanks.
select employeeId
,Date
,max(Salary) over(partition by employeeId, s_grp) as Salary
,max(CommuteDistance) over(partition by employeeId, d_grp) as CommuteDistance
from (
select *
,count(case when Salary is not null then 1 end) over(partition by employeeId order by Date) as s_grp
,count(case when CommuteDistance is not null then 1 end) over(partition by employeeId order by Date) as d_grp
from t
) t
order by Date
employeeId
Date
Salary
CommuteDistance
1
2000-01-01
1000
null
2
2000-01-15
2000
20
3
2000-01-30
3000
null
2
2010-02-15
2100
20
3
2010-03-30
3000
30
1
2020-02-01
1100
10
1
2030-03-01
1100
100
Fiddle

Count values separately until certain amount of duplicates SQL

I need a Statement that selects all patients and the amount of their appointments and when there are 3 or more appointments that are taking place on the same date they should be counted as one appointment
That is what my Statement looks so far
SELECT PATSuchname, Count(DISTINCT AKTDATUM) AS AKTAnz
FROM tblAktivitaeten
LEFT OUTER JOIN tblPatienten ON (tblPatienten.PATID=tblAktivitaeten.PATID)
WHERE (AKTDeleted<>'J' OR AKTDeleted IS Null)
GROUP BY PATSuchname
ORDER BY AKTAnz DESC
The result should look like this
PATSuchname Appointments
----------------------------------------
Joey Patner 13
Billy Jean 15
Example Name 13
As you can see Joey Patner has 13 Appointments, in the real table though he has 15 appointments but three of them have the same Date and because of that they are only counted as 1
So how can i write a Statement that does exactly that?
(I am new to Stack Overflow, sorry if the format I use is wrong and tell me if it is.
In the table it looks like this.
tblPatienten
----------
PATSuchname PATID
------------------------
Joey Patner 1
Billy Jean 2
Example Name 3
tblAktivitaeten
----------
AKTDatum PATID AKTID
-----------------------------------------
08.02.2021 1 1000 ----
08.02.2021 1 1001 ---- So these 3 should counted as 1
08.02.2021 1 1002 ----
09.05.2021 1 1003
09.07.2021 2 1004 -- these 2 shouldn't be counted as 1
09.07.2021 2 1005 --
Two GROUP BY should do it:
SELECT
x.PATID, PATSuchname, SUM(ApptCount)
FROM (
SELECT
PATID, AKTDatum, CASE WHEN COUNT(*) < 3 THEN COUNT(*) ELSE 1 END AS ApptCount
FROM tblAktivitaeten
GROUP BY
PATID, AKTDatum
) AS x
LEFT JOIN tblPatienten ON tblPatienten.PATID = x.PATID
GROUP BY
x.PATID, PATSuchname

Calculate MinDate and MaxDate based on table relationship

Following is the condition to write a create a query has per the requirement.
Visit 1 Date for the Patient(Abc) should be calculated fom the "Screening" visit of the patient.
For example if the Patient (Abc) has visited on 23/Mar/2019 then with Min Date (22/Mar/2019) and MaxDate (25/Mar/2019).
In the VisitWindow I am linking the VisitWindowId to VisitId in the table for VisitEntry.
So if you see the visitWindowId you can see that I have mention MinDays (1) and MaxDays(2) which is to be calculated using the VisitDate for the VisitName is equal to "Screening".
For example I expecting the query or write the query to give the below result.
I struck with writing the desired query to get the result
Table - VisitEntry
--------------------
RecordId VisitId VisitName VisitDate PatientId PatientName
1 1 Screening 23/Mar/2019 100 Abc
2 2 Visit 1 Date 23/Mar/2019 100 Abc
Table - VisitWindow
-------------------
RecordId VisitId VisitWindowId MinDays MaxDays
1 2 1 1 2
Expected QueryResult
--------------------
RecordId VisitId VisitName VisitDate PatientId MinDate MaxDate
1 1 Screening 23/Mar/2019 100 NUll Null
2 2 Visit 1 Date 23/Mar/2019 100 22/Mar/2019 25/Mar/2019
You didn't mention the db, so I'll give you 2.
SQL Server:
SELECT ve.RecordID, ve.VisitID, ve.VisitName, ve.VisitDate, ve.PatientID, ve.PatientName,
dateadd(d, sd.VisitDate, -1 * vw.MinDays) MinDate,
dateadd(d, sd.VisitDate, vw.MaxDays) MaxDate
FROM VisitEntry ve
LEFT JOIN VisitEntry sd ON (ve.PatientId=sd.PatientId AND sd.VisitName='Screening')
LEFT JOIN VisitWindow vw ON ve.VisitID=vw.VisitID
Oracle:
SELECT ve.RecordID, ve.VisitID, ve.VisitName, ve.VisitDate, ve.PatientID, ve.PatientName,
sd.VisitDate - vw.MinDays MinDate,
sd.VisitDate + vw.MaxDays MaxDate
FROM VisitEntry ve
LEFT JOIN VisitEntry sd ON (ve.PatientId=sd.PatientId AND sd.VisitName='Screening')
LEFT JOIN VisitWindow vw ON ve.VisitID=vw.VisitID

Get max of column using sum

I have one table with following data..
saleId amount date
-------------------------
1 2000 10/10/2012
2 3000 12/10/2012
3 2000 11/12/2012
2 3000 12/10/2012
1 4000 11/10/2012
4 6000 10/10/2012
From my table I want result with max of sum amount between dates 10/10/2012 and 12/10/2012 which for the data above will be:
saleId amount
---------------
1 6000
2 6000
4 6000
Here 6000 is the max of the sums (by saleId) so I want ids 1, 2 and 4.
You have to use Sub-queries like this:
SELECT saleId , SUM(amount) AS Amount
FROM Table1
GROUP BY saleId
HAVING SUM(amount) =
(
SELECT MAX(AMOUNT) FROM
(
SELECT SUM(amount) AS AMOUNT FROM Table1
WHERE date BETWEEN '10/10/2012' AND '12/10/2012'
GROUP BY saleId
) AS A
)
See this SQLFiddle
This query goes through the table only once and is fairly optimised.
select top(1) with ties saleid, amount
from (
select saleid, sum(amount) amount
from tbl
where date between '20121010' and '20121210'
group by saleid
) x
order by amount desc;
You can produce the SUM with the WHERE clause as a derived table, then SELECT TOP(1) in the query using WITH TIES to show all the ones with the same (MAX) amount.
When presenting dates to SQL Server, try to always use the format YYYYMMDD for robustness.

Count two Columns with two Where Clauses

I know it's just late in the day and my brain is just fried....
Using Teradata, I need to COUNT DISTINCT MEMBERS that haven't had a TRANS in the past six months and also COUNT the number of TRANS they had historically (prior to the six months). We can just assume the cutoff date to be 01/01/2012. All table is contained in a single table.
For example:
Member | Tran Date
123 | 01/01/2011
789 | 06/01/2011
123 |10/31/2011
678 | 04/03/2011
789 | 06/01/2012
So 2 members had a total of 3 transactions dated prior to 1/1/2012 with no transactions later than 1/1/2012.
In this example, my result would be:
MEMBERS | TRANS
2 | 3
Try this solution:
SELECT
COUNT(DISTINCT member_id) AS MEMBERS,
COUNT(*) AS TRANS
FROM
tbl
WHERE
member_id NOT IN
(
SELECT DISTINCT member_id
FROM tbl
WHERE trans_date > '2012-01-01'
)
You can't do it in one SQL statement. Use subqueries. This is TSQL coz I am unfamiliar with Teradata.
DECLARE #CUTOFF DATETIME = DATEADD(MO,-6,GETDATE()) --6MTHS AGO
SELECT COUNT(MEMBERID) AS MEMBERS, SUM(TRANSCOUNT) AS TRANS FROM (
SELECT DISTINCT
MEMBERID,
(SELECT COUNT(*) TRANSDATE WHERE TRANSDATA.MEMBERID = MEMBER.MEMBERIF) AS TRANSCOUNT
FROM MEMBER WHERE NOT EXISTS
(SELECT * FROM TRANSDATA, MEMBER WHERE
TRANSDATA.MEMBERID = MEMBER.MEMBERIF
AND TRANDATE > #CUTOFF)
)