Get percentages of larger group - sql

The query below is kind of an ugly one so I hope I've got it spaced well enough to make it readable. The query finds the percentage of people that visit a given hospital if they are from a certain area. For instance, if 100 people live in county X and 20 go to hospital A and 80 go to hospital B the query outputs. How the heck is this sort of thing done? Let me know if I need to document the query or whatever I can do to make it clearer.
hospital A 20
hospital B 80
The query below works exactly like I want it to, but it give me thinking: how could this be done for every county in my table?
select hospitalname, round(cast(counts as float)/cast(fayettestrokepop as float)*100,2)as percentSeen
from
(
SELECT tblHospitals.hospitalname, COUNT(tblHospitals.hospitalname) AS counts, tblStateCounties_1.countyName,
(SELECT COUNT(*) AS Expr1
FROM Patient INNER JOIN
tblStateCounties ON Patient.stateCode = tblStateCounties.stateCode AND Patient.countyCode = tblStateCounties.countyCode
WHERE (tblStateCounties.stateCode = '21') AND (tblStateCounties.countyName = 'fayette')) AS fayetteStrokePop
FROM Patient AS Patient_1 INNER JOIN
tblHospitals ON Patient_1.hospitalnpi = tblHospitals.hospitalnpi INNER JOIN
tblStateCounties AS tblStateCounties_1 ON Patient_1.stateCode = tblStateCounties_1.stateCode AND Patient_1.countyCode = tblStateCounties_1.countyCode
WHERE (tblStateCounties_1.stateCode = '21') AND (tblStateCounties_1.countyName = 'fayette')
GROUP BY tblHospitals.hospitalname, tblStateCounties_1.countyName
) as t
order by percentSeen desc
EDIT: sample data
The sample data below is without the outermost query (the as t order by part).
The countsInTheCounty column is the (select count(*)..) part after 'tblStateCounties_1.countyName'
hospitalName hospitalCounts countyName countsInTheCounty
st. james 23 X 300
st. jude 40 X 300
Now with the outer query we would get
st james 0.076 (23/300)
st. jude 0.1333 (40/300)

Here is my guess. You'll have to test against your data or provide proper DDL + sample data.
;WITH totalCounts AS
(
SELECT StateCode, countyCode, COUNT(*) AS totalcount
FROM dbo.Patient GROUP BY StateCode, countyCode
)
SELECT
h.hospitalName,
hospitalCounts = COUNT(p.hospitalnpi),
c.countyName,
countsInTheCounty = tc.totalCount,
percentseen = CONVERT(DECIMAL(5,2), COUNT(p.hospitalnpi)*100.0/tc.totalCount)
FROM
dbo.Patient AS p
INNER JOIN
dbo.tblHospitals AS h
ON p.hospitalnpi = h.hospitalnpi
INNER JOIN
totalCounts AS tc
ON p.StateCode = tc.StateCode
AND p.countyCode = tc.countyCode
INNER JOIN
dbo.tblStateCounties AS c
ON tc.StateCode = c.stateCode
AND tc.countyCode = c.countyCode
GROUP BY
h.hospitalname,
c.countyName,
tc.totalcount
ORDER BY
c.countyName,
percentseen DESC;

Related

How to check more than one unrelated conditions in SQL?

I have following query:
SQL> SELECT DISTINCT INSTRUCTORADDRESSMODPER.instructor_id, Instructor.instructor_name, InstructorRank.salary, Student.specification_id
2 FROM INSTRUCTORADDRESSMODPER
3 JOIN Student ON INSTRUCTORADDRESSMODPER.student_id = Student.student_id
4 JOIN Instructor ON INSTRUCTORADDRESSMODPER.instructor_id = Instructor.instructor_id
5 JOIN InstructorRank ON Instructor.instructor_rank = InstructorRank.instructor_rank
6 ORDER BY specification_id;
which has yielded following result:
I was trying to get result which shows same column values for instructors with same salary and same specification as highlighted in the figure. Now these two conditions require completely different checks and I don't even know how to get started.
You need something like this ?
SELECT instructor_id, instructor_name, instructor_name, specification_id
FROM (
SELECT DISTINCT INSTRUCTORADDRESSMODPER.instructor_id, Instructor.instructor_name, InstructorRank.instructor_name, Student.specification_id
, COUNT(distinct INSTRUCTORADDRESSMODPER.instructor_id)over(partition by InstructorRank.salary, Student.specification_id) cnt
FROM INSTRUCTORADDRESSMODPER
JOIN Student ON INSTRUCTORADDRESSMODPER.student_id = Student.student_id
JOIN Instructor ON INSTRUCTORADDRESSMODPER.instructor_id = Instructor.instructor_id
JOIN InstructorRank ON Instructor.instructor_rank = InstructorRank.instructor_rank
ORDER BY specification_id
)
WHERE cnt > 1
;
You can use the window function count as follows:
Select * from
(SELECT DISTINCT INSTRUCTORADDRESSMODPER.instructor_id, Instructor.instructor_name, InstructorRank.salary, Student.specification_id,
Count(1) over (partition by InstructorRank.salary, Student.specification_id) as cnt
FROM INSTRUCTORADDRESSMODPER
JOIN Student ON INSTRUCTORADDRESSMODPER.student_id = Student.student_id
JOIN Instructor ON INSTRUCTORADDRESSMODPER.instructor_id = Instructor.instructor_id
JOIN InstructorRank ON Instructor.instructor_rank = InstructorRank.instructor_rank)
Where cnt > 1
ORDER BY specification_id;
SQL> SELECT DISTINCT INSTRUCTORADDRESSMODPER.instructor_id,
Instructor.instructor_name,InstructorRank.salary, Student.specification_id
FROM INSTRUCTORADDRESSMODPER
WHERE 1=1
AND INSTRUCTORADDRESSMODPER.student_id = Student.student_id
AND INSTRUCTORADDRESSMODPER.instructor_id = Instructor.instructor_id
AND Instructor.instructor_rank = InstructorRank.instructor_rank
AND ORDER BY specification_id
AND ...;
With the above use, desired conditions can be added.

SQL Server : Pivot Table Odd result,

So, full disclosure, this is my first time using a pivot table and the result set has multiple columns as a result. In this example, we will use the result set without a pivot table applied.
This student has taken multiple courses. What the ideal end result would be, is to count how many courses this student has taken. So if I look I see that the Student has taken Algebra 1 5 times, English 1 2 times and Geometry 2 times.
The Next image is the query with the pivot applied, oh and to make things complicated, its a monster query before trying to attempt the pivot.
So we see in this, that Algebra 1 has a count of 0, Geometry has a count of 2 and English has a count of 0. This is the same query as before but just using this student.
Query:
SELECT *
FROM
(SELECT
[StudentNumber], psat.MathScaledScore, psat.EBRWScore,
SCReady.MathScaleScore, SCReady.ReadingScaleScore,
SpringMap.MathPercentile, SpringMap.ReadingPercentile,
Coursework.Course_name
FROM
[OnlineApplications].[dbo].[Users] U
LEFT JOIN
[OnlineApplications].[dbo].ContactInfoes C ON C.ContactInfoId = U.UserId
LEFT JOIN
(SELECT *
FROM [DOENRICH-SQL].[enrich_prod].dbo.HCS_view_Most_Recent_PSAT_Scores Scores) AS psat ON U.StudentNumber = psat.number
LEFT JOIN
(SELECT
stud.Number, sc.DateTaken,
CASE sc.ELALev
WHEN 'E8AE0E4D-AD36-41D8-AC89-627D19661803' THEN 'Exceeds Expectations'
WHEN 'C9F2CDA2-D904-438B-9DD3-94EFC9111A0E' THEN 'Approaches Expectations'
WHEN '9B39E28F-89C8-44AD-A8F2-1463192F88F1' THEN 'Does Not Meet Expectation'
WHEN '87247DB1-4A57-419E-9619-7B43B02B1135' THEN 'Meets Expectations'
END ELALev,
sc.ELASS AS ELAScaleScore, sc.elavss AS VerticalScaleScore,
sc.elassread AS ReadingScaleScore,
sc.ELARS10 AS RawScoreStandard10, sc.elASPR AS ELAStatePercentileRank,
MathLev = CASE MathLev
WHEN 'E8AE0E4D-AD36-41D8-AC89-627D19661803' THEN 'Exceeds Expectations'
WHEN 'C9F2CDA2-D904-438B-9DD3-94EFC9111A0E' THEN 'Approaches Expectations'
WHEN '9B39E28F-89C8-44AD-A8F2-1463192F88F1' THEN 'Does Not Meet Expectation'
WHEN '87247DB1-4A57-419E-9619-7B43B02B1135' THEN 'Meets Expectations'
END,
sc.MATHSS AS MathScaleScore, sc.MathVSS AS MathVerticalScaleScore,
sc.mathspr AS MathStatePercentileRank
FROM
[DOENRICH-SQL].[enrich_prod].dbo.t_sc_ready sc
JOIN
[DOENRICH-SQL].[enrich_prod].dbo.Student stud ON sc.StudentID = stud.ID
WHERE
DateTaken = '2019-05-17') AS SCReady ON SCReady.number = U.studentnumber
LEFT JOIN
(
select stud.Number, max(map.readingPercentile) as ReadingPercentile, max(map.mathPercentile) as MathPercentile
from [DOENRICH-SQL].[ENRICH_PROD].[dbo].t_map map
join [DOENRICH-SQL].[ENRICH_PROD].[INFORM].[Map_GradeLevelID] mapping on map.GradeLevelID = mapping.DestID
JOIN [DOENRICH-SQL].[ENRICH_PROD].[dbo].[Student] stud on map.StudentID = stud.ID
where DateTaken >= '2018-08-01'
group by stud.Number
) as SpringMap on SpringMap.number = U.Studentnumber
left join (
SELECT * FROM OPENQUERY(PSPROD,'
Select B.STUDENT_NUMBER , A.COURSE_NAME, A.GRADE_LEVEL, A.SCHOOLNAME, A.GRADE
from PS.STOREDGRADES A
join PS.STUDENTS B ON A.STUDENTID = B.ID
AND STORECODE in (''Q1'',''Q2'',''Q3'',''Q4'',''F1'')
AND (COURSE_NAME LIKE ''%Algebra 1%'' OR COURSE_NAME LIKE ''%Geometry Honors%'' OR COURSE_NAME LIKE ''%English 1%'')
group by B.STUDENT_NUMBER , A.COURSE_NAME, a.STORECODE, A.GRADE, A.PERCENT, A.GRADE_LEVEL, A.SCHOOLNAME
ORDER BY STUDENT_NUMBER, STORECODE DESC
'
)
) as Coursework on Coursework.STUDENT_NUMBER = U.StudentNumber
join [OnlineApplications].[dbo].ScholarsApps Sapps on Sapps.ScholarsAppId = u.UserId
where AppYear = 2019 and StudentNumber <> '' and StudentNumber = '17476'
) T
PIVOT (
COUNT (COURSE_NAME)
FOR course_name IN (
[Algebra 1],
[Geometry Honors],
[English 1])
)
as Pivot_table
Again, very complicated query before and I'm not clear if I'm using the pivot function correctly.
I'd love to have this pivot with the counts of the courses.

Max count SQL Server

I will explain my question with a practical example so that it is easier to visualize the issue. I have build this query:
Select
E.Tipo_Esp, F.Nome, F.Apelido,
count (Ac.id_acto) as total_consultas
from
Especialidade as E
right join
Funcionario as F on F.id_Esp = E.id_Esp
inner join
Acto as Ac on Ac.id_func = F.id_func
inner join
TipoActo as TA on TA.id_Tipo_acto = Ac.id_Tipo_acto
where
TA.Descricao_Acto = 'Consulta'
group by
E.Tipo_Esp, F.Nome, F.Apelido
order by
count(Ac.id_acto) DESC
to arrive to the following result:
Tipo_Esp Nome Apelido total_consultas
Ortopedia Maria Antonia 3
Ortopedia Luis Cruz 1
Cirurgia André Martins 2
Cirurgia Diogo Martins 1
However what I need to arrive is this:
Tipo_Esp Nome Apelido total_consultas
Ortopedia Maria Antonia 3
Cirurgia André Martins 2
meaning I only need the higher count for each "Tipo_Esp". I have tried to apply the max count function with the above query as a subquery but it did went well as expected. can someone help me with this issue please? thanks in advance
You could do this:
with orig as (
Select E.Tipo_Esp, F.Nome, F.Apelido, count (Ac.id_acto) as total_consultas from
Especialidade as E
right join Funcionario as F on F.id_Esp = E.id_Esp
inner join Acto as Ac on Ac.id_func = F.id_func
inner join TipoActo as TA on TA.id_Tipo_acto = Ac.id_Tipo_acto
WHERE TA.Descricao_Acto = 'Consulta'
GROUP BY E.Tipo_Esp, F.Nome, F.Apelido
ORDER BY count(Ac.id_acto) DESC
)
select o.*
from orig o
inner join (
select tipo_esp, max(total_consultas) as maxtotal
from orig
group by tipo_esp
) t on o.tipo_esp = t.tipo_esp and o.total_consultas = t.maxtotal

Difference Max and Min from Different Dates

I'm going to try to explain this the best I can.
The code below does the following:
Finds a service address from the ServiceLocation table.
Finds a service type (electric or water).
Finds how many days in the past to pull data.
Once it has this, it calculates the "daily usage" by subtracting the max meter read for a day from the minimum meter read for a day.
(MAX(mr.Reading) - MIN(mr.Reading)) AS 'DaytimeUsage'
However, what I'm missing is the max reading from the day prior and the minimum reading from the current day. Mathematically, this should look something like this:
MAX(PriorDayReading) - MIN(ReadDateReading)
Essentially, if it goes back 5 days it should kick out a table that reads as follows:
Service Location | Read Date | Usage |
123 Main St | 4/20/15 | 12 |
123 Main St | 4/19/15 | 8 |
123 Main St | 4/18/15 | 6 |
123 Main St | 4/17/15 | 10 |
123 Main St | 4/16/15 | 11 |
Where "Usage" is the 'DaytimeUsage' + usage that I'm missing (and the question above). For example, 4/18/15 would be the 'DaytimeUsage' in the query below PLUS the the difference between the MAX read from 4/17/15 and the MIN read from 4/18/15.
I'm not sure how to accomplish this or if it is possible.
SELECT
A.ServiceAddress AS 'Service Address',
convert(VARCHAR(10),A.ReadDate,101) AS 'Date',
SUM(A.[DaytimeUsage]) AS 'Usage'
FROM
(
SELECT
sl.location_addr AS 'ServiceAddress',
convert(VARCHAR(10),mr.read_date,101) AS 'ReadDate',
(MAX(mr.Reading) - MIN(mr.Reading)) AS 'DaytimeUsage'
FROM
DimServiceLocation AS sl
INNER JOIN FactBill AS fb ON fb.ServiceLocationKey = sl.ServiceLocationKey
INNER JOIN FactMeterRead as mr ON mr.ServiceLocationKey = sl.ServiceLocationKey
INNER JOIN DimCustomer AS c ON c.CustomerKey = fb.CustomerKey
WHERE
c.class_name = 'Tenant'
AND sl.ServiceLocationKey = #ServiceLocation
AND mr.meter_type = #ServiceType
GROUP BY
sl.location_addr,
convert(VARCHAR(10),
mr.read_date,101)
) A
WHERE A.ReadDate >= GETDATE()-#Days
GROUP BY A.ServiceAddress, convert(VARCHAR(10),A.ReadDate,101)
ORDER BY convert(VARCHAR(10),A.ReadDate,101) DESC
It seems like you could solve this by just calculating the difference between the MAX of yesterday & today, however this is how I would approach it. Join to the same table again for the previous day relative to any given day, and select the Max/Min for that too within your inner query. Also if you place the date in the inner query where clause the data set you return will be quicker & smaller.
SELECT
A.ServiceAddress AS 'Service Address',
convert(VARCHAR(10),A.ReadDate,101) AS 'Date',
SUM(A.[TodayMax]) - SUM(A.[TodayMin]) AS 'Usage',
SUM(A.[TodayMax]) - SUM(A.[YesterdayMax]) AS 'Usage with extra bit you want'
FROM
(
SELECT
sl.location_addr AS 'ServiceAddress',
convert(VARCHAR(10),mr.read_date,101) AS 'ReadDate',
MAX(mrT.Reading) AS 'TodayMax',
MIN(mrT.Reading) AS 'TodayMin',
MAX(mrY.Reading) AS 'YesterdayMax',
MIN(mrY.Reading) AS 'YesterdayMin',
FROM
DimServiceLocation AS sl
INNER JOIN FactBill AS fb ON fb.ServiceLocationKey = sl.ServiceLocationKey
INNER JOIN FactMeterRead as mrT ON mrT.ServiceLocationKey = sl.ServiceLocationKey
INNER JOIN FactMeterRead as mrY ON mrY.ServiceLocationKey = s1.ServiceLocationKey
AND mrY.read_date = mrT.read_date -1)
INNER JOIN DimCustomer AS c ON c.CustomerKey = fb.CustomerKey
WHERE
c.class_name = 'Tenant'
AND sl.ServiceLocationKey = #ServiceLocation
AND mr.meter_type = #ServiceType
AND convert(VARCHAR(10), mrT.read_date,101) >= GETDATE()-#Days
GROUP BY
sl.location_addr,
convert(VARCHAR(10),
mr.read_date,101)
) A
GROUP BY A.ServiceAddress, convert(VARCHAR(10),A.ReadDate,101)
ORDER BY convert(VARCHAR(10),A.ReadDate,101) DESC
You can use the APPLY operator if you are above sql server 2005. Here is a link to the documentation. https://technet.microsoft.com/en-us/library/ms175156(v=sql.105).aspx The APPLY operation comes in two forms OUTER APPLY AND CROSS APPLY - OUTER works like a left join and CROSS works like an inner join. They let you run a query once for each row returned. I setup my own sample of what you were trying to do, here it is and I hope it helps.
http://sqlfiddle.com/#!6/fdb3f/1
CREATE TABLE SequencedValues (
Location varchar(50) NOT NULL,
CalendarDate datetime NOT NULL,
Reading int
)
INSERT INTO SequencedValues (
Location,
CalendarDate,
Reading
)
SELECT
'Address1',
'4/20/2015',
10
UNION SELECT
'Address1',
'4/19/2015',
9
UNION SELECT
'Address1',
'4/19/2015',
20
UNION SELECT
'Address1',
'4/19/2015',
25
UNION SELECT
'Address1',
'4/18/2015',
8
UNION SELECT
'Address1',
'4/17/2015',
7
UNION SELECT
'Address2',
'4/20/2015',
100
UNION SELECT
'Address2',
'4/20/2015',
111
UNION SELECT
'Address2',
'4/19/2015',
50
UNION SELECT
'Address2',
'4/19/2015',
65
SELECT DISTINCT
sv.Location,
sv.CalendarDate,
sv_dayof.MINDayOfReading,
sv_daybefore.MAXDayBeforeReading
FROM SequencedValues sv
OUTER APPLY (
SELECT MIN(sv_dayof_inside.Reading) AS MINDayOfReading
FROM SequencedValues sv_dayof_inside
WHERE sv.Location = sv_dayof_inside.Location
AND sv.CalendarDate = sv_dayof_inside.CalendarDate
) sv_dayof
OUTER APPLY (
SELECT MAX(sv_daybefore_max.Reading) AS MAXDayBeforeReading
FROM SequencedValues sv_daybefore_max
WHERE sv.Location = sv_daybefore_max.Location
AND sv_daybefore_max.CalendarDate IN (
SELECT TOP 1 sv_daybefore_inside.CalendarDate
FROM SequencedValues sv_daybefore_inside
WHERE sv.Location = sv_daybefore_inside.Location
AND sv.CalendarDate > sv_daybefore_inside.CalendarDate
ORDER BY sv_daybefore_inside.CalendarDate DESC
)
) sv_daybefore
ORDER BY
sv.Location,
sv.CalendarDate DESC
I'm not sure I full understood your db structure but I may have a solution so feel free to edit my answer to adapt or correct any mistake.
The idea is to use two aliases for the table FactMeterRead. mrY (Y as yesterday) and mrT (T as Today). And differentiate them with a read_date restriction.
However I didn't understand enough your tables to write a fully functional query. I hope you will get the idea anyway with this example.
SELECT
sl.location_addr AS 'ServiceAddress',
convert(VARCHAR(10),mrT.read_date,101) AS 'ReadDate',
(MAX(mrY.Reading) - MIN(mrT.Reading)) AS 'DaytimeUsage'
FROM
DimServiceLocation AS sl
INNER JOIN FactMeterRead as mrY ON mrY.ServiceLocationKey = sl.ServiceLocationKey
INNER JOIN FactMeterRead as mrT ON mrT.ServiceLocationKey = sl.ServiceLocationKey
WHERE mrY.read_date=DATE_SUB(mrT.read_date,1 DAY)

Rollup / recursive addition SQL Server 2008

I have a query with rollup that outputs data like (the query is a little busy, but I can post if necessary)
range subCounts Counts percent
1-9 3 100 3.0
10-19 13 100 13.0
20-29 30 100 33.0
30-39 74 100 74.0
NULL 100 100 100.0
How is it possible to keep a running summation total of percent? Say I need to find the bottom 15 percentile, in this case 3+13=16 so I would like for the last row to be returned read
range subCounts counts percent
10-19 13 100 13.0
EDIT1: here the query
select '$'+cast(+bin*10000 + ' ' as varchar(10)) + '-' + cast(bin*10000+9999 as varchar(10)) as bins,
count(*) as numbers,
(select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')
) as Totals
, round(100*count(*)/cast((select count(distinct patient.patientid) from patient
inner join tblclaims on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on tblhospitals.hospitalnpi = patient.hospitalnpi
where (tblhospitals.hospitalname = 'X')) as float),2) as binsPercent
from
(
select tblclaims.patientid, sum(claimsmedicarepaid) as TotalCosts,
cast(sum(claimsmedicarePaid)/10000 as int) as bin
from tblclaims inner join patient on patient.patientid = tblclaims.patientid
and patient.admissiondate = tblclaims.admissiondate
and patient.dischargedate = tblclaims.dischargedate
inner join tblhospitals on patient.hospitalnpi = tblhospitals.hospitalnpi
where tblhospitals.hospitalname = 'X'
group by tblclaims.patientid
) as t
group by bin with rollup
OK, so for whomever might use this for reference I figured out what I needed to do.
I added row_number() over(bin) as rownum to the query and saved all of this as a view.
Then I used
SELECT *,
SUM(t2.binspercent) AS SUM
FROM t t1
INNER JOIN t t2 ON t1.rownum >= t2.rownum
GROUP BY t1.rownum,
t1.bins, t1.numbers, t1.uktotal, t1.binspercent
ORDER BY t1.rownum
by joining t1.rownum >=t2.rownum you can get the rolling count sort of thing.
This isn't exactly what i was looking for, but it's on the same track:
http://blog.tallan.com/2011/12/08/sql-server-2012-windowing-functions-part-1-of-2-running-and-sliding-aggregates/ and http://blog.tallan.com/2011/12/19/sql-server-2012-windowing-functions-part-2-of-2-new-analytic-functions/ - check out PERCENT_RANK
CUME_DIST
PERCENTILE_CONT
PERCENTILE_DISC
Sorry for the lame answer