Trouble running a complex query in sql? - sql

I am pretty new to SQL Server and just started playing with it. I am trying to create a table that shows attendance percentage by department.
So first i run this query:
SELECT CrewDesc, COUNT(*)
FROM database.emp
INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
This gives a table like this:
Accounting 10
Marketing 5
Economics 20
Engineering 5
Machinery 5
Tech Support 10
Then i run another query:
SELECT DeptDescription, COUNT(*)
FROM database.Attendee
GROUP BY DeptDescription
This gives me the result of all the people that have attended meeting something like
Accounting 8
Marketing 5
Economics 15
Engineering 10
Tech Support 8
Then I get the current week in the year by SELECT Datepart(ww, GetDate()) as CurrentWeek To make this example easy lets assume this will be week "2".
Now the way i was going to create this was a table for each step but that seems like waste. Is there a way we can combine to tables in a query? So in the end result i would like a table like this
Total# Attd Week (Total*Week) Attd/(Total*week)%
Accounting 10 8 2 20 8/20
Marketing 5 5 2 10 5/10
Economics 20 15 2 40 15/40
Engineering 5 10 2 10 10/10
Machinery 5 NULL 2 10 0/10
Tech Support 10 8 2 20 8/20

Ok, note that my recommendation below is based on your exact existing queries - there are certainly other ways to construct this that may be more performant, but functionally this should work for your requirement. Also, it illustrates the key features of different join types that happen to be relevant for your request, as well as inline views (aka nested queries), which are a super-powerful technique in the SQL language as a whole.
select t1.CrewDesc, t1.Total, t2.Attd, t3.Week,
(t1.Total*t3.Week) as Total_x_Week,
case when isnull(t1.Total*t3.Week, 0) = 0 then 0 else isnull(t2.Attd, 0) / isnull(t1.Total*t3.Week, 0) end as PercentageAttd
from (
SELECT CrewDesc, COUNT(*) AS Total
FROM database.emp INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
) t1
left outer join /* left outer to keep all rows from t1 */ (
SELECT DeptDescription, COUNT(*) AS Attd
FROM database.Attendee GROUP BY DeptDescription
) t2
on t1.CrewDesc = t2.DeptDescription
cross join /* useful when adding a scalar value to all rows */ (
SELECT Datepart(ww, GetDate()) as Week
) t3
order by t1.CrewDesc
Good luck!

Try something like this
SELECT COALESCE(a.crewdesc,b.deptdescription),
a.total,
b.attd,
Datepart(ww, Getdate()) AS week,
total * Datepart(ww, Getdate()),
b.attd/(a.total*Datepart(ww, Getdate()))
FROM (query 1) a
FULL OUTER JOIN (query 2) b
ON a.crewdesc = b.deptdescription

WITH Total AS ( SELECT CrewDesc, COUNT(*) AS [Count]
FROM database.emp
INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
),
Attd AS ( SELECT DeptDescription, COUNT(*) AS [Count]
FROM database.Attendee
GROUP BY DeptDescription
)
SELECT COALESCE(CrewDesc,DeptDescription) AS [Dept],
Total.[Count] AS [Total#],Attd.[Count] AS [Attd],
Total.[Count] * Datepart(ww, GetDate()) AS [(Total*Week)],
CAST(Attd.[Count] AS VARCHAR(10))+'/'+ CAST((Total.[Count] * Datepart(ww, GetDate()))AS VARCHAR(10)) AS [Attd/(Total*week)%]
FROM Total INNER JOIN Attd ON Total.CrewDesc = Attd.DeptDescription

I'm assuming your queries are correct -- you give no real information about your model so I've no way to know. They look wrong since the same data is called CrewDesc in one table and Dept in another. Also the join sim1 = sim2 seems very strange to me. In any case given the queries you posted this will work.
With TAttend as
(
SELECT CrewDesc, COUNT(*) as TotalNum
FROM database.emp
INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
), Attend as
(
SELECT DeptDescription, COUNT(*) as Attd
FROM database.Attendee
GROUP BY DeptDescription
)
SELECT CrewDesc as Dept, TotalNum, ISNULL(Attd, 0) as Attd ,Datepart(ww, GetDate()) as Week,
CASE WHEN ISNULL(Attd, 0) > 0 THEN 0
ELSE ISNULL(Attd, 0) / (TotalNum * Datepart(ww, GetDate()) ) END AS Percent
FROM TAttend
LEFT JOIN Attend on CrewDesc = DeptDescription

Related

See the distribution of secondary requests grouped by time interval in sql

I have the following table:
RequestId,Type, Date, ParentRequestId
1 1 2020-10-15 null
2 2 2020-10-19 1
3 1 2020-10-20 null
4 2 2020-11-15 3
For this example I am interested in the request type 1 and 2, to make the example simpler. My task is to query a big database and to see the distribution of the secondary transaction based on the difference of dates with the parent one. So the result would look like:
Interval,Percentage
0-7 days,50 %
8-15 days,0 %
16-50 days, 50 %
So for the first line from teh expected result we have the request with the id 2 and for the third line from the expected result we have the request with the id 4 because the date difference fits in this interval.
How to achieve this?
I'm using sql server 2014.
We like to see your attempts, but by the looks of it, it seems like you're going to need to treat this table as 2 tables and do a basic GROUP BY, but make it fancy by grouping on a CASE statement.
WITH dateDiffs as (
/* perform our date calculations first, to get that out of the way */
SELECT
DATEDIFF(Day, parent.[Date], child.[Date]) as daysDiff,
1 as rowsFound
FROM (SELECT RequestID, [Date] FROM myTable WHERE Type = 1) parent
INNER JOIN (SELECT ParentRequestID, [Date] FROM myTable WHERE Type = 2) child
ON parent.requestID = child.parentRequestID
)
/* Now group and aggregate and enjoy your maths! */
SELECT
case when daysDiff between 0 and 7 then '0-7'
when daysDiff between 8 and 15 then '8-15'
when daysDiff between 16 and 50 THEN '16-50'
else '50+'
end as myInterval,
sum(rowsFound) as totalFound,
(select sum(rowsFound) from dateDiffs) as totalRows,
1.0 * sum(rowsFound) / (select sum(rowsFound) from dateDiffs) * 100.00 as percentFound
FROM dateDiffs
GROUP BY
case when daysDiff between 0 and 7 then '0-7'
when daysDiff between 8 and 15 then '8-15'
when daysDiff between 16 and 50 THEN '16-50'
else '50+'
end;
This seems like basically a join and group by query:
with dates as (
select 0 as lo, 7 as hi, '0-7 days' as grp union all
select 8 as lo, 15 as hi, '8-15 days' union all
select 16 as lo, 50 as hi, '16-50 days'
)
select d.grp,
count(*) as cnt,
count(*) * 1.0 / sum(count(*)) over () as raio
from dates left join
(t join
t tp
on tp.RequestId = t. ParentRequestId
)
on datediff(day, tp.date, t.date) between d.lo and d.hi
group by d.grp
order by d.lo;
The only trick is generating all the date groups, so you have rows with zero values.

Having clause being ignored

In this query, I am attempting to get a count that gives me a count of patients for each practice under given conditions.
The issue is that I have to show patients who have had >=3 office visits in the past year.
Count(D.PID)
in the select list is ignoring
HAVING count(admitdatetime)>=3
Here is my query
select distinct D.PracticeAbbrevName, D.ProviderLastName, count(D.pid) AS Count
from PersonDetail AS D
left join Visit AS V on D.PID = V.PID
where D.A1C >=7.5 and V.admitdatetime >= (getdate()-365) and D.A1CDays <180 and D.Diabetes = 1
group by D.PracticeAbbrevName, D.ProviderLastName
having count(admitdatetime)>=3
order by PracticeAbbrevName
If I get rid of the count function for D.pid, and just display each PID individually, my having phrase works properly.
There is something about count and having that do now work properly together.
Revised answer:
SELECT DISTINCT
D.PracticeAbbrevName,
D.ProviderLastName,
COUNT(D.pid) AS PIDCount,
COUNT(admitdatetime) AS AdmitCount
FROM
PersonDetail AS D
LEFT JOIN Visit AS V
ON D.PID = V.PID
WHERE
D.A1C >= 7.5
AND V.admitdatetime >= ( GETDATE() - 365 )
AND D.A1CDays < 180
AND D.Diabetes = 1
GROUP BY
D.PracticeAbbrevName,
D.ProviderLastName
HAVING
COUNT(admitdatetime) >= 3
ORDER BY
PracticeAbbrevName
You're trying to do too much at once. Split the logic in 2 steps:
Query grouping by PID to filter out patients that don't meet your criteria.
Query grouping by practice to get a patient count.
Your query would look like this:
;with EligiblePatients as (
select d.pid,
d.PracticeAbbrevName,
d.ProviderLastName
from PersonDetail d
left join Visit v
on v.pid = d.pid
and v.admitdatetime >= (getdate()-365)
where d.A1C >= 7.5
and d.A1CDays < 180
and d.Diabetes = 1
group by d.pid,
d.PracticeAbbrevName,
d.ProviderLastName
having count(v.pid) >= 3
)
select PracticeAbbrevName,
ProviderLastName,
COUNT(*) as PatientCount
from EligiblePatients
group by PracticeAbbrevName,
ProviderLastName
order by PracticeAbbrevName

Aggregate totals and create data for weeks that no data exists

I have a table like such:
Region Date Cases
France 1-1-2014 5
Spain 2-5-2014 6
France 3-5-2014 7
...
What I would like to do is run an aggregated function like so, to group the total number of cases in weeks for each region.
select region, datepart(week, date) weeknbr, sum(cases) cases
from <table>
group by region, datepart(week, date)
order by region, datepart(week, date)
Using this aggregated function, is there a way to insert a zero value for each region when data does not exist for that week?
so the final result would look like:
region weeknbr cases
France 1 5
France 2 0
France 3 0
.....
Spain 1 0
Spain 2 0
Spain 3 0
....
Spain 8 6
I have tried to create a table with week numbers, and then joining the week numbers with my data, but have been unsuccessful. This ends up creating a null or zero value for the region and cases. I can always use the isnull function to make the cases 0, but I need to account for each region for each week. That's whats killing me right now. Is this possible? If not, where should I start looking and how should I modify the underlining tables?
Any help would be greatly appreciated. Thank you
If I understand your meaning correctly, you could always generate artificial rows, cross join on grouped regions for completeness of your 0's, then left join your aggregate table on region and week. So:
select r.region, w.RowId as Weeknbr, isnull(c.Cases,0)
from (
select row_number()over(order by name) as RowID
from master..spt_values
) w
cross join (
select region
from <table>
group by region
) r
left join
select region, datepart(week, date) weeknbr, sum(cases) cases
from <table>
group by region, datepart(week, date)
order by region, datepart(week, date)
) c on (w.RowID <= 53 and w.RowID = c.Weeknbr and r.region = c.region)
You need a date_list table and a region_list table. Cross join the dimension tables to get all date-region combinations and then left join against your fact table.
SELECT
d.date,
r.region,
t.cases
FROM date_list d
CROSS JOIN region_list r
LEFT JOIN date_region t ON d.date = t.date AND r.region = t.region

Missing a single day

My database has two tables, a car table and a wheel table.
I'm trying to find the number of wheels that meet a certain condition over a range of days, but some days are not included in the output.
Here is the query:
USE CarDB
SELECT MONTH(c.DateTime1) 'Month',
DAY(c.DateTime1) 'Day',
COUNT(w.ID) 'Wheels'
FROM tblCar c
INNER JOIN tblWheel w
ON c.ID = w.CarID
WHERE c.DateTime1 BETWEEN '05/01/2013' AND '06/04/2013'
AND w.Measurement < 18
GROUP BY MONTH(c.DateTime1), DAY(c.DateTime1)
ORDER BY [ Month ], [ Day ]
GO
The output results seem to be correct, but days with 0 wheels do not show up. For example:
Sample Current Output:
Month Day Wheels
2 1 7
2 2 4
2 3 2 -- 2/4 is missing
2 5 9
Sample Desired Ouput:
Month Day Wheels
2 1 7
2 2 4
2 3 2
2 4 0
2 5 9
I also tried a left join but it didn't seem to work.
You were on the right track with a LEFT JOIN
Try run your query with this kind of outer join but remove your WHERE clause. Notice anything?
What's happening is that the join is applied and then the where clause removes the values that don't match the criteria. All this happens before the group by, meaning the cars are excluded.
Here's one method for you:
SELECT Year(cars.datetime1) As the_year
, Month(cars.datetime1) As the_month
, Day(cars.datetime1) As the_day
, Count(wheels.id) As wheels
FROM (
SELECT id
, datetime1
FROM tblcar
WHERE datetime1 BETWEEN '2013-01-05' AND '2013-04-06'
) As cars
LEFT
JOIN tblwheels As wheels
ON wheels.carid = cars.id
What's different this time round is that we're limiting the results of the car table before we join to the wheels table.
You probably want to use a LEFT OUTER JOIN:
USE CarDB
SELECT MONTH (c.DateTime1) 'Month', DAY (c.DateTime1) 'Day', COUNT (w.ID) 'Wheels'
FROM tblCar c LEFT OUTER JOIN tblWheel w ON c.ID = w.CarID
WHERE c.DateTime1 BETWEEN '05/01/2013' AND '06/04/2013'
AND (w.Measurement IS NULL OR w.Measurement < 18)
GROUP BY MONTH (c.DateTime1), DAY (c.DateTime1)
ORDER BY [Month], [Day]
GO
Aand then, you need to adapt the WHERE condition, as you want to keep the rows with w.Measurement being NULL due to the OUTER join.
Remove the join and change your select to this:
SELECT MONTH (c.DateTime1) 'Month', DAY (c.DateTime1) 'Day', isnull(select top 1 (select COUNT from tblWheel where id = tblCar.ID and Measurement < 18), 0) 'Wheels'

Multiple joins to single table in SQL Server query throwing off counts

I am writing a stored procedure in SQL Server Management Studio 2005 to return a list of states and policy counts for two different time periods, month to date and year to date. I have created a couple of views to gather the required data and a stored procedure for use in a Reporting Services report.
Below is my stored procedure:
SELECT DISTINCT
S.[State],
COUNT(HP_MTD.PolicyID) AS PolicyCount_MTD,
COUNT(HP_YTD.PolicyID) AS PolicyCount_YTD
FROM tblStates S
LEFT OUTER JOIN vwHospitalPolicies HP_MTD ON S.[State] = HP.[State]
AND HP.CreatedDate BETWEEN DATEADD(MONTH, -1, GETDATE()) AND GETDATE()
LEFT OUTER JOIN vwHospitalPolicies HP_YTD ON S.[State] = HP.[State]
AND HP.CreatedDate BETWEEN DATEADD(YEAR, -1, GETDATE()) AND GETDATE()
GROUP BY S.[State]
ORDER BY S.[State] ASC
The problem I am running into is my counts are bloating when a second LEFT OUTER JOIN is added, even the COUNT() that isn't referencing the second join. I need a left join since not all states will have policies for the given period, but they should still appear on the report.
Any suggestions would be greatly appreciated!
It sounds like you need:
COUNT(DISTINCT HP_MTD.PolicyID) AS PolicyCount_MTD,
COUNT(DISTINCT HP_YTD.PolicyID) AS PolicyCount_YTD
instead of:
COUNT(HP_MTD.PolicyID) AS PolicyCount_MTD,
COUNT(HP_YTD.PolicyID) AS PolicyCount_YTD
Your original query is including the number of matching rows in the second join. Adding a DISTINCT clause inside the COUNT limits it to unique occurrences of the PolicyID.
I prefer CTEs for this sort of work - they're basically a sort of inline view.
Your actual problem is that some policies are being counted twice - once for the month-to-date, and once for the year-to-date - in both count columns. So, if you have source tables like this:
year-to-date
state policy
==================
1 1
1 2
1 3
2 4
month-to-date
state policy
==================
1 1
1 2
The result table for the JOINs (before COUNT is assesed) looks like this:
temp
state monthPolicy yearPolicy
================================
1 1 1
1 1 2
1 1 3
1 2 1
1 2 2
1 2 3
2 - 4
Clearly not what you want. Often, the solution is to use a CTE or other table reference to present summed-up records for the final join. Something like this:
WITH PoliciesMonthToDate (state, count) as (
SELECT state, COUNT(*)
FROM vwHospitalPolicies
WHERE createdDate >= DATEADD(MONTH, -1, GETDATE())
AND createdDate < DATEADD(DAY, 1, GETDATE())
GROUP BY state),
PoliciesYearToDate (state, count) as (
SELECT state, COUNT(*)
FROM vwHospitalPolicies
WHERE createdDate >= DATEADD(YEAR, -1, GETDATE())
AND createdDate < DATEADD(DAY, 1, GETDATE())
GROUP BY state)
SELECT a.state, COALESCE(b.count, 0) as policy_count_mtd,
COALESCE(c.count, 0) as policy_count_ytd
FROM tblStates a
LEFT JOIN PoliciesMonthToDate b
ON b.state = a.state
LEFT JOIN PoliciesYearToDate c
ON c.state = a.state
ORDER BY a.state
(minor nitpick - don't use prefixes for tables and views. It's noise, and if for some reason you switch between them, it would require a code change. That, or the name would then be misleading)