COUNT Multiple tables with relationships and users

COUNT Multiple tables with relationships and users - sql

I've got a group of tables in SQL Server that I need to count on with specific counting criteria for each table, the problem I'm having that I need to group this by users, which are contained in one table and are mapped to the tables I need to count on via a relationship table.
This is the relationship table
Relationship User Work Item
Analyst 1 IR1
Analyst 2 IR2
Analyst 2 IR3
Analyst 1 IR4
User 3 IR1
Analyst 1 SR2
Analyst 1 SR3
Analyst 2 SR4
This is the IR table (the SR table is identical)
ID Status
IR1 Active
IR2 Active
IR3 Closed
IR4 Active
This is the user table
User Name
1 Dave
2 Jim
3 Karl
What I need is a table like below counting only the active items
Name IR Count SR Count
Dave 2 2
Jim 1 1
All I seem to be able to do currently is count all of the users regardless of status, I think this may be due to the left joins. I basically had:
Select u.name,
count (ir),
count (sr) from user u
Inner join relationship r on r.user=u.user and r.relationship = 'Analyst'
Left Join IR on r.workitem=ir.id and ir.Status = 'Active'
Left Join SR on r.workitem=sr.id and sr.Status = 'Active'
Group by u.name
I have simplified the above as much as possible. This is the actual query:
SELECT
u.DisplayName as Analyst,
u.BaseManagedEntityId as AUsername,
COUNT(distinct i.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) AS 'Active Incidents',
COUNT(distinct sr.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) as 'Active Service Requests',
COUNT(distinct cr.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) as 'Active Change Requests',
COUNT(distinct ma.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) as 'Active Manual Activities'
FROM MTV_System$Domain$User u
INNER JOIN RelationshipView r ON r.TargetEntityId = u.BaseManagedEntityId AND r.RelationshipTypeId = '15E577A3-6BF9-6713-4EAC-BA5A5B7C4722' AND r.IsDeleted ='0'
LEFT JOIN MTV_System$WorkItem$Incident i ON r.SourceEntityId = i.BaseManagedEntityId AND (i.Status_785407A9_729D_3A74_A383_575DB0CD50ED != '2B8830B6-59F0-F574-9C2A-F4B4682F1681' AND i.Status_785407A9_729D_3A74_A383_575DB0CD50ED != 'BD0AE7C4-3315-2EB3-7933-82DFC482DBAF')
LEFT JOIN MTV_System$WorkItem$ServiceRequest sr ON r.SourceEntityId = SR.BaseManagedEntityId AND (sr.Status_6DBB4A46_48F2_4D89_CBF6_215182E99E0F = '72B55E17-1C7D-B34C-53AE-F61F8732E425' OR sr.Status_6DBB4A46_48F2_4D89_CBF6_215182E99E0F = '59393F48-D85F-FA6D-2EBE-DCFF395D7ED1' OR sr.Status_6DBB4A46_48F2_4D89_CBF6_215182E99E0F = '05306BF5-A6B9-B5AD-326B-BA4E9724BF37')
LEFT JOIN MTV_System$WorkItem$ChangeRequest cr on r.SourceEntityId = cr.BaseManagedEntityId AND (cr.Status_72C1BC70_443C_C96F_A624_A94F1C857138 = '6D6C64DD-07AC-AAF5-F812-6A7CCEB5154D' or cr.Status_72C1BC70_443C_C96F_A624_A94F1C857138 = 'DD6B0870-BCEA-1520-993D-9F1337E39D4D')
LEFT JOIN MTV_System$WorkItem$Activity$ManualActivity MA on r.SourceEntityId = ma.BaseManagedEntityId AND (ma.Status_8895EC8D_2CBF_0D9D_E8EC_524DEFA00014 = '11FC3CEF-15E5-BCA4-DEE0-9C1155EC8D83' OR ma.Status_8895EC8D_2CBF_0D9D_E8EC_524DEFA00014 = 'D544258F-24DA-1CF3-C230-B057AAA66BED')
GROUP BY u.DisplayName,u.BaseManagedEntityId
Order by u.DisplayName

your over simplification seems to have lost your what appears to be your issue from some of your comments. Using left joins would include any of the users even if they don't have a count of one of your other tables. However if you want your result set to only include usres that have at least 1 Incident and/or 1 Request and/or 1 Change Request. Etc. you can either filter the aggretation you are doing after the fact to remove when Incidents + Requests + ... = 0. Or you can filter them out by adding a WHERE statement that says WHEN NOT all of those other tables are null which is the same as OR IS NOT NULL...
SELECT
u.DisplayName as Analyst,
u.BaseManagedEntityId as AUsername,
COUNT(distinct i.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) AS 'Active Incidents',
COUNT(distinct sr.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) as 'Active Service Requests',
COUNT(distinct cr.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) as 'Active Change Requests',
COUNT(distinct ma.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) as 'Active Manual Activities'
FROM
MTV_System$Domain$User u
INNER JOIN RelationshipView r
ON r.TargetEntityId = u.BaseManagedEntityId
AND r.RelationshipTypeId = '15E577A3-6BF9-6713-4EAC-BA5A5B7C4722'
AND r.IsDeleted ='0'
LEFT JOIN MTV_System$WorkItem$Incident i
ON r.SourceEntityId = i.BaseManagedEntityId
AND i.Status_785407A9_729D_3A74_A383_575DB0CD50ED NOT IN ('2B8830B6-59F0-F574-9C2A-F4B4682F1681','BD0AE7C4-3315-2EB3-7933-82DFC482DBAF')
LEFT JOIN MTV_System$WorkItem$ServiceRequest sr
ON r.SourceEntityId = SR.BaseManagedEntityId
AND sr.Status_6DBB4A46_48F2_4D89_CBF6_215182E99E0F IN ('72B55E17-1C7D-B34C-53AE-F61F8732E425','59393F48-D85F-FA6D-2EBE-DCFF395D7ED1','05306BF5-A6B9-B5AD-326B-BA4E9724BF37')
LEFT JOIN MTV_System$WorkItem$ChangeRequest cr
ON r.SourceEntityId = cr.BaseManagedEntityId
AND cr.Status_72C1BC70_443C_C96F_A624_A94F1C857138 IN ('6D6C64DD-07AC-AAF5-F812-6A7CCEB5154D','DD6B0870-BCEA-1520-993D-9F1337E39D4D')
LEFT JOIN MTV_System$WorkItem$Activity$ManualActivity MA
ON r.SourceEntityId = ma.BaseManagedEntityId
AND ma.Status_8895EC8D_2CBF_0D9D_E8EC_524DEFA00014 IN ('11FC3CEF-15E5-BCA4-DEE0-9C1155EC8D83','D544258F-24DA-1CF3-C230-B057AAA66BED')
WHERE
i.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C IS NOT NULL
OR sr.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C IS NOT NULL
OR cr.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C IS NOT NULL
OR ma.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C IS NOT NULL
GROUP BY u.DisplayName,u.BaseManagedEntityId
Order by u.DisplayName
Also note the user of IN and NOT IN in the join conditions instead of OR all of the time.

The issue is count(). Instead, you want count(distinct):
Select u.name,
count(distinct ir),
count(distinct sr)
from user u
. . .
This generates a Cartesian product between the ir and sr values. If you have largish numbers in each group, then aggregating before the join is a better approach.

;WITH T AS
(
SELECT
U.UserID,
U.Name,
CASE WHEN R.WorkItem LIKE 'IR%' THEN R.WorkItem ELSE '' END AS IRCode,
CASE WHEN R.WorkItem LIKE 'SR%' THEN R.WorkItem ELSE '' END AS SRCode
FROM #tblUser U
INNER JOIN #tblRelationship R ON U.UserId=R.UserId
)
SELECT
Name,
SUM(CASE IRCode WHEN '' THEN 0 ELSE 1 END) AS 'IR Count',
SUM(CASE SRCode WHEN '' THEN 0 ELSE 1 END) AS 'SR Count'
FROM T
WHERE
(
(T.IRCode=''
OR
T.SRCode='')
AND
(
T.IRCode IN (SELECT ID FROM #tblIR WHERE Status='Active')
OR
T.SRCode IN (SELECT ID FROM #tblSR WHERE Status='Active')
)
)
GROUP BY T.Name

Try with the below script .
SELECT u.name
,SUM(t1.[IR Count]) [IR Count]
,SUM(t2.[SR Count]) [SR Count]
FROM #user u
INNER JOIN #relationship r on r.[user]=u.[user]and r.relationship = 'Analyst'
CROSS APPLY (SELECT COUNT(DISTINCT ID) [IR Count]
FROM #IR ir WHERE r.workitem=ir.id and ir.Status = 'Active') t1
CROSS APPLY (SELECT COUNT(DISTINCT ID) [SR Count]
FROM #SR sr WHERE r.workitem=sr.id and sr.Status = 'Active')t2
GROUP BY u.name
OUTPUT :
SR Count in your sample output is wrong ,if the SR table is same as that of IR table (Status of the SR3 is closed for the user 'dave' so that will be ignored.) .

This is what I've come up with that seems to work based off the actual table
SELECT
u.DisplayName as Analyst,
u.BaseManagedEntityId as AUsername,
COUNT(distinct i.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) AS 'Active Incidents',
COUNT(distinct sr.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) as 'Active Service Requests',
COUNT(distinct cr.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) as 'Active Change Requests',
COUNT(distinct ma.Id_9A505725_E2F2_447F_271B_9B9F4F0D190C) as 'Active Manual Activities'
FROM MTV_System$Domain$User u
LEFT JOIN RelationshipView r ON r.TargetEntityId = u.BaseManagedEntityId AND r.RelationshipTypeId = '15E577A3-6BF9-6713-4EAC-BA5A5B7C4722' AND r.IsDeleted ='0'
LEFT JOIN MTV_System$WorkItem$Incident i ON r.SourceEntityId = i.BaseManagedEntityId
LEFT JOIN MTV_System$WorkItem$ServiceRequest sr ON r.SourceEntityId = SR.BaseManagedEntityId
LEFT JOIN MTV_System$WorkItem$ChangeRequest cr on r.SourceEntityId = cr.BaseManagedEntityId
LEFT JOIN MTV_System$WorkItem$Activity$ManualActivity MA on r.SourceEntityId = ma.BaseManagedEntityId
Where (i.Status_785407A9_729D_3A74_A383_575DB0CD50ED != '2B8830B6-59F0-F574-9C2A-F4B4682F1681' AND i.Status_785407A9_729D_3A74_A383_575DB0CD50ED != 'BD0AE7C4-3315-2EB3-7933-82DFC482DBAF')
OR (sr.Status_6DBB4A46_48F2_4D89_CBF6_215182E99E0F = '72B55E17-1C7D-B34C-53AE-F61F8732E425' OR sr.Status_6DBB4A46_48F2_4D89_CBF6_215182E99E0F = '59393F48-D85F-FA6D-2EBE-DCFF395D7ED1' OR sr.Status_6DBB4A46_48F2_4D89_CBF6_215182E99E0F = '05306BF5-A6B9-B5AD-326B-BA4E9724BF37')
OR (cr.Status_72C1BC70_443C_C96F_A624_A94F1C857138 = '6D6C64DD-07AC-AAF5-F812-6A7CCEB5154D' or cr.Status_72C1BC70_443C_C96F_A624_A94F1C857138 = 'DD6B0870-BCEA-1520-993D-9F1337E39D4D')
OR (ma.Status_8895EC8D_2CBF_0D9D_E8EC_524DEFA00014 = '11FC3CEF-15E5-BCA4-DEE0-9C1155EC8D83' OR ma.Status_8895EC8D_2CBF_0D9D_E8EC_524DEFA00014 = 'D544258F-24DA-1CF3-C230-B057AAA66BED')
GROUP BY u.DisplayName,u.BaseManagedEntityId
Order by u.DisplayName

Related

How to optimize SQL Server query

I am copying data from one table to another table. While copying I am doing some calculation to modify one column.
SQL Server query:
INSERT INTO rat_proj_duration_map_2
SELECT
r.*,
r.hour_val / (CASE
WHEN week_val = 1 AND
(SELECT TOP 1
hrswk
FROM UserProfileRATinterface_view us
INNER JOIN users u
ON u.username = us.username
WHERE calwk = 2
AND r.uid = u.uid
AND yr = 2016)
> 0 THEN (SELECT TOP 1
hrswk
FROM UserProfileRATinterface_view us
INNER JOIN users u
ON u.username = us.username
WHERE calwk = 2
AND r.uid = u.uid
AND yr = 2016)
WHEN (SELECT
hrswk
FROM UserProfileRATinterface_view us
INNER JOIN users u
ON u.username = us.username
WHERE r.week_val = us.calwk
AND r.uid = u.uid
AND yr = 2016)
< 1 AND
(SELECT
MAX(hrswk)
FROM UserProfileRATinterface_view us
INNER JOIN users u
ON u.username = us.username
WHERE r.uid = u.uid
AND yr = 2016)
> 0 THEN (SELECT
MAX(hrswk)
FROM UserProfileRATinterface_view us
INNER JOIN users u
ON u.username = us.username
WHERE r.uid = u.uid
AND yr = 2016)
WHEN (SELECT
COUNT(*)
FROM UserProfileRATinterface_view us
INNER JOIN users u
ON u.username = us.username
WHERE r.uid = u.uid
AND yr = 2016)
<= 0 THEN 1
ELSE (SELECT
hrswk
FROM UserProfileRATinterface_view us
INNER JOIN users u
ON u.username = us.username
WHERE r.week_val = us.calwk
AND r.uid = u.uid
AND yr = 2016)
END) * 100 AS percentage_val
FROM rat_proj_duration_map r
When I run this query I getting time out issue.
TCP Provider: Timeout error [258]
SQL Server is not in my hand to increase time out value.
Is it possible to optimize my SQL query?

Are you sure this query is logically correct? You have several TOP 1s without specific ORDER BY, scalar comparison of subselect without TOP (which, I assume, may return more than one row if you are using top in other subselects with same source).
And yes - this query can be optimized. You can obtain all the values you need with a single subselect statement and avoid multiple execution of same subselects for each row of rat_proj_duration_map which you are having now:
INSERT INTO rat_proj_duration_map_2
SELECT
r.*,
r.hour_val / (CASE
WHEN week_val = 1 AND us.min_hrswk_2 > 0
THEN us.min_hrswk_2
WHEN us.min_hrswk_week_val <1
AND max_hrswk > 0
THEN max_hrswk
WHEN us.cnt <= 0
THEN 1
ELSE min_hrswk_week_val
END) * 100 as percentage_val
FROM
rat_proj_duration_map r
OUTER APPLY
(
SELECT
count(*) as cnt,
MIN(CASE WHEN calcw = 2 THEN hrswk END) as min_hrswk_2,
MIN(CASE WHEN calcw = r.week_val THEN hrswk END) as min_hrswk_week_val,
MAX(hrswk) as max_hrswk
FROM UserProfileRATinterface_view us
inner join users u on u.username=us.username
WHERE r.uid=u.uid and yr=2016
) us
But I can't be sure if original logic is correct. And the idea of that case to me looks like this:
...
r.hour_val / COALESCE(NULLIF(us.min_hrswk_2, 0),
NULLIF(us.min_hrswk_week_val, 0), NULLIF(max_hrswk, 0), 1)
...

The subqueries in your case clause seem to be essentially the same. You could simplify the whole command by defining a grouped version (... where yr=2016 group by u.uid) of this subquery (preferrably as a common table expression) and then work with that. This could potentially save a lot of redundant operations.
The following might work (have not tested it):
;WITH usrall as (
SELECT u.uid ui, hrswk hw, r.week wk, us.calwk cw
FROM UserProfileRATinterface_view us
INNER JOIN users u on u.username=us.username
WHERE r.uid=u.uid and yr=2016
), usrgrp as (
SELECT ui gui, MAX(hrswk) ghw, count(*) gcnt FROM usrall group by ui
), denom as (
SELECT gui dui, COALESCE( MAX(w2.hw), MAX(wkwc.hw), MAX(gwh) ) dnm
FROM usrgrp
LEFT JOIN usrall w2 ON w2.ui=gui AND w2.cw=2 AND w2.hw>0
LEFT JOIN usrall wkcw ON wkcw.ui=gui AND wkcw.wk=wkcw.cw AND wkwc.hw<1
GROUP BY gui
)
SELECT r.*, r.hour_val / d.dnm
FROM rat_proj_duration_map r
INNER JOIN denom d ON d.dui=u.uid
Essentially I have tried (I hope it works :-/) to replace the case construct by a COALESCE() function that checks the three possible calculated values one after the other. The first non-null value is accepted.
As I said: I have not tested it. Good luck

Better ways to write this SQL Query

I have the below table structure
Users (PK - UserId)
System (PK - SystemId)
SystemRoles (PK-SystemRoleId, FK - SystemId)
UserRoles (PK-UserId & SystemRoleId, FK-SystemRoleId, FK-UserId)
Users can have access to different Systems and one System can have different SystemRoles defined.
Now, I need to delete Users who have SystemRoles assigned to them ONLY for a specific System(s). If they have SystemRoles defined for other Systems, they should not be deleted.
I have come up the below query to identify the records that are eligible for delete but think this can surely be optimized. Any suggestions?
SELECT U.*
FROM
(
SELECT
distinct UR.UserID
FROM
dbo.UserRole UR
INNER JOIN dbo.SystemRole SR ON (SR.SystemRoleID = UR.SystemRoleID)
INNER JOIN dbo.[System] S ON (S.SystemID = SR.SystemID)
WHERE
S.SystemName = 'ABC' OR S.SystemName = 'XYZ'
) T
INNER JOIN dbo.[User] U ON (U.UserID = T.UserID)
WHERE T.UserID NOT IN
(
select
distinct UR.UserID
from
dbo.[UserRole] UR
INNER JOIN dbo.SystemRole SR ON (SR.SystemRoleID = UR.SystemRoleID)
INNER JOIN dbo.[System] S ON (S.SystemID = SR.SystemID)
WHERE
S.SystemName <> 'ABC'
AND S.SystemName <> 'XYZ'
)

something like this?
select userid from (
SELECT
UR.UserID,
max(case when (S.SystemName = 'ABC' OR S.SystemName = 'XYZ')
then 1 else 0 end) as kill,
max(case when (S.SystemName <> 'ABC' AND S.SystemName <> 'XYZ')
then 1 else 0 end) as keep
FROM
dbo.UserRole UR
INNER JOIN dbo.SystemRole SR ON (SR.SystemRoleID = UR.SystemRoleID)
INNER JOIN dbo.[System] S ON (S.SystemID = SR.SystemID)
group by UR.UserID
) u where kill = 1 and keep = 0

This sort of structure will get you the records you need.
select yourfields -- or delete
from userroles
where userid in
(select userid
from userroles join etc
where system.name = the one you want
except
select userid
from userroles join etc
where system.name <> the one you want
)

Same values being returned for multiple rows in sql but not all columns

i'm currently running the following query (see below.)
However when i do the same values is returned for multiple rows in totalusers, activeusers and suspendedusers.
However when it comes to total login the values are unique.
Is their a reason why this could be happening and is their a way to solve the problem. If it helps im using the tool sql workben with postgre sql driver.
Cheers
SELECT
company.companyStatus,
company.CompanyId,
company.companyName,
select
count(distinct UserID)
From Users
inner join company
on Users.CompanyID = Company.CompanyId
where Users.Companyid = company.Companyid
as TotalUsers,
select sum(case when userstatusid =2 then 1 else 0 end)
from users
inner join company
on users.companyid = company.companyid
where users.companyid = company.companyid)
as ActiveUsers,
select sum(case when userstatusid = 3 then 1 else 0 end)
from users
inner join company
on users.companyid = company.companyid
where users.companyid = company.companyid)
as SuspendedUsers,
(Select COUNT (distinct usersessionid)
From UserSession
inner join users
on usersession.UserID=users.UserID
where usersession.UserID=users.UserID
and users.companyid= company.CompanyID)
as TotalLogin,
from Company

Its because your TotalUsers, ActiveUsers and SuspendedUsers queries are all using their own (unrestricted) copy of the Company table, whereas your TotalLogin is using the main instance from which you're selecting. This means that the TotalLogin numbers you're seeing are for that particular company, but the other fields are across ALL companies.
Presumably you wanted something more like:
SELECT
company.companyStatus,
company.CompanyId,
company.companyName,
count(distinct u.UserID) TotalUsers,
sum(case when u.userstatusid =2 then 1 else 0 end) ActiveUsers,
sum(case when u.userstatusid = 3 then 1 else 0 end) SuspendedUsers,
count(distinct u.usersessionid) TotalLogin
from Company
inner join Users on Users.CompanyID = Company.CompanyId

The reason is because you have company in the subqueries for those calculations.
I much prefer having table references in the from clause where possible, and you can write this query moving everything to the from clause:
SELECT c.companyStatus, c.CompanyId, c.companyName,
uc.Totalusers, uc.Activeusers, uc.Suspendedusers, ucs.TotalLogin
from Company c left outer join
(select u.companyid,
COUNT(distinct userid) as Totalusers,
SUM(case when userstatusid = 2 then 1 else 0 end) as ActiveUsers,
sum(case when userstatusid = 3 then 1 else 0 end) as Suspendedusers
from users u
group by u.companyid
) uc
uc.companyid = c.companyId left outer join
(select u.companyid, COUNT(distinct usersessionid) as TotalLogin
from UserSession us inner join
users u
on us.UserID = u.UserID
) ucs
on ucs.companyid = c.companyid;
This should also speed up the query because it doesn't have to do the same work multiple times.

Query for logistic regression, multiple where exists

A logistic regression is a composed of a uniquely identifying number, followed by multiple binary variables (always 1 or 0) based on whether or not a person meets certain criteria. Below I have a query that lists several of these binary conditions. With only four such criteria the query takes a little longer to run than what I would think. Is there a more efficient approach than below? Note. tblicd is a large table lookup table with text representations of 15k+ rows. The query makes no real sense, just a proof of concept. I have the proper indexes on my composite keys.
select patient.patientid
,case when exists
(
select c.patientid from tblclaims as c
inner join patient as p on p.patientid=c.patientid
and c.admissiondate = p.admissiondate
and c.dischargedate = p.dischargedate
where patient.patientid = p.patientid
group by c.patientid
having count(*) > 1000
)
then '1' else '0'
end as moreThan1000
,case when exists
(
select c.patientid from tblclaims as c
inner join patient as p on p.patientid=c.patientid
and c.admissiondate = p.admissiondate
and c.dischargedate = p.dischargedate
where patient.patientid = p.patientid
group by c.patientid
having count(*) > 1500
)
then '1' else '0'
end as moreThan1500
,case when exists
(
select distinct picd.patientid from patienticd as picd
inner join patient as p on p.patientid= picd.patientid
and picd.admissiondate = p.admissiondate
and picd.dischargedate = p.dischargedate
inner join tblicd as t on t.icd_id = picd.icd_id
where t.descrip like '%diabetes%' and patient.patientid = picd.patientid
)
then '1' else '0'
end as diabetes
,case when exists
(
select r.patientid, count(*) from patient as r
where r.patientid = patient.patientid
group by r.patientid
having count(*) >1
)
then '1' else '0'
end
from patient
order by moreThan1000 desc

I would start by using subqueries in the from clause:
select q.patientid, moreThan1000, moreThan1500,
(case when d.patientid is not null then 1 else 0 end),
(case when pc.patientid is not null then 1 else 0 end)
from patient p left outer join
(select c.patientid,
(case when count(*) > 1000 then 1 else 0 end) as moreThan1000,
(case when count(*) > 1500 then 1 else 0 end) as moreThan1500
from tblclaims as c inner join
patient as p
on p.patientid=c.patientid and
c.admissiondate = p.admissiondate and
c.dischargedate = p.dischargedate
group by c.patientid
) q
on p.patientid = q.patientid left outer join
(select distinct picd.patientid
from patienticd as picd inner join
patient as p
on p.patientid= picd.patientid and
picd.admissiondate = p.admissiondate and
picd.dischargedate = p.dischargedate inner join
tblicd as t
on t.icd_id = picd.icd_id
where t.descrip like '%diabetes%'
) d
on p.patientid = d.patientid left outer join
(select r.patientid, count(*) as cnt
from patient as r
group by r.patientid
having count(*) >1
) pc
on p.patientid = pc.patientid
order by 2 desc
You can then probably simplify these subqueries more by combining them (for instance "p" and "pc" on the outer query can be combined into one). However, without the correlated subqueries, SQL Server should find it easier to optimize the queries.

Example of left joins as requested...
SELECT
patientid,
ISNULL(CondA.ConditionA,0) as IsConditionA,
ISNULL(CondB.ConditionB,0) as IsConditionB,
....
FROM
patient
LEFT JOIN
(SELECT DISTINCT patientid, 1 as ConditionA from ... where ... ) CondA
ON patient.patientid = CondA.patientID
LEFT JOIN
(SELECT DISTINCT patientid, 1 as ConditionB from ... where ... ) CondB
ON patient.patientid = CondB.patientID
If your Condition queries only return a maximum one row, you can simplify them down to
(SELECT patientid, 1 as ConditionA from ... where ... ) CondA

SQL LEFT JOIN combined with regular joins

I have the following query that joins a bunch of tables.
I'd like to get every record from the INDUSTRY table that has consolidated_industry_id = 1 regardless of whether or not it matches the other tables. I believe this needs to be done with a LEFT JOIN?
SELECT attr.industry_id AS option_id,
attr.industry AS option_name,
uj.ft_job_industry_id,
Avg(CASE
WHEN s.salary > 0 THEN s.salary
END) AS average,
Count(CASE
WHEN s.salary > 0 THEN attr.industry
END) AS count_non_zero,
Count(attr.industry_id) AS count_total
FROM industry attr,
user_job_ft_job uj,
salary_ft_job s,
user_job_ft_job ut,
[user] u,
user_education_mba_school mba
WHERE u.user_id = uj.user_id
AND u.user_id = ut.user_id
AND u.user_id = mba.user_id
AND uj.ft_job_industry_id = attr.industry_id
AND uj.user_job_ft_job_id = s.user_job_id
AND u.include_in_student_site_results = 1
AND u.site_instance_id IN ( 1 )
AND uj.job_type_id = 1
AND attr.consolidated_industry_id = 1
AND mba.mba_graduation_year_id NOT IN ( 8, 9 )
AND uj.admin_approved = 1
GROUP BY attr.industry_id,
attr.industry,
uj.ft_job_industry_id
This returns only one row, but there are 8 matches in the industry table where consolidated_industry_id = 1.
--- EDIT: The real question here is, how do I combine the LEFT JOIN with the regular joins?

Use left join for tables that may miss a corresponding record. Put the conditions for each table in the on clause of the join, not in the where, as that would in effect make them inner joins anyway. Something like:
select
attr.industry_id AS option_id, attr.industry AS option_name,
uj.ft_job_industry_id, AVG(CASE WHEN s.salary > 0 THEN s.salary END) AS average,
COUNT(CASE WHEN s.salary > 0 THEN attr.industry END) as count_non_zero,
COUNT(attr.industry_id) as count_total
from
industry attr
left join user_job_ft_job uj on uj.ft_job_industry_id = attr.industry_id and uj.job_type_id = 1 and uj.admin_approved = 1
left join salary_ft_job s on uj.user_job_ft_job_id = s.user_job_id
left join [user] u on u.user_id = uj.user_id and u.include_in_student_site_results = 1 and u.site_instance_id IN (1)
left join user_job_ft_job ut on u.user_id = ut.user_id
left join user_education_mba_school mba on u.user_id = mba.user_id and mba.mba_graduation_year_id not in (8, 9)
where
attr.consolidated_industry_id = 1
group by
attr.industry_id, attr.industry, uj.ft_job_industry_id
If you have any tables that you know always have a corresponding record, just use innser join for that.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

COUNT Multiple tables with relationships and users - sql

The issue is count(). Instead, you want count(distinct): Select u.name, count(distinct ir), count(distinct sr) from user u . . . This generates a Cartesian product between the ir and sr values. If you have largish numbers in each group, then aggregating before the join is a better approach.

Related

How to optimize SQL Server query

Better ways to write this SQL Query

Same values being returned for multiple rows in sql but not all columns

Query for logistic regression, multiple where exists

SQL LEFT JOIN combined with regular joins

Categories

Resources