SQL SELECT string Greater than using count() - sql

I am trying to list groups that have more graduate than undergraduate student members. I feel I have the concept behind my idea, but making the query is a little more difficult then a simple translation. Below is my code, I currently am getting a missing right parenthesis error where COUNT(student.career = 'GRD'). Thanks.
SELECT studentgroup.name
COUNT(student.career = 'GRD') - COUNT(student.career = 'UGRD')
AS Gradnum FROM studentgroup
INNER JOIN memberof ON studentgroup.GID = memberof.GroupID
INNER JOIN student ON memberof.StudentID = student.SID
WHERE Gradnum > 1;

SELECT studentgroup.GID, max(studentgroup.name)
FROM studentgroup
INNER JOIN memberof ON studentgroup.GID = memberof.GroupID
INNER JOIN student ON memberof.StudentID = student.SID
GROUP BY studentgroup.GID
HAVING SUM(CASE WHEN student.career = 'GRD' THEN 1
WHEN student.career = 'UGRD'THEN -1
ELSE 0
END) >0

SELECT studentgroup.name
SUM(CASE WHEN student.career = 'GRD' THEN 1 ELSE 0 END) - SUM(CASE WHEN student.career = 'UGRD' THEN 1 ELSE 0 END)
AS Gradnum FROM studentgroup
INNER JOIN memberof ON studentgroup.GID = memberof.GroupID
INNER JOIN student ON memberof.StudentID = student.SID
WHERE Gradnum > 1
GROUP BY studentgroup.name;

I have used WITH As clause which is supported by most of DBMS like SQL Server, PostGresSQL except MySQL
With grpTbl As
(
SELECT studentgroup.name As StudentGroupName,
SUM( CASE WHEN student.career = 'GRD' THEN 1 ELSE 0 END ) AS 'TotalGraduate',
SUM( CASE WHEN student.career = 'UGRD' THEN 1 ELSE 0 END ) AS 'TotalUnderGraduate'
FROM studentgroup
INNER JOIN memberof ON studentgroup.GID = memberof.GroupID
INNER JOIN student ON memberof.StudentID = student.SID
)
SELECT StudentGroupName
FROM grpTbl
WHERE TotalGraduate > TotalUnderGraduate
For MySQL you can use Temporary table to store resultset from First query and filter out the GroupNames which have more Graduate than UnderGraduate in WHERE clause .
This method will work for other DBMS also difference being syntax of creating temporary table.
CREATE TEMPORARY TABLE grpTbl (
StudentGroupName varchar(255),
TotalGraduate INT,
TotalUnderGraduate INT
);
INSERT INTO grpTbl
SELECT studentgroup.name As StudentGroupName,
SUM( CASE WHEN student.career = 'GRD' THEN 1 ELSE 0 END ) ,
SUM( CASE WHEN student.career = 'UGRD' THEN 1 ELSE 0 END )
FROM studentgroup
INNER JOIN memberof ON studentgroup.GID = memberof.GroupID
INNER JOIN student ON memberof.StudentID = student.SID
SELECT StudentGroupName
FROM grpTbl
WHERE TotalGraduate > TotalUnderGraduate
DROP TABLE grpTbl

One more option
SELECT studentgroup.name
FROM studentgroup INNER JOIN memberof ON studentgroup.GID = memberof.GroupID
INNER JOIN student ON memberof.StudentID = student.SID
GROUP BY studentgroup.name
HAVING COUNT(CASE WHEN student.career = 'GRD' THEN student.career END)
> COUNT(CASE WHEN student.career = 'UGRD' THEN student.career END)

Related

SELECT statement with operators not working

I have these two tables and I need a query, that outputs every member that has the lvnr 050056 AND NOT 050054.
I have these two tables
I have tried it with the following query but it does not work right:
SELECT s.matrnr, s.vorname, s.nachname
FROM student s
INNER JOIN teilgenommen t ON s.matrnr = t.matrnr
WHERE (t.lvnr = 050056) AND (t.lvnr != 050054)
Only Martin Huber with the ID 0111111 should be shown, but I get both..
I would be very thankful for any advie
Use exists and not exists:
select s.*
from student s
where exists (select 1
from teilgenommen t
where t.matrnr = s.matrnr and t.lvnr = '050056'
) and
not exists (select 1
from teilgenommen t
where t.matrnr = s.matrnr and t.lvnr = '050054'
);
The leading zeros suggest that lvnr is really stored as a string. If so, then single quotes should be used for the comparison value.
You can do:
SELECT
s.matrnr,
s.vorname,
s.nachname
FROM
student s
INNER JOIN
(
select
matrnr,
max(case when lvnr='050056' then 1 else 0 end) as a,
max(case when lvnr='050054' then 1 else 0 end) as b
from
teilgenommen
group by
matrnr
having a=1 and b=0
) t
ON s.matrnr = t.matrnr
This can also be solved with aggregation and a having clause for filtering:
select s.matrnr, s.vorname, s.nachname
from student s
inner join teilgenommen t on s.matrnr = t.matrnr
group by s.matrnr, s.vorname, s.nachname
having
max(case when t.lvnr = 050056 then 1 else 0 end) = 1
and max(case when t.lvnr = 050054 then 1 else 0 end) = 0

How can I use multiple HAVING or similar thing?

I have some troubles where I want to show only teachers who is teaching 3 or more internal or external courses (not together). I guess my code now would be happy with 2 internal and 1 external etc. So how can I make sure it will actually count them individually and not together?
SELECT t.pnr, t.tname
FROM teacher t
JOIN teaches s ON t.pnr = s.pnr
JOIN course c ON s.coursecode = c.coursecode
WHERE c.coursetype = 'intern' OR c.coursetype = 'extern'
GROUP BY t.pnr, t.tname
HAVING COUNT(c.coursetype) > 2
You could use conditional aggregation to obtain counts of internal and external courses that the teacher teaches:
SELECT t.pnr, t.tname,
SUM(CASE WHEN c.coursetype = 'intern' THEN 1 ELSE 0 END) AS intcourses,
SUM(CASE WHEN c.coursetype = 'extern' THEN 1 ELSE 0 END) AS extcourses
FROM teacher t
JOIN teaches s ON t.pnr = s.pnr
JOIN course c ON s.coursecode = c.coursecode
GROUP BY t.pnr, t.tname
HAVING intcourses > 2 or extcourses > 2
You could just count in where clause, instead of grouping:
SELECT t.pnr, t.tname
FROM teacher t
WHERE (select count(*) from teaches s
JOIN course c ON c.coursecode = s.coursecode
and c.coursetype='intern'
where t.pnr = s.pnr) > 2
OR
(select count(*) from teaches s
JOIN course c ON c.coursecode = s.coursecode
and c.coursetype='extern'
where t.pnr = s.pnr) > 2
Your use of the having makes sense. You can do this as:
SELECT t.pnr, t.tname
FROM teacher t JOIN
teaches s
ON t.pnr = s.pnr JOIN
course c
ON s.coursecode = c.coursecode
WHERE c.coursetype IN ('intern', 'extern')
GROUP BY t.pnr, t.tname
HAVING SUM(CASE WHEN c.coursetype = 'intern' THEN 1 ELSE 0 END) >= 3 OR
SUM(CASE WHEN c.coursetype = 'extern' THEN 1 ELSE 0 END) >= 3 ;

SQL Server 2012 - is there a better way to do this as when there are duplicates it counts them more than once?

This is not accurate as the count can be wrong so is there a better way using exists? I want to identify if one case of each course exists.
SELECT
IdentityCourses.IdentityID AS ID,Identities.LastName AS LastName,
Identities.FirstNames AS FirstName,Units.UnitID, Units.Description AS Unit
FROM
dbo.UnitIdentities
INNER JOIN
dbo.IdentityCourses ON dbo.UnitIdentities.IdentityID = dbo.IdentityCourses.IdentityID
INNER JOIN
dbo.COCSourceCourses ON dbo.IdentityCourses.CourseID = dbo.COCSourceCourses.CBESCourseID
INNER JOIN
dbo.Identities ON dbo.UnitIdentities.IdentityID = dbo.Identities.IdentityID
INNER JOIN
dbo.Units ON dbo.UnitIdentities.UnitID = dbo.Units.UnitID
WHERE
(dbo.UnitIdentities.IsActiveMember = 1)
GROUP BY
IdentityCourses.IdentityID, Identities.LastName, Identities.FirstNames,
Units.Description, Units.UnitID
HAVING
(SUM((CASE WHEN COCSourceCourses.COCID = 10048 then 1 else 0 end)+
(CASE WHEN COCSourceCourses.COCID = 10049 then 1 else 0 end)+
(CASE WHEN COCSourceCourses.COCID = 10050 then 1 else 0 end)+
(CASE WHEN COCSourceCourses.COCID = 10051 then 1 else 0 end)+
(CASE WHEN COCSourceCourses.COCID = 10063 then 1 else 0 end)+
(CASE WHEN COCSourceCourses.COCID = 10073 then 1 else 0 end))) = 6
AND IdentityCourses.IdentityID NOT IN (SELECT IdentityID
FROM IdentityQualifications
WHERE QualificationID IN (1012, 1014, 1025))
ORDER BY
Units.UnitID
Try using count(distinct ..):
SELECT (..columns..)
FROM dbo.UnitIdentities UI
LEFT JOIN IdentityQualifications IQ
ON IQ.IdentityID = UI.IdentityID
AND IQ.QualificationID IN (1012, 1014, 1025)
INNER JOIN dbo.IdentityCourses IC
ON IC.IdentityID = dbo.UnitIdentities.IdentityID
INNER JOIN dbo.COCSourceCourses COC
ON COC.CBESCourseID = IC.CourseID
AND COC.COCID IN (10048, 10049, 10050, 10051, 10063, 10073)
(..two more table joins on identities and units..)
WHERE IQ.IdentityID IS NULL
GROUP BY (..columns..)
HAVING COUNT(DISTINCT COC.COCID) = 6
ORDER BY Units.UnitID
When you are only interested in certain records, then why don't you use the WHERE clause? Only select the COCIDs you are interested in and then count distinct results.
You don't need any GROUP BY and HAVING by the way, as you only display identities/units, so you can count associated courses in a subquery in your WHERE clause.
select
i.identityid as id,
i.lastname as lastname,
i.firstnames as firstname,
u.unitid,
u.description as unit
from dbo.identities i
join dbo.unitidentities ui on ui.identityid = i.identityid and ui.isactivemember = 1
join dbo.units u on u.unitid = ui.unitid
where i.identityid not in
(
select iq.identityid
from identityqualifications iq
where iq.qualificationid in (1012, 1014, 1025)
)
and
(
select count(distinct sc.cocid)
from dbo.cocsourcecourses sc
join dbo.identitycourses ic on ic.courseid = sc.cbescourseid
where sc.cocid in (10048, 10049, 10050, 10051, 10063, 10073)
and ic.identityid = i.identityid
) = 6
order by u.unitid;

Same values being returned for multiple rows in sql but not all columns

i'm currently running the following query (see below.)
However when i do the same values is returned for multiple rows in totalusers, activeusers and suspendedusers.
However when it comes to total login the values are unique.
Is their a reason why this could be happening and is their a way to solve the problem. If it helps im using the tool sql workben with postgre sql driver.
Cheers
SELECT
company.companyStatus,
company.CompanyId,
company.companyName,
select
count(distinct UserID)
From Users
inner join company
on Users.CompanyID = Company.CompanyId
where Users.Companyid = company.Companyid
as TotalUsers,
select sum(case when userstatusid =2 then 1 else 0 end)
from users
inner join company
on users.companyid = company.companyid
where users.companyid = company.companyid)
as ActiveUsers,
select sum(case when userstatusid = 3 then 1 else 0 end)
from users
inner join company
on users.companyid = company.companyid
where users.companyid = company.companyid)
as SuspendedUsers,
(Select COUNT (distinct usersessionid)
From UserSession
inner join users
on usersession.UserID=users.UserID
where usersession.UserID=users.UserID
and users.companyid= company.CompanyID)
as TotalLogin,
from Company
Its because your TotalUsers, ActiveUsers and SuspendedUsers queries are all using their own (unrestricted) copy of the Company table, whereas your TotalLogin is using the main instance from which you're selecting. This means that the TotalLogin numbers you're seeing are for that particular company, but the other fields are across ALL companies.
Presumably you wanted something more like:
SELECT
company.companyStatus,
company.CompanyId,
company.companyName,
count(distinct u.UserID) TotalUsers,
sum(case when u.userstatusid =2 then 1 else 0 end) ActiveUsers,
sum(case when u.userstatusid = 3 then 1 else 0 end) SuspendedUsers,
count(distinct u.usersessionid) TotalLogin
from Company
inner join Users on Users.CompanyID = Company.CompanyId
The reason is because you have company in the subqueries for those calculations.
I much prefer having table references in the from clause where possible, and you can write this query moving everything to the from clause:
SELECT c.companyStatus, c.CompanyId, c.companyName,
uc.Totalusers, uc.Activeusers, uc.Suspendedusers, ucs.TotalLogin
from Company c left outer join
(select u.companyid,
COUNT(distinct userid) as Totalusers,
SUM(case when userstatusid = 2 then 1 else 0 end) as ActiveUsers,
sum(case when userstatusid = 3 then 1 else 0 end) as Suspendedusers
from users u
group by u.companyid
) uc
uc.companyid = c.companyId left outer join
(select u.companyid, COUNT(distinct usersessionid) as TotalLogin
from UserSession us inner join
users u
on us.UserID = u.UserID
) ucs
on ucs.companyid = c.companyid;
This should also speed up the query because it doesn't have to do the same work multiple times.

Query for logistic regression, multiple where exists

A logistic regression is a composed of a uniquely identifying number, followed by multiple binary variables (always 1 or 0) based on whether or not a person meets certain criteria. Below I have a query that lists several of these binary conditions. With only four such criteria the query takes a little longer to run than what I would think. Is there a more efficient approach than below? Note. tblicd is a large table lookup table with text representations of 15k+ rows. The query makes no real sense, just a proof of concept. I have the proper indexes on my composite keys.
select patient.patientid
,case when exists
(
select c.patientid from tblclaims as c
inner join patient as p on p.patientid=c.patientid
and c.admissiondate = p.admissiondate
and c.dischargedate = p.dischargedate
where patient.patientid = p.patientid
group by c.patientid
having count(*) > 1000
)
then '1' else '0'
end as moreThan1000
,case when exists
(
select c.patientid from tblclaims as c
inner join patient as p on p.patientid=c.patientid
and c.admissiondate = p.admissiondate
and c.dischargedate = p.dischargedate
where patient.patientid = p.patientid
group by c.patientid
having count(*) > 1500
)
then '1' else '0'
end as moreThan1500
,case when exists
(
select distinct picd.patientid from patienticd as picd
inner join patient as p on p.patientid= picd.patientid
and picd.admissiondate = p.admissiondate
and picd.dischargedate = p.dischargedate
inner join tblicd as t on t.icd_id = picd.icd_id
where t.descrip like '%diabetes%' and patient.patientid = picd.patientid
)
then '1' else '0'
end as diabetes
,case when exists
(
select r.patientid, count(*) from patient as r
where r.patientid = patient.patientid
group by r.patientid
having count(*) >1
)
then '1' else '0'
end
from patient
order by moreThan1000 desc
I would start by using subqueries in the from clause:
select q.patientid, moreThan1000, moreThan1500,
(case when d.patientid is not null then 1 else 0 end),
(case when pc.patientid is not null then 1 else 0 end)
from patient p left outer join
(select c.patientid,
(case when count(*) > 1000 then 1 else 0 end) as moreThan1000,
(case when count(*) > 1500 then 1 else 0 end) as moreThan1500
from tblclaims as c inner join
patient as p
on p.patientid=c.patientid and
c.admissiondate = p.admissiondate and
c.dischargedate = p.dischargedate
group by c.patientid
) q
on p.patientid = q.patientid left outer join
(select distinct picd.patientid
from patienticd as picd inner join
patient as p
on p.patientid= picd.patientid and
picd.admissiondate = p.admissiondate and
picd.dischargedate = p.dischargedate inner join
tblicd as t
on t.icd_id = picd.icd_id
where t.descrip like '%diabetes%'
) d
on p.patientid = d.patientid left outer join
(select r.patientid, count(*) as cnt
from patient as r
group by r.patientid
having count(*) >1
) pc
on p.patientid = pc.patientid
order by 2 desc
You can then probably simplify these subqueries more by combining them (for instance "p" and "pc" on the outer query can be combined into one). However, without the correlated subqueries, SQL Server should find it easier to optimize the queries.
Example of left joins as requested...
SELECT
patientid,
ISNULL(CondA.ConditionA,0) as IsConditionA,
ISNULL(CondB.ConditionB,0) as IsConditionB,
....
FROM
patient
LEFT JOIN
(SELECT DISTINCT patientid, 1 as ConditionA from ... where ... ) CondA
ON patient.patientid = CondA.patientID
LEFT JOIN
(SELECT DISTINCT patientid, 1 as ConditionB from ... where ... ) CondB
ON patient.patientid = CondB.patientID
If your Condition queries only return a maximum one row, you can simplify them down to
(SELECT patientid, 1 as ConditionA from ... where ... ) CondA