How to do the max count part in SQL? - sql

I was told to Find out which occupation has the greatest number of patients with conditionID=MC8
I dk how to do the greatest part.....
Here my code right now
SELECT occupation
FROM Patient
WHERE EXISTS
(SELECT PatientID FROM PatientMedcon
Where conditionID=’MC8’)
GROUP BY occupation
HAVNG count(occupation) = (Select max(occupation)
From Patient

You should approach these types of queries using regular joins and then add additional factors. The following gets the count of patients for each occupation with that condition:
SELECT occupation, COUNT(*)
FROM Patient p JOIN
PatentMedcon pm
ON p.PatientId = pm.PatientId and
pm.conditionId = 'MC8'
GROUP BY occupation
ORDER BY COUNT(*) DESC;
If you want the top row, that depends on the database. It might be select top 1, limit 1 at the end, fetch first 1 rows only at the end, or even something else.

Related

Filter on specific columns and return all columns

I am trying to left join two tables and retrieve all columns from table one but remove duplicates based on a set of columns.
SELECT A.*, B.impact
FROM #Site_one AS A WITH (NOLOCK)
LEFT JOIN #Progress AS B With (NOLOCK)
ON lower(A.site_code) = lower(B.site_code)
GROUP BY A.date, A.operationid, A.worklocation, A.siteid, A.alias
This does not work as there will be column in A which either need to be aggregated or be added to the group by clause. The issue with that is that I do not want to filter on those columns and do not want them aggregated.
Is there a way to select all columns in A and the impact column in B and still be able to filter out duplicates on the columns specified in the group by clause?
Any pointers/help would be greatly appreciated.
and still be able to filter out duplicates on the columns specified in the group by clause
But, how does the database really know which rows to throw away? Suppose you have:
Person
John, 42, Stockbroker
John, 36, Train driver
John, 58, Retired
John, 58, Metalworker
And you think "I wanna dedupe those based on the name":
SELECT * FROM person GROUP BY name
So which three Johns should the DB throw away?
It cannot decide this for you; you have to write the query to make it clear what you want to keep or throw
You could MAX everything:
SELECT name, MAX(age), MAX(job) FROM person GROUP BY name
That'll work.. but it gives you a John that never existed in the original data:
John, 58, Train driver
You could say "I'll only keep the person with the max age":
SELECT p.*
FROM
person p
INNER JOIN (SELECT name, max(age) as maxage FROM person GROUP BY name) maxp
ON p.name = maxp.name AND p.age = maxp.maxage
.. but there are two people with the same max age.
Your DB might have a row number analytic, which is nice:
SELECT *, row_number() over(PARTITION BY name ORDER BY age DESC) rn
FROM person
One of your 58 year old Johns will get row number 1 - can't be sure which one, but you could then discard all the rows with an rn > 1:
WITH x as (
SELECT *, row_number() over(PARTITION BY name ORDER BY age DESC) rn
FROM person
)
SELECT name, age, job
INTO newtable
FROM x
WHER rn = 1
..but what if you discarded the wrong John...
You're going to have to go and think about this some more, and exactly specify what to throw away...

SQL Server Get one row for each student with highest date

I have two tables as follows:
I want to find the StudentId, FirstName, StudentLoginInfoId, LoginDate. I am expecting only one entry per student with higher LoginDate.
Expected result:
You could use ROW_NUMBER to number output of the result-set for each partition (here each student) in a subquery and achieve your desired output by applying a condition of the number assigned for each student to be 1 which will equal one row.
select studentid, firstname, studentlogininfoid, logindate
from (
select
s.studentid, s.firstname, sl.studentlogininfoid, sl.logindate,
row_number() over (partition by sl.studentid order by sl.logindate desc) as rn
from student s
inner join studentlogininfoid sl on s.studentid = sl.studentid
) t
where rn = 1
Explaining arguments for row_number:
PARTITION BY specifies what are your groups to enumerate separately (start from 1 for each group)
ORDER BY specifies how should rows be enumerated (based on which order)
If we enumerate rows for each student and sort them from latest date descending, then the first row for each student (the row with rn = 1) will contain highest login date value for that student.
You can use "CROSS APPLY" to find what you want:
SELECT S.StudentId
, S.FirstName
, SLI.StudentLoginInfoId
, SLI.LoginDate
FROM Student S
CROSS APPLY (SELECT TOP 1 * FROM StudentLoginInfo SLI WHERE S.StudentId = SLI.StudentId ORDER BY LoginDate DESC) SLI

Hive Script, DISTINCT with SUM

I am trying to distinct and then find the count of the teams a player played for in any single season and number of teams he played for. This is tripping me up and ofcourse i have a sample down below(2nd) one. The first ones is my failed attempt
SELECT o.id,o.year,COUNT(DISTINCT(o.team)) b JOIN
(SELECT id, year, team FROM batting
GROUP BY id,year,team
ORDER BY id DESC
LIMIT 25) o
0.id =b.id;
SELECT id, year, team FROM batting
GROUP BY id,year,team
ORDER BY id DESC
LIMIT 25;
produces
IGNORE the ^A, i think they represent either space or comma, just column seperatpr
Get the count of teams for each player for each year and order by the count desc,get the 1 row
SELECT id, year, COUNT(DISTINCT(team)) FROM batting
GROUP BY id,year
ORDER BY COUNT(DISTINCT(team)) DESC
LIMIT 1;

SQL Finding maximum value without top command

Let's say I have a bases with a table:
-courses (key: name [ofthecourse], other attributes: year in which the course takes place)
I want to complete a query looking for an answer to the question:
On which year of study there is a maximum number of courses?
Normally, the query would be:
SELECT TOP 1 STUDYEAR
FROM COURSES
GROUP BY STUDYEAR
ORDER BY COUNT(CNO) DESC;
But my question is, which query could complete this without using the TOP 1 phrase?
You can use an inner query to get the maximum count. The only difference is though that it can return more than one record if they have the same count.
SELECT STUDYEAR
FROM COURSES
GROUP BY STUDYEAR
HAVING COUNT(CNO) = (SELECT MAX(CNOCount) FROM
(SELECT COUNT(CNO) CNOCount
FROM COURSES
GROUP BY STUDYEAR) X)
Another version with only one inner query:
SELECT STUDYEAR
FROM
(SELECT STUDYEAR, ROW_NUMBER() OVER (ORDER BY COUNT(CNO) DESC) RowNumber
FROM COURSES
GROUP BY STUDYEAR) X
WHERE RowNumber = 1

How do I see if there are multiple rows with an identical value in particular column?

I'm looking for an efficient way to exclude rows from my SELECT statement WHERE more than one row is returned with an identical value for a certain column.
Specifically, I am selecting a bunch of accounts, but need to exclude accounts where more than one is found with the same SSN associated.
this will return all SSNs with exactly 1 row
select ssn,count(*)
from SomeTable
group by ssn
having count(*) = 1
this will return all SSNs with more than 1 row
select ssn,count(*)
from SomeTable
group by ssn
having count(*) > 1
Your full query would be like this (will work on SQL Server 7 and up)
select a.* from account a
join(
select ssn
from SomeTable
group by ssn
having count(*) = 1) s on a.ssn = s.ssn
For SQL 2005 or above you can try this:
WITH qry AS
(
SELECT a.*,
COUNT(*) OVER(PARTITION BY ssn) dup_count
FROM accounts a
)
SELECT *
FROM qry
WHERE dup_count = 1
For SQL 2000 and 7:
SELECT a.*
FROM accounts a INNER JOIN
(
SELECT ssn
FROM accounts b
GROUP BY ssn
HAVING COUNT(1) = 1
) b ON a.ssn = b.ssn
SELECT *
FROM #Temp
WHERE SSN NOT IN (SELECT ssn FROM #Temp GROUP BY ssn HAVING COUNT(ssn) > 1)
Thank you all for your detailed suggestions. When it was all said and done, I needed to use a correlated subquery. Essentially, this is what I had to do:
SELECT acn, ssn, [date] FROM Account a
WHERE NOT EXISTS (SELECT 1 FROM Account WHERE ssn = a.ssn AND [date] < a.[date])
Hope this helps someone.
I never updated this... In my final submission, I achieved this through a left join to increase efficiency (the correlated subquery was not acceptable as it took a significant amount of time to run, checking each record against over 150K others).
Here is what had to be done to solve my problem:
SELECT acn, ssn
FROM Account a
LEFT JOIN (SELECT ssn, COUNT(1) AS counter FROM Account
GROUP BY ssn) AS counters
ON a.ssn = counters.ssn
WHERE counter IS NULL OR counter = 0