Finding the most frequent value in SQL column

Finding the most frequent value in SQL column - sql

I have written the following code :
select ap.doctorsnum,doc.specialty
from appointments as ap
join doctor as doc
on ap.doctorsnum = doc.doctorsnum
This a screenshot of the first 10 rows of the result. (The actual result contains more than 5k rows)
How can I calculate which is the most frequent value to appear in the "specialty" column?
(All security numbers are fake, they have been randomly generated)
Any help is much appreciated!
Thank you!

I would make a query like:
SELECT
speciality,
COUNT(*) AS value_occurrence
FROM appointments
GROUP BY
speciality
ORDER BY value_occurrence DESC
LIMIT 1;

Related

SQL QUERY : Find for each year copies sold > 10000

I am practicing a bit with SQL and I came across this exercise:
Consider the following database relating to albums, singers and sales:
Album (Code, Singer, Title)
Sales (Album, Year, CopiesSold)
with a constraint of referential integrity between the Sales Album attribute and the key of the
Album report.
Formulate the following query in SQL :
Find the code and title of the albums that have sold 10,000 copies
every year since they came out.
I had thought of solving it like this:
SELECT CODE, TITLE, COUNT (*)
FROM ALBUM JOIN SALES ON ALBUM.Code = SALES.Album
WHERE CopiesSold > 10000
HAVING COUNT(*) = /* Select difference from current year and came out year.*/
Can you help me with this? Thanks.

You can do this with an INNER JOIN, GROUP BY, and HAVING.
SELECT A.Code, A.Title
FROM ALBUM A
INNER JOIN SALES S ON S.Album = A.Code
GROUP BY A.Code, A.Title
HAVING MIN(S.CopiesSold) >= 10000
The HAVING clause will filter out albums whose minimum Copies Sold are < 10000.
EDIT
There was also a question about gaps in the Sales data, there are a number of ways to modify the above query to solve for this as well. One solution would be to use an embedded query to identify the correct number of years.
SELECT A.Code, A.Title
FROM ALBUM A
INNER JOIN SALES S ON S.Album = A.Code
GROUP BY A.Code, A.Title
HAVING MIN(S.CopiesSold) >= 10000 AND
COUNT(*) = (SELECT COUNT(DISTINCT Year) FROM SALES WHERE Year >= MIN(s.Year))
This solution assumes that at least one album by some artist was sold each year (a fairly safe bet). If you had a Years table there are simpler solutions. If the data is current there are also solutions that utilize DATEDIFF.

You can use correlated subqueries with EXISTS or NOT EXISTS respectively.
In one check if the maximum year minus the minimum year plus one is equal to the count of records with a defined year of an album. That way you make sure you don't get albums where there are figures missing for a year and you therefore cannot tell whether they sold 10000 or more or not. Also check that the maximum year is the current year not to miss gaps between the maximum year and the current year. (In the example code I will use the literal 2020 but there are means to get that dynamically. They depend on the DBMS however and you didn't state which one you're using.)
In the second one check that there's no record with undefined sales figures or sales figures lower than 10000 for the album. If no such record exists, all of the existing one have to have figures of 10000 or greater.
SELECT a1.code,
a1.title
FROM album a1
WHERE EXISTS (SELECT ''
FROM sales s1
WHERE s1.album = a1.code
HAVING max(s1.year) - min(s1.year) + 1 = count(s1.year)
AND max(s1.year) = 2020)
AND NOT EXISTS (SELECT *
FROM sales s2
WHERE s2.album = a1.code
AND s2.copiessold IS NULL
OR s2.copiessold < 10000);

I think the ALL keyword should work nicely here. Something like this:
SELECT * FROM Album
WHERE 10000 <= ALL (
SELECT CopiesSold FROM Sales
WHERE Sales.Album = Album.Code)

Counting distinct values output from a grouped SQL Count function

I've got a database that holds information about volunteers and their participation in a range of events.
The following query gives me a list of their names and total attendances
SELECT
volunteers.last_name,
volunteers.first_name,
count (bookings.id)
FROM
volunteers,
bookings
WHERE
volunteers.id = bookings.volunteer_id
GROUP BY
volunteers.last_name,
volunteers.first_name
I want the result table to show the distinct number of attendances and how many there are of each; So if five people did one event it'd display 1 in the first column and 5 in the second and so on.
Thanks

If I understand correctly, you want what I call a "histogram of histograms" query:
select numvolunteers, count(*) as numevents, min(eventid), max(eventid)
from (select b.eventid, count(*) as numvolunteers
from bookings b
group by b.eventid
) b
group by numvolunteers
order by numvolunteers;
The first column is the number of volunteers booked for an "event". The second is the number of events where this occurs. The last two columns are just examples of events that have the given number of volunteers.

PostgreSQL: reuse Column Data In Different Column Of The Same Query

I'm trying to create a SELECT query that does several calculated fields on one of two tables. I'm new to SQL (I've looked at several free online tutorials, so I have a general idea), but I think my goal is a little out of my skill range.
I have two tables:
TreeRecord with columns ID (serial), Site (chr)
Each ID represents an individual tree.
TreeHistory with columns ID (serial), TreeID (int), DBH (int)
DBH is tree diameter.
Currently I can create this:
| Site | Total tree count of site | Avg DBH of site |
I would like to have another column that can give the total count of trees over a particular size for each site. I can recreate this in a simple query, and my research on stack (SQL Select - Calculated Column if Value Exists in another Table) makes me feel that a nested SELECT is what I'm after but I can't get that to work. My current code is this:
SELECT
"TreeRecord"."Site",
count("TreeRecord".*) AS Total_Count,
round(avg("TreeHistory"."DBH"), 0) AS Average_DBH
FROM
"TreeRecord"
LEFT OUTER JOIN
"TreeHistory" ON "TreeRecord"."ID" = "TreeHistory"."TreeID"
GROUP BY
"Site"
ORDER BY
"Site" ASC;
Any help on this would be most appreciated.
Thank you

Use count with the specific size condition.
SELECT "TreeRecord"."Site",
count("TreeRecord".*) AS Total_Count,
round(avg("TreeHistory"."DBH"),0) AS Average_DBH,
count(case when "TreeHistory"."DBH" > 10 then 1 end) as count_over_specific_size
^^--change this size accordingly
FROM "TreeRecord"
LEFT OUTER JOIN "TreeHistory"
ON "TreeRecord"."ID" = "TreeHistory"."TreeID"
GROUP BY "Site"
ORDER BY "Site" ASC;

sql-server-2008 : get the last status of subjects of a student

Salam, (Greetings) to all.
Intro:
I am working on a Student Examination System, where Students appear and pass or fail or absent.
Problem:
I am tasked to fetch their Summary of Status. you may say a Result Card which should print their very last status of a Subject.
Below is a sample of the data where a student has appeared many times, in different sessions. I have highlighted one subject in which a student has appeared three times.
Now, I write the following Query which extract the same result as the picture above:
SELECT DISTINCT
gr.STUDKEY,gr.SUBJECT_ID, gr.SUBJECT_DESC,gr.MARKS,
gr.PASSFAIL, gr.GRADE,max(gr.SESSION_ID), gr.LEVEL_ID
FROM RESULT gr
WHERE gr.STUDKEY = '0100106524'
GROUP BY gr.STUDKEY,gr.SUBJECT_ID, gr.SUBJECT_DESC,gr.MARKS,
gr.PASSFAIL, gr.GRADE, gr.LEVEL_ID
Desired:
I want to get only the last status of a subject in which a student has appeared.
Help is requested. Thanks in advanced.
Regards
I am using sql-server-2008.

This won't work because you include fields like gr.MARKS and gr.GRADE in the group by and in the select which means that the query might return more than 1 record for each session id while their grade or marks is different.
SELECT
gr.STUDKEY,gr.SUBJECT_ID, gr.SUBJECT_DESC,
gr.PASSFAIL, gr.GRADE,gr.SESSION_ID, gr.LEVEL_ID
FROM RESULT gr
JOIN (SELECT MAX(SessionId) as sessionId, STUDKEY
FROM RESULT
GROUP BY STUDKEY ) gr1 ON gr1.sessionId=gr.sessionid AND gr1.STUDKEY =gr.STUDKEY

Hopefully there is a date field, or something that indicates the order of the students appearances in this class. Use that to order your query in descending order, so that the most recent occurrence is the first record, then specifiy "Top 1" which will then give you only the most recent record for that student, which will include in his most recent status.
SELECT TOP 1
gr.STUDKEY,gr.SUBJECT_ID, gr.SUBJECT_DESC,gr.MARKS,
gr.PASSFAIL, gr.GRADE,gr.SESSION_ID, gr.LEVEL_ID
FROM RESULT gr
WHERE gr.STUDKEY = '0100106524'
ORDER BY gr.Date DESC //swap "Date" out for your field indicating the sequence.
or use a Group by with MAX(Date) if you're looking for multiple classes with the same student at the same time.

Query to find how many students signed out

I need to write a query to find out how many students signed out after 1st period. We don't store a record if the student was present so I can't say if the student was present 1st period and has 6 absence records (we have 7 period days). All I have is the info in the schema below. I ahve a query that I wrote but its not working. Need some help on where to go from here.
Thanks
Select student_id, Count(*) AS #ofPerAbsent
From Attend_Student_Detail
where School_Year='1112' and School_Number='0031'
and Absent_Date='2012-04-13' and Absent_Code IN ('ABU','ABX')
Group by Student_ID
Having count(*)<=6
ORDER BY #ofPerAbsent desc

So your criteria for determining a student signed out after 1st period is having an Absent_Code or 'ABU' or 'ABX' ?
If that assumption is correct, then you can query as follows to get count of students per day that fit that criteria...
SELECT COUNT(DISTINCT(Student_ID))
FROM Attend_Student_Detail
WHERE Absent_Code IN ('ABU','ABX')
GROUP BY Absent_Date
You can further filter to specific dates in the WHERE clause if you'd like.
Your schema doesn't make much sense to me by the way; so if the above is not what you're looking for, can you please explain your schema a bit more and I'm sure I can help.

from what i can gather you will want to count all the absences minus the count of absences after the first period, so i think something like this should work.
SELECT
A.student_id,
(Count(A.student_id) - B.absences_after) as absences
FROM
attend_student_detail as A
LEFT JOIN (
SELECT
Z.student_id,
Count(Z.student_id) as absences_after
FROM
attend_student_detail as Z
WHERE school_year='1112' AND school_number='0031'
AND absent_date='2012-04-13' AND absent_code IN ('ABU','ABX')
AND absent_period <> "period one"
GROUP BY Z.student_id
) as B
ON B.student_id = A.student_id
GROUP BY A.student_id;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Finding the most frequent value in SQL column - sql

I would make a query like: SELECT speciality, COUNT(*) AS value_occurrence FROM appointments GROUP BY speciality ORDER BY value_occurrence DESC LIMIT 1;

Related

SQL QUERY : Find for each year copies sold > 10000

Counting distinct values output from a grouped SQL Count function

PostgreSQL: reuse Column Data In Different Column Of The Same Query

sql-server-2008 : get the last status of subjects of a student

Query to find how many students signed out

Categories

Resources