SQL - count occurrences of gender - sql

I'm new to SQL and I am creating a movie rental site database. I have to show some summary statistics of the database, so I'm planning to show the number of males and females that have selected films of a certain certification to rent.
The output I want is:
Certificate No. Males No. Females
----------- --------- -----------
U # #
PG # #
12 # #
15 # #
18 # #
The relevant tables for this query are:
FILM {FID (PK), Film_Title, Certificate, Date_of_Release}
MEMBER {MID (PK), ..., Gender}
LIST {MID (FK), FID (FK)}
FILM and MEMBER table should be quite self-explanatory, while a LIST is a selection of films a MEMBER wishes to rent. It's like a shopping basket, if you like. Each member only has one list and each list can contain many films. I tried to come up with a query but Access wouldn't let me run it:
SELECT Gender, COUNT(*) AS Gender_per_Certificate
FROM (SELECT DISTINCT Film_Title, Certificate, Gender
FROM (SELECT *
FROM Film, Member, List
WHERE Film.FID=List.FID
AND Member.MID=List.MID)
WHERE Film_Title IN (SELECT Film_Title
FROM (SELECT *
FROM Film, Member, List
WHERE Film.FID=List.FID
AND Member.MID=List.MID)
WHERE Certificate=[Certificate?]
GROUP BY Certificate))
WHERE (((Member.[Gender])="M")) OR ((("F")<>False))
GROUP BY Gender;
The error is
You tried to execute a query that does not include the specified expression 'FID' as part of an aggregate function
I'm not sure what this error means - could someone please explain it? Also, how can I execute the query correctly?
I understand this might be a really simple query that I've overcomplicated, would appreciate any help at all, thanks a lot for your time in advance!

If you use a case statement (IIF in access, thank you #HansUp) to determine which are male and female, grouping by certificate, you should get your answer.
select
f.Certificate,
sum(IIF(m.[gender] = 'M',1, 0)) as [# of males],
sum(IIF(m.[gender] = 'F',1, 0)) as [# of females]
from
(member m
inner join list l on m.mid = l.mid)
inner join film f on l.fid = f.fid
group by f.Certificate

Related

Oracle sql group function issues

Find Melbourne VIP level 4 customers’ first name, last name who have hired the vehicle model as “Ranger ” at least 2 times in database. You write three different queries: one is using operator EXISTS and the other one is using operator IN. The third query with the main filter criteria in FROM clause of the main query or filter criteria in the sub-query. Find one with the better performance.
I Have tried this query;
SELECT c_fname, c_fname FROM rental WHERE
EXISTS(SELECT c_id FROM customer WHERE c_city = 'Melbourne' AND customer.vip_level = '4')
AND EXISTS (SELECT vehicle_reg FROM vehicle WHERE v_model = 'Ranger')
HAVING COUNT(c_id)>=2 GROUP BY c_lname, c_fname;
I am getting error: SQL Error: ORA-00934: group function is not allowed here
00934. 00000 - "group function is not allowed here"
can anyone help me with this question. really struggled to get this done?
You are selecting from the wrong subject table as Rental does not have c_fname or c_lname columns.
You want to "Find Melbourne VIP level 4 customers’ first name, last name" which would be in the customer table:
SELECT c_fname,
c_lname
FROM customer
WHERE c_city = 'Melbourne'
AND vip_level = 4;
Then you want to add an additional filter "who have hired the vehicle model as “Ranger ” at least 2 times in database". That requires you to use EXISTS (for the answer for the first query) and you need to correlate between the outer-query and the sub-query; once you have done that then you do not need a GROUP BY clause and you are aggregating over the entire result set of the sub-query and can just use a HAVING clause.
SELECT c_fname,
c_lname
FROM customer c
WHERE c_city = 'Melbourne'
AND vip_level = 4
AND EXISTS(
SELECT 1
FROM rental r
INNER JOIN vehicle v
ON (r.vehicle_reg = v.vehicle_reg)
WHERE c.c_id = r.c_id
AND v.v_model = 'Ranger'
HAVING COUNT(*) >= 2
);
Then you need to write the same query using IN instead of EXISTS and the same query a third time using JOIN conditions instead of IN or EXISTS and, finally, you need to compare the performance of all three queries.

How can I select the highest counts attributes from different groups?

So I have a table with players data(name, team, etc..) and a table with goals (player who scored it, local team, etc...). What I need to do is, get from each team the highest scorer. So the result I'm getting is something like:
germany - whatever name - 1
germany - another dude - 5
spain - another name - 8
italy - one more name - 6
As you can see teams repeat, and I want them not to, just get the highest scorer of each team.
Right now I have this:
SELECT P.TEAM_PLAYER, G.PLAYER_GOAL, COUNT(*) AS "TOTAL GOALS" FROM PLAYER P, GOAL G
WHERE TO_CHAR(G.DATE_GOAL, 'YYYY')=2002
AND P.NAME = G.PLAYER_GOAL
GROUP BY G.PLAYER_GOAL, P.TEAM_PLAYER
HAVING COUNT(*)>=ALL (SELECT COUNT(*) FROM PLAYER P2 where P.TEAM_PLAYER = P2.TEAM_PLAYER GROUP BY P2.TEAM_PLAYER)
ORDER BY COUNT(*) DESC;
I am 100% sure I'm close, and I'm pretty sure I have to do this with the HAVING feature, but I can't get it right.
Without the HAVING it returns a list of all the players, their teams and how many goals have they scored, now I want to cut it down to only one player for each team.
PD: the teams in the table GOAL are local and visiting team, so I have to use the Player table to get the team. Also the Goal table is not a list of the players and how many goals they have scored, but a list of every individual goal and the player who scored it.
If I understand correctly you can try this query.
just get MAX of PLAYER_GOAL column,SUM(G.PLAYER_GOAL) instead of COUNT(*)
SELECT P.TEAM_PLAYER,
MAX(G.PLAYER_GOAL) "PLAYER_GOAL",
SUM(G.PLAYER_GOAL) AS "TOTAL GOALS"
FROM PLAYER P
INNER JOIN GOAL G
ON P.NAME = G.PLAYER_NAME
WHERE TO_CHAR(G.DATE_GOAL, 'YYYY')=2002
GROUP BY P.TEAM_PLAYER
ORDER BY SUM(G.PLAYER_GOAL) DESC;
NOTE :
Avoid using commas to join tables it's a old join style, You can use inner-join instead.
Edit
I don't know your table schema, but this query might be work.
use a subquery to contain your current result set. then get MAX function and GROUP BY
SELECT T.TEAM_PLAYER,
T.PLAYER_GOAL,
MAX(TOTAL_GOALS) AS "TOTAL GOALS"
FROM
(
SELECT P.TEAM_PLAYER, G.PLAYER_GOAL, COUNT(*) AS "TOTAL_GOALS" FROM
PLAYER P, GOAL G
WHERE TO_CHAR(G.DATE_GOAL, 'YYYY')=2002
AND P.NAME = G.PLAYER_GOAL
GROUP BY G.PLAYER_GOAL, P.TEAM_PLAYER
HAVING COUNT(*)>=ALL (SELECT COUNT(*) FROM PLAYER P2 where P.TEAM_PLAYER = P2.TEAM_PLAYER GROUP BY P2.TEAM_PLAYER)
) T
GROUP BY T.TEAM_PLAYER,
T.PLAYER_GOAL
ORDER BY MAX(TOTAL_GOALS) DESC

finding people with possible incorrectly spelled cities where zip codes match

I am trying to create a report that will return a list of people whose cities most likely need to be corrected.
I was thinking of comparing the data against other data within the table to leverage the assumption that most of the cities are spelled correctly. Take Albuquerque, for example. We have records for many of the zip codes, but the city isn't always spelled correctly.
I can't figure out my next step.
Here's what I have started with:
SELECT city, zip_5_digits, COUNT(*) AS "COUNT"
FROM people
INNER JOIN addresses
ON addresses.people_id = people.id
AND city LIKE 'Albu%que'
GROUP BY city, zip_5_digits
Doing this results in
Albuqureque 87108 1
Albuquerque 87108 238
Albuqerque 87109 1
Albuquerque 87109 34
What I'd like to do is, for each row, find the maximum records where the zip code matches but the city does not match. If there is no match, I want to return that record, and I'll use this to return people's id and names, since I most likely need to correct the name of the city for those people who have it mis-spelled.
This is hard, because some "cities" have very few residents. And, some zip codes might just have a small part of a city.
I would recommend two rules:
Look at zip codes that have at least a certain number of people -- say 100.
Look at cities in the zip code that have less than some number -- say 5.
There are candidates for misspellings:
SELECT pa.*
FROM (SELECT city, zip_5_digits, COUNT(*) AS cnt,
MAX(COUNT(*)) OVER (PARTITION BY zip_5_digits) as max_cnt,
SUM(COUNT(*)) OVER (PARTITION BY zip_5_digits) as sum_cnt
FROM people p, INNER JOIN
addresses a
ON a.people_id = p.id
GROUP BY city, zip_5_digits
) pa
WHERE sum_cnt >= 100 AND cnt <= 5;

Match All Records

I am having difficulty with writing an SQL query for my application. Below shows a screenshot of some of the tables in my database.
Please let me explain how a section of my application works, and then I can show you the problem I am having with my SQL query.
A User can create an application form on my system (dbo.Form). An application form consists of several sections, one of those sections is Employment History (dbo.FormEmployment). An employment record includes details like employer name, start date etc, but also a gradeID. A User can add one or more employment records to the table dbo.FormEmployment.
A system administrator can add a Shift (dbo.Shift) to the system, and also then assign Grades to a Shift. These Grades are recorded in the dbo.ShiftGrade table. A Shift can be assigned 1 or more Grades, i.e. it could be assigned 1,2,3,4 or 5 Grades.
When the system administrator has added the Shift and Shift Grades, I then want to perform an SQL query to return a list of Users whose Grades match that of the Grades assigned to a Shift (remember the User adds their Grade when they add an employment record).
I have written the following SQL query which works, however, the issue with this query is that it returns a list of Users, that have any Grade that matches that of the Grades assigned to a Shift.
SELECT u.userID, u.userTypeID, u.firstName, u.lastName, u.email, u.password,
u.contactNumber, u.organisationID, u.emailVerificationCode, u.mobileVerificationCode,
u.userStatusID, u.AddedBy, u.AddedDate, u.ModifiedBy, u.ModifiedDate
FROM [User] u
--Check Grades Match
WHERE u.userID IN
(
SELECT f.locumID
FROM dbo.Form f, dbo.FormEmployment emp
WHERE f.formID = emp.formID
AND emp.statusID = 101
AND emp.workYN = 1
AND emp.gradeID IN (
select gradeID from dbo.ShiftGrade where shiftID = #p0
)
)
You can see I am using the IN statement to do this. However, I only want to return a list of Users who can match exactly the same Grades that have been assigned to a Shift. Therefore, if a Shift has been assigned 3 Grades, then I want a List of Users who also have the exact same three Grades to be returned.
I apologise for the lengthy post, I just thought it would be better explained this way.
Any feedback or help with this would be greatly appreciated.
select u.*
from dbo.Users u
join dbo.Form f on u.? = f.formId
join dbo.FormEmployment fe on fe.formId = f.formId
join dbo.Grade g on g.gradeId = fe.gradeId
join dbo.ShiftGrade shg on shg.gradeId =g.gradeId
join dbo.Shift sh on sh.shiftId = shg.shiftId
where
sh.shiftId = -- recently added shift id
and g.gradeId == -- recently added grade id

Help with SQL query (Calculate a ratio between two entitiess)

I’m going to calculate a ratio between two entities but are having some trouble with the query.
The principal is the same to, say a forum, where you say:
A user gets points for every new thread. Then, calculate the ratio of points for the number of threads.
Example:
User A has 300 points. User A has started 6 thread. The point ratio is: 50:6
My schemas look as following:
student(studentid, name, class, major)
course(courseid, coursename, department)
courseoffering(courseid, semester, year, instructor)
faculty(name, office, salary)
gradereport(studentid, courseid, semester, year, grade)
The relations is a following:
Faculity(name) = courseoffering(instructor)
Student(studentid) = gradereport (studentid)
Courseoffering(courseid) = course(courseid)
Gradereport(courseid) = courseoffering(courseid)
I have this query to select the faculty names there is teaching one or more students:
SELECT COUNT(faculty.name) FROM faculty, courseoffering, gradereport, student WHERE faculty.name = courseoffering.instructor AND courseoffering.courseid = gradereport.courseid AND gradereport.studentid = student.studentid
My problem is to find the ratio between the faculty members salary in regarding to the number of students they are teaching.
Say, a teacher get 10.000 in salary and teaches 5 students, then his ratio should be 1:5.
I hope that someone has an answer to my problem and understand what I'm having trouble with.
Thanks
Mestika
Some further explanation and examples on my problem and request:
Employee 1: Salary = 10.000 | # of courses he teaches: 3 | # of students (totaly) following thoes 3 courses: 15.
Then, Employee 1 earns 666,7 pr. each student. (i believe this is the ratio)
Employee 2: Salary = 30.000 | # of courses he teaches: 1 | # of students (totaly) following thoes 3 courses: 6.
Then, Employee 2 earns 5000 pr. each student.
You are completely right that my own ratio examples don’t make sense so I will try to explain further.
What I am seeking to do is, to find out how much salary each faculty member has depending on the number of students they are teaching. I imagine that it is a simple question about dividing a faculty members salary by the number of students following a course that the member is teaching.
I get an error when I am running your query, my MySQL has a problem with the convert part (it seems) but otherwise you query is correct it seems.
I haven’t tried the convert statement before, but is it (and why) necessary to convert them? If I for each faculty member that has the correct conditions, find the number of students that are attending the course. Then take that faculty members salary and divide it by the found numbers of student?
when I look at your first example it says that 300 points for 6 threads works out to 50:6 rato. Don't you mean in your later example that 10000 salary for 5 students works out to 2000:5 ratio? not 1:5 ratio?
anyway if my understanding of your example is correct then this should be a good solution
select f.name, f.salary, count(s.studentid) as noofstudents, convert(f.salary / count(s.studentid),varchar(50)) + ':' + convert(count(s.studentid),varchar(10)) as ratio
from faculty f
inner join courseoffering co on f.name = co.instructor
inner join gradereport gr on co.courseid = gr.courseid
inner join student s on gr.studentid = s.studentid
where co.semester = #semester
and co.year = #year
group by f.name, f.salary
perhaps you could expand on your question a bit if this isn't what you're looking for.