SQL Join the count to a query - sql-server-2005

I have created a database of my movies and another with my actors in each movie
Columns are:
ID
Actor
ImdbActorID
ImdbMovieID
Character
Example:
47105 | Howard McGillin | nm0569294 | tt0111333 | Adult Prince Derek
47106 | Michelle Nicastro | nm0629264 | tt0111333 | Adult Princess Odette
47108 | John Cleese | nm0000092 | tt0111333 | Jean-Bob
when my webapp queries a specific movie:
Select * from actors where ImdbMovieID='tt0111333'
I get that list. What my problem is I would like to add a column of the total movies I have of each actor. so i don't programatically have to run a query for each actor
I've thought of joining the same table to itself with the count??? but I don't know if that will even work. what stumps me is having that where clause.

Thanks everyone Stuart and Aaron.
Select ImdbActorID, Character,Actor,ImdbMovieID,cnt from Actors
Join (Select ImdbActorID as Act2, count(*) as cnt
from Actors group by ImdbActorID) as x on Actors.ImdbActorID=x.Act2
where ImdbMovieID='tt0111333'
order by cnt desc

Related

SQL query doesn't retrieve correct result with count

I have these tables
Actor: id | name
Acting: actor_id| movie|id
Movie: id | title
I have the code which returns how many the number of movies that an actor has acted in
SELECT a.name AS name, COUNT(ag.actor_id)
FROM actor a
LEFT JOIN acting ag ON a.id = ag.actor_id
GROUP BY a.id, a.name
ORDER BY COUNT(*) ASC
LIMIT 10;
name | count
--------------------------+-------
Bianca Brigitte VanDamme | 1
Karin Konoval | 1
Keri Maletto | 1
Terence Bernie Hines | 1
Jean Stapleton | 1
Kyle Hebert | 1
Brandon Middleton | 1
Timothy Webber | 1
Dana Hanna | 1
Travis Betz | 1
After inserting a random actor to the actors table, I run the same code but the output does not have the new actor.
INSERT INTO Actor
VALUES (5000, 'Jeremy Bearimy')
-- Run same code
SELECT a.name ........
FROM ...
I get this result:
name | count
--------------------------+-------
Bianca Brigitte VanDamme | 1
Karin Konoval | 1
Keri Maletto | 1
Terence Bernie Hines | 1
Jean Stapleton | 1
Kyle Hebert | 1
Brandon Middleton | 1
Timothy Webber | 1
Dana Hanna | 1
Travis Betz | 1
When I run the query with the code to see which actor has acted in 0 movies, I get a result, so I don't know why they don't appear in the query result.
SELECT a.name AS name, COUNT(ag.actor_id)
FROM actor a
LEFT JOIN acting ag ON a.id = ag.actor_id
GROUP BY a.id, a.name
HAVING COUNT(ag.actor_id) = 0
ORDER BY COUNT(*) ASC
LIMIT 10;
Output:
name | count
----------------+-------
Jeremy Bearimy | 0
Note your order by clause in the first query:
ORDER BY COUNT(*) ASC
Actors that are in 1 movie will have a COUNT(*) of 1. I think that is obvious.
Actors that are in 0 movies will also have a COUNT(*) of 1. Why? Because there is one row in the group even if the columns from the second table are NULL.
You are then limiting to 10 results. There is no second ORDER BY key, so the 10 returned rows are an arbitrary mix, starting with the actors that are in 0 or 1 movies.
If you instead used:
ORDER BY COUNT(ag.actor_id) ASC
then the actors with zero movies would appear before those with 1 movie.
from the last output it looks like actor 'Jeremy Bearimy' is missing in the acting table
Your query returns the 10 first actor, and they all have 1 movie. You have at least 10 actors with 1 movie. The ones that don't have movies can't appear in the result.
Remove the limit 10 and you will see all the actors with a movie or not.

Novice seeking help, Max Aggregate not returning expected results

I'm still very new to MS-SQL. I have a simple table and query that that is getting the best of me. I know it will something fundamental I'm overlooking.
I've changed the field names but the idea is the same.
So the idea is that every time someone signs up they get a RegID, Name, and Team. The names are unique, so for below yes John changed teams. And that's my trouble.
Football Table
+------------+----------+---------+
| Max_RegID | Name | Team |
+------------+----------+---------+
| 100 | John | Red |
| 101 | Bill | Blue |
| 102 | Tom | Green |
| 103 | John | Green |
+------------+----------+---------+
With the query at the bottom using the Max_RegID, I was expecting to get back only one record.
+------------+----------+---------+
| Max_RegID | Name | Team |
+------------+----------+---------+
| 103 | John | Green |
+------------+----------+---------+
Instead I get back below, Which seems to include Max_RegID but also for each team. What am I doing wrong?
+------------+----------+---------+
| Max_RegID | Name | Team |
+------------+----------+---------+
| 100 | John | Red |
| 103 | John | Green |
+------------+----------+---------+
My Query
SELECT
Max(Football.RegID) AS Max_RegID,
Football.Name,
Football.Team
FROM
Football
GROUP BY
Football.RegID,
Football.Name,
Football.Team
EDIT* Removed the WHERE statement
The reason you're getting the results that you are is because of the way you have your GROUP BY clause structured.
When you're using any aggregate function, MAX(X), SUM(X), COUNT(X), or what have you, you're telling the SQL engine that you want the aggregate value of column X for each unique combination of the columns listed in the GROUP BY clause.
In your query as written, you're grouping by all three of the columns in the table, telling the SQL engine that each tuple is unique. Therefore the query is returning ALL of the values, and you aren't actually getting the MAX of anything at all.
What you actually want in your results is the maximum RegID for each distinct value in the Name column and also the Team that goes along with that (RegID,Name) combination.
To accomplish that you need to find the MAX(ID) for each Name in an initial data set, and then use that list of RegIDs to add the values for Name and Team in a secondary data set.
Caveat (per comments from #HABO): This is premised on the assumption that RegID is a unique number (an IDENTITY column, value from a SEQUENCE, or something of that sort). If there are duplicate values, this will fail.
The most straight forward way to accomplish that is with a sub-query. The sub-query below gets your unique RegIDs, then joins to the original table to add the other values.
SELECT
f.RegID
,f.Name
,f.Team
FROM
Football AS f
JOIN
(--The sub-query, sq, gets the list of IDs
SELECT
MAX(f2.RegID) AS Max_RegID
FROM
Football AS f2
GROUP BY
f2.Name
) AS sq
ON
sq.Max_RegID = f.RegID;
EDIT: Sorry. I just re-read the question. To get just the single record for the MAX(RegID), just take the GROUP BY out of the sub-query, and you'll just get the current maximum value, which you can use to find the values in the rest of the columns.
SELECT
f.RegID
,f.Name
,f.Team
FROM
Football AS f
JOIN
(--The sub-query, sq, now gets the MAX ID
SELECT
MAX(f2.RegID) AS Max_RegID
FROM
Football AS f2
) AS sq
ON
sq.Max_RegID = f.RegID;
Use row_number()
select * from
(SELECT
Football.RegID AS Max_RegID,
Football.Name,
Football.Team, row_number() over(partition by name order by Football.RegID desc) as rn
FROM
Football
WHERE
Football.Name = 'John')a
where rn=1
simply you can edit your query below way
SELECT *
FROM
Football f
WHERE
f.Name = 'John' and
Max_RegID = (SELECT Max(Football.Max_RegID) where Football.Name = 'John'
)
or
if sql server simply use this
select top 1 * from Football f
where f.Name = 'John'
order by Max_RegID desc
or
if mysql then
select * from Football f
where f.Name = 'John'
order by Max_RegID desc
Limit 1
You need self join :
select f1.*
from Football f inner join
Football f1
on f1.name = f.name
where f.Max_RegID = 103;
After re-visit question, the sample data suggests me subquery :
select f.*
from Football f
where name = (select top (1) f1.name
from Football f1
order by f1.Max_RegID desc
);

How can I find all columns A whose subcategories B are all related to the same column C?

I'm trying to better understand relational algebra and am having trouble solving the following type of question:
Suppose there is a column A (Department), a column B (Employees) and a column C (Managers). How can I find all of the departments who only have one manager for all of their employees? An example is provided below:
Department | Employees | Managers
-------------+-------------+----------
A | John | Bob
A | Sue | Sam
B | Jim | Don
B | Alex | Don
C | Jason | Xie
C | Greg | Xie
In this table, the result I should get are all tuples containing departments B and C because all of their employees are managed by the same person (Don and Xie respectively). Department A however, would not be returned because it's employees have multiple managers.
Any help or pointers would be appreciated.
Such problems usually call for a self-join.
Joining the relation onto itself on Department, then filtering out the tuples where the Managers are equal would yield us all the unwanted tuples, which we can just subtract from the original relations.
Here's how I'd do it:
First we make a copy of table T, and call it T2, then take a cross product of T and T2. From the result we select all the rows where T1.Manager /= T2.Manager but T1.Department=T2.Department, yielding us these tuples:
T1.Department | T1.Employees| T1.Managers | T2.Managers | T2.Employees | T2.Department
--------------+-------------+-------------+-------------+--------------+--------------
A | John | Bob | Sam | Sue | A
A | Sue | Sam | Bob | John | A
Departments A and B aren't present because their T1.Manager always equals T2.Manager.
Then we just subtract this result the original set to get the answer.
If your RDBMS supports common table expressions:
with C as (
select department, manager, count(*) as cnt
from A
group by department, manager
),
B as (
select department, count(*) as cnt
from A group by department
)
select A.*
from A
join C on A.department = C.department
join B on A.department = B.department
where B.cnt = C.cnt;

selecting rows the id's of which appear in a column of another table

I can't quite get my head around a SQL query because it is not my forté. I'm trying to select the names of rows in an employees table the id's of which appear in a column salesPersonId of another table, accounts. That is, any employee name which is represented in the accounts table.
ACCOUNT
+----+---------------+
| id | salesPersonID |
+----+---------------+
| 0 | 1020 |
+----+---------------+
| 1 | 1020 |
+----+---------------+
| 2 | 1009 |
+----+---------------+
EMPLOYEE
+------+---------------+
| id | firstName |
+------+---------------+
| 1009 | BILL | <-select his name
+------+---------------+
| 1020 | KATE | <-select her name
+------+---------------+
| 1025 | NEIL | <-not this guy
+------+---------------+
Since Neil hasn't got any presence in account.salesPersonID, I'd like to select the other two besides him. I'm not getting very far with it though, and looking for some input.
SELECT * FROM employee e
LEFT JOIN account a
ON a.salesPersonID = e.id
WHERE (SELECT COUNT(salesPersonID) FROM account) > 0
does not work. I wonder how I could select these employee names that are present in salesPersonID. Thank you.
Try this:
SELECT Distinct e.firstName
FROM employee e
JOIN account a ON a.salesPersonID = e.id
The JOIN will take care of the filtering to make sure that you are only returning the records that exist in both tables.
Distinct will make sure that you are only getting each firstName value one time. You can also accomplish this by Grouping by employee.Id or employee.firstName (grouping by Id is the better strategy if you want to return one row for each unique employee, even if they have the same first name, grouping on firstName or using distinct is for when you just want one of each unique name, even if the name is used by more than one employee)
u can have the query like this....
select e.firstname from employees1 e left join account a on(e.id=a.salespersonid)
where e.id= a.salespersonid
group by e.firstname
result:
firstname
bill
kate

SELECT a subset of records from Table A that match two columns in table B

I have a list of users in a database table called Users (SQL Server 2008 R2). In addition to the user's UserName, there are two fields that classify the user - for simplicity we'll say Department and JobTitle.
| UserName | Department | JobTitle |
------------------------------------------
| Joe | IT | SysAdmin |
| Jim | IT | DBA |
| Jeff | Sales | SalesMgr |
| Mack | Sales | Rep |
I also have a table, ActiveJobs, that lists certain combinations of Department and JobTitle that I actually care about.
| Department | JobTitle |
-----------------------------
| IT | SysAdmin |
| Sales | SalesMgr |
| Sales | Rep |
I want to select each of the records from Users that matches the combination of Department / JobTitle in ActiveJobs. I thought this query would do it:
SELECT Users.*
FROM Users
INNER JOIN ActiveJobs DEP
ON Users.Department = DEP.Department
INNER JOIN ActiveJobs JOB
ON Users.JobTitle = JOB.JobTitle
But that returns the same User record more than once in many cases (which I think is caused by the duplicates in the Department column - but I don't really understand why). For the example above, I'm getting (Joe, Joe, Jim, Mack) even though I was hoping to just get (Joe, Jim, Mack).
What query would get the subset of User records that has a matching combination of Department and JobTitle in Active Jobs?
Put an "AND" in your join clause instead of joining twice.
SELECT Users.*
FROM Users
INNER JOIN ActiveJobs DEP
ON Users.Department = DEP.Department AND Users.JobTitle = DEP.JobTitle
Seems like one join on two attributes would work, rather than two joins on one attribute each. Can you JOIN ON ... AND ... ? (Away from computer)