Average of specific group of values - sql

I am trying to select the average age of renters of a specific movie for demographic purposes.
My data is similar to
Movies
movie_id movie_title
1 Spider Man
2 Avengers
3 Thor
Customers
customer_id customer_dob
1 1989-03-05
2 1994-02-12
3 2001-05-01
Customer_rentals
rental_id customer_id movie_id
1 1 1
2 1 3
3 2 2
4 2 1
5 3 1
What I would like to see is
Title Avg_Age
Spider Man 25
Avengers 26
Thor 31
I have tried the following
select m.movie_title as Title, avg(all_ages.age) as avg_age
from
movies m,
(select ((0 + convert(char(8), getdate(),112) - convert(char(8),c.customer_dob,112)) / 10000) as age
from customers c, movies m, customer_rentals cr
where m.movie_id=cr.movie_id
and cr.customer_id=c.customer_id) all_ages
group by m.movie_title
Which gives me
Title Avg_Age
Spider Man 25
Avengers 25
Thor 25
It seems to be taking the average of all ages and returning it as the average for each movie and I'm not sure why this is happening

The problem with your query is that the subquery is not properly correlated to the outer query. You are selecting again from movie (using the same alias as in the outer query - m - which is confusing), while you should be relating to the record from outer query.
This can be simplified with straight joins and aggregation:
select
m.movie_title as Title,
avg((0 + convert(char(8), getdate(),112) - convert(char(8),c.customer_dob,112)) / 10000) as avg_age
from movies m
inner join customer_rentals cr on cr.movie_id = m.movie_id
inner join customers c on c.customer_id = cr.customer_id
group by m.movie_id, m.movie_title
Note that this uses standard, explicit joins (with the on keyword) rather than implicit joins (with commas in the from clause): this old syntax from decades ago should not be used in new code.

Related

Is there an easier way to figure the query out

I have a movie table which has year and movie details like title , movie id( mid) and a table m_cast where i have all the actors in that movie.
I would like to get all the actors who have never been unemployed for more than 3 years. ( Assuming actors are unemployed between two consecutive movies)
i code i came up with is
select a.yr1 y1 , b.yr2 y2 , a.yr1 - b.yr2 diff from
(select substr(substr(trim(year),-5),0,5) yr1 , * from movie m inner join m_cast p on m.mid = p.mid order by pid , yr1) a ,
(select substr(substr(trim(year),-5),0,5) yr2 , * from movie m inner join m_cast p on m.mid = p.mid order by pid, yr2) b on a.yr1 > b.yr2
where not exists
(select count(*) from movie m inner join m_cast p on m.mid = p.mid
and cast(substr(substr(trim(year),-5),0,5) as integer) < a.yr1 and cast(substr(substr(trim(year),-5),0,5) as integer) > b.yr2)
Self join itself takes a lot of time. And lag and lead functions do not work in SQLite version i am using.
I'm assuming the movie table has a column called year, and a column to identify the actor's name. Something like : year int, actorId int
The fastest way to run your query is to filter the last 3 years from your movie table and then to group by your actors the distinct count of years.
Example after filtering
ActorId Year
1. 2018
1. 2018
1. 2017
2. 2016
2. 2017
2. 2018
Then group by and select distinct :
Select actorId from movieTable group by actorId having count (distinct (Year)) =3
And that will only return the actors who have worked in the last 3 years. Once you have your actors id's filtered out in that column do a join to the table that holds their names.
Sorry about the format of my writing - did it from my cellphone.
Regards,
Jorge D. Lopez

SQL Query with different aggregation for count

Publisher
pub_id title_id city
1 1 NY
1 2 NY
2 3 CA
3 4 VA
Titles
title_id price genre
1 10 Horror
2 5 Bio
3 50 Science
Question:
Create a SQL query that gives
-pub_id
-# titles
-# titles with Horror genre
I have been having hard time writing this SQL query and can't figure out how to include both #titles and #titles with horror genre in the same query. I would appreciate any help on this. Thank you.
Query I have tried so far( don't know how to incude titles with horror genre):
select a.pub_id, count(a.titles)
from publisher a
left join titles b on a.title_id = b.title_id group by a.pub_id
If I use having then I won't be able to calculate the total number of titles.
use following query to achieve your results
select
pub_id,
count(*) as [titles],
SUM(CASE WHEN genre='horror' then 1 else 0 END) as [horror titles]
from Publisher a
inner join titles b on a.title_id=b.title_id
group by
pub_id
you can use CASE statements to do this
you can use this query to get expected output
select t.pub_id, COUNT(t.title_id) as title_id, t2.genre
from table1 t
inner join
table2 t2
on t.title_id = t2.title_id and t2.genre like 'Horror%'
group by t.pub_id, t2.genre
Note: change table1 and table2 names into your table name
the output is shown here

How to assign zero(0) when the avg of a particular field is null in PostgreSQL

I have two tables:
Table user_ratings
id home_info_id ratings
1 1 3.5
2 2 3.5
3 1 4
4 1 5
5 1 2
6 2 1
7 2 4
Table home_info:
id home_name
1 my_home
2 ur_home
3 his_home
As you can see 'my_home' and 'ur_home' has ratings but 'his_home' is not rated yet. I am calculating the avg of all homes, so I am getting avg of only two homes, i.e. 'my_home' and 'ur_home', as I said 'his_home' is not rated yet, so I am not getting 'his_home' in my query below. I want all the names of homes which are not rated yet. Here is my query:
select u.home_info_id
, avg(u.ratings)
, h.home_name
from user_ratings u
, home_info h
where h.id = u.home_info_id
group by u.home_info_id
, h.home_name;
The output is something like this:
home_info_id ratings home_name
1 4.83 my_home
2 2.83 ur_home
But I want something like this:
home_info_id ratings home_name
1 4.83 my_home
2 2.83 ur_home
3 0 his_home
You can use COALESCE with LEFT JOIN (instead of implicit INNER JOIN):
select h.id
, coalesce(avg(u.ratings), 0)
, h.home_name
from home_info h
left join user_review u on h.id = u.home_info_id
group by h.id
, h.home_name
When scanning the whole table or most of it, it is cheaper to aggregate before you join:
SELECT h.id, h.home_name
, COALESCE(u.avg_rating, 0) AS avg_rating
FROM home_info h
LEFT JOIN (
SELECT home_info_id AS id, avg(ratings) AS avg_rating
FROM user_review
GROUP BY 1
) u USING (id);
Test with EXPLAIN ANALYZE.
How to make a SELECT query in Hibernate includes Subquery COUNT(*)
Aggregate a single column in query with many columns

Finding group maxes in SQL join result [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
SQL: Select first row in each GROUP BY group?
Two SQL tables. One contestant has many entries:
Contestants Entries
Id Name Id Contestant_Id Score
-- ---- -- ------------- -----
1 Fred 1 3 100
2 Mary 2 3 22
3 Irving 3 1 888
4 Grizelda 4 4 123
5 1 19
6 3 50
Low score wins. Need to retrieve current best scores of all contestants ordered by score:
Best Entries Report
Name Entry_Id Score
---- -------- -----
Fred 5 19
Irving 2 22
Grizelda 4 123
I can certainly get this done with many queries. My question is whether there's a way to get the result with one, efficient SQL query. I can almost see how to do it with GROUP BY, but not quite.
In case it's relevant, the environment is Rails ActiveRecord and PostgreSQL.
Here is specific postgresql way of doing this:
SELECT DISTINCT ON (c.id) c.name, e.id, e.score
FROM Contestants c
JOIN Entries e ON c.id = e.Contestant_id
ORDER BY c.id, e.score
Details about DISTINCT ON are here.
My SQLFiddle with example.
UPD To order the results by score:
SELECT *
FROM (SELECT DISTINCT ON (c.id) c.name, e.id, e.score
FROM Contestants c
JOIN Entries e ON c.id = e.Contestant_id
ORDER BY c.id, e.score) t
ORDER BY score
The easiest way to do this is with the ranking functions:
select name, Entry_id, score
from (select e.*, c.name,
row_number() over (partition by e.contestant_id order by score) as seqnum
from entries e join
contestants c
on c.Contestant_id = c.id
) ec
where seqnum = 1
I'm not familiar with PostgreSQL, but something along these lines should work:
SELECT c.*, s.Score
FROM Contestants c
JOIN (SELECT MIN(Score) Score, Contestant_Id FROM Entries GROUP BY Contestant_Id) s
ON c.Id=s.Contestant_Id
one of solutions is
select min(e.score),c.name,c.id from entries e
inner join contestants c on e.contestant_id = c.id
group by e.contestant_id,c.name,c.id
here is example
http://sqlfiddle.com/#!3/9e307/27
This simple query should do the trick..
Select contestants.name as name, entries.id as entry_id, MIN(entries.score) as score
FROM entries
JOIN contestants ON contestants.id = entries.contestant_id
GROUP BY name
ORDER BY score
this grabs the min score for each contestant and orders them ASC

Too many sub-queries, that I am already confused and I am still missing one column - Oracle

I have the following query that has no errors:
SELECT u.user_name, u.user_lastn, outer_s.movie_id, outer_s.times_rented
FROM users u,
(
SELECT * FROM
(
SELECT user_id, movie_id, count (movie_id) as times_rented
FROM movie_queue
GROUP BY (user_id, movie_id)
ORDER BY user_id, movie_id
) inner_s
WHERE times_rented>1
) outer_s
WHERE u.user_id= outer_s.user_id;
This is what it returns:
USER_NAME USER_LASTN MOVIE_ID TIMES_RENTED
------------------------ ------------------------ ---------- ------------
John Smith 1 3
John Smith 6 2
Mary Berman 4 2
Mary Berman 6 4
Elizabeth Johnson 1 2
Peter Quigley 2 2
What I still need to do is to show the name of the movie, instead of the movie_id, but
the name of the movies are located in another table named movies that is similar to the
following sample:
MOVIE_ID MOVIE_NAME
---------- ---------------------------------------------
1 E.T. the Extra-Terrestrial
2 Jurassic Park
3 Indiana Jones and the Kingdom of the Crystal
4 War of the Worlds
5 Signs
Desired result:
What I want to see in the final table are the following columns:
USER_NAME | USER_LASTN | MOVIE_NAME | TIMES_RENTED |
Question:
But after all the many subqueries I am very confused, how can I get the movie_name there instead of the movie_id?
Attempted:
I tried getting the desired result by changing the query to
SELECT u.user_name, u.user_lastn, m.movie_name, outer_s.times_rented
FROM users u, movie m (etc.....)
But It returned 120 rows instead of the 6 I should get.
Help please!!
SELECT u.user_name, u.user_lastn, m.movie_name, COUNT(q.movie_id)
FROM users AS u
JOIN movie_queue AS q ON q.user_id = u.user_id
JOIN movie AS m ON m.movie_id = q.movie_id
GROUP BY u.user_name, u.user_lastn, m.movie_name
HAVING COUNT(q.movie_id) > 1
You just need to join the results of your query to the other query. However, first, I'm going to rewrite the query to simplify it an use proper join syntax:
SELECT u.user_name, u.user_lastn, m.movie_name, outer_s.movie_id, outer_s.times_rented
FROM users u join
(SELECT user_id, movie_id, count (movie_id) as times_rented
FROM movie_queue
GROUP BY (user_id, movie_id)
having count (movie_id) > 1
) outer_s
on u.user_id= outer_s.user_id join
movies m
on outer_s.movie_id = m.move_id
Or you could use CTEs to make your the query readable:
WITH outer_s as (SELECT user_id, movie_id, count (movie_id) as times_rented
FROM movie_queue
GROUP BY (user_id, movie_id)
having count (movie_id) > 1
)
SELECT u.user_name, u.user_lastn, m.movie_name, outer_s.movie_id, outer_s.times_rented FROM users u join outer_s
on u.user_id= outer_s.user_id join
movies m
on outer_s.movie_id = m.move_id
Using a CTE offers the advantages of improved readability and ease in maintenance of complex queries. The query can be divided into separate, simple, logical building blocks. These simple blocks can then be used to build more complex, interim CTEs until the final result set is generated.