Doing a FULL OUTER JOIN in Sqlite3 to get the combination of two columns? - sql

I'm currently working on a database project and one of the problems calls for the following:
The Genre table contains twenty-five entries. The MediaType table contains 5
entries. Write a single SQL query to generate a table with three columns and 125
rows. One column should contain the list of MediaType names; one column
should contain the list of Genre names; the third column should contain a count of
the number of tracks that have each combination of media type and genre. For
example, one row will be: “Rock MPEG Audio File xxx” where xxx is the
number of MPEG Rock tracks, even if the value is 0.
Recognizing this, I believe I'll need to use a FULL OUTER JOIN, which Sqlite3 doesn't support. The part that is confusing me is generating the column with the combination. Below, I've attached the two methods I've tried.
create view T as
select MediaTypeId, M.Name as MName, GenreId, G.Name as GName
from MediaType M, Genre G
SELECT DISTINCT GName, MName, COUNT(*) FROM (
SELECT *
FROM T
OUTER LEFT JOIN MediaType
ON MName = GName
UNION ALL
SELECT *
FROM Genre
OUTER LEFT JOIN T
) GROUP BY GName, MName;
However, that returned nearly 250 rows and the GROUP BY or JOIN(s) is totally wrong.
I've also tried:
SELECT Genre.Name as GenreName, MediaTypeName, COUNT(*)
FROM Genre LEFT OUTER JOIN (
SELECT MediaType.Name as MediaTypeName, Track.Name as TrackName
FROM MediaType LEFT OUTER JOIN Track) GROUP BY GenreName, MediaTypeName;
Which returned 125 rows but they all had the same count of 3503 which leads me to believe the GROUP BY is wrong.
Also, here is a schema of the database:
https://www.dropbox.com/s/onnbwqfrfc82r1t/IMG_2429.png?dl=0

You don't use full outer join to solve this problem.
Because it looks like a homework problem, I'll describe the solution.
First, you want to generate all combinations of genres and media types. Hint: This uses a cross join.
Second, you want to count all the combinations that you have. Hint: this uses an aggregation.
Third, you want to combine these together. Hint: left join.

Related

SQLite Subqueries and Inner Joins

I was doing a practice question for SQL which asks to create a list of album titles and unit prices for the artist "Audioslave" and find out how many records are returned.
Here is the relational database picture given in the question:
Initially, I used an inner join to retrieve the list and actually got the correct answer (40 records returned). The code is shown below:
select a.Title, t.UnitPrice
from albums a
inner join tracks t on t.AlbumId = a.AlbumId
inner join artists ar on ar.ArtistId = a.ArtistId
where ar.Name = 'Audioslave';
Although I finished the question, I was curious to try to solve this problem using nested subqueries instead and tried to first retrieve the AlbumId and UnitPrice from tracks. I got the correct answer but not the correct list (the question asked for album title and not AlbumId). Here is the code:
select AlbumId, UnitPrice
from tracks
where AlbumId in (
select AlbumId
from albums
where ArtistId in (
select ArtistId
from artists
where Name = 'Audioslave'));
In order to solve the problem with the list, I tried combining the previous codes. However, I get a completely different amount of records being returned (10509).
select a.Title, t.UnitPrice
from albums a
inner join tracks t
where a.AlbumId in (
select AlbumId
from albums
where ArtistId in (
select ArtistId
from artists
where Name = 'Audioslave'));
I don't understand what I'm doing wrong with the last code...Any help would be appreciated! Also, sorry if I wrote too much, I just wanted to convey my thinking process clearly.
Some databases (SQLite, MySQL, Maria, maybe others) allow you to write an INNER JOIN without specifying ON, and they just cross every record on the left with every record on the right in that case. If there were 2 albums and 3 tracks, 6 rows would result. If the albums were A and B, and the tracks were 1, 2 and 3, the rows would be the combination of all: A1, A2, A3, B1, B2, B3
Other databases (Postgres, SQLServer, Oracle, maybe others) refuse to do it unless you specify ON. To get an "every row on the left combined with every row on the right" you have to write CROSS JOIN (or write an inner join with an ON that is always true)
It might help your mental model of what happens during a join to consider that the db takes all the rows on the left and connects them to all the rows on the right, then for each combination of rows, assesses the truth of the ON clause, and the WHERE clause, before deciding to return the row
For example, this will return 10509 rows:
SELECT * FROM albums INNER JOIN tracks ON 1=1
The on clause is always true
This will return 10509 tracks, but only if the query is run on Monday
SELECT * FROM albums INNER JOIN tracks ON strftime('%w', 'now') = 1
What goes in the ON or WHERE doesn't have to have anything to do with the data in the table.. it just has to be something that resolves to a Boolean

SQL Join query brings multiple results

I have 2 tables. One lists all the goals scored in the English Premier League and who scored it and the other, the squad numbers of each player in the league.
I want to do a join so that the table sums the total number of goals by player name, and then looks up the squad number of that player.
Table A [goal_scorer]
[]1
Table B [squads]
[]2
I have the SQL query below:
SELECT goal_scorer.*,sum(goal_scorer.number),squads.squad_number
FROM goal_scorer
Inner join squads on goal_scorer.name=squads.player
group by goal_scorer.name
The issue I have is that in the result, the sum of 'number' is too high and seems to include duplicate rows. For example, Aaron Lennon has scored 33 times, not 264 as shown below.
Maybe you want something like this?
SELECT goal_scorer.*, s.total, squads.squad_number
FROM goal_scorer
LEFT JOIN (
SELECT name, sum(number) as total
FROM goal_scorer
GROUP BY name
) s on s.name = goal_scorer.name
JOIN squads on goal_scorer.name=squads.player
There are other ways to do it, but here I'm using a sub-query to get the total by player. NB: Most modern SQL platforms support windowing functions to do this too.
Also, probably don't need the left on the sub-query (since we know there will always be at least one name), but I put it in case your actual use case is more complicated.
Can you try this if you are using sql-server?
select *
from squads
outer apply(
selecr sum(goal_scorer.number) as score
from goal_scorer where goal_scorer.name=squads.player
)x

Selecting a grouping that matches a certain criteria, SQL

I have two relations, one is a list of the areas an instructor is able to teach (AreasOfInstructor(InstructorNo,AreaName)) and the other is the result of a subquery that returns a list of AreaNames. I want to group the AreaOfInstructor relation by InstructorNo, and then return each instructor (as represented by InstructorNo) that is able to teach all the areas returned by the subquery.
My attempt:
SELECT InstructorNo
FROM AreasofInstructor
GROUP BY InstructorNo
/**WHERE THE GROUP CONTAINS* (the list of AreaNames returned by the subquery)*/
I'm not sure what the actual SQL commands are that will implement the stuff between the stars on the last line. Thanks for the help!
Edit: Just to be clear, what I'm looking for is the set of instructors that are able to teach in the areas that are returned by the subquery.
To do this, you can join both relations, group by InstructorNo, and then validate that the distinct count of AreaNames per InstructorNo matches the distinct count of AreaNames in the AreaNames relation.
with AreaNames as (subquery)
select i.InstructorNo, count(distinct i.AreaName)
from AreasofInstructor i
join AreaNames n
on n.AreaName = i.AreaName
group by i.InstructorNo
having count(distinct i.AreaName) = (select count(distinct AreaName) from AreaNames)
It's better to use Common Table Expression are more readable than a sub-query.
Check if this is what you are looking for?
WITH Areas (AreaName)
AS
(
*sub-query goes here*
)
SELECT DISTINCT
InstructorNo
FROM
AreasOfInstructor AOI
INNER JOIN
Areas A ON AOI.AreaName = A.AreaName

how to perform these queries?

I have these three tables:
create table albums(sernum number primary key,
Albname varchar2(30) not null,
Artist varchar2(20) not null,
Pdate number(4),
Recompany varchar2(10),
Media char(2) not null);
create table tracks(sernum number not null,
song varchar2(50) not null,
primary key(sernum, song),
foreign key(sernum) references albums(sernum));
create table performers(sernum number not null,
Artist varchar2(30) not null,
Instrument varchar2(50) not null,
primary key(sernum, Artist, Instrument),
foreign key (sernum) references albums(sernum));
I want to perform two queries in sql oracle:
list the names of the artists that used all instruments.
list the names of the albums containing the maximum number or songs.
here is my tries:
select distinct(a.Artist) from albums a where a.Artist like (select p.Artist, distinct(p.Instrument) from performers p) group by a.Artist, p.Instrument;
select a.Albname from albums a, inner join tracks t on where a.sernum in(select max(t.sernum) group by t.sernum);
Query 1 - get artists who have played all instruments:
SELECT
p.Artist
FROM
(
SELECT Artist, count(distinct Instrument) as InstrumentCount
FROM performers
GROUP BY artist
) p
JOIN
(
SELECT COUNT(DISTINCT Instrument) as InstrumentCount
FROM performers
) i
ON p.InstrumentCount = i.InstrumentCount
Explanation: 1st subquery gets the count of instruments played by each artist. 2nd subquery gets the count of unique instruments. The two are joined together based on this instrument count to give us only those artists whose instrument counts match the maximum.
--
Query 2 - Get albums containing the maximum number of songs:
WITH
AlbumTrackCount
(
SELECT
sernum,
COUNT(1) as TrackCount
FROM tracks
GROUP BY sernum
)
SELECT
a.Albname
FROM albums a
JOIN AlbumTrackCount atc
ON a.sernum = atc.sernum
AND atc.TrackCount =
(
SELECT MAX(TrackCount)
FROM AlbumTrackCount
)
Explanation: the WITH up top establishes a subquery we'll reuse; it gets us the track count within each album. Down below, we join the albums with this album track count, and add a filter that only those albums with a track count equal to the maximum track count of any of the albums. Note that this is different from the top query, which just got every instrument ever; here, it is important to first count up the tracks within each album, and then get the maximum of those counts.
Below are some of the issues with your queries:
SELECT DISTINCT (a.artist)
FROM albums a
WHERE a.artist LIKE (SELECT p.artist,
distinct(p.Instrument)
from performers p)
group by a.Artist, p.Instrument;
LIKE indicates that you're going to use a wildcard. When comparing against a sub-query in the where clause, you typically use in as the operator.
DISTINCT is not a function. It always applies to all of the columns in a SELECT statement.
DISTINCT and GROUP BY serve very similar purposes. You would rarely use both in the same statement.
You can't reference a column from a correlated sub-query (i.e. a query in the where clause), in the outer query.
SELECT a.albname
FROM albums a,
inner join tracks t
on
where a.sernum in(select max(t.sernum) group by t.sernum);
Your using both a comma and inner join to connect two tables. The comma indicates pre-SQL:1999 syntax, whereas INNER JOIN is SQL:1999. While, technically you can use both in a single FROM clause, you can't use both between a single pair of tables. Also, you shouldn't use both. Sticj to SQL:1999.
Your ON clause is empty. You should probably be joining your two tables here. If you really want to not have a join condition, change the join to CROSS JOIN (to re-iterate: you almost certainly don't actually want this).
You have a SELECT statement without a FROM clause. That is not allowed.

Select average rating from another datatable

I have 3 data tables.
In the entries data table I have entries with ID (entryId as primary key).
I have another table called EntryUsersRatings in there are multiple entries that have entryId field and a rating value (from 1 to 5).
(ratings are stored multiple times for one entryId).
Columns: ratingId (primary key), entryId, rating (integer value).
In the third data table I have translations of entries in the first table (with entryId, languageId and title - translation).
What I would like to do is select all entries from first data table with their titles (by language ID).
On a top of that I want average rating of each entry (which can be stored multiple times) that is stored in EntryUsersRatings.
I have tried this:
SELECT entries.entryId, EntryTranslations.title, AVG(EntryUsersRatings.rating) AS AverageRating
FROM entries
LEFT OUTER JOIN
EntryTranslations ON entries.entryId = EntryTranslations.entryId AND EntryTranslations.languageId = 1
LEFT OUTER JOIN
EntryUsersRatings ON entries.entryId = EntryUsersRatings.entryId
WHERE entries.isDraft=0
GROUP BY title, entries.entryId
isDraft is just something that means that entries are not stored with all information needed (just incomplete data - irrelevant for our case here).
Any help would be greatly appreciated.
EDIT: my solution gives me null values for rating.
Edit1: this query is working perfectly OK, I was looking into wrong database.
We also came to another solution, which gives us the same result (I hope someone will find this useful):
SELECT entries.entryId, COALESCE(x.EntryUsersRatings, 0) as averageRating
FROM entries
LEFT JOIN(
SELECT rr.entryId, AVG(rating) AS entryRating
FROM EntryUsersRatings rr
GROUP BY rr.entryId) x ON x.entryId = entries.entryId
#CyberHawk: as you are using left outer join with entries, your result will give all records from left table and matching record with your join condition from right table. but for unmatching records it will give you a null value .
check out following link for the deta:
http://msdn.microsoft.com/en-us/library/ms187518(v=sql.105).aspx