Compare Two Relations in SQL

Compare Two Relations in SQL - sql

I just started studying SQL and this is a demo given by the teacher in an online course and it works fine. The statement is looking for "students such that number of other students with same GPA is equal to number of other students with same sizeHS":
select *
from Student S1
where (
select count(*)
from Student S2
where S2.sID <> S1.sID and S2.GPA = S1.GPA
) = (
select count(*)
from Student S2
where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS
);
It seems that in this where clause, we're comparing two relations (because the result of a subquery is a relation), but most of the time we are comparing attributes(as far as I've seen).
So I'm thinking about whether there are requirements for how many attributes, and how many tuples, the RELATION should contain when comparing two RELATIONS. If not, how do we compare two RELATIONS when there're multiple attributes or multiple tuples and what do we get for result?
Note:
Student relation has 4 attributes: sID, sName, GPA, sizeHS. And here's the data:
+-----+--------+-----+--------+
| sID | sName | GPA | sizeHS |
+-----+--------+-----+--------+
| 123 | Amy | 3.9 | 1000 |
| 234 | Bob | 3.6 | 1500 |
| 345 | Craig | 3.5 | 500 |
| 456 | Doris | 3.9 | 1000 |
| 567 | Edward | 2.9 | 2000 |
| 678 | Fay | 3.8 | 200 |
| 789 | Gary | 3.4 | 800 |
| 987 | Helen | 3.7 | 800 |
| 876 | Irene | 3.9 | 400 |
| 765 | Jay | 2.9 | 1500 |
| 654 | Amy | 3.9 | 1000 |
| 543 | Craig | 3.4 | 2000 |
+-----+--------+-----+--------+
and the result of this query is:
+-----+--------+-----+---------+
| sID | sName | GPA | sizeHS |
+-----+--------+-----+---------+
| 345 | Craig | 3.5 | 500 |
| 567 | Edward | 2.9 | 2000 |
| 678 | Fay | 3.8 | 200 |
| 789 | Gary | 3.4 | 800 |
| 765 | Jay | 2.9 | 1500 |
| 543 | Craig | 3.4 | 2000 |
+-----+--------+-----+---------+

because the result of a subquery is a relation
Relation is the scientific name for what we call a table in a database and I like the name "table" much better than "relation". A table is easy to imagine. We know them from our school time schedule for instance. Yes, we relate things here inside a table (day and time and the subject taught in school), but we can also relate tables to tables (pupils' timetables with the table of class rooms, the overall subject schedule, and the teacher's timetables). As such, tables in an RDBMS are also related to each other (hence the name relational database management system). I find the name relation for a table quite confusing (and many people use the word "relation" to describe the relations between tables instead).
So, yes, a query result itself is again a table ("relation"). And from tables we can of course select:
select * from (select * from b) as subq;
And then there are scalar queries that return exactly one row and one column. select count(*) from b is such a query. While this is still a table we can select from
select * from (select count(*) as cnt from b) as subq;
we can even use them where we usually have single values, e.g. in the select clause:
select a.*, (select count(*) from b) as cnt from a;
In your query you have two scalar subqueries in your where clause.
With subqueries there is another distinction to make: we have correlated and non-correlated subqueries. The last query I have just shown contains a non-correlated subquery. It selects the count of b rows for every single result row, no matter what that row contains elsewise. A correlated subquery on the other hand may look like this:
select a.*, (select count(*) from b where b.x = a.y) as cnt from a;
Here, the subquery is related to the main table. For every result row we look up the count of b rows matching the a row we are displaying via where b.x = a.y, so the count is different from row to row (but we'd get the same count for a rows sharing the same y value).
Your subqueries are also correlated. As with the select clause, the where clause deals with one row at a time (in order to keep or dismiss it). So we look at one student S1 at a time. For this student we count other students (S2, where S2.sID <> S1.sID) who have the same GPA (and S2.GPA = S1.GPA) and count other students who have the same sizeHS. We only keep students (S1) where there are exactly as many other students with the same GPA as there are with the same sizeHS.
UPDATE
As do dealing with multiple tuples as in
select *
from Student S1
where (
select count(*), avg(grade)
from Student S2
where S2.sID <> S1.sID and S2.GPA = S1.GPA
) = (
select count(*), avg(grade)
from Student S2
where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS
);
this is possible in some DBMS, but not in SQL Server. SQL Server doesn't know tuples.
But there are other means to achieve the same. You could just add two subqueries:
select * from student s1
where (...) = (...) -- compare counts here
and (...) = (...) -- compare averages here
Or get the data in the FROM clause and then deal with it. E.g.:
select *
from Student S1
cross apply
(
select count(*) as cnt, avg(grade) as avg_grade
from Student S2
where S2.sID <> S1.sID and S2.GPA = S1.GPA
) sx
cross apply
(
select count(*) as cnt, avg(grade) as avg_grade
from Student S2
where S2.sID <> S1.sID and S2.sizeHS = S1.sizeHS
) sy
where sx.cnt = sy.cnt and sx.avg_grade = sy.avg_grade;

There are relational operations:
The intersection operator produces the set of tuples that two
relations share in common. Intersection is implemented in SQL in the
form of the INTERSECT operator.
The difference operator acts on two relations and produces the set of tuples from the first relation that do not exist in the second relation. Difference is implemented in SQL in the form of the EXCEPT or MINUS operator.
So, in the context of SQL Server, for example, you can do:
SELECT *
FROM R1
EXCEPT
SELECT *
FROM R2
to get rows in R1 not included in R2 and the reverse - to get all differences.
Of course, the attributes must be the same - if not, you need to explicit set the attributes in the SELECT.

Related

For Table A, Return All Values in Column X in Table B

I have two tables, one is a table of employee names, 176 records. The other is a table (with duplicates) of employee names (same format) and their locations (7943 rows).
From this answer i deduced i needed a left join to give me the rows from Table A only.
I was hoping the below would give me the original 176 rows back from Table A, each column with a value for location from Table B, else blank if not available, however it gives me 7601 rows which i cannot for the life of me understand:
SELECT e.[UniqueName], l.[location]
FROM [Employees] as e
left join Locations as l
on e.[UniqueName] = l.[UniqueName]
Even using a group by (which I'm not sure why this would be necessary given that I am asking only for whats in Table A) gives 172 rows even though each name in the Employees table is unique!

The table Locations contains more than 1 locations for each employee and this is why you get so many rows in the results.
If you want just 1 location and it does not matter which 1 then add aggregation to your query:
SELECT e.[UniqueName], MAX(l.[location]) AS location
FROM [Employees] as e
LEFT JOIN Locations as l
ON e.[UniqueName] = l.[UniqueName]
GROUP BY e.[UniqueName]
You can use MIN() instead of MAX().

I was hoping the below would give me the original 176 rows back from
Table A, each column with a value for location from Table B, else
blank if not available, however it gives me 7601 rows which I cannot
for the life of me understand...
Whilst a left join will always return all records from the dataset on the lefthand side of the join, the number of records returned by the query will depend upon the number of possible pairings between the two datasets, which (for a left join) will always be greater than or equal to the number of records in the dataset to the left of the join.
For your example, consider the following two datasets:
Employees
+------------+
| UniqueName |
+------------+
| Alice |
| Bob |
| Charlie |
+------------+
Locations
+------------+----------+
| UniqueName | Location |
+------------+----------+
| Alice | London |
| Bob | Berlin |
| Bob | New York |
| Bob | Paris |
+------------+----------+
Evaluating the query:
select
e.[uniquename], l.[location]
from
[employees] as e left join locations as l
on e.[uniquename] = l.[uniquename]
Will cause the records to be paired up in the following manner:
And will therefore return the result:
+------------+----------+
| uniquename | location |
+------------+----------+
| Alice | London |
| Bob | Berlin |
| Bob | New York |
| Bob | Paris |
| Charlie | |
+------------+----------+

You can use a correlated subquery:
SELECT e.[UniqueName],
(SELECT TOP 1 l.[location]
FROM locations as l
WHERE e.[UniqueName] = l.[UniqueName]
) as location
FROM [Employees] as e;
Note: There is no ORDER BY so this returns an arbitrary location.
If location can be duplicated for a given UniqueName, you will get an error. To solve that, you can use an aggregation functions:
SELECT e.[UniqueName],
(SELECT MAX(l.[location])
FROM locations as l
WHERE e.[UniqueName] = l.[UniqueName]
) as location
FROM [Employees] as e;

How to combine in one sql query in extra column the result of 2 group by queries?

Considering the following mdl_course_completions table that describes a course completion for a user:
id,bigint
userid,bigint
course,bigint
timeenrolled,bigint
timestarted,bigint
timecompleted,bigint
reaggregate,bigint
To determinate if a course has been finished by a student, I use a predicate on the timecompleted field.
When this field is null, the student has not finished the course, but when this field is not null, that means the student has finished the course.
Thus, the count of the number of students that finished course by course is given by:
SELECT mdl_course.fullname,count(*) as "number of students that didn't finish courses"
FROM mdl_course_completions
INNER JOIN mdl_course on mdl_course.id = mdl_course_completions.course
WHERE timecompleted IS NOT NULL
GROUP BY mdl_course.fullname
;
the result is:
| course name | number of students that finish courses |
|-------------|----------------------------------------|
| course 1 | 50 |
| course 2 | 200 |
| course 3 | 120 |
AND the count of the number of students that DIDN'T finished course by course is given by:
SELECT mdl_course.fullname,count(*) as "number student that didn't finish courses"
FROM mdl_course_completions
INNER JOIN mdl_course on mdl_course.id = mdl_course_completions.course
WHERE timecompleted IS NULL
GROUP BY mdl_course.fullname
;
the result is:
| course name | number of students that didn't finish courses |
|-------------|-----------------------------------------------|
| course 1 | 12 |
| course 2 | 12 |
| course 3 | 120 |
I wonder how can I combine this 2 queries to get in one query the results in an extra column such as:
| course name | number of students that finish courses | number of students that didn't finish courses |
|-------------|------------------------------------|-------------------------------------------|
| course 1 | 50 | 12 |
| course 2 | 200 | 12 |
| course 3 | 120 | 120 |
I am using postgresql.In my opinion, this kind of stuff is not related to database system. I just don't know how to proceed to combine these 2 queries in one in an extra column with the GROUP BY clause.

Use conditional aggregation.
SELECT mdl_course.fullname
,SUM((timecompleted IS NOT NULL)::int) as "number student that finish courses"
,SUM((timecompleted IS NULL)::int) as "number student that didn't finish courses"
FROM mdl_course_completions
INNER JOIN mdl_course on mdl_course.id = mdl_course_completions.course
GROUP BY mdl_course.fullname

From PostgreSQL 9.4 on, you can use the FILTER clause with aggregate functions:
count(*) FILTER (WHERE timecompleted IS NOT NULL)

Find spectators that have seen the same shows (match multiple rows for each)

For an assignment I have to write several SQL queries for a database stored in a PostgreSQL server running PostgreSQL 9.3.0. However, I find myself blocked with last query. The database models a reservation system for an opera house. The query is about associating the a spectator the other spectators that assist to the same events every time.
The model looks like this:
Reservations table
id_res | create_date | tickets_presented | id_show | id_spectator | price | category
-------+---------------------+---------------------+---------+--------------+-------+----------
1 | 2015-08-05 17:45:03 | | 1 | 1 | 195 | 1
2 | 2014-03-15 14:51:08 | 2014-11-30 14:17:00 | 11 | 1 | 150 | 2
Spectators table
id_spectator | last_name | first_name | email | create_time | age
---------------+------------+------------+----------------------------------------+---------------------+-----
1 | gonzalez | colin | colin.gonzalez#gmail.com | 2014-03-15 14:21:30 | 22
2 | bequet | camille | bequet.camille#gmail.com | 2014-12-10 15:22:31 | 22
Shows table
id_show | name | kind | presentation_date | start_time | end_time | id_season | capacity_cat1 | capacity_cat2 | capacity_cat3 | price_cat1 | price_cat2 | price_cat3
---------+------------------------+--------+-------------------+------------+----------+-----------+---------------+---------------+---------------+------------+------------+------------
1 | madama butterfly | opera | 2015-09-05 | 19:30:00 | 21:30:00 | 2 | 315 | 630 | 945 | 195 | 150 | 100
2 | don giovanni | opera | 2015-09-12 | 19:30:00 | 21:45:00 | 2 | 315 | 630 | 945 | 195 | 150 | 100
So far I've started by writing a query to get the id of the spectator and the date of the show he's attending to, the query looks like this.
SELECT Reservations.id_spectator, Shows.presentation_date
FROM Reservations
LEFT JOIN Shows ON Reservations.id_show = Shows.id_show;
Could someone help me understand better the problem and hint me towards finding a solution. Thanks in advance.
So the result I'm expecting should be something like this
id_spectator | other_id_spectators
-------------+--------------------
1| 2,3
Meaning that every time spectator with id 1 went to a show, spectators 2 and 3 did too.

Note based on comments: Wanted to make clear that this answer may be of limited use as it was answered in the context of SQL-Server (tag was present at the time)
There is probably a better way to do it, but you could do it with the 'stuff 'function. The only drawback here is that, since your ids are ints, placing a comma between values will involve a work around (would need to be a string). Below is the method I can think of using a work around.
SELECT [id_spectator], [id_show]
, STUFF((SELECT ',' + CAST(A.[id_spectator] as NVARCHAR(10))
FROM reservations A
Where A.[id_show]=B.[id_show] AND a.[id_spectator] != b.[id_spectator] FOR XML PATH('')),1,1,'') As [other_id_spectators]
From reservations B
Group By [id_spectator], [id_show]
This will show you all other spectators that attended the same shows.

Meaning that every time spectator with id 1 went to a show, spectators 2 and 3 did too.
In other words, you want a list of ...
all spectators that have seen all the shows that a given spectator has seen (and possibly more than the given one)
This is a special case of relational division. We have assembled an arsenal of basic techniques here:
How to filter SQL results in a has-many-through relation
It is special because the list of shows each spectator has to have attended is dynamically determined by the given prime spectator.
Assuming that (d_spectator, id_show) is unique in reservations, which has not been clarified.
A UNIQUE constraint on those two columns (in that order) also provides the most important index.
For best performance in query 2 and 3 below also create an index with leading id_show.
1. Brute force
The primitive approach would be to form a sorted array of shows the given user has seen and compare the same array of others:
SELECT 1 AS id_spectator, array_agg(sub.id_spectator) AS id_other_spectators
FROM (
SELECT id_spectator
FROM reservations r
WHERE id_spectator <> 1
GROUP BY 1
HAVING array_agg(id_show ORDER BY id_show)
#> (SELECT array_agg(id_show ORDER BY id_show)
FROM reservations
WHERE id_spectator = 1)
) sub;
But this is potentially very expensive for big tables. The whole table hast to be processes, and in a rather expensive way, too.
2. Smarter
Use a CTE to determine relevant shows, then only consider those
WITH shows AS ( -- all shows of id 1; 1 row per show
SELECT id_spectator, id_show
FROM reservations
WHERE id_spectator = 1 -- your prime spectator here
)
SELECT sub.id_spectator, array_agg(sub.other) AS id_other_spectators
FROM (
SELECT s.id_spectator, r.id_spectator AS other
FROM shows s
JOIN reservations r USING (id_show)
WHERE r.id_spectator <> s.id_spectator
GROUP BY 1,2
HAVING count(*) = (SELECT count(*) FROM shows)
) sub
GROUP BY 1;
#> is the "contains2 operator for arrays - so we get all spectators that have at least seen the same shows.
Faster than 1. because only relevant shows are considered.
3. Real smart
To also exclude spectators that are not going to qualify early from the query, use a recursive CTE:
WITH RECURSIVE shows AS ( -- produces exactly 1 row
SELECT id_spectator, array_agg(id_show) AS shows, count(*) AS ct
FROM reservations
WHERE id_spectator = 1 -- your prime spectator here
GROUP BY 1
)
, cte AS (
SELECT r.id_spectator, 1 AS idx
FROM shows s
JOIN reservations r ON r.id_show = s.shows[1]
WHERE r.id_spectator <> s.id_spectator
UNION ALL
SELECT r.id_spectator, idx + 1
FROM cte c
JOIN reservations r USING (id_spectator)
JOIN shows s ON s.shows[c.idx + 1] = r.id_show
)
SELECT s.id_spectator, array_agg(c.id_spectator) AS id_other_spectators
FROM shows s
JOIN cte c ON c.idx = s.ct -- has an entry for every show
GROUP BY 1;
Note that the first CTE is non-recursive. Only the second part is recursive (iterative really).
This should be fastest for small selections from big tables. Row that don't qualify are excluded early. the two indices I mentioned are essential.
SQL Fiddle demonstrating all three.

It sounds like you have one half of the total question--determining which id_shows a particular id_spectator attended.
What you want to ask yourself is how you can determine which id_spectators attended an id_show, given an id_show. Once you have that, combine the two answers to get the full result.

So the final answer I got, looks like this :
SELECT id_spectator, id_show,(
SELECT string_agg(to_char(A.id_spectator, '999'), ',')
FROM Reservations A
WHERE A.id_show=B.id_show
) AS other_id_spectators
FROM Reservations B
GROUP By id_spectator, id_show
ORDER BY id_spectator ASC;
Which prints something like this:
id_spectator | id_show | other_id_spectators
-------------+---------+---------------------
1 | 1 | 1, 2, 9
1 | 14 | 1, 2
Which suits my needs, however if you have any improvements to offer, please share :) Thanks again everybody!

CTE to represent a logical table for the rows in a table which have the max value in one column

I have an "insert only" database, wherein records aren't physically updated, but rather logically updated by adding a new record, with a CRUD value, carrying a larger sequence. In this case, the "seq" (sequence) column is more in line with what you may consider a primary key, but the "id" is the logical identifier for the record. In the example below,
This is the physical representation of the table:
seq id name | CRUD |
----|-----|--------|------|
1 | 10 | john | C |
2 | 10 | joe | U |
3 | 11 | kent | C |
4 | 12 | katie | C |
5 | 12 | sue | U |
6 | 13 | jill | C |
7 | 14 | bill | C |
This is the logical representation of the table, considering the "most recent" records:
seq id name | CRUD |
----|-----|--------|------|
2 | 10 | joe | U |
3 | 11 | kent | C |
5 | 12 | sue | U |
6 | 13 | jill | C |
7 | 14 | bill | C |
In order to, for instance, retrieve the most recent record for the person with id=12, I would currently do something like this:
SELECT
*
FROM
PEOPLE P
WHERE
P.ID = 12
AND
P.SEQ = (
SELECT
MAX(P1.SEQ)
FROM
PEOPLE P1
WHERE P.ID = 12
)
...and I would receive this row:
seq id name | CRUD |
----|-----|--------|------|
5 | 12 | sue | U |
What I'd rather do is something like this:
WITH
NEW_P
AS
(
--CTE representing all of the most recent records
--i.e. for any given id, the most recent sequence
)
SELECT
*
FROM
NEW_P P2
WHERE
P2.ID = 12
The first SQL example using the the subquery already works for us.
Question: How can I leverage a CTE to simplify our predicates when needing to leverage the "most recent" logical view of the table. In essence, I don't want to inline a subquery every single time I want to get at the most recent record. I'd rather define a CTE and leverage that in any subsequent predicate.
P.S. While I'm currently using DB2, I'm looking for a solution that is database agnostic.

This is a clear case for window (or OLAP) functions, which are supported by all modern SQL databases. For example:
WITH
ORD_P
AS
(
SELECT p.*, ROW_NUMBER() OVER ( PARTITION BY id ORDER BY seq DESC) rn
FROM people p
)
,
NEW_P
AS
(
SELECT * from ORD_P
WHERE rn = 1
)
SELECT
*
FROM
NEW_P P2
WHERE
P2.ID = 12
PS. Not tested. You may need to explicitly list all columns in the CTE clauses.

I guess you already put it together. First find the max seq associated with each id, then use that to join back to the main table:
WITH newp AS (
SELECT id, MAX(seq) AS latestseq
FROM people
GROUP BY id
)
SELECT p.*
FROM people p
JOIN newp n ON (n.latestseq = p.seq)
ORDER BY p.id
What you originally had would work, or moving the CTE into the "from" clause. Maybe you want to use a timestamp field rather than a sequence number for the ordering?

Following up from #Glenn's answer, here is an updated query which meets my original goal and is on par with #mustaccio's answer, but I'm still not sure what the performance (and other) implications of this approach vs the other are.
WITH
LATEST_PERSON_SEQS AS
(
SELECT
ID,
MAX(SEQ) AS LATEST_SEQ
FROM
PERSON
GROUP BY
ID
)
,
LATEST_PERSON AS
(
SELECT
P.*
FROM
PERSON P
JOIN
LATEST_PERSON_SEQS L
ON
(
L.LATEST_SEQ = P.SEQ)
)
SELECT
*
FROM
LATEST_PERSON L2
WHERE
L2.ID = 12

Using multiple left joins to calculate averages and counts

I am trying to figure out how to use multiple left outer joins to calculate average scores and number of cards. I have the following schema and test data. Each deck has 0 or more scores and 0 or more cards. I need to calculate an average score and card count for each deck. I'm using mysql for convenience, I eventually want this to run on sqlite on an Android phone.
mysql> select * from deck;
+----+-------+
| id | name |
+----+-------+
| 1 | one |
| 2 | two |
| 3 | three |
+----+-------+
mysql> select * from score;
+---------+-------+---------------------+--------+
| scoreId | value | date | deckId |
+---------+-------+---------------------+--------+
| 1 | 6.58 | 2009-10-05 20:54:52 | 1 |
| 2 | 7 | 2009-10-05 20:54:58 | 1 |
| 3 | 4.67 | 2009-10-05 20:55:04 | 1 |
| 4 | 7 | 2009-10-05 20:57:38 | 2 |
| 5 | 7 | 2009-10-05 20:57:41 | 2 |
+---------+-------+---------------------+--------+
mysql> select * from card;
+--------+-------+------+--------+
| cardId | front | back | deckId |
+--------+-------+------+--------+
| 1 | fron | back | 2 |
| 2 | fron | back | 1 |
| 3 | f1 | b2 | 1 |
+--------+-------+------+--------+
I run the following query...
mysql> select deck.name, sum(score.value)/count(score.value) "Ave",
-> count(card.front) "Count"
-> from deck
-> left outer join score on deck.id=score.deckId
-> left outer join card on deck.id=card.deckId
-> group by deck.id;
+-------+-----------------+-------+
| name | Ave | Count |
+-------+-----------------+-------+
| one | 6.0833333333333 | 6 |
| two | 7 | 2 |
| three | NULL | 0 |
+-------+-----------------+-------+
... and I get the right answer for the average, but the wrong answer for the number of cards. Can someone tell me what I am doing wrong before I pull my hair out?
Thanks!
John

It's running what you're asking--it's joining card 2 and 3 to scores 1, 2, and 3--creating a count of 6 (2 * 3). In card 1's case, it joins to scores 4 and 5, creating a count of 2 (1 * 2).
If you just want a count of cards, like you're currently doing, COUNT(Distinct Card.CardId).

select deck.name, coalesce(x.ave,0) as ave, count(card.*) as count -- card.* makes the intent more clear, i.e. to counting card itself, not the field. but do not do count(*), will make the result wrong
from deck
left join -- flatten the average result rows first
(
select deckId,sum(value)/count(*) as ave -- count the number of rows, not count the column name value. intent is more clear
from score
group by deckId
) as x on x.deckId = deck.id
left outer join card on card.deckId = deck.id -- then join the flattened results to cards
group by deck.id, x.ave, deck.name
order by deck.id
[EDIT]
sql has built-in average function, just use this:
select deckId, avg(value) as ave
from score
group by deckId

What's going wrong is that you're creating a Cartesian product between score and card.
Here's how it works: when you join deck to score, you may have multiple rows match. Then each of these multiple rows is joined to all of the matching rows in card. There's no condition preventing that from happening, and the default join behavior when no condition restricts it is to join all rows in one table to all rows in another table.
To see it in action, try this query, without the group by:
select *
from deck
left outer join score on deck.id=score.deckId
left outer join card on deck.id=card.deckId;
You'll see a lot of repeated data in the columns that come from score and card. When you calculate the AVG() over data that has repeats in it, the redundant values magically disappear (as long as the values are repeated uniformly). But when you COUNT() or SUM() them, the totals are way off.
There may be remedies for inadvertent Cartesian products. In your case, you can use COUNT(DISTINCT) to compensate:
select deck.name, avg(score.value) "Ave", count(DISTINCT card.front) "Count"
from deck
left outer join score on deck.id=score.deckId
left outer join card on deck.id=card.deckId
group by deck.id;
This solution doesn't solve all cases of inadvertent Cartesian products. The more general-purpose solution is to break it up into two separate queries:
select deck.name, avg(score.value) "Ave"
from deck
left outer join score on deck.id=score.deckId
group by deck.id;
select deck.name, count(card.front) "Count"
from deck
left outer join card on deck.id=card.deckId
group by deck.id;
Not every task in database programming must be done in a single query. It can even be more efficient (as well as simpler, easier to modify, and less error-prone) to use individual queries when you need multiple statistics.

Using left joins isn't a good approach, in my opinion. Here's a standard SQL query for the result you want.
select
name,
(select avg(value) from score where score.deckId = deck.id) as Ave,
(select count(*) from card where card.deckId = deck.id) as "Count"
from deck;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Compare Two Relations in SQL - sql

Related

For Table A, Return All Values in Column X in Table B

How to combine in one sql query in extra column the result of 2 group by queries?

Find spectators that have seen the same shows (match multiple rows for each)

CTE to represent a logical table for the rows in a table which have the max value in one column

Using multiple left joins to calculate averages and counts

Categories

Resources