SQL query without join - sql

I have the following tables
Table food Table Race Table animal
+------------+--------------+ +------------+--------------+ +------------+--------------+
| Quantity | animal_id | | race_code | race_name | | animal_id | race_code |
+------------+--------------+ +------------+--------------+ +------------+--------------+
I was asked to calculate the average food quantity for every race (race_name). The challenge here is that I should not use JOIN because we have not studied it yet.
I have written the following query:
select AVG(f.quantity),r.race_name from food f, race r
group by r.race_name;
but it doesn't work as I want it to be since it returns the same average food quantity for all races. I know I have to use the animal table to link the other 2 but I didn't know how. I should be using subqueries

That question is exactly the same as your previous, where you had to use SUM (instead of AVG). No difference at all.
Ah, sorry - it wasn't you, but your school colleague, here
Saying that you "didn't learn joins", well - what do you call what you posted here, then? That's a cross join and will produce Cartesian product, once you fix the error you got by not including non-aggregated column into the group by clause and include additional joins required to return desired result.
The "old" syntax is
select r.name,
avg(f.quantity) avg_quantity
from race r, animal a, food f
where a.race_code = r.race_code
and f.animal_id = a.animal_id
group by r.name;
What you "didn't learn yet" does the same, but looks differently:
from race r join animal a on a.race_code = r.race_code
join food f on f.animal_id = a.animal_id
The rest of the query remains the same.
Nowadays, you should use JOINs to join tables, and put conditions into the WHERE clause. For example, condition would be that you want to calculate averages for donkeys only. As you don't have it, you don't need it.

You still have to do some matching of related rows. If not explicitly with JOIN you can do it in the WHERE clause. Ie something like
select AVG(f.quantity),r.race_name
from food f, race r, animal a
where f.animal_id = a.animal_id and a.race_code = r.race_code
group by r.race_name;

select race_name ,(select avg(quantity) from food where animal_id in (select animal_id from animal a where r.race_code = a.race_code))
from race r

Related

Aggregating or Bundle a Many to Many Relationship in SQL Developer

So I have 1 single table with 2 columns : Sales_Order called ccso, Arrangement called arrmap
The table has distinct values for this combination and both these fields have a Many to Many relationship
1 ccso can have Multiple arrmap
1 arrmap can have Multiple ccso
All such combinations should be considered as one single bundle
Objective :
Assign a final map to each of the Sales Order as the Largest Arrangement in that Bundle
Example:
ccso : 100-10015 has 3 arrangements --> Now each of those arrangements have a set of Sales Orders --> Now those sales orders will also have a list of other arrangements and so on
(Image : 1)
Therefore the answer definitely points to something recursively checking. Ive managed to write the below code / codes and they work as long as I hard code a ccso in the where clause - But I don't know how to proceed after this now. (I'm an accountant by profession but finding more passion in coding recently) I've searched the forums and web for things like
Recursive CTEs,
many to many aggregation
cartesian product etc
and I'm sure there must be a term for this which I don't know yet. I've also tried
I have to use sqldeveloper or googlesheet query and filter formulas
sqldeveloper has restrictions on on some CTEs. If recursive is the way I'd like to know how and if I can control the depth to say 4 or 5 iterations
Ideally I'd want to update a third column with the final map if possible but if not, then a select query result is just fine
Codes I've tried
Code 1: As per Screenshot
WITH a1(ccso, amap) AS
(SELECT distinct a.ccso, a.arrmap
FROM rg_consol_map2 A
WHERE a.ccso = '100-10115' -- this condition defines the ultimate ancestors in your chain, change it as appropriate
UNION ALL
SELECT m.ccso, m.arrmap
FROM rg_consol_map2 m
JOIN a1
ON M.arrmap = a1.amap -- or m.ccso=a1.ccso
) /*if*/ CYCLE amap SET nemap TO 1 /*else*/ DEFAULT 0
SELECT DISTINCT amap FROM (SELECT ccso, amap FROM a1 ORDER BY 1 DESC) WHERE ROWNUM = 1
In this the main challenge is how to remove the hardcoded ccso and do a join for each of the ccso
Code 2 : Manual CTEs for depth
Here again the join outside the CTE gives me an error and sqldeveloper does not allow WITH clause with UPDATE statement - only works for select and cannot be enclosed within brackets as subtable
SELECT distinct ccso FROM
(
WITH ar1 AS
(SELECT distinct arrmap
FROM rg_consol_map
WHERE ccso = a.ccso
)
,so1 AS
(SELECT DISTINCT ccso
FROM rg_consol_map
WHERE arrmap IN (SELECT arrmap FROM ar1)
)
,ar2 AS
(SELECT DISTINCT ccso FROM rg_consol_map
where arrmap IN (select distinct arrmap FROM rg_consol_map
WHERE ccso IN (SELECT ccso FROM so1)
))
SELECT ar1.arrmap, NULL ccso FROM ar1
union all
SELECT null, ar2.ccso FROM ar2
UNION ALL
SELECT NULL arrmap, so1.ccso FROM so1
)
Am I Missing something here or is there an easier way to do this? I read something about MERGE and PROC SQL JOIN but was unable to get them to work but if that's the way to go ahead I will try further if someone can point me in the direction
(Image : 2)
(CSV File : [3])
Edit : Fixing CSV file link
https://github.com/karan360note/karanstackoverflow.git
I suppose can be downloaded from here IC mapping many to many.csv
Oracle 11g version is being used
Apologies in advance for the wall of text.
Your problem is a complex, multi-layered Many-to-Many query; there is no "easy" solution to this, because that is not a terribly ideal design choice. The safest best does literally include multiple layers of CTE or subqueries in order to achieve all the depths you want, as the only ways I know to do so recursively rely on an anchor column (like "parentID") to direct the recursion in a linear fashion. We don't have that option here; we'd go in circles without a way to track our path.
Therefore, I went basic, and with several subqueries. Every level checks for a) All orders containing a particular ARRMAP item, and then b) All additional items on those orders. It's clear enough for you to see the logic and modify to your needs. It will generate a new table that contains the original CCSO, the linking ARRMAP, and the related CCSO. Link: https://pastebin.com/un70JnpA
This should enable you to go back and perform the desired updates you want, based on order # or order date, etc... in a much more straightforward fashion. Once you have an anchor column, a CTE in the future is much more trivial (just search for "CTE recursion tree hierarchy").
SELECT DISTINCT
CCSO, RELATEDORDER
FROM myTempTable
WHERE CCSO = '100-10115'; /* to find all orders by CCSO, query SELECT DISTINCT RELATEDORDER */
--WHERE ARRMAP = 'ARR10524'; /* to find all orders by ARRMAP, query SELECT DISTINCT CCSO */
EDIT:
To better explain what this table generates, let me simplify the problem.
If you have order
A with arrangements 1 and 2;
B with arrangement 2, 3; and
C with arrangement 3;
then, by your initial inquiry and image, order A should related to orders B and C, right? The query generates the following table when you SELECT DISTINCT ccso, relatedOrder:
+-------+--------------+
| CCSO | RelatedOrder |
+----------------------+
| A | B |
| A | C |
+----------------------+
| B | C |
| B | A |
+----------------------+
| C | A |
| C | B |
+-------+--------------+
You can see here if you query WHERE CCSO = 'A' OR RelatedOrder = 'A', you'll get the same relationships, just flipped between the two columns.
+-------+--------------+
| CCSO | RelatedOrder |
+----------------------+
| A | B |
| A | C |
+----------------------+
| B | A |
+----------------------+
| C | A |
+-------+--------------+
So query only CCSO or RelatedOrder.
As for the results of WHERE CCSO = '100-10115', see image here, which includes all the links you showed in your Image #1, as well as additional depths of relations.

Stuck on beginner SQL practice. Multiple table where columns use same id

I'm very sorry to bother with minor problem, but I tried to search old answers for this one and since my skills in SQL are complete 0, I didn't even understand the answers :/! Neither is my English terminology great enough for properly searching.
I have these 2 tables: Cities and Flights.
Cities
+----+-------------+
|id | name |
+----+-------------+
|1 | Oslo |
|2 | New York |
|3 | Hong Kong |
+----+-------------+
Flights
+----+--------------------+-------------------+
|id | wherefrom_id | whereto_id |
+----+--------------------+-------------------+
|1 | 3 | 2 |
|2 | 3 | 1 |
|3 | 1 | 3 |
+----+--------------------+-------------------+
Now I have to write code where I need to make city ID's merge to wherefrom_id and whereto_id, in that manner that the answer shows table where you can see list of Flights (FROM/TO).
Example:
ANSWER:
+-----------+----------------+
|HONG KONG | NEW YORK |
+-----------+----------------+
|HONG KONG | OSLO |
+-----------+----------------+
|OSLO | HONG KONG |
+-----------+----------------+
This is what I wrote:
SELECT C.name, C.name
FROM Cities C, Flights F
WHERE C.id = F.wherefrom_id AND C.id = F.whereto_id;
For some reason this doesnt seem to work and I get nothing showing on my practice program. There is no error or anything it just doesnt show anything on the test answer. I really hope you get what I mean, English is not my first language and I truly tried my best to make it clear as possible :S
First things first - it's a lot easier to code in standard SQL join syntax. Converting your above to that is
SELECT C.name, C.name
FROM Cities C
INNER JOIN Flights F ON C.id = F.wherefrom_id AND C.id = F.whereto_id;
The question you've been asked requires logic people don't usually use at first so it can be confusing the first time you encounter it.
I will run through the logic jump in a moment.
Imagine your Flights table has the City names in it (not IDs).
It would have columns, say, FlightID, From_City_Name, To_City_Name.
An example row would be 1, 'Oslo', 'Prague'.
Getting the data for this would be easy e.g., SELECT Flight_ID, From_City_Name, To_City_name FROM Flights.
However, this has many problems. As your question has done, you decide to pull out the cities into their own reference tables.
For this first example, however, you decided to have two extra tables as reference tables: From_City and To_City. These would both have an ID and city name. You then change your Flights to refer to these.
Your code would look like
SELECT F.ID, FC.Name AS From_City, TC.Name AS To_City
FROM Flights
INNER JOIN From_City AS FC ON Flights.From_City_ID = FC.ID
INNER JOIN To_City AS TC ON Flights.To_City_ID = TC.ID
Notice how there are two joins there - one to From_City and one to To_City? That is because the From and To cities are referring to different things in the data.
So, then the final part of the issue: why have two city tables (from and to). Why not have one? Well, you can. If you create just one table, and modify the above, you get something like this:
SELECT F.ID, FC.Name AS From_City, TC.Name AS To_City
FROM Flights
INNER JOIN City AS FC ON Flights.From_City_ID = FC.ID
INNER JOIN City AS TC ON Flights.To_City_ID = TC.ID
Note that all that has changed is that the From_City and To_City references have been pointed to a different table City. However, the rest is the same.
And that, actually, would be your answer. The complex part that most people don't get to straight away, is having two joins to the same table.
As an aside, your original code is technically valid.
SELECT C.name, C.name
FROM Cities C
INNER JOIN Flights F ON C.id = F.wherefrom_id AND C.id = F.whereto_id;
However, what it's effectively saying is to get the city names where the From_City is the same as the To_City - which is obviously not what you want (unless you're looking for turnbacks).
What you're doing is an old SQL way of expressing joins. The standard now has better ways to declare the relationships within the from clause and I take it that your material has postponed that slightly:
There are people who will yell at you for using this ancient syntax but the answer is easy enough:
SELECT C1.name, C2.name
FROM Cities C1, Cities C2, Flights F
WHERE C1.id = F.wherefrom_id AND C2.id = F.whereto_id
You can think of this as creating a "cross product" of all city-pair combinations and matching up the ones that match actual flights. The key is to references Cities twice by using different aliases (or correlation names.)
I think this is what you are looking for..
SELECT wf.name "wherefrom", wt.name "whereto"
FROM Flights f
JOIN Cities wf
ON f.wherefrom_id = wf.id
JOIN Cities wt
ON f.whereto_id = wt.id
order by f.id

pass results between queries and display joint results ( google bigquery )

I want to make a query q1, and use the result of q1 on a second query q2.
I want to display all columns of q1 and q2, so that results are based on a common column.
(Please let me know if title is not so clear)
The example below should display columns [id, publisher, author] in the q1.
I want to pass them to q2, retrieve properties [id, cited_id, category] for all items within the id column of q1.
As results, for each id I want to display all cited_ids and their properties (of both ids and cited_ids).
Alternatively, for better clarity, it is also ok to retrieve an array of cited_ids for each ids, and in a separate query I will decorate my ids and cited_ids with their properties.
Please advise also on the "performance" (I m using bigquery, so if you could explain why a solution is more efficient that would help in saving computational resources!).
I came up with this, but cannot display all columns of q1.
WITH q1 AS (
SELECT id, publisher, a.name
FROM `db.publications`,
UNNEST (publisher) as h,
UNNEST (author) as a
WHERE h Like '%penguin%'
)
SELECT p.id, c.id AS Cited, c.Category AS Cat
FROM `db.publications` AS p, UNNEST(citation) AS c
WHERE p.id IN (SELECT id from q1)
Sample Data:
# result of q1
Row | Id | Publisher | Author
1 | item0 | penguin | Bob
2 | item0 | penguin | Alice
3 | item1 | penguin | Charlie
I want to find other items that are cited by each unique item in q1 (item0, item1).
I wish to have results in an handy format that could be used in this way:
# Citations: books mentioned by item0, item1 ...
item0 : [item10, item15, item100]
item1 : [item23, item0, item101, item15]
..
# Decorators : information about each book:
Row | Id | Publisher | Author(s) |
My question is can achieve both in a single query?
If so, is it convenient or better to split in two separated queries for lower computational resources?
My approach is first query a set of books and their decorators, and then use a list of ids to look for their citations. I could not carry decorators along with above example.
Regarding the first part of your question, instead of using where p.id in(select id from q1), use a join to bring in q1 fields.
WITH q1 AS (
SELECT id, publisher, a.name
FROM `db.publications`,
UNNEST (publisher) as h,
UNNEST (author) as a
WHERE h Like '%penguin%'
),
joined as (
select id, p.citation, q1.publisher, q1.name
from `db.publications` p
inner join q1 using(id)
)
select id, c.id as Cited, c.Category as Cat
from joined
left join unnest(citation) c

Using SELECT...GROUP BY...HAVING in SQLite

I'm working on exercise 17 in the Teach Yourself SQL program GalaXQL (based on SQLite). I've got three tables:
Stars that contains starid;
Planets that contains planetid and starid;
Moons that contains moonid and planetid.
I want to return the starid associated with the greatest number of planets and moons combined.
I've got a query that will return the starid, planetid and total planets + moons.
How do I change this query so it only returns the single starid corresponding to the max(total) and not a table? This is my query so far:
select
stars.starid as sid,
planets.planetid as pid,
(count(moons.moonid)+count(planets.planetid)) as total
from stars, planets, moons
where planets.planetid=moons.planetid and stars.starid=planets.starid
group by stars.starid
Let's visualize a system that might be represented by this database structure, and see if we can't translate your question into working SQL.
I drew you a galaxy:
To distinguish stars and planets from moons, I've used capital Roman numerals for starid values and lower-case Roman numerals for moonid values. And since everyone knows that astronomers have nothing to do on those long nights in the observatory but drink, I put an unexplained gap in the middle of your planetid values. Gaps like these will occur when using so-called "surrogate" IDs, because their values hold no meaning; they are simply unique identifiers for rows.
If you'd like to follow along, here's the galaxy naively loaded into SQL Fiddle (if you get a popup about switching to WebSQL, you may need to hit "cancel" and stick with SQL.js for this example to work).
Let's see, what was it you wanted again?
I want to return the starid associated with the greatest number of planets and moons combined
Awesome. Rephrased, the question is: Which star is associated with the greatest number of orbiting bodies?
Star (I) has 1 planet with 3 moons;
Star (II) has 1 planet with 1 moon and 1 planet with 2 moons;
Star (III) has 1 planet with 1 moon and 2 planets with no moons.
All we're doing here is counting the different entities associated with each star. With a total of 5 orbiting bodies, star (II) is the winner! So the final result we expect from a working query is:
| starid |
|--------|
| 2 |
I intentionally drew this awesome galaxy such that the "winning" star doesn't have the most planets and isn't associated with the planet that has the most moons. If those astronomers weren't all three sheets to the wind, I might have gotten an extra moon out of planet (1) as well, so that our winning star isn't tied for most moons total. It'll be convenient for us in this demonstration if star (II) only answers the question we're asking and not any other questions with potentially similar queries, to reduce our chances of arriving at the right answer via the wrong query.
Lost in translation
The first thing I want to do is introduce you to the explicit JOIN syntax. This is going to be your very close friend. You will always JOIN your tables, no matter what some silly tutorial says. Trust in my far sillier advice instead (and optionally, read Explicit vs implicit SQL joins).
The explicit JOIN syntax shows how we're requiring our tables to relate to each other and reserves the WHERE clause for the sole purpose of filtering rows from the result set. There are a few different types, but what we're going to start with is a plain old INNER JOIN. This is essentially what your original query performed and it implies that all you want to see in your result set is the data that overlaps in all three tables. Check out a skeleton of your original query:
SELECT ... FROM stars, planets, moons
WHERE planets.planetid = moons.planetid
AND planets.starid = stars.starid;
Given those conditions, what happens to an orphaned planet somewhere off in space that isn't associated with a star (i.e., its starid is NULL)? Since an orphaned planet has no overlap with the stars table, an INNER JOIN wouldn't include it in the result set.
In SQL any equality or inequality comparison with NULL gives a result of NULL—even NULL = NULL isn't true! Now your query has a problem, because the other condition is that planets.planetid = moons.planetid. If there's a planet for which no corresponding moon exists, that turns into planets.planetid = NULL and the planet will not appear in your query result. That's no good! Lonely planets must be counted!
The OUTER limits
Fortunately there's a JOIN for you: An OUTER JOIN, which will ensure that at least one of the tables always shows up in our result set. They come in LEFT and RIGHT flavors to indicate which table gets special treatment, relative to the position of the JOIN keyword. What joins does SQLite support? confirms that the INNER and OUTER keywords are optional, so we can use LEFT JOIN, noting that:
stars and planets are linked by a common starid;
planets and moons are linked by a common planetid;
stars and moons are linked indirectly by the above two links;
we always want to count all the planets and all the moons.
SELECT
*
FROM
stars
LEFT JOIN
planets ON stars.starid = planets.starid
LEFT JOIN
moons ON planets.planetid = moons.planetid;
Notice that instead of having a big bag o' tables and a WHERE clause, you now have one ON clause for each JOIN. As you find yourself working with more tables, this is going to be far easier to read; and because this is standard syntax, it's relatively portable between SQL databases.
Lost in space
Our new query basically grabs everything in our database. But does this correspond to everything in our galaxy? Actually, there's some redundancy here, because two of our ID fields (starid and planetid) exist in more than one table. This is only one of many reasons to avoid the SELECT * catch-all syntax in actual use cases. We only really need the three ID fields, and I'm going to throw in two more tricks while we're at it:
Aliases! You can give your tables more convenient names by using the table_name AS alias syntax. This can be really convenient when you have to refer to many different columns in a multi-table query and you don't want to type out the full table names each time.
Grab starid from the planets table and leave stars out of the JOIN entirely! Having stars LEFT JOIN planets ON stars.starid = planets.starid means that the starid field is going to be the same, no matter which table we get it from—as long as the star has any planets. If we were counting stars, we'd need this table, but we're counting planets and moons; moons by definition orbit planets, so a star with no planets also has no moons and can be ignored. (This is an assumption; check your data to make sure it's justified! Maybe your astronomers are more drunk than usual!)
SELECT
p.starid, -- This could be S.starid, if we kept using `stars`
p.planetid,
m.moonid
FROM
planets AS p
LEFT JOIN
moons AS m ON p.planetid = m.planetid;
Result:
| starid | planetid | moonid |
|--------|----------|--------|
| 1 | 1 | 1 |
| 1 | 1 | 2 |
| 1 | 1 | 3 |
| 2 | 2 | 6 |
| 2 | 3 | 4 |
| 2 | 3 | 5 |
| 3 | 7 | |
| 3 | 8 | 7 |
| 3 | 9 | |
Mathematical!
Now our task is to decide which star is the winner, and for that we have to do some simple calculation. Let's count moons first; since they have no "children" and only one "parent" each, they're easy to aggregate:
SELECT
p.starid,
p.planetid,
COUNT(m.moonid) AS moon_count
FROM
planets AS p
LEFT JOIN
moons AS m ON p.planetid = m.planetid
GROUP BY p.starid, p.planetid;
Result:
| starid | planetid | moon_count |
|--------|----------|------------|
| 1 | 1 | 3 |
| 2 | 2 | 1 |
| 2 | 3 | 2 |
| 3 | 7 | 0 |
| 3 | 8 | 1 |
| 3 | 9 | 0 |
(Note: Usually we like to use COUNT(*) because it's simple to type and to read, but it would get us into trouble here! Since two of our rows have a NULL value for the moonid, we have to use COUNT(moonid) to avoid counting moons that don't exist.)
So far, so good—I see six planets, we know which star each belongs to, and the right number of moons are shown for each planet. Next step, counting the planets. You might think this requires a subquery in order to also add up the moon_count column for each planet but it's actually simpler than that; if we GROUP BY the star, our moon_count will switch from counting "moons per planet, per star" to "moons per star" which is just fine:
SELECT
p.starid,
COUNT(p.planetid) AS planet_count,
COUNT(m.moonid) AS moon_count
FROM
planets AS p
LEFT JOIN
moons AS m ON p.planetid = m.planetid
GROUP BY p.starid;
Result:
| starid | planet_count | moon_count |
|--------|--------------|------------|
| 1 | 3 | 3 |
| 2 | 3 | 3 |
| 3 | 3 | 1 |
Now we've run into trouble. The moon_count is correct, but you should see right away that the planet_count is wrong. Why is this? Look back at the ungrouped query result and notice that there are nine rows, with three rows for each starid, and each row has a non-null value for planetid. That's what we asked the database to count with this query, when what we really meant to ask was how many different planets are there? Planet (1) appears three times with star (I) but it's the same planet each time. The fix is to stick the DISTINCT keyword inside the COUNT() function call. At the same time, we can add the two columns together:
SELECT
p.starid,
COUNT(DISTINCT p.planetid)+ COUNT(m.moonid) AS total_bodies
FROM
planets AS p
LEFT JOIN
moons AS m ON p.planetid = m.planetid
GROUP BY p.starid;
Result:
| starid | total_bodies |
|--------|--------------|
| 1 | 4 |
| 2 | 5 |
| 3 | 4 |
And the winner is...
Counting the orbiting bodies around each star in the drawing, we can see that the total_bodies column is correct. But you didn't ask for all this information; you just want to know who won. Well, there are a bunch of ways to get there, and depending on the size and makeup of your galaxy (database), some may be more efficient than others. One approach is to ORDER BY the total_bodies expression so that the "winner" appears at the top, LIMIT 1 so that we don't see the losers, and select only the starid column (see it on SQL Fiddle).
The problem with that approach is that it hides ties. What if we gave the losing stars in our galaxy each an extra planet or moon? Now we've got a three way tie—everyone's a winner! But who shows up first when we ORDER BY a value that's always the same? In the SQL standard, this is undefined; there's no telling who will come out on top. You might run the same query twice on the same data and get two different results!
For this reason, you might prefer to ask which stars have the greatest number of orbital bodies, instead of specifying in your question that you know there is only one value. This is a more typically set-based approach and it's not a bad idea to get used to set-based thinking when working with relational databases. Until you execute the query, you don't know the size of the result set; if you're going to assume there's not a tie for first place, you have to justify that assumption somehow. (Since astronomers regularly find new moons and planets, I'd have a hard time justifying this one!)
The way I'd prefer to write this query is with something called a Common Table Expression (CTE). These are supported in recent versions of SQLite and in many other databases but last I checked GalaXQL was using an older version of the SQLite engine that doesn't include this feature. CTEs let you refer to a subquery multiple times using an alias, rather than having to write it out in full each time. A solution using CTEs could look like this:
WITH body_counts AS
(SELECT
p.starid,
COUNT(DISTINCT p.planetid) + COUNT(m.moonid) AS total_bodies
FROM
planets AS p
LEFT JOIN
moons AS m ON p.planetid = m.planetid
GROUP BY p.starid)
SELECT
starid
FROM
body_counts
WHERE
total_bodies = (SELECT MAX(total_bodies) FROM body_counts);
Result:
| STARID |
|--------|
| 2 |
Check out this query in action on SQLFiddle. To confirm that this query can show more than one row in the case of a tie, try changing MAX() on the last line to MIN().
Just for you
Doing this without CTEs is ugly but it can be done if the table size is manageable. Looking at the query above, our CTE is aliased as body_counts and we refer to it twice—in the FROM clause and in the WHERE clause. We can replace both of those references with the statement that we used to define body_counts (removing the id column once in the second subquery, where it's not used):
SELECT
starid
FROM
(SELECT
p.starid,
COUNT(DISTINCT p.planetid) + COUNT(m.moonid) AS total_bodies
FROM
planets AS p
LEFT JOIN
moons AS m ON p.planetid = m.planetid
GROUP BY p.starid)
WHERE
total_bodies = (SELECT MAX(total_bodies) FROM
(SELECT
COUNT(DISTINCT p.planetid)+ COUNT(m.moonid) AS total_bodies
FROM
planets AS p
LEFT JOIN
moons AS m ON p.planetid = m.planetid
GROUP BY p.starid)
);
This is the tie-friendly approach that should work for you in GalaXQL. See it working here in SQLFiddle.
Now that you've seen both, isn't the CTE version easier to understand? MySQL, which didn't support CTEs until the 2018 release of version 8.0, would additionally demand aliases for our subqueries. Fortunately, SQLite does not, because in this case it's just extra verbiage to add to an already over-complicated query.
Well, that was fun—are you sorry you asked? ;)
(P.S., if you were wondering what's up with planet number nine: giant space potato chips tend to have very eccentric orbits.)
Maybe something like this is what you want?
select
stars.starid as sid,
(count(distinct moons.moonid)+count(distinct planets.planetid)) as total
from stars
left join planets on stars.starid=planets.starid
left join moons on planets.planetid=moons.planetid
group by stars.starid
order by 2 desc
limit 1
Sample SQL Fiddle

SQL: SUM of MAX values WHERE date1 <= date2 returns "wrong" results

Hi stackoverflow users
I'm having a bit of a problem trying to combine SUM, MAX and WHERE in one query and after an intense Google search (my search engine skills usually don't fail me) you are my last hope to understand and fix the following issue.
My goal is to count people in a certain period of time and because a person can visit more than once in said period, I'm using MAX. Due to the fact that I'm defining people as male (m) or female (f) using a string (for statistic purposes), CHAR_LENGTH returns the numbers I'm in need of.
SELECT SUM(max_pers) AS "People"
FROM (
SELECT "guests"."id", MAX(CHAR_LENGTH("guests"."gender")) AS "max_pers"
FROM "guests"
GROUP BY "guests"."id")
So far, so good. But now, as stated before, I'd like to only count the guests which visited in a certain time interval (for statistic purposes as well).
SELECT "statistic"."id", SUM(max_pers) AS "People"
FROM (
SELECT "guests"."id", MAX(CHAR_LENGTH("guests"."gender")) AS "max_pers"
FROM "guests"
GROUP BY "guests"."id"),
"statistic", "guests"
WHERE ( "guests"."arrival" <= "statistic"."from" AND "guests"."departure" >= "statistic"."to")
GROUP BY "statistic"."id"
This query returns the following, x = desired result:
x * (x+1)
So if the result should be 3, it's 12. If it should be 5, it's 30 etc.
I probably could solve this algebraic but I'd rather understand what I'm doing wrong and learn from it.
Thanks in advance and I'm certainly going to answer all further questions.
PS: I'm using LibreOffice Base.
EDIT: An example
guests table:
ID | arrival | departure | gender |
10 | 1.1.14 | 10.1.14 | mf |
10 | 15.1.14 | 17.1.14 | m |
11 | 5.1.14 | 6.1.14 | m |
12 | 10.2.14 | 24.2.14 | f |
13 | 27.2.14 | 28.2.14 | mmmmmf |
statistic table:
ID | from | to | name |
1 | 1.1.14 | 31.1.14 |January | expected result: 3
2 | 1.2.14 | 28.2.14 |February| expected result: 7
MAX(...) is the wrong function: You want COUNT(DISTINCT ...).
Add proper join syntax, simplify (and remove unnecessary quotes) and this should work:
SELECT s.id, COUNT(DISTINCT g.id) AS People
FROM statistic s
LEFT JOIN guests g ON g.arrival <= s."from" AND g.departure >= s."too"
GROUP BY s.id
Note: Using LEFT join means you'll get a result of zero for statistics ids that have no guests. If you would rather no row at all, remove the LEFT keyword.
You have a very strange data structure. In any case, I think you want:
SELECT s.id, sum(numpersons) AS People
FROM (select g.id, max(char_length(g.gender)) as numpersons
from guests g join
statistic s
on g.arrival <= s."from" AND g.departure >= s."too"
group by g.id
) g join
GROUP BY s.id;
Thanks for all your inputs. I wasn't familiar with JOIN but it was necessary to solve my problem.
Since my databank is designed in german, I made quite the big mistake while translating it and I'm sorry if this caused confusion.
Selecting guests.id and later on grouping by guests.id wouldn't make any sense since the id is unique. What I actually wanted to do is select and group the guests.adr_id which links a visiting guest to an adress databank.
The correct solution to my problem is the following code:
SELECT statname, SUM (numpers) FROM (
SELECT statistic.name AS statname, guests.adr_id, MAX( CHAR_LENGTH( guests.gender ) ) AS numpers
FROM guests
JOIN statistics ON (guests.arrival <= statistics.too AND guests.departure >= statistics.from )
GROUP BY guests.adr_id, statistic.name )
GROUP BY statname
I also noted that my database structure is a mess but I created it learning by doing and haven't found any time to rewrite it yet. Next time posting, I'll try better.