Joining/Querying Self-Referential "Grandchildren" Tables - sql

Say I have a table as following:
countries
| ID | Name | Population | Continent |
Countries is self-referential, and has an association table:
alliances
| country_id | ally_id |
I understand that I need to use the 'AS' keyword to join the table, say, as c1, c2, etc. But I can't quite wrap my head around how to go about this for grandchildren, great grandchildren, etc.
How would I write SQL, for example, to get the countries where they have an ally' who's ally's population is greater than 50 000 000?
I'm generating this SQL based off of models defined in code, so need to be able to support this kind of behaviour up to a user-defined depth.
Thanks!

The query for your example is
SELECT c1.name
FROM countries AS c1
JOIN alliances AS a ON c1.id = a.country_id
JOIN countries AS c2 ON a.ally_id = c2.id
WHERE c2.population > 50000000;
If this was a homework question (and it looks like one) you have found the sucker who does it for you.

Related

SQL query without join

I have the following tables
Table food Table Race Table animal
+------------+--------------+ +------------+--------------+ +------------+--------------+
| Quantity | animal_id | | race_code | race_name | | animal_id | race_code |
+------------+--------------+ +------------+--------------+ +------------+--------------+
I was asked to calculate the average food quantity for every race (race_name). The challenge here is that I should not use JOIN because we have not studied it yet.
I have written the following query:
select AVG(f.quantity),r.race_name from food f, race r
group by r.race_name;
but it doesn't work as I want it to be since it returns the same average food quantity for all races. I know I have to use the animal table to link the other 2 but I didn't know how. I should be using subqueries
That question is exactly the same as your previous, where you had to use SUM (instead of AVG). No difference at all.
Ah, sorry - it wasn't you, but your school colleague, here
Saying that you "didn't learn joins", well - what do you call what you posted here, then? That's a cross join and will produce Cartesian product, once you fix the error you got by not including non-aggregated column into the group by clause and include additional joins required to return desired result.
The "old" syntax is
select r.name,
avg(f.quantity) avg_quantity
from race r, animal a, food f
where a.race_code = r.race_code
and f.animal_id = a.animal_id
group by r.name;
What you "didn't learn yet" does the same, but looks differently:
from race r join animal a on a.race_code = r.race_code
join food f on f.animal_id = a.animal_id
The rest of the query remains the same.
Nowadays, you should use JOINs to join tables, and put conditions into the WHERE clause. For example, condition would be that you want to calculate averages for donkeys only. As you don't have it, you don't need it.
You still have to do some matching of related rows. If not explicitly with JOIN you can do it in the WHERE clause. Ie something like
select AVG(f.quantity),r.race_name
from food f, race r, animal a
where f.animal_id = a.animal_id and a.race_code = r.race_code
group by r.race_name;
select race_name ,(select avg(quantity) from food where animal_id in (select animal_id from animal a where r.race_code = a.race_code))
from race r

Aggregating or Bundle a Many to Many Relationship in SQL Developer

So I have 1 single table with 2 columns : Sales_Order called ccso, Arrangement called arrmap
The table has distinct values for this combination and both these fields have a Many to Many relationship
1 ccso can have Multiple arrmap
1 arrmap can have Multiple ccso
All such combinations should be considered as one single bundle
Objective :
Assign a final map to each of the Sales Order as the Largest Arrangement in that Bundle
Example:
ccso : 100-10015 has 3 arrangements --> Now each of those arrangements have a set of Sales Orders --> Now those sales orders will also have a list of other arrangements and so on
(Image : 1)
Therefore the answer definitely points to something recursively checking. Ive managed to write the below code / codes and they work as long as I hard code a ccso in the where clause - But I don't know how to proceed after this now. (I'm an accountant by profession but finding more passion in coding recently) I've searched the forums and web for things like
Recursive CTEs,
many to many aggregation
cartesian product etc
and I'm sure there must be a term for this which I don't know yet. I've also tried
I have to use sqldeveloper or googlesheet query and filter formulas
sqldeveloper has restrictions on on some CTEs. If recursive is the way I'd like to know how and if I can control the depth to say 4 or 5 iterations
Ideally I'd want to update a third column with the final map if possible but if not, then a select query result is just fine
Codes I've tried
Code 1: As per Screenshot
WITH a1(ccso, amap) AS
(SELECT distinct a.ccso, a.arrmap
FROM rg_consol_map2 A
WHERE a.ccso = '100-10115' -- this condition defines the ultimate ancestors in your chain, change it as appropriate
UNION ALL
SELECT m.ccso, m.arrmap
FROM rg_consol_map2 m
JOIN a1
ON M.arrmap = a1.amap -- or m.ccso=a1.ccso
) /*if*/ CYCLE amap SET nemap TO 1 /*else*/ DEFAULT 0
SELECT DISTINCT amap FROM (SELECT ccso, amap FROM a1 ORDER BY 1 DESC) WHERE ROWNUM = 1
In this the main challenge is how to remove the hardcoded ccso and do a join for each of the ccso
Code 2 : Manual CTEs for depth
Here again the join outside the CTE gives me an error and sqldeveloper does not allow WITH clause with UPDATE statement - only works for select and cannot be enclosed within brackets as subtable
SELECT distinct ccso FROM
(
WITH ar1 AS
(SELECT distinct arrmap
FROM rg_consol_map
WHERE ccso = a.ccso
)
,so1 AS
(SELECT DISTINCT ccso
FROM rg_consol_map
WHERE arrmap IN (SELECT arrmap FROM ar1)
)
,ar2 AS
(SELECT DISTINCT ccso FROM rg_consol_map
where arrmap IN (select distinct arrmap FROM rg_consol_map
WHERE ccso IN (SELECT ccso FROM so1)
))
SELECT ar1.arrmap, NULL ccso FROM ar1
union all
SELECT null, ar2.ccso FROM ar2
UNION ALL
SELECT NULL arrmap, so1.ccso FROM so1
)
Am I Missing something here or is there an easier way to do this? I read something about MERGE and PROC SQL JOIN but was unable to get them to work but if that's the way to go ahead I will try further if someone can point me in the direction
(Image : 2)
(CSV File : [3])
Edit : Fixing CSV file link
https://github.com/karan360note/karanstackoverflow.git
I suppose can be downloaded from here IC mapping many to many.csv
Oracle 11g version is being used
Apologies in advance for the wall of text.
Your problem is a complex, multi-layered Many-to-Many query; there is no "easy" solution to this, because that is not a terribly ideal design choice. The safest best does literally include multiple layers of CTE or subqueries in order to achieve all the depths you want, as the only ways I know to do so recursively rely on an anchor column (like "parentID") to direct the recursion in a linear fashion. We don't have that option here; we'd go in circles without a way to track our path.
Therefore, I went basic, and with several subqueries. Every level checks for a) All orders containing a particular ARRMAP item, and then b) All additional items on those orders. It's clear enough for you to see the logic and modify to your needs. It will generate a new table that contains the original CCSO, the linking ARRMAP, and the related CCSO. Link: https://pastebin.com/un70JnpA
This should enable you to go back and perform the desired updates you want, based on order # or order date, etc... in a much more straightforward fashion. Once you have an anchor column, a CTE in the future is much more trivial (just search for "CTE recursion tree hierarchy").
SELECT DISTINCT
CCSO, RELATEDORDER
FROM myTempTable
WHERE CCSO = '100-10115'; /* to find all orders by CCSO, query SELECT DISTINCT RELATEDORDER */
--WHERE ARRMAP = 'ARR10524'; /* to find all orders by ARRMAP, query SELECT DISTINCT CCSO */
EDIT:
To better explain what this table generates, let me simplify the problem.
If you have order
A with arrangements 1 and 2;
B with arrangement 2, 3; and
C with arrangement 3;
then, by your initial inquiry and image, order A should related to orders B and C, right? The query generates the following table when you SELECT DISTINCT ccso, relatedOrder:
+-------+--------------+
| CCSO | RelatedOrder |
+----------------------+
| A | B |
| A | C |
+----------------------+
| B | C |
| B | A |
+----------------------+
| C | A |
| C | B |
+-------+--------------+
You can see here if you query WHERE CCSO = 'A' OR RelatedOrder = 'A', you'll get the same relationships, just flipped between the two columns.
+-------+--------------+
| CCSO | RelatedOrder |
+----------------------+
| A | B |
| A | C |
+----------------------+
| B | A |
+----------------------+
| C | A |
+-------+--------------+
So query only CCSO or RelatedOrder.
As for the results of WHERE CCSO = '100-10115', see image here, which includes all the links you showed in your Image #1, as well as additional depths of relations.

Stuck on beginner SQL practice. Multiple table where columns use same id

I'm very sorry to bother with minor problem, but I tried to search old answers for this one and since my skills in SQL are complete 0, I didn't even understand the answers :/! Neither is my English terminology great enough for properly searching.
I have these 2 tables: Cities and Flights.
Cities
+----+-------------+
|id | name |
+----+-------------+
|1 | Oslo |
|2 | New York |
|3 | Hong Kong |
+----+-------------+
Flights
+----+--------------------+-------------------+
|id | wherefrom_id | whereto_id |
+----+--------------------+-------------------+
|1 | 3 | 2 |
|2 | 3 | 1 |
|3 | 1 | 3 |
+----+--------------------+-------------------+
Now I have to write code where I need to make city ID's merge to wherefrom_id and whereto_id, in that manner that the answer shows table where you can see list of Flights (FROM/TO).
Example:
ANSWER:
+-----------+----------------+
|HONG KONG | NEW YORK |
+-----------+----------------+
|HONG KONG | OSLO |
+-----------+----------------+
|OSLO | HONG KONG |
+-----------+----------------+
This is what I wrote:
SELECT C.name, C.name
FROM Cities C, Flights F
WHERE C.id = F.wherefrom_id AND C.id = F.whereto_id;
For some reason this doesnt seem to work and I get nothing showing on my practice program. There is no error or anything it just doesnt show anything on the test answer. I really hope you get what I mean, English is not my first language and I truly tried my best to make it clear as possible :S
First things first - it's a lot easier to code in standard SQL join syntax. Converting your above to that is
SELECT C.name, C.name
FROM Cities C
INNER JOIN Flights F ON C.id = F.wherefrom_id AND C.id = F.whereto_id;
The question you've been asked requires logic people don't usually use at first so it can be confusing the first time you encounter it.
I will run through the logic jump in a moment.
Imagine your Flights table has the City names in it (not IDs).
It would have columns, say, FlightID, From_City_Name, To_City_Name.
An example row would be 1, 'Oslo', 'Prague'.
Getting the data for this would be easy e.g., SELECT Flight_ID, From_City_Name, To_City_name FROM Flights.
However, this has many problems. As your question has done, you decide to pull out the cities into their own reference tables.
For this first example, however, you decided to have two extra tables as reference tables: From_City and To_City. These would both have an ID and city name. You then change your Flights to refer to these.
Your code would look like
SELECT F.ID, FC.Name AS From_City, TC.Name AS To_City
FROM Flights
INNER JOIN From_City AS FC ON Flights.From_City_ID = FC.ID
INNER JOIN To_City AS TC ON Flights.To_City_ID = TC.ID
Notice how there are two joins there - one to From_City and one to To_City? That is because the From and To cities are referring to different things in the data.
So, then the final part of the issue: why have two city tables (from and to). Why not have one? Well, you can. If you create just one table, and modify the above, you get something like this:
SELECT F.ID, FC.Name AS From_City, TC.Name AS To_City
FROM Flights
INNER JOIN City AS FC ON Flights.From_City_ID = FC.ID
INNER JOIN City AS TC ON Flights.To_City_ID = TC.ID
Note that all that has changed is that the From_City and To_City references have been pointed to a different table City. However, the rest is the same.
And that, actually, would be your answer. The complex part that most people don't get to straight away, is having two joins to the same table.
As an aside, your original code is technically valid.
SELECT C.name, C.name
FROM Cities C
INNER JOIN Flights F ON C.id = F.wherefrom_id AND C.id = F.whereto_id;
However, what it's effectively saying is to get the city names where the From_City is the same as the To_City - which is obviously not what you want (unless you're looking for turnbacks).
What you're doing is an old SQL way of expressing joins. The standard now has better ways to declare the relationships within the from clause and I take it that your material has postponed that slightly:
There are people who will yell at you for using this ancient syntax but the answer is easy enough:
SELECT C1.name, C2.name
FROM Cities C1, Cities C2, Flights F
WHERE C1.id = F.wherefrom_id AND C2.id = F.whereto_id
You can think of this as creating a "cross product" of all city-pair combinations and matching up the ones that match actual flights. The key is to references Cities twice by using different aliases (or correlation names.)
I think this is what you are looking for..
SELECT wf.name "wherefrom", wt.name "whereto"
FROM Flights f
JOIN Cities wf
ON f.wherefrom_id = wf.id
JOIN Cities wt
ON f.whereto_id = wt.id
order by f.id

Selecting more after group-by while using join

At the moment I am busy with two tables, Students and Classes. These two both contain a column project_group, a way to categorize multiple students from one class into smaller groups.
In the Students table there is a column City that states in which town/city students live, from the rows that have been filled there are already several cities occurring multiple times. The code I used to check how many times a city is being showed is this:
SELECT City, count(*)
FROM Students
GROUP BY City
Now the next thing I want to do is show per class in which cities the students live and how many live there, so for example a result like:
A | - | 2
A | New York | 3
A | Los Angeles | 1
B | - | 1
B | Miami | 2
B | Seattle | 1
Students and Classes can join each other on the column project_group but what I'm mostly interested in his using both the GROUP BY mentioned earlier, using the JOIN and also showing the results per class.
Thanks in advance,
KRAD
I'm not sure what the column name is for A and B in your example. I'm assuming Classes.Class in the following:
SELECT
C.Class
, S.City
, COUNT(S.*) AS Count
FROM
Classes AS C INNER JOIN
Students AS S ON C.Project_Group = S.Project_Group
GROUP BY
C.Class
, S.City
I managed to get it working. While doing some tests to see which exact error message it was that I got, I used this and managed to get it working. I now get an overview per class that shows how many people live in which city. This is the code used.
SELECT class_id, city, count(*) AS amount
FROM students, classes
WHERE students.project_group = classes.project_group
GROUP BY class_id, city
ORDER BY class_id

PostgreSQL: Self-referencing, flattening join to table which contains tree of objects

I have a relatively large (as in >10^6 entries) table called "things" which represent locateable objects, e.g. countries, areas, cities, streets, etc. They are used as a tree of objects with a fixed depth, so the table structure looks like this:
id
name
type
continent_id
country_id
city_id
area_id
street_id
etc.
The association inside "things" is 1:n, i.e. a street or area always belongs to a defined city and country (not two or none); the column city_id for example contains the id of the "city" thing for all the objects which are inside that city. The "type" column contains the type of thing (Street, City, etc) as a string.
This table is referenced in another table "actions" as "thing_id". I am trying to generate a table of action location statistics showing the number of active and inactive actions a given location has. A simple JOIN like
SELECT count(nullif(actions.active, 1)) AS icount,
count(nullif(actions.active, 0)) AS acount,
things.name AS name, things.id AS thing_id, things.city_id AS city_id
FROM "actions"
LEFT JOIN things ON actions.thing_id = things.id
WHERE UPPER(substring(things.name, 1, 1)) = UPPER('A')
AND actions.datetime_at BETWEEN '2012-09-26 19:52:14' AND '2012-10-26 22:00:00'
GROUP BY things.name, things.id ORDER BY things.name
will give me a list of "things" (starting with 'A') which have actions associated with them and their active and inactive count like this:
icount | acount | name | thing_id | city_id
------------------------------------------------------------------
0 5 Brooklyn, New York City | 25 | 23
1 0 Manhattan, New York City | 24 | 23
3 2 New York City | 23 | 23
Now I would like to
only consider "city" things (that's easy: filter by type in "things"), and
in the active/inactive counts, use the sum of all actions happening in this city - regardless of whether the action is associated with the city itself or something inside the city (= having the same city_id). With the same dataset as above, the new query should result in
icount | acount | name | thing_id | city_id
------------------------------------------------------------------
4 7 New York City | 23 | 23
I do not need the thing_id in this table (since it would not be unique anyway), but since I do need the city's name (for display), it is probably just as easy to also output the ID, then I don't have to change as much in my code.
How would I have to modify the above query to achieve this? I'd like to avoid additional trips to the database, and advanced SQL features such as procedures, triggers, views and temporary tables, if possible.
I'm using Postgres 8.3 with Ruby 1.9.3 on Rails 3.0.14 (on Mac OS X 10.7.4).
Thank you! :)
You need to count actions for all things in the city in an independent subquery and then join to a limited set of things:
SELECT c.icount
,c.acount
,t.name
,t.id AS thing_id
,t.city_id
FROM (
SELECT t.city_id
,count(nullif(a.active, 1)) AS icount
,sum(a.active) AS acount
FROM things t
LEFT JOIN actions a ON a.thing_id = t.id
WHERE t.city_id = 23 -- to restrict results to one city
GROUP BY t.city_id
) c -- counts per city
JOIN things t USING (city_id)
WHERE t.name ILIKE 'A%'
AND t.datetime_at BETWEEN '2012-09-26 19:52:14'
AND '2012-10-26 22:00:00'
ORDER BY t.name, t.id;
I also simplified a number of other things in your query and used table aliases to make it easier to read.