SQL combined SELECT statement (for SQL zoo) - sql

I was using SQL zoo for brushing up my SQL knowledge and found the following problem:
"Some countries have populations more than three times that of any of
their neighbours (in the same continent). Give the countries and
continents."
The solution I have put down for this is:
SELECT name, continent
FROM world w
WHERE NOT EXISTS (
SELECT *
FROM world nx
WHERE nx.continent = w.continent
AND nx.population <= 3*w.population)
The interpreter is saying that I have "too few columns" (on number 8 problem on SQL zoo). I am not sure what is incorrect here. Any suggestion or help is appreciated.

That page has a "Show correct result" button, which should tell you exactly what's incorrect about the results, if not the SQL statement you used.
When I use your SQL statement for that problem, I get the right columns, but the wrong rows, so I must assume you've typo'd somewhere.
One correct answer for that question is:
SELECT name, continent
FROM world w
WHERE population IS NOT NULL
AND NOT EXISTS (
SELECT * FROM world x
WHERE x.continent = w.continent
AND population IS NOT NULL
AND x.name != w.name
AND x.population > w.population/3
)

Related

using result of SELECT in another SELECT

I'm trying to do question 9 here:
https://sqlzoo.net/wiki/SELECT_within_SELECT_Tutorial
I currently have the code:
SELECT continent, SUM(y.population) as Population
FROM world AS y
GROUP BY(y.continent)
HAVING SUM(y.population) < 250000000;
This returns the continents with a sum of their respective populations less than 250000000. I know I need to encase this in another select to make use of the continent returned, but don't know how to do this?
I tried something like this:
SELECT A.continent from world A
INNER JOIN(
SELECT B.continent, SUM(B.population) as Population
FROM world B
GROUP BY(B.continent)
HAVING SUM(B.population) < 250000000
) ON A.continent = B.continent;
^This was to try and get a single list of the continents which i could then encase in another select to iterate through and print the country names, although I feel there must be a way to directly iterate through the continent column from the first example?
This is likely something pretty trivial, but regardless any help would be great
there are multiple ways to solve this - i used a count of all countries on the continent = count of countries with population<25000000
your main mistake is in logic - SUM - it should be EACH country, not the summary
select name,w1.continent,population
from world w1
join
(
SELECT distinct continent, count(name) cnt
FROM world x
WHERE population<=25000000
group by continent
)w2
on w1.continent=w2.continent
where cnt=(select count(name) from world where w1.continent=continent)

Nested Select in SQL

I'm trying the 5th question in the Nested Select of SQL zoo (using Oracle engine) http://sqlzoo.net/wiki/SELECT_within_SELECT_Tutorial
Show the name and the population of each country in Europe. Show the
population as a percentage of the population of Germany.
I know the correct answer (given below), but something puzzled me.
SELECT name, CONCAT(ROUND(population/(SELECT population
FROM world WHERE name = 'Germany'),2)*100,'%')
FROM world WHERE continent = 'Europe'
When I run the following modified query, only one row (Albania) is returned.
SELECT name, population/(SELECT population
FROM world WHERE name = 'Germany')
FROM world WHERE continent = 'Europe'
Wondering if anyone can shed light on the inner workings of Oracle as to why only Albania is returned? Its puzzling to me why it doesn't work without ROUND().
The correct answer is actually:
SELECT name,
CONCAT(ROUND(population/(SELECT population FROM world WHERE name = 'Germany')*100,0),'%')
FROM world
WHERE continent = 'Europe'
Note the results set rounds to zero decimal spaces.
That said, I tried your exact code above and still get results for all countries:
SELECT name,
population/(SELECT population FROM world WHERE name = 'Germany')
FROM world
WHERE continent = 'Europe'
You're right to be puzzled as it should certainly return regardless of using ROUND() or not, but since I cannot recreate it, I can't explain it.
A more efficient Oracle query (that gets rid of the sub-query) is to use an analytic function:
SELECT name,
ROUND(
population
/ MAX( CASE name WHEN 'Germany' THEN population END ) OVER ()
* 100
) || '%'
FROM world;
However, sqlzoo appears to use MariaDB so you can't put that query into the website (but if you recreate the table in Oracle then you can test it).

Why does changing the operator from '>=' to '>' cause a blank result?

I'm learning SQL and from a programming point of view I'm struggling to understand why this query is behaving the way it is (from SQLZOO Q6)
The question:
"Find the largest country (by area) in each continent, show the continent, the name and the area:"
SELECT continent, name, area from world a
WHERE area >= ALL
(SELECT area from world b WHERE a.continent = b.continent AND area>0)
I get the above, fairly simple nested select statement.
However, what I don't get is why changing this line causes a blank result:
WHERE area >= ALL - Change it to - WHERE area > ALL
Why does this give me a blank result?
Update: I'm using MySQL
By selecting with >, you're asking for all countries that are greater than all countries on the same continent. No country can have an area greater than that of all countries on the same continent: even if it is the biggest country on a continent, it is still not bigger than itself.

What do OrientDB's functions do when applied to the results of another function?

I am getting very strange behavior on 2.0-M2. Consider the following against the GratefulDeadConcerts database:
Query 1
SELECT name, in('written_by') AS wrote FROM V WHERE type='artist'
This query returns a list of artists and the songs each has written; a majority of the rows have at least one song.
Query 2
Now try:
SELECT name, count(in('written_by')) AS num_wrote FROM V WHERE type='artist'
On my system (OSX Yosemite; Orient 2.0-M2), I see just one row:
name num_wrote
---------------------------
Willie_Cobb 224
This seems wrong. But I tried to better understand. Perhaps the count() causes the in() to look at all written_by edges...
Query 3
SELECT name, in('written_by') FROM V WHERE type='artist' GROUP BY name
Produces results similar to the first query.
Query 4
Now try count()
SELECT name, count(in('written_by')) FROM V WHERE type='artist' GROUP BY name
Wrong path -- So try LET variables...
Query 5
SELECT name, $wblist, $wbcount FROM V
LET $wblist = in('written_by'),
$wbcount = count($wblist)
WHERE type='artist'
Produces seemingly meaningless results:
You can see that the $wblist and $wbcount columns are inconsistent with one another, and the $wbcount values don't show any obvious progression like a cumulative result.
Note that the strange behavior is not limited to count(). For example, first() does similarly odd things.
count(), like in RDBMS, computes the sum of all the records in only one value. For your purpose .size()seems the right method to call:
in('written_by').size()

Oracle: '= ANY()' vs. 'IN ()'

I just stumbled upon something in ORACLE SQL (not sure if it's in others), that I am curious about. I am asking here as a wiki, since it's hard to try to search symbols in google...
I just found that when checking a value against a set of values you can do
WHERE x = ANY (a, b, c)
As opposed to the usual
WHERE x IN (a, b, c)
So I'm curious, what is the reasoning for these two syntaxes? Is one standard and one some oddball Oracle syntax? Or are they both standard? And is there a preference of one over the other for performance reasons, or ?
Just curious what anyone can tell me about that '= ANY' syntax.
ANY (or its synonym SOME) is a syntax sugar for EXISTS with a simple correlation:
SELECT *
FROM mytable
WHERE x <= ANY
(
SELECT y
FROM othertable
)
is the same as:
SELECT *
FROM mytable m
WHERE EXISTS
(
SELECT NULL
FROM othertable o
WHERE m.x <= o.y
)
With the equality condition on a not-nullable field, it becomes similar to IN.
All major databases, including SQL Server, MySQL and PostgreSQL, support this keyword.
IN- Equal to any member in the list
ANY- Compare value to **each** value returned by the subquery
ALL- Compare value to **EVERY** value returned by the subquery
<ANY() - less than maximum
>ANY() - more than minimum
=ANY() - equivalent to IN
>ALL() - more than the maximum
<ALL() - less than the minimum
eg:
Find the employees who earn the same salary as the minimum salary for each department:
SELECT last_name, salary,department_id
FROM employees
WHERE salary IN (SELECT MIN(salary)
FROM employees
GROUP BY department_id);
Employees who are not IT Programmers and whose salary is less than that of any IT programmer:
SELECT employee_id, last_name, salary, job_id
FROM employees
WHERE salary <ANY
(SELECT salary
FROM employees
WHERE job_id = 'IT_PROG')
AND job_id <> 'IT_PROG';
Employees whose salary is less than the salary ofall employees with a job ID of IT_PROG and whose job is not IT_PROG:
SELECT employee_id,last_name, salary,job_id
FROM employees
WHERE salary <ALL
(SELECT salary
FROM employees
WHERE job_id = 'IT_PROG')
AND job_id <> 'IT_PROG';
....................
Hope it helps.
-Noorin Fatima
To put it simply and quoting from O'Reilly's "Mastering Oracle SQL":
"Using IN with a subquery is functionally equivalent to using ANY, and returns TRUE if a match is found in the set returned by the subquery."
"We think you will agree that IN is more intuitive than ANY, which is why IN is almost always used in such situations."
Hope that clears up your question about ANY vs IN.
I believe that what you are looking for is this:
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96533/opt_ops.htm#1005298
(Link found on Eddie Awad's Blog)
To sum it up here:
last_name IN ('SMITH', 'KING',
'JONES')
is transformed into
last_name = 'SMITH' OR last_name =
'KING' OR last_name = 'JONES'
while
salary > ANY (:first_sal,
:second_sal)
is transformed into
salary > :first_sal OR salary >
:second_sal
The optimizer transforms a condition
that uses the ANY or SOME operator
followed by a subquery into a
condition containing the EXISTS
operator and a correlated subquery
The ANY syntax allows you to write things like
WHERE x > ANY(a, b, c)
or event
WHERE x > ANY(SELECT ... FROM ...)
Not sure whether there actually is anyone on the planet who uses ANY (and its brother ALL).
A quick google found this http://theopensourcery.com/sqlanysomeall.htm
Any allows you to use an operator other than = , in most other respect (special cases for nulls) it acts like IN. You can think of IN as ANY with the = operator.
This is a standard. The SQL 1992 standard states
8.4 <in predicate>
[...]
<in predicate> ::=
<row value constructor>
[ NOT ] IN <in predicate value>
[...]
2) Let RVC be the <row value constructor> and let IPV be the <in predicate value>.
[...]
4) The expression
RVC IN IPV
is equivalent to
RVC = ANY IPV
So in fact, the <in predicate> behaviour definition is based on the 8.7 <quantified comparison predicate>. In Other words, Oracle correctly implements the SQL standard here
Perhaps one of the linked articles points this out, but isn't it true that when looking for a match (=) the two return the same thing. However, if looking for a range of answers (>, <, etc) you cannot use "IN" and would have to use "ANY"...
I'm a newb, forgive me if I've missed something obvious...
MySql clears up ANY in it's documentation pretty well:
The ANY keyword, which must follow a comparison operator, means
“return TRUE if the comparison is TRUE for ANY of the values in the
column that the subquery returns.” For example:
SELECT s1 FROM t1 WHERE s1 > ANY (SELECT s1 FROM t2);
Suppose that there is a row in table t1 containing (10). The
expression is TRUE if table t2 contains (21,14,7) because there is a
value 7 in t2 that is less than 10. The expression is FALSE if table
t2 contains (20,10), or if table t2 is empty. The expression is
unknown (that is, NULL) if table t2 contains (NULL,NULL,NULL).
https://dev.mysql.com/doc/refman/5.5/en/any-in-some-subqueries.html
Also Learning SQL by Alan Beaulieu states the following:
Although most people prefer to use IN, using = ANY is equivalent to
using the IN operator.
Why I always use any is because in some oracle or mssql versions IN list is limited by 1000/999 elements. While = any () is not limited by 1000.
Nobody likes their sql query crashing a web request.
So there is a practical difference.
Second reason it is the more modern form. As it correlates with expressions like > all (...).
Third reason is somehow for me as non-native English speaker it appears more natural to use "any" and "all" than to use IN.