Unexpected result of SQL query with ALL in PostgreSQL - sql

I have a very simple table countries(name, gdp, continent) with data I collected from Wikipedia (actually, doesn't really matter). If I want to know how many countries each continent has with
SELECT continent, COUNT(*) AS cnt
FROM countries
GROUP BY continent;
I get the following result:
continent | cnt
---------------+-----
Africa | 56
Asia | 46
South America | 12
North America | 22
Europe | 46
Oceania | 14
So I have for each content the corresponding countries. In any case, there are only 196 rows so the data is very small.
No I want to use a query to get for each continent the country with the largest GDP. The query is also very simple and looks like this:
SELECT name, continent, gdp
FROM countries c1
WHERE gdp >= ALL (SELECT gdp
FROM countries c2
WHERE c2.continent = c1.continent);
However, the result I get is:
name | continent | gdp
----------------------------+---------------+----------------
Australia | Oceania | 1748000000000
Brazil | South America | 1810000000000
People's Republic of China | Asia | 19910000000000
United States | North America | 25350000000000
In short, the corresponding countries for Europe and Africa are missing. By looking at the data, the countries with the highest GDP is Germany, but I don't understand why it's not part of the result set.
As a test, if I run the query
SELECT continent, MAX(GDP) AS max_gdp
FROM countries
GROUP BY continent;
I correctly get 6 GDP values for each continent (incl. the correct value for Germany):
continent | max_gdp
---------------+----------------
Africa | 498060000000
Asia | 19910000000000
South America | 1810000000000
North America | 25350000000000
Europe | 4319000000000
Oceania | 1748000000000
Why is the ALL query missing the 2 rows for the European and African countries?

The problem turned out to be NULL values, as a_horse_with_no_name suspected.
Note that there is a more performant way to write this query in PostgreSQL:
SELECT DISTINCT ON (continent)
name, continent, gdp
FROM countries
ORDER BY continent, gdp DESC NULLS LAST;
For each continent, that will output the first row in ORDER BY order.

Related

Sum of column depending on values

Can you guys let me know how to make a query that output the sum of amount based on column values(order, Continent and Country)? Also, I want to show all Continent values as unique value (North America)
Example table,
ID Code Continent Country amount
----------------------------------------------------
1 1 North America NULL NULL
2 1 America USA 10
3 1 NA USA 10
4 1 Unknown USA 10
5 2 North America NULL NULL
6 2 America Canada 15
7 2 NA Canada 15
8 2 Unknown Canada 15
9 3 North America NULL NULL
10 3 America Mexico 20
11 3 NA Mexico 20
12 3 Unknown Mexico 20
Output
ID Code Continent Country SumAmount
----------------------------------------------
1 1 North America USA 30
2 2 North America Canada 45
3 3 North America Mexico 60
I have tried to approach it like
select ID, Code, case when Continent != 'North America' then Continent = 'North America' end as Continent, Country, sum(Amount) as SumAmount
from Table group by ID, Continent, Country
or maybe I need to make a query like this and work with this query below?
select ID, Code, Continent, Country, sum(Amount) as SumAmount
from Table where Continent !='North America'
But it is not working. How should I do this?
I appreciate for any other approaches. It would be better than mine
The awkward design here (relations with no real indication of such other than the shared Code column) is going to lead to suboptimal queries like this
DECLARE #ContinentToReport varchar(32) = 'North America';
;WITH x AS
(
SELECT Code FROM dbo.TableName
WHERE Continent = #ContinentToReport
AND Country IS NULL
)
SELECT ID = ROW_NUMBER() OVER (ORDER BY x.Code),
x.Code,
Continent = #ContinentToReport,
t.Country,
SumAmount = SUM(t.amount)
FROM dbo.TableName AS t
INNER JOIN x ON t.Code = x.Code
WHERE t.Country IS NOT NULL
GROUP BY x.Code, t.Country
ORDER BY x.Code;
Output (though I made a guess at what ID means and why it's different then the ID and the source, and I find the Continent column is kind of redundant since it will always be the same):
ID
Code
Continent
Country
SumAmount
1
1
North America
USA
30
2
2
North America
Canada
45
3
3
North America
Mexico
60
Example db<>fiddle
The simplest query which returns the correct result seems to be something like this
select row_number() over (order by Code) ID,
Code,
'North America' Continent,
Country,
sum(amount) SumAmount
from dbo.TableName
where Country is not null
group by Code, Country
order by Code;
dbFiddle

How to sum values if I don't have column to group it by

I need to sum sales grouped by country, but I have to group them manually because I don't have any other way.
Unfortunately, I don't have the column 'continent', but there are not too many countries on the list so I can do it manually. I can't create any new columns in the table, so I need to do it in a query.
For example:
country | sum of sales
Germany 1000
Italy 500
Canada 700
UK 1300
USA 3000
I would like to see the total sales for Europe and North America
continent | sum of sales
Europe 2800
North America 3700
You should be able to combine case expression and in predicate, something along this lines:
SELECT CASE
WHEN country in ('Germany', 'UK') THEN 'Europe'
WHEN country in ('Canada', 'USA') THEN 'North America'
END as continent,
sum("sum of sales")
FROM table
GROUP BY 1

Move the extra entry to a new row in SQL Server

I have a table.
Country| Continent| City
----------------------------
USA Americas [{1,New York}]
Chile Americas [{2,Santiago}]
England Europe [{3,London},{4,Bristol}]
I want to move the extra entry to a new row in sql server. And the output needs to look like this.
----------------------------
Country| Continent| City
----------------------------
USA Americas [{1,New York}]
Chile Americas [{2,Santiago}]
England Europe [{3,London}]
England Europe [{4,Bristol}]
Try this
;WITH CTE(Country, Continent, City)
AS
(
SELECT 'USA' ,'Americas' ,'[{1,New York}]' UNION ALL
SELECT 'Chile' ,'Americas' ,'[{2,Santiago}]' UNION ALL
SELECT 'England','Europe' ,'[{3,London},{4,Bristol}]'
)
SELECT Country,
Continent,
QUOTENAME(IIF(RIGHT(Split.a.value('.','nvarchar(1000)'),1)<>'}',Split.a.value('.','nvarchar(1000)')+'}' ,Split.a.value('.','nvarchar(1000)')))
AS City
FROM
(
SELECT Country,Continent,CAST('<S>'+ REPLACE(REPLACE(REPLACE(City,'[',''),']',''),'},','</S><S>' )+'</S>' AS XML) AS City
FROM CTE
)AS A
CROSS APPLY City.nodes('S') AS Split(a)
Demo Result : http://rextester.com/AWWN39720

SQL statement to list and country countries in SQL

Let's say I have a some rows like this in a table:
SWEDEN
MEXICO
USA
SWEDEN
GERMANY
RUSSIA
MEXICO
SWEDEN
Now I need to create a script to count the countries and list them like this:
Country Amount of countries
SWEDEN 3
USA 1
MEXICO 2
RUSSIA 1
GERMANY 1
I'm stuck at:
SELECT Country
FROM dbo.Customers
How do I only show them once and create a row and count them?
Thanks a lot..
try
SELECT Country, count(*)
FROM dbo.Customers
group by Country

SQL help with MAX query

I have a table of countries named bbc(name, region, area, population, gdp)
I want a table with the region, name and population of the largest ( most populated) countries by region. So far i've tried this:
SELECT region, name, MAX(population)
FROM bbc
GROUP BY region
It gave me an error message : ORA-00979: Not a GROUP BY Expression
I tried to change to GROUP BY region, name, but it doesn't give me the right table
You can use analytics for queries like that:
SELECT name, region, population
FROM (SELECT region, name, population
, MAX(population) OVER (PARTITION BY region) maxpop
FROM bbc)
WHERE population = maxpop;
The inline view gives you a table that looks like your base table, plus an extra column with the max population for the region. Your top-level select gives you the country, region and population of the largest country in each region.
To illustrate with a limited example:
SELECT * FROM bbc;
REGION NAME POPULATION
--------------- ------- ----------
North America USA 300000000
North America Canada 100000000
North America Mexico 50000000
South America Brazil 50000000
South America Argentina 40000000
South America Venezuela 20000000
Add the analytic function:
SELECT region, NAME, population
, MAX(population) OVER (PARTITION BY region) maxpop
FROM bbc;
REGION NAME POPULATION MAXPOP
--------------- ------- ---------- ----------
North America USA 300000000 300000000
North America Canada 100000000 300000000
North America Mexico 50000000 300000000
South America Brazil 50000000 50000000
South America Argentina 40000000 50000000
South America Venezuela 20000000 50000000
Then the finished product:
NAME REGION POPULATION
------- --------------- -----------
USA North America 300000000
Brazil South America 50000000
One more edit. You can avoid a nest select, but not a subquery:
SELECT NAME, region, population
FROM bbc
WHERE (region, population) IN
(SELECT region, MAX(population)
FROM bbc
group by region);
Here's the easiest and shortest way to do it, since Oracle has tuple testing, it can make the code shorter:
First, get the max population on each region:
SELECT region, MAX(population)
FROM bbc
GROUP BY region
Then test the countries against it:
select region, name, population
from bbc
where (region, population) in
(SELECT region, MAX(population)
FROM bbc
GROUP BY region)
order by region
If you want to support many RDBMS, use EXISTS:
select region, name, population
from bbc o
where exists
(SELECT null -- neutral. doesn't invoke Cargo Cult Programming ;-)
FROM bbc
WHERE region = o.region
GROUP BY region
HAVING o.population = MAX(population) )
order by region
Query tested here, both have similar output: http://sqlzoo.net/0.htm
http://www.ienablemuch.com/2010/05/why-is-exists-select-1-cargo-cult.html
In the vast majority of vases, the ORA-00979 error is caused because a non-aggregated column is not included in the GROUP BY clause. In this case you need to include name as well in your GROUP BY clause. Also, you should not be calling the MAX function in your FROM statement.
SELECT region, name, MAX(population)
FROM bbc
GROUP BY region, name