SQL unknown columns? - sql

Im new to SQL and am having difficulty with this query:
The average elevation of each Texas county which contains one or more zip code with a population over 20,000.
This is what I have so far:
SELECT county, zip_code, state, AVG(elevation)
FROM zip_codes
WHERE state=’TX’
GROUP BY county
HAVING population > 20000
ORDER BY county;
ERROR 1054 (42S22): Unknown column '’TX’' in 'where clause'
But when I select state from the zip_codes database it is there.

You're using wrong single quotes. Should be
WHERE state = 'tx'
instead of
WHERE state = ’tx’
Yours are ... where from? Some editor like MS Word or similar, which creates those "fancy" quotes. Well, don't use them.
Apart from that, columns that aren't aggregated should all be part of the GROUP BY clause:
group by county, zip_code, state

Fixing the quote issues is a must-have, as answered by Littlefoot.
However, your query is still invalid Oracle SQL, since:
the select clause has two non-aggregated columns that are not part of the group by clause (zip_code and state)
same goes for column population in the having clause, that is neither aggregated nor part of the group by clause
For this assignment:
The average elevation of each Texas county which contains one or more zip code with a population over 20,000.
Assuming that each record in the table corresponds to to a different zip_code (which seems relevant, given that the table itself is called zip_codes), you could phrase:
select county, state, avg(elevation) avg_elevation
from zip_codes
where state= 'TX'
group by county, state
having max(population) > 20000
order by county;

SELECT county, zip_code, state, AVG(elevation)
FROM zip_codes
WHERE state='TX'
GROUP BY county, zip_code, state
where population > 20000
ORDER BY county;

Without a data sample it's difficult to understand precisely what's going on, but I'll take a stab at it.
Your assignment says you are to compute "The average elevation of each Texas county which contains one or more zip code with a population over 20,000". Let's break down the elements of the assignment you need to find:
Average elevation
Of each county in Texas
Which has one or more zip code
With population > 20,000
And we've got a table of zip codes, with county, state, and elevation.
So, OK - first problem: find all the zip codes in Texas:
SELECT *
FROM ZIP_CODES
WHERE STATE = 'TX'
OK, easy so far. Now, limit it to zip codes with a population over 20,000:
SELECT *
FROM ZIP_CODES
WHERE STATE = 'TX' AND
POPULATION > 20000
Cool - but this gives us an entry for each zip code in the county with a population > 20000. All we care about are the individual counties which have AT LEAST one zip code with a population over 20K. So how about
SELECT DISTINCT COUNTY
FROM ZIP_CODES
WHERE STATE = 'TX' AND
POPULATION > 20000
Great. Now we need to compute the average elevation in each county returned by the above query:
SELECT COUNTY, AVG(ELEVATION)
FROM ZIP_CODES
WHERE STATE = 'TX' AND
COUNTY IN (SELECT DISTINCT COUNTY
FROM ZIP_CODES
WHERE STATE = 'TX' AND
POPULATION > 20000)
GROUP BY COUNTY
ORDER BY COUNTY
And there's your answer.

Related

SQL Query that reports number of universities by town in descending order

I am trying to write a SQL query that reports the universities in Cambridge, New York City, and Chicago. It is only supposed to output the Town and number of universities in each town in descending order. Not sure what the condition would look like nor what I am selecting besides town.
Here is my query so far:
SELECT Town
FROM Universities
WHERE Town = "New York City" OR Town = "Cambridge" OR Town = "Chicago"
ORDER BY Town DESC
Please help!
Based on your description I would suggest you need a simple aggregate query like the following - note you can use in here and also string literals are delimited by single quotes.
select Town, Count(*) as NumberOfUniversities
from Universities
where Town in ('New York City', 'Cambridge', 'Chicago')
group by Town
order by NumberOfUniversities desc;

Search for a city based off city id

I'm trying to get cities which are equal to a particular city based off a user search I will be implementing .
I've got a sql query below which gives the exact output I want:
Select r.City, AVG(s.Longitude) AS Longitude, AVG(s.Latitude) AS Latitude
From CafeAddress r inner join Cafe s on s.CafeId = r.CafeId
Where City = 'Mumbai'
Group By City
Current output:
City Longitude Latitude
Mumbai -73.9904097 40.7036292
What I'm currently trying to add is a urlsafe "id" which is pretty much the city but with no white spaces, random chars just want it all lower case.
Like below:
id City Longitude Latitude
mumbai Mumbai -73.9904097 40.7036292
Is there a way to implement something like this?
Use LOWER to make it lowercase
Use TRIM to trim whitespace from beginning/end
Use REPLACE to replace interior spaces w/ underscore
Select REPLACE(TRIM(LOWER(r.City)),' ','_'),r.City, AVG(s.Longitude) AS Longitude, AVG(s.Latitude) AS Latitude
From CafeAddress r inner join Cafe s on s.CafeId = r.CafeId
Where City = 'Mumbai'
Group By City
Example, if r.City was ' SAN JOSE '
it would return: 'san_jose'
You can daisy chain REPLACE() to get rid of special characters or use TRANSLATE()

Select top three records grouping by two factors

I am trying to identify the three records with the highest values grouped by two factors. I realize this question is similar to this one PostgreSQL: select top three in each group, but I can't figure out how to generalize from this example which includes a single factor, to two factors. I have tried searching stack overflow for an answer to this question beyond the one listed above and I can't find one, but perhaps I'm not searching for the correct terms.
Briefly, I'm connecting to a table with the following schema
city, country, value
I only have a single row per city, country combination, but I have a variable, but the number of city entries I have per country is variable. For example, I have a few dozen cities for Canada, a hundred for the United States, but only two for Uzbekistan.
What I want, as output is a table with the same schema, but only countaining the rows containing the highest three values for city, nested within country. For example, if Canada has the cities and values of
{Canada, toronto, 100}, {Canada, vancouver, 80},
{Canada, montreal,112}, {Canada, calgary, 109},
{Canada, edmonton, 76}, {Canada, winnipeg, 73},
and the United States has the entries of
{{us, nyc, 104}, {us, chicago, 87},
{us, boston, 98}, {us, seattle, 105},
{us, sanfran, 88}, {us, minneapolis, 84},
{us, miami, 103}, {us, houston, 112},
{us, dallas, 78}, {us, tucson, 83}}
and Uzbekistan has the entries of
{uzbekistan, qarshi, 95}, {uzbeckistan, gluiston, 101}
What I would like as output would be
Canada, Montreal, 112
Canada, Toronto, 100
Canada, Calgary, 109
us, houston, 112
us, seattle, 105
us, nyc, 103,
uzbeckistan, qarshi, 95
uzbeckistan, gluiston 101
I've tried the following query
SELECT logincity, logincountry, VAL
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY logincountry, logincity ORDER BY
val DESC) AS Row_ID
FROM a_table)
WHERE Row_ID < 4
ORDER BY logincity
But I end up with more than three cities per country.
Can someone help me out?
Thanks Stack Overflow!
I think you only need partition by logincountry
SELECT logincity, logincountry, VAL
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY logincountry
ORDER BY val DESC) AS Row_ID
FROM a_table ) T
WHERE Row_ID < 4
ORDER BY logincity
TIP: You probably will realize the problem if you include the Row_id on the SELECT
SELECT logincity, logincountry, VAL, Row_ID
On your query all Row_ID = 1
TIP 2: Your query want top 3 cities for each country, so you only have one partition country. So the linked question is the right answer, top 3 of each group in this case country.

Select distinct values with count in PostgreSQL

This is a heavily simplified version of an SQL problem I'm dealing with. Let's say I've got a table of all the cities in the world, like this:
country city
------------
Canada Montreal
Cuba Havanna
China Beijing
Canada Victoria
China Macau
I want to count how many cities each country has, so that I would end up with a table as such:
country city_count
------------------
Canada 50
Cuba 10
China 200
I know that I can get the distinct country values with SELECT distinct country FROM T1 and I suspect I need to construct a subquery for the city_count column. But my non-SQL brain is just telling me I need to loop through the results...
Thanks!
Assuming the only reason for a new row is a unique city
select country, count(country) AS City_Count
from table
group by country

How do I do a DISTINCT and ORDER BY in PostgreSQL?

PostgreSQL is about to make me punch small animals. I'm doing the following SQL statement for MySQL to get a listing of city/state/countries that are unique.
SELECT DISTINCT city
, state
, country
FROM events
WHERE (city > '')
AND (number_id = 123)
ORDER BY occured_at ASC
But doing that makes PostgreSQL throw this error:
PGError: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
But if I add occured_at to the SELECT, then it kills of getting back the unique list.
Results using MySQL and first query:
BEDFORD PARK IL US
ADDISON IL US
HOUSTON TX US
Results if I add occured_at to the SELECT:
BEDFORD PARK IL US 2009-11-02 19:10:00
BEDFORD PARK IL US 2009-11-02 21:40:00
ADDISON IL US 2009-11-02 22:37:00
ADDISON IL US 2009-11-03 00:22:00
ADDISON IL US 2009-11-03 01:35:00
HOUSTON TX US 2009-11-03 01:36:00
The first set of results is what I'm ultimately trying to get with PostgreSQL.
Well, how would you expect Postgres to determine which occured_at value to use in creating the sort order?
I don't know Postgres syntax particularly, but you could try:
SELECT DISTINCT city, state, country, MAX(occured_at)
FROM events
WHERE (city > '') AND (number_id = 123) ORDER BY MAX(occured_at) ASC
or
SELECT city, state, country, MAX(occured_at)
FROM events
WHERE (city > '') AND (number_id = 123)
GROUP BY city, state, country ORDER BY MAX(occured_at) ASC
That's assuming you want the results ordered by the MOST RECENT occurrence. If you want the first occurrence, change MAX to MIN.
Incidentally, your title asks about GROUP BY, but your syntax specifies DISTINCT.