subquery with NOT IN - sql

I have a table of cities like:
state city
----- ----
texas houston
texas austin
texas dallas
texas san antonio
texas beaumont
texas brownsville
texas el paso
california anaheim
california san diego
california los angeles
california oakland
california simi valley
california san francisco
I need a query to find the states that don't have a city named 'houston' or 'dallas'. My first thought was this
select distinct state from cities where city not in ('houston', 'dallas');
but that won't work. I think I need a subquery and a NOT IN of some sort..

A way you can do this is with a NOT EXISTS clause:
Select Distinct State
From Cities C1
Where Not Exists
(
Select *
From Cities C2
Where C2.City In ('Houston', 'Dallas')
And C1.State = C2.State
)

select distinct state from cities where state not in (SELECT state FROM cities WHERE city in ('houston', 'dallas'));

Another method, may be slightly faster:
select distinct state from cities where state not in (select state from cities where city in ('houston', 'dallas'));

Select State
from Cities
group by State
having count(case when Cities in ('houston', 'dallas') then cities end) = 0
This will return all states where the number of cities associated with that state and matching your criteria is 0 (i.e. there are no such cities associated with the state).

Related

Extract suburb name from address string in Bigquery

I have a table of addresses (property) from which I need to extract just the suburb name. I have another table (suburbs) that contains all of the suburb names.
I'm having a problem with the multi-word suburb names, where a match is found on one, and both words. I need it to match with the longest suburb name, eg. an address with "North Bondi" should only match to suburb "North Bondi" and not suburb "Bondi".
I've found some examples online that use the MAX function in the join but Bigquery won't let me use that function in the join.
Would appreciate if someone could please suggest corrections, or provide guidance on other solutions (eg. sorting the suburb table and retrieving only one result?) Thank you!
Table: property
address
12 Smith Street Surry Hills NSW
34 Jones Street Bondi NSW
15 Sunny Road North Bondi NSW
Table: suburbs
suburb
state
Surry Hills
NSW
Bondi
NSW
North Bondi
NSW
Current code used:
Select * from ( SELECT p.address, s.suburb
FROM `property` p
JOIN `suburbs` s
ON INITCAP(p.address) LIKE CONCAT('%', INITCAP(s.suburb),' ', INITCAP(s.state), '%')
GROUP BY p.address, s.suburb
) x
join `property` p
ON p.address = x.address
where p.address is not null;
Actual result:
address
suburb
12 Smith Street Surry Hills NSW
Surry Hills
34 Jones Street Bondi NSW
Bondi
15 Sunny Road North Bondi NSW
Bondi
15 Sunny Road North Bondi NSW
North Bondi
Desired result:
address
suburb
12 Smith Street Surry Hills NSW
Surry Hills
34 Jones Street Bondi NSW
Bondi
15 Sunny Road North Bondi NSW
North Bondi
Try this:
select v1.address,
string_agg(v1.addr_part, " ") as suburb,
from (
select t.address,
addr_part,
from property t
cross join unnest(split(t.address, " ")) addr_part WITH OFFSET AS ofst
where ofst > 2
qualify row_number() over(partition by t.address order by ofst desc) > 1
) v1
group by v1.address
;
But! This approach assumes:
The first 3 words in every address are not belong to suburb name;
Every state is one word.
I've found some examples online that use the MAX function in the join but Bigquery won't let me use that function in the join.
Using a window function Instead of a MAX function,
Select * from ( SELECT p.address, s.suburb
FROM `property` p
JOIN `suburbs` s
ON INITCAP(p.address) LIKE CONCAT('%', INITCAP(s.suburb),' ', INITCAP(s.state), '%')
) x
QUALIFY RANK() OVER (PARTITION BY address ORDER BY LENGTH(suburb) DESC) = 1
Query results:
address
suburb
12 Smith Street Surry Hills NSW
Surry Hills
15 Sunny Road North Bondi NSW
North Bondi
34 Jones Street Bondi NSW
Bondi
Consider below approach
SELECT p.address,
STRING_AGG(s.suburb ORDER BY LENGTH(s.suburb) DESC LIMIT 1) suburb
FROM `property` p
JOIN `suburbs` s
ON INITCAP(p.address) LIKE CONCAT('%', INITCAP(s.suburb),' ', INITCAP(s.state), '%')
GROUP BY p.address
if applied to sample data in your question - output is

SELECT with multiple PRIMARY KEY

I have 3 table:
nation (name PRIMARY KEY);
city (name PRIMARY KEY, nation REFERENCES nation(name))
overflight (number, city, PRIMARY KEY (number, city))
The overflight table content is something like below:
AA11 city1
AA11 city2
BB22 city1
BB22 city3
etc.
I need to select only overflight that doesn't have city from a certain nation in the city field.
I've tried with:
SELECT number
FROM overflight
JOIN city ON overflight.city = city.name
WHERE overflight.city NOT IN (
SELECT name FROM city WHERE nation = some_nation
)
GROUP BY number;
but it doesn't work because it doesn't list the row of overflight that have city from some_nation but can happen that the same overflight have another row in the table that doesn't have city in some_nation. How can I display only the overflight that doesn't have city in some_nation at all?
Hope that I've explained my problem as clear as possible.
EDIT
This is exact content of overflight table:
AZ 7255 Rome
AZ 7255 Milan
AZ 608 Rome
AZ 608 New York
AA 1 New York
AA 1 Los Angeles
BA 2430 New York
BA 2430 Los Angeles
Suppose that I want to show the overflight that doesn't fly over any city in Italy. I need that the result is like this
AA 1 New York
AA 1 Los Angeles
BB 2430 New York
BB 2430 Los Angeles
Join the tables to get the overflight numbers that do have a city from the nation that you want to exclude and use the operator NOT IN to select all the other oveflights:
SELECT * FROM overflight
WHERE number NOT IN (
SELECT o.number
FROM overflight o INNER JOIN city c
ON o.city = c.name
WHERE c.nation = 'Italy'
)
See the demo.
Results:
number
city
AA 1
New York
AA 1
Los Angeles
BA 2430
New York
BA 2430
Los Angeles

Count occurrences with exclude criteria

I have a Table
City ID
Austin 123
Austin 123
Austin 123
Austin 145
Austin 145
Chicago 12
Chicago 12
Houston 24
Houston 45
Houston 45
Now I want to count the occurrences of all Citis with different ids so since Chicago has only one id (=12) I am not interested in Chicago and it should not appear in the resultset that should looks like this:
city Id Occurrences
Austin 123 3
Austin 145 2
Houston 34 1
Houston 45 2
I am able to get myself an overview with
select city, Id from Table
group by city, Id
But I am not sure how to only select the once having different ids and to count them.
Could anyone help me out here?
You can use window functions and aggregation:
select city, id, occurences
from (
select city, id, count(*) occurences, count(*) over(partition by city) cnt_city
from mytable
group by city, id
) t
where cnt_city > 1

SQL ordering cities ascending and persons descending

I have been stuck in complicated problem. I do not know the version of this SQL, it is school edition. But it is not relevant info now anyway.
I want order cities ascending and numbers descending. With descending numbers I mean when there is same city couple times it orders then biggest number first.
I also need row numbers, I have tried SELECT ROW_NUMBER() OVER(ORDER BY COUNT(FIRST_NAME)) row with no succes.
I have two tables called CUSTOMERS and EMPLOYEES. Both of them having FIRST_NAME, LAST_NAME, CITY.
Now I have this kind of code:
SELECT
CITY, COUNT(FIRST_NAME),
CASE WHEN COUNT(FIRST_NAME) >= 0 THEN 'CUSTOMERS'
END
FROM CUSTOMERS
GROUP BY CITY
UNION
SELECT
CITY, COUNT(FIRST_NAME),
CASE WHEN COUNT(FIRST_NAME) >= 0 THEN 'EMPLOYEES'
END
FROM EMPLOYEES
GROUP BY CITY
This SQL code gives me list like this:
CITY
NEW YORK 2 CUSTOMERS
MIAMI 1 CUSTOMERS
MIAMI 4 EMPLOYEES
LOS ANGELES 1 CUSTOMERS
CHIGACO 1 CUSTOMERS
HOUSTON 1 CUSTOMERS
DALLAS 2 CUSTOMERS
SAN JOSE 2 CUSTOMERS
SEATTLE 2 CUSTOMERS
SEATTLE 5 EMPLOYEES
BOSTON 1 CUSTOMERS
BOSTON 3 EMPLOYEES
I want it look like this:
ROW CITY
1 NEW YORK 2 CUSTOMERS
2 MIAMI 4 EMPLOYEES
3 MIAMI 1 CUSTOMERS
4 LOS ANGELES 1 CUSTOMERS
5 CHIGACO 1 CUSTOMERS
6 HOUSTON 1 CUSTOMERS
7 DALLAS 2 CUSTOMERS
8 SAN JOSE 2 CUSTOMERS
9 SEATTLE 5 EMPLOYEES
10 SEATTLE 2 CUSTOMERS
11 BOSTON 3 EMPLOYEES
12 BOSTON 1 CUSTOMERS
You can use window functions in the ORDER BY:
SELECT c.*
FROM ((SELECT CITY, COUNT(*) as cnt, 'CUSTOMERS' as WHICH
FROM CUSTOMERS
GROUP BY CITY
) UNION ALL
(SELECT CITY, COUNT(*), 'EMPLOYEES'
FROM EMPLOYEES
GROUP BY CITY
)
) c
ORDER BY MAX(cnt) OVER (PARTITION BY city) DESC,
city,
cnt DESC;

How to show blank when value is repeated

SELECT * FROM Cities ORDER BY Country;
This is the result.
COUNTRY CITY PLACE
Italy Milan Zone_A
Italy Rome Zone_A
Italy Rome Zone_B
USA New York Zone_Q
USA Atlanta Zone_A
I would like to create a Stored Procedure that shows "blank" when the item is repeated. The final result should be the following. (Note that this rule is applied only in the first 2 columns, not in the third).
COUNTRY CITY PLACE
Italy Milan Zone_A
Rome Zone_A
Zone_B
USA New York Zone_Q
Atlanta Zone_A
If your version of maria DB supports window functions, you can use lag():
select
case when lag(country) over(order by country, city, place) = country
then null
else country
end country,
case when lag(city) over(order by country, city, place) = city
then null
else city
end city,
place
from cities
order by
country,
city,
place