Querying with an outer join in SQL - sql

I have a SQL database that contains geographic information. This database has three tables:
PostalCode
----------
Code (char(10))
StateID (uniqueidentifier)
State
-----
ID (uniqueidentifier)
Name (nvarchar(max))
CountryID (uniqueidentifier)
Country
-------
ID (uniqueidentifier)
Name
The relationship is: A country has states. States have postal codes. I'm trying to create a query where I can find all of the states in the country where a specific postal code belongs. Currently, I'm trying the following:
SELECT
s.*
FROM
[PostalCode] p,
[State] s,
[Country] c
WHERE
p.[Zip]='90028' AND
p.[StateID]=s.[ID] AND
s.[CountryID]=c.[ID]
Unfortunately, this result returns 1 record (The State record associated with California). However, in reality, I need it to return 50 records (one for each state in the united states). How do I modify this query to do this?
Thank you

You are using an INNER JOIN, you need to change your syntax to a LEFT JOIN:
SELECT s.*
FROM [State] s
LEFT JOIN [PostalCode] p
ON p.[StateID]=s.[ID]
AND p.[Zip]='90028'
LEFT JOIN [Country] c
ON s.[CountryID]=c.[ID]
You will notice that I changed to use ANSI JOIN syntax instead of the joining the tables with the commas and the WHERE clause.
A LEFT JOIN will return all rows from the state table even if there are not matching rows in the other tables.
If you want to return all states in a country where the postal code is equal to a specific code, then you can use:
select s.*
from state s
inner join
(
SELECT s.countryid
FROM [State] s
INNER JOIN [PostalCode] p
ON p.[StateID]=s.[ID]
INNER JOIN [Country] c
ON s.[CountryID]=c.[ID]
WHERE p.[Zip]='90028'
) c
on s.countryid = c.countryid;
Or you can use:
select s1.*
from state s1
where exists (select s2.countryid
from state s2
inner join country c
on s2.countryid = c.id
inner join postalcode p
on s2.id = p.stateid
where p.zip = 90028
and s1.countryid = s2.countryid)

Related

Which Join for SQL plus query

I have 4 tables, I would like to select one column from each table, but only if the department has both 'Mick' and 'Dave working in it (must have both names, not one or the other). But it does not seem to be working properly:
SELECT SCHOOL_NAME, TOWN, COUNTY
FROM STUDENTS
NATURAL JOIN SCHOOLS NATURAL JOIN TOWNS NATURAL JOIN
COUNTIES
WHERE FIRST_NAME IN ('Mick','Dave)
/
I'm going wrong somewhere (probably lots of places :( ). Any help would be great
Don't use NATURAL JOIN. It is an abomination, because it does not take properly declared foreign key relationships into account. It only looks at the names of columns. This can introduce really hard to find errors.
Second, what you want is aggregation:
select sc.SCHOOL_NAME, t.TOWN, c.COUNTY
from STUDENTS st join
SCHOOLS sc
on st.? = sc.? join
TOWNS t
on t.? = ? join
COUNTIES c
on c.? = t.?
where FIRST_NAME in ('Mick', 'Dave')
group by sc.SCHOOL_NAME, t.TOWN, c.COUNTY
having count(distinct st.first_name) = 2;
The ? are placeholders for table and column names. If you are learning SQL, it is all the more important that you understand how columns line up for joins in different tables.
A where clause can only check the values in a single row. There is a separate row for each student, so there is no way -- with just a where -- to find both students. That is where the aggregation comes in.
You need at least three Join conditions, and properly end the string Dave with quote :
SELECT SCHOOL_NAME, TOWN, COUNTY
FROM SCHOOLS h
JOIN TOWNS t ON (t.id=h.town_id)
JOIN COUNTIES c ON (t.county_id=c.id)
WHERE EXISTS ( SELECT school_id
FROM STUDENTS s
WHERE s.first_name in ('Mick','Dave')
AND school_id = h.id
GROUP BY school_id
HAVING count(1)>1
);
SQL Fiddle Demo
You can use an analytic function in a sub-query to count the students who have the name Mick or Dave for each school_id (assuming that is your identifier for a school):
SELECT SCHOOL_NAME, TOWN, COUNTY
FROM ( SELECT *
FROM (
SELECT d.*,
COUNT(
DISTINCT
CASE WHEN FIRST_NAME IN ( 'Mick', 'Dave' ) THEN FIRST_NAME END
) OVER( PARTITION BY school_id )
AS num_matched
FROM STUDENTS d
)
WHERE num_matched = 2
)
NATURAL JOIN SCHOOLS
NATURAL JOIN TOWNS
NATURAL JOIN COUNTIES;
SQLFiddle
You would also be better to use an INNER JOIN and explicitly specify the join condition rather than relying on NATURAL JOIN.

How to work in case in join condition

How to find city when ContactID is provided and condition is if ContactID is coming as 123 then it will look whether it is P or C, If P then it will go to Person table and returns City(USA) as output and If C then it will go to Company table and gives City(AUS) as output.
NB: all tables contain thousands of record and City value comes from run time.
Unless you're dynamically generating the query (i.e. using some language other than SQL to execute it) then you need to join on both tables anyway. If you're joining on both tables then there's no need for a CASE statement:
select *
from contacts co
left outer join person p
on co.contactid = p.contactid
and co.person_company = 'P'
left outer join company c
on co.contactid = c.contactid
and co.person_company = 'C'
You'll start noting an issue here, for every column from PERSON and COMPANY you're going to have to add some business logic to work out which table you want the information from. This can get very tiresome
select co.contactid
, case when p.id is not null then p.name else c.name end as name
from contacts co
left outer join person p
on co.contactid = p.contactid
and co.person_company = 'P'
left outer join company c
on co.contactid = c.contactid
and co.person_company = 'C'
Your PERSON and COMPANY tables seem to have exactly the same information in them. If this is true in your actual data model then there's no need to split them up. You make the determination as to whether each entity is a person or a company in your CONTACTS table.
Creating additional tables to store data in this manner is only really helpful if you need to store additional data. Even then, I'd still put the data that means the same thing for a person or a companny (i.e. name or address) in a single table.
If there's a 1-2-1 relationship between CONTACTID and PID and CONTACTID and CID, which is what your sample data implies, then you have a number of additional IDs, which have no value.
Lastly, if you're not restricting that only companies can go in the COMPANY table and individuals in the PERSON table. You need the PERSON_COMPANY column to exist in both PERSON and COMPANY, though as a fixed string. It would be more normal to set up this data model as something like the following:
create table contacts (
id integer not null
, contact_type char(1) not null
, name varchar2(4000) not null
, city varchar2(3)
, constraint pk_contacts primary key (id)
, constraints uk_contacts unique (id, contact_type)
);
create table people (
id integer not null
, contact_type char(1) not null
, some_extra_info varchar2(4000)
, constraint pk_people primary key (id)
, constraint fk_people_contacts
foreign key (id, contact_type)
references contacts (id, contact_type)
, constraint chk_people_type check (contact_type = 'P')
);
etc.
you can LEFT JOIN all 3 tables and the using a CASE statement select the one that you need based on the P or C value
SELECT
CASE c.[Person/Company]
WHEN 'P' THEN p.NAME
WHEN 'C' THEN a.Name
END AS Name
FROM Contact c
LEFT JOIN Person p on p.ContactId = c.ContactId
LEFT JOIN Company a on a.ContachId = c.ContactId
Ben's answer is almost right. You might want to check that the first join has no match before doing the second one:
select c.*, coalesce(p.name, c.name) as p.name
from contacts c left outer join
person p
on c.contactid = p.contactid and
c.person_company = 'P' left join
company co
on c.contactid = co.contactid and
c.person_company = 'C' and
p.contactid is null;
This may not be important in your case. But in the event that the second join matches multiple rows and the first matches a single row, you might not want the additional rows in the output.

Using advance SELECT statement for SQL QUERY

I'm trying to use sql query to show name_id and name attribute for all the people who have only grown tomato (veg_grown) and the result are show ascending order of name attribute.
CREATE TABLE people
(
name_id# CHAR(4) PRIMARY KEY,
name VARCHAR2(20) NOT NULL,
address VARCHAR2(80) NOT NULL,
tel_no CHAR(11) NOT NULL
)
CREATE TABLE area
(
area_id# CHAR(5) PRIMARY KEY,
name_id# REFRENCES people,
area_location_adress VARCHAR2(80) NOT NULL
)
CREATE TABLE area_use
(
area_id# REFERENCES area,
veg_grown VARCHAR (20) NOT NULL
)
but the veg_grown attribute has no relation to the people table but the people and area_use table are linked through area table so I tried using INNER JOIN like this which I confused my-self and didn't even work:
SELECT
name, name_id
FROM
people
INNER JOIN
area USING (name_id)
SELECT area_id
FROM area
INNER JOIN area_use USING (area_id)
WHERE veg_grown = 'tomato'
ORDER BY name ASC;
Surely there must be a way to select name_id and name who has only grown tomato in SQL query
I will take any help or advice :) thanks
SELECT p.name, p.name_id
FROM people p
JOIN area a
ON p.name_id = a.name_id
JOIN area_use au
ON a.area_id = au.area_id
AND au.veg_grown = 'tomato'
LEFT JOIN area_use au2
ON a.area_id = au2.area_id
AND au2.veg_grown <> 'tomato'
WHERE au2.area_id IS NULL;
This will use a LEFT JOIN to find people that only grow tomatoes. To find people that grow tomatoes and possibly anything else too, remove the LEFT JOIN part and everything below it.
An SQLfiddle to test with.
EDIT: If your field names contain # in the actual table, you'll need to quote the identifiers and add the #, I left them out in this sample.
AFAICT you only want entries where all info is available, so there are no left/right joins.
SELECT p.name_id, p.name
FROM people p
JOIN area a
ON p.name_id = a.name_id
JOIN area_use au
ON a.area_id = au.area_id
WHERE au.veg_grown = 'tomato'
ORDER BY p.name ASC
I'm not 100% sure of your data model, but this seems to be what you're trying to do.
SELECT name, people.name_id
FROM people, area, area_use
WHERE area.area_id = area_use.area_id
AND veg_grown = 'tomato'
AND area.name_id = people.name_id
ORDER BY name ASC;

Suggest optimized query using 4 tables ( RIGHT JOIN v/s INNER JOIN & HAVING )

I have following table structure
table_country ==> country_id (PK) | country | status
table_department ==> department_id (PK) | department | country_id (FK)
table_province ==> province_id (PK) | province | department_id (FK)
table_district ==> district_id (PK) | district | province_id (FK)
NOTE: all tables engine are innoDB
One country can have multiple department, one department can have multiple province and one province can have multiple district. Now I need to search only those country which have at least one district
I have written the below 2 SQL queries, in my case, both queries return the same results.... please describe the difference between those queries
Using a RIGHT JOIN:
SELECT c.country_id as id, c.country as name
FROM table_country c
RIGHT JOIN table_department d ON d.country_id=c.country_id
RIGHT JOIN table_province p ON p.department_id=d.department_id
RIGHT JOIN table_district ds ON ds.province_id=p.province_id
WHERE c.status='Active' GROUP BY (c.country_id)
Using INNER JOIN and HAVING clause:
SELECT COUNT(ds.district), c.country_id as id, c.country as name
FROM table_country c
INNER JOIN table_department d ON d.country_id = c.country_id
INNER JOIN table_province p ON p.department_id = d.department_id
INNER JOIN table_district ds ON ds.province_id = p.province_id
WHERE c.status='Active'
GROUP BY (c.country_id)
HAVING COUNT(ds.district)>0
Please tell me where these both query make the difference in results and which one I have to use or do I have to use a different query?
Thanks in advance
I would suggest using the second of your 2 queries (with inner joins), but without the HAVING clause as it isn't required because the inner joins require that any row in the final result MUST have a row in the district table.
The first of your queries, using a more exotic series of RIGHT OUTER JOINS, ultimately produces the same outcome - but because they are outer joins it is potentially less efficient. Another way of representing your first query would be to reverse the table sequence like this:
SELECT
c.country_id AS id
, c.country AS name
, COUNT(ds.district)
FROM table_district ds
INNER JOIN table_province p ON ds.province_id = p.province_id
INNER JOIN table_department d ON p.department_id = d.department_id
INNER JOIN table_country c ON d.country_id = c.country_id
WHERE c.status='Active'
GROUP BY
c.country_id
, c.country
and hopefully when inverted like that it is clear that no result row can exist unless there is a row from the district table.

What's the best way to get related data from their ID's in a single query?

I have a table where each row has a few fields that have ID's that relate to some other data from some other tables.
Let's say it's called people, and each person has the ID of a city, state and country.
So there will be three more tables, cities, states and countries where each has an ID and a name.
When I'm selecting a person, what's the easiest way to get the names of the city, state and country in a single query?
Note: I know this is possible with joins, however as there are more related tables, the nested joins makes the query hard to read, and I'm wondering if there is a cleaner way. It should also be possible for the person to have those fields empty.
Assuming the following tables:
create table People
(
ID int not null primary key auto_increment
,FullName varchar(255) not null
,StateID int
,CountryID int
,CityID int
)
;
create table States
(
ID int not null primary key auto_increment
,Name varchar(255) not null
)
;
create table Countries
(
ID int not null primary key auto_increment
,Name varchar(255) not null
)
;
create table Cities
(
ID int not null primary key auto_increment
,Name varchar(255) not null
)
;
With the Following Data:
insert into Cities(Name) values ('City 1'),('City 2'),('City 3');
insert into States(Name) values ('State 1'),('State 2'),('State 3');
insert into Countries(Name) values ('Country 1'),('Country 2'),('Country 3');
insert into People(FullName,CityID,StateID,CountryID) values ('Has Nothing' ,null,null,null);
insert into People(FullName,CityID,StateID,CountryID) values ('Has City' , 1,null,null);
insert into People(FullName,CityID,StateID,CountryID) values ('Has State' ,null, 2,null);
insert into People(FullName,CityID,StateID,CountryID) values ('Has Country' ,null,null, 3);
insert into People(FullName,CityID,StateID,CountryID) values ('Has Everything', 3, 2, 1);
Then this query should give you what you are after.
select
P.ID
,P.FullName
,Ci.Name as CityName
,St.Name as StateName
,Co.Name as CountryName
from People P
left Join Cities Ci on Ci.ID = P.CityID
left Join States St on St.ID = P.StateID
left Join Countries Co on Co.ID = P.CountryID
JOINS are the only way to really do this.
You might be able to change your schema, but the problem will be the same regardless.
(A City is always in a State, which is always in a Country - so the Person could just have a reference to the city_id rather than all three. You still need to join the 3 tables though).
There is no cleaner way than joins. If the fields are allowed to be empty, use outer joins
SELECT c.*, s.name AS state_name
FROM customer c
LEFT OUTER JOIN state s ON s.id = c.state
WHERE c.id = 10
According to the description of the schema that you have given you will have to use JOINS in a single query.
SELECT
p.first_name
, p.last_name
, c.name as city
, s.name as state
, co.name as country
FROM people p
LEFT OUTER JOIN city c
ON p.city_id = c.id
LEFT OUTER JOIN state s
ON p.state_id = s.id
LEFT OUTER JOIN country co
ON p.country_id = co.id;
The LEFT OUTER JOIN will allow you to fetch details of person even if some IDs are blank or empty.
Another way is to redesign your lookup tables. A city is always in a state and a state in a country. Hence your city table will have columns : Id, Name and state_id. Your state table will be : Id, Name and country_id. And country table will remain the same : Id and Name.
The person table will now have only 1 id : city_id
Now your query will be :
SELECT
p.first_name
, p.last_name
, c.name as city
, s.name as state
, co.name as country
FROM people p
LEFT OUTER JOIN city c
ON p.city_id = c.id
LEFT OUTER JOIN state s
ON c.state_id = s.id
LEFT OUTER JOIN country co
ON s.country_id = co.id;
Notice the difference in the last two OUTER JOINS
If the tables involved are reference tables (i.e. they hold lookup data that isn't going to change during the life time of a session), depending on the nature of your application, you could pre-load the reference data during you application start up. Then your query doesn't need to do the joins, instead it returns the id values, and in your application you do a decode of the ids when you need to display the data.
The easiest solution is to use the names as the primary keys in city, state, and country. Then your person table can reference them by the name instead of the pseudokey "id". That way, you don't need to do joins, since your person table already has the needed values.
It does take more space to store a string instead of a 4-byte pseudokey. But you may find the tradeoff worthwhile, if you are threatened by joins as much as you seem to be (which, by the way, is like a PHP programmer being reluctant to use foreach -- joins are fundamental to SQL in the same way).
Also there are many city names that appear in more than one state. So your city table should reference the state table and use these two columns as the primary key.
CREATE TABLE cities (
city_name VARCHAR(30),
state CHAR(2),
PRIMARY KEY (city_name, state),
FOREIGN KEY (state) REFERENCES states(state)
);
CREATE TABLE persons (
person_id SERIAL PRIMARY KEY,
...other columns...
city_name VARCHAR(30),
state CHAR(2),
country_name VARCHAR(30),
FOREIGN KEY (city_name, state) REFERENCES cities(city_name, state),
FOREIGN KEY (country_name) REFERENCES countries(country_name)
);
This just an example of the technique. Of course it's more complex than this, because you may have city names in more than one country, you may have countries with no states, and so on. The point is SQL doesn't force you to use integer pseudokeys, so use CHAR and VARCHAR keys where appropriate.
A disadvantage of standard SQL is the the return data needs to be in tabular format.
However some database vendors have added features that makes it possible to select data in non-tabular format. I don't know whether MySQL knows such features.
Create a view that does the Person, City, State, and Country joins for you. Then just reference the View in all other joins.
Something like:
CREATE VIEW FullPerson AS
SELECT Person.*, City.Name, State.Name, Country.Name
FROM
Person LEFT OUTER JOIN City ON Person.CityId = City.Id
LEFT OUTER JOIN State ON Person.StateId = State.Id
LEFT OUTER JOIN Country ON Person.CountryId = Country.Id
Then in other queries, you can
SELECT FullPerson.*, Other.Value
FROM FullPerson LEFT OUTER JOIN Other ON FullPerson.OtherId = Other.Id
All great answers but the questioner specified they didn't want to use joins. As one respondent demonstrated, assuming your Cities, States, and Countries tables have an Id and a Description field you might be able to do something like this:
SELECT
p.Name, c.Description, s.Description, ct.Description
FROM
People p, Cities c, States s, Countries ct
WHERE
p.Id = value AND
c.Id = value AND
s.Id = value AND
ct.Id = value;
Joins are the answer. With practise they will become more readable to you.
There may be special cases where creating a function would help you, for example you could do the following (in Oracle, I don't know any mysql):
You could create a function to return a formatted address given the city state and country codes, then your query becomes
SELECT first_name, last_name, formated_address(city_id, state_id, country_id)
FROM people
WHERE some_where_clause;
where formated_address does individual lookups on the city state and country tables and puts separators between the decoded values, or returns "no address" if they are all empty, etc