Conditional join based on lookup - sql

Apologies if a similar problem is posted earlier, I couldn't find the same.
Problem: I need to join two tables based a conditional look up in the second table.
Tables: Below are the two tables which have a subset of the total fields.
+-------------------------------------------------------+
| Persons |
+----------+------------+---------------+---------------+
| PersonID | PersonName | HomeAddressID | WorkAddressID |
+----------+------------+---------------+---------------+
| P1 | Doe, John | HA1 | WA1 |
+----------+------------+---------------+---------------+
| P2 | Doe, Jane | HA2 | WA2 |
+----------+------------+---------------+---------------+
| P3 | Doe, Jane | | WA3 |
+----------+------------+---------------+---------------+
+-----------------------------------+
| Addresses |
+-----------+--------+------+-------+
| AddressID | Street | City | State |
+-----------+--------+------+-------+
| HA1 | 123 | A | B |
+-----------+--------+------+-------+
| WA1 | 456 | C | D |
+-----------+--------+------+-------+
| HA2 | 111 | | |
+-----------+--------+------+-------+
| WA2 | 101 | G | H |
+-----------+--------+------+-------+
| WA3 | 333 | I | J |
+-----------+--------+------+-------+
Current Scenario: The SELECT query in a view fetches PersonName from first table and work address fields from second table. (Join is on WorkAddressID)
Expected Result: The SELECT query should fetch PersonName field from first table and address fields from second table conditions being:
If state for home address is available then display Street, City and State for home address.
If state for home address is NULL/blank then display Street, City and State for work address.
Notes:
Many rows in Persons table do not have HomeAddressID but all do have WorkAddressID.
Many rows in Addresses table do not have City and State information for Home addresses.
While this may look like a design flaw, I'm not in a position to re-engineer the database as there are hundreds of objects and sub-objects depending on the original view.
There are 3 million+ rows in the Persons table so performance needs to be acceptable.
The current query has joins to at least 5 other views.
Please advise as to how I can address this problem.
Many thanks,
-V

Here's a MySQL solution:
SELECT PersonName,
IF(h.State = '' OR h.State IS NULL, w.Street, h.Street) AS Street,
IF(h.State = '' OR h.State IS NULL, w.City, h.City) AS City,
IF(h.State = '' OR h.State IS NULL, w.State, h.State) AS State
FROM Persons AS p
JOIN Addresses AS w ON w.AddressID = p.WorkAddressID
LEFT JOIN Addresses as h ON h.AddressID = p.HomeAddressID

A self join would handle this:
select
p.personname,
case when ha.state is null then wa.street else ha.street end as street,
case when ha.state is null then wa.city else ha.city end as city,
case when ha.state is null then wa.state else ha.state end as state
from
Persons p
inner join addresses wa on p.workaddressid = wa.addressid
left join addresses ha on p.homeaddressid = ha.addressid
This syntax would be for MSSQL
Edit: changed the home to a left join because of the criterion Many rows in Persons table do not have HomeAddressID

Related

How to properly write a query to use two copies of a single table? [duplicate]

This question already has answers here:
How do you join on the same table, twice, in mysql?
(3 answers)
Closed 3 years ago.
I need to show one attribute referring to two different instances of the same attribute.
I have a set of tables such as Branch, Address, Property_for_rent, State. I need to show properties data including the branch_state and the property_state. These two attributes come from the table state row name. My problem is that I haven't found a way to join the table state two times referring to two different instances. I already tried Full Join but sql doesn't recognize one copy of my table state, Union doesn't work because if I split the query into to I won't have the same amount of columns.
need to join this two queries:
SELECT p.property_no, p.prop_type, p.rooms, p.rent, s.name AS property_state, staff_no
FROM state s
JOIN address a ON s.state_id = a.state_id
JOIN property_for_rent p ON a.address_id = p.address_id
ORDER BY rent ASC;
+-------------+-----------+-------+------+----------------+----------+
| property_no | prop_type | rooms | rent | property_state | staff_no |
+-------------+-----------+-------+------+----------------+----------+
| PR200 | Flat | 3 | 24 | Nevada | SQ523 |
| PR901 | Flat | 7 | 31 | Vermont | SL569 |
| PR806 | House | 3 | 54 | Minnesota | NULL |
SELECT branch_no, z.name AS branch_state
FROM branch b
JOIN address a ON b.address_id = a.address_id
JOIN state z ON a.state_id = z.state_id;
+-----------+----------------+
| branch_no | branch_state |
+-----------+----------------+
| B424 | Kentucky |
| B947 | Massachusetts |
| B942 | South Carolina |
| B714 | North Dakota |
branch_state and property_state are alias for the name attribute in the state table.
You need to connect the branch to the property somehow. Let me assume that property_for_rent has a column for branch_id:
SELECT p.property_no, p.prop_type, p.rooms, p.rent,
sp.name AS property_state, staff_no,
sb.name as branch_state
FROM property_for_rent p JOIN
address ap
ON ap.address_id = p.address_id JOIN
state sp
ON sp.state_id = ap.state_id JOIN
branch b
on p.branch_id = b.branch_id JOIN
address ab
ON ab.address_id = b.address_id JOIN
state sb
ON sb.state_id = ab.state_id
ORDER BY rent ASC;

SQL - UNION vs NULL functions. Which is better?

I have three tables: ACCT, PERS, ORG. Each ACCT is owned by either a PERS or ORG. The PERS and ORG tables are very similar and so are all of their child tables, but all PERS and ORG data is separate.
I'm writing a query to get PERS and ORG information for each account in ACCT and I'm curious what the best method of combining the information is. Should I use a series of left joins and NULL functions to fill in the blanks, or should I write the queries separately and use UNION to combine?
I've already written separate queries for PERS ACCT's and another for ORG ACCT's and plan on using UNION. My question more pertains to best practice in the future.
I'm expecting both to give me my desired my results, but I want to find the most efficient method both in development time and run time.
EDIT: Sample Table Data
ACCT Table:
+---------+---------+--------------+-------------+
| ACCTNBR | ACCTTYP | OWNERPERSNBR | OWNERORGNBR |
+---------+---------+--------------+-------------+
| 555001 | abc | 3010 | |
| 555002 | abc | | 2255 |
| 555003 | tre | 5125 | |
| 555004 | tre | 4485 | |
| 555005 | dsa | | 6785 |
+---------+---------+--------------+-------------+
PERS Table:
+---------+--------------+---------------+----------+-------+
| PERSNBR | PHONE | STREET | CITY | STATE |
+---------+--------------+---------------+----------+-------+
| 3010 | 555-555-5555 | 1234 Main St | New York | NY |
| 5125 | 555-555-5555 | 1234 State St | New York | NY |
| 4485 | 555-555-5555 | 6542 Vine St | New York | NY |
+---------+--------------+---------------+----------+-------+
ORG Table:
+--------+--------------+--------------+----------+-------+
| ORGNBR | PHONE | STREET | CITY | STATE |
+--------+--------------+--------------+----------+-------+
| 2255 | 222-222-2222 | 1000 Main St | New York | NY |
| 6785 | 333-333-3333 | 400 4th St | New York | NY |
+--------+--------------+--------------+----------+-------+
Desired Output:
+---------+---------+--------------+-------------+--------------+---------------+----------+-------+
| ACCTNBR | ACCTTYP | OWNERPERSNBR | OWNERORGNBR | PHONE | STREET | CITY | STATE |
+---------+---------+--------------+-------------+--------------+---------------+----------+-------+
| 555001 | abc | 3010 | | 555-555-5555 | 1234 Main St | New York | NY |
| 555002 | abc | | 2255 | 222-222-2222 | 1000 Main St | New York | NY |
| 555003 | tre | 5125 | | 555-555-5555 | 1234 State St | New York | NY |
| 555004 | tre | 4485 | | 555-555-5555 | 6542 Vine St | New York | NY |
| 555005 | dsa | | 6785 | 333-333-3333 | 400 4th St | New York | NY |
+---------+---------+--------------+-------------+--------------+---------------+----------+-------+
Query Option 1: Write 2 queries and use UNION to combine them:
select a.acctnbr, a.accttyp, a.ownerpersnbr, a.ownerorgnbr, p.phone, p.street, p.city, p.state
from acct a
inner join pers p on p.persnbr = a.ownerpersnbr
UNION
select a.acctnbr, a.accttyp, a.ownerpersnbr, a.ownerorgnbr, o.phone, o.street, o.city, o.state
from acct a
inner join org o on o.orgnbr = a.ownerorgnbr
Option 2: Use NVL() or Coalesce to return a single data set:
SELECT a.acctnbr,
a.accttyp,
NVL(a.ownerpersnbr, a.ownerorgnbr) Owner,
NVL(p.phone, o.phone) Phone,
NVL(p.street, o.street) Street,
NVL(p.city, o.city) City,
NVL(p.state, o.state) State
FROM
acct a
LEFT JOIN pers p on p.persnbr = a.ownerpersnbr
LEFT JOIN org o on o.orgnbr = a.ownerorgnbr
There are way more fields in each of the 3 tables as well as many more PERS and ORG tables in my actual query. Is one way better (faster, more efficient) than another?
That depends, on what you consider "better".
Assuming, that you will always want to pull all rows from ACCT table, I'd say to go for the LEFT OUTER JOIN and no UNION. (If using UNION, then rather go for UNION ALL variant.)
EDIT: As you've already shown your queries, mine is no longer required, and did not match your structures. Removing this part.
Why LEFT JOIN? Because with UNION you'd have to go through ACCT twice, based on "parent" criteria (whether separate or done INNER JOIN criteria), while with plain LEFT OUTER JOIN you'll probably get just one pass through ACCT. In both cases, rows from "parents" will most probably be accessed based on primary keys.
As you are probably considering performance, when looking for "better", as always: Test your queries and look at the execution plans with adequate and fresh database statistics in place, as depending on the data "layout" (histograms, etc.) the "better" may be something completely different.
I think you misunderstand what a Union does versus a join statement. A union takes the records from multiple tables, generally similar or the same structure and combines them into a single resultset. It is not meant to combine multiple dissimilar tables.
What I am seeing is that you have two tables PERS and ORG with some of the same data in it. In this case I suggest you union those two tables and then join to ACCT to get the sample output.
In this case to get the output as you have shown you would want to use Outer joins so that you don't drop any records without a match. That will give you nulls in some places but most of the time that is what you want. It is much easier to filter those out later.
Very rough sample code.
SELECT a.*, b.*
from Acct as a
FULL OUTER JOIN (
Select * from PERS UNION Select * from ORG
) as b
ON a.ID = b.ID

Access SQL - Return unique combinations in field

I have a table with data stored vertically, I have shown a simplified example below which has a record for each city a customer has lived in:
| CUSTOMER | CITY |
------------------------------
| John | London |
| John | Manchester |
| Sarah | Cardiff |
| Sarah | Edinburgh |
| Sarah | Liverpool |
| Craig | Manchester |
| Craig | London |
I am trying to come up with an SQL query that will return all unique combinations of cities so in the example above, John and Craig have both lived in London and Manchester but Sarah has lived in different cities (Cardiff, Edinburgh and Liverpool) so I would like an output as below (which can handle any amount of cities)
| CITY1 | CITY2 | CITY3 |
--------------------------------------------
| London | Manchester | |
| Cardiff | Edinburgh | Liverpool |
I have tried using a crosstab query to view the data horizontally like this:
TRANSFORM Max(City)
SELECT Customer
FROM tblCities
GROUP BY Customer
PIVOT City
but it is just returning a field for all cities for every customer. Does anyone know if this is possible using SQL?
p.s Ideally it will ignore the order of cities
This was a nice challenge! The query below gets the groupings per customer. It doesn't discard the duplicates where multiple customers have lived in the same combination of cities ... I'll let you or others find a way to handle that.
TRANSFORM Min(OrderedList.City) AS MinOfCity
SELECT OrderedList.Customer
FROM (SELECT CustomerCities.Customer, CustomerCities.City, Count(1) AS CityNo
FROM CustomerCities INNER JOIN CustomerCities AS CustomerCities_1 ON CustomerCities.Customer = CustomerCities_1.Customer
WHERE (((CustomerCities.City)>=[CustomerCities_1].[City]))
GROUP BY CustomerCities.Customer, CustomerCities.City) OrderedList
GROUP BY OrderedList.Customer
PIVOT "CITY" & [CityNo];
Is this what you want?
select distinct c1.city, c2.city
from tblCities as c1 inner join
tblCities as c2
on c1.customer = c2.customer and c1.city < c2.city;
This returns all pairs of cities that appear for any single customer.
Here is a query which might work assuming each customer is only associated with two cities:
SELECT DISTINCT t.city_1, t.city_2
FROM
(
SELECT MIN(CITY) AS city_1, MAX(CITY) AS city_2
FROM tblCities
GROUP BY CUSTOMER
) t

SQL JOIN to omit other columns after first result

Here is the result I need, simplified:
select name, phonenumber
from contacttmp
left outer join phonetmp on (contacttmp.id = phonetmp.contact_id);
name | phonenumber
-------+--------------
bob | 111-222-3333
bob | 111-222-4444
bob | 111-222-5555
frank | 111-222-6666
joe | 111-222-7777
The query, however displays the name, I'm trying to omit the name after the first result:
name | phonenumber
-------+--------------
bob | 111-222-3333
| 111-222-4444
| 111-222-5555
frank | 111-222-6666
joe | 111-222-7777
Here's how I made the example tables and the data:
create table contacttmp (id serial, name text);
create table phonetmp (phoneNumber text, contact_id integer);
select * from contacttmp;
id | name
----+-------
1 | bob
2 | frank
3 | joe
select * from phonetmp ;
phonenumber | contact_id
--------------+------------
111-222-3333 | 1
111-222-4444 | 1
111-222-5555 | 1
111-222-6666 | 2
111-222-7777 | 3
Old part of question
I'm working on a contacts program in PHP and a requirement is to display the results but omit the other fields after the first record is displayed if there are multiple results of that same record.
From the postgres tutorial join examples I'm doing something like this with a left outer join:
SELECT *
FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name);
city | temp_lo | temp_hi | prcp | date | name | location
--------------+---------+---------+------+------------+---------------+-----------
Hayward | 37 | 54 | | 1994-11-29 | |
San Francisco | 46 | 50 | 0.25 | 1994-11-27 | San Francisco | (-194,53)
San Francisco | 43 | 57 | 0 | 1994-11-29 | San Francisco | (-194,53)
I can't figure out how to, or if it is possible to, alter the above query to not display the other fields after the first result.
For example, if we add the clause "WHERE location = '(-194,53)'" we don't want the second (and third if there is one) results to display the columns other than location, so the query (plus something extra) and the result would look like this:
SELECT *
FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name)
WHERE location = '(-194,53)';
city | temp_lo | temp_hi | prcp | date | name | location
--------------+---------+---------+------+------------+---------------+-----------
San Francisco | 46 | 50 | 0.25 | 1994-11-27 | San Francisco | (-194,53)
| | | | | | (-194,53)
Is this possible with some kind of JOIN or exclusion or other query? Or do I have to remove these fields in PHP after getting all the results (would rather not do).
To avoid confusion, I'm required to achieve a result set like:
city | temp_lo | temp_hi | prcp | date | name | location
--------------+---------+---------+------+------------+---------------+-----------
San Francisco | 46 | 50 | 0.25 | 1994-11-27 | San Francisco | (-194,53)
| | | | | | (-19,5)
| | | | | | (-94,3)
Philadelphia | 55 | 60 | 0.1 | 1995-12-12 | Philadelphia | (-1,1)
| | | | | | (-77,55)
| | | | | | (-3,33)
Where any additional results for the same record (city) with different locations would only display the different location.
You can do this type of logic in SQL, but it is not recommended. The result set from SQL queries is in a table format. Tables represented unordered sets and generally have all columns meaning the same thing.
So, having a result set that depends on the values from the "preceding" row is not a proper way to use SQL. Although you can get this result in Postgres, I do not recommend it. Usually, this type of formatting is done on the application side.
If you want to avoid repeating the same information, you can use a window function that tells you the position of that row in the group (a PARTITION for this purpose, not a group in the GROUP BY sense), then hide the text for the columns you don't want to repeat if that position in the group is greater than 1.
WITH joined_results AS (
SELECT
w.city, c.location, w.temp_lo, w.temp_hi, w.prcp, w.date,
ROW_NUMBER() OVER (PARTITION BY w.city, c.location ORDER BY date) AS pos
FROM weather w
LEFT OUTER JOIN cities c ON (w.city = c.name)
ORDER BY w.city, c.location
)
SELECT
CASE WHEN pos > 1 THEN '' ELSE city END,
CASE WHEN pos > 1 THEN '' ELSE location END,
temp_lo, temp_hi, prcp, date
FROM joined_results;
This should give you this:
city | location | temp_lo | temp_hi | prcp | date
---------------+-----------+---------+---------+------+------------
Hayward | | 37 | 54 | | 1994-11-29
San Francisco | (-194,53) | 46 | 50 | 0.25 | 1994-11-27
| | 43 | 57 | 0 | 1994-11-29
To understand what ROW_NUMBER() OVER (PARTITION BY w.city, c.location ORDER BY date) AS pos does, it probably worth looking at what you get with SELECT * FROM joined_results:
city | location | temp_lo | temp_hi | prcp | date | pos
---------------+-----------+---------+---------+------+------------+-----
Hayward | | 37 | 54 | | 1994-11-29 | 1
San Francisco | (-194,53) | 46 | 50 | 0.25 | 1994-11-27 | 1
San Francisco | (-194,53) | 43 | 57 | 0 | 1994-11-29 | 2
After that, just replace what you don't want with white space using CASE WHEN pos > 1 THEN '' ELSE ... END.
(This being said, it's something I'd generally prefer to do in the presentation layer rather than in the query.)
Consider the slightly modified test case in the fiddle below.
Simple case
For the simple case dealing with a single column from each column, comparing to the previous row with the window function lag() does the job:
SELECT CASE WHEN lag(c.contact) OVER (ORDER BY c.contact, p.phone_nr)
= c.contact THEN NULL ELSE c.contact END
, p.phone_nr
FROM contact c
LEFT JOIN phone p USING (contact_id);
You could repeat that for n columns, but that's tedious
For many columns
SELECT c.*, p.phone_nr
FROM (
SELECT *
, row_number() OVER (PARTITION BY contact_id ORDER BY phone_nr) AS rn
FROM phone
) p
LEFT JOIN contact c ON c.contact_id = p.contact_id AND p.rn = 1;
Something like a "reverse LEFT JOIN". This is assuming referential integrity (no missing rows in contact. Also, contacts without any entries in phone are not in the result. Easy to add if need should be.
SQL Fiddle.
Aside, your query in the first example exhibits a rookie mistake.
SELECT * FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name)
WHERE location = '(-194,53)';
One does not combine a LEFT JOIN with a WHERE clause on the right table. Doesn't makes sense. Details:
Explain JOIN vs. LEFT JOIN and WHERE condition performance suggestion in more detail
Except to test for existence ...
Select rows which are not present in other table

Zend Framework: How to combine three tables in one query using Joins?

I have three tables like this:
Person table:
person_id | name | dob
--------------------------------
1 | Naveed | 1988
2 | Ali | 1985
3 | Khan | 1987
4 | Rizwan | 1984
Address table:
address_id | street | city | state | country
----------------------------------------------------
1 | MAJ Road | Karachi | Sindh | Pakistan
2 | ABC Road | Multan | Punjab | Pakistan
3 | XYZ Road | Riyadh | SA | SA
Person_Address table:
person_id | address_id
----------------------
1 | 1
2 | 2
3 | 3
Now I want to get all records of Person_Address table but also with their person and address records like this by one query:
person_id| name | dob | address_id | street | city | state | country
----------------------------------------------------------------------------------
1 | Naveed | 1988 | 1 | MAJ Road | Karachi | Sindh | Pakistan
2 | Ali | 1985 | 2 | ABC Road | Multan | Punjab | Pakistan
3 | Khan | 1987 | 3 | XYZ Road | Riyadh | SA | SA
How it is possible using zend? Thanks
The reference guide is the best starting point to learn about Zend_Db_Select. Along with my example below, of course:
//$db is an instance of Zend_Db_Adapter_Abstract
$select = $db->select();
$select->from(array('p' => 'person'), array('person_id', 'name', 'dob'))
->join(array('pa' => 'Person_Address'), 'pa.person_id = p.person_id', array())
->join(array('a' => 'Address'), 'a.address_id = pa.address_id', array('address_id', 'street', 'city', 'state', 'country'));
It's then as simple as this to fetch a row:
$db->fetchRow($select);
In debugging Zend_Db_Select there's a clever trick you can use - simply print the select object, which in turn invokes the toString method to produce SQl:
echo $select; //prints SQL
I'm not sure if you're looking for SQL to do the above, or code using Zend's facilities. Given the presence of "sql" and "joins" in the tags, here's the SQL you'd need:
SELECT p.person_id, p.name, p.dob, a.address_id, street, city, state, country
FROM person p
INNER JOIN Person_Address pa ON pa.person_id = p.person_id
INNER JOIN Address a ON a.address_id = pa.address_id
Bear in mind that the Person_Address tells us that there's a many-to-many relationship between a Person and an Address. Many Persons may share an Address, and a Person may have more than one address.
The SQL above will show ALL such relationships. So if Naveed has two Address records, you will have two rows in the result set with person_id = 1.