Joining three tables clarification - sql

This isn't a question I need a direct answer to, I Just need to know how, from a philosophical point of view, how many tables I would need to join.
Say I have 3 tables :
TableA :
Account_Number | Name | Address
TableB :
Account_Number | Occupation | Salary
TableC :
Account_Number | Model | Make
These tables all have a column in common. If I was to join these tables together to get a the columns in each table that has a matching account number, i would do an inner join to return only records that are matching.
in the query, I would have something like :
SELECT T1.Name, T1.Address, T2.Occupation, T2.Salary, T3.Model, T3.Make
FROM TableA T1
INNER JOIN TableB T2 ON T1.Account_Number=T2.Account_Number
INNER JOIN TableC T3 ON T2.Account_Number=T3.Account_Number /Do I need to join T3 to T1 or is this handled by having T1 and T2 joined together?
WHERE ??.AccountNumber=1234123412341234 /Would any Table work in the WHERE clause here since we are all looking for the same account_number in each table?

SELECT T1.Name, T1.Address, T2.Occupation, T2.Salary, T3.Model, T3.Make
FROM TableA T1
INNER JOIN TableB T2 ON T1.Account_Number=T2.Account_Number
INNER JOIN TableC T3 ON T2.Account_Number=T3.Account_Number /Do I need to join T3 to T1 or is this handled by having T1 and T2 joined together?
WHERE T1.AccountNumber=1234123412341234
Inner join will return only records which are present in the table you are joining ,. So in your select your use TableA so in where clause you should use T1.AccountNumber
For more information please look this link http://www.sql-join.com/

There are a few things to consider with respect to your data model. For example
o What is the relationship between these tables? Is there one? Is the fact that there is an Account_Number column just a coincidence
o Assuming there is a relationship, what is the cardinality? For example, for any given row in Table A, how many corresponding rows are there in Table B; 0, 1 or more than one. Similarly from Table B to table A; and similarly between Tables B and C.
o. Depending on the meaning of these relationships, you could find yourself in a situation called a "fan trap". This is a situation where you have a fanning out of relationships:
B
/ \
/ \
A C
It easier to describe with an Example.
Lets say:
A is a table of Sales Reps
B is a table of Branches
C is a table of Accounts
So each Sales Rep works in one Branch (cardinality=1), and each Branch has 1 or more Sales Rep ( cardinality=Many )
Each Branch handles many Accounts, and each Account is held at one Branch
This is a perfectly valid data model
However, from this data model, if you want to find out which sales rep handles which account, you cannot. This is the so called Fan trap. In this particular example, you would probably solve this be having a relationship directly between Sales Rep and Accounts; even though it may look redundant.
So I know this is a long answer to your question, but it elaborates on the short answer, which is "It depends" :)

BobC is right concerning semantics, it depends.
Syntactically, your example is fully correct and just how I would do it too. Joining applies to just 2 objects/result sets and both have to be in the condition when you don't want a cross join. Furthermore it doesn't matter which of the identically account# columns from the first join you use in the second join.
As for the WHERE condition, if only one of the tables got an Index or Partionkey, use that one. Else, use the table on which the WHERE statement gives you the fewest rows.
Joining often benefits from starting with the smallest tables/result sets. Although Oracle will optimize your statement respecting the gathered statistics anyway.

Related

How to join 4 tables in SQL?

I just started using SQL and I need some help. I have 4 tables in a database. All four are connected with each other. I need to find the amount of unique transactions but can't seem to find it.
Transactions
transaction_id pk
name
Partyinvolved
transaction.id pk
partyinvolved.id
type (buyer, seller)
PartyCompany
partyinvolved.id
Partycompany.id
Companies
PartyCompany.id pk
sector
pk = primary key
The transaction is unique if the conditions are met.
I only need a certain sector out of Companies, this is condition1. Condition2 is a condition inside table Partyinvolved but we first need to execute condition1. I know the conditions but do not know where to put them.
SELECT *
FROM group
INNER JOIN groupB ON groupB.group_id = group.id
INNER JOIN companies ON companies.id = groupB.company_id
WHERE condition1 AND condition2 ;
I want to output the amount of unique transactions with the name.
It is a bit unclear what you are asking as your table definitions look like your hinting at column meanings more than names such as partycompany.id you are probably meaning the column that stores the relationship to PartyCompany column Id......
Anyway, If I follow that logic and I look at your questions about wanting to know where to limit the recordsets during the join. You could do it in Where clause because you are using an Inner Join and it wont mess you your results, but the same would not be true if you were to use an outer join. Plus for optimization it is typically best to add the limiter to the ON condition of the join.
I am also a bit lost as to what exactly you want e.g. a count of transactions or the actual transactions associated with a particular sector for instance. Anyway, either should be able to be derived from a basic query structure like:
SELECT
t.*
FROM
Companies co
INNER JOIN PartyCompancy pco
ON co.PartyCompanyId = pco.PartyCompanyId
INNER JOIN PartyInvolved pinv
ON pco.PartyInvolvedId = pinv.PartyInvolvedId
AND pinv.[type] = 'buyer'
INNER JOIN Transactions t
ON ping.TransactionId = t.TransactionId
WHERE
co.sector = 'some sector'

Can we join two parts of two composite primary keys together?

I have to two tables, both have a composite primary key:
OrderNr + CustNr
OrderNr + ItemNr
Can I join both tables with the OrderNr and OrderNr which is each a part of a composite primary key?
Yes, but you may find you get rows from each table that repeat as they combine to make a unique combination. This is called a Cartesian product
Table A
OrderNr, CustNr
1,C1
1,C2
2,C1
2,C2
TableB
OrderNr,ItemNr
1,i1
1,i2
SELECT * FROM a JOIN b ON a.OrderNr = b.OrderNr
1,C1,1,i1
1,C1,1,i2
1,C2,1,i1
1,C2,1,i2
This happens because composite primary keys can contain repeated elements so long as the combination of elements is unique. Joining on only one part of the PK, and that part being an element that is repeated (my custnr 1 repeats twice in each table, even though the itemnr and CustNr mean the rows are unique) results in a multiplied resultset - 2 rows from A that are custnr 1, multiplied by 2 rows from B that are custnr 1, gives 4 rows in total
Does it work with the normal/naturla join too?
Normal joins (INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER) will join the rows from two tables or subqueries when the ON condition is valid. The clause in the ON is like a WHERE clause, yes - in that it represents a statement that is true or false (a predicate). If the statement is true, the rows are joined. You don't even have to make it about data from the tables - you can even say a JOIN b ON 1=1 and every rows from A will get joined to every row from B. As commented, primary keys aren't involved in JOINS at all, though primary keys often rely on indexes and those indexes may be used to speed up a join, but they aren't vital to it.
Other joins (CROSS, NATURAL..) exist; a CROSS join is like the 1=1 example above, you don't specify an ON, every row from A is joined to every row from B, by design. NATURAL JOIN is one to avoid using, IMHO - the database will look for column names that are the same in both tables and join on them. The problem is that things can stop working in future if someone adds a column with the same name but different content/meaning to the two tables. No serious production system I've ever come across has used NATURAL join. You can get away with some typing if your columns to join on are named the same, with USING - SELECT * FROM a JOIN b USING (col) - here both A and B have a column called col. USING has some advantages, especially over NATURAL join, in that it doesn't fall apart if another column of the same name as an existing one but it has some detractors too - you can't say USING(col) AND .... Most people just stick to writing ON, and forget USING
NATURAL join also does NOT use primary keys. There is no join style (that I know of) that will look at a foreign key relationship between two tables and use that as the join condition
And then is it true that if I try to join Primary key and foreign key of two tables, that it works like a "where" command?
Hard to understand what you mean by this, but if you mean that A JOIN B ON A.primarykey = B.primary_key_in_a then it'll work out, sure. If you mean A CROSS JOIN B WHERE A.primarykey = B.primary_key_in_a then that will also work, but it's something I'd definitely avoid - no one writes SQLs this way, and the general favoring is to drop use of WHERE to create joining conditions (you do still see people writing the old school way of FROM a,b WHERE a.col=b.col but it's also heavily discouraged), and put them in the ON instead
So, in summary:
SELECT * FROM a JOIN b ON a.col1 = b.col2
Joins all rows from a with all rows from b, where the values in col1 equal the values in col2. No primary keys are needed for any of this to work out
You can join any table if there is/are logical relationship between them
select *
from t1
JOIN t2
on t1.ORderNr = t2.OrderNr
Although if OrderNr cannot provide unicity between tables by itself, your data will be multiplied.
Lets say that you have 2 OrderNr with value 1 on t1 and 5 OrderNr with value 1 on t2, when you join them, you will get 2 x 5 = 10 records.
Your data model is similar to a problem commonly referred to as a "fan trap". (If you had an "order" table keyed solely by OrderNr if would exactly be a fan trap).
Either way, it's the same problem -- the relationship between Order/Customers and Order/Items is ambiguous. You cannot tell which customers ordered which items.
It is technically possible to join these tables -- you can join on any columns regardless of whether they are key columns or not. The problem is that your results will probably not make sense, unless you have more conditions and other tables that you are not telling us about.
For example, a simple join just on t1.OrderNr = t2.OrderNr will return rows indicating every customer related to the order has ordered every item related to the order. If that is what you want, you have no problem here.

SQL INNER JOIN without linked column

I have an UltraGrid displaying customer information in it. The way the database is set up, there are 2 tables. Customers and Customer_Addresses. I need to be able to display all of the columns from Customers as well as Town and Region from Customer_Addresses, but I'm under the impression that I'd need Town and Region columns in the Customer table to be able to do this? I've never used an INNER JOIN before so I'm not sure if this is true or not, so can anybody give me pointers on how to do this, or if I need the matching columns or not?
Does it even require an INNER JOIN, or is there an alternative way to do this?
Below are the design views of both of the tables - Is it possible to display Add4 and Add5 from Customer_Addresses with all of Customers?
As long as you have another key column you can use to link the tables (ex. ID_Column), it is better that you use LEFT JOIN.
Example:
SELECT c.col1, ... , c.colN, a.town, a.region FROM Customers c
LEFT JOIN Customer_Addresses a ON a.ID_Column = c.ID_Column
In order to clarify how JOIN types work, look at this picture:
In our case, using a LEFT JOIN will take all information from the Customers table, along with any found matching (on ID) information from Customer_Addresses table.
First of all you need some column in common in two tables, all what you have to do is:
CREATE TABLE all_things
AS
SELECT * (or columns that you want to have in the new table)
FROM Costumers AS a1
INNER JOIN Customer_Addresses AS a2 ON a1.column_in_common = a2.column_in_common
The point is what kind of join do you want.
If you can continue the process without having information in table Costumers or in table Customer_Addresses maybe you need OUTER JOIN or other kind of JOIN.

Select Query producing duplicate results of the same records (Access 2010)

I had a select query, which I designed in the SQL view of the query design tool. With some of my results I found duplicates of the same records. As in there were not multiples in the table, only the query (Same Primary Key). Here is the original query.
SELECT t1.*
FROM Inventory AS t1 INNER JOIN Inventory AS t2 ON
t1.Part_ID = t2.Part_ID WHERE (t1.Inventory_ID<>t2.Inventory_ID);
I aimed to query Inventory for records with the same Part_ID (FK) but different Inventory_ID(PK). There is a composite key between part_ID (FK) and location_ID (FK), if that makes any difference.
I have since changed this query to:
SELECT DISTINCT t1.*
FROM Inventory AS t1 INNER JOIN Inventory AS t2 ON t1.Part_ID = t2.Part_ID
WHERE (t1.Inventory_ID<>t2.Inventory_ID);
This removes the duplicate records, however, I don't believe that my original query should produce replicate data results. I am worried that this suggests that there is something wrong with my tables?
My table looks like the following:
Thanks
The thing is that you might have multiple occurences of part_ID on the INNER JOIN side of your select. So if a part with the same part_ID and a different inventory_ID exists in 2 other locations, you will get duplicates.
To check that, you could do a test on a few duplicates, or rewrite your original query with a GROUP BY instruction on the INNER JOIN side of the query.

Select based on the number of appearances of an id in another table

I have a table B with cids and cities. I also have a table C that has these cids with extra information. I want to list all the cids in table C that are associated with ALL appearances of a given city in Table B.
My current solution relies on counting the number of times the given city appears in Table B and selecting only the cids that appear that many times. I don't know all the SQL syntax yet, but is there a way to select for this kind of pattern?
My current solution:
SELECT Agents.aid
FROM Agents, Customers, Orders
WHERE (Customers.city='Duluth')
AND (Agents.aid = Orders.aid)
AND (Customers.cid = Orders.cid)
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
It only works because I know right now with the HAVING statement.
Thanks for the help. I wasn't sure how to google this problem, since it's pretty specific.
EDIT: I'm pinpointing my problem a bit. I need to know how to determine if EVERY row in a table has a certain value for a field. Declaring a variable and counting the rows in a sub-selection and filtering out my results by IDs that appear that many times works, but It's really ugly.
There HAS to be a way to do this without explicitly count()ing rows. I hope.
Not an answer to your question, but a general improvement.
I'd recommend using JOIN syntax to join your tables together.
This would change your query to be:
SELECT Agents.aid
FROM Agents
INNER JOIN Orders
ON Agents.aid = Orders.aid
INNER JOIN Customers
ON Customers.cid = Orders.cid
WHERE Customers.city='Duluth'
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
What variant of SQL are you using?
To start with, you can (and should) use JOIN instead of doing it in the WHERE clause, e.g.,
select Agents.aid
from Agents
join Orders on Agents.aid = Orders.aid
join Customers on Customers.cid = Orders.cid
where Customers.city = 'Duluth'
group by Agents.aid
having count(Agents.aid) > 1
After that, I'm afraid I might be a little lost. Using the table names in your example query, what (in English, not pseudocode) are you trying to retrieve? For example, I think your sample query is retrieving the PK for all Agents that have been involved in at least 2 Orders involving Customers in Duluth.
Also, some table definitions for Agents, Orders, and Customers might help (then again, they might be irrelevant).
I'm not sure if I understood you problem, but I think the following query is what you want:
SELECT *
FROM customers b
INNER JOIN orders c USING (cid)
WHERE b.city = 'Duluth'
AND NOT EXISTS (SELECT 1
FROM customers b2
WHERE b2.city = b.city
AND b2.cid <> cid);
Probably you will need some indexes on these columns.