SQL Join ON field differences - sql

I have a question regarding the use of JOINS from SQL:
What's the difference between:
Select * From Employees E
JOIN Products P ON E.idEmployee = P.idEmployee
JOIN ProductsDetails PD ON P.idProduct = PD.idProduct
AND
Select * From Employees E
JOIN Products P ON P.idEmployee = E.idEmployee
JOIN ProductsDetails PD ON PD.idProduct = P.idProduct
Also, what's the difference between:
Select * From Employees E
JOIN Products P ON E.idemployee =P.idemployee
Where P.name like '%prod01%'
AND
Select * From Employees E
JOIN Products P ON E.idemployee =P.idemployee
Where E.ProdName like '%prod01%' //considering the fact that the field ProdName also exists in the table Products.
Actually, how does a query actually works, I mean the workflow:
Select * From Employees E
JOIN Products P ON E.idemployee = P.idEmployee
JOIN ProductsDetails PD ON P.idProduct = PD.idProduct
JOIN OtherTable OT ON E.idField = OT.idField
where E.ProductNumber = 1 and OT.idOfAnotherField = value
How does the where condition on the Join clauses affects the main query, how does the query actually works , what it brings first and how does it applies the conditions?

This is a bit long for a comment.
SQL is a descriptive language not a procedural language. That is, a select statement describes the results produced by processing the data, but not the methods used to achieve it. Two important parts of the database engine are the optimizer which determines how the query will be executed and the execution engine which actually executes it.
From the perspective of the optimizer, col1 = col2 and col2 = col1 are the same. So, there is no difference in the first two queries. From the perspective of what the query does, the two examples with name are the same. Or, at least, superficially the same. The two product columns could have different collations set, which would affect the meaning.
As for your final question, you need to look at your database documentation. Specifics on the execution are highly database dependent. You should also learn about explain and execution plans.

There is no Difference between the first two queries, you just providing the criteria of joining the two tables, so Table1.FieldName = Table2.FieldName is the same of Table2.FieldName = Table1.FieldName.
The Difference between the second two queries is in the first one you searching for product name in the product table, and the second in the Employee table.
If there a product With name like '%prod01%' in the product table then the first query will return it and the second will not, and if there a product with name like '%prod01%' in the employee table the second query will return value and the first will not.
The query start by building the join, starting by select the Employee table then join the values of Employee table and Product table where the idemployee equal idEmployee and do the same with Products and ProductsDetails table and ProductsDetails table and OtherTable at the end the query filter the result based on the ProductNumber of Employee Table and idOfAnotherFields of OtherTable.

Related

SQL query wrong index when where on join

I have a query with joins that is not using the index that would be the best match and I am looking for help to correct this.
I have the following query:
select
equipment.name,purchaselines.description,contacts.name,vendors.accountNumber
from purchaselines
left join vendors on vendors.id = purchaselines.vendorId
left join contacts on contacts.id = vendors.contactId
left join equipment on equipment.id = purchaselines.equipmentId
where contacts.id = 12345
The table purchaselines has an index on the column vendorId, which is the proper index to use. When the query is run, I know the value of contacts.id which is joined to vendors.contactId which is joined to purchaselines.vendorId.
What is the proper way to run this query? Currently, no index is used on the table purchaselines.
If you are intending to query a specific contact, I would put THAT first since that is the primary basis. Additionally, you had left-joins to the other tables (vendors, contacts, equipment). So by having a WHERE clause to the CONTACTS table forces the equation to become an INNER JOIN, thus REQUIRING.
That said, I would try to rewrite the query as (also using aliases for simplified readability of longer table names)
select
equipment.name,
purchaselines.description,
contacts.name,
vendors.accountNumber
from
contacts c
join vendors v
on c.id = v.contactid
join purchaselines pl
on v.id = pl.vendorid
join equipment e
on pl.equipmentid = e.id
where
c.id = 12345
Also notice the indentation of the JOINs helps readability (IMO) to see how/where each table gets to the next in a more hierarchical manner. They are all regular inner JOIN context.
So, the customer ID will be the first / fastest, then to vendors by that contact ID which should optimize the join to that. Then, I would expect the purchase lines to have an index on vendorid optimizing that. And finally, the equipment table on ITs PK.
FEEDBACK Basic JOIN clarification.
JOIN is just the explicit statement of how two tables are related. By listing them left-side and right-side and the join condition showing on what relationship is between them is all.
Now, in your data example, each table is subsequently nested under the one prior. It is quite common though that one table may link to multiple other tables. For example an employee. A customer could have an ethnicity ID linking to an ethnicity lookup table, but also, a job position id also linking to a job position lookup table. That might look something like
select
e.name,
eth.ethnicity,
jp.jobPosition
from
employee e
join ethnicitiy eth
on e.ethnicityid = eth.id
join jobPosition jp
on e.jobPositionID = jp.id
Notice here that both ethnicity and jobPosition are at the same hierarchical level to the employee table scenario. If, for example, you wanted to further apply conditions that you only wanted certain types of employees, you can just add your logical additional conditions directly at the location of the join such as
join jobPosition jp
on e.jobPositionID = jp.id
AND jp.jobPosition = 'Manager'
This would get you a list of only those employees who are managers. You do not need to explictily add a WHERE condition if you already include it directly at the JOIN/ON criteria. This helps keeping the table-specific criteria at the join if you ever find yourself needing LEFT JOINs.

On an SQL Select, how do i avoid getting 0 results if I want to query for optional data in another table?

I have a table with Customers which includes their contact person in the helpdesk. I have another table that lists all vacancies of the helpdesk employees - if they are currently sick or on vacation etc.
I need to get the helpdesk contact and the start/end time of their vacation IF there is an entry.
I currently have this (simplified):
SELECT *
FROM dbo.Customers, dbo.Projects, dbo.Vacations
WHERE ($Phone = dbo.Customers.Phone)
AND dbo.Customers.CustomerID = dbo.Projects.CustomerID
AND dbo.Projects.HDContactID = dbo.Vacations.HDContactID
So if there is a vacation listed in the Vacations table, it works fine, but if there is no vacation at all, this will not return anything - what i want is that if there is no vacation, it simply returns the other data, and ignores the missing data (returns NULL, doesn't return anything, not important)
In any case, I need to get the Customers and Project data, even if the query can't find an entry in the Vacations table. How would I do this? I pretty new to SQL and couldn't find a similar question on this site
EDIT: I'm using SQL Server, currently using HeidiSQL
Try below query:
SELECT * FROM dbo.Customers, dbo.Projects
left join dbo.Vacations on dbo.Projects.HDContactID = dbo.Vacations.HDContactID
WHERE ($Phone = dbo.Customers.Phone)
AND dbo.Customers.CustomerID = dbo.Projects.CustomerID
Use left join as mentioned by #Flying Thunder,
Example of the left join:
SELECT country.country_name_eng, city.city_name, customer.customer_name
FROM customer
LEFT JOIN city ON customer.city_id = city.id
LEFT JOIN country ON city.country_id = country.id;
You can find a nice guide for the joins and SQL here:
https://www.sqlshack.com/learn-sql-join-multiple-tables/
You should be using LEFT JOIN. In fact, you should never be using commas in the FROM clause. That is just archaic syntax and closes the powerful world of JOINs from your queries.
I also recommend using table aliases that are abbreviations of table names. The best are abbreviations for the table names:
SELECT *
FROM dbo.Customers c LEFT JOIN
dbo.Projects p
ON c.CustomerID = p.CustomerID LEFT JOIN
dbo.Vacations v
ON p.HDContactID = v.HDContactID
WHERE c.Phone = $Phone;
Have you try this to skip vacation record if not present like this:
SELECT * FROM dbo.Customers, dbo.Projects, dbo.Vacations
WHERE ($Phone = dbo.Customers.Phone)
AND dbo.Customers.CustomerID = dbo.Projects.CustomerID
AND (dbo.Vacations.HDContactID IS NULL OR dbo.Projects.HDContactID = dbo.Vacations.HDContactID)

How to join 4 tables in SQL?

I just started using SQL and I need some help. I have 4 tables in a database. All four are connected with each other. I need to find the amount of unique transactions but can't seem to find it.
Transactions
transaction_id pk
name
Partyinvolved
transaction.id pk
partyinvolved.id
type (buyer, seller)
PartyCompany
partyinvolved.id
Partycompany.id
Companies
PartyCompany.id pk
sector
pk = primary key
The transaction is unique if the conditions are met.
I only need a certain sector out of Companies, this is condition1. Condition2 is a condition inside table Partyinvolved but we first need to execute condition1. I know the conditions but do not know where to put them.
SELECT *
FROM group
INNER JOIN groupB ON groupB.group_id = group.id
INNER JOIN companies ON companies.id = groupB.company_id
WHERE condition1 AND condition2 ;
I want to output the amount of unique transactions with the name.
It is a bit unclear what you are asking as your table definitions look like your hinting at column meanings more than names such as partycompany.id you are probably meaning the column that stores the relationship to PartyCompany column Id......
Anyway, If I follow that logic and I look at your questions about wanting to know where to limit the recordsets during the join. You could do it in Where clause because you are using an Inner Join and it wont mess you your results, but the same would not be true if you were to use an outer join. Plus for optimization it is typically best to add the limiter to the ON condition of the join.
I am also a bit lost as to what exactly you want e.g. a count of transactions or the actual transactions associated with a particular sector for instance. Anyway, either should be able to be derived from a basic query structure like:
SELECT
t.*
FROM
Companies co
INNER JOIN PartyCompancy pco
ON co.PartyCompanyId = pco.PartyCompanyId
INNER JOIN PartyInvolved pinv
ON pco.PartyInvolvedId = pinv.PartyInvolvedId
AND pinv.[type] = 'buyer'
INNER JOIN Transactions t
ON ping.TransactionId = t.TransactionId
WHERE
co.sector = 'some sector'

SQL for many to one to many table

I have three tables in an Access database that I am using in java via ucanaccess.
Patients (PK Pt_ID)
Endoscopy (PK Endo_ID, FK Pt_ID)
Histology (PK Histol_ID, FK Pt_ID)
1 patient can have many endoscopies
1 patient can have many histologies
Endoscopy and histology are not related
I want to retrieve all the Endoscopies and histologies for a single patients in a single SQL query. Although I can write select statements for two tables I don't know how to do this across the three tables. Is it something like this
Select *.Endoscopy,*.Histology from Patients INNER JOIN Endoscopy, Histology ON Patient.Pt_Id=Endoscopy.Pt_ID, Patient.Pt_Id=Histology.Pt_ID
I'm sure that's a mess though...
What kind of SQL DB are you using?
I believe this works on most.
SELECT * FROM Patients, Endoscopy, Histology
WHERE Patient.Pt_Id=Endoscopy.Pt_ID
AND Patient.Pt_Id=Histology.Pt_ID
Also, I belive you have these switched around *.Endoscopy,*.Histology If you need to use that it should be Endoscopy.*, Histology.*
You can use the following query to select both endoscopies and histologies :
SELECT p.Pt_ID
, e.Endo_ID
, h.Histol_ID
FROM Patients p
INNER JOIN Endoscopy e ON p.Pt_Id = e.Pt_ID
INNER JOIN Histology h ON p.Pt_Id = h.Pt_ID
But I'm not sure this is really what you want. You might need to map the tables Patients, Endoscopy and Histology into Java classes ? In this case, you can consider the Java Persistence API (JPA). It helps you to handle these tables in your java code. Here is a JPA Quick Guide.
First idea is to use inner join (with correct syntax) but this is wrong. inner join returns patients who have both procedure. Pure left join returns additionally patients who have none. So this is the solution:
SELECT Patients.Pt_PK, Endoscopy.*, Histology.*
FROM Patients
LEFT JOIN Endoscopy ON Patients.Pt_Id = Endoscopy.Pt_ID
LEFT JOIN Histology ON Patients.Pt_Id = Histology.Pt_ID
--exclude patients who don't have any
where coalesce(Endoscopy.Endo_ID, Histology.Histol_ID) is not null
If you have multiple Endoscopy records or multiple Histology records for the same Patient then you will receive duplicate/repeated records in your SELECT. I do no think there is a way around that unless you use 2 SELECT statements instead of 1.
SELECT Endoscopy.*, Histology.*
FROM Patients
INNER JOIN Endoscopy ON Patients.Pt_Id = Endoscopy.Pt_ID
INNER JOIN Histology ON Patients.Pt_Id = Histology.Pt_ID
To select all records on a table in the select its table name/table alias .*
INNER JOIN will only select records where there is a relationship, once one of these tables does not contain a Pt_ID where it is contained in any one of the other tables then no record will be displayed with that Pt_ID
To add additional tables continue to add additional join statements
You used Patients (with S) in one location and Patient (no S) in another, make sure you use the correct naming. I am guessing its Patients but maybe its Patient.
This statement does almost the same as the above but uses LEFT JOIN syntax so that you will always get records for both tables even if one of the two tables does not have a record for a patient.
SELECT Endoscopy.*, Histology.*
FROM Patients
LEFT JOIN Endoscopy ON Patients.Pt_Id = Endoscopy.Pt_ID
LEFT JOIN Histology ON Patients.Pt_Id = Histology.Pt_ID
WHERE Histology.Histol_ID IS NOT NULL OR Endoscopy.Endo_ID IS NOT NULL
The added WHERE clause ensures that you do not get a record with all NULL values where there is a patient but no records in either of those tables.

SQL like clause is not returning any results

I have following query, but it doesn't return any results for where clauses, even when there is row with that kind of name what is queried. If I remove where clause, then all records in Company table which have OfficeLocation table are returned. What is wrong in my query?
SELECT c.*
FROM [MyDb].[dbo].[Company] AS c
INNER JOIN [MyDb].[dbo].[CompanyOfficeLocation] AS col ON c.Id = col.CompanyId
INNER JOIN [MyDb].[dbo].[OfficeLocation] AS ol ON ol.Id = col.OfficeLocationId
WHERE ol.Name like '%Actual Name In This Table%';
Table structure :
Company
Id
etc ...
CompanyOfficeLocation
CompanyId
OfficeLocationId
OfficeLocation
Id
etc ...
Two things for a record to show up given your query:
The OfficeLocation you specified (given the ol.Name value) must have an Id value that is used by a record in the CompanyOfficeLocation table in its OfficeLocationId.
The CompanyOfficeLocation record that you got in #1 must have a CompanyId that exists in the Company table.
If any of those two criteria are not met, then no records will show up in your query result. The INNER JOIN is essentially an 'AND' clause. If a record could not be related to at least one INNER JOINed table, then that record will not show up at all.
If you want a record to show up despite not having any related records in the joined tables, you may want to consider using OUTER JOINs. A RIGHT JOIN in your case to be exact.
I do not find any mistake however I'd suggest you switch the columns after ON when joining to maintain standards.
Instead of - INNER JOIN [MyDb].[dbo].[OfficeLocation] AS ol ON ol.Id = col.OfficeLocationId
Do - INNER JOIN [MyDb].[dbo].[OfficeLocation] AS ol ON col.OfficeLocationId = ol.Id