SQL query wrong index when where on join - sql

I have a query with joins that is not using the index that would be the best match and I am looking for help to correct this.
I have the following query:
select
equipment.name,purchaselines.description,contacts.name,vendors.accountNumber
from purchaselines
left join vendors on vendors.id = purchaselines.vendorId
left join contacts on contacts.id = vendors.contactId
left join equipment on equipment.id = purchaselines.equipmentId
where contacts.id = 12345
The table purchaselines has an index on the column vendorId, which is the proper index to use. When the query is run, I know the value of contacts.id which is joined to vendors.contactId which is joined to purchaselines.vendorId.
What is the proper way to run this query? Currently, no index is used on the table purchaselines.

If you are intending to query a specific contact, I would put THAT first since that is the primary basis. Additionally, you had left-joins to the other tables (vendors, contacts, equipment). So by having a WHERE clause to the CONTACTS table forces the equation to become an INNER JOIN, thus REQUIRING.
That said, I would try to rewrite the query as (also using aliases for simplified readability of longer table names)
select
equipment.name,
purchaselines.description,
contacts.name,
vendors.accountNumber
from
contacts c
join vendors v
on c.id = v.contactid
join purchaselines pl
on v.id = pl.vendorid
join equipment e
on pl.equipmentid = e.id
where
c.id = 12345
Also notice the indentation of the JOINs helps readability (IMO) to see how/where each table gets to the next in a more hierarchical manner. They are all regular inner JOIN context.
So, the customer ID will be the first / fastest, then to vendors by that contact ID which should optimize the join to that. Then, I would expect the purchase lines to have an index on vendorid optimizing that. And finally, the equipment table on ITs PK.
FEEDBACK Basic JOIN clarification.
JOIN is just the explicit statement of how two tables are related. By listing them left-side and right-side and the join condition showing on what relationship is between them is all.
Now, in your data example, each table is subsequently nested under the one prior. It is quite common though that one table may link to multiple other tables. For example an employee. A customer could have an ethnicity ID linking to an ethnicity lookup table, but also, a job position id also linking to a job position lookup table. That might look something like
select
e.name,
eth.ethnicity,
jp.jobPosition
from
employee e
join ethnicitiy eth
on e.ethnicityid = eth.id
join jobPosition jp
on e.jobPositionID = jp.id
Notice here that both ethnicity and jobPosition are at the same hierarchical level to the employee table scenario. If, for example, you wanted to further apply conditions that you only wanted certain types of employees, you can just add your logical additional conditions directly at the location of the join such as
join jobPosition jp
on e.jobPositionID = jp.id
AND jp.jobPosition = 'Manager'
This would get you a list of only those employees who are managers. You do not need to explictily add a WHERE condition if you already include it directly at the JOIN/ON criteria. This helps keeping the table-specific criteria at the join if you ever find yourself needing LEFT JOINs.

Related

How the SQL query with two left join works?

I need help in understanding how left join is working in below query.
There are total three tables and two left joins.
So my question is, the second left join is between customer and books table or between result of first join and books table?
SELECT c.id, c.first_name, c.last_name, s.date AS sale,
b.name AS book, b.genre
FROM customers c
LEFT JOIN sales s
ON c.id = s.customer_id
LEFT JOIN books b
ON s.book_id = b.id;
Good question.
When it comes to outer-joined tables, it depends on the predicates in the ON clause. The engine is free to reorder the fetch and scans on indexes or tables as long as the predicates are respected.
In this particular case there are three tables:
customers (c)
sales (s)
books (b)
customers is inner joined so it becomes the driving table; there are other considerations, but for simplicity you can consider that this is the table that is read first. Now, which one is second? sales or books?
The first join predicate c.id = s.customer_id doesn't establish any relationship between the secondary tables; therefore it doesn't affect which table is joined first.
The second join predicate s.book_id = b.id makes books dependent on sales. Therefore, it decides sales is the second table, and books is the last one.
A final note: if you understand the concept of dependency there are several dirty tricks you can use to force the engine to walk the tables in the order you want. I would not recommend to do this to a novice, but if at some point you realise the engine is not doing what you want, you can tweak the queries.
The second join statement specifies to join on s.book_id = b.id where s is sales and b is books. However, a record in the books table will not be returned unless it has a corresponding record in the sales AND customers tables, which is what a left join does by definition https://www.w3schools.com/sql/sql_join_left.asp. put another way, this query will return all books that have been purchased by at least one customer (and books that have been purchased multiple times will appear in the results multiple times).

How to join 4 tables in SQL?

I just started using SQL and I need some help. I have 4 tables in a database. All four are connected with each other. I need to find the amount of unique transactions but can't seem to find it.
Transactions
transaction_id pk
name
Partyinvolved
transaction.id pk
partyinvolved.id
type (buyer, seller)
PartyCompany
partyinvolved.id
Partycompany.id
Companies
PartyCompany.id pk
sector
pk = primary key
The transaction is unique if the conditions are met.
I only need a certain sector out of Companies, this is condition1. Condition2 is a condition inside table Partyinvolved but we first need to execute condition1. I know the conditions but do not know where to put them.
SELECT *
FROM group
INNER JOIN groupB ON groupB.group_id = group.id
INNER JOIN companies ON companies.id = groupB.company_id
WHERE condition1 AND condition2 ;
I want to output the amount of unique transactions with the name.
It is a bit unclear what you are asking as your table definitions look like your hinting at column meanings more than names such as partycompany.id you are probably meaning the column that stores the relationship to PartyCompany column Id......
Anyway, If I follow that logic and I look at your questions about wanting to know where to limit the recordsets during the join. You could do it in Where clause because you are using an Inner Join and it wont mess you your results, but the same would not be true if you were to use an outer join. Plus for optimization it is typically best to add the limiter to the ON condition of the join.
I am also a bit lost as to what exactly you want e.g. a count of transactions or the actual transactions associated with a particular sector for instance. Anyway, either should be able to be derived from a basic query structure like:
SELECT
t.*
FROM
Companies co
INNER JOIN PartyCompancy pco
ON co.PartyCompanyId = pco.PartyCompanyId
INNER JOIN PartyInvolved pinv
ON pco.PartyInvolvedId = pinv.PartyInvolvedId
AND pinv.[type] = 'buyer'
INNER JOIN Transactions t
ON ping.TransactionId = t.TransactionId
WHERE
co.sector = 'some sector'

SQL INNER JOIN without linked column

I have an UltraGrid displaying customer information in it. The way the database is set up, there are 2 tables. Customers and Customer_Addresses. I need to be able to display all of the columns from Customers as well as Town and Region from Customer_Addresses, but I'm under the impression that I'd need Town and Region columns in the Customer table to be able to do this? I've never used an INNER JOIN before so I'm not sure if this is true or not, so can anybody give me pointers on how to do this, or if I need the matching columns or not?
Does it even require an INNER JOIN, or is there an alternative way to do this?
Below are the design views of both of the tables - Is it possible to display Add4 and Add5 from Customer_Addresses with all of Customers?
As long as you have another key column you can use to link the tables (ex. ID_Column), it is better that you use LEFT JOIN.
Example:
SELECT c.col1, ... , c.colN, a.town, a.region FROM Customers c
LEFT JOIN Customer_Addresses a ON a.ID_Column = c.ID_Column
In order to clarify how JOIN types work, look at this picture:
In our case, using a LEFT JOIN will take all information from the Customers table, along with any found matching (on ID) information from Customer_Addresses table.
First of all you need some column in common in two tables, all what you have to do is:
CREATE TABLE all_things
AS
SELECT * (or columns that you want to have in the new table)
FROM Costumers AS a1
INNER JOIN Customer_Addresses AS a2 ON a1.column_in_common = a2.column_in_common
The point is what kind of join do you want.
If you can continue the process without having information in table Costumers or in table Customer_Addresses maybe you need OUTER JOIN or other kind of JOIN.

Viewing Data from two tables while linking multiple tables in SQL

I am trying to set up a view in a database I want to see all the data in the PERSON table and three columns from the NON_PERSONNEL table for a program from the PROGRAM table. This is what I am trying now, the query runs without errors but doesnt give me any results. All 4 of the tables listed below are imperative to derive the answer
SELECT
person.*,
non_personnel.description,
non_personnel.amount
FROM
person,
non_personnel,
personnel_role,
programs
WHERE
person.person_id = personnel_role.person_id
AND personnel_role.program_id = programs.program_id
AND programs.program_id = non_personnel.program_id
AND programs.program_name = 'Fake Program'
you need to use left join, so you get all persons but description and amount can be NULL if that person doesn't have records in other tables
Also use explicit join syntax.
SELECT person.*, non_personnel.description, non_personnel.amount
FROM person
left join personnel_role
ON person.person_id = personnel_role.person_id
left join programs
ON personnel_role.program_id = programs.program_id
AND programs.program_name = 'Fake Program'
left join non_personnel
ON programs.program_id = non_personnel.program_id
This really depends on your schema and the data in the tables. The way you have it written now means that only records that match in every table (according to your WHERE conditions) are passed into the result set.
This means that you have to have all program_id's in your programs table that you want returned in the results ALSO in your non_personnel table. They must also ALL be in your personnel_role table. And all person_ids in your personnel_role table must be in the person table. You get no results back, so this is probably not what you meant to write.
My guess is that you want to use a LEFT OUTER JOIN here. LEFT OUTER JOIN says "Take all records from one table and ONLY records from the joined table that meet the criteria in your ON statement".
Because you are wanting information based on a particular Program, chances are you want to start with that table:
SELECT person.*,
non_personnel.description,
non_personnel.amount
FROM
programs
LEFT OUTER JOIN personnel_role ON
programs.program_id = personnel_role.program_id
LEFT OUTER JOIN person ON
personnel_role.person_id = person.person_id
LEFT OUTER JOIN non_personnel
programs.program_id = non_personnel.program_id
WHERE
programs.program_name = 'Fake Program'
This is a bit of an assumption since I have no idea what your schema is or how your data is built, but I'm betting it's what you are after.
What this FROM clause says is:
1. Take all records from Program (where program_name = 'fake program') and only reocrds from personnel_role that share the same program_id
2. Take only the records from person where the person_id matches the records we just got from the personnel_role table
3. Take only the records from non_personnel where it shares a program_id with the results from the program table.

SQL Join ON field differences

I have a question regarding the use of JOINS from SQL:
What's the difference between:
Select * From Employees E
JOIN Products P ON E.idEmployee = P.idEmployee
JOIN ProductsDetails PD ON P.idProduct = PD.idProduct
AND
Select * From Employees E
JOIN Products P ON P.idEmployee = E.idEmployee
JOIN ProductsDetails PD ON PD.idProduct = P.idProduct
Also, what's the difference between:
Select * From Employees E
JOIN Products P ON E.idemployee =P.idemployee
Where P.name like '%prod01%'
AND
Select * From Employees E
JOIN Products P ON E.idemployee =P.idemployee
Where E.ProdName like '%prod01%' //considering the fact that the field ProdName also exists in the table Products.
Actually, how does a query actually works, I mean the workflow:
Select * From Employees E
JOIN Products P ON E.idemployee = P.idEmployee
JOIN ProductsDetails PD ON P.idProduct = PD.idProduct
JOIN OtherTable OT ON E.idField = OT.idField
where E.ProductNumber = 1 and OT.idOfAnotherField = value
How does the where condition on the Join clauses affects the main query, how does the query actually works , what it brings first and how does it applies the conditions?
This is a bit long for a comment.
SQL is a descriptive language not a procedural language. That is, a select statement describes the results produced by processing the data, but not the methods used to achieve it. Two important parts of the database engine are the optimizer which determines how the query will be executed and the execution engine which actually executes it.
From the perspective of the optimizer, col1 = col2 and col2 = col1 are the same. So, there is no difference in the first two queries. From the perspective of what the query does, the two examples with name are the same. Or, at least, superficially the same. The two product columns could have different collations set, which would affect the meaning.
As for your final question, you need to look at your database documentation. Specifics on the execution are highly database dependent. You should also learn about explain and execution plans.
There is no Difference between the first two queries, you just providing the criteria of joining the two tables, so Table1.FieldName = Table2.FieldName is the same of Table2.FieldName = Table1.FieldName.
The Difference between the second two queries is in the first one you searching for product name in the product table, and the second in the Employee table.
If there a product With name like '%prod01%' in the product table then the first query will return it and the second will not, and if there a product with name like '%prod01%' in the employee table the second query will return value and the first will not.
The query start by building the join, starting by select the Employee table then join the values of Employee table and Product table where the idemployee equal idEmployee and do the same with Products and ProductsDetails table and ProductsDetails table and OtherTable at the end the query filter the result based on the ProductNumber of Employee Table and idOfAnotherFields of OtherTable.