Access 2007 query to not cut-off records if one particular join doesn't exist

Access 2007 query to not cut-off records if one particular join doesn't exist - sql

I feel a bit stupid for asking this, because I'm sure there is a simple other type of join I should be using, however, I can't seem to find the answer, so I'm hoping one of you can point me in the right direction.
I have a big query in Access 2007 that pulls in records, but in some cases, I can't use an INNER JOIN on some tables, because linking records may not exist, so the main record rightly drops off. I can get round this problem by using IIF statements, checking if an entry exists first, but this makes the query terribly slow. I simplify the scenario below. Many thanks in advance:

Use a LEFT JOIN instead of an INNER JOIN.

Related

Update with inner join in compact SQL

In access, I can use:
UPDATE Projects
INNER JOIN (Images INNER JOIN ImageCrossRef
ON Images.ImageId = ImageCrossRef.ImageId)
ON Projects.ProjectID = ImageCrossRef.ProjectId
SET Images.Folder = [Projects].[Folder];
to update a field in a table based upon two inner joins to another field, but this fails in Compact SQL. I've tried various suggestions with WHERE EXISTS, but none seem to work. Any suggestions?

You should "categorically" be aware that certain types of queries are not "directly updatable." This might well be one of them. Things which Microsoft Access is able to do, other engines might not. (And, SQL syntax which one server might accept, another one might not.)
One possibility which sometimes works is to use a "nested query" to provide the IDs of the records (Images) that are to be updated, and the values that should be inserted into them.
A second possibility is to use a "stored procedure," which is basically a small sub-program that is executed by the SQL server. The code in that procedure would do a SELECT as before, then iterate through that result-set issuing individual UPDATE statements in a loop.
I'm not familiar enough with Compact SQL to actually write a code-example for you, but I hope this at least gives you some options to think about. "HTH ..."

How does this SQL syntax work? [duplicate]

This question already has answers here:
Odd INNER JOIN syntax and encapsulation
(1 answer)
Join statement order of operation
(1 answer)
Closed 9 years ago.
I came across this horrifyingly freakish SQL query today that another developer generated with the SQL Server query designer tool. I hate the query designer, but I'm stuck trying to figure out what it did. I've never seen syntax like it before and don't understand it. How does it work?
In particular it is is the multiple ON clauses joined together separate from the JOIN clauses that is throwing me off.
SELECT *
FROM dbo.tblDealStatus
RIGHT OUTER JOIN dbo.tblUser
RIGHT OUTER JOIN dbo.tblOwnerLocation
INNER JOIN dbo.tblOwner
INNER JOIN dbo.tblDeal
ON dbo.tblOwner.OwnerID = dbo.tblDeal.OwnerID
ON dbo.tblOwnerLocation.DealID = dbo.tblDeal.DealID
ON dbo.tblUser.UserID = dbo.tblDeal.CHK_Contact
LEFT OUTER JOIN dbo.tblCompany AS tblCompany_1
INNER JOIN dbo.tblParticipation
ON tblCompany_1.CompanyID = dbo.tblParticipation.CompanyID
ON dbo.tblDeal.ParticipationID = dbo.tblParticipation.ParticipationID
ON /*...
....so on and so forth...*/

First, I make it rule for clarity to never mix right and left joins in the same query. All right joins can be switched to left joins and that alone will make it easier to figure out what is going on.
Next abandon select *. It is never appropraite in a query with joins as you are returning the same data in two or more fields (the join fields) and that is wasteful of valuable network and database processing time.
I believe the wierd ONs are forcing the query to go in a particular order. They are bad and should not be used in my opinion as they are hard to maintain and hard for developers to understand as they are not common and are totally unneeded. Just reversing the right joins and putting the tables in the order you need to join them may fix this. If not you may need a few derived tables to get the right data. Note that in reversing it, you may need to change those inner joins to something else. Right now it is such a mess, it is highly likely it does not return the correct results. So in rewriting it, while you would want to see if your changes change the results, you would also want to use judgement to determine if the changes are fixes for a bad query or incorrect changes to translate the query into something maintainable.
If the developer who wrote this mess is still there, I would force him to rewrite in in more standard SQL and tell him he is forbidden to ever use the query designer again. This fails code review as far as I am concerned.
If I were rewriting this, I would look for the table that should be first in the query and work down from there. My personal guess right now is that it would be the tblDeal table but I don't know your data model so I could be wrong.

Methods of visualizing joins

Just wondering if anyone has any tricks (or tools) they use to visualize joins. You know, you write the perfect query, hit run, and after it's been running for 20 minutes, you realize you've probably created a cartesian join.
I sometimes have difficulty visualizing what's going to happen when I add another join statement and wondered if folks have different techniques they use when trying to put together lots of joins.

Always keep the end in mind.
Ascertain which are the columns you need
Try to figure out the minimum number of tables which will be needed to do it.
Write your FROM part with the table which will give max number of columns. eg FROM Teams T
Add each join one by one on a new line. Ensure whether you'll need OUTER, INNER, LEFT, RIGHT JOIN at each step.
Usually works for me. Keep in mind that it is Structured query language. Always break your query into logical lines and it's much easier.

Every join combines two resultsets into one. Each may be from a single database table or a temporary resultset which is the result of previous join(s) or of a subquery.
Always know the order that joins are processed, and, for each join, know the nature of the two temporary result sets that you are joining together. Know what logical entity each row in that resultset represents, and what attributes in that resultset uniquely identify that entity. If your join is intended to always join one row to one row, these key attributes are the ones you need to use (in join conditions) to implement the join. If your join is intended to generate some kind of cartesian product, then it is critical to understand the above to understand how the join conditions (whatever they are) will affect the cardinality of the new joined resultset.
Try to be consistent in the use of outer join directions. I try to always use Left Joins when I need an outer join, as I "think" of each join as "joining" the new table (to the right) to whatever I have already joined together (on the left) of the Left Join statement...

Run an explain plan.
These are always hierarchical trees (to do this, first I must do that). Many tools exist to make these plans into graphical trees, some in SQL browsers, (e.g, Oracle SQLDeveloper, whatever SQlServer's GUI client is called). If you don't have a tool, most plan text ouput includes a "depth" column, which you can use to indent the line.
What you want to look for is the cost of each row. (Note that for Oracle, though, higher costs can mean less time, if it allows Oracle to do a hash join rather than nested loops, and if the final result set has high cardinality (many, many rows).)

I have never found a better tool than thinking it through and using my own mind.
If the query is so complicated that you cannot do that, you may want to use either CTE's, views, or some other carefully organized subqueries to break it into logical pieces so you can easily understand and visualize each piece even if you cannot manage the whole.
Also, if your concern is effeciency, then SQL Server Management Studio 2005 or later lets you get estimated query execution plans without actually executing the query. This can give you very good ideas of where problems lie, if you are using MS SQL Server.

INNER JOIN keywords | with and without using them

SELECT * FROM TableA
INNER JOIN TableB
ON TableA.name = TableB.name
SELECT * FROM TableA, TableB
where TableA.name = TableB.name
Which is the preferred way and why?
Will there be any performance difference when keywords like JOIN is used?
Thanks

The second way is the classical way of doing it, from before the join keyword existed.
Normally the query processor generates the same database operations from the two queries, so there would be no difference in performance.
Using join better describes what you are doing in the query. If you have many joins, it's also better because the joined table and it's condition are beside each other, instead of putting all tables in one place and all conditions in another.
Another aspect is that it's easier to do an unbounded join by mistake using the second way, resulting in a cross join containing all combinations from the two tables.

Use the first one, as it is:
More explicit
Is the Standard way
As for performance - there should be no difference.

find out by using EXPLAIN SELECT …
it depends on the engine used, on the query optimizer, on the keys, on the table; on pretty much everything

In some SQL engines the second form (associative joins) is depreicated. Use the first form.
Second is less explicit, causes begginers to SQL to pause when writing code. Is much more difficult to manage in complex SQL due to the sequence of the join match requirement to match the WHERE clause sequence - they (squence in the code) must match or the results returned will change making the returned data set change which really goes against the thought that sequence should not change the results when elements at the same level are considered.
When joins containing multiple tables are created, it gets REALLY difficult to code, quite fast using the second form.
EDIT: Performance: I consider coding, debugging ease part of personal performance, thus ease of edit/debug/maintenance is better performant using the first form - it just takes me less time to do/understand stuff during the development and maintenance cycles.

Most current databases will optimize both of those queries into the exact same execution plan. However, use the first syntax, it is the current standard. By learning and using this join syntax, it will help when you do queries with LEFT OUTER JOIN and RIGHT OUTER JOIN. which become tricky and problematic using the older syntax with the joins in the WHERE clause.

Filtering joins solely using WHERE can be extremely inefficient in some common scenarios. For example:
SELECT * FROM people p, companies c WHERE p.companyID = c.id AND p.firstName = 'Daniel'
Most databases will execute this query quite literally, first taking the Cartesian product of the people and companies tables and then filtering by those which have matching companyID and id fields. While the fully-unconstrained product does not exist anywhere but in memory and then only for a moment, its calculation does take some time.
A better approach is to group the constraints with the JOINs where relevant. This is not only subjectively easier to read but also far more efficient. Thusly:
SELECT * FROM people p JOIN companies c ON p.companyID = c.id
WHERE p.firstName = 'Daniel'
It's a little longer, but the database is able to look at the ON clause and use it to compute the fully-constrained JOIN directly, rather than starting with everything and then limiting down. This is faster to compute (especially with large data sets and/or many-table joins) and requires less memory.
I change every query I see which uses the "comma JOIN" syntax. In my opinion, the only purpose for its existence is conciseness. Considering the performance impact, I don't think this is a compelling reason.

Can an INNER JOIN offer better performance than EXISTS

I've been investigating making performance improvements on a series of procedures, and recently a colleague mentioned that he had achieved significant performance improvements when utilising an INNER JOIN in place of EXISTS.
As part of the investigation as to why this might be I thought I would ask the question here.
So:
Can an INNER JOIN offer better performance than EXISTS?
What circumstances would this happen?
How might I set up a test case as proof?
Do you have any useful links to further documentation?
And really, any other experience people can bring to bear on this question.
I would appreciate if any answers could address this question specifically without any suggestion of other possible performance improvements. We've had quite a degree of success already, and I was just interested in this one item.
Any help would be much appreciated.

Generally speaking, INNER JOIN and EXISTS are different things.
The former returns duplicates and columns from both tables, the latter returns one record and, being a predicate, returns records from only one table.
If you do an inner join on a UNIQUE column, they exhibit same performance.
If you do an inner join on a recordset with DISTINCT applied (to get rid of the duplicates), EXISTS is usually faster.
IN and EXISTS clauses (with an equijoin correlation) usually employ one of the several SEMI JOIN algorithms which are usually more efficient than a DISTINCT on one of the tables.
See this article in my blog:
IN vs. JOIN vs. EXISTS

Maybe, maybe not.
The same plan will be generated most likely
An INNER JOIN may require a DISTINCT to get the same output
EXISTS deals with NULL

In sql server 2019 queries with IN, EXIST, JOIN statements have different plans (if correct indexes added). So performence also is different. It is shown in article https://www.mssqltips.com/sqlservertip/6659/sql-exists-vs-in-vs-join-performance-comparison/ that JOIN is some faster.
P.S. I understand that question was about sql server 2005 (in tags), but people mostly looks for answer by article title.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas