Right Join vs where a value exists in another table - sql

Without realizing it I've switched to the first block of code as a preference. I am curious if it is a best practice or more efficient to use the first block of code over the second or vice versa?
In my opinion the first is more readable and concise since all the columns are from one table.
SELECT Column2, Column3, Column4
FROM Table1
WHERE Column1 in (SELECT Column1 FROM Table2)
vs
SELECT A.Column2, A.Column3, A.Column4
FROM Table1 A
RIGHT JOIN Table2 B ON A.Column1 = B.Column1
Just hoping for clarification on best practices/efficiency of each statement and if there's an accepted form.

Your two queries don't do the same thing.
Your first one
SELECT Column2, Column3, Column4
FROM Table1
WHERE Column1 in (SELECT Column1 FROM Table2)
is called a semi-join. It works like an inner join where the resultset has no columns from the second table. This is another way of writing the semi-join, but you have pointed out that your way is easier for you to read and reason about. (I agree.) Modern query planners satisfy either way of writing the semi-join the same way. This is the other way of writing the semi-join.
SELECT Table1.Column2, Table1.Column3, Table1.Column4
FROM Table1
INNER JOIN Table2 ON Table1.Column1 = Table2.Column1
Your second query is this. (By the way, RIGHT JOINs are far less common than LEFT JOINs in production code; many people have to stop and think twice when reading a RIGHT JOIN.)
SELECT A.Column2, A.Column3, A.Column4
FROM Table1 A
RIGHT JOIN Table2 B ON A.Column1 = B.Column1
This will produce resultset rows for every row in Table2 whether or not they match rows in Table1. Inner joins only deliver the rows that match the ON condition for both joined tables, and that's what you want.
Left joins produce at least one row for every row in Table1, even if it doesn't match. It's the same mutatis mutandis for right joins.

Related

SQL inner join and where performance comparison

If we have 2 tables, tableA (with column1, column2) and tableB (with column1, column2), what's the difference between the following two queries? Which one has better performance? What if we have indexing for both tables?
Query #1:
select
b.column2
from
tableA a,
tableB b
where
a.column1 = b.column1
and a.column2 = ?;
Query #2:
select
b.column2
from
tableA a
inner join
tableB b on a.column1 = b.column1
where
a.column2 = ?;
2nd query has better performance.
You are using cross join in your first query and then filtering the results. Imagine having 10000 records in both the tables, it will produce 10000*10000 combinations.
Both will perform equally. One is an ansi style and the other is old fashioned style of joining
You may compare the explain plans and most likely you will find them to be the same.

Choosing Best SQL Query

Hi I am pretty New to MS SQL so forgive me if I am asking something which is very obvious to other more experienced people. I can write the query to fetch the data in multiple way to fetch the same data. Now I have two SQL queries X and Y which look like following
(Query 1)
select column1, column2, column3
from
Table1 a
inner join
Table2 b on a.column1=b.column1
where Condition1 and condition2
EXCEPT
(select column1, column2, column3
from
Table1 a
inner join
Table2 b on a.column1=b.column1
where Condition3
)
(Query 2)
select column1, column2, column3
from
Table1 a
inner join
Table2 b on a.column1=b.column1
where Condition1 and condition2
And column1 Not in
(select column1
from
Table1 a
inner join
Table2 b on a.column1=b.column1
where Condition3
)
These both take similar time and Estimated Subtree cost also have minimal difference. I am not sure which one is a better query and why.
EXCEPT compares all (paired)columns of two full-selects and returns distinct rows from left result set which are not present in the right result set, while NOT IN compares two or more tables according to the conditions specified in WHERE clause in the sub-query following NOT EXISTS keyword and does the same however it doesn’t returns the distinct result set.
The EXCEPT returns distinct rows whereas NOT IN didn’t return distinct values. If you analyse the execution plan, you will realise that the EXCEPT query is slower than NOT IN.
The distinct sort operator in the EXCEPT costs around 65% of the total execution time.
According to this Link, EXCEPT can be rewritten by using NOT EXISTS. (EXCEPT ALL can be rewritten by using ROW_NUMBER and NOT EXISTS.)
Refer to LINK for more info.
Second one seems to have a slight edge on the first one.
The sub-query in second one fetches only one column i.e. column1.
If that column is indexed then it will be far better for sql engine to query with precision and speed.
What if you modify the where condition like below?
select column1, column2, column3
from
Table1 a
inner join
Table2 b on a.column1=b.column1
where Condition1 and condition2 and not condition 3

Access Removing CERTAIN PARTS of Duplicates in Union Query

I'm working in Access 2007 and know nothing about SQL and very, very little VBA. I am trying to do a union query to join two tables, and delete the duplicates.
BUT, a lot of my duplicates have info in one entry that's not in the other. It's not a 100% exact duplicate.
Example,
Row 1: A, B, BLANK
Row 2: A, BLANK, C
I want it to MERGE both of these to end up as one row of A, B, C.
I found a similar question on here but I don't understand the answer at all. Any help would be greatly appreciated.
I would suggest a query like this:
select
coalesce(t1.a, t2.a) as a,
coalesce(t1.b, t2.b) as b,
coalesce(t1.c, t2.c) as c
from
table1 t1
inner join table2 t2 on t1.key = t2.key
Here, I have used the keyword coalesce. This will take the first non null value in a list of values. Also note that I have used key to indicate the column that is the same between the two rows. From your example it looks like A but I cannot be sure.
If your first table has all the key values, then you can do:
select t1.a, nz(t1.b, t2.b), nz(t1.c, t2.c) as c
from table1 as t1 left join
table2 as t2
on t1.a = t2.a;
If this isn't the case, you can use this rather arcane looking construct:
select t1.a, nz(t1.b, t2.b), nz(t1.c, t2.c) as c
from table1 as t1 left join
table2 as t2
on t1.a = t2.a
union all
select t2.a, t2.b, t2.c
from table2 as t2
where not exists (select 1 from table1 as t1 where t1.key = t2.key)
The first part of the union gets the rows where there is a key value in the first table. The second gets the rows where the key value is in the second but not the first.
Note this is much harder in Access than in other (dare I say "real") databases. MS Access doesn't support common table expressions (CTEs), unions in subqueries, or full outer join -- all of which would help simplify the query.

Are Columns Not Selected in SQL Views Executed?

I wasn't able to come up with the right keywords to search for the answer for this, so apologies if it was answered already.
Consider the following SQL view:
CREATE VIEW View1 AS
SELECT Column1
,Column2
,(SELECT SUM(Column3) FROM Table2 WHERE Table2.ID = Table1.ID) -- Subquery
FROM Table1
If I run the following query, will the subquery be executed or does SQL Server optimise the query?
SELECT Column1 FROM View1
I'm looking at this from a performance point of view, say, if the view has quite a few subqueries (aggregations can take a long time if the inner select refers to a large table).
I'm using SQL Server 2008 R2, but I'm interested to know if the answer differs for 2012 or maybe MySQL.
Thanks.
As has been said, this varies depending on your DBMS (version and provider), to know for sure check the execution plan. This shows for SQL-Server 2008 the subquery is not executed:
As you can see in the top plan where Column3 is not selected the plan is simply selecting from table1, in the bottom plan that in includes Column3, table2 is queried.
In SQL-Server 2008 R2 it is not executed.
In SQL-Server 2012 it is not executed;
In MySQL it is executed, and both queries generate the same plan:
To elaborate further, it will also depend on your exact query, as well as your DBMS. For example:
CREATE VIEW View2
AS
SELECT t.ID, t.Column1, t.Column2, t2.Column3
FROM Table1 t
LEFT JOIN
( SELECT ID, Column3 = SUM(Column3)
FROM Table2
GROUP BY ID
) t2
ON t2.ID = t.ID
GO
SELECT Column1, Column2
FROM View2;
SELECT Column1, Column2, Column3
FROM View2;
In this case you get similar results to the correlated subquery, The plan shows only a select from table1 if column3 is not selected, because it is a LEFT JOIN the optimiser knows that the subquery t2 has no bearing on the select from table1, and no columns are used so it does not bother with it. If you changed the LEFT JOIN to an INNER JOIN though, e.g.
CREATE VIEW View3
AS
SELECT t.ID, t.Column1, t.Column2, t2.Column3
FROM Table1 t
INNER JOIN
( SELECT ID, Column3 = SUM(Column3)
FROM Table2
GROUP BY ID
) t2
ON t2.ID = t.ID
GO
SELECT Column1, Column2
FROM View3;
SELECT Column1, Column2, Column3
FROM View3;
The query plan for these two queries shows that because the aggregate column is not used in the second query, the optimiser essentially changes the view to this:
SELECT t.ID, t.Column1, t.Column2
FROM Table1 t
INNER JOIN
( SELECT DISTINCT ID
FROM Table2
) t2
ON t2.ID = t.ID;
As seen by the appearance of the Distinct Sort on table2 and the removal of the Stream Aggregate.
So to summarise, it depends.
The view is just a definition, like a temporary table in a query.
First the query behind the view will be executed and then your selection on the view. So yes the subquery will be executed. If you don't want this you should create a new view without the subquery.

SQL query, where = value of another table

I want to make a query that simply makes this, this may sound really dumb but i made a lot of research and couldn't understand nothing.
Imagine that i have two tables (table1 and table2) and two columns (table1.column1 and table2.column2).
What i want to make is basically this:
SELECT column1 FROM table1 where table2.column2 = '0'
I don't know if this is possible.
Thanks in advance,
You need to apply join between two talbes and than you can apply your where clause will do work for you
select column1 from table1
inner join table2 on table1.column = table2.column
where table2.columne=0
for join info you can see this
Reading this original article on The Code Project will help you a lot: Visual Representation of SQL Joins.
Find original one at: Difference between JOIN and OUTER JOIN in MySQL.
SELECT column1 FROM table1 t1
where exists (select 1 from table2 t2
where t1.id = t2.table1_id and t2.column2 = '0')
assuming table1_id in table2 is a foreign key refering to id of table1 which is the primary key
You don't have any kind of natural join between two tables.
You're asking for
Select Houses.DoorColour from Houses, Cars where Cars.AreFourWheelDrive = '1'
You'd need to think about why you're selecting anything from the first table, there must be a shared piece of information between tables 1 and 2 otherwise a join is pointless and probably dangerous.