How to get rid of NOT EXISTS - sql

I have a sql that is not very complex but sufficiently confusion that I question rather I have an equivalent or by coincident that the count are the same.
SQL1:
SELECT a, b
FROM table1
WHERE NOT EXISTS(
SELECT a, c
FROM TABLE2
WHERE table2.a != table1.a)
SQL2
SELECT table1.a, table1.b
FROM table1
LEFT JOIN table2 ON table2.a = table1.a
WHERE table2.a IS NULL
The count on the two are identical, but not sure if this is by chance, and I want to make sure the conversion do not change the original functionality.

That doesn't look the same - but it's close. Your LEFT JOIN syntax is the same as:
SELECT a, b
FROM table1
WHERE NOT EXIST(
SELECT a, c
FROM TABLE2
WHERE table2.a = table1.a)
Note the "=" instead of "!=" though. Are you sure that's not what you have?
Your actual query translates to something like "where no non-matching rows exist", which would be odd, but could be expressed by changing the JOIN condition:
SELECT a, b
FROM table1
LEFT JOIN table2 ON table2.a != table1.a
WHERE table2.a IS NULL

The first query, as you have it, returns all rows of TABLE1 where a matches all values of a in TABLE2. Therefore, it will return zero rows, unless there's a single not-null value for a in TABLE2, and that value exists in TABLE1. In that case, it will return as many rows as there are in TABLE1 with that value of a.
The second query is completely different. It will simply returns all rows of TABLE1 where a does not exist in TABLE2.
So it's "matches all" (query 1) vs. "does not match any" (query 2). The fact that you are getting the same number of rows is pure coincidence.
Your queries would be equivalent if you changed != for = in the first one, like this:
SELECT a, b
FROM table1
WHERE NOT EXISTS(
SELECT a, c
FROM TABLE2
WHERE table2.a = table1.a)
That gets you values of a in table1 that doesn't exist in table2. This is EXACTLY the same as:
SELECT table1.a, b
FROM table1
LEFT JOIN table2 ON table2.a = table1.a
WHERE table2.a IS NULL
As you have it though, they are NOT equivalent. You must change != for = in the first one to make them so.

For the first query i.e.
SELECT a, b
FROM table1
WHERE NOT EXISTS(
SELECT a, c
FROM TABLE2
WHERE table2.a != table1.a)
This will return all rows when all the values of a in table1 are the same one value and either all the rows in table2 are the same one value as table1 or table2 is the empty set. Otherwise, the result will be the empty set.
The same cannot be same of your second query.

SELECT a, b, c , d
FROM table1 t1
WHERE NOT EXISTS( SELECT * FROM table2 nx
WHERE nx.y = t1.a
)
;
There is one big advantage of this ("correlated subquery") method: table table2 is not visible from the outside query, and cannot pollute it, or confuse your thinking. The subquery just produces one bit of information: either it exists, or does not exist. to be or not to be ....
In that respect, the LEFT JOIN idiom is nastier, since you'll have to check the xxx IS NULL condition in the outer query, while the xxx references the table2 from the inner query.
Technically, there is no difference.

Related

DELETE duplicates with subquery from 2 tables in MS Access

I have two tables with same structure that can have duplicated records, I want to identify which ones from table 2 already exists in table 1 and delete them from table 2. The following SELECT returns the duplicated records I want to delete. None of these tables have a primary key, so I need to do multiple 'ON' to identify unique records.
SELECT V.*
FROM table2 AS V
INNER JOIN table1 AS N
ON V.column1 = N.column1 AND V.column2 = N.column2 AND V.column3= N.column3;
Then I insert this as a subquery for the DELETE:
DELETE FROM table2
WHERE table2.column1 IN
(SELECT V.*
FROM table2 AS V
INNER JOIN table1 AS N
ON V.column1 = N.column1 AND V.column2 = N.column2 AND V.column3= N.column3);
When running this query I get the following error:
You have written a query that can return more than one field without using the reserved word EXISTS in the FROM clause of the main query. Correct the SELECT instruction of the subquery to request a single field.
I also tried this way, but it deletes all the records from table 2, not only the result of the subquery:
DELETE FROM table2
WHERE EXISTS
(SELECT V.*
FROM table2 AS V
INNER JOIN table1 AS N
ON V.column1 = N.column1 AND V.column2 = N.column2 AND V.column3= N.column3);
This is the first solution I came up with, but I'm wondering if it wouldn't be easier to do in MS Access inserting into table1 all the records from table2 that doesn't match, and then delete table2.
All sugestions will be appreciated :)
Take the advice of the error message and try using exists logic:
DELETE
FROM table2 t2
WHERE EXISTS (SELECT 1 FROM table1 t1
WHERE t1.column1 = t2.column1 AND
t1.column2 = t2.column2 AND
t1.column3 = t2.column3);
The problem with your current exists attempt is that the query inside the EXISTS clause always has a result set, and that result set is independent of the outer delete call. So, all records get deleted.
I think you're just missing the specific column in your subquery.
This should work better :
DELETE FROM table2
WHERE table2.column1 IN
(SELECT V.column1
FROM table2 AS V
INNER JOIN table1 AS N
ON V.column1 = N.column1 AND V.column2 = N.column2 AND V.column3= N.column3);

Access Removing CERTAIN PARTS of Duplicates in Union Query

I'm working in Access 2007 and know nothing about SQL and very, very little VBA. I am trying to do a union query to join two tables, and delete the duplicates.
BUT, a lot of my duplicates have info in one entry that's not in the other. It's not a 100% exact duplicate.
Example,
Row 1: A, B, BLANK
Row 2: A, BLANK, C
I want it to MERGE both of these to end up as one row of A, B, C.
I found a similar question on here but I don't understand the answer at all. Any help would be greatly appreciated.
I would suggest a query like this:
select
coalesce(t1.a, t2.a) as a,
coalesce(t1.b, t2.b) as b,
coalesce(t1.c, t2.c) as c
from
table1 t1
inner join table2 t2 on t1.key = t2.key
Here, I have used the keyword coalesce. This will take the first non null value in a list of values. Also note that I have used key to indicate the column that is the same between the two rows. From your example it looks like A but I cannot be sure.
If your first table has all the key values, then you can do:
select t1.a, nz(t1.b, t2.b), nz(t1.c, t2.c) as c
from table1 as t1 left join
table2 as t2
on t1.a = t2.a;
If this isn't the case, you can use this rather arcane looking construct:
select t1.a, nz(t1.b, t2.b), nz(t1.c, t2.c) as c
from table1 as t1 left join
table2 as t2
on t1.a = t2.a
union all
select t2.a, t2.b, t2.c
from table2 as t2
where not exists (select 1 from table1 as t1 where t1.key = t2.key)
The first part of the union gets the rows where there is a key value in the first table. The second gets the rows where the key value is in the second but not the first.
Note this is much harder in Access than in other (dare I say "real") databases. MS Access doesn't support common table expressions (CTEs), unions in subqueries, or full outer join -- all of which would help simplify the query.

Fastest SQL & HQL Query for two tables

Table1: Columns A, B, C
Table2: Columns A, B, C
Table 2 is a copy of Table 1 with different data. Assume all columns to be varchar
Looking for a single efficient query which can fetch:
Columns A, B, C from Table1
Additional Rows from Table2 where values of Table2.A are not present in Table1.A
Any differences between the Oracle SQL & HQL for the same query will be appreciated.
I'm fiddling with Joins, Unions & Minus but not able to get the correct combination.
SQL:
SELECT *
FROM Table1
UNION ALL
SELECT *
FROM Table2 T2
WHERE NOT EXISTS(
SELECT 'X' FROM Table1 T1
WHERE T1.A = T2.A
)
HQL:
You must execute two different query an discard the element by Table2 result in a Java loop because in HQL doesn't exist UNION command.
Alternatatively you can write the first query for Table1 and the second query must have a not in clause to discard Table1 A field.
Solution 1:
Query 1:
SELECT * FROM Table1
Query 2:
SELECT * FROM Table2
and then you apply a discard loop in Java code
Solution 2:
Query 1:
SELECT * FROM Table1
Query 2:
SELECT * FROM Table2 WHERE Table2.A not in (SELECT Table1.A from Table1)
This query returns all rows in table1, plus all rows in table2 which does not exist in table1, given that column a is the common key.
select a,b,c
from table1
union
all
select a,b,c
from table2
where a not in(select a from table1);
There may be different options available depending on the relative sizes of table1 and table2 and the expected overlap.

T-SQL "Where not in" using two columns

I want to select all records from a table T1 where the values in columns A and B has no matching tuple for the columns C and D in table T2.
In mysql “Where not in” using two columns I can read how to accomplish that using the form select A,B from T1 where (A,B) not in (SELECT C,D from T2), but that fails in T-SQL for me resulting in "Incorrect syntax near ','.".
So how do I do this?
Use a correlated sub-query:
...
WHERE
NOT EXISTS (
SELECT * FROM SecondaryTable WHERE c = FirstTable.a AND d = FirstTable.b
)
Make sure there's a composite index on SecondaryTable over (c, d), unless that table does not contain many rows.
You can't do this using a WHERE IN type statement.
Instead you could LEFT JOIN to the target table (T2) and select where T2.ID is NULL.
For example
SELECT
T1.*
FROM
T1 LEFT OUTER JOIN T2
ON T1.A = T2.C AND T1.B = T2.D
WHERE
T2.PrimaryKey IS NULL
will only return rows from T1 that don't have a corresponding row in T2.
I Used it in Mysql because in Mysql there isn't "EXCLUDE" statement.
This code:
Concates fields C and D of table T2 into one new field to make it easier to compare the columns.
Concates the fields A and B of table T1 into one new field to make it easier to compare the columns.
Selects all records where the value of the new field of T1 is not equal to the value of the new field of T2.
SQL-Statement:
SELECT T1.* FROM T1
WHERE CONCAT(T1.A,'Seperator', T1.B) NOT IN
(SELECT CONCAT(T2.C,'Seperator', T2.D) FROM T2)
Here is an example of the answer that worked for me:
SELECT Count(1)
FROM LCSource as s
JOIN FileTransaction as t
ON s.TrackingNumber = t.TrackingNumber
WHERE NOT EXISTS (
SELECT * FROM LCSourceFileTransaction
WHERE [LCSourceID] = s.[LCSourceID] AND [FileTransactionID] = t.[FileTransactionID]
)
You see both columns exist in LCSourceFileTransaction, but one occurs in LCSource and one occurs in FileTransaction and LCSourceFileTransaction is a mapping table. I want to find all records where the combination of the two columns is not in the mapping table. This works great. Hope this helps someone.

Are "from Table1 left join Table2" and "from Table2 right join Table1" interchangeable?

For example, there are two tables:
create table Table1 (id int, Name varchar (10))
create table Table2 (id int, Name varchar (10))
Table1 data as follows:
Id Name
-------------
1 A
2 B
Table2 data as follows:
Id Name
-------------
1 A
2 B
3 C
If I execute both below mentioned SQL statements, both outputs will be the same:
select *
from Table1
left join Table2 on Table1.id = Table2.id
select *
from Table2
right join Table1 on Table1.id = Table2.id
Please explain the difference between left and right join in the above SQL statements.
Select * from Table1 left join Table2 ...
and
Select * from Table2 right join Table1 ...
are indeed completely interchangeable. Try however Table2 left join Table1 (or its identical pair, Table1 right join Table2) to see a difference. This query should give you more rows, since Table2 contains a row with an id which is not present in Table1.
Table from which you are taking data is 'LEFT'.
Table you are joining is 'RIGHT'.
LEFT JOIN: Take all items from left table AND (only) matching items from right table.
RIGHT JOIN: Take all items from right table AND (only) matching items from left table.
So:
Select * from Table1 left join Table2 on Table1.id = Table2.id
gives:
Id Name
-------------
1 A
2 B
but:
Select * from Table1 right join Table2 on Table1.id = Table2.id
gives:
Id Name
-------------
1 A
2 B
3 C
you were right joining table with less rows on table with more rows
AND
again, left joining table with less rows on table with more rows
Try:
If Table1.Rows.Count > Table2.Rows.Count Then
' Left Join
Else
' Right Join
End If
You seem to be asking, "If I can rewrite a RIGHT OUTER JOIN using LEFT OUTER JOIN syntax then why have a RIGHT OUTER JOIN syntax at all?" I think the answer to this question is, because the designers of the language didn't want to place such a restriction on users (and I think they would have been criticized if they did), which would force users to change the order of tables in the FROM clause in some circumstances when merely changing the join type.
select fields
from tableA --left
left join tableB --right
on tableA.key = tableB.key
The table in the from in this example tableA, is on the left side of relation.
tableA <- tableB
[left]------[right]
So if you want to take all rows from the left table (tableA), even if there are no matches in the right table (tableB), you'll use the "left join".
And if you want to take all rows from the right table (tableB), even if there are no matches in the left table (tableA), you will use the right join.
Thus, the following query is equivalent to that used above.
select fields
from tableB
right join tableA on tableB.key = tableA.key
Your two statements are equivalent.
Most people only use LEFT JOIN since it seems more intuitive, and it's universal syntax - I don't think all RDBMS support RIGHT JOIN.
I feel we may require AND condition in where clause of last figure of Outer Excluding JOIN so that we get the desired result of A Union B Minus A Interaction B.
I feel query needs to be updated to
SELECT <select_list>
FROM Table_A A
FULL OUTER JOIN Table_B B
ON A.Key = B.Key
WHERE A.Key IS NULL AND B.Key IS NULL
If we use OR , then we will get all the results of A Union B
select *
from Table1
left join Table2 on Table1.id = Table2.id
In the first query Left join compares left-sided table table1 to right-sided table table2.
In Which all the properties of table1 will be shown, whereas in table2 only those properties will be shown in which condition get true.
select *
from Table2
right join Table1 on Table1.id = Table2.id
In the first query Right join compares right-sided table table1 to left-sided table table2.
In Which all the properties of table1 will be shown, whereas in table2 only those properties will be shown in which condition get true.
Both queries will give the same result because the order of table declaration in query are different like you are declaring table1 and table2 in left and right respectively in first left join query, and also declaring table1 and table2 in right and left respectively in second right join query.
This is the reason why you are getting the same result in both queries. So if you want different result then execute this two queries respectively,
select *
from Table1
left join Table2 on Table1.id = Table2.id
select *
from Table1
right join Table2 on Table1.id = Table2.id
Select * from Table1 t1 Left Join Table2 t2 on t1.id=t2.id
By definition: Left Join selects all columns mentioned with the "select" keyword from Table 1 and the columns from Table 2 which matches the criteria after the "on" keyword.
Similarly,By definition: Right Join selects all columns mentioned with the "select" keyword from Table 2 and the columns from Table 1 which matches the criteria after the "on" keyword.
Referring to your question, id's in both the tables are compared with all the columns needed to be thrown in the output. So, ids 1 and 2 are common in the both the tables and as a result in the result you will have four columns with id and name columns from first and second tables in order.
*select *
from Table1
left join Table2 on Table1.id = Table2.id
The above expression,it takes all the records (rows) from table 1 and columns, with matching id's from table 1 and table 2, from table 2.
select *
from Table2
right join Table1 on Table1.id = Table2.id**
Similarly from the above expression,it takes all the records (rows) from table 1 and columns, with matching id's from table 1 and table 2, from table 2. (remember, this is a right join so all the columns from table2 and not from table1 will be considered).