Write correlated subquery in a WHERE Clause as join - sql

I have a query like below:
select
a.id, a.title, a.description
from
my_table_name as a
where
a.id in (select id from another_table b where b.id = 1)
My question is, is there any way I can avoid the subquery in where clause and use it in from clause itself without compromising of performance?

Both of the answers given so far are incorrect in the general case (though the database may have unique constraints which ensure they are correct in a specific case)
If another_table might have multiple rows with the same id then the INNER JOIN will bring back duplicates that are not present in the IN version. Trying to remove them with DISTINCT can change the semantics if the columns from my_table_name themselves have duplicates.
A general rewrite would be
SELECT a.id,
a.title,
a.description
FROM my_table_name AS a
JOIN (SELECT DISTINCT id
FROM another_table
WHERE id = 1) AS b
ON b.id = a.id
The performance characteristics of this rewrite are implementation dependant.

You may use INNER JOIN as:
select
a.id, a.title, a.description
from
my_table_name as a INNER JOIN another_table as b ON (a.id = b.id and b.id = 1)
Or
select
a.id, a.title, a.description
from
my_table_name as a INNER JOIN another_table as b ON a.id = b.id
where b.id = 1
Both the queries may not return the same value for you. You may choose whatever works for you. Please use this as a starting point and not as a copy-paste code.

To express it as a join:
select distinct
a.id, a.title, a.description
from my_table_name as a
join another_table b on b.id = a.id
where b.id = 1
The use of distinct is to produce the same results in case another_table has the same id more than once so the same row doesn't get returned multiple times.
Note: if combinations of id, name and description in my_table_name are not unique, this query won't return such duplicates as the original query would.
To guarantee to produce the same results, you need to ensure that the id's in another_table is unique. To do this as a join:
select
a.id, a.title, a.description
from my_table_name as a
join (select distinct id from another_table) b on b.id = a.id
where b.id = 1

Related

Using Summary Data as a Parameter in SQL

I'm new to SQL and am using Access to run queries that Excel can't really handle. Here's the basic design of the query:
SELECT A.ID, A.Description, A.Location, B.ID, B.Quantity, B.Location
FROM A LEFT JOIN B ON A.ID = B.ID
In table B, location is all the same value. I want to retain the left join above, but limit the resulting values in table A to whatever the location value is in column B. In my mind this would be a WHERE clause in which A.Location = max(B.Location) or something like that.
Any ideas?
If you want to limit the resulting values in table A to whatever the location value is in table B, why can't you simply use the join based on location also?
SELECT A.ID, A.Description, A.Location, B.ID, B.Quantity, B.Location
FROM A LEFT JOIN B
ON A.ID = B.ID
AND A.location = B.location
You can use a DMax expression to fetch the duplicated non-Null value of B.Location. And that expression can be used in the WHERE clause to limit A rows to only those with matching [Location]:
SELECT A.ID, A.Description, A.Location, B.ID, B.Quantity, B.Location
FROM A LEFT JOIN B ON A.ID = B.ID
WHERE A.Location = DMax("[Location]", "B");
If you prefer not to use DMax since it is Access-specific, you can do it this way instead:
SELECT A.ID, A.Description, A.Location, B.ID, B.Quantity, B.Location
FROM A LEFT JOIN B ON A.ID = B.ID
WHERE A.Location = (SELECT Max([Location]) FROM B);

How can you perform a join when also using a comma-separated list of tables in an SQL select statement?

This is evidently correct syntax in SQL Server:
SELECT a.id, b.name
FROM Table1 a, Table2 b
WHERE a.id = b.fk1
So is this:
SELECT a.id, c.status
FROM Table1 a
JOIN Table3 c ON a.id = c.fk2
But this apparently isn't:
SELECT a.id, b.name, c.status
FROM Table1 a, Table2 b
JOIN Table3 c ON a.id = c.fk2
WHERE a.id = b.fk1
I would NOT normally want to construct a query in the third case's style (and really not the first case's either), but it would probably be the path of least resistence in editing some code that's already been written at my company. Somebody used the first form with five different tables, and I really need to work in a sixth table through a JOIN statement, without taking chances of messing up what they already have. Even though I could re-write their stuff outright if I need to, I would really like to know how to do something like in the third case.
Running the code exactly as-is in the examples, the third case gives me this error message:
The multi-part identifier "a.id" could not be bound.
What is syntactically breaking the third case? What simple fix could be applied? Thanks!
I, likewise, would not recommend doing this. But, you can just change the , to a cross join:
SELECT a.id, b.name, c.status
FROM Table1 a cross join Table2 b
JOIN Table3 c ON a.id = c.fk2
WHERE a.id = b.fk1
This code:
SELECT a.id, b.name, c.status
FROM Table1 a, Table2 b
JOIN Table3 c ON a.id = c.fk2
WHERE a.id = b.fk1
is doing a cross join on a and the result of an inner join on b and c. c cannot access any of the fields in a because the join is being performed on b. what you should do is change your query to:
SELECT a.id, b.name, c.status
FROM Table1 a
inner join Table2 b on a.id = b.fk1
inner JOIN Table3 c ON a.id = c.fk2

Oracle outer join with filter condition on the second table

Is there any condition under which the result sets will be different from the following two statements?
select * from a,b where a.id = b.id and b.name = 'XYZ'
select * from a,b where a.id =b.id(+) and b.name = 'XYZ'
I think in both cases it will bring the common rows from a and b where b.name = 'XYZ'. So a.id = b.id(+) has no meaning.
No, there is no condition under which the result sets will be different.
But your assumption "a.id = b.id(+) has no meaning" is not 100% correct. It has a meaning, because it defines the join, otherwise this would be a cartesian product of a and b with all rows from a and b.name = 'XYZ'.
What has no effect is the (+), because the statement is "semantically" wrong. It makes no sense to outer join on id but to join on name.
Usually something like that is wanted:
select * from a,b where a.id =b.id(+) and b.name(+) = 'XYZ';
Short example at http://www.sqlfiddle.com/#!4/d19b4/15

Simulate a left join without using "left join"

I need to simulate the left join effect without using the "left join" key.
I have two tables, A and B, both with id and name columns. I would like to select all the dbids on both tables, where the name in A equals the name in B.
I use this to make a synchronization, so at the beginning B is empty (so I will have couples with id from A with a value and id from B is null). Later I will have a mix of couples with value - value and value - null.
Normally it would be:
SELECT A.id, B.id
FROM A left join B
ON A.name = B.name
The problem is that I can't use the left join and wanted to know if/how it is possible to do the same thing.
you can use this approach, but you must be sure that the inner select only returns one row.
SELECT A.id,
(select B.id from B where A.name = B.name) as B_ID
FROM A
Just reverse the tables and use a right join instead.
SELECT A.id,
B.id
FROM B
RIGHT JOIN A
ON A.name = B.name
I'm not familiar with java/jpa. Using pure SQL, here's one approach:
SELECT A.id AS A_id, B.id AS B_id
FROM A INNER JOIN B
ON A.name = B.name
UNION
SELECT id AS A_id, NULL AS B_id
FROM A
WHERE name NOT IN ( SELECT name FROM B );
In SQL Server, for example, You can use the *= operator to make a left join:
select A.id, B.id
from A, B
where A.name *= B.name
Other databases might have a slightly different syntax, if such an operator exists at all.
This is the old syntax, used before the join keyword was introduced. You should of course use the join keyword instead if possible. The old syntax might not even work in newer versions of the database.
I can only think of two ways that haven't been given so far. My last three ideas have already been given (boohoo) but I put them here for posterity. I DID think of them without cheating. :-p
Calculate whether B has a match, then provide an extra UNIONed row for the B set to supply the NULL when there is no match.
SELECT A.Id, A.Something, B.Id, B.Whatever, B.SomethingElse
FROM
(
SELECT
A.*,
CASE
WHEN EXISTS (SELECT * FROM B WHERE A.Id = B.Id) THEN 1
ELSE 0
END Which
FROM A
) A
INNER JOIN (
SELECT 1 Which, B.* FROM B
UNION ALL SELECT 0, B* FROM B WHERE 1 = 0
) B ON A.Which = B.Which
AND (
A.Which = 0
OR (
A.Which = 1
AND A.Id = b.Id
)
)
A slightly different take on that same query:
SELECT A.Id, B.Id
FROM
(
SELECT
A.*,
CASE
WHEN EXISTS (SELECT * FROM B WHERE A.Id = B.Id) THEN A.Id
ELSE -1 // a value that does not exist in B
END PseudoId
FROM A
) A
INNER JOIN (
SELECT B.Id PseudoId, B.Id FROM B
UNION ALL SELECT -1, NULL
) B ON A.Which = B.Which
AND A.PseudoId = B.PseudoId
Only for SQL Server specifically. I know, it's really a left join, but it doesn't SAY LEFT in there!
SELECT A.Id, B.Id
FROM
A
OUTER APPLY (
SELECT *
FROM B
WHERE A.Id = B.Id
) B
Get the inner join then UNION the outer join:
SELECT A.Id, B.Id
FROM
A
INNER JOIN B ON A.name = B.name
UNION ALL
SELECT A.Id, NULL
FROM A
WHERE NOT EXISTS (
SELECT *
FROM B
WHERE A.Id = B.Id
)
Use RIGHT JOIN. That's not a LEFT JOIN!
SELECT A.Id, B.Id
FROM
B
RIGHT JOIN A ON B.name = A.name
Just select the B value in a subquery expression (let's hope there's only one B per A). Multiple columns from B can be their own expressions (YUCKO!):
SELECT A.Id, (SELECT TOP 1 B.Id FROM B WHERE A.Id = B.Id) Bid
FROM A
Anyone using Oracle may need some FROM DUAL clauses in any SELECTs that have no FROM.
You could use subqueries, something like:
select a.id
, nvl((select b.id from b where b.name = a.name), "") as bId
from a
you can use oracle + operator for left join :-
SELECT A.id, B.id
FROM A , B
ON A.name = B.name (+)
Find link :-
Oracle "(+)" Operator
SELECT A.id, B.id
FROM A full outer join B
ON A.name = B.name
where A.name is not null
I'm not sure if you just can't use a LEFT JOIN or if you're restricted from using any JOINS at all. But as far as I understand your requirements, an INNER JOIN should work:
SELECT A.id, B.id
FROM A
INNER JOIN B ON A.name = B.name
Simulating left join using pure simple sql:
SELECT A.name
FROM A
where (select count(B.name) from B where A.id = B.id)<1;
In left join there are no lines in B referring A so 0 names in B will refer to the lines in A that dont have a match
+ or A.id = B.id in where clause to simulate the inner join

How do I find records that are not joined?

I have two tables that are joined together.
A has many B
Normally you would do:
select * from a,b where b.a_id = a.id
To get all of the records from a that has a record in b.
How do I get just the records in a that does not have anything in b?
select * from a where id not in (select a_id from b)
Or like some other people on this thread says:
select a.* from a
left outer join b on a.id = b.a_id
where b.a_id is null
select * from a
left outer join b on a.id = b.a_id
where b.a_id is null
The following image will help to understand SQL LET JOIN :
Another approach:
select * from a where not exists (select * from b where b.a_id = a.id)
The "exists" approach is useful if there is some other "where" clause you need to attach to the inner query.
SELECT id FROM a
EXCEPT
SELECT a_id FROM b;
You will probably get a lot better performance (than using 'not in') if you use an outer join:
select * from a left outer join b on a.id = b.a_id where b.a_id is null;
SELECT <columnns>
FROM a WHERE id NOT IN (SELECT a_id FROM b)
In case of one join it is pretty fast, but when we are removing records from database which has about 50 milions records and 4 and more joins due to foreign keys, it takes a few minutes to do it.
Much faster to use WHERE NOT IN condition like this:
select a.* from a
where a.id NOT IN(SELECT DISTINCT a_id FROM b where a_id IS NOT NULL)
//And for more joins
AND a.id NOT IN(SELECT DISTINCT a_id FROM c where a_id IS NOT NULL)
I can also recommended this approach for deleting in case we don't have configured cascade delete.
This query takes only a few seconds.
The first approach is
select a.* from a where a.id not in (select b.ida from b)
the second approach is
select a.*
from a left outer join b on a.id = b.ida
where b.ida is null
The first approach is very expensive. The second approach is better.
With PostgreSql 9.4, I did the "explain query" function and the first query as a cost of cost=0.00..1982043603.32.
Instead the join query as a cost of cost=45946.77..45946.78
For example, I search for all products that are not compatible with no vehicles. I've 100k products and more than 1m compatibilities.
select count(*) from product a left outer join compatible c on a.id=c.idprod where c.idprod is null
The join query spent about 5 seconds, instead the subquery version has never ended after 3 minutes.
Another way of writing it
select a.*
from a
left outer join b
on a.id = b.id
where b.id is null
Ouch, beaten by Nathan :)
This will protect you from nulls in the IN clause, which can cause unexpected behavior.
select * from a where id not in (select [a id] from b where [a id] is not null)