SQL multiple shared WHERE criteria - sql

I'm trying to find a good, efficient way to run a query like this:
SELECT *
FROM tableA a
WHERE a.manager IN ( SELECT id
FROM tableB b
CONNECT BY PRIOR b.id = b.manager_id
START WITH b.id = 'managerBob')
OR a.teamLead IN ( SELECT ID
FROM tableB b
CONNECT BY PRIOR b.ID = b.manager_id
START WITH b.ID = 'managerBob')
OR a.creator IN ( SELECT id
FROM tableB b
CONNECT BY PRIOR b.id = b.manager_id
START WITH b.id = 'managerBob')
As you can see, I'm trying to use multiple WHERE clauses, but each clause is using the same dataset on the right-hand side of the equation. It seems to run very slowly if I use more than one clause, and I'm pretty sure that it's because Oracle is running each subquery. Is there a way to make something like this work?
SELECT *
FROM tableA a
WHERE a.manager,
a.teamLead,
a.creator in ( SELECT id
FROM tableB b
CONNECT BY PRIOR b.id = b.manager_id
START WITH b.id = 'managerBob')
By the way, I'm sorry if this is something I could have Googled, I'm not sure what to call this.

Subquery factoring may help:
WITH people AS
( SELECT id
FROM tableB b
CONNECT BY PRIOR b.id = b.manager_id
START WITH b.id = 'managerBob'
)
SELECT *
FROM tableA a
WHERE a.manager IN (SELECT id FROM people)
OR a.teamLead IN (SELECT id FROM people)
OR a.creator IN (SELECT id FROM people)

You can do:
WITH bob_subordinates AS (
( SELECT id
FROM tableB b
CONNECT BY PRIOR b.id = b.manager_id
START WITH b.id = 'managerBob')
SELECT * FROM tableA a
WHERE a.manager in (select id from bob_subordinates)
OR a.teamlead in (select id from bob_subordinates)
or a.creator in (select id from bob_subordinates)
Alternative (check the use of DISTINCT: if ids are not unique in table B then this is not equivalent):
WITH bob_subordinates AS (
( SELECT DISTINCT id
FROM tableB b
CONNECT BY PRIOR b.id = b.manager_id
START WITH b.id = 'managerBob')
SELECT DISTINCT a.*
FROM tableA a JOIN bob_subordinates b ON b.id IN (a.manager, a.teamlead, a.creator);

UPDATE as per comments - try
SELECT A.* FROM
(SELECT bb.id FROM tableB bb CONNECT BY PRIOR bb.id = bb.manager_id START WITH bb.id = 'managerBob') B INNER JOIN TABLEA A ON B.ID IN (A.MANAGER, A.TEAMLEAD, A.CREATOR)

Related

How do I solve this unique constraint error?

I am running a python code in jupyter lab. I am trying to create a table from other tables in a database file. This is my code:
# Alternative INVENTORY_PARENTLOT table
c.execute("DROP TABLE IF EXISTS inventory_parentlot")
query = """
CREATE TEMPORARY TABLE tmp_inv_parents AS SELECT a.id, b.inventorytype AS type, a.parentid FROM inventory_parents AS a JOIN table_inventory AS b on a.id=b.id JOIN table_inventory AS c on a.parentid=c.id WHERE c.inventorytype > 12 ;
CREATE TEMPORARY TABLE tmp_inv_parents2 AS SELECT a.id, a.type, b.parentid FROM tmp_inv_parents AS a JOIN tmp_inv_parents AS b on a.parentid = b.id;
CREATE TEMPORARY TABLE tmp_inv_parents3 AS SELECT a.id, a.type, b.parentid FROM tmp_inv_parents2 AS a JOIN tmp_inv_parents2 AS b on a.parentid = b.id;
CREATE TEMPORARY TABLE tmp_inv_parents4 AS SELECT a.id, a.type, b.parentid FROM tmp_inv_parents3 AS a JOIN tmp_inv_parents3 AS b on a.parentid = b.id;
CREATE TEMPORARY TABLE tmp_inv_parents5 AS SELECT a.id, a.type, b.parentid FROM tmp_inv_parents4 AS a JOIN tmp_inv_parents4 AS b on a.parentid = b.id ;
CREATE TEMPORARY TABLE tmp_inv_parentall AS SELECT * from tmp_inv_parents UNION SELECT * from tmp_inv_parents2 UNION SELECT * from tmp_inv_parents3 UNION SELECT * from tmp_inv_parents4;
CREATE TABLE inventory_parentlot AS SELECT a.id, b.id AS parentlot FROM (select id, min(parentserial) as firstparentserial FROM (select a.id, a.parentid, b.idserial AS parentserial from tmp_inv_parentall AS a JOIN table_inventory AS b on a.parentid = b.id ORDER BY a.id) GROUP BY id) AS a JOIN table_inventory AS b on a.firstparentserial = b.idserial;
CREATE UNIQUE INDEX inventory_parentlot_id ON inventory_parentlot(id);
ANALYZE inventory_parentlot;
"""
c.executescript(query)
When this code runs, I get a unique constraint error. However, I have checked each table and the ids are unique.
How can I solve this? Is there a way I can just ignore the error?
Make sure that the id is unique in the outer query of the SQL that creates the tabel.
CREATE TABLE inventory_parentlot AS
SELECT a.id, MAX(b.id) AS parentlot
FROM (
SELECT id, MIN(parentserial) AS firstparentserial
FROM (
SELECT a.id, a.parentid, b.idserial AS parentserial
FROM tmp_inv_parentall AS a
JOIN table_inventory AS b ON a.parentid = b.id
)
GROUP BY id
) AS a
JOIN table_inventory AS b
ON a.firstparentserial = b.idserial
GROUP BY a.id;

Return matched-pair and the non-matched pair for each ID only once

Any help would be appreciated.
I have two sample tables here.
Table A:
ID |Name
123|REG
123|ERT
124|REG
124|ACR
Table B
ID |Name
123|REG
123|WWW
124|REG
124|ADR
Here is the simple join output and I will explain my question in the comments:
*Yes -- I want this row
*No -- I don't want this row
AID|Aname|BID|Bname
123|REG |123|REG --Yes-- Matched-pair for id '123'
123|ERT |123|REG --No--'REG' already had one match. 'ERT' should pair with 'WWW' for id '123'
123|REG |123|WWW --No--The same reason as above
123|ERT |123|WWW --Yes--non-matched pair for id '123'
124|REG |124|REG
124|ACR |124|REG
124|REG |124|ADR
124|ACR |124|ADR
My desired result:
AID|Aname|BID|Bname
123|ERT |123|WWW
123|REG |123|REG
124|REG |124|REG
124|ACR |124|ADR
SQL server 2017.
Thank you in advance.
My approach (Inspired by the post from #The Impaler)
;with CTEall as(
select A.id as AID, A.NAME as Aname, b.id as BID,b.NAME as Bname from A
inner join B on A.id = B.id),
match as (
select A.id as AID, A.NAME as Aname, b.id as BID,b.NAME as Bname
from A inner join B on A.id = B.id and A.NAME = B.NAME)
select *
from CTEall
where Aname not in (select Aname from match where AID = BID)
and Bname not in (select Aname from match where BID = AID)
union all
select * from match
order by 1
Often when you think about the logic you want in a different way, the answer (or at least AN answer) becomes obvious.
I am thinking of your logic this way:
JOIN Table A to Table B such that A.ID=B.ID (always) AND EITHER
A.Name=B.Name OR A.Name doesn't have a Match in B, and B.Name doesn't
have a match in A.
This logic is pretty easy to express in SQL
WHERE a.ID=b.ID
AND (
a.Name=b.Name OR (
NOT EXISTS(SELECT * FROM TableB b2 WHERE b2.ID=a.ID AND b2.Name=a.Name)
AND
NOT EXISTS(SELECT * FROM TableA a2 WHERE a2.ID=b.ID AND a2.Name=b.Name)
)
)
I would do:
with
m as ( -- matched rows
select a.id as aid, a.name as aname, b.id as bid, b.name as bname
from table_a a
join table_b b on a.id = b.id and a.name = b.name
),
l as ( -- unmatched "left rows"
select a.id, a.name,
row_number() over(partition by id order by name) as rn
from table_a a
left join table_b b on a.id = b.id and a.name = b.name
where b.id is null
),
r as ( -- unmatched "right rows"
select b.id, b.name,
row_number() over(partition by id order by name) as rn
from table_b b
left join table_a a on a.id = b.id and a.name = b.name
where a.id is null
)
select aid, aname, bid, bname from m
union all
select l.id, l.name, r.id, r.name
from l
join r on r.id = l.id and r.rn = l.rn
Note: This solution may be a little bit overkill, since matches all unmatched rows when there are multiple ones per ID... something that is not necessary. Per OP comments there always be a single unmatched row per ID.

How can I get Oracle to have a better execution plan for this query?

We have some tables which keep track of processed transactions. These tables have millions of rows. Often times I want to look at the most recent X transactions, so I have a query like this to pull the info I want from a few tables:
select a.id, b.field_one, c.field_two
from trans a, table_two b, table_three c
where a.id = b.id
and a.id = c.id
and a.id in
(select id from trans where id > (select max(id) from trans) - 100);
Right now the query is very slow. The explain plan shows a full table scan on B and C. Now, if I evaluate the nested query separately and replace it with a list of comma separated IDs, the query is very fast. This seems obvious to me - it will only have 100 rows to join together so of course it will be faster than if it answered the query by first joining A and B together.
Conceptually I understand the query optimizer is trying to find a good execution plan but in this case it seems like it is doing a terrible job. Is there any way to force the DBMS to execute the nested query first? Possibly by using a Hint?
Thanks
Your method is probably obsuring the fact that only a maximum of 100 rows are being selected from trans.
Try this:
with cte_last_trans as (
select id
from (select id
from trans
where id > (select max(id)-100 from trans)
order by id desc)
where rownum <= 100)
select a.id,
b.field_one,
c.field_two
from cte_last_trans a,
table_two b,
table_three c
where a.id = b.id
and a.id = c.id
By the way, have you thought of the possibility that not all values of id might be present? If you want 100 rows to be returned, use:
with cte_last_trans as (
select id
from (select id
from trans
order by id desc)
where rownum <= 100)
select a.id,
b.field_one,
c.field_two
from cte_last_trans a,
table_two b,
table_three c
where a.id = b.id
and a.id = c.id
You can use a NO_MERGE hint which will force Oracle to execute the inner query first and not try to merge the two queries. Here's an example:
SELECT /*+NO_MERGE(seattle_dept)*/ e1.last_name, seattle_dept.department_name
FROM employees e1,
(SELECT location_id, department_id, department_name
FROM departments
WHERE location_id = 1700) seattle_dept
WHERE e1.department_id = seattle_dept.department_id;
select /*+ no_merge(inner) */ a.id, b.field_one, c.field_two
from trans a,
table_two b,
table_three c,
(select id from trans where id > (select max(id) from trans) - 100) inner
where a.id = b.id
and a.id = c.id
and a.id = inner.id;
The in does not seem to do anything have you tried removing it?
select a.id, b.field_one, c.field_two
from trans a, table_two b, table_three c
where a.id = b.id
and a.id = c.id
and a.id > (select max(id) from trans) - 100;
You could simply filter the 100 record in the primary trans table itself instead of joing it again and again.
select a.id, b.field_one, c.field_two
from
(select id from (select id, row_number() over(order by id desc) rn
from trans) where rn <=100) a,
table_two b, table_three c
where a.id = b.id
and a.id = c.id;
select a.id, b.field_one, c.field_two
from trans a, table_two b, table_three c
where a.id = b.id
and a.id = c.id
and a.id between (select max(id)-100 from trans) and (select max(id) from trans)

Simulate a left join without using "left join"

I need to simulate the left join effect without using the "left join" key.
I have two tables, A and B, both with id and name columns. I would like to select all the dbids on both tables, where the name in A equals the name in B.
I use this to make a synchronization, so at the beginning B is empty (so I will have couples with id from A with a value and id from B is null). Later I will have a mix of couples with value - value and value - null.
Normally it would be:
SELECT A.id, B.id
FROM A left join B
ON A.name = B.name
The problem is that I can't use the left join and wanted to know if/how it is possible to do the same thing.
you can use this approach, but you must be sure that the inner select only returns one row.
SELECT A.id,
(select B.id from B where A.name = B.name) as B_ID
FROM A
Just reverse the tables and use a right join instead.
SELECT A.id,
B.id
FROM B
RIGHT JOIN A
ON A.name = B.name
I'm not familiar with java/jpa. Using pure SQL, here's one approach:
SELECT A.id AS A_id, B.id AS B_id
FROM A INNER JOIN B
ON A.name = B.name
UNION
SELECT id AS A_id, NULL AS B_id
FROM A
WHERE name NOT IN ( SELECT name FROM B );
In SQL Server, for example, You can use the *= operator to make a left join:
select A.id, B.id
from A, B
where A.name *= B.name
Other databases might have a slightly different syntax, if such an operator exists at all.
This is the old syntax, used before the join keyword was introduced. You should of course use the join keyword instead if possible. The old syntax might not even work in newer versions of the database.
I can only think of two ways that haven't been given so far. My last three ideas have already been given (boohoo) but I put them here for posterity. I DID think of them without cheating. :-p
Calculate whether B has a match, then provide an extra UNIONed row for the B set to supply the NULL when there is no match.
SELECT A.Id, A.Something, B.Id, B.Whatever, B.SomethingElse
FROM
(
SELECT
A.*,
CASE
WHEN EXISTS (SELECT * FROM B WHERE A.Id = B.Id) THEN 1
ELSE 0
END Which
FROM A
) A
INNER JOIN (
SELECT 1 Which, B.* FROM B
UNION ALL SELECT 0, B* FROM B WHERE 1 = 0
) B ON A.Which = B.Which
AND (
A.Which = 0
OR (
A.Which = 1
AND A.Id = b.Id
)
)
A slightly different take on that same query:
SELECT A.Id, B.Id
FROM
(
SELECT
A.*,
CASE
WHEN EXISTS (SELECT * FROM B WHERE A.Id = B.Id) THEN A.Id
ELSE -1 // a value that does not exist in B
END PseudoId
FROM A
) A
INNER JOIN (
SELECT B.Id PseudoId, B.Id FROM B
UNION ALL SELECT -1, NULL
) B ON A.Which = B.Which
AND A.PseudoId = B.PseudoId
Only for SQL Server specifically. I know, it's really a left join, but it doesn't SAY LEFT in there!
SELECT A.Id, B.Id
FROM
A
OUTER APPLY (
SELECT *
FROM B
WHERE A.Id = B.Id
) B
Get the inner join then UNION the outer join:
SELECT A.Id, B.Id
FROM
A
INNER JOIN B ON A.name = B.name
UNION ALL
SELECT A.Id, NULL
FROM A
WHERE NOT EXISTS (
SELECT *
FROM B
WHERE A.Id = B.Id
)
Use RIGHT JOIN. That's not a LEFT JOIN!
SELECT A.Id, B.Id
FROM
B
RIGHT JOIN A ON B.name = A.name
Just select the B value in a subquery expression (let's hope there's only one B per A). Multiple columns from B can be their own expressions (YUCKO!):
SELECT A.Id, (SELECT TOP 1 B.Id FROM B WHERE A.Id = B.Id) Bid
FROM A
Anyone using Oracle may need some FROM DUAL clauses in any SELECTs that have no FROM.
You could use subqueries, something like:
select a.id
, nvl((select b.id from b where b.name = a.name), "") as bId
from a
you can use oracle + operator for left join :-
SELECT A.id, B.id
FROM A , B
ON A.name = B.name (+)
Find link :-
Oracle "(+)" Operator
SELECT A.id, B.id
FROM A full outer join B
ON A.name = B.name
where A.name is not null
I'm not sure if you just can't use a LEFT JOIN or if you're restricted from using any JOINS at all. But as far as I understand your requirements, an INNER JOIN should work:
SELECT A.id, B.id
FROM A
INNER JOIN B ON A.name = B.name
Simulating left join using pure simple sql:
SELECT A.name
FROM A
where (select count(B.name) from B where A.id = B.id)<1;
In left join there are no lines in B referring A so 0 names in B will refer to the lines in A that dont have a match
+ or A.id = B.id in where clause to simulate the inner join

How do I write SELECT FROM myTable WHERE id IN (SELECT...) in Linq?

How do you rewrite this in Linq?
SELECT Id, Name FROM TableA WHERE TableA.Id IN (SELECT xx from TableB INNER JOIN Table C....)
So in plain english, I want to select Id and Name from TableA where TableA's Id is in a result set from a second query.
from a in TableA
where (from b in TableB
join c in TableC on b.id equals c.id
where .. select b.id)
.Contains(a.Id)
select new { a.Id, a.Name }
LINQ supports IN in the form of contains. Think "collection.Contains(id)" instead of "id IN (collection)".
from a in TableA
where (
from b in TableB
join c in TableC
on b.id equals c.id
select b.id
).Contains(TableA.Id)
select new { a.Id, a.Name }
See also this blog post.
There is no out of box support for IN in LINQ. You need to join 2 queries.