reuse result of subquery in IN operator - sql

I found posts about reusing results from subquery, but none of them mention about IN operator
My query looks like this:
select count(*)
from X
where ...
and X.id NOT IN (select id from Y)
and X.id IN (select id from Z
where ...
and Z.id IN (select id from Y)
)
As you see, the subquery select id from Y is repeated
How can I reuse the result of the subquery in the IN operator?

Related

Optimising where clause with x in y or z in y

I'm just wondering if there is any way to optimise this query :
select * from table_x where buyer_id in (select id from table_y) x or
seller_id in (select id from table_y) y
Since the two subqueries in my where-clause are identical and I suspect that the program will run the two subqueries separately
Thanks!
Your query is essentially:
select x.*
from table_x x
where x.buyer_id in (select y.id from table_y y) or
x.seller_id in (select y.id from table_y y);
This construct should be fine. In some databases, you might use exists instead of in, but I think Hive will be fine with this.
To solve multiple equery issue in hive use semi left join:
SELECT x.*
FROM table_x x
LEFT SEMI JOIN table_y b ON (x.buyer_id = b.id )
LEFT SEMI JOIN table_y c ON (x.seller_id = c.id )

SELECT * FROM t WHERE t.x OR t.y IN (SELECT id FROM z)

Is something like this possible (I know this statement is not working, I tried it):
SELECT * FROM t WHERE t.x OR t.y IN (SELECT id FROM z)
Example
Table t:
|id|x |y
|1 |101|201
|2 |102|202
Table z:
|id |
|101 |
|201 |
And from this table t I want to select all entries where either attribute x or attribute y is contained in the list of ids of table z.
I know I can do
SELECT * FROM t WHERE t.x IN (SELECT id FROM z) OR t.y IN (SELECT id FROM z)
but this feels like it is very inefficient when the IN values are coming from a complex subquery (which then is the same in both IN clauses).
Or are current query planner implementations clever enough to see that both subqueries give the same results and only execute this one time? Or maybe there is another solution using EXISTS which I currently don't see?
PS: I'm using Postgres, but I'm looking for a generic solution.
Use EXISTS
SELECT * FROM t WHERE exists (SELECT 1 FROM z where z.id in (t.x,t.y))
If z is a complex query, then you can use a CTE to simplify the code:
WITH z AS (
. . .
)
SELECT *
FROM t
WHERE t.x IN (SELECT id FROM z) OR t.y IN (SELECT id FROM z);
You can also use JOIN or EXISTS instead:
SELECT *
FROM t
WHERE EXISTS (SELECT 1
FROM z
WHERE z.id IN (t.x, t.y)
);
The JOIN version has the downside that rows can multiply due to duplicates in z.
That said, the version with the two IN expressions is possibly the most efficient.
Try...
SELECT t.* FROM t join z on t.x = z.id or t.y = z.id

How to do a LEFT JOIN in MS Access without duplicates?

I have 2 tables with duplicated values in one of the columns. I'd like to do a left join without taking rows, where mentioned column values duplicates.
For example,
i have table X:
id Value
A 2
B 4
C 5
and table Y:
id Value
A 2
A 8
B 2
I'm doing a LEFT JOIN:
SELECT*
FROM X LEFT JOIN Y ON X.id = Y.id;
Would like to have something like:
id Value
A 2 A 2
B 4 B 2
C 5
so that duplicated id (A 8) from table Y is not considered.
You can do it with GROUP BY:
SELECT X.id, X.value, MIN(Y.value)
FROM X
LEFT JOIN Y ON X.id = Y.id
GROUP BY X.id, X.value
Note that it is not necessary to bring Y.id into the mix, because it is either null or equal to X.id.
You are looking for GROUP BY to aggregate the Y table records, effectively collapsing them down to one row per id. I have chosen MIN but you could use SUM if they are integers like your example data.
SELECT
x.id , x.Value, y.id, min(y.value)
FROM
X LEFT JOIN Y ON X.id = Y.id
GROUP BY
x.id, x.value, y.id;
I have given exactly what you asked for. But in my opinion the y.Id is unnecessary in the select and group by list.
This isn't what you're asking for in particular, but if you don't need them joined you could use a union.
SELECT * FROM (
SELECT ID, VALUE FROM tableX
UNION ALL
SELECT ID, VALUE FROM TAbleY)
GROUP BY ID, Value
You would end up getting
id Value
A 2
A 4
A 8
B 2
B 4
C 5
Hmmm, I think you can do this in Access with correlated subqueries:
select x.*,
(select top 1 y.id
from y
where y.id = x.id
),
(select top 1 y.value
from y
where y.id = x.id
),
from x;
This doesn't guarantee that the values come from the same row, but that's not a big deal in this case, because the y.id is either present (and the same as x.id) or it is NULL. The y.value comes from an arbitrary matching row.

Using having count() in exists clause

I am trying to make a SQL query where the subquery in an 'exists' clause has a 'having' clause. The strange thing is that. There is no error and the subquery works as a stand-alone query. However, the whole query gives exactly the same results with the 'having' clause as without.
This is kind of what my query looks like:
SELECT X
FROM A
WHERE exists (
SELECT X, count(distinct Y)
FROM B
GROUP BY X
HAVING count(distinct Y) > 2)
So I'm trying to select the rows from A where X has more then two occurances of Y in B.
However, the results also include records that do not exist in the subquery. What am I doing wrong here?
You don't correlate the two queries:
SELECT X
FROM A
WHERE (
SELECT COUNT(DISTINCT y)
FROM b
WHERE b.x = a.x
) > 2
Your query says something like this:
select X from A IF THERE ARE records having more than one occurence if grouped by Y in B.
If your 'exists subquery' returns even one record from table B the condition is true and you will get all the rows from A.
Try:
select X
from A
where exists (select 1
from B
where B.x = A.x
group by b.x
having count(distinct b.y) > 2
)
I had a similar situation and solved by a JOIN since the other answers didn't work for me. I tried to correlate to your generic example. Hope it is helpful to someone else!
SELECT X
FROM A
JOIN (SELECT X, COUNT(DISTINCT y)
FROM B
GROUP BY X
HAVING count(distinct Y) > 2) C
ON A.X = C.X

Can we write subquery in between SELECT and FROM

i want to know, how to write subquery in between SELECT and FROM as
SELECT Col_Name,(Subquery)
From Table_Name
Where Some_condition
This:
SELECT y.col_name,
(SELECT x.column
FROM TABLE x) AS your_subquery
FROM TABLE y
WHERE y.col = ?
...is a typical subquery in the SELECT clause. Some call it a "subselect". This:
SELECT y.col_name,
(SELECT x.column
FROM TABLE x
WHERE x.id = y.id) AS your_subquery
FROM TABLE y
WHERE y.col = ?
...is a correlated subquery. It's correlated because the subquery result references a table in the outer query (y in this case).
Effectively, just write whatever additional SELECT statement you want in the SELECT clause, but it has to be surrounded by brackets.
you can do it, but you must use an alias for the subquery
SELECT Col_Name,(Subquery) as S
From Table_Name
Where Some_condition