Reference subquery several times - sql

Simplified version of the statement I try to run (a, b, and m are tables with only one column containing integers):
SELECT * FROM m WHERE
c1 IN (SELECT * FROM a) OR
c1 IN (SELECT * FROM b)
AND (NOT c1 IN (SELECT * FROM a)
OR c1 IN (SELECT * FROM b));
I know this could be done more easily, but I have to keep this structure (i.e. cannot JOIN for instance). How could I avoid repeating the subqueries after the AND NOT, since they will return the same set as the previous ones?

Related

Three-Way Diff in SQL

I have three SQL tables (A, B, and C), representing three different version of a dataset. I want to devise an SQL query whose effect is to extract the ids of rows/tuples whose values are different in all three tables. (Two records are different if there exists a field where the records do not share the same value.) For simplicity, let's just assume that A, B, and C each have N records with record ids ranging from 1 to N, so for every id from 1 to N, there is a record in each table with that ID.
What might be the most efficient way to do this in SQL? One way would be to do something like
(SELECT id FROM
((SELECT * FROM A EXCEPT SELECT * FROM B) EXCEPT (SELECT * FROM C)) result)
EXCEPT
(SELECT id FROM
((SELECT * FROM A) INTERSECT (SELECT * FROM B)) result2)
Basically what I've done above is first found the ids of records where the version in A differs from the version of B and from the version in C (in the first two lines of the SQL query I've written). What's left is to filter out the ids of record where the version in B matches the version in C (which is done in the last two lines). But this seems horribly inefficient; is there a better, more concise way?
Note: I'm using PostgreSQL syntax here.
I would do it like this:
select id,
a.id is null as "missing in a",
b.id is null as "missing in b",
c.id is null as "missing in c",
a is distinct from b as "a and b different",
a is distinct from c as "a and c different",
b is distinct from c as "b and c different"
from a
full join b using (id)
full join c using (id)
where a is distinct from b
or b is distinct from c
or a is distinct from c
The id column is assumed to be a primary (or unique) key.
Online example
You can use the group by and having as follows:
select id from
(select * from A
union select * from B
union select * from C)
group by id
-- use characters that you know will not appear in this columns for concat
having count(distinct column1 || '#~#' || column2 || '#~#' || column3) = 3

Error in query, two select statements with left join?

I have two select statements:
A: select a.1,a.2,a.3 from table1 a
B: select b.1,b.2,b.3 from table1 b
Now I join these two statements?
I tried in the below way and got error:
select *
(select a.1,a.2,a.3 from table1 a) aa
left join
(select b.1,b.2,b.3 from table1 b) bb
aa.a.1 = bb.b.1;
Within your Left Join, you need to include the ON/WHERE clause:
select *
(select a.1,a.2,a.3 from table1 a) aa
left join
(select b.1,b.2,b.3 from table1 b) bb
aa.a.1 = bb.b.1,
should be in the format:
SELECT *
(SELECT a.1, a.2, a.3 FROM table1 a) aa
LEFT JOIN
(SELECT b.1,b.2,b.3 FROM table2 b) bb
ON a.1 = b.1
WHERE ...
For more clarification, please see this image:
As it currently stands, it's quite hard to distinguish what exactly your requirements are in terms of what you want the query to return, but I op this image will visually display the syntax for each of the joins.
Numbers (a.1, a.2, i.e. columns 1 and 2 for table alias a) are usually not valid column names. Are the columns really named thus? Then you'd need something to indicate that these are column names. Depending on the dbms that could be `a.1` or "a.1" or [a.1]. Or use different names, such as num1, num2, num3, or one, two, three, etc.
EDIT: You are also missing the word ON before your criteria. And aa.a.1 is invalid, for your table alias is now aa and the column name is still "1" and the table alias a is no longer known. So it must be a."1" instead. Moreover you are missing the keyword FROM for your first derived table.
select *
from (select a."1", a."2", a."3" from table1 a) aa
left join (select b."1", b."2", b."3" from table1 b) bb ON aa."1" = bb."1";

How to retain a row which is foreign key in another table and remove other duplicate rows?

I have two table:
A:
id code
1 A1
2 A1
3 B1
4 B1
5 C1
6 C1
=====================
B:
id Aid
1 1
2 4
(B doesn't contain the Aid which link to code C1)
Let me explain the overall flow:
I want to make each row in table A have different code(by delete duplicate),and I want to retain the Aid which I can find in table B.If Aid which not be saved in table B,I retain the id bigger one.
so I can not just do something as below:
DELETE FROM A
WHERE id NOT IN (SELECT MAX(id)
FROM A
GROUP BY code,
)
I can get each duplicate_code_groups by below sql statement:
SELECT code
FROM A
GROUP BY code
HAVING COUNT(*) > 1
Is there some code in sql like
for (var ids in duplicate_code_groups){
for (var id in ids) {
if (id in B){
return id
}
}
return max(ids)
}
and put the return id into a idtable?? I just don't know how to write such code in sql.
then I can do
DELETE FROM A
WHERE id NOT IN idtable
Using ROW_NUMBER() inside CTE (or sub-query) you can assign numbers for each Code based on your ordering and then just join the result-set with your table A to make a delete.
WITH CTE AS
(
SELECT A.*, ROW_NUMBER() OVER (PARTITION BY A.Code ORDER BY COALESCE(B.ID,0) DESC, A.ID desc) RN
FROM A
LEFT JOIN B ON A.ID = B.Aid
)
DELETE A FROM A
INNER JOIN CTE C ON A.ID = C.ID
WHERE RN > 1;
SELECT * FROM A;
SQLFiddle DEMO
The first select gives you all A.id that are in B - you don't want to delete them. The second select takes A, selects all codes without an id that appears in B, and from this subset takes the maximum id. These two sets of ids are the ones you want to keep, so the delete deletes the ones not in the sets.
DELETE from A where A.id not in
(
select aid from B
union
select MAX(A.id) from A left outer join B on B.Aid=A.id group by code having COUNT(B.id)=0
)
Actual Execution Plan on MS SQL Server 2008 R2 reveals that this solution performs quite well, it's 5-6 times faster than Nenad's solution :).
Try this Solution
DELETE FROM A
WHERE NOT id IN
(
SELECT MAX(B.AId)
FROM A INNER JOIN B ON A.id = B.aId
)

Select from union of same columns from three tables

I have three tables, A, B, and C. They all hold different data, but have some columns in common.
If A, B, and C all have columns C1 and C2 then how can I look up a specific C2 value using a C1 value that could be in any of the 3 tables?
Basically, I want to do a simple look-up but have it act on the union of the 3 tables - and I'd prefer to not use a view to achieve this.
Note that this is an Ingres Vectorwise database.
You do this by doing a union of the tables in the from clause:
select c2
from ((select c1, c2 from a) union all
(select c1, c2 from b) union all
(select c1, c2 from c)
) t
where c1 = <your value>
I've used union all for performance reasons. If you are concerned about duplicate values, either use union or add a distinct in the select.
This is standard SQL and should work in any database.
I don't know what you mean by " a specific C2 value using a C1 value ",
but, whatever your query would be for the view, repeat that query and union the results,
SELECT *
FROM A
WHERE C2 = ?
UNION ALL
SELECT *
FROM B
WHERE C2 = ?
UNION ALL
SELECT *
FROM C
WHERE C2 = ?
(The view is a standard SQL feature, and will make any query you write easier.)

Is there alternative way to write this query?

I have tables A, B, C, where A represents items which can have zero or more sub-items stored in C. B table only has 2 foreign keys to connect A and C.
I have this sql query:
select * from A
where not exists (select * from B natural join C where B.id = A.id and C.value > 10);
Which says: "Give me every item from table A where all sub-items have value less than 10.
Is there a way to optimize this? And is there a way to write this not using exists operator?
There are three commonly used ways to test if a value is in one table but not another:
NOT EXISTS
NOT IN
LEFT JOIN ... WHERE ... IS NULL
You have already shown code for the first. Here is the second:
SELECT *
FROM A
WHERE id NOT IN (
SELECT b.id
FROM B
NATURAL JOIN C
WHERE C.value > 10
)
And with a left join:
SELECT *
FROM A
LEFT JOIN (
SELECT b.id
FROM B
NATURAL JOIN C
WHERE C.value > 10
) BC
ON A.id = BC.id
WHERE BC.id IS NULL
Depending on the database type and version, the three different methods can result in different query plans with different performance characteristics.