Questions similar to this one about using DISTINCT values in an INNER JOIN have been asked a few times, but I don't see my (simple) use case.
Problem Description:
I have two tables Table A and Table B. They can be joined via a variable ID. Each ID may appear on multiple rows in both Table A and Table B.
I would like to INNER JOIN Table A and Table B on the distinct values of ID which appear in Table B and select all rows of Table A with a Table A.ID which appears matching some condition in Table B.
What I want:
I want to make sure I get only one copy of each row of Table A with a Table A.ID matching a Table B.ID which satisfies [some condition].
What I would like to do:
SELECT * FROM TABLE A
INNER JOIN (
SELECT DISTINCT ID FROM TABLE B WHERE [some condition]
) ON TABLE A.ID=TABLE B.ID
Additionally:
As a further (really dumb) constraint, I can't say anything about the SQL standard in use, since I'm executing the SQL query through Stata's odbc load command on a database I have no information about beyond the variable names and the fact that "it does accept SQL queries," ( <- this is the extent of the information I have).
If you want all rows in a that match an id in b, then use exists:
select a.*
from a
where exists (select 1 from b where b.id = a.id);
Trying to use join just complicates matters, because it both filters and generates duplicates.
Related
I have one master table A, and two different sub tables (B, C) which is referenced by foreign key in table A, I want to check if a row exists with foreign key fk-1 in tables B or C.
I tried by selecting rows from A with exists clause on B & C which are selected using the fk-1 further OR'ed together and found the result.
SELECT A.id FROM A where A.id = fk-1 AND
(
EXISTS (select B.id from B where B.fk_1 = fk-1)
OR EXISTS (select C.id from C where C.fk_1 = fk-1)
);
Can this be optimised or is there any better ways to do this.
Thanks in advance.
For a single check that is the fastest given you indexed A.id, B.fk_1 and C.fk_1
A common pitfall is calling this SQL for every single row you might want to check. The check can be way faster if all rows are checked at once. (Faster per row checked)
So in case you want to check a bunch of them at the same time, you could do:
SELECT A.id FROM A WHERE A.id IN (
SELECT B.fk_1 FROM B [WHERE xxx]
UNION SELECT C.fk_1 FROM C [WHERE xxx])
Use [WHERE xxx] to place a WHERE to filter the relevant results you might want.
One recommeded check would be "WHERE B.fk_1 IS NOT NULL" to filter out records without FK.
I have two tables where the fields are different except for a shared key. I need to only keep the records with keys that are in A and NOT in B. I don't want records that are only in B or records that are in both A and B (so to exclude anything in the inner join).
I see SAS SQL references to "EXCEPT" but it seems that can only be used if all fields are shared across the two tables since a key is not used. Is there another way?
Do you have to use SQL?
data want ;
merge A (in=in1) B(keep=id in=in2);
by id;
if in1 and not in2 ;
run;
Just use NOT EXISTS:
proc sql;
select a.*
from a
where not exists (select 1 from b where a.key = b.key);
You could use the exists operator:
SELECT *
FROM a
WHERE NOT EXISTS (SELECT *
FROM b
WHERE a.id = b.id)
One more approach with except is to get all id's (or the key column) in A that are not in B. Then use those ids to get all records from A.
select a.*
from a
inner join (select id from a except select id from B) t
on a.id = t.id
Morning all, I need some help please.
I have a database with several tables. I'm trying to write an INSERT INTO and SELECT query that copies all values from one Table (TableA) into another Table (TableD) but substitutes one value with a value it looks up in another table (Table B).
Table A with various fields including TableBRef
Table B includes various fields starting with TableBRef and also includes a field TableCRef
I want to copy all of TableA into Table D but replace The TableBRef with the TableCRef ie I know TableBRef, I need to search for it in TableB and return the associated value from the TableCRef field.
INSERT INTO TableD
(DRef, CRef, DData1, DData2)
SELECT TableA.ARef, TableB.CRef, TableA.AData1, TableA.AData2
FROM TableD AS TableD_1 CROSS JOIN
TableC CROSS JOIN
TableA INNER JOIN
TableB ON TableA.BRef = TableB.BRef
Sorry, I thought that calling them generic table names may help but it's actually a little confusing :-)
I don't understand how you relate each table but see if this query will work for you
INSERT INTO TableD
(DRef, CRef, DData1, DData2)
SELECT a.ARef, c.CRef, a.AData1, a.AData2
from tableA a
left outer join tableB b on a.ARef = b.ARef
inner join tableC c on b.CRef = c.CRef
TableA TableB
Column1 Column2 Column3 Column4
1 2 1 3
I have two table TableA(Column1,Column2) and TableB(Column3,Column4).I want to join two table using column1 ,column4(LIKE NATURAL JOIN). Is in SQL any things to join two table and return a new table with deleting repeated columns?
I want select this:
column1 column2 column4
1 2 3
DBMSes that support NATURAL JOIN require the column names of the join keys to match, and if you do SELECT * you will get only the unique column names. It doesn't make sense to try to specify column names, because the whole thing works by the names already being the same.
You MUST have same-named columns between the two tables, as it will use every same-named column between them to perform the join. Your tables TableA and TableB are unsuitable for a natural join as they don't share any column names.
So you are relegated to doing a regular join:
SELECT
A.*, -- you can at least get all the columns from one table
B.Column4 -- but you have to specify the rest one at a time
FROM
TableA A
INNER JOIN TableB B
ON A.Column1 = B.Column3
;
You just have to bite the bullet and write the query. You may want to not have to write the column names, but that's just not possible.
Some notes: When you say "return a new table", I think I know what you mean, but technically it is a rowset since to be a table it would have to be stored in the database with a name.
It may be possible to alias the column in a view or inline derived table, but you haven't told us what specific DBMS you're using so we can answer for its exact capabilities. It might look something like this:
SELECT
*
FROM
TableA A
NATURAL JOIN (
SELECT Column1 = Column3, Column4
FROM TableB B
) B
;
But notice that you still have to list all the other columns in TableB in order to do this. And I'm not even sure it works.
Joining two tables and querying on some or all columns doesn't return you a new table but record set. To get what you wanted try this. Below query adheres to SQL standard and thus should work on all SQL compliant databases.
SELECT ta.column1, ta.column2, tb.column4 from TableA ta INNER JOIN TableB tb ON (ta.column1 = tb.column4)
If you want to use Natural Join, you need to have same columns.
'Distinct' statement prevents repeating the similar rows too
SELECT Distinct
TableA.Column1,
TableA.Column2,
TableB.Column4
FROM
TableA INNER JOIN TableB ON TableA.Column1 = TableB.Column3
I want to delete records from a table using inner joins on more than two tables. Say if I have tables A,B,C,D with A's pk shared in all other mentioned tables. Then how to write a delete query to delete records from table D using inner joins on table B and A since the conditions are fetched from these two tables. I need this query from DB2 perspective. I am not using IN clause or EXISTS because of their limitations.
From your description, I take the schema as:
A(pk_A, col1, col2, ...)
B(pk_B, fk_A, col1, col2, ..., foreign key fk_A references A(pk_A))
C(pk_c, fk_A, col1, col2, ..., foreign key fk_A references A(pk_A))
D(pk_d, fk_A, col1, col2, ..., foreign key fk_A references A(pk_A))
As you say DB2 will allow only 1000 rows to be deleted if IN clause is used. I don't know about DB2, but Oracle allows only 1000 manual values inside the IN clause. There is not such limit on subquery results in Oracle at least. EXISTS should not be a problem as any database, including Oracle and DB2 checks only for existence of rows, be it one or a million.
There are three scenarios on deleting data from table D:
You want to delete data from table D in which fk_A (naturally) refers to a record in table A using column A.pk_A:
DELETE FROM d
WHERE EXISTS (
SELECT 1
FROM a
WHERE a.pk_A = d.fk_A
);
You want to delete data from table D in which fk_A refers to a record in table A, and that record in table A is also referred to by column B.fk_A. We do not want to delete the data from D that is in A but not in B. We can write:
DELETE FROM d
WHERE EXISTS (
SELECT 1
FROM a
INNER JOIN b ON a.pk_A = b.fk_A
WHERE a.pk_A = d.fk_A
);
The third scenario is when we have to delete data in table D that refers to a record in table A, and that record in A is also referred by columns B.fk_A and table C.fk_A. We want to delete only that data from table D which is common in all the four tables - A, B, C and D. We can write:
DELETE FROM d
WHERE EXISTS (
SELECT 1
FROM a
INNER JOIN b ON a.pk_A = b.fk_A
INNER JOIN c ON a.pk_A = c.fk_A
WHERE a.pk_A = d.fk_A
);
Depending upon your requirement you can incorporate one of these queries.
Note that "=" operator would return an error if the subquery retrieves more than one line. Also, I don't know if DB2 supports ANY or ALL keywords, hence I used a simple but powerful EXISTS keyword which performs faster than IN, ANY and ALL.
Also, you can observe here that the subqueries inside the EXISTS clause use "SELECT 1", not "SELECT a.pk" or some other column. This is because EXISTS, in any database, looks for only existence of rows, not for any particular values inside the columns.
Based on 'Using SQL to delete rows from a table using INNER JOIN to another table'
The key is that you specify the name of the table to be deleted from
as the SELECT. So, the JOIN and WHERE do the selection and limiting,
while the DELETE does the deleting. You're not limited to just one
table, though. If you have a many-to-many relationship (for instance,
Magazines and Subscribers, joined by a Subscription) and you're
removing a Subscriber, you need to remove any potential records from
the join model as well.
DELETE subscribers
FROM subscribers INNER JOIN subscriptions
ON subscribers.id = subscriptions.subscriber_id
INNER JOIN magazines
ON subscriptions.magazine_id = magazines.id
WHERE subscribers.name='Wes';
delete from D
where fk = (select d.fk from D d,A a,B b where a.pk = b.fk and b.fk = d.fk )
this should work