Inserting into one table from multiple tables except uniques - sql

Lets say I have 3 tables named A,B and C
CREATE TABLE IF NOT EXIST A (ID integer primary key);
CREATE TABLE IF NOT EXIST B (ID integer primary key);
CREATE TABLE IF NOT EXIST C (IDA integer,IDB integer);
If I want to make sure that there is one C (and only one C) for every A,B pair. How do I do it?
I tried:
INSERT INTO C(IDA,IDB) SELECT A.ID, B.ID FROM A,B;
And it does create a C from each A,B pair. But if its run again it will be created again. How do I modify the query it so that it only creates a new C when there is not already a C with A and B.
Lets Say I have a:
A:1,2
B:1,2
C:(1,1),(1,2),(2,1),(2,2)
and then B=3 is added. I want a query that will add C:(1,3),(2,3) and not any pair already in C.

Please try this query. Hopefully it works for you.
INSERT INTO c (ida, idb)
SELECT DISTINCT a.id, b.id
FROM a, b
WHERE NOT EXISTS (SELECT 1
FROM c
WHERE c.ida = a.id
AND c.idb = b.id)

INSERT INTO C(IDA,IDB)
SELECT DISTINCT A.ID, B.ID
FROM A,B
MINUS
SELECT IDA, IDB
FROM C;
One another possible way, efficient if A & B are small as cartesian join is made.

You can set the table C:
CREATE TABLE IF NOT EXISTS C (IDA integer,IDB integer, UNIQUE (IDA, IDB) ON CONFLICT IGNORE);
SQLite will do all the work.

Related

Optimizing sql query: check for all rows in table B if any rows in table C reference the same row in table A

I have 3 tables, A, B and C structured like this
CREATE TABLE a (
id SERIAL NOT NULL PRIMARY KEY
);
CREATE TABLE b (
id SERIAL NOT NULL PRIMARY KEY,
a_id INT REFERENCES a(id) ON DELETE CASCADE
);
CREATE TABLE c (
id SERIAL NOT NULL PRIMARY KEY,
a_id INT REFERENCES a(id) ON DELETE CASCADE
);
Where the relationships are many-to-one. What i want is, for every row in table b, i want to check if any row in table c has a reference to the same row in table a. Now, I already have the query
SELECT
b.id,
true
FROM
b
WHERE EXISTS (
SELECT 1
FROM c
WHERE b.a_id = c.a_id
)
UNION
SELECT
b.id,
false
FROM
b
WHERE NOT EXISTS (
SELECT 1
FROM c
WHERE b.a_id = c.a_id
)
ORDER BY id
Though I am not certain, I think this is doing double work, and going through the table twice, and I am wondering how I could optimize it to only traverse the table once.
Is it possible with a simple query, or do I have to do anything complex?
Simply move the EXISTS clause into your SELECT clause.
SELECT
b.id,
EXISTS (SELECT null FROM c WHERE c.a_id = b.a_id) AS c_exists
FROM b;
The same with an IN clause, which I prefer for being even a tad simpler:
SELECT
id,
a_id IN (SELECT c.a_id FROM c) AS c_exists
FROM b;
This can be done with a subquery, a left join, and a case.
The subquery gets you a list of distinct c.a_id values.
SELECT DISTINCT a_id FROM c;
Then do this
SELECT b.id,
CASE WHEN distinct_ids.a_id IS NULL THEN 'false'
ELSE 'true' END has_c_row
FROM b
LEFT JOIN (
SELECT DISTINCT a_id FROM c;
) distinct_ids ON b.a_id = distinct_ids.a_id
This shape of query is called an antijoin or IS NULL ... LEFT JOIN. It detects the rows in the first table that don't match rows in the second table.
The subquery gives us a view of the data in table c with at most one row per each distinct a_id value. Without the subquery, we might get duplicate rows in the result query.
This eliminates your WHERE EXISTS correlated subqueries; even though PostgreSQL's query planner is pretty smart, sometimes it does the slow thing with subqueries like that.
If it is still too slow for you, create these indexes on the a_id columns.
ALTER TABLE b ADD INDEX a_id (a_id);
ALTER TABLE c ADD INDEX a_id (a_id);
i think i understand what you are after
this is how i would do it
SELECT b.id, ISNULL(res.result,0) as result
FROM b
LEFT JOIN (
SELECT c.id, 1 as result
FROM c
INNER JOIN a on a.id = c.id
) res on b.id = res.id
i dont think you need to worry about distinct if they are all unique ids

How to find what one table have elements of second table?

I have 2 Tables, that related with third by one-to-many relation.
For example I have Table A, Table B and Table C.
Table A and B related with table C as many-to-one. So in Table C I have 2 fields like tableAId and tableBId. As a result I need to find a list which includes all Elements from table C which related with table A and compare them to all elements from table C which related with table B.
I tried do it with except, minus statements, but it works incorrect.
Here is what I try to do:
SELECT tableAId FROM tableC
except
select tableBId FROM tableC
UPDATE
Here is my 3 tables :enter image description here
I'm not 100% following what you require but i think below is what you want
SELECT a.ID, b.ID
FROM TableA a
JOIN TableC c ON a.ID = c.TableCID
JOIN TableB b ON c.TableBID = b.ID

Postgis/SQL Select tuples such that the first tuple item is unique and the items geometries intersect

This question is particularly for Postgres 9.4
Lets say I have two tables:
CREATE TABLE A(id INT);
CREATE TABLE B(id INT);
I'd like to have all tuples (A, B) with a certain condition such that
among selected tuples all have different A column:
SELECT DISTINCT ON (A.id) A.id, B.id WHERE condition(A,B);
However DISTINCT ON will perform sorting in memory after all the tuples have been selected and I will like to not select tuples with duplicate A.id at all.
How can this be done in an efficient way?
EDIT:
both A and B have unique ids
EDIT2:
Here is the complete setup:
CREATE EXTENSION postgis;
DROP TABLE A;
DROP TABLE B;
CREATE TABLE A(shape Geometry, id INT);
CREATE TABLE B(shape Geometry, id INT, kind INT);
CREATE INDEX ON A USING GIST (shape);`
I would like to do the following:
SELECT A.id, B.id FROM A, B
WHERE B.id = (SELECT B.id FROM B WHERE
ST_Intersects(A.shape, B.shape)
AND ST_Length(ST_Intersection(A.shape, B.shape)) / ST_Length(A.shape) >= 0.5 AND B.kind != 1 LIMIT 1)`
which works (I believe), however is not necessarily the most efficient way. The table A has orders of magnitude more rows than table B. So
I am not even sure if the GiST index is right.
I am also aware that the order of arguments in ST_Intersects can have a significant effect on run time. What should the correct order be?
If you want just one row for each "A", you can use a correlated subquery (or lateral join):
select a.id,
(select b.id
from b
where condition(a, b)
limit 1
) as b_id
from a;
This should stop testing for rows from b when the first one is found -- which I imagine is the best approach performance-wise.
If none are found, you will get a NULL value. You can wrap this in a subquery and filter out NULLs.
Try something like:
WITH distinct_a as (
SELECT DISTINCT a.id
FROM A)
SELECT A.id, B.id
FROM distinct_a, B
WHERE condition(A,B)
The CTE (WITH ...) will select all distinct values first. Then selected values will be used in the next query.

Insert new/Changes from one table to another in Oracle SQL

I have two tables with same number of columns :-Table A and Table B
Every day I insert data from Table B to Table A. now the insert query is working
insert into table_a (select * from table_b);
But by this insert the same data which was inserted earlier that is also getting inserted. I only want those rows which are new or are changed from the old data. How can this be done ?
You can use minus:
insert into table_a
select *
from table_b
minus
select *
from table_a;
This assumes that by "duplicate" you mean that all the columns are duplicated.
If you have a timestamp field, you could use it to limit the records to those created after the last copy.
Another option is, assuming that you have an primary key (id column in my example) that you can use to know whether a record has already been copied, you can create a table c (with the same structure as a and b) and do the following:
insert into table c
select a.* from table a
left join table b on (a.id=b.id)
where b.id is null;
insert into table b select * from table c;
truncate table c;
You need to adjust this query in order to use the actual primary key.
Hope this helps!
If the tables have a primary or unique key, then you could leverage that in an anti-join:
insert into table_a
select *
from table_b b
where not exists (
select null
from table_a a
where
a.pk_field_1 = b.pk_field_1 and
a.pk_field_2 = b.pk_field_2
)
You don't say what your key is. Assuming you have a key ID, that is you only want ID's that are not already in Table A. You can also use Merge-Statement for this:
MERGE INTO A USING B ON (A.ID = B.ID)
WHEN NOT MATCHED THEN INSERT (... columns of A) VALUES (... columns of B)

How to get a row from a Table with no ids

I have two tables, table A and table B, table A has and Id,a,b and c rows , Table B just has a,b and c and what i want (but i dont know how to do it) is to get a single row from Table B
(i would be something like comparinga,b,and c at the same time)
thanks!
SELECT A.id, B.a, B.b, B.c
FROM A
JOIN B
ON(A.a=B.a)
AND(A.b=B.b)
AND(A.c=B.c)
But I'd change your database structure and add foreign keyto table B referencing table A. It would really help you with this and later cases.
May I suggest changing your table structure.
TableA ( id primary key)
TableB(colA,colB,colC,fid foreign key refrencing TableA.id )
So now you dont have to store colA,colB,colC values in both TableA and TableB
Select B.*
From TableA A
Join TableB B
On A.id=B.fid