Postgres Relationship query - sql

I have two tables A and B where the relationship is one to many (A -> many B). Table "A" contains columns id, name and table "B" has id, a_id(fk), is_off(boolean).
Now, I want to get id of "A" which has all "B"'s is_off = true.
I tried this one select a.id from A a inner join B b on a.id = b.a_id where b.is_off = true But it only returns even if an "A" has an item (B) which has is_off = false;
Any help will be appreciated.

You were close. A subquery is probably what you're looking for:
Test data
CREATE TABLE a (id int PRIMARY KEY, name text);
CREATE TABLE b (id int, a_id int REFERENCES a(id), is_off boolean);
INSERT INTO a VALUES (1,'foo');
INSERT INTO b VALUES (42,1,true),(1,1,false);
Your query would return all records if at least one of the b records fulfil your join and where clauses:
SELECT * FROM a
JOIN b on a.id = b.a_id
WHERE b.is_off;
id | name | id | a_id | is_off
----+------+----+------+--------
1 | foo | 42 | 1 | t
(1 Zeile)
If you intend to exclude all a records that contain at least one is_off = true, you can use NOT IN with a subquery, but as suggested by #a_horse_with_no_name (see comments below) you could use EXISTS or NOT EXISTS:
SELECT * FROM a
WHERE NOT EXISTS (SELECT a_id FROM b WHERE a.id = b.a_id AND is_off);
id | name
----+------
(0 Zeilen)

Related

What is the best way to anonymize ID values in sql server 2008

I got 2 tables in sql 2008
Table1
Id Name Surname City
1000 Alex White London
1001 John Brown Brussels
..
Table2
Id Surgeon Room aId
1 Mike J. A104 1000
2 Jack S. C144 1001
...
And I have a query like:
Select a.Id,b.Id,
a.Name,a.Surname,a.City,b.Surgeon,b.Room
into #results
from Table1 a
inner join Table2 b on a.Id = b.aId
What I want to do is to anonymize the a.Id and b.Id values for privacy, by using dummy ones instead of the real ones. I added a random mathematical operations before, like:
Select aId = a.Id * 22 / 5 + 14 * 2
,bId = b.Id * 12 / 4 + 7 * 3
...
but honestly I am not really happy what I am doing here and I am looking for a more professinal way to provide this. Any advice would be appreciated.
If you don't need to be sure the anonymized IDs are unique and you don't need to find a real ID based on an anonymized ID, you could use the CheckSum() or HashBytes() function with the strings from your Table1 and Table2:
Select aId = CheckSum(a.Name + a.Surname) % 10000
,bId = HashBytes('SHA1', b.Surgeon) % 10000
,a.Name,a.Surname,a.City,b.Surgeon,b.Room
into #results
from Table1 a
inner join Table2 b on a.Id = b.aId
If you need to be sure you have a unique value for each of the Id values in your table and you also need to find a real ID based on an anonymized ID, you can construct a lookup table as follows:
CREATE TABLE Anon
(
ID INTEGER NOT NULL PRIMARY KEY,
AnonID UNIQUEIDENTIFIER DEFAULT NewID()
);
this can then be used in queries where the actual ID should not be returned:
Select aID = Anona.AnonID,
bID = Anonb.AnonID,
a.Name,a.Surname,a.City,b.Surgeon,b.Room
into #results
from Table1 a inner join Table2 b on a.Id = b.aId
inner join Anon Anona on a.Id = Anona.Id
inner join Anon Anonb on b.Id = Anonb.Id
The Anon table would need to be maintained to ensure it contains all IDs from your Table1 and Table2.

INSERT into table and UPDATE foreign key in another table (Sql Server 2008)

I have two tables, TableA and TableB like this:
TableA
-------------------------------------------
| id | some_data | new_FK_column_on_B |
| ---- | ----------- | ------------------ |
| 1 | ... | null |
| ... | ... | null |
| 999 | ... | null |
-------------------------------------------
TableB
----------------------------
| id | some_other_data |
| ---- | ----------------- |
| | |
----------------------------
At the moment, TableB is empty, and FK column in TableA is null for all rows. I need to write one-time initializing scrit to populate TableB and initialize FK column for some rows (criterial, not for all) in TableA by identifiers from rows, inserted in TableB.
I know two ways to do this:
1) using while and scope_identity(), inserting new row into TableB and updating TableA on each iteration, while exists rows in TableA, which should be updated
while (exists (select 1 from TableA where [condition]))
begin
insert into TableB (some_other_data) values ('some_other_data')
update TableA set new_FK_column_on_B
where id = (select top 1 id from TableA where [condition])
end
2) create temp column in TableB, storing id of row in TableA, for which it was inserted, and then update TableA using join
alter table TableB add temp int
go
insert into TableB (some_other_data, temp) select 'some_other_data', id from TableA where [condition]
update TableA
set new_FK_column_on_B = b.id
from TableB as b
join TableA as a on a.id = b.temp
alter table TableB drop column temp
Also I was trying to use somehow output from insert like this, but it's syntax is incorrect:
update TableA
set new_FK_column_on_B =
(
select insertedId from
(
insert into TableB (some_other_data)
output inserter.id as insertedId
values ('some_other_data')
)
)
where [condition]
Is there any easier way to do this whithout using while or modifing any table?
I found this question when searching for a solution a similar case. The only difference was that I wanted to fill TableB with an row for each row in TableA (no where-clause).
I found a solution to the third option you suggeted (using output from insert), however, since you are using INSERT with data from a SELECT, you cannot use the OUTPUT clause. This pointed me in the right direction:
DECLARE #MyTableVar table(tableBId int, tableAId int)
MERGE INTO TableB
using TableA AS AP
on 1=0
WHEN NOT MATCHED THEN
Insert(some_other_data)
Values('some_other_data')
Output inserted.ID, AP.ID INTO #MyTableVar;
update TableA set new_FK_column_on_B = (select tableBId from #MyTableVar where tableAId = TableA.ID)
Be aware that executing this a seconds time will create new entries in TableB. If you only want to create new rows in TableB, where there is no foreign key set in TableA, you can use this script:
DECLARE #MyTableVar TABLE(tableBId int, tableAId int)
MERGE INTO TableB AS B
USING TableA AS AP
ON A.new_FK_column_on_B = B.id
WHEN NOT MATCHED THEN
INSERT(some_data)
VALUES(AP.some_data)
OUTPUT inserted.ID, AP.ID INTO #MyTableVar;
UPDATE TableA SET new_FK_column_on_B = (
SELECT tableBId
FROM #MyTableVar
WHERE tableAId = TableA.ID )
WHERE TableA.new_FK_column_on_B IS NULL;
You can do all this as set operations:
insert into b(some_data)
select distinct some_data
from a;
update a
set new_FK_column_on_B = b.id
from a join
b
on a.some_data = b.some_data;
This assumes that the id column in b is declared as identity(), so it gets assigned automatically. This is a good idea, but if you want to do this manually, then the first query would be something like:
insert into b(some_data)
select row_number() over (order by (select null)), some_data
from (select distinct some_data
from a
) a;
There is no need for a while loop.
DISCLAIMER: This is not tested at all, I wrote this in notepad:
DECLARE #TableAValues TABLE (IdTableA int, SomeData varchar)
INSERT INTO #TableAValues(IdTableA, SomeData)
SELECT id, 'some_data' FROM TableA
DECLARE #TableBIds TABLE (IdTableB int)
INSERT INTO TableB(SomeData)
OUTPUT INSERTED.ID INTO #TableBIds
SELECT SomeData FROM #TableAValues
UPDATE ta
SET ta.new_FK_column_on_B = tbi.IdTableB
FROM dbo.TableA AS ta
INNER JOIN #TableAValues AS tav ON ta.id = tav.IdTableA -- used in case more records were added to table in the interim.
LEFT OUTER JOIN #TableBIds tbi On tav.RowNum = tbi.RowNum
Note: I am using In memory tables, but if you were concerned about memory usage, you could probably just switch those out for temp tables on disk.
The idea I was going for here:
Grab the rows from table A (ID + data) that we will use to populate B and store them (I am using an In memory table)
Insert those rows into B and store the corresponding ID of B (again in an in memory table)
I assume the order of rows in both in memory tables will now match, so the idea is that we can join both in memory tables together on row number to get Table A Id with it's Table B Id - which we use to update table A's foreign key.
I'd be very surprised if my code sample above works as is, but hopefully the idea will be useful, if not my execution.

Is there any better way to write this query

I designed below query for my delete operation. I am new to SQL and just wanted to check with experienced people here if it is fine or any better way to do this. I am using DB2 database
DELETE FROM TableD
WHERE B_id IN
(
SELECT B.B_id
FROM TableB tB
INNER JOIN TableA tA
ON tB.A_id = tA.A_id
WHERE A_id = 123
) AND
C_id IN (1,2,3)
This has two IN clause which I am little worried and not sure if I could use EXISTS clause anywhere.
Database Structure as below:
Table A has ONE TO MANY relation with Table B
Table B has ONE TO MANY relation with Table C
Table B has ONE TO MANY relation with Table D
Table D has composite primary key ( B_id, C_id )
Table D data somewhat similar to below
B_id|C_id
----------
1 | 1
1 | 2
1 | 3
2 | 4
2 | 5
3 | 5
Here I have to delete rows which have C_id in array of values. But since the index is a composite of B_id and D_id, I am retrieving related B_id to the particular entity of Table A by equality operator A_id=123
There isn't necessarily anything wrong with your method. However, a useful alternative technique to know is merge:
merge into TableD
using (
select distinct
B.B_id
from TableB tB
inner join TableA tA on
tB.A_id = tA.A_id and
A_id = 123
) AB
on
TableD.B_id = AB.B_id and
C_id in (1,2,3)
when matched then delete;
Note that I had to use distinct on the inner query to prevent duplicate matches.
You can use merge like this too :
merge into TableD
using TableB tB
on B.B_id = TableD.B_id
and tB.A_id in (select A_id from TableA tA where A_id = 123)
and C_id in (1,2,3)
when matched then delete;
DELETE FROM TableD tD
WHERE
EXISTS (
SELECT
tB.B_id
FROM
TableB tB
WHERE
A_id = 123
AND tB.B_id = tD.B_id
)
AND C_id IN (1, 2, 3)

Find differences between two large tables in oracle

I have two different tables, say table A and B in oracle with around 15 million records in each. Table A has columns (a,b,c,d) and
Table B has columns (e,f,g,h).
The objective is to write a stored procedure to check if every record present in table A is also present in table B and vice versa. Differences between these two should be inserted into a third table.
My problem is that
column a in Table A should be compared with concatenate of column e and f in table B if column e contains a certain string (0311),
if not I have to compare it with just column f.
Column b should be compared with column g in table B and
I also have to compare column c in the table A with column g in table B, if the two aren't a match column d should be compared with column g.
What's the fastest way to do so?
for example these two are a match:
Table A: 9353456789,03117884657,12082200003035,12082123595535
Table B: 9353456789,0311,7884657,12082200003035
or:
Table A: 9353456789,03117884657,12082200003035,12082123595535
Table B: 9353456789,0311,7884657,12082123595535
example of records that do not need concatenation and are a match:
Table A: 9353456789,03617884657,12082200003035,12082123595535
Table B: 9353456789,0361,03617884657,12082200003035
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE TableA ( a VARCHAR2(20), b VARCHAR2(20), c VARCHAR2(20), d VARCHAR2(20) );
CREATE TABLE TableB ( e VARCHAR2(20), f VARCHAR2(20), g VARCHAR2(20), h VARCHAR2(20) );
CREATE TABLE TableC ( i VARCHAR2(20), j VARCHAR2(20), k VARCHAR2(20), l VARCHAR2(20) );
INSERT INTO TableA
SELECT '9353456789','03117884657','12082200003035','12082123595535' FROM DUAL
UNION ALL SELECT '9353456789','03617884657','12082200003035','12082123595535' FROM DUAL
UNION ALL SELECT '9353456789','03617884657','12082200003034','12082123595534' FROM DUAL;
INSERT INTO TableB
SELECT '9353456789','0311','7884657','12082200003035' FROM DUAL
UNION ALL SELECT '9353456789','0311','7884657','12082123595535' FROM DUAL
UNION ALL SELECT '9353456789','0361','03617884657','12082200003035' FROM DUAL
UNION ALL SELECT '9353456789','0361','03617884657','12082200003036' FROM DUAL;
Query 1:
To insert the rows - perform an INSERT INTO... SELECT using a FULL OUTER JOIN between both tables using your requirements as the join condition; then for the rows which do not match either TableA(a, b, c, d) will all be NULL or TableB(e, f, g, h) will all be NULL and this can be used in the WHERE condition to only get the non-matched rows. Finally, so as not to return NULL values, COALESCE() is used for the returned values.
INSERT INTO TableC
SELECT COALESCE( ta.a, tb.e ) AS i,
COALESCE( ta.b, tb.f ) AS j,
COALESCE( ta.c, tb.g ) AS k,
COALESCE( ta.d, tb.h ) AS l
FROM TableA ta
FULL OUTER JOIN
TableB tb
ON ( ta.a = tb.e
AND ta.b = CASE tb.f WHEN '0311' THEN tb.f || tb.g ELSE tb.g END
AND ( ta.c = tb.h OR ta.d = tb.h )
)
WHERE ta.a IS NULL
OR tb.e IS NULL;
Query 2:
SELECT * FROM TableC
Results:
| I | J | K | L |
|------------|-------------|----------------|----------------|
| 9353456789 | 03617884657 | 12082200003034 | 12082123595534 |
| 9353456789 | 0361 | 03617884657 | 12082200003036 |
I'd do this as two statements, though it can be combined
Select a.*
from tablea a left join tableb b on a.a =
case when e = 'string' then b.e || b.f else b.f end
and ...
where b.e is null
The left join will return nulls where a row isn't found in table b, so this should bring up a list of rows i9n table a not in table b. Change the statement to a right join and select b.* and you'll see whats in b but not in a.
Statement can be turned into a 'create table as' which will create a new table with the results from this select statement.
I put and ... your conditions there are a bit confusing, you'll just need to use case statements to pick which columns you want to compare/join on.

Combining sql select and Count

I have two tables
A and B
A B
----------------- -----------------
a_pk (int) b_pk (int)
a_name(varchar) a_pk (int)
b_name (varchar)
I could write a query
SELECT a.a_name, b.b_name
FROM a LEFT OUTER JOIN b ON a.a_pk = b.a_pk
and this would return me a non distinct list of everything in table a and its table b joined data. Duplicates would display for column a where different b records shared a common a_pk column value.
But what I want to do is get a full list of values from table A column a_name and ADD a column that is a COUNT of the joined values of table B.
So if a_pk = 1 and a_name = test and in table b there are 5 records that have a a_pk value of 1 my result set would be
a_name b_count
------ -------
test 5
The query should like this :
SELECT
a.a_name,
(
SELECT Count(b.b_pk)
FROM b
Where b.a_pk = a.a_pk
) as b_count
FROM a
SELECT a_name, COUNT(*) as 'b_count'
FROM
A a
JOIN B b
ON a.a_pk = b.a_pk
GROUP BY a_name
SELECT
a.name,
(
SELECT COUNT(1)
FROM B b
WHERE b.a_pk = a.a_pk
)
FROM A a