How to join a base table with update table with change data? - sql

I have two tables (in real code there is more data columns, this is just an example with single data column)
create table base_t (
id int primary key,
val int;
);
create table update_t (
id int primary key,
c char (1),
val int;
);
In update table, char column can be 'A' (add), 'D' (delete) or 'U' (update). In the result view I need to get rows from update_t when c is 'A' or 'U', when c is 'D' that row should not be present in result and all other rows should be taken from base_t table.
Example
contents of base_t table
1 10
2 20
3 30
contents of update_t table
2 'U' 21
3 'D' NULL
4 'A' 41
expected result
1 10
2 21
4 41
How to create such join in postgresql?

You could use COALESCE() for the U rows, preferring the update_t table over the base_t table. This is assuming that no A records in update_t would exist yet in base_t. Otherwise you could change that COALESCE() over to a case expression like CASE WHEN update_t.c = 'UPDATE' then update_t.id ELSE base_t.id END as id (and similar for val) column.
SELECT COALESCE(update_t.id, base_t.id) as id,
COALESCE(update_t.val, base_t.val) as val
FROM base_t
FULL OUTER JOIN update_t
ON base_t.id = update_t.id
WHERE
update_t.c <> 'D'

Related

SQL Query Optimization to retrieve non-null entries

Need help in optimizing SQL query, I have figured a way to solve the problem by using UNIONALL, but my worry is that performance will be impacted as the record set is huge in production env.
I have a table of records in below format, I need help in retrieving the non-null entries if available otherwise pick the null entries.
In the below case; Query should exclude RowIds 1,7 and retrieve everything else, i.e because there are non-null entries for that combination.
RowID
UniqueID
TrackId
1
325
NULL
2
325
8zUAC
3
325
99XER
4
427
NULL
5
632
2kYCV
6
533
NULL
7
774
NULL
8
774
94UAC
--UNIONALL Command
SELECT A.* FROM
( SELECT * FROM [MY_PKG].[TEMP] WHERE TRACKID is not null) A
WHERE A.UNIQUEID in
( SELECT UNIQUEID FROM [MY_PKG].[TEMP] WHERE TRACKID is null
)
UNION ALL
SELECT B.* FROM
( SELECT * FROM [MY_PKG].[TEMP] WHERE TRACKID is null) B
WHERE B.UNIQUEID not in
( SELECT UNIQUEID FROM [MY_PKG].[TEMP] WHERE TRACKID is not null
)
Temp Table Creation Scrip
CREATE TABLE MY_PKG.TEMP
( UNIQUEID varchar(3),
TRACKID varchar(5)
);
INSERT INTO MY_PKG.TEMP
( UNIQUEID, TRACKID)
VALUES
('325',null),
('325','8zUAC'),
('325','99XER'),
('427',null),
('632','2kYCV'),
('533','2kYCV'),
('774',null),
('774','94UAC')
You can use the NOT EXISTS operator with a correlated subquery:
SELECT * FROM TEMP T
WHERE TRACKID IS NOT NULL
OR (TRACKID IS NULL
AND NOT EXISTS(
SELECT 1 FROM TEMP D
WHERE D.UNIQUEID = T.UNIQUEID AND
D.TRACKID IS NOT NULL)
)
See demo

bigQuery assign a value to table 1 based on table 2

I would like to update Table 2 based based on Table 1 that is given by:
Row sample_id PIK3CA_features
1 huDBF9DD chr3_3268035_CT
2 huDBF9DD chr3_3268043_AT
3 huDBF9DD chr3_3268049_T
Table 2:
Row sample_id chr3_3268035_CT chr3_3268043_AT chr3_3268049_C
1 huDBF9DD 1 1 null
2 huDBF9De null null null
3 huDBF9Dw null null null
For each row in Table 1, if its samle_id is correspondent in Table 2 then I'd like to update the respective PIK3CA_feature in Table 2 to 1.
How can I pass the sample_id and PIK3CA_features values from Table 1 as parameters to update Table 2 in a SQL command?
You can use an UPDATE statement to accomplish this. Assuming I understand correctly, you want something like this query:
#standardSQL
UPDATE table2 AS t2
SET
chr3_3268035_CT =
IF(t1.PIK3CA_features = 'chr3_3268035_CT', 1, chr3_3268035_CT),
chr3_3268043_AT =
IF(t1.PIK3CA_features = 'chr3_3268043_AT', 1, chr3_3268043_AT),
chr3_3268049_C =
IF(t1.PIK3CA_features = 'chr3_3268049_C', 1, chr3_3268049_C)
FROM table1 AS t1
WHERE true;
This will set the appropriate column in table 2 to have a value of 1 based on the value of PIK3CA_features. If you have a lot of these columns, you can generate the query using Python or some other programming language, or you can generate all the column_name=expression pairs using a query:
#standardSQL
SELECT
STRING_AGG(FORMAT('%s=IF(t1.PIK3CA_features="%s",1,%s)',
PIK3CA_features, PIK3CA_features, PIK3CA_features), ',\n')
FROM (
SELECT DISTINCT PIK3CA_features
FROM table1
);
This produces a list like:
chr3_3268035_CT=IF(t1.PIK3CA_features="chr3_3268035_CT",1,chr3_3268035_CT),
chr3_3268049_C=IF(t1.PIK3CA_features="chr3_3268049_C",1,chr3_3268049_C),
chr3_3268043_AT=IF(t1.PIK3CA_features="chr3_3268043_AT",1,chr3_3268043_AT)

Derive groups of records that match over multiple columns, but where some column values might be NULL

I would like an efficient means of deriving groups of matching records across multiple fields. Let's say I have the following table:
CREATE TABLE cust
(
id INT NOT NULL,
class VARCHAR(1) NULL,
cust_type VARCHAR(1) NULL,
terms VARCHAR(1) NULL
);
INSERT INTO cust
VALUES
(1,'A',NULL,'C'),
(2,NULL,'B','C'),
(3,'A','B',NULL),
(4,NULL,NULL,'C'),
(5,'D','E',NULL),
(6,'D',NULL,NULL);
What I am looking to get is the set of IDs for which matching values unify a set of records over the three fields (class, cust_type and terms), so that I can apply a unique ID to the group.
In the example, records 1-4 constitute one match group over the three fields, while records 5-6 form a separate match.
The following does the job:
SELECT
DISTINCT
a.id,
DENSE_RANK() OVER (ORDER BY max(b.class),max(b.cust_type),max(b.terms)) AS match_group
FROM cust AS a
INNER JOIN
cust AS b
ON
a.class = b.class
OR a.cust_type = b.cust_type
OR a.terms = b.terms
GROUP BY a.id
ORDER BY a.id
id match_group
-- -----------
1 1
2 1
3 1
4 1
5 2
6 2
**But, is there a better way?** Running this query on a table of over a million rows is painful...
As Graham pointed out in the comments, the above query doesn't satisfy the requirements if another record is added that would group all the records together.
The following values should be grouped together in one group:
INSERT INTO cust
VALUES
(1,'A',NULL,'C'),
(2,NULL,'B','C'),
(3,'A','B',NULL),
(4,NULL,NULL,'C'),
(5,'D','E',NULL),
(6,'D',NULL,NULL),
(7,'D','B','C');
Would yield:
id match_group
-- -----------
1 1
2 1
3 1
4 1
5 1
6 1
...because the class value of D groups records 5, 6 and 7. The terms value of C matches records 1, 2 and 4 to that group, and cust_type value B ( or class value A) pulls in record 3.
Hopefully that all makes sense.
I don't think you can do this with a (recursive) Select.
I did something similar (trying to identify unique households) using a temporary table & repeated updates using following logic:
For each class|cust_type|terms get the minimum id and update that temp table:
update temp
from
(
SELECT
class, -- similar for cust_type & terms
min(id) as min_id
from temp
group by class
) x
set id = min_id
where temp.class = x.class
and temp.id <> x.min_id
;
Repeat all three updates until none of them updates a row.

Insert value from two tables based on the rows source table

I have two tables A and B each with 1 column for simplicity sake and they are primary Keys.
A contains the values (1,2,3) B contains (1,2,3)
The third table needs to be the insertion of both A and B and has a composite primary key.
Table C (id, src)
If the id is coming from table A I'd like src to be 'A' and if its coming from B then 'B'.
There can be duplicate ID's between the tables but they are not the same item which is why I need to create a composite key based on which table the row is coming from.
I've tried
Insert into C (anID, src)
Select
Case when (A.anID is not null)
then A.anID else B.anID end,
case when (A.anID is not null)
then 'A' else 'B' end
from
A,
B
But my results always end up as just 3 rows (1, A) (2,A) (3,A) When there should be 6 rows (one of each of those with a B)
insert into TableC (id, src)
select ID, 'A' from tableA
union
select ID, 'B' from tableB

How to do I update existing records using a conditional clause?

I'm new to Oracle SQL so I have a question .. I have two tables, Table A and Table B .. Now Table A and Table B have the same column names, but in table A, only one column (named 'tracker') actually has data in it .. The rest of the columns in Table A are empty ... What I need to do is update each record in Table A, so that values for other columns are copied over from Table B, with the condition that the the 'tracker' columns value from Table A is matched with the 'tracker' column in Table B ..
Any ideas ?
MERGE INTO tableA a
USING tableB b
ON (a.tracker=b.tracker)
WHEN MATCHED THEN UPDATE SET
a.column1=b.column1,
a.column2=b.column2;
And if exist rows in B that does not exist in A:
MERGE INTO tableA a
USING tableB b
ON (a.tracker=b.tracker)
WHEN MATCHED THEN UPDATE SET
a.column1=b.column1,
a.column2=b.column2
WHEN NOT MATCHED THEN INSERT VALUES
a.tracker,a.column1,a.column2; --all columns
create table a (somedata varchar2(50), tracker number , constraint pk_a primary key (tracker));
create table b (somedata varchar2(50), tracker number, constraint pk_b primary key (tracker));
/
--insert some data
insert into a (somedata, tracker)
select 'data-a-' || level, level
from dual
connect by level < 10;
insert into b (somedata, tracker)
select 'data-b-' || -level, level
from dual
connect by level < 10;
select * from a;
SOMEDATA TRACKER
-------------------------------------------------- -------
data-a-1 1
data-a-2 2
data-a-3 3
data-a-4 4
data-a-5 5
data-a-6 6
data-a-7 7
data-a-8 8
data-a-9 9
select * from b;
SOMEDATA TRACKER
-------------------------------------------------- -------
data-b--1 1
data-b--2 2
data-b--3 3
data-b--4 4
data-b--5 5
data-b--6 6
data-b--7 7
data-b--8 8
data-b--9 9
commit;
update (select a.somedata a_somedata, b.somedata b_somedata
from a
inner join
b
on a.tracker = b.tracker)
set
a_somedata = b_somedata;
select * from a; --see below for results--
--or you can do it this way: (issuing rollback to get data back in previous state)
--for a one column update, either way will work, I would prefer the former in case there is a multi-column update necessary
-- merge *as posted by another person* will also work
update a
set somedata = (select somedata
from b
where a.tracker = b.tracker
);
select * from A; --see below for results--
-- clean up
-- drop table a;
-- drop table b;
this will give you the results:
SOMEDATA TRACKER
-------------------------------------------------- -------
data-b--1 1
data-b--2 2
data-b--3 3
data-b--4 4
data-b--5 5
data-b--6 6
data-b--7 7
data-b--8 8
data-b--9 9
here is a link to oracle's documentation on UPDATE