How do you merge two tables with multiple unique indentifiers? - sql

Hey I have two tables with the same rows the first table is the main table and I want to upsert the data with new unique entries from the _tmp_ table.
for example;
id, text_id, last_sent, recent_sent, updated_at, date_created
I want to merge a communicated _tmp_ table that is created from another table into the communicated table. Only if the communicated table doesn't have an identical row id, text_id, last_sent and recent_sent
The query I'm using now is posted below but doesn't work. This query inserts all the data from the _tmp_ table.
I have checked and both the types of the tables are the same. And I just don't know what I'm doing wrong.
Help much appreciated
MERGE
`project.map.communicated` CURRENT_TABLE
USING
`project.map.communicated_tmp_` NEW_OR_UPDATED
ON
(CURRENT_TABLE.id = NEW_OR_UPDATED.id
AND CURRENT_TABLE.text_id = NEW_OR_UPDATED.text_id
AND CURRENT_TABLE.last_sent = NEW_OR_UPDATED.last_sent
AND CURRENT_TABLE.recent_sent = NEW_OR_UPDATED.recent_sent)
WHEN NOT MATCHED
THEN
INSERT
(`id`,
`text_id`,
`last_sent`,
`recent_sent`,
`updated_at`,
`date_created`)
VALUES
(`id`,`text_id`,`last_sent`,`recent_sent`,`updated_at`,`date_created`)

The Merge statement uses JOIN logic to see matches. The only reason this should not work if there are rows that have NULLS in either of the fields you use for the join. Make sure to exclude the NULLS or make a composite key which works around the NULL values.

Related

Selecting distinct rows from two tables and replacing the values

I have a base_table and a final_table having same columns with plan and date being the primary keys. The data flow happens from base to final table.
Initially final table will look like below:
After that the base table will have
Now the data needs to flow from base table to final table, based on primary keys columns (plan, date) and distinct rows the Final_table should have:
The first two rows gets updated with new values in percentage from base table to final table.
How do we write a SQL query for this?
I am looking to write this query in Redshift SQL.
Pseudo code tried:
insert into final_table
(plan, date, percentage)
select
b.plan, b.date, b. percentage from base_table
inner join final_table f on b.plan=f.plan andb.date=f.date;
First you need to understand that clustered (distributed) columnar databases like Redshift and Snowflake don't enforce uniqueness constraints (would be a performance killer). So your pseudo code is incorrect as this will create duplicate rows in the final_table.
You could use UPDATE to change the values in the rows with matching PKs. However, this won't work in the case where there are new values to be added to final_table. I expect you need a more general solution that works in the case of updated values AND new values.
The general way to address this is to create an "upsert" transaction that deletes the matching rows and then inserts rows into the target table. A transaction is needed so no other session can see the table where the rows are deleted but not yet inserted. It looks like:
begin;
delete from final_table
using base_table
where final_table.plan = base_table.plan
and final_table.date = base_table.date;
insert into final_table
select * from base_table;
commit;
Things to remember - 1) autocommit mode can break the transaction 2) you should vacuum and analyze the table if the number of rows changed is large.
Based on your description it is not clear that I have captured the full intent of you situation ("distinct rows from two tables"). If I have missed the intent please update.
You don't need an INSERT statement but an UPDATE statement -
UPDATE final_table
SET percentage = b.percentage
FROM base_table b
INNER JOIN final_table f ON b.plan = f.plan AND b.date = f.date;

How to update numerical column of one table based on matching string column from another table in SQL

I want to update numerical columns of one table based on matching string columns from another table.i.e.,
I have a table (let's say table1) with 100 records containing 5 string (or text) columns and 10 numerical columns. Now I have another table that has the same structure (columns) and 20 records. In this, few records contain updated data of table1 i.e., numerical columns values are updated for these records and rest are new (both text and numerical columns).
I want to update numerical columns for records with the same text columns (in table1) and insert new data from table2 into table1 where text columns are also new.
I thought of taking an intersect of these two tables and then update but couldn't figure out the logic as how can I update the numerical columns.
Note: I don't have any primary or unique key columns.
Please help here.
Thanks in advance.
The simplest solution would be to use two separate queries, such as:
UPDATE b
SET b.[NumericColumn] = a.[NumericColumn],
etc...
FROM [dbo].[SourceTable] a
JOIN [dbo].[DestinationTable] b
ON a.[StringColumn1] = b.[StringColumn1]
AND a.[StringColumn2] = b.[StringColumn2] etc...
INSERT INTO [dbo].[DestinationTable] (
[NumericColumn],
[StringColumn1],
[StringColumn2],
etc...
)
SELECT a.[NumericColumn],
a.[StringColumn1],
a.[StringColumn2],
etc...
FROM [dbo].[SourceTable] a
LEFT JOIN [dbo].[DestinationTable] b
ON a.[StringColumn1] = b.[StringColumn1]
AND a.[StringColumn2] = b.[StringColumn2] etc...
WHERE b.[NumericColumn] IS NULL
--assumes that [NumericColumn] is non-nullable.
--If there are no non-nullable columns then you
--will have to structure your query differently
This will be effective if you are working with a small dataset that does not change very frequently and you are not worried about high contention.
There are still a number of issues with this approach - most notably what happens if either the source or destination table is accessed and/or modified while the update statement is running. Some of these issues can be worked around other ways but so much depends on the context of how the tables are used that it is difficult to provide a more effective generically-applicable solution.

Joining Different Database Tables

I have two tables in Access pulling from databases. There is no primary key linking the two tables, because the databases pull from different programs.
Using SQL, I need all of the information from both tables to pull into a query, and this is where I have problems. The two tables are pulling the same data, but they column titles might not necessarily be the same. For now, I'm assuming they are. How can I get it so that the data from both tables pull into the correct column together?
Here's an example of code (I can't post the real code for certain reasons):
SELECT system1_vehiclecolor, system1_vehicleweight, system1_licenseplate, system2_vehiclecolor, system2_vehicleweight, system2_licenseplate
FROM system1, system2
To further explain this, I want the table to have a column for vehiclecolor, vehicleweight, and licenseplate that combines all of the information. Currently, the way I have it, it is making a column for each of the names in each table, which isn't what I want.
You can use 2 queries to get this done
Select col1as c1 ,col2 col as c2 into resulttable from table1
Insert into resulttable (c1,c2) select colX as c1, colY as c2 from table2
Hope this will help you

Insert with select, dependent on the values in the table inserting into EDITED

So I need to figure out how to insert into a table, from another table, with a where clause that requires me to access the table that I am inserting into. I tried an alias from the table I am inserting into, but I quickly found out that you cannot do that. Basically, what I want to check is that the values that I am inserting into the table match a particular field within the table that I am inserting into. Here is what I've tried:
INSERT INTO "USER"."TABLE1" AS A1
SELECT *
FROM "USER"."TABLE2" AS A2
WHERE A2."HIERARCHYLEVEL" = 2
AND A2."PARENT" = A1."INSTANCE"
Obviously, this was to no avail. I've tried a couple other queries, but they didn't me anywhere, either. Any help would be much appreciated.
EDIT:
I would like to add rows to this table, not add columns to the table. The two tables are of the exact same structure -- in fact, I extracted the data already in table1 from table2. What I have in table1 currently is a bunch of records who have NO PARENT, but an instance. What I want to add is all the records who have a parent in table2 that are equal to the instance in table 1.
Currently there is no way to join on a table when inserting. The solution with the subselect where you select from the table, is the correct.
Aliasing the table you want to change is only possible with UPDATE, UPSERT and MERGE. For these operations it makes sense, as you need to match a column and then decide if you need to update it or insert something instead. In your example the line from table1 that you match is not relevant, as you don't want to change it, so from the statement point of view it is not really relevant that the table you use in your subselect is the same that the one you insert into.
As alternative, I can suggest you following solution, which is equivalent with yours:
INSERT INTO "user"."table1"
SELECT
A1."ROOT",
A1."INSTANCE",
A1."PARENT",
A1."HIERARCHYLEVEL"
FROM "user"."table2" AS A1
WHERE A1."INSTANCE" in (select "PARENT" from "user"."table1")
AND A2."HIERARCHYLEVEL" = 2
This gave me the answer I was looking for, although I am sure there is an easier -- or more efficient -- way to do it.
INSERT INTO "user"."table1"
SELECT
A1."ROOT",
A1."INSTANCE",
A1."PARENT",
A1."HIERARCHYLEVEL"
FROM "user"."table2" AS A1,
"user"."table1" AS A2
WHERE A1."INSTANCE" = A2."PARENT"
AND A2."HIERARCHYLEVEL" = 2

SQL inserting rows from multiple tables

I have got an assignment. We have been given a table, MAIN_TABLE, which has a column patient_id as foreign key.
I need to make a separate table named patient which has patient_id as a primary key along with some other attributes such as name and address.
I did successfully create schema of this table. Now there is a serious problem I am facing. After creating this table I used insert statement to insert values for name and address from a dummy table.
Till this point everything works fine. However, the column patient_id is still empty rather I have set it to 0 by default.
Now the problem is that I need to get values into this column, patient_id, from the patient_id column of MAIN TABLE.
I can't figure out how do I do this? I did try to use:
UPDATE patient
SET patient_id=(select id from MAIN_TABLE)
BUT this gives me error that multiple rows returned which does make sense but what condition do I put in where clause then?
That sounds strange. How can there be a table MAIN_TABLE with a foreign key patient_id but the master table patient does not exist. Where do that patient_ids in MAIN_TABLE come from?
I suggest not to insert your data from a dummy table alone and then try to update it. But insert it with both - the MAIN_TABLE and the dummy table joined. If you can not join them. You would also not be able during the update.
So since i think they have no connected primary/foreign keys the only way to join them is using a good business key. Do you have a good business key?
You are talking about persons. So First Name, Last Name, Birth Day, Address often is good enough. But you have to think about it.
With your given data I can only give you some kind of meta insert statement. But you will get the point.
Example:
insert into patient (col1, col2, col3)
select
a.colA,
a.colF,
b.colX
from
dummy_table a
inner join MAIN_TABLE b on a.colN=b.colA and a.colM=b.colB
And: If patient_id is your primary key in patient you should ensure that it is even not possible to have duplicate values or null in this column. And you should use constraints to ensure your data integrity.
http://docs.oracle.com/cd/B19306_01/server.102/b14200/clauses002.htm