Copy related rows from two tables into those same tables with new values in SQL? - sql

I have two tables, "table1" and "table2". They contain rows that are related to each other. The table "table2" has pairs of rows with a column named "table1_id" and these pairs refer to the sequential "id" columns in "table1".
The task that needs to be performed is that I need to copy rows from both tables and put these rows into the same tables with new data while maintaining the relationships between them.
I have read several posts on StackOverflow and some articles on "mssqltips.com", but I am still not sure how I should do this. Should I use a cursor or a query with joins and a temporary table? What is the best-practice way to achieve the above task and if possible, could you demonstrate a short example?

You cannot "copy both rows from both tables and put these into the same tables" verbatim if the id column is being copied and is truly an ID (a unique key). If you want to duplicate some other (non-key) columns into new rows with new IDs, that's fine. If that's what you want, you can create the primary key (table1) row(s) first, and reference the newly created keys. How you do this depends on whether the id column is a generated or explicitly specified key, and whether you have any keys (primary or foreign keys) in table2 as well.
It might look something like this:
insert into table1
select col1, col2, etc
from table1
where id = 'somekey'
insert into table2
select 'newkey' as table1_id, col3, col4, etc
from table2
where table1_id = 'somekey'
If you want to do this atomically, wrap it in a begin transaction, and make use of an OUTPUT clause or similar to collect the new IDs; see the documentation about that.

Related

How do you merge two tables with multiple unique indentifiers?

Hey I have two tables with the same rows the first table is the main table and I want to upsert the data with new unique entries from the _tmp_ table.
for example;
id, text_id, last_sent, recent_sent, updated_at, date_created
I want to merge a communicated _tmp_ table that is created from another table into the communicated table. Only if the communicated table doesn't have an identical row id, text_id, last_sent and recent_sent
The query I'm using now is posted below but doesn't work. This query inserts all the data from the _tmp_ table.
I have checked and both the types of the tables are the same. And I just don't know what I'm doing wrong.
Help much appreciated
MERGE
`project.map.communicated` CURRENT_TABLE
USING
`project.map.communicated_tmp_` NEW_OR_UPDATED
ON
(CURRENT_TABLE.id = NEW_OR_UPDATED.id
AND CURRENT_TABLE.text_id = NEW_OR_UPDATED.text_id
AND CURRENT_TABLE.last_sent = NEW_OR_UPDATED.last_sent
AND CURRENT_TABLE.recent_sent = NEW_OR_UPDATED.recent_sent)
WHEN NOT MATCHED
THEN
INSERT
(`id`,
`text_id`,
`last_sent`,
`recent_sent`,
`updated_at`,
`date_created`)
VALUES
(`id`,`text_id`,`last_sent`,`recent_sent`,`updated_at`,`date_created`)
The Merge statement uses JOIN logic to see matches. The only reason this should not work if there are rows that have NULLS in either of the fields you use for the join. Make sure to exclude the NULLS or make a composite key which works around the NULL values.

One column key or multiple column key?

Suppose I have two tables in an RDBMS that try to model storage and retrieval of same data based on different key specification. Table1 stores the entire key in a single char column, Table2 stores the key in multiple columns, like so:
Table1: key=String, value=Data
Table2: col1=String, col2=String, value=Data
Table1 key holds the same information as the combination of col1 and col2, plus potentially delimiters, ex. key="NASDAQ/SUNW", col1="NASDAQ", col2="SUNW"
I am interested in efficient data retrieval. Would using Table1 be more efficient than Table2?
If your key contains multiple values, you should separate them into separate columns. That way you could potentially index them separately if the need arises, e.g. if you need to be able to filter a resultset by the second value (in your example, imagine if you needed to find all records with SUNW).
As a rule of thumb, if you find yourself putting comma-delimited values into a single database column, you are probaly doing something wrong.

WHEN NOT MATCHED BY TARGET THEN --Update table - id field and insert

I doubt this can be done, thought I would ask the experts.
I have two large tables and have a need to merge them (excluding a non identity integer key field). All PK's are maintained in a NextId table here.
So, my question is, if I have an insert during MERGE - is there a way to grab the NextId for this table and update this table for each insert?
Unfortunately, this is not possible but you might be able to generate the IDs beforehand. Example:
Generate all IDs and store them in a table variable. Join that table variable in the MERGE statement.
If the IDs are contiguous you can use ROW_NUMBER() OVER (something) + startingID to generate them as part of the MERGE.

How to combine identical tables into one table?

I have 100s of millions of unique rows spread across 12 tables in the same database. They all have the same schema/columns. Is there a relatively easy way to combine all of the separate tables into 1 table?
I've tried importing the tables into a single table, but given this is a HUGE size of files/rows, SQL Server is making me wait a long time as if I was importing from a flat file. There has to be an easier/faster way, no?
You haven't given much info about your table structure, but you can probably just do a plain old insert from a select, like below. The example would take all records that don't already exist Table2 and Table3, and insert them into Table1. You could do this to merge everything from all your 12 tables into a single table.
INSERT INTO Table1
SELECT * FROM Table2
WHERE SomeUniqueKey
NOT IN (SELECT SomeUniqueKey FROM Table1)
UNION
SELECT * FROM Table3
WHERE SomeUniqueKey
NOT IN (SELECT SomeUniqueKey FROM Table1)
--...
Do what Jim says, but first:
1) Drop (or disable) all indices in the destination table.
2) Insert rows from each table, one table at a time.
3) Commit the transaction after each table is appended, otherwise much disk space will be taken up in case of a possible rollback.
4) Renable or recreate the indices after you are done.
If there is a possibility of duplicate keys, you may need to retain an index on the key field and have a NOT EXISTS clause to hold back the duplicate records from being added.

SQL question regarding the insertion of empty tuples to prepare for update statements

I am making a table that will be borrowing a couple of columns from another table, but not all of them. Right now my new table doesn't have anything in it. I want to add X number of empty tuples to my table so that I can then begin adding data from my old table to my new table via update statements.
Is that the best way to do it? What is the syntax for adding empty rows? Thanks
Instead of inserting nulls and then updating them, cant you just insert data from the other table directly. using something like this -
INSERT INTO Table1 (col1, col2, col3)
SELECT column1, column2, column3
FROM Table2
WHERE <some condition>
If you still want to insert empty records, then you will have to make sure that all your columns allow nulls. Then you can use something like this -
Table1
PrimaryKey_Col | col1 | col2 | col3
Insert INTO Table1 (PrimaryKey_col) values (<some value>)
This will make sure your a new row is inserted with a primary key and the rest of the columns are nulls. And these records can be updated later.
No, this is not even a good way, let alone the best way.
If you look at it conceptually adding empty rows serves no purpose.
In databases each row of a table corresponds to a true statement (fact). Adding a row with all NULLs (even if possible) records nothing and represents inconsistent data on its own. Especially having multiple empty records.
Also, if you are even able to add a row with all NULLs to a table that's an indication that you have no data integrity rules for the row so and that's mostly what databases are about - integrity rules and quality of data. A well designed database should not accept contradictory or meaningless data (and empty row is meaningless)
As Pavanred answered inserting real data is a single command, so there is no benefit in making it two or more commands.