populate a table from another table including logging - sql

I'm trying to populate a table from another table including logging.
For example there 2 tables A and B.
Data should be copied from B to A
There is one primary key called id in both tables.
The script should update the matching rows if existing.
The script should insert the missing rows from B if not found in table A
Data is expected to be around 800 k, having 15 columns.

I have no idea what you mean with "including logging", but to insert/update from one table to another, use merge:
merge into a
using b on (b.id = a.id)
when matched then update
set col1 = b.col1,
col2 = b.col2
when not matched then insert (id, col1, col2)
values (b.id, b.col1, col2);
This assumes the PK is named id in both tables.

merge into tableA a
using tableB b
on (a.id = b.id)
when matched then update set
--list columns here
when not matched then insert
--list columns to insert here
;
800k shouldn't be too much to insert in one transaction. If it is too much you should use cursor with bulk collect and split merge in a few steps passing to using only part of data. How big limit set for bulk collect you need to test which gives optimal times.

Related

How to copy data from TableA to TableB with new partitions?

I have TableA that has hundreds of thousands of rows and is still increasing in size. With no partitions, the speed has decreased very noticeably.
So I made a new table called TableB made columns exactly like (both name and type) TableA in Oracle SQL Developer. (TableA and TableB are in the same database but not the same tables) I additionally created partitions for TableB.
Now, all I want to do is copy all the data from TableA from TableB in order to test the speeds of queries.
In order to test speeds of tables with partitions, I decided to copy all of the data now that TableB has all the same columns as A.
insert into TableB ( select * from TableA);
What I expected from the statement above was the data to be copied over but instead, I got the error:
Error starting at line : 1 in command -
insert into TableB ( select * from TableA)
Error at Command Line : 1 Column : 1
Error report -
SQL Error: ORA-54013: INSERT operation disallowed on virtual columns
54013. 0000 - "INSERT operation disallowed on virtual columns"
*Cause: Attempted to insert values into a virtual column
*Action: Re-issue the statment without providing values for a virtual column
I looked up Virtual Columns and it seems to be
"When queried, virtual columns appear to be normal table columns, but their values are derived rather than being stored on disc. The syntax for defining a virtual column is listed below."
However, I do not have any data in TableB whatsoever. TableB only has the columns that match TableA so I am unsure as to how my columns can be derived, when there is nothing to derive?
You can use the query
SELECT column_name, virtual_column
FROM user_tab_cols
WHERE table_name = 'TABLEA';
COLUMN_NAME VIRTUAL_COLUMN
----------- --------------
ID NO
COL1 NO
COL2 NO
COL3 YES
Then use
INSERT INTO TABLEB(ID,COL1,COL2) SELECT ID,COL1,COL2 FROM TABLEA;
to be exempt from the virtual columns, those are calculated ones from the other columns' values.
did you create table B also with derived columns ? from your question i presume you created tableB also with virtual columns..
One thing you need to notice is since you have a large volume of records to insert , use bulk mode for faster operation.. use append hint as shown below.
Please note - you need not include virtual columns in below statement as they would be calculated on the fly.
insert /*+ APPEND */ into tableB (column1, column2,...columnn) select column1, column2,...columnn from TableA

How to synchronize two tables in SSIS

I have a scenario where i need to synchronize two tables in SSIS
Table A is in DATABASE A and TABLE B is in DATABASE B. Both tables have same schema. I need to have a SSIS package that Synchronize TABLE A with TABLE B in Such a way That
1. It inserts all the records That Exist in Table A into Table B
AND
2. Update TABLE B if Same "Key" exsit in Both but Updated records in Table A
For Example Table A and B both Contains Key = 123 both Few Columns in Table A has been Updated.
I am thinking about using Merge Joins but that helps with only insertion of New records. How i can manage to implement UPDATE thing as well
1.It inserts all the records That Exist in Table A into Table B
Use a lookup transformation .Source will be Table A and Lookup will be Table B .Map the common columns in both the table and select those columns which you need for insertion.After lookup use OLEDB destination and the map the columns coming from the lookup and insert it into Table B
2.Update TABLE B if Same "Key" exsit in Both but Updated records in Table A
Same logic as above .Use lookup and instead of OLEDB Destination use OLEDB Command and then write the update sql .
Update TableB
Set col1=?,col2=?....
In the column mapping map the columns coming out of the lookup
Check out this article
Checking to see if a record exists and if so update else insert
Using Merge :
MERGE TableB b
USING TableA a
ON b.Key = a.Key
WHEN MATCHED AND b.Col1<>a.Col1 THEN
UPDATE
SET b.Col1 = a.Col1
WHEN NOT MATCHED BY TARGET THEN
INSERT (Col1, Col2, col3)
VALUES (a.Col1, a.Col2,a.Col3);
You can execute the Merge SQL in Execute SQL Task in Control Flow
Update : The Lookup transformation tries to perform an equi-join between values in the transformation input and values in the reference dataset.
You can just need to have one Data Flow Task .
Diagram
When the target table data does not have a matching value in the source table then lookup will redirect the target rows to the oledb destination which inserts the Data into source table( Lookup No Match Output)
When the target table rows matches for the business key with the source table then matched rows will be sent to the Oledb Command and using the Update SQL ,the all the target rows from the lookup will be updated in the source table .
This is just an overview .There is a problem with the above design as when the rows matches irrespective of any change in the columns the source table will be updated .So kindly refer the above article or try for search for SCD component in ssis
Update 2:
MERGE TableB b
USING TableA a
ON b.Key = a.Key
WHEN MATCHED THEN
UPDATE
SET b.Col1 = a.Col1
WHEN NOT MATCHED BY TARGET AND a.IsReady=1 THEN --isReady bit data type
INSERT (Col1, Col2, col3)
VALUES (a.Col1, a.Col2,a.Col3);

Merge SQL With Condition

I have a Scenario where i need to User Merge SQL statement to Synchronize two Tables. Let's suppose i have two tables Table A and Table B. Schema is same with the exception of one Extra Column in Table A. That Extra column is a flag that tells me which records are ready to be inserted/updated in Table B. Lets say that flag column is IsReady. It will be either true or False.
Can i use Isready=True in Merge Statement or I need a Temp table to move all records from Table A to Temp Table where IsReady=True and then use Merge SQL on TempTable and Table B???
Yes, you can use that column in the merge condition.
merge tableB targetTable
using tableA sourceTable
on sourceTable.IsReady = 1 and [any other condition]
when not matched then
insert ...
when matched and [...] then
update ...
This may help you,
merge into tableB
using tableA
on tableB.IsReady=true
when not matched then
insert (tableB.field1,tableB.field2..)
values (tableA.field1,tableA.field2..);
commit;

Is it wiser to use a function in between First and Next Insertions based on Select?

PROCEDURE add_values
AS BEGIN
INSERT INTO TableA
SELECT id, name
FROM TableC ("This selection will return multiple records")
While it inserts in TableA i would like insert into another table(TableB) for that particular record which got inserted in tableA
Note:The columns in TableA and TableB are different , is it wise to call a function before inserting into TableB as i would like to perform certain gets and sets based on the id inserted in tableA.
If you want to insert a set of rows into two tables, you'd have to store it in a temporary table first and then do the two INSERT statement from there
INSERT INTO #TempTable
SELECT id, name
FROM TableC ("This selection will return multiple records")
INSERT INTO TableA
SELECT (fieldlist) FROM #TempTable
INSERT INTO TableB
SELECT (fieldlist) FROM #TempTable
Apart from Marc_S answer, one more way is
First insert the needed records into Table A from Table C. Then pump the needed records from Table A to Table B
Though many ways has been suggested by many peoples in your question that u asked just 3 hrs ago How to Insert Records based on the Previous Insert

"Merging" two tables in T-SQL - replacing or preserving duplicate IDs

I have a web application that uses a fairly large table (millions of rows, about 30 columns). Let's call that TableA. Among the 30 columns, this table has a primary key named "id", and another column named "campaignID".
As part of the application, users are able to upload new sets of data pertaining to new "campaigns".
These data sets have the same structure as TableA, but typically only about 10,000-20,000 rows.
Every row in a new data set will have a unique "id", but they'll all share the same campaignID. In other words, the user is loading the complete data for a new "campaign", so all 10,000 rows have the same "campaignID".
Usually, users are uploading data for a NEW campaign, so there are no rows in TableA with the same campaignID. Since the "id" is unique to each campaign, the id of every row of new data will be unique in TableA.
However, in the rare case where a user tries to load a new set of rows for a "campaign" that's already in the database, the requirement was to remove all the old rows for that campaign from TableA first, and then insert the new rows from the new data set.
So, my stored procedure was simple:
BULK INSERT the new data into a temporary table (#tableB)
Delete any existing rows in TableA with the same campaignID
INSERT INTO Table A ([columns]) SELECT [columns] from #TableB
Drop #TableB
This worked just fine.
But the new requirement is to give users 3 options when they upload new data for handling "duplicates" - instances where the user is uploading data for a campaign that's already in TableA.
Remove ALL data in TableA with the same campaignID, then insert all the new data from #TableB. (This is the old behavior. With this option, they'll never be duplicates.)
If a row in #TableB has the same id as a row in TableA, then update that row in TableA with the row from #TableB (Effectively, this is "replacing" the old data with the new data)
If a row in #TableB has the same id as a row in TableA, then ignore that row in #TableB (Essentially, this is preserving the original data, and ignoring the new data).
A user doesn't get to choose this on a row-by-row basis. She chooses how the data will be merged, and this logic is applied to the entire data set.
In a similar application I worked on that used MySQL, I used the "LOAD DATA INFILE" function, with the "REPLACE" or "IGNORE" option. But I don't know how to do this with SQL Server/T-SQL.
Any solution needs to be efficient enough to handle the fact that TableA has millions of rows, and #TableB (the new data set) may have 10k-20k rows.
I googled for something like a "Merge" command (something that seems to be supported for SQL Server 2008), but I only have access to SQL Server 2005.
In rough pseudocode, I need something like this:
If user selects option 1:
[I'm all set here - I have this working]
If user selects option 2 (replace):
merge into TableA as Target
using #TableB as Source
on TableA.id=#TableB.id
when matched then
update row in TableA with row from #TableB
when not matched then
insert row from #TableB into TableA
If user selects option 3 (preserve):
merge into TableA as Target
using #TableB as Source
on TableA.id=#TableB.id
when matched then
do nothing
when not matched then
insert row from #TableB into TableA
How about this?
option 2:
begin tran;
delete from tablea where exists (select 1 from tableb where tablea.id=tableb.id);
insert into tablea select * from tableb;
commit tran;
option 3:
begin tran;
delete from tableb where exists (select 1 from tablea where tablea.id=tableb.id);
insert into tablea select * from tableb;
commit tran;
As for performance, so long as the id field(s) in tablea (the big table) are indexed, you should be fine.
Why are you using Upserts when he claims he wanted a MERGE? MAREG in SQL 2008 is faster and more efficient.
I would let the merge handle the differences.