SSIS Package - Joining two tables - sql

I have two access db sources both with the same columns representing data from different time periods. The files have two identity columns UPC and StoreNbr. The resulting table within the DB being inserted to has the two identity columns and the data columns from each file "concatenated" into one table as seen below.:
File 1 Columns:
UPC StoreNbr data1 data2 data3
File 2 Columns:
UPC StoreNbr data1 data2 data3
DB Table Columns:
UPC StoreNbr data1(File 1) data2(File 1) data3(File 1) data1(File 2) data2(File 2) data3(File 2)
I am new to SSIS and have been confronted with the task of merging these two sources into one table to insert into the final DB table. Can I join the two tables on the identity colummns and then insert the data into one result set? FYI, this was originally imported in one filed reflecting the layout of the DB table but the client had the bright idea to split it into two files. Any direction is very appreciated thanks.

It should look something like this.
The sources must be sorted by the join keys. In your case UPC AND StoreNbr
In the merge join editor you can select which columns from the different files that will continue on the flow. You can also give them an alias in order to differentiate two similarly named columns.
After that you can just dump it all back into your DB. Cheers!

Depending on whether an item can exist in one Access source and not in another source, an alternative to TsSkTo's implementation would be to route it as
[Access Source 1]
|
[Lookup Transformation to Access Source 2]
|
[OLE DB Destination]
Lookup Transformation

Related

Migrate data between two SQL databases with identity column

Here is the scenario... I've two databases (A & B) with same schema but different records. I'd like to transfer B's data into corresponding tables in DB A.
Lets say we have tables named Question and Answer in both databases. DB A contains 10 records in Question table and 30 in Answer table. Both tables have identity column Id starting with 1(& auto increment), and there is 1 to many relation between Question and Answer.
In DB B, we have 5 entries in Question table and 20 in Answer.
My requirement is to copy data of both tables from source DB B into destination DB A without having any conflict in identity column while maintaining the relation between two tables during data transfer.
Any solution or potential workaround would be highly appreciated.
I will not write SQL here but here is what I think can be done. Make sure to use Identity insert ON and OFF.
Take maxids of both tables from DB A like A_maxidquestion and A_maxidanswer.
Select from B_question . In select column add derived col QuestionID+A_maxidquestion.This will be your new ID.
Select from B_Answer . In select column add derived col AnswerID+A_maxidanswer and fk id as QuestionID+A_maxidquestion.
Note- Make sure Destination table is not beeing used by any other process for inserting values while you are inserting
One of the best approaches to something like this is to use the OUTPUT clause. https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql?view=sql-server-2017 You can insert the new parent and capture the newly inserted identity value which you can use to insert the children.
You can do this set based if you also include a temp table which would hold the original identity value and the new identity value.
With no details of the tables that is the best I can do.

How to take look up values from a look up table using SSIS

I have a scenario as described below need to create a SSIS Package for that.
I have 3 COLUMNS in source table which needs to be entered in destination table.
But all these columns has to be looked up in the look up table of destination database and then enter their ID's in the destination column.
For example
Source table has 3 columns with values
idnum static type timedimension geography modified date
1 price daydate france 8/12/2015
2 RetailpRICE WEEK ITALY 9/12/2014
I want a package which looks up the column values with the matchin ID and populates in the destination table...
I know we can use the LOOKUP transform to update the data for one single column in destination table what about the other columns which I need to insert along with the lookup insertion.
How can I achieve this ? Also is there a way to pull only the recent data from the source table using modified date column values
Use a different lookup for each lookup table that you need to reference to get the Ids. So if each of your columns that you want IDs for gets its ID from a different table, then you need to use three lookups, one after the other, until you have all three IDs.

How to check if a set of rows already exist in the database and skip migrate them?

I need to create a package to migrate a large amount of data from a database table into a different database table. The source table will continuously have new data in like 4,5 days so I will run my package again and again.
I need to migrate all data from this table to another table but I don't want to migrate those data that I already migrated. What kind of transformation I need to use or what SQL command I need to write to do this?
The usual way this is done is by having "audit" timestamps on the source table and migating only records updated or inserted after the last migration.
for example:
Table Sales
sale_id
sale_date
sale_amount
...............
dw_create_date
dw_update_date
Your source extraction could be something along the lines of..
select sales.sale_id,
sales.sale_date,
....
from sales
where dw_updated_date > {last_migration_date}
last_migration_date is usually read from a config file or table.
Other approaches
There are a few other approaches that you could use, but all of these have bigger performance problems as your data size grows.
1) Do a (target-source) data, to get changed rows in the souurce.
select *
from source
minus
select * from target
You could do the same using a join between source and target.
select source.*
from src
left join tgt on (src.id=tgt.id)
where (src.column1 <> tgt.column1 or
src.column2 <> tgt.column2
............
)
Note that either one of these approaches does not take care of deletes in the source. If you want the tables to be in sync, the only way to do that would be do a (source-target) to get insert/update changes and (target-source) to get deleted rows and do the same in the target.
2. Insert and ignore the primary constraint error:
This has serious issues if the data can change in the source and you want the updates propagated to the target. You'd also be querying the entire source each time. It is usually better to use Merge/Upsert along with filtered source data, instead.
I would assume both tables have some unique identifier, no?
Table A has:
1
2
3
4
You're moving that to Table B, but keeping the data in Table A at the same time, yes?
So you've run your job once. Now Table B has:
1
2
3
4
Table A gets updated. It now has:
1
2
3
4
5
6
7
You run your job again, but you only want to send over 5,6,7.
SELECT *
FROM TableA
LEFT OUTER JOIN TableB ON TableA.ID = TableB.ID
WHERE TableB.ID = NULL.
If you have some sample data it would help. Does this give you a good idea?
See joins: http://i.stack.imgur.com/1UKp7.png

How to search different data from two same structure table from two same structure database?

I wanna search data that is different from each other.
I don't know how to link table in two database to search different data.
For example....
tblCustomer in Database1 has all data
tblCustomer in Database2 has some data that contain in Database1
I want to search which data does not contain in Database 1.
In single query one cannot fetch data from two different databases.You may take data in dataset and perform your operation.
You can use a three-part name to reference objects in another database (or four part if its also on another server/instance). Something like:
SELECT * --TODO, name columns
FROM
tblCustomer c
left join
Database1..tblCustomer c_not
on
c.CustomerID = c_not.CustomerID --TODO - Actual match conditions
WHERE
c_not.CustomerID is null --Only select rows where no match occurred.
(Here, I've assumed that the query is running in Database2 and that tblCustomer in Database1 is in the default schema)

Comparing the data of two tables in the same database in sqlserver

I need to compare the two table data with in one database.match the data using some columns form table.
Stored this extra rows data into another table called "relationaldata".
while I am searched ,found some solutin.
But it's not working to me
http://weblogs.sqlteam.com/jeffs/archive/2004/11/10/2737.aspx
can any one help how to do this.
How compare two table data with in one database using redgate(Tool)?
Red Gate SQL Data Compare lets you map together two tables in the same database, provided the columns are compatible datatypes. You just put the same database in the source and target, then go to the Object Mapping tab, unmap the two tables, and map them together.
Data Compare used to use UNION ALL, but it was filling up tempdb, which is what will happen if the table has a high row count. It does all the "joins" on local hard disk now using a data cache.
I think you can use Except clause in sql server
INSERT INTO tableC
(
Col1
, col2
, col3
)
select Col1,col2,col3from tableA
Except
select Col1,col2,col3 from tableB
Please refer for more information
http://blog.sqlauthority.com/2008/08/07/sql-server-except-clause-in-sql-server-is-similar-to-minus-clause-in-oracle/
Hope this helps