SQL data migration - sql

I'm just going to use an illustration to explain my problem. SQL Migration:
In the attached image are 2 SQL tables, Table 2 is referencing a primary key of table 1. Table 1 however had many duplicates so I deleted all the duplicates using Excel and imported the data into a new table with a new set of IDs. I now have to import table 2 into a new table and reference table 1 again like before. Now in the image it looks fairly easy to do that but I am dealing with a database of over 2000 rows after eliminating around 700 duplicates. Besides manually editing each row and matching them is there any way of doing this quickly. This is the first time I doing database migration but guessing there a quick ways of doing this. Google searches did not really bear any results. I appreciate any help on this.

To preview the data you need to delete from 2nd table please run this:
select * from names
where place_id NOT in (select id from places)
If you are happy with results you can run
delete from names
where place_id NOT in (select id from places)

after copying table2 into table2New
update table2New
set PlaceNameID = (select t1N.ID
from table2 as t2
join table1 as t1
on t1.ID = t2.PlaceNameID
join table1new as t1N
on t1.PlaceName = t1N.PlaceName
where table2New.ID = t2.ID
);

Related

Is it possible to update a local table with values from a linked table?

I have two identical tables in my database, T1 and T2. T1 is a local table while T2 is a linked table from a live SQL database. At the moment the two tables are identical.
In short, I would like to be able to run a query that will update T1 will all new records that have been added to T2. So once I have run the query, the two tables should be identical again. Is this at all possible? I need to have the data in T1 locally available as I need to be able to query that table even when the data in T2 is not available. The SQL database in question is off site so I will not always be able to run my queries as the link is unreliable.
Any assistance will be greatly appreciated.
If you do have a unique ID for every record and are sure records already entered will never be changed this can actually be quite simple:
INSERT INTO [TBL_INVOICES_LOCAL]
SELECT TBL_INVOICES.*
FROM [TBL_INVOICES_LOCAL] RIGHT JOIN TBL_INVOICES ON [TBL_INVOICES_LOCAL].InvoiceID = TBL_INVOICES.InvoiceID
WHERE ((([TBL_INVOICES_LOCAL].InvoiceID) Is Null));
All you need to do is join the two tables with the relationship being set to:
Include ALL records from LINKED_TABLE and only records from LOCAL_TABLE where the joined fields are equal.
Setting the criteria of the Local Tables ID field to "Is Null" will only show missing records. If you do this in an append query you can update your table in only one query as seen below:

MS Access Populating Data in Column with Data from another table (SQL)

I'm needing help with a (probably very basic) command. On a test spreadsheet I have been asked to populate a column with data from another table in MS Access. However, to add the data from table 1 to table 2 it relies on another set of data to be matched with it.
In table 1 there is 2 Tables "Customer Number" and "CustomerID" - Both of which are full data sets 40,000 entries so I can't copy paste.
In table 2 the "CustomerID" is there but there are no entries.There are all entries for "Customer Number".
I need to add the customerid from table 1 to table 2 but only when the customer number is matching on both tables.
I have tried using Join function and insert but I only have been using access for about 3 days and need some help. Sorry for the longwinded explanation and hope the question is legible.
Thanks.
Does this work with JOIN?
UPDATE table2 t2 JOIN
table1 t1
ON t2.CustomerNumber = t1.CustomerNumber
SET t2.CustomerID = t1.CustomerID;

Merge multiple tables in SSIS or SQL command

I'm facing a problem with fetching data from multiple sources. It would be great if you can provide your ideas to design the SQL query.
I have to take data from two tables and INSERT it into a third table.
INPUT
TABLE 1
- TaskOrderNumer
- MemberID
TABLE 2
- ReferenceID
- MemberID
OUTPUT
TABLE 3
- TaskRefID
- PatID
My input table has TaskOrderNumber and MemberID. Right now I'm joining TABLE1 and TABLE2 based on MemberID. I'm getting the corresponding ReferenceID from TABLE2 and mapping it into PatID of TABLE3. The TaskOrder Number in TABLE1= TaskRefID in TABLE3.
I'm currently doing this using SSIS components. I want to make sure that the correct data is MERGED. I'm not able to map the TaskOrderNumber to TaskRefID. Can you please help me design the solution.
You can simply query the information you are trying to display. I am not sure I would even bother with SSIS here unless your sources aren't SQL.
select t1.TaskOrderID as TaskRefID
,t2.ReferenceID as PatID
into Table3 --Added this as edit.
from Table1 t1
join Table2 t2 on t1.MemberID=t2.MemberID

Copy records missing from one table to a new table

I managed to delete 4,000 rows from a table in my 129,000-row production database (Postgres 9.4 on Heroku), but only identified the problem a few days later.
I have a backup from before the loss, but only want to selectively restore the missing rows back to the table, preserving their id's. (A complete restore is not an option as new data has since been added to the table.)
Into a local testing database I have imported the backed-up table as articles_backup, alongside the actual articles table. I want to find all the rows in articles_backups that are missing from articles and then copy these to a new table articles_restores that I will then restore to the production database, back into the articles table (preserving record id's).
This query successfully returns all the id's of the deleted records:
select articles_backups.id
from articles_backups
left outer join articles on (articles_backups.id = articles.id)
where articles.id is null
But I have not been able to copy the result to a new table. I have unsuccessfully tried:
select *
into articles_restores
from articles_backups
left outer join articles on (articles_backups.id = articles.id)
where articles.id is null;
Which gives:
ERROR: column "id" specified more than once
Basically your query with LEFT JOIN / IS NULL does what you are after:
Select rows which are not present in other table
You get the error because you select all columns from both tables, and there is an id column in both. It's not possible to create a new table with duplicate column names, and it's not what you want to begin with. Only select columns from articles_backups:
CREATE TABLE articles_restores AS
SELECT ab.*
FROM articles_backups ab
LEFT JOIN articles a USING (id)
WHERE a.id IS NULL;
While being at it I simplified your query syntax with table aliases. The USING clause is just for the convenience of shorter code. It folds the two id columns into one, but all other columns are still in there twice if you SELECT *.
Use CREATE TABLE AS. SELECT INTO is also defined by the SQL standard and implemented in Postgres, but its use is discouraged. It's used in PL/pgSQL functions for a different purpose. Details:
Creating temporary tables in SQL
You could use an except to retrieve all the rows from articles_backup that are different from articles:
(assuming both tables have the same columns in the same order)
you could also create a temp table with this info to make it easy on your repairing statements:
create table temp_articles as
select * from articles_backup
except
select * from articles
step 1 - update rows from 'articles_backup' present in articles.
This step needs attention... you will have to establish a rule to choose between the data present in articles and the one present in temp_articles.
UPDATE articles a
SET a.col1=b.col1,
a.col2=b.col2,
(... other columns ...)
FROM (SELECT * FROM temp_articles) AS b
WHERE a.id = b.id and /* your rule for data to be (or not) updated goes here */
step 2 - insert rows from 'articles_backup' not present in articles (your deleted records):
insert into articles
select * from temp_articles where id not in (select id from articles)
Let us know if you need more help.

How to check if a set of rows already exist in the database and skip migrate them?

I need to create a package to migrate a large amount of data from a database table into a different database table. The source table will continuously have new data in like 4,5 days so I will run my package again and again.
I need to migrate all data from this table to another table but I don't want to migrate those data that I already migrated. What kind of transformation I need to use or what SQL command I need to write to do this?
The usual way this is done is by having "audit" timestamps on the source table and migating only records updated or inserted after the last migration.
for example:
Table Sales
sale_id
sale_date
sale_amount
...............
dw_create_date
dw_update_date
Your source extraction could be something along the lines of..
select sales.sale_id,
sales.sale_date,
....
from sales
where dw_updated_date > {last_migration_date}
last_migration_date is usually read from a config file or table.
Other approaches
There are a few other approaches that you could use, but all of these have bigger performance problems as your data size grows.
1) Do a (target-source) data, to get changed rows in the souurce.
select *
from source
minus
select * from target
You could do the same using a join between source and target.
select source.*
from src
left join tgt on (src.id=tgt.id)
where (src.column1 <> tgt.column1 or
src.column2 <> tgt.column2
............
)
Note that either one of these approaches does not take care of deletes in the source. If you want the tables to be in sync, the only way to do that would be do a (source-target) to get insert/update changes and (target-source) to get deleted rows and do the same in the target.
2. Insert and ignore the primary constraint error:
This has serious issues if the data can change in the source and you want the updates propagated to the target. You'd also be querying the entire source each time. It is usually better to use Merge/Upsert along with filtered source data, instead.
I would assume both tables have some unique identifier, no?
Table A has:
1
2
3
4
You're moving that to Table B, but keeping the data in Table A at the same time, yes?
So you've run your job once. Now Table B has:
1
2
3
4
Table A gets updated. It now has:
1
2
3
4
5
6
7
You run your job again, but you only want to send over 5,6,7.
SELECT *
FROM TableA
LEFT OUTER JOIN TableB ON TableA.ID = TableB.ID
WHERE TableB.ID = NULL.
If you have some sample data it would help. Does this give you a good idea?
See joins: http://i.stack.imgur.com/1UKp7.png