This is my problem:
I have an old database, with no constraints whatsoever. There are a handful of tables I copy from the old database to the new database. This copy is simple, and I'm running that daily at night in a Job.
In my new database (nicely with constraints) I made all these loose tables in constraint with a main table. all these tables have a key made of 3ID's and a string.
My main table would translate these 3ID's and a string to 1 ID, so this table would have 5 columns.
In the loose tables, some records can be double, so to insert the ID's in the main table I'd take a distinct of the 3 ID's and a string, and insert those into my main table.
The tables in the old database are updated daily. the copy is run daily, and from the main table I'd like to make a one-to-many relation with the copied tables.
This gives me the problem:
how do I update the main table, do nothing with the already-inserted keys and add the new keys? Don't remove old keys if removed in old database.
I was thinking to make a distinct view of all keys in the old database, but how would I update this to the main table? This would need to run before the daily copy of the other tables (or it would fail on the constraints)
One other idea is to run this update of the main table in Linq-to-SQL in my website but that doesn't seem very clean.
So in short:
Old DB is SQL Server 2000
New DB is SQL Server 2008
Old db has no constraints, copy of some tables happens daily.
There should be a Main table, translating the 3ID and 1string key to a 1ID key, with a constraint to the other tables.
The main table should be updated before the copy job, else the constraints will fail. The main table will be a distinct of a few columns of 1 table. This distinct will be in a view on the old db.
Only new rows can be added in the main db
Does anyone have some ideas, some guidance?
Visualize the DB:
these loose tables are details about a company. 1 table has address(es) 1 table has contact person, another table has it's username and login for our system (which could be more than 1 for 1 company)
a company is identified by the 3ID's and 1string. the main table would list these unique ID's and string so that they could be translated to 1 ID. this 1 ID is then used in the rest of my DB. the one to many relation would then be made from the main table to all those loose tables. I hope this clears it up a bit :)
I think you could use EXCEPT to insert the ids that aren't in your main table yet http://msdn.microsoft.com/en-us/library/ms188055.aspx
So for example:
insert into MainTable
select Id1,Id2,Id3,String1,NewId from DistinctOldTable
except
select Id1,Id2,Id3,String1,NewId from MainTable
Related
Here is the scenario... I've two databases (A & B) with same schema but different records. I'd like to transfer B's data into corresponding tables in DB A.
Lets say we have tables named Question and Answer in both databases. DB A contains 10 records in Question table and 30 in Answer table. Both tables have identity column Id starting with 1(& auto increment), and there is 1 to many relation between Question and Answer.
In DB B, we have 5 entries in Question table and 20 in Answer.
My requirement is to copy data of both tables from source DB B into destination DB A without having any conflict in identity column while maintaining the relation between two tables during data transfer.
Any solution or potential workaround would be highly appreciated.
I will not write SQL here but here is what I think can be done. Make sure to use Identity insert ON and OFF.
Take maxids of both tables from DB A like A_maxidquestion and A_maxidanswer.
Select from B_question . In select column add derived col QuestionID+A_maxidquestion.This will be your new ID.
Select from B_Answer . In select column add derived col AnswerID+A_maxidanswer and fk id as QuestionID+A_maxidquestion.
Note- Make sure Destination table is not beeing used by any other process for inserting values while you are inserting
One of the best approaches to something like this is to use the OUTPUT clause. https://learn.microsoft.com/en-us/sql/t-sql/queries/output-clause-transact-sql?view=sql-server-2017 You can insert the new parent and capture the newly inserted identity value which you can use to insert the children.
You can do this set based if you also include a temp table which would hold the original identity value and the new identity value.
With no details of the tables that is the best I can do.
I am planning for an incremental load into warehouse (especially for updates of source tables in RDBMS).
Capturing the updated rows in staging tables from RDBMS based the updates datetime. But how do I determine which column of a particular row needs to be updated in the target warehouse tables?
Or do I just delete a particular row in the warehouse table (based on the primary key of the row in staging table) and insert the new updated row?
Which is the best way to implement the incremental load between the RDBMS and Warehouse using PL/SQL and SQL coding?
In my opinion, the easiest way to accomplish this is as follows:
Create a stage table identical to your host table. When you do your incremental/net-change load, load all changed records into this table (based on whatever your "last updated" field is)
Delete the records from your actual table based on the primary key. For example, if your primary key is customer, part, the query might look like this:
delete from main_table m
where exists (
select null
from stage_table s
where
m.customer = s.customer and
m.part = s.part
);
Insert the records from the stage to the main table.
You could also do an update existing records / insert new records, but either way that's two steps. The advantage of the method I listed is that it will work even if your tables have partitions and the newly updated data violates one of the original partition rules, whereas an update would not accomplish that. Also, the syntax is much simpler as your update would have to list every single field, whereas the delete from / insert into allows you list only the primary key fields.
Oracle also has a merge clause that will update if it exists or insert if it does not. I honestly don't know how that would be impacted if you had partitions.
One major caveat. If your updates include deletes -- records that need to be deleted from the main table, none of these will resolve that and you will need some other way to handle that. It may not be necessary, depending on your circumstances, but it's something to consider.
I have a table people with less than 100,000 records and I have taken a backup of this table using the following:
create table people_backup as select * from people
I add some new records to my people table over time, but eventually I want to merge the records from my backup table into people. Unfortunately I cannot simply DROP my table as my new records will be lost!
So I want to update the records in my people table using the records from people_backup, based on their primary key id and I have found 2 ways to do this:
MERGE the tables together
use some sort of fancy correlated update
Great! However, both of these methods use SET and make me specify what columns I want to update. Unfortunately I am lazy and the structure of people may change over time and while my CTAS statement doesn't need to be updated, my update/merge script will need changes, which feels like unnecessary work for me.
Is there a way merge entire rows without having to specify columns? I see here that not specifying columns during an INSERT will direct SQL to insert values by order, can the same methodology be applied here, is this safe?
NB: The structure of the table will not change between backups
Given that your table is small, you could simply
DELETE FROM table t
WHERE EXISTS( SELECT 1
FROM backup b
WHERE t.key = b.key );
INSERT INTO table
SELECT *
FROM backup;
That is slow and not particularly elegant (particularly if most of the data from the backup hasn't changed) but assuming the columns in the two tables match, it does allow you to not list out the columns. Personally, I'd much prefer writing out the column names (presumably those don't change all that often) so that I could do an update.
I have a live production table which has more than 1 million records. Now i don't need to tamper anything on this table and would like to create another table which fetches all records from this live production table. I would schedule a job which can take entries from my main table and inserts them to my new table. But i don't want all the records daily; i just need the records added on a daily basis in the production table to get added in my new table.
Please suggest a faster and efficient approach.
You could do this with an INSERT/UPDATE/DELETE trigger to send the INSERTED/UPDATED/DELETED row to the new table, however this feels like reinventing the wheel on the most basic level.
You could just use asynchronous replication rather than hand-rolling it all yourself, this is probably safer, more sustainable and scalable. You could add as many tables as you like to the replicated source.
Copying one million records from an existing table to a new table should not take very long -- and might even be faster than figuring out what records to copy. You could do something like:
truncate table copytable;
insert into copytable
select *
from productiontable;
Note that you should explicitly list the columns when doing the insert.
You can also readily add new records -- assuming you have some form of id on the production table, such as an id assigned by a sequence. Then you can do:
insert into copytable
select *
from productiontable p
where p.id > (select max(id) from copytable);
I have two identical tables in the database (Same column name, same data type). Table A and Table B.
A is the master table. Now i want to update the table B with the changes done in the master table A automatically at any point of time. How would i do that without using DBMS_COMPARISON