How can I merge two MySQL tables? - sql

How can I merge two MySQL tables that have the same structure?
The primary keys of the two tables will clash, so I have take that into account.

You can also try:
INSERT IGNORE
INTO table_1
SELECT *
FROM table_2
;
which allows those rows in table_1 to supersede those in table_2 that have a matching primary key, while still inserting rows with new primary keys.
Alternatively,
REPLACE
INTO table_1
SELECT *
FROM table_2
;
will update those rows already in table_1 with the corresponding row from table_2, while inserting rows with new primary keys.

It depends on the semantic of the primary key. If it's just autoincrement, then use something like:
insert into table1 (all columns except pk)
select all_columns_except_pk
from table2;
If PK means something, you need to find a way to determine which record should have priority. You could create a select query to find duplicates first (see answer by cpitis). Then eliminate the ones you don't want to keep and use the above insert to add records that remain.

INSERT
INTO first_table f
SELECT *
FROM second_table s
ON DUPLICATE KEY
UPDATE
s.column1 = DO_WHAT_EVER_MUST_BE_DONE_ON_KEY_CLASH(f.column1)

If you need to do it manually, one time:
First, merge in a temporary table, with something like:
create table MERGED as select * from table 1 UNION select * from table 2
Then, identify the primary key constraints with something like
SELECT COUNT(*), PK from MERGED GROUP BY PK HAVING COUNT(*) > 1
Where PK is the primary key field...
Solve the duplicates.
Rename the table.
[edited - removed brackets in the UNION query, which was causing the error in the comment below]

Not as complicated as it sounds....
Just leave the duplicate primary key out of your query....
this works for me !
INSERT INTO
Content(
`status`,
content_category,
content_type,
content_id,
user_id,
title,
description,
content_file,
content_url,
tags,
create_date,
edit_date,
runs
)
SELECT `status`,
content_category,
content_type,
content_id,
user_id,
title,
description,
content_file,
content_url,
tags,
create_date,
edit_date,
runs
FROM
Content_Images

You could write a script to update the FK's for you.. check out this blog: http://multunus.com/2011/03/how-to-easily-merge-two-identical-mysql-databases/
They have a clever script to use the information_schema tables to get the "id" columns:
SET #db:='id_new';
select #max_id:=max(AUTO_INCREMENT) from information_schema.tables;
select concat('update ',table_name,' set ', column_name,' = ',column_name,'+',#max_id,' ; ') from information_schema.columns where table_schema=#db and column_name like '%id' into outfile 'update_ids.sql';
use id_new
source update_ids.sql;

Related

Delete subset of a table based on temp table

I have a table, say myTable. I also have a temp table, say myTableTemp, that contains the exact values I want to keep eliminate from myTable (myTable has more value than I need).
I was initially thinking I could drop myTable, and then rename myTableTemp to myTable`. However there are many FK contraints that I do not want to touch. In theory, my query would look like:
DELETE FROM myTable where in (myTableTemp);
At least logically that is how i think about it
EDIT: The temp table contains the data I want to DELETE from myTable
DELETE FROM myTable where in (myTableTemp);
Isn't the above backwards? Don't you want to keep all the values in myTableTemp?
I would do the following:
DELETE FROM myTable t1
WHERE NOT EXISTS ( SELECT 1 FROM myTableTemp t2
WHERE t2.primary_key = t1.primary_key );
Again, that's assuming that you want to keep everything in myTableTemp and delete everything in myTable that isn't in myTableTemp.
As an alternate solution to eliminate from myTable items present in myTableTemp:
DELETE FROM myTable
WHERE primary_key IN ( SELECT primary_key FROM myTableTemp )
;
It is usually believed that [NOT] EXISTS queries perform better than those using [NOT] IN. But it is not always that obvious.

How to combine three tables into a new table

all with the same column headings and I would like to create one singular table from all three.
I'd also, if it is at all possible, like to create a trigger so that when one of these three source tables is edited, the change is copied into the new combined table.
I would normally do this as a view, however due to constraints on the STSrid, I need to create a table, not a view.
Edit* Right, this is a bit ridiculous but anyhow.
I HAVE THREE TABLES
THERE ARE NO DUPLICATES IN ANY OF THE THREE TABLES
I WANT TO COMBINE THE THREE TABLES INTO ONE TABLE
CAN SOMEONE HELP PROVIDE THE SAMPLE SQL CODE TO DO THIS
ALSO IS IT POSSIBLE TO CREATE TRIGGERS SO THAT WHEN ONE OF THE THREE TABLES IS EDITED THE CHANGE IS PASSED TO THE COMBINED TABLE
I CAN NOT CREATE A VIEW DUE TO THE FACT THAT THE COMBINED TABLE NEEDS TO HAVE A DIFFERENT STSrid TO THE SOURCE TABLES, CREATING A VIEW DOES NOT ALLOW ME TO DO THIS, NOR DOES AN INDEXED VIEW.
Edit* I Have Table A,Table B and Table C all with columns ORN, Geometry and APP_NUMBER. All the information is different so
Table A (I'm not going to give an example geometry column)
ORN ID
123 14/0045/F
124 12/0002/X
Table B (I'm not going to give an example geometry column)
ORN ID
256 05/0005/D
989 12/0012/X
Table C (I'm not going to give an example geometry column)
ORN ID
043 13/0045/D
222 11/0002/A
I want one complete table of all info
Table D
ORN ID
123 14/0045/F
124 12/0002/X
256 05/0005/D
989 12/0012/X
043 13/0045/D
222 11/0002/A
Any help would be greatly appreciated.
Thanks
If the creation of the table is a one time thing you can use a select into combined with a union like this:
select * into TableD from
(
select * from TableA
union all
select * from TableB
union all
select * from TableC
) UnionedTables
As for the trigger, it should be easy to set up a after insert trigger like this:
CREATE TRIGGER insert_trigger
ON TableA
AFTER INSERT AS
insert TableD (columns...) select (columns...) from inserted
Obviously you will have to change the columns... to match your structure.
I haven't checked the syntax though so it might not be prefect and it could need some adjustment, but it should give you an idea I hope.
If IDs are not duplicated it ill be easy to achieve it, in another case you can must add a OriginatedFrom column. You also can create a lot of instead off triggers (not only for insert but for delete and update) but that a lazy excuse for not refactoring the app.
Also you must pay attention for any reference for the data, since its a RELATIONAL model is likely to other tables are related to the table you are about to drop.
This is the code for create the table D
drop table D;
Select * into D from (select * from A Union all select* from B Union all select * from C);
Its rather simple Just Create Table_D First
CREATE TABLE_D
(
ORN INT,
ID VARCHAR(20),
Column3 Datatype
)
GO
Use INSERT statement to insert records into this table SELECTing and using UNION ALL operator from other three table.
INSERT INTO TABLE_D (ORN , ID, Column3)
SELECT ORN , ID, Column3
FROM Table_A
UNION ALL
SELECT ORN , ID, Column3
FROM Table_B
UNION ALL
SELECT ORN , ID, Column3
FROM Table_C
Trigger
You will need to create this trigger on all of the tables.
CREATE TRIGGER tr_Insert_Table_A
ON TABLE_A
FOR INSERT
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO TABLE_D (ORN , ID, Column3)
SELECT i.ORN , i.ID, i.Column3
FROM Inserted i LEFT JOIN TABLE_D D
ON i.ORN = D.ORN
WHERE D.ORN IS NULL
END
Read here to learn more about SQL Server Triggers

SQL SELECT INSERT INTO Generate Unique Id

I'm attempting to select a table of data and insert this data into another file with similar column names (it's essentially duplicate data). Current syntax as follows:
INSERT INTO TABLE1 (id, id2, col1, col2)
SELECT similiarId, similiarId2, similiarCol1, similiarCol2
FROM TABLE2
The problem I have is generating unique key fields (declared as integers) for the newly inserted records. I can't use table2's key's as table1 has existing data and will error on duplicate key values.
I cannot change the table schema and these are custom id columns not generated automatically by the DB.
Does table1 have an auto-increment on its id field? If so, can you lose similiarId from the insert and let the auto-increment take care of unique keys?
INSERT INTO TABLE1 (id2, col1, col2) SELECT similiarId2, similiarCol1, similiarCol2
FROM TABLE2
As per you requirement you need to do you query like this:
INSERT INTO TABLE1 (id, id2, col1, col2)
SELECT (ROW_NUMBER( ) OVER ( ORDER BY ID ASC ))
+ (SELECT MAX(id) FROM TABLE1) AS similiarId
, similiarId2, similiarCol1, similiarCol2
FROM TABLE2
What have I done here:
Added ROW_NUMBER() which will start from 1 so also added MAX() function for ID of destination table.
For better explanation See this SQLFiddle.
Im not sure if I understad you correctly:
You want to copy all data from TABLE2 but be sure that TABLE2.similiarId is not alredy in TABLE1.id, maybe this is solution for your problem:
DECLARE #idmax INT
SELECT #idmax = MAX(id) FROM TABLE1
INSERT INTO TABLE1 (id, id2, col1, col2)
SELECT similiarId + #idmax, similiarId2, similiarCol1, similiarCol2
FROM TABLE2
Now insert will not fail because of primary key violation because every inserted id will be greater then id witch was alredy there.
If the id field is defined as auto-id and you leave it out of the insert statement, then sql will generate unique id's from the available pool.
In SQL Server we have the function ROW_NUMBER, and if I have understood you correctly the following code will do what you need:
INSERT INTO TABLE1 (id, id2, col1, col2)
SELECT (ROW_NUMBER( ) OVER ( ORDER BY similiarId2 ASC )) + 6 AS similiarId,
similiarId2, similiarCol1, similiarCol2
FROM TABLE2
ROW_NUMBER will bring the number of each row, and you can add a "magic value" to it to make those values different from the current max ID of TABLE1. Let's say your current max ID is 6, then adding 6 to each result of ROW_NUMBER will give you 7, 8, 9, and so on. This way you won't have the same values for the TABLE1's primary key.
I have asked Google and it said to me that Sybase has the function ROW_NUMBER too (http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.sqlanywhere.12.0.1/dbusage/ug-olap-s-51258147.html), so I think you can try it.
If you want to make an identical table why not simply use (quick and dirty) Select INTO method ?
SELECT * INTO TABLE2
FROM TABLE1
Hope This helps.
Make the table1 ID IDENTITY if it is not a custom id.
or
Create new primary key in table1 and make it IDENTITY, and you can keep the previous IDs in the same format (but not primary key).
Your best bet may be to add an additional column on Table2 for Table1.Id. This way you keep both sets of Keys.
(If you are busy with a data merge, retaining Table1.Id may be important for any foreign keys which may still reference Table1.Id - you will then need to 'fix up' foreign keys in tables referencing Table1.Id, which now need to reference the applicable key in table 2).
If you need your 2nd table keep similar values as in 1st table , then donot apply auto increment on 2nd table.
If you have large range, and want easy fast make and don't care about ID:
Example wit CONCAT
INSERT INTO session(SELECT CONCAT("3000", id) as id, cookieid FROM `session2`)
but you can using also REPLACE

Deleting at most one record for each unique tuple combination

I want to delete at most one record for each unique (columnA, columnB)-tuple in my following delete statement:
DELETE FROM tableA
WHERE columnA IN
(
--some subqueryA
)
AND columnB IN
(
--some subqueryB
)
How is this accomplished? Please only consider those statements that work when used against MSS 2000 (i.e., T-SQL 2000 syntax). I can do it with iterating through a temptable but I want to write it using only sets.
Example:
subqueryA returns 1
subqueryB returns 2,3
If the original table contained
(columnA, columnB, columnC)
5,2,5
1,2,34
1,2,45
1,3,86
Then
1,2,34
1,3,86
should be deleted. Each unique (columnA, columnB)-tuple will appear at most twice in tableA and each time I run my SQL statement I want to delete at most one of these unique combinations - never two.
If there is one record for a given unique (columnA, columnB)-tuple,
delete it.
If there are two records for a given unique (columnA,
columnB)-tuple, delete only one of them.
Delete tabA
from TableA tabA
Where tabA.columnC in (
select max(tabAA.columnC) from TableA tabAA
where tabAA.columnA in (1)
and tabAA.columnB in (2,3)
group by tabAA.columnA,tabAA.columnB
)
How often are you going to be running this that it matters whether you use temp tables or not? Maybe you should consider adding constraints to the table so you only have to do this once...
That said, in all honesty, the best way to do this for SQL Server 2000 is probably to use the #temp table as you're already doing. If you were trying to delete all but one of each dupe, then you could do something like:
insert the distinct rows into a separate table
delete all the rows from the old table
move the distinct rows back into the original table
I've also done things like copy the distinct rows into a new table, drop the old table, and rename the new table.
But this doesn't sound like the goal. Can you show the code you're currently using with the #temp table? I'm trying to envision how you're identifying the rows to keep, and maybe seeing your existing code will trigger something.
EDIT - now with better understood requirements, I can propose the following query. Please test it on a copy of the table first!
DELETE a
FROM dbo.TableA AS a
INNER JOIN
(
SELECT columnA, columnB, columnC = MIN(columnC)
FROM dbo.TableA
WHERE columnA IN
(
-- some subqueryA
SELECT 1
)
AND columnB IN
(
-- some subqueryB
SELECT 2 UNION SELECT 3
)
GROUP BY columnA, columnB
) AS x
ON a.columnA = x.columnA
AND a.columnB = x.columnB
AND a.columnC = x.columnC;
Note that this doesn't confirm that there are exactly one or two rows that match the grouping on columnA and columnB. Also note that if you run this twice it will delete the remaining row that still matches the subquery!

Can I select a set of rows from a table and directly insert that into a table or the same table in SQL?

Hi all I was just curious if I could do something like -
insert into Employee ( Select * from Employee where EmployeeId=1)
I just felt the need to do this a lot of times...so just was curious if there was any way to achieve it..
You can do something like that, but you cannot Select * if you want to change a column value:
Insert into employee ( employeeId, someColumn, someOtherColumn )
Select 2, someColumn, someOtherColumn
From employee
Where employeeId=1
This would insert 2 as your new employeeId.
yes list out the column and do it like that
insert into Employee (EmployeeId, LastName, FirstName......)
Select 600 as EmployeeId, LastName, FirstName......
from Employee where EmployeeId=1
However if EmployeeId is an identity column then you also need to do
set identity_insert employee on
first and then
set identity_insert employee off
after you are done
You could use the INSERT INTO ... SELECT ... syntax if the destination table already exists, and you just want to append rows to it. It's easy to check if you are selecting your data by executing just the SELECT part.
Insert into existingTable ( Column1, Column2, Column3, ... )
Select 1, Column2, Column3
From tableName
Where ....
You are not restricted to a simple select, the SELECT statement can be as complex as necessary. Also you do not need to provide names for the selected columns, as they will be provided by the destination table.
However, if you have autoincrement columns on the dest table, you can either omit them from the INSERT's column list or use the 'set identity_insert' configuration.
If you want to create a new table from existing data, then you should use the SELECT ... INTO ... syntax
Select 1 as Column1, Column2, Column3
Into newTable
From tableName
Where ....
Again, the select can be arbitrarily complex, but, because the column names are taken from the select statement, all columns must have explicit names. This syntax will give an error if the 'newTable' table already exists. This form is very convenient if you want to make a quick copy of a table to try something.