mysql: is there a way to do a "INSERT INTO" 2 tables? - sql

I have one table with 2 columns that i essentially want to split into 2 tables:
table A columns: user_id, col1, col2
New tables:
B: user_id, col1
C: user_id, col2
I want to do:
INSERT INTO B (user_id, col1) SELECT user_id,col1 from A;
INSERT INTO C (user_id,col2) SELECT user_id, col2 from A;
But i want to do it in one statement. The table is big, so i just want to do it in one pass. Is there a way to do this?
Thx.

No, you can't insert into more than one table at the same time. INSERT syntax allows only a single table name.
http://dev.mysql.com/doc/refman/5.5/en/insert.html
INSERT [LOW_PRIORITY | DELAYED |
HIGH_PRIORITY] [IGNORE] [INTO]
tbl_name [...

Write a stored procedure to encapsulate the two inserts and protect the transaction.

If by "in one statement", you mean "atomically" - so that it can never happen that it's inserted into one table but not the other - then transactions are what you're looking for:
START TRANSACTION;
INSERT INTO B (user_id, col1) SELECT user_id,col1 from A;
INSERT INTO C (user_id,col2) SELECT user_id, col2 from A;
COMMIT;
If you need to actually do this in a single statement, you could create these as a stored procedure and call that, as #lexu suggests.
See the manual for reference: http://dev.mysql.com/doc/refman/5.0/en/commit.html
Caveat: this will not work with MyISAM tables (no transaction support), they need to be InnoDB.

Unless your tables are spread over multiple physical disks, then the speed of the select/insert is likely to be IO bound.
Trying to insert into two tables at once (even if it were possible) is likely to increase the total insert time as the disk will have to thrash more writing to your tables.

Related

What happens with duplicates when inserting multiple rows?

I am running a python script that inserts a large amount of data into a Postgres database, I use a single query to perform multiple row inserts:
INSERT INTO table (col1,col2) VALUES ('v1','v2'),('v3','v4') ... etc
I was wondering what would happen if it hits a duplicate key for the insert. Will it stop the entire query and throw an exception? Or will it merely ignore the insert of that specific row and move on?
The INSERT will just insert all rows and nothing special will happen, unless you have some kind of constraint disallowing duplicate / overlapping values (PRIMARY KEY, UNIQUE, CHECK or EXCLUDE constraint) - which you did not mention in your question. But that's what you are probably worried about.
Assuming a UNIQUE or PK constraint on (col1,col2), you are dealing with a textbook UPSERT situation. Many related questions and answers to find here.
Generally, if any constraint is violated, an exception is raised which (unless trapped in subtransaction like it's possible in a procedural server-side language like plpgsql) will roll back not only the statement, but the whole transaction.
Without concurrent writes
I.e.: No other transactions will try to write to the same table at the same time.
Exclude rows that are already in the table with WHERE NOT EXISTS ... or any other applicable technique:
Select rows which are not present in other table
And don't forget to remove duplicates within the inserted set as well, which would not be excluded by the semi-anti-join WHERE NOT EXISTS ...
One technique to deal with both at once would be EXCEPT:
INSERT INTO tbl (col1, col2)
VALUES
(text 'v1', text 'v2') -- explicit type cast may be needed in 1st row
, ('v3', 'v4')
, ('v3', 'v4') -- beware of dupes in source
EXCEPT SELECT col1, col2 FROM tbl;
EXCEPT without the key word ALL folds duplicate rows in the source. If you know there are no dupes, or you don't want to fold duplicates silently, use EXCEPT ALL (or one of the other techniques). See:
Using EXCEPT clause in PostgreSQL
Generally, if the target table is big, WHERE NOT EXISTS in combination with DISTINCT on the source will probably be faster:
INSERT INTO tbl (col1, col2)
SELECT *
FROM (
SELECT DISTINCT *
FROM (
VALUES
(text 'v1', text'v2')
, ('v3', 'v4')
, ('v3', 'v4') -- dupes in source
) t(c1, c2)
) t
WHERE NOT EXISTS (
SELECT FROM tbl
WHERE col1 = t.c1 AND col2 = t.c2
);
If there can be many dupes, it pays to fold them in the source first. Else use one subquery less.
Related:
Select rows which are not present in other table
With concurrent writes
Use the Postgres UPSERT implementation INSERT ... ON CONFLICT ... in Postgres 9.5 or later:
INSERT INTO tbl (col1,col2)
SELECT DISTINCT * -- still can't insert the same row more than once
FROM (
VALUES
(text 'v1', text 'v2')
, ('v3','v4')
, ('v3','v4') -- you still need to fold dupes in source!
) t(c1, c2)
ON CONFLICT DO NOTHING; -- ignores rows with *any* conflict!
Further reading:
How to use RETURNING with ON CONFLICT in PostgreSQL?
How do I insert a row which contains a foreign key?
Documentation:
The manual
The commit page
The Postgres Wiki page
Craig's reference answer for UPSERT problems:
How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?
Will it stop the entire query and throw an exception? Yes.
To avoid that, you can look on the following SO question here, which describes how to avoid Postgres from throwing an error for multiple inserts when some of the inserted keys already exist on the DB.
You should basically do this:
INSERT INTO DBtable
(id, field1)
SELECT 1, 'value'
WHERE
NOT EXISTS (
SELECT id FROM DBtable WHERE id = 1
);

SQL insert into using Union should add only distinct values

So I have this temp table that has structure like:
col1 col2 col3 col3
intID1 intID2 intID3 bitAdd
I am doing a union of the values of this temp table with a select query and storing
it into the same temp table.The thing is col3 is not part of the union query I will
need it later on to update the table.
So I am doing like so:
Insert into #temptable
(
intID1,
intID2,
intID3
)
select intID1,intID2,intID3
From
#temptable
UNION
select intID1,intID2,intID3
From
Table A
Issue is that I want only the rows that are not already existing in the temp table to be added.Doing it this way will add a duplicate of the already existing row(since union will return one row)How do I insert only those rows not existing in the current temp table in my union query?
Use MERGE:
MERGE INTO #temptable tmp
USING (select intID1,intID2,intID3 From Table A) t
ON (tmp.intID1 = t.intID1 and tmp.intID2 = t.intID2 and tmp.intID3 = t.intID3)
WHEN NOT MATCHED THEN
INSERT (intID1,intID2,intID3)
VALUES (t.intID1,t.intID2,t.intID3)
Nice and simple with EXCEPT
INSERT INTO #temptable (intID1, intID2, intID3)
SELECT intID1,intID2,intID3 FROM TableA
EXCEPT
SELECT intID1,intID2,intID3 FROM #temptable
I see where you are coming from. In most programming languages #temptable would be a variable (a relation variable or relvar for short) to which you would assign a value (a relation value) thus:
#temptable := #temptable UNION A
In the relational model, this would achieve the desired result because a relation has no duplicate rows by definition.
However, SQL is not truly relational and does not support assignment. Instead, you are required to add rows to a table using SQL DML INSERT statements (which is not so bad: the users of a truly relational database language, if we had one, would no doubt demand a similar shorthand for relational assignment!) but you are also required to do the test for duplicates yourself.
The answers from Daniel Hilgarth and Joachim Isaksson both look good. It's good practice to have two good, logically sound candidate answers then look for criteria (usually performance under typical load) to eliminate one (but retaining it commented out for future re-testing!)

SQL Insert into 2 tables, passing the new PK from one table as the FK in the other

How can I achieve an insert query on 2 tables that will insert the primary key set from one table as a foreign key into the second table.
Here's a quick example of what I'm trying to do, but I'd like this to be one query, perhaps a join.
INSERT INTO Table1 (col1, col2) VALUES ( val1, val2 )
INSERT INTO Table2 (foreign_key_column) VALUES (parimary_key_from_table1_insert)
I'd like this to be one join query.
I've made some attempts but I can't get this to work correctly.
This is not possible to do with a single query.
The record in the PK table needs to be inserted before the new PK is known and can be used in the FK table, so at least two queries are required (though normally 3, as you need to retrieve the new PK value for use).
The exact syntax depends on the database being used, which you have not specified.
If you need this set of inserts to be atomic, use transactions.
Despite what others have answered, this absolutely is possible, although it takes 2 queries made consecutively with the same connection (to maintain the session state).
Here's the mysql solution (with executable test code below):
INSERT INTO Table1 (col1, col2) VALUES ( val1, val2 );
INSERT INTO Table2 (foreign_key_column) VALUES (LAST_INSERT_ID());
Note: These should be executed using a single connection.
Here's the test code:
create table tab1 (id int auto_increment primary key, note text);
create table tab2 (id int auto_increment primary key, tab2_id int references tab1, note text);
insert into tab1 values (null, 'row 1');
insert into tab2 values (null, LAST_INSERT_ID(), 'row 1');
select * from tab1;
select * from tab2;
mysql> select * from tab1;
+----+-------+
| id | note |
+----+-------+
| 1 | row 1 |
+----+-------+
1 row in set (0.00 sec)
mysql> select * from tab2;
+----+---------+-------+
| id | tab2_id | note |
+----+---------+-------+
| 1 | 1 | row 1 |
+----+---------+-------+
1 row in set (0.00 sec)
From your example, if the tuple (col1, col2) can be considered unique, then you could do:
INSERT INTO table1 (col1, col2) VALUES (val1, val2);
INSERT INTO table2 (foreign_key_column) VALUES (SELECT id FROM Table1 WHERE col1 = val1 AND col2 = val2);
There may be a few ways to accomplish this. Probably the most straight forward is to use a stored procedure that accepts as input all the values you need for both tables, then inserts to the first, retrieves the PK, and inserts to the second.
If your DB supports it, you can also tell the first INSERT to return a value:
INSERT INTO table1 ... RETURNING primary_key;
This at least saves the SELECT step that would otherwise be necessary. If you go with a stored procedure approach, you'll probably want to incorporate this into that stored procedure.
It could also possibly be done with a combination of views and triggers--if supported by your DB. This is probably far messier than it's worth, though. I believe this could be done in PostgreSQL, but I'd still advise against it. You'll need a view that contains all of the columns represented by both table1 and table2, then you need an ON INSERT DO INSTEAD trigger with three parts--the first part inserts to the new table, the second part retrieves the PK from the first table and updates the NEW result, and the third inserts to the second table. (Note: This view doesn't even have to reference the two literal tables, and would never be used for queries--it only has to contain columns with names/data types that match the real tables)
Of course all of these methods are just complicated ways of getting around the fact that you can't really do what you want with a single command.

"select * into table" Will it work for inserting data into existing table

I am trying to insert data from one of my existing table into another existing table.
Is it possible to insert data into any existing table using select * into query.
I think it can be done using union but in that case i need to record all data of my existing table into temporary table, then drop that table and finally than apply union to insert all records into same table
eg.
select * into #tblExisting from tblExisting
drop table tblExisting
select * into tblExisting from #tblExisting union tblActualData
Here tblExisting is the table where I actually want to store all data
tblActualData is the table from where data is to be appended to tblExisting.
Is it right method.
Do we have some other alternative ?
You should try
INSERT INTO ExistingTable (Columns,..)
SELECT Columns,...
FROM OtherTable
Have a look at INSERT
and SQL SERVER – Insert Data From One Table to Another Table – INSERT INTO SELECT – SELECT INTO TABLE
No, you cannot use SELECT INTO to insert data into an existing table.
The documentation makes this very clear:
SELECT…INTO creates a new table in the default filegroup and inserts the resulting rows from the query into it.
You generally want to avoid using SELECT INTO in production because it gives you very little control over how the table is created, and can lead to all sorts of nasty locking and other performance problems. You should create schemas explicitly and use INSERT - even for temporary tables.
#Ryan Chase
Can you do this by selecting all columns using *?
Yes!
INSERT INTO yourtable2
SELECT * FROM yourtable1
Update from CTE? http://www.sqlservercentral.com/Forums/Topic629743-338-1.aspx

Row number in Sybase tables

Sybase db tables do not have a concept of self updating row numbers. However , for one of the modules , I require the presence of rownumber corresponding to each row in the database such that max(Column) would always tell me the number of rows in the table.
I thought I'll introduce an int column and keep updating this column to keep track of the row number. However I'm having problems in updating this column in case of deletes. What sql should I use in delete trigger to update this column?
You can easily assign a unique number to each row by using an identity column. The identity can be a numeric or an integer (in ASE12+).
This will almost do what you require. There are certain circumstances in which you will get a gap in the identity sequence. (These are called "identity gaps", the best discussion on them is here). Also deletes will cause gaps in the sequence as you've identified.
Why do you need to use max(col) to get the number of rows in the table, when you could just use count(*)? If you're trying to get the last row from the table, then you can do
select * from table where column = (select max(column) from table).
Regarding the delete trigger to update a manually managed column, I think this would be a potential source of deadlocks, and many performance issues. Imagine you have 1 million rows in your table, and you delete row 1, that's 999999 rows you now have to update to subtract 1 from the id.
Delete trigger
CREATE TRIGGER tigger ON myTable FOR DELETE
AS
update myTable
set id = id - (select count(*) from deleted d where d.id < t.id)
from myTable t
To avoid locking problems
You could add an extra table (which joins to your primary table) like this:
CREATE TABLE rowCounter
(id int, -- foreign key to main table
rownum int)
... and use the rownum field from this table.
If you put the delete trigger on this table then you would hugely reduce the potential for locking problems.
Approximate solution?
Does the table need to keep its rownumbers up to date all the time?
If not, you could have a job which runs every minute or so, which checks for gaps in the rownum, and does an update.
Question: do the rownumbers have to reflect the order in which rows were inserted?
If not, you could do far fewer updates, but only updating the most recent rows, "moving" them into gaps.
Leave a comment if you would like me to post any SQL for these ideas.
I'm not sure why you would want to do this. You could experiment with using temporary tables and "select into" with an Identity column like below.
create table test
(
col1 int,
col2 varchar(3)
)
insert into test values (100, "abc")
insert into test values (111, "def")
insert into test values (222, "ghi")
insert into test values (300, "jkl")
insert into test values (400, "mno")
select rank = identity(10), col1 into #t1 from Test
select * from #t1
delete from test where col2="ghi"
select rank = identity(10), col1 into #t2 from Test
select * from #t2
drop table test
drop table #t1
drop table #t2
This would give you a dynamic id (of sorts)