Let's say I have two databases with identical tables, but one database's tables contains data while the other doesn't. Is there a way in SQL Server to generate a script to fill the empty tables with data from the full tables?
If the tables are identical and don't use an IDENTITY column, it is quite easy.
You would do something like this:
INSERT INTO TableB
SELECT * FROM TableA
Again, only for identical table structures, otherwise you have to change the SELECT * to the correct columns and perform any conversions that are necessary.
And, to add to the #WilliamD answer, if there is an IDENTITY column you can use a variation of the INSERT statement.
Assuming you have two columns (Col1 and Col2, with Col1 having IDENTITY property) in the tables, you can do the following:
SET IDENTITY_INSERT TableB ON
INSERT INTO TableB (col1, col2)
SELECT col1, col2 FROM TableA
SET IDENTITY_INSERT TableB OFF
It's necessary to list the columns in this situation.
Related
I have TableA that has hundreds of thousands of rows and is still increasing in size. With no partitions, the speed has decreased very noticeably.
So I made a new table called TableB made columns exactly like (both name and type) TableA in Oracle SQL Developer. (TableA and TableB are in the same database but not the same tables) I additionally created partitions for TableB.
Now, all I want to do is copy all the data from TableA from TableB in order to test the speeds of queries.
In order to test speeds of tables with partitions, I decided to copy all of the data now that TableB has all the same columns as A.
insert into TableB ( select * from TableA);
What I expected from the statement above was the data to be copied over but instead, I got the error:
Error starting at line : 1 in command -
insert into TableB ( select * from TableA)
Error at Command Line : 1 Column : 1
Error report -
SQL Error: ORA-54013: INSERT operation disallowed on virtual columns
54013. 0000 - "INSERT operation disallowed on virtual columns"
*Cause: Attempted to insert values into a virtual column
*Action: Re-issue the statment without providing values for a virtual column
I looked up Virtual Columns and it seems to be
"When queried, virtual columns appear to be normal table columns, but their values are derived rather than being stored on disc. The syntax for defining a virtual column is listed below."
However, I do not have any data in TableB whatsoever. TableB only has the columns that match TableA so I am unsure as to how my columns can be derived, when there is nothing to derive?
You can use the query
SELECT column_name, virtual_column
FROM user_tab_cols
WHERE table_name = 'TABLEA';
COLUMN_NAME VIRTUAL_COLUMN
----------- --------------
ID NO
COL1 NO
COL2 NO
COL3 YES
Then use
INSERT INTO TABLEB(ID,COL1,COL2) SELECT ID,COL1,COL2 FROM TABLEA;
to be exempt from the virtual columns, those are calculated ones from the other columns' values.
did you create table B also with derived columns ? from your question i presume you created tableB also with virtual columns..
One thing you need to notice is since you have a large volume of records to insert , use bulk mode for faster operation.. use append hint as shown below.
Please note - you need not include virtual columns in below statement as they would be calculated on the fly.
insert /*+ APPEND */ into tableB (column1, column2,...columnn) select column1, column2,...columnn from TableA
I have a .csv file with 600 million plus rows. I need to upload this into a database. It will have 3 columns assigned as primary keys.
I use pandas to read the file in chunks of 1000 lines.
At each chunk iteration I use the
INSERT INTO db_name.dbo.table_name("col1", "col2", "col3", "col4")
VALUES (?,?,?,?)
cursor.executemany(query, df.values.tolist())
Syntax with pyodbc in python to upload data in chunks of 1000 lines.
Unfortunately, there are apparently some duplicate rows present. When the duplicate row is encountered the uploading stops with an error from SQL Server.
Question: how can I upload data such that whenever a duplicate is encountered instead of stopping it will just skip that line and upload the rest? I found some questions and answers on insert into table from another table, or insert into table from variables declared, but nothing on reading from a file and using insert into table col_names values() command.
Based on those answers one idea might be:
At each iteration of chunks:
Upload to a temp table
Do the insertion from the temp table into the final table
Delete the rows in the temp table
However, with such a large file each second counts, and I was looking for an answer with better efficiency.
I also tried to deal with duplicates using python, however, since the file is too large to fit into the memory I could not find a way to do that.
Question 2: if I were to use bulk insert, how would I achieve to skip over the duplicates?
Thank you
You can try to use a CTE and an INSERT ... SELECT ... WHERE NOT EXISTS.
WITH cte
AS
(
SELECT ? col1,
? col2,
? col3,
? col4
)
INSERT INTO db_name.dbo.table_name
(col1,
col2,
col3,
col4)
SELECT col1,
col2,
col3,
col4
FROM cte
WHERE NOT EXISTS (SELECT *
FROM db_name.dbo.table_name
WHERE table_name.col1 = cte.col1
AND table_name.col2 = cte.col2
AND table_name.col3 = cte.col3
AND table_name.col4 = cte.col4);
Possibly delete some of the table_name.col<n> = cte.col<n>, if the column isn't part of the primary key.
I would always load into a temporary load table first, which doesn't have any unique or PK constraint on those columns. This way you can always see that the whole file has loaded, which is an invaluable check in any ETL work, and for any other easy analysis of the source data.
After that then use an insert such as suggested by an earlier answer, or if you know that the target table is empty then simply
INSERT INTO db_name.dbo.table_name(col1,col2,col3,col4)
SELECT distinct col1,col2,col3,col4 from load_table
The best approach is to use a temporary table and execute a MERGE-INSERT statement. You can do something like this (not tested):
CREATE TABLE #MyTempTable (col1 VARCHAR(50), col2, col3...);
INSERT INTO #MyTempTable(col1, col2, col3, col4)
VALUES (?,?,?,?)
CREATE CLUSTERED INDEX ix_tempCol1 ON #MyTempTable (col1);
MERGE INTO db_name.dbo.table_name AS TARGET
USING #MyTempTable AS SOURCE ON TARGET.COL1 = SOURCE.COL1 AND TARGET.COL2 = SOURCE.COL2 ...
WHEN NOT MATCHED THEN
INSERT(col1, col2, col3, col4)
VALUES(source.col1, source.col2, source.col3, source.col4);
You need to consider the best indexes for your temporary table to make the MERGE faster. With the statement WHEN NOT MATCHED you avoid duplicates depending on the ON clause.
SQL Server Integration Services offers one method that can read data from a source (via a Dataflow task), then remove duplicates using it's Sort control (a checkbox to remove duplicates).
https://www.mssqltips.com/sqlservertip/3036/removing-duplicates-rows-with-ssis-sort-transformation/
Of course the data has to be sorted and 60 million+ rows isn't going to be fast.
If you want to use pure SQL Server then you need a staging table (without a pk constraint). After importing your data into Staging, you would insert into your target table using filtering for the composite PK combination. For example,
Insert into dbo.RealTable (KeyCol1, KeyCol2, KeyCol3, Col4)
Select Col1, Col2, Col3, Col4
from dbo.Staging S
where not exists (Select *
from dbo.RealTable RT
where RT.KeyCol1 = S.Col1
AND RT.KeyCol2 = S.Col2
AND RT.KeyCol3 = S.Col3
)
In theory you could also use the set operator EXCEPT since it takes the distinct values from both tables. For example:
INSERT INTO RealTable
SELECT * FROM Staging
EXCEPT
SELECT * FROM RealTable
Would insert distinct rows from Staging into RealTable (that don't already exist in RealTable). This method doesn't take into account the composite PK using different values on multiple rows- so an insert error would indicate different values are being assigned to the same PK composite key in the csv.
I want to select some data using simple sql and insert those data into another table. Both table are same. Data types and column names all are same. Simply those are temporary table of masters table. Using single sql I want to insert those data into another table and in the where condition I check E_ID=? checking part. My another problem is sometime there may be any matching rows in the table. In that time is it may be out sql exception? Another problem is it may be multiple matching rows. That means one E_ID may have multiple rows. As a example in my attachment_master and attachments_temp table has multiple rows for one single ID. How do I solve those problems? I have another problem. My master table data can insert temp table using following code. But I want to change only one column and others are same data. Because I want to change temp table status column.
insert into dates_temp_table SELECT * FROM master_dates_table where e_id=?;
In here all data insert into my dates_temp_table. But I want to add all column data and change only dates_temp_table status column as "Modified". How should I change this code?
You could try this:
insert into table1 ( col1, col2, col3,.... )
SELECT col1, col2, col3, ....
FROM table2 where (you can check any condition here on table1 or table2 or mixed)
For more info have a look here and this similar question
Hope it may help you.
EDit : If I understand your requirement properly then this may be a helpful solution for you:
insert into table1 ( col-1, col-2, col-3,...., col-n, <Your modification col name here> )
SELECT col-1, col-2, col-3,...., col-n, 'modified'
FROM table2 where table1.e_id=<your id value here>
As per your comment in above other answer:
"I send my E_ID. I don't want to matching and get. I send my E_ID and
if that ID available I insert those data into my temp table and change
temp table status as 'Modified' and otherwise don't do anything."
As according to your above statements, If given e_id is there it will copy all the columns values to your table1 and will place a value 'modified' in the 'status' column of your table1
For more info look here
You can use merge statement if I understand your requirement correctly.
Documentation
As I do not have your table structure below is based on assumption, see whether this cater your requirement. I am assuming that e_id is primary key or change as per your table design.
MERGE INTO dates_temp_table trgt
USING (SELECT * FROM master_dates_table WHERE e_id=100) src
ON (trgt.prm_key = src.prm_key)
WHEN NOT MATCHED
THEN
INSERT (trgt.col, trgt.col2, trgt.status)
VALUES (src.col, src.col2, 'Modified');
More information and examples here
insert into tablename( column1, column2, column3,column4 ) SELECT column1,
column2, column3,column4 from anothertablename where tablename.ID=anothertablename.ID
IF multiple values are there then it will return the last result..If not you have narrow your search..
Table1 and Table2 have same schema, same columns and same types, and Table2 is empty while Table1 has some data
Insert into Table2 values(Select * from Table1)
how to transfer the data with sql statement? i think syntax is valid in oracle, but how to do with sql-server
You can leave out the values statement:
insert into table2
select * from table1
That said, you should really be in the habit of listing column names, both for the insert and select in this case. The columns could have the same name and type -- but be in different order.
You might possibly want to drop table 2 and then do a select * into table2 from table1. This way you are guaranteed to have the same structure. Because when somebody changes the structure of either table, but not the other, insert into will bomb.
So I have this temp table that has structure like:
col1 col2 col3 col3
intID1 intID2 intID3 bitAdd
I am doing a union of the values of this temp table with a select query and storing
it into the same temp table.The thing is col3 is not part of the union query I will
need it later on to update the table.
So I am doing like so:
Insert into #temptable
(
intID1,
intID2,
intID3
)
select intID1,intID2,intID3
From
#temptable
UNION
select intID1,intID2,intID3
From
Table A
Issue is that I want only the rows that are not already existing in the temp table to be added.Doing it this way will add a duplicate of the already existing row(since union will return one row)How do I insert only those rows not existing in the current temp table in my union query?
Use MERGE:
MERGE INTO #temptable tmp
USING (select intID1,intID2,intID3 From Table A) t
ON (tmp.intID1 = t.intID1 and tmp.intID2 = t.intID2 and tmp.intID3 = t.intID3)
WHEN NOT MATCHED THEN
INSERT (intID1,intID2,intID3)
VALUES (t.intID1,t.intID2,t.intID3)
Nice and simple with EXCEPT
INSERT INTO #temptable (intID1, intID2, intID3)
SELECT intID1,intID2,intID3 FROM TableA
EXCEPT
SELECT intID1,intID2,intID3 FROM #temptable
I see where you are coming from. In most programming languages #temptable would be a variable (a relation variable or relvar for short) to which you would assign a value (a relation value) thus:
#temptable := #temptable UNION A
In the relational model, this would achieve the desired result because a relation has no duplicate rows by definition.
However, SQL is not truly relational and does not support assignment. Instead, you are required to add rows to a table using SQL DML INSERT statements (which is not so bad: the users of a truly relational database language, if we had one, would no doubt demand a similar shorthand for relational assignment!) but you are also required to do the test for duplicates yourself.
The answers from Daniel Hilgarth and Joachim Isaksson both look good. It's good practice to have two good, logically sound candidate answers then look for criteria (usually performance under typical load) to eliminate one (but retaining it commented out for future re-testing!)