I have two tables InventoryOriginal and InventoryBackup. Both had 20 columns, but now InventoryOriginal has changed and has only 12 columns which are common with InventoryBackup. And I need to copy each month's data from InventoryOriginal to InventoryBackup, but it will give me an error because the column number do not match.
I use the simple statement to copy the data which is
INSERT INTO InventoryBackup
Select * from InventoryOriginal where Period = '2020-01-01'
but now the number of columns has changed InventoryOriginal has 12 and InventoryBackup has 20. Can I copy the 12 columns from InventoryOriginal to InventoryBackup and the rest columns can be blank?
getting error - Column name or number of supplied values does not match table definition.
The general way you do this is to mention the target and source columns you want to use. For example, for 3 columns you would use:
INSERT INTO InventoryBackup (col1, col2, col3)
SELECT col1, col2, col3
FROM InventoryOriginal
WHERE Period = '2020-01-01';
Keep in mind that any columns in the target backup table for which were not mentioned would either be assigned a NULL value, or possibly a default value, if the table definition support that.
The simple answer is yes, but you'll likely need to adjust your SQL statement to specify the columns. Eg.
INSERT INTO InventoryBackup (col1, col2, col3, ...)
SELECT col1, col2, col3, ... FROM InventoryOriginal
This will be fine, as long as the extra columns in the "backup" table (that are no longer in the "orignal" table) allow NULLs. If they don't, then you'll need to specify a blank value (or 0, if they're numeric) when you insert. Eg.
INSERT INTO InventoryBackup (col1, col2, col3, extra1, extra2, ...)
SELECT col1, col2, col3, '', 0, ... FROM InventoryOriginal
Hopefully that makes sense. The exact SQL statement that you're going to need is really going to depend on the actual structure of the "backup" table
Related
I have two tables. They have an identical structure except for the fact that Table2 has one more column. I want to create a stored procedure that copies all the data from Table1 to Table2, and then insert data into the unique column in Table2. I am kinda stumped, all I have so far is this:
CREATE PROCEDURE insert_t_p #t_p INT AS
BEGIN
INSERT INTO table_2
SELECT * FROM table_1
END
where #t_p is the data that I want to insert. This is going to be constant for all the records being copied over. Does anyone have any suggestions?
I suspect that you want:
INSERT INTO table_2 SELECT *, #t_p FROM table_1
Note that you should really enumerate the columns in both the insert and select, like:
INSERT INTO table_2(col1, col2, col3)
SELECT col1, col2, #t_p FROM table_1
This makes it much easier to ensure that each column from the source table is going into the relevant target column, possibly makes the query resilient to changes in the data structures, and allows you to handle structures where columns have different orders.
I strongly recommend that you list the columns:
INSERT INTO table_2 (col1, col2, . . . , col_extra)
SELECT col1, col2, . . ., #t_p
FROM table_1 ;
Listing the columns is a good habit that ensures that your code works with fewer errors and is not prone to issue if the table structures change -- or the columns are declared in a different order.
I have a .csv file with 600 million plus rows. I need to upload this into a database. It will have 3 columns assigned as primary keys.
I use pandas to read the file in chunks of 1000 lines.
At each chunk iteration I use the
INSERT INTO db_name.dbo.table_name("col1", "col2", "col3", "col4")
VALUES (?,?,?,?)
cursor.executemany(query, df.values.tolist())
Syntax with pyodbc in python to upload data in chunks of 1000 lines.
Unfortunately, there are apparently some duplicate rows present. When the duplicate row is encountered the uploading stops with an error from SQL Server.
Question: how can I upload data such that whenever a duplicate is encountered instead of stopping it will just skip that line and upload the rest? I found some questions and answers on insert into table from another table, or insert into table from variables declared, but nothing on reading from a file and using insert into table col_names values() command.
Based on those answers one idea might be:
At each iteration of chunks:
Upload to a temp table
Do the insertion from the temp table into the final table
Delete the rows in the temp table
However, with such a large file each second counts, and I was looking for an answer with better efficiency.
I also tried to deal with duplicates using python, however, since the file is too large to fit into the memory I could not find a way to do that.
Question 2: if I were to use bulk insert, how would I achieve to skip over the duplicates?
Thank you
You can try to use a CTE and an INSERT ... SELECT ... WHERE NOT EXISTS.
WITH cte
AS
(
SELECT ? col1,
? col2,
? col3,
? col4
)
INSERT INTO db_name.dbo.table_name
(col1,
col2,
col3,
col4)
SELECT col1,
col2,
col3,
col4
FROM cte
WHERE NOT EXISTS (SELECT *
FROM db_name.dbo.table_name
WHERE table_name.col1 = cte.col1
AND table_name.col2 = cte.col2
AND table_name.col3 = cte.col3
AND table_name.col4 = cte.col4);
Possibly delete some of the table_name.col<n> = cte.col<n>, if the column isn't part of the primary key.
I would always load into a temporary load table first, which doesn't have any unique or PK constraint on those columns. This way you can always see that the whole file has loaded, which is an invaluable check in any ETL work, and for any other easy analysis of the source data.
After that then use an insert such as suggested by an earlier answer, or if you know that the target table is empty then simply
INSERT INTO db_name.dbo.table_name(col1,col2,col3,col4)
SELECT distinct col1,col2,col3,col4 from load_table
The best approach is to use a temporary table and execute a MERGE-INSERT statement. You can do something like this (not tested):
CREATE TABLE #MyTempTable (col1 VARCHAR(50), col2, col3...);
INSERT INTO #MyTempTable(col1, col2, col3, col4)
VALUES (?,?,?,?)
CREATE CLUSTERED INDEX ix_tempCol1 ON #MyTempTable (col1);
MERGE INTO db_name.dbo.table_name AS TARGET
USING #MyTempTable AS SOURCE ON TARGET.COL1 = SOURCE.COL1 AND TARGET.COL2 = SOURCE.COL2 ...
WHEN NOT MATCHED THEN
INSERT(col1, col2, col3, col4)
VALUES(source.col1, source.col2, source.col3, source.col4);
You need to consider the best indexes for your temporary table to make the MERGE faster. With the statement WHEN NOT MATCHED you avoid duplicates depending on the ON clause.
SQL Server Integration Services offers one method that can read data from a source (via a Dataflow task), then remove duplicates using it's Sort control (a checkbox to remove duplicates).
https://www.mssqltips.com/sqlservertip/3036/removing-duplicates-rows-with-ssis-sort-transformation/
Of course the data has to be sorted and 60 million+ rows isn't going to be fast.
If you want to use pure SQL Server then you need a staging table (without a pk constraint). After importing your data into Staging, you would insert into your target table using filtering for the composite PK combination. For example,
Insert into dbo.RealTable (KeyCol1, KeyCol2, KeyCol3, Col4)
Select Col1, Col2, Col3, Col4
from dbo.Staging S
where not exists (Select *
from dbo.RealTable RT
where RT.KeyCol1 = S.Col1
AND RT.KeyCol2 = S.Col2
AND RT.KeyCol3 = S.Col3
)
In theory you could also use the set operator EXCEPT since it takes the distinct values from both tables. For example:
INSERT INTO RealTable
SELECT * FROM Staging
EXCEPT
SELECT * FROM RealTable
Would insert distinct rows from Staging into RealTable (that don't already exist in RealTable). This method doesn't take into account the composite PK using different values on multiple rows- so an insert error would indicate different values are being assigned to the same PK composite key in the csv.
I am new to SQL Server and have a problem with an insert statement. I am to convert an old database to a SQL server relational database. I am transferring the old data into new tables. The old records are not complete which is causing problems because the fields in the new tables do not allow null values. So what I am trying to do is in insert n/a in the missing fields and then use the select statement to retrieve the available data from the old table all at the same time so I don't get null value not allowed, but I get the error Only one expression can be specified in the select list when the subquery is not introduced with EXISTS along with the Insert statement has more columns than the values statement.
I sure there is a way to do this but I can't figure it out, hope someone can help. Below is an abbreviated description to the statement.
insert into database1.dbo.table (col1, col2, .....col10)
values('n/a','n/a',(select col3, col4...col10 from database2.dbo.table)
You can try to use INSERT INTO ... SELECT
INSERT INTO database1.dbo.table (col1, col2, .....col10)
SELECT 'n/a',
'n/a',
col3,
col4,
...col10
FROM database2.dbo.table
i have one table test it has 10 column with 20 rows.
I need to move this data to archive_test table which has 11 column (10 same as test table plus one column is archive date).
when i tried to insert like below its shows error because number of column mismatch.
insert into archive_test
select * from test;
Please suggest the better way to do this.Thanks!
Well, obviously you need to supply values for all the columns, and although you can avoid doing so you should also explicitly state whic value is going to be inserted into which column. If you have an extra column in the target table you either:
Do not mention it
Specify a default value as part of its column definition in the table
Have a trigger to populate it
Specify a value for that column.
eg.
insert into table archive_test (col1, col2, col3 ... col11)
select col1,
col2,
col3,
...
sysdate
from test;
assuming that archive_date is the last column:
INSERT INTO archive_test
SELECT test.*, sysdate
FROM test
How can you export a result set given by a query to another table using SQL Server 2005?
I'd like to accomplish this without exporting to CSV?
INSERT INTO TargetTable(Col1, Col2, Col3)
SELECT Col1, Col2, Col3
FROM SourceTable
insert into table(column1, columns, etc) select columns from sourcetable
You can omit column list in insert if columns returned by select matches table definition. Column names in select are ignored, but recommended for readability.
Select into is possible too, but it creates new table. It is sometimes useful for selecting into temporary table, but be aware of tempdb locking by select into.
SELECT col1, col2, ...
INTO dbo.newtable
FROM (SELECT ...query...) AS x;