Bulk Insert (TSQL) from csv file with missing values - sql

Problem:
I have a table
CREATE TABLE BestTableEver
(
Id INT,
knownValue INT,
unknownValue INT DEFAULT 0,
totalValue INT DEFAULT 0);
And I have this CSV File (Loki.csv)
Id, knownValue, unknownValue, totalValue
1, 11114
2, 11135
3, 11235
I want to do a bulk insert into the table and since I do not know the values of unknownValue and totalValue yet , I want them to be take up the default value (as defined in the table creation)
My approach so far
create procedure populateLikeABoss
#i_filepath NVARCHAR(2048)
DECLARE #thor nvarchar(MAX)
SET #thor=
'BULK INSERT populateLikeABoss
FROM ' + char(39) + #i_filepath + char(39) +
'WITH
(
FIELDTERMINATOR = '','',
ROWTERMINATOR = ''\n'',
FIRSTROW = 2,
KEEPNULLS
)'
exec(#thor)
END
and calling the procedure to do the magic
populateLikeABoss 'C:\Loki.csv'
Error
Bulk load data conversion error (type mismatch or invalid character
for the specified codepage) for row 2, column 2 (sizeOnMedia).
References
Keeping NULL value with bulk insert
Microsoft
Similar question without the answer I need
StackOverflow

I think the csv is not in the expected format. For keeping null the records should be in the format 1, 11114,, in each row. Other option is to remove the last two columns in header.

Related

Remove quotation chars from file while bulk inserting data to table

This is Users table
CREATE TABLE Users
(
Id INT PRIMARY KEY IDENTITY(0,1),
Name NVARCHAR(20) NOT NULL,
Surname NVARCHAR(25) NOT NULL,
Email NVARCHAR(30),
Facebook NVARCHAR(30),
CHECK(Email IS NOT NULL OR Facebook IS NOT NULL)
);
This is BULK INSERT
BULK INSERT Users
FROM 'C:\Users\SAMIR\Downloads\Telegram Desktop\users.txt'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
--FIRSTROW = 0,
--UTF-8
CODEPAGE = '65001'
);
So this is Users.txt file data:
`1, N'Alex', N'Mituchin', N'qwe#gmail.com', NULL`
When I load data from the file it sets Username to values like N'Alex'. But I want to have the data simply like Alex. How can I fix this problem?
I recommend loading data into a staging table where are values are strings.
Then you can use a simply query to get the final results. In this case, you can do:
select (case when name like 'N''%'''
then substring(name, 2, len(name) - 3)
else name
end) as name
from staging
There's a better option for this. If the string delimiters and unicode indicators are consistent (they're present on all rows), you should use a format file where you can indicate delimites for each column. This will allow you to set , N' as delimiter between the first and second columns, ', N' as delimiter for the second and third columns, and so on.

BULK INSERT 0 Rows affected

I have been at this problem all morning and can't seem to figure it out. I have a simple txt file with the following entries:
1,van Rhijn
2,van Dam
3,van Rhijn van Dam
I am trying to import these fields using the following query:
CREATE TABLE #test
(
Id INT NOT NULL,
LastName VARCHAR(MAX) NOT NULL
)
BULK INSERT #test
FROM 'C:\test.txt'
WITH
(
MAXERRORS = 0,
FIRSTROW = 1,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\r\n'
)
SELECT *
FROM #test
I have tried everything I found on the web. Changing delimitor, row terminator, encoding, extension. I keep getting the message "0 row(s) affected" and the last select obviously returns no rows.
EDIT: I use Microsoft SQL Server.
Please help.
Might be a silly question, but are you actually using the syntax 'test.txt' or are you using a fully qualified or at least a full path like 'c:\test.txt'? Because I am pretty sure you need to use the full path here.
CREATE TABLE #test
(
Id INT NOT NULL,
LastName VARCHAR(MAX) NOT NULL
)
BULK INSERT #test
FROM 'C:\test.txt'
WITH
(
MAXERRORS = 0,
FIRSTROW = 1,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
SELECT *
FROM #test
Or wherever your file resides on the network (note if your SQL server is on a different machine you will probably need to use a network path and/or shared folder combination).
edit: Try updating your ROWTERMINATOR to \n only

How to do an INSERT in SQL Server from a string containing a specific column

I'm trying to create a stored procedure that receives a string with the values of a specific column I want and then insert one row per each value, divided by "," .
For instance:
--The string I mentioned
#Objectid = '15, 21, 23, 53'.
--Then I wish to insert those values into a table like for instance
#Result( ID bigint, AppID bigint, ObjectID bigint)
So I wished to perform an insert on that table, and place the
values of each #Objectid onto a different row of #Result and on the
column ObjectID, while at the same time, filling other columns with
values I have stored in variables on the same stored procedure. Is there a way to do this? And if so, is there a way to do this without the usage of a cursor?
DECLARE #Objectid varchar(max);
set #Objectid = '15, 21, 23, 53';
set #Objectid = Replace(#Objectid, ',', '.');
SELECT ParseName(#Objectid, 4) As ID ,
ParseName(#Objectid, 3) As AppID ,
ParseName(#Objectid, 2) As ObjectID,
ParseName(#Objectid, 1) As ObjectID2

Inserting Dates with BULK INSERT

I have a CSV file, which contains three dates:
'2010-07-01','2010-08-05','2010-09-04'
When I try to bulk insert them...
BULK INSERT [dbo].[STUDY]
FROM 'StudyTable.csv'
WITH
(
MAXERRORS = 0,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
I get an error:
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 1 (CREATED_ON).
So I'm assuming this is because I have an invalid date format. What is the correct format to use?
EDIT
CREATE TABLE [dbo].[STUDY]
(
[CREATED_ON] DATE,
[COMPLETED_ON] DATE,
[AUTHORIZED_ON] DATE,
}
You've got quotes (') around your dates. Remove those and it should work.
Does your data file have a header record? If it does, obviously your table names will not be the correct data type, and will fail when SQL Server tries to INSERT them into your table. Try this:
BULK INSERT [dbo].[STUDY]
FROM 'StudyTable.csv'
WITH
(
MAXERRORS = 0,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2
)
According to MSDN the BULK INSERT operation technically doesn't support skipping header records in the CSV file. You can either remove the header record or try the above. I don't have SQL Server in front of me at the moment, so I have not confirmed this works. YMMV.

SQL Bulk Insert with FIRSTROW parameter skips the following line

I can't seem to figure out how this is happening.
Here's an example of the file that I'm attempting to bulk insert into SQL server 2005:
***A NICE HEADER HERE***
0000001234|SSNV|00013893-03JUN09
0000005678|ABCD|00013893-03JUN09
0000009112|0000|00013893-03JUN09
0000009112|0000|00013893-03JUN09
Here's my bulk insert statement:
BULK INSERT sometable
FROM 'E:\filefromabove.txt
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR= '|',
ROWTERMINATOR = '\n'
)
But, for some reason the only output I can get is:
0000005678|ABCD|00013893-03JUN09
0000009112|0000|00013893-03JUN09
0000009112|0000|00013893-03JUN09
The first record always gets skipped, unless I remove the header altogether and don't use the FIRSTROW parameter. How is this possible?
Thanks in advance!
I don't think you can skip rows in a different format with BULK INSERT/BCP.
When I run this:
TRUNCATE TABLE so1029384
BULK INSERT so1029384
FROM 'C:\Data\test\so1029384.txt'
WITH
(
--FIRSTROW = 2,
FIELDTERMINATOR= '|',
ROWTERMINATOR = '\n'
)
SELECT * FROM so1029384
I get:
col1 col2 col3
-------------------------------------------------- -------------------------------------------------- --------------------------------------------------
***A NICE HEADER HERE***
0000001234 SSNV 00013893-03JUN09
0000005678 ABCD 00013893-03JUN09
0000009112 0000 00013893-03JUN09
0000009112 0000 00013893-03JUN09
It looks like it requires the '|' even in the header data, because it reads up to that into the first column - swallowing up a newline into the first column. Obviously if you include a field terminator parameter, it expects that every row MUST have one.
You could strip the row with a pre-processing step. Another possibility is to select only complete rows, then process them (exluding the header). Or use a tool which can handle this, like SSIS.
Maybe check that the header has the same line-ending as the actual data rows (as specified in ROWTERMINATOR)?
Update: from MSDN:
The FIRSTROW attribute is not intended
to skip column headers. Skipping
headers is not supported by the BULK
INSERT statement. When skipping rows,
the SQL Server Database Engine looks
only at the field terminators, and
does not validate the data in the
fields of skipped rows.
I found it easiest to just read the entire line into one column then parse out the data using XML.
IF (OBJECT_ID('tempdb..#data') IS NOT NULL) DROP TABLE #data
CREATE TABLE #data (data VARCHAR(MAX))
BULK INSERT #data FROM 'E:\filefromabove.txt' WITH (FIRSTROW = 2, ROWTERMINATOR = '\n')
IF (OBJECT_ID('tempdb..#dataXml') IS NOT NULL) DROP TABLE #dataXml
CREATE TABLE #dataXml (ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED, data XML)
INSERT #dataXml (data)
SELECT CAST('<r><d>' + REPLACE(data, '|', '</d><d>') + '</d></r>' AS XML)
FROM #data
SELECT d.data.value('(/r//d)[1]', 'varchar(max)') AS col1,
d.data.value('(/r//d)[2]', 'varchar(max)') AS col2,
d.data.value('(/r//d)[3]', 'varchar(max)') AS col3
FROM #dataXml d
You can use the below snippet
BULK INSERT TextData
FROM 'E:\filefromabove.txt'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = '|', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
ERRORFILE = 'E:\ErrorRows.csv',
TABLOCK
)
To let SQL handle quote escape and everything else do this
BULK INSERT Test_CSV
FROM 'C:\MyCSV.csv'
WITH (
FORMAT='CSV'
--FIRSTROW = 2, --uncomment this if your CSV contains header, so start parsing at line 2
);
In regards to other answers, here is valuable info as well:
I keep seeing this in all answers: ROWTERMINATOR = '\n'
The \n means LF and it is Linux style EOL
In Windows the EOL is made of 2 chars CRLF so you need ROWTERMINATOR = '\r\n'
Given how mangled some data can look after BCP importing into SQL Server from non-SQL data sources, I'd suggest doing all the BCP import into some scratch tables first.
For example
truncate table Address_Import_tbl
BULK INSERT dbo.Address_Import_tbl
FROM 'E:\external\SomeDataSource\Address.csv'
WITH (
FIELDTERMINATOR = '|', ROWTERMINATOR = '\n', MAXERRORS = 10
)
Make sure all the columns in Address_Import_tbl are nvarchar(), to make it as agnostic as possible, and avoid type conversion errors.
Then apply whatever fixes you need to Address_Import_tbl. Like deleting the unwanted header.
Then run a INSERT SELECT query, to copy from Address_Import_tbl to Address_tbl, along with any datatype conversions you need. For example, to cast imported dates to SQL DATETIME.