Encoding issue of arguments for BULK INSERT via SSMS - sql

So I stumbled on an interesting problem while trying to load some data. Essentially I have some files with data in them that I am trying to BULK INSERT into a table with varchar columns to make it easy to import. The file is tab delimited with CRLFs as the row terminator.
For some reason, when I write/copy&paste the BULK INSERT command from my own PC the command fails. It offers an error stating
Bulk load: DataFileType was incorrectly specified as widechar. DataFileType will be assumed to be char because the data file does not have a Unicode signature.
Then it says:
Bulk load failed. The column is too long in the data file for row 1, column 7. Verify that the field terminator and row terminator are specified correctly.
The command is as follows:
BULK INSERT <table>
FROM '<filepath>'
WITH
(
DATAFILETYPE = 'widechar',
FIELDTERMINATOR = ' ',
ROWTERMINATOR = '
'
);
Now, the part that doesn't make sense is that without changing any piece of that code, my co-worker was able to run and load the table with perfect success. No warning messages, no failures, nothing.
When I look at the command in Notepad++ with all character symbols enabled it appears to be correct with CRLFs as the row endings and arrows to denote tabs between columns.
The only thing I could come up with on my own is that somehow the encoding of my SQL Server Management Studio text editor must be messing up the field/row terminator arguments and causing the bulk insert command to fail.
Anyone have any bright ideas?

Turns out my coworker's computer is messed up and does something weird with encoding that enables him to just paste LFs correctly.
I was able to solve my problem by creating some dynamic sql to execute the bulk insert with the row terminator being generated by CHAR(10) concatenated to the rest of the command.
CHAR(10) is the ascii representation of a linefeed which was the row terminator in the files I had.

Related

How to load special characters (non-English letters) in SQL Loader

Some of my developer_id are in foreign language (special character). I googled how to handle those characters, and what people said was using
NVARCHAR2()
or use:
INSERT INTO table_name VALUES (N'你好');
However, I used NVARCHAR2() in stage and on all the tables but still doesn't work for me (the original datatype for developer_id was VARCHAR2()). Also, the insert statement with N at the beginning is not working for SQL Loader I think.
What should I do?
Here is where the problem shows:
Here is my ctl. file
Here is the datatype for all the data in the flat file:
The character set for the flat file is UTF-8. I thought I have successfully solved this problem by changing my Encoding when pre-loading the data to stage table, but the same problem still shows up when I finished importing my data to stage.

Missing Final Row Qualifer in .csv and Bulk Insert

I'm having some trouble dealing with a missing row qualifier at the end of a .csv file. I'm automatically downloading a Google sheets .csv which is then bulk inserted into a SQL server table. However what I've found is happening is that the final row of the file is not being inserted.
Looking at the file in Notepad ++, all of the lines except for the final one has a row qualifier of 'LF'.
The code I'm using to insert is below.
bulk insert CSVworkout
from 'C:\Users\Documents\Personal\531 Workouts.csv'
with (
fieldterminator = ',',
rowterminator = '0x0a',
firstrow=2)
Has anyone encountered anything similiar? Looking around, it seems this is a drawback of the Google Sheets .csv export, but is there a way I can either force the insert to recognise the final row, or is there a tool I can use to automatically populate a LF on the final row?
Any tips are very welcome!
Thanks
Can't you just add a CRLF to the file? Plenty of different ways to do that, you could use this in a batch file;
echo. >> "C:\Users\Documents\Personal\531 Workouts.csv"

sql server Bulk insert csv with data having comma

below is the sample line of csv
012,12/11/2013,"<555523051548>KRISHNA KUMAR ASHOKU,AR",<10-12-2013>,555523051548,12/11/2013,"13,012.55",
you can see KRISHNA KUMAR ASHOKU,AR as single field but it is treating KRISHNA KUMAR ASHOKU and AR as two different fields because of comma, though they are enclosed with " but still no luck
I tried
BULK
INSERT tbl
FROM 'd:\1.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW=2
)
GO
is there any solution for it?
The answer is: you can't do that. See http://technet.microsoft.com/en-us/library/ms188365.aspx.
"Importing Data from a CSV file
Comma-separated value (CSV) files are not supported by SQL Server bulk-import operations. However, in some cases, a CSV file can be used as the data file for a bulk import of data into SQL Server. For information about the requirements for importing data from a CSV data file, see Prepare Data for Bulk Export or Import (SQL Server)."
The general solution is that you must convert your CSV file into one that can be be successfully imported. You can do that in many ways, such as by creating the file with a different delimiter (such as TAB) or by importing your table using a tool that understands CSV files (such as Excel or many scripting languages) and exporting it with a unique delimiter (such as TAB), from which you can then BULK INSERT.
They added support for this SQL Server 2017 (14.x) CTP 1.1. You need to use the FORMAT = 'CSV' Input File Option for the BULK INSERT command.
To be clear, here is what the csv looks like that was giving me problems, the first line is easy to parse, the second line contains the curve ball since there is a comma inside the quoted field:
jenkins-2019-09-25_cve-2019-10401,CVE-2019-10401,4,Jenkins Advisory 2019-09-25: CVE-2019-10401:
jenkins-2019-09-25_cve-2019-10403_cve-2019-10404,"CVE-2019-10404,CVE-2019-10403",4,Jenkins Advisory 2019-09-25: CVE-2019-10403: CVE-2019-10404:
Broken Code
BULK INSERT temp
FROM 'c:\test.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a',
FIRSTROW= 2
);
Working Code
BULK INSERT temp
FROM 'c:\test.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a',
FORMAT = 'CSV',
FIRSTROW= 2
);
Unfortunately , SQL Server Import methods( BCP && BULK INSERT) do not understand quoting " "
Source : http://msdn.microsoft.com/en-us/library/ms191485%28v=sql.100%29.aspx
I have encountered this problem recently and had to switch to tab-delimited format. If you do that and use the SQL Server Management Studio to do the import (Right-click on database, then select Tasks, then Import) tab-delimited works just fine. The bulk insert option with tab-delimited should also work.
I must admit to being very surprised when finding out that Microsoft SQL Server had this comma-delimited issue. The CSV file format is a very old one, so finding out that this was an issue with a modern database was very disappointing.
MS have now addressed this issue and you can use FIELDQUOTE in your with clause to add quoted string support:
FIELDQUOTE = '"',
anywhere in your with clause should do the trick, if you have SQL Server 2017 or above.
Well, Bulk Insert is very fast but not very flexible. Can you load the data into a staging table and then push everything into a production table? Once in SQL Server, you will have a lot more control in how you move data from one table to another. So, basically.
1) Load data into staging
2) Clean/Convert by copying to a second staging table defined using the desired datatypes. Good data copied over, bad data left behind
3) Copy data from the "clean" table to the "live" table

Bulk Import of CSV into SQL Server

I am having a .CSV file that contain more than 1,00,000 rows.
I have tried the following method to Import the CSV into table "Root"
BULK INSERT [dbo].[Root]
FROM 'C:\Original.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
But there are so many errors like check your Terminators.
I opened the CSV with Notepad.
There is no Terminator , or \n. I find at end of the row a square box is there.
please help me to import this CSV into table.
http://msdn.microsoft.com/en-us/library/ms188609.aspx
Comma-separated value (CSV) files are not supported by SQL Server bulk-import operations. However, in some cases, a CSV file can be used as the data file for a bulk import of data into SQL Server. Note that the field terminator of a CSV file does not have to be a comma. To be usable as a data file for bulk import, a CSV file must comply with the following restrictions:
Data fields never contain the field terminator.
Either none or all of the values in a data field are enclosed in quotation marks ("").
Note: There may be other unseen characters that need to be stripped from the source file. VIM (command ":set list") or Notepad++(View > Show Symbol > Show All Characters) are two methods to check.
If you are comfortable with Java, I have written a set of tools for CSV manipulation, including an importer and exporter. The project is up on Github.com:
https://github.com/carlspring/csv-db-tools
The importer is here:
https://github.com/carlspring/csv-db-tools/tree/master/csv-db-importer
For instructions on how to use the importer, check:
https://github.com/carlspring/csv-db-tools/blob/master/csv-db-importer/USAGE
You will need to make a simple mapping file. An example can be seen here:
https://github.com/carlspring/csv-db-tools/blob/master/csv-db-importer/src/test/resources/configuration-large.xml

SQL Server Bulk Insert with "^M" rowterminator

I have a flat file that ends each row with a ^M character. I have found that this is generated from dos/windows OS and is a visual representation of 0x0D. I am trying to do a bulk insert on the file into SQL Server 2008 but I can't find a way to define ^M in a way that the process will know it is the rowterminator. I have tried specifying it multiple ways but no success. Any ideas on how to import this file with the "^M" character as the rowterminator?
dos2unix fileNameCreatedInWindows.sql this small utility should help.
It does what it says, converts Windows specific delimiters into Unix delimiters. Then you can use this file to perform bulk insert.
0x0D is \r; have you tried passing in a \r\n rowterminator parameter to the bulk insert command?