SQL Server Bulk Insert with "^M" rowterminator - sql

I have a flat file that ends each row with a ^M character. I have found that this is generated from dos/windows OS and is a visual representation of 0x0D. I am trying to do a bulk insert on the file into SQL Server 2008 but I can't find a way to define ^M in a way that the process will know it is the rowterminator. I have tried specifying it multiple ways but no success. Any ideas on how to import this file with the "^M" character as the rowterminator?

dos2unix fileNameCreatedInWindows.sql this small utility should help.
It does what it says, converts Windows specific delimiters into Unix delimiters. Then you can use this file to perform bulk insert.

0x0D is \r; have you tried passing in a \r\n rowterminator parameter to the bulk insert command?

Related

Encoding issue of arguments for BULK INSERT via SSMS

So I stumbled on an interesting problem while trying to load some data. Essentially I have some files with data in them that I am trying to BULK INSERT into a table with varchar columns to make it easy to import. The file is tab delimited with CRLFs as the row terminator.
For some reason, when I write/copy&paste the BULK INSERT command from my own PC the command fails. It offers an error stating
Bulk load: DataFileType was incorrectly specified as widechar. DataFileType will be assumed to be char because the data file does not have a Unicode signature.
Then it says:
Bulk load failed. The column is too long in the data file for row 1, column 7. Verify that the field terminator and row terminator are specified correctly.
The command is as follows:
BULK INSERT <table>
FROM '<filepath>'
WITH
(
DATAFILETYPE = 'widechar',
FIELDTERMINATOR = ' ',
ROWTERMINATOR = '
'
);
Now, the part that doesn't make sense is that without changing any piece of that code, my co-worker was able to run and load the table with perfect success. No warning messages, no failures, nothing.
When I look at the command in Notepad++ with all character symbols enabled it appears to be correct with CRLFs as the row endings and arrows to denote tabs between columns.
The only thing I could come up with on my own is that somehow the encoding of my SQL Server Management Studio text editor must be messing up the field/row terminator arguments and causing the bulk insert command to fail.
Anyone have any bright ideas?
Turns out my coworker's computer is messed up and does something weird with encoding that enables him to just paste LFs correctly.
I was able to solve my problem by creating some dynamic sql to execute the bulk insert with the row terminator being generated by CHAR(10) concatenated to the rest of the command.
CHAR(10) is the ascii representation of a linefeed which was the row terminator in the files I had.

sql server Bulk insert csv with data having comma

below is the sample line of csv
012,12/11/2013,"<555523051548>KRISHNA KUMAR ASHOKU,AR",<10-12-2013>,555523051548,12/11/2013,"13,012.55",
you can see KRISHNA KUMAR ASHOKU,AR as single field but it is treating KRISHNA KUMAR ASHOKU and AR as two different fields because of comma, though they are enclosed with " but still no luck
I tried
BULK
INSERT tbl
FROM 'd:\1.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW=2
)
GO
is there any solution for it?
The answer is: you can't do that. See http://technet.microsoft.com/en-us/library/ms188365.aspx.
"Importing Data from a CSV file
Comma-separated value (CSV) files are not supported by SQL Server bulk-import operations. However, in some cases, a CSV file can be used as the data file for a bulk import of data into SQL Server. For information about the requirements for importing data from a CSV data file, see Prepare Data for Bulk Export or Import (SQL Server)."
The general solution is that you must convert your CSV file into one that can be be successfully imported. You can do that in many ways, such as by creating the file with a different delimiter (such as TAB) or by importing your table using a tool that understands CSV files (such as Excel or many scripting languages) and exporting it with a unique delimiter (such as TAB), from which you can then BULK INSERT.
They added support for this SQL Server 2017 (14.x) CTP 1.1. You need to use the FORMAT = 'CSV' Input File Option for the BULK INSERT command.
To be clear, here is what the csv looks like that was giving me problems, the first line is easy to parse, the second line contains the curve ball since there is a comma inside the quoted field:
jenkins-2019-09-25_cve-2019-10401,CVE-2019-10401,4,Jenkins Advisory 2019-09-25: CVE-2019-10401:
jenkins-2019-09-25_cve-2019-10403_cve-2019-10404,"CVE-2019-10404,CVE-2019-10403",4,Jenkins Advisory 2019-09-25: CVE-2019-10403: CVE-2019-10404:
Broken Code
BULK INSERT temp
FROM 'c:\test.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a',
FIRSTROW= 2
);
Working Code
BULK INSERT temp
FROM 'c:\test.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a',
FORMAT = 'CSV',
FIRSTROW= 2
);
Unfortunately , SQL Server Import methods( BCP && BULK INSERT) do not understand quoting " "
Source : http://msdn.microsoft.com/en-us/library/ms191485%28v=sql.100%29.aspx
I have encountered this problem recently and had to switch to tab-delimited format. If you do that and use the SQL Server Management Studio to do the import (Right-click on database, then select Tasks, then Import) tab-delimited works just fine. The bulk insert option with tab-delimited should also work.
I must admit to being very surprised when finding out that Microsoft SQL Server had this comma-delimited issue. The CSV file format is a very old one, so finding out that this was an issue with a modern database was very disappointing.
MS have now addressed this issue and you can use FIELDQUOTE in your with clause to add quoted string support:
FIELDQUOTE = '"',
anywhere in your with clause should do the trick, if you have SQL Server 2017 or above.
Well, Bulk Insert is very fast but not very flexible. Can you load the data into a staging table and then push everything into a production table? Once in SQL Server, you will have a lot more control in how you move data from one table to another. So, basically.
1) Load data into staging
2) Clean/Convert by copying to a second staging table defined using the desired datatypes. Good data copied over, bad data left behind
3) Copy data from the "clean" table to the "live" table

Bulk Import of CSV into SQL Server

I am having a .CSV file that contain more than 1,00,000 rows.
I have tried the following method to Import the CSV into table "Root"
BULK INSERT [dbo].[Root]
FROM 'C:\Original.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
But there are so many errors like check your Terminators.
I opened the CSV with Notepad.
There is no Terminator , or \n. I find at end of the row a square box is there.
please help me to import this CSV into table.
http://msdn.microsoft.com/en-us/library/ms188609.aspx
Comma-separated value (CSV) files are not supported by SQL Server bulk-import operations. However, in some cases, a CSV file can be used as the data file for a bulk import of data into SQL Server. Note that the field terminator of a CSV file does not have to be a comma. To be usable as a data file for bulk import, a CSV file must comply with the following restrictions:
Data fields never contain the field terminator.
Either none or all of the values in a data field are enclosed in quotation marks ("").
Note: There may be other unseen characters that need to be stripped from the source file. VIM (command ":set list") or Notepad++(View > Show Symbol > Show All Characters) are two methods to check.
If you are comfortable with Java, I have written a set of tools for CSV manipulation, including an importer and exporter. The project is up on Github.com:
https://github.com/carlspring/csv-db-tools
The importer is here:
https://github.com/carlspring/csv-db-tools/tree/master/csv-db-importer
For instructions on how to use the importer, check:
https://github.com/carlspring/csv-db-tools/blob/master/csv-db-importer/USAGE
You will need to make a simple mapping file. An example can be seen here:
https://github.com/carlspring/csv-db-tools/blob/master/csv-db-importer/src/test/resources/configuration-large.xml

TSQL Bulk Insert

I have such csv file, fields delimiter is ,. My csv files are very big, and I need to import it to a SQL Server table. The process must be automated, and it is not one time job.
So I use Bulk Insert to insert such csv files. But today I received a csvfile that has such row
1,12312312,HOME ,"House, Gregory",P,NULL,NULL,NULL,NULL
The problem is that Bulk Insert creates this row, specially this field "House, Gregory"
as two fields one '"House' and second ' Gregory"'.
Is there some way to make Bulk Insert understand that double quotes override behaviour of comma?
When I open this csv with Excel it sees this field normally as 'House, Gregory'
You need preprocess your file, look to this answer:
SQL Server Bulk insert of CSV file with inconsistent quotes
If every row in the table has double quotes you can specify ," and ", as column separators for that column using format files
If not, get it changed or you'll have to write some clever pre-processing routines somewhere.
The file format need to be consistent for any of the SQL Server tools to work
Since you are referring to Sql Server, I assume you have Access available as well (Microsoft-friendly environment). If you do have Access, I recommend you use its Import Wizard. It is much smarter than the import wizard of Sql Server (even version 2014), and smarter than the Bulk Insert sql command as well.
It has a widget where you can define the Text seperator to be ", it also makes no problems with string length because it uses the Access data type Text.
If you are satisfied with the results in Access you can import them later to Sql Server seamlessly.
The best way to move the data from Access to Sql is using Sql Server Migration Assistant, available here

bcp and backspace (^H) delimiter

I need to parse a flat file which is containing backspace (^H) character delimiter between fields. I need to parse this file and insert into sql server 2005 tables.I tried to use bcp utility along with the format file but I wasn't able to specify the delimiter as backspace.
The default one is tab (\t). There are several other delimiters as well but none to specify backspace. Anyone has any ideas, please do help me.
Also I need to export data from sql server table to fixed length flat file.I tried to use non-xml format file, but always it asks for a delimiter.How can I create a flat file using bcp without any delimiter between the fields?
All above are character files.
This is an ugly workaround, but you could always find something that's not in the flat file, and replace everything in the flat file with that, then use that as the column terminator (using bcp -t that).
Sorry that I'm almost 11 years late on this, hopefully you've already solved your problem but you can use the hexadecimal representation of the backspace character 0x08 to parse your input file and properly delimit your fields which are separated with a backspace character.