Cannot bulk load error message - syntax-error

Attempting to do a bulk insert. The sample data and the format file are given below. It was brought to my attention that we need to use a Universal naming convention (UNC) hence why the '\FR-6RSGJH2.xyz.st\C$ item in the code. However, the same error occurs if you simply it to '\C\Users\myname\Desktop\testimport.csv'. Any ideas as to what is missing in the syntax or any settings changes that could be done?
BULK INSERT testimport
FROM '\\FR-6RSGJH2.xyz.st\C$\Users\myname\Desktop\testimport.csv'
WITH (FORMATFILE = '\\FR-
6RSGJH2.xyz.st\C$\Users\myname\Desktop\format.txt')
GO
Msg 4861, Level 16, State 1, Line 1
Cannot bulk load because the file
"\C\Users\myname\Desktop\testimport.csv" could not be opened. Operating
system error code 3(The system cannot find the path specified.).
Sample data
32003012017010316
32001022017040218
32003032017030213
32002042017020111
32002052017020110
format file
13.0
5
1 SQLCHAR 0 02 "" 1 st ""
2 SQLCHAR 0 03 "" 2 cnty ""
3 SQLCHAR 0 02 "" 3 v1 ""
4 SQLCHAR 0 08 "" 4 date ""
5 SQLCHAR 0 02 "\r\n" 5 v2 ""

Not sure how it worked but when I made the testimport into a .txt versus a .csv, it worked. Anyways, that is the answer.

Related

My last column isn't getting populated

I'm trying to use a non-xml format file to bulk import a null delimited file into sql. I've added a column to the staging table in question, and updated the format file to reflect this. Everything seems to be inserting fine, except this last column. The column I added is
Comments (nvarchar(256), null)
The format file looks like this:
11.0
8
1 SQLNCHAR 0 4 "\0\0" 1 ClaimCheckSetId ""
2 SQLNCHAR 0 4 "\0\0" 2 BatchValidationId ""
3 SQLNCHAR 0 4 "\0\0" 3 SourceCommunicationId ""
4 SQLNCHAR 0 4 "\0\0" 4 TargetCommunicationId ""
5 SQLNCHAR 0 1800 "\0\0" 5 TargetExternalCommunicationId ""
6 SQLNCHAR 0 8 "\0\0" 6 TargetSentDateTime ""
7 SQLNCHAR 0 2000 "\0\0" 7 TargetSubject ""
8 SQLNCHAR 0 256 "\r\0\n\0" 8 Comments ""
The SQL looks like this:
DECLARE #filepath NVARCHAR(MAX) = 'C:\{file to import}_512fc21d-dbc9-4975-8169-2ca383ac2bdf.txt';
DECLARE #formatpath NVARCHAR(MAX) = 'C:\{format file}.txt';
DECLARE #bulkinsert NVARCHAR(MAX);
SET
#bulkinsert =
N'BULK INSERT
[The Table]
FROM ''' +
#filepath + N'''
WITH
(
FORMATFILE = ''' +
#formatpath + N''',
DATAFILETYPE = ''WIDECHAR'',
FIRSTROW = 1
)';
SET ANSI_WARNINGS OFF;
EXEC sp_executesql #Bulkinsert;
SET ANSI_WARNINGS ON;
I'm getting no errors, and it is returning a number of rows affected. Unfortunately, I don't know enough about SQL to diagnose this problem. A few hours of googling have not helped either. I hope one of you kind guys or gals can set me back on the straight and narrow.
Update: I edited the \r\0\n\0 to \r\n and am now getting an error!
OLE DB provider 'BULK' for linked server '(null)' returned invalid data for column '[BULK].InsertedDateTime'.
You should check the input file in an editor that shows special symbols. Personally I use Notepad++ (free) for that (View > Show Symbol > Show All Characters), but any decent editor will do.
That way the row terminator (ie the last field terminator) should be clearly visible. In Notepad++ the \0 will be visible as NUL, \r AS CR and \n AS LF.
So with your settings as you currently have, you should be seeing CR NUL LF NUL. If you don't then change the last field terminator to what you see in the editor you are using.
With the limited information I have, can you please change the following
8 SQLNCHAR 0 256 "\r\0\n\0" 8 Comments ""
to
8 SQLNCHAR 0 256 "\r\n" 8 Comments ""
or
8 SQLNCHAR 0 256 "\0\0" 8 Comments ""
It seems the last one should wrap to new line.

SQL Server Bulk Insert fixed width file failure

I am attempting to Use Bulk Insert to upload a very large data file (5M rows). All columns are just varchars no conversion. So the Format file is simple...
11.0
29
1 SQLCHAR 0 8 "" 1 AccountId ""
2 SQLCHAR 0 10 "" 2 TranDate ""
3 SQLCHAR 0 4 "" 3 TransCode ""
4 SQLCHAR 0 2 "" 4 AdditionalCode ""
5 SQLCHAR 0 11 "" 5 CurrentPrincipal ""
6 SQLCHAR 0 11 "" 6 CurrentInterest ""
7 SQLCHAR 0 11 "" 7 LateInterest ""
...
27 SQLCHAR 0 8 "" 27 Operator ""
28 SQLCHAR 0 10 "" 28 UpdateDate ""
29 SQLCHAR 0 12 "" 29 TimeUpdated ""
but each time, at some point, I get the same error:
Msg 4832, Level 16, State 1, Line 1 Bulk load: An unexpected end of
file was encountered in the data file. Msg 7399, Level 16, State 1,
Line 1 The OLE DB provider "BULK" for linked server "(null)" reported
an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1 Cannot fetch a row from OLE DB
provider "BULK" for linked server "(null)".
I have tried the following:
Bulk Insert
[TableName] From 'dataFilePPathSpecification'
With (FORMATFILE = 'formatFilePPathSpecification')
but I get the error after about 5-6 minutes, and no data has been inserted.
When I added BatchSize parameter, I get the error after a much longer time, near the end of the file, after all except a very few of the rows have been inserted successfully.
Bulk Insert
[TableName] From 'dataFilePPathSpecification'
With (BATCHSIZE = 200,
FORMATFILE = 'formatFilePPathSpecification')
When I set the BatchSize to 2000 it runs much faster, (Fewer, larger transacxtions I assume), but it still fails.
Does this have something to do with how the Bulk Insert recognizes the end of the file? If so, what do I need to do to the format file to fix it ?
Explicitly state your row terminator:
BULK INSERT TableName FROM 'Path'
WITH (
DATAFILETYPE = 'char',
ROWTERMINATOR = '\r\n'
With (FORMATFILE = 'formatFilePPathSpecification')
);
If this still fails, check your file to see if you have unexpected terminators embedded in text fields.
Trying using the errorFile specifier in the WITH portion to find the offending data:
ERRORFILE = 'C:\offendingdata.log'
If you still have problem even after enabling the errorfile output, you can do a binary search for the problem by setting the FirstRow and LastRow options and running bulk insert repeatedly to isolate the problem.
To be honest your input format looks so simple it might be a good idea to write a small C#, Python, or whatever floats your boat app to quality check you data before attempt import. You could simply discard invalid rows (or possibly fix them) or write them to an exceptions file for hand processing, or simply stop the job -- I.e., file must be perfect or it is considered corrupted. Validating 5M rows this way will be quite fast -- essentially as fast as you can read the file (and possible write) the file.
Thanks for the suggestions to all, I applied both ideas... I wrote a small .Net (c#) file processor utility and it told me there were additional nulls (binary zeroes (\0) at the end of every line, and I was able to strip them off using a simple c# program.
The error file indicated the issue was at the very end, (That's what the error msg said!)
The actual issue was that the Bulk Insert could not recognize the EOF.. I had to modify the format file like this to fix it.. Then it worked.
11.0
29
1 SQLCHAR 0 8 "" 1 AccountId ""
2 SQLCHAR 0 10 "" 2 TranDate ""
3 SQLCHAR 0 4 "" 3 TransCode ""
4 SQLCHAR 0 2 "" 4 AdditionalCode ""
5 SQLCHAR 0 11 "" 5 CurrentPrincipa ""
6 SQLCHAR 0 11 "" 6 CurrentInterest ""
7 SQLCHAR 0 11 "" 7 LateInterest ""
...
27 SQLCHAR 0 8 "" 27 Operator ""
28 SQLCHAR 0 10 "" 28 UpdateDate ""
29 SQLCHAR 0 12 "\r\n" 29 TimeUpdated ""

BCP format file editing for Bulk Import into SQL

I'm attempting to import a large amount of data contained in a CSV file into a SQL database. The CSV is 4g in size. The CSV has 329 columns and 300,000+ rows of data. So far I've successfully created the database and table that will hold the data once imported. The data contains string (VARCHAR(x), numeric (INT), and dates (DATE).
The data contained within the CSV file is separated by a deliminator "," but all of the data fields are encased in double quotes, with some fields not containing data values. Below is a mock example of the data.
"123244234","09/12/2012","First Name","Last Name","Address 1","","","555-555-5555","","CountryCode"
In research I've determined the easiest way to import the data will be to use BCP to create a format file and then uses that with BULK INSERT. The only probably is in formatting the format file to remove the double quotes. When attempting to import without a format file it fails on row one because the first column first row is numeric and has "" around it.
I've reviewed the following link that talks about removing the double quotes "http://support.microsoft.com/default.aspx?scid=kb;EN-US;132463" with the use of a dummy entry to remove the quotes. In this case that is a lot of manual editing. Does anyone know of a better way to edit the format file?? Here is a sample of the format file:
10.0
329
1 SQLCHAR 0 12 "," 1 NPI ""
2 SQLCHAR 0 12 "," 2 Entity Type Code ""
3 SQLCHAR 0 12 "," 3 Replacement NPI ""
4 SQLCHAR 0 9 "," 4 Employer Identification Number (EIN) SQL_Latin1_General_CP1_CI_AS
5 SQLCHAR 0 70 "," 5 Provider Organization Name (Legal Business Name) SQL_Latin1_General_CP1_CI_AS
6 SQLCHAR 0 35 "," 6 Provider Last Name (Legal Name) SQL_Latin1_General_CP1_CI_AS
7 SQLCHAR 0 20 "," 7 Provider First Name SQL_Latin1_General_CP1_CI_AS
8 SQLCHAR 0 20 "," 8 Provider Middle Name SQL_Latin1_General_CP1_CI_AS
9 SQLCHAR 0 5 "," 9 Provider Name Prefix Text SQL_Latin1_General_CP1_CI_AS
10 SQLCHAR 0 5 "," 10 Provider Name Suffix Text

SQL Server 2005 bulk insert with format file error

error list
Msg 4866, Level 16, State 7, Line 2
The bulk load failed. The column is too long in the data file for row 1, column 1.
Verify that the field terminator and row terminator are specified correctly.
Msg 7399, Level 16, State 1, Line 2
The OLE DB provider "BULK" for linked server "(null)" reported an error.
The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 2
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
fmt file
9.0
10
1 SQLCHAR 2 50 "," 2 EmployeeSSN SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 2 50 "," 3 DOB SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 2 50 "," 4 Gender SQL_Latin1_General_CP1_CI_AS
4 SQLCHAR 2 50 "," 5 Relcode SQL_Latin1_General_CP1_CI_AS
5 SQLCHAR 2 50 "," 6 EmployeeID SQL_Latin1_General_CP1_CI_AS
6 SQLCHAR 2 50 "," 7 AssessmentType SQL_Latin1_General_CP1_CI_AS
7 SQLCHAR 2 50 "," 8 MeasurementDate SQL_Latin1_General_CP1_CI_AS
8 SQLCHAR 2 50 "," 9 RecordCreationDate SQL_Latin1_General_CP1_CI_AS
9 SQLCHAR 2 50 "," 10 AttributeID SQL_Latin1_General_CP1_CI_AS
10 SQLCHAR 2 50 "/r/n" 11 AttributeValue SQL_Latin1_General_CP1_CI_AS
Bulk insert code
BULK insert *******_raw_data
from 'E:\*****_csv\BWC_To_*****_2.csv'
with (formatfile = 'c:\*******_raw_data-n.fmt');
first line from csv
NULL,07/14/1983,F,S,105***,HRA,09/28/2011,09/28/2011,19,1
I am trying to figure out where I am going wrong here.... I have gotten other files to work but have been unsuccessful with this one. The files' names are correct in my code they are starred out because they are company names
First error:
Msg 4866, Level 16, State 7, Line 2
The bulk load failed. The column is too long in the data file for row 1, column 1. Verify that the field terminator and row terminator are specified correctly.
This is either a problem with NULL or the row terminator.
The last terminator for the row may not be "/r/n", it could be "/n". It is best to confirm that with a Hex Editor.
Second and Third Error:
These all look like a NULL problem.
The correct way to handle nulls in BULK INSERT is to specify the KEEPNULLS option.
with (formatfile = 'c:\*******_raw_data-n.fmt',KEEPNULLS);
Create the csv files with an empty field for NULL values.
,07/14/1983,F,S,105***,HRA,09/28/2011,09/28/2011,19,1

Pulling from a very very specific location imbedded in a text file

I finished every piece of code in my program save for one tid bit, how to pull two numbers from a text file. I know how to pull lines, I know how to pull search strings, but I cant figure out this one to save my life.
Anyways here is a sample of the automatically generated text that I need to pull from...
.......................................................................
Applications Memory Usage (kB):
Uptime: 6089044 Realtime: 6089040
** MEMINFO in pid 764 [com.lookout] **
native dalvik other total
size: 27908 8775 N/A 36683
allocated: 3240 4216 N/A 7456
free: 24115 4559 N/A 28674
(Pss): 1454 1142 6524 *9120*
(priv dirty): 1436 628 5588 *7652*
Objects
Views: 0 ViewRoots: 0
AppContexts: 0 Activities: 0
Assets: 3 AssetManagers: 3
Local Binders: 15 Proxy Binders: 41
Death Recipients: 3
OpenSSL Sockets: 0
SQL
heap: 98 MEMORY_USED: 98
PAGECACHE_OVERFLOW: 16 MALLOC_SIZE: 50
DATABASES
pgsz dbsz Lookaside(b) Dbname
1 14 120 google_analytics.db
Asset Allocations
zip:/system/app/com.lookout_6.0.1_r8234_Release.apk:/resources.arsc: 161K
.............................................................................
The two numbers that I need out of this are the two ones that I put in the **'s (the asterisks are not normally there). These numbers will be different every time this sheet is generated, and the number placement might be different as well as some of the numbers could have 4 digits, 5 digits, or 6 digits.
If anyone could shed any light on the subject it would be greatly appreciated
Thanks,
Zach
You just need to read in the last word of the line and convert it to a number. Use String.LastIndexOf to find the last space " " in the file and read the data from that point forwards.
Dim line as String = " (Pss): 1454 1142 6524 9120"
Dim value as Integer
If line.IndexOf("(Pss)") > 0 Then
value = CInt(line.Substring(line.LastIndexOf(" ") + 1))
End If