SQL SERVER bulk insert ignore deformed lines

SQL SERVER bulk insert ignore deformed lines - sql

I have to import SAP unconvered lists. These reports look quite ugly and are not that well suited for automated processing. However there is no other option. The data is borderd around minus and pipe symbols similar to the following example:
02.07.2012
--------------------
Report name
--------------------
|Header1 |Header2 |
|Value 11|Value1 2 |
|Value 21|Value2 2 |
--------------------
I use a format file and a statement like the following:
SELECT Header1, Header2
FROM OPENROWSET(BULK 'report.txt',
FORMATFILE='formatfile_report.xml' ,
errorfile='rejects.txt',
firstrOW = 2,
maxerrors = 100 ) as report
Unfortunately I receive the follwing error code:
Msg 4832, Level 16, State 1, Line 1
Bulk load: An unexpected end of file was encountered in the data file.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
The rejects txt file contains the last row from the file with just minuses in it. The rejects.txt.Error.Txt documents:
Row 21550 File Offset 3383848 ErrorFile Offset 0 - HRESULT 0x80004005
The culprit that raises the error is obviously the very last row that does not conform to the format as declared in the format file. However the ugly header does not cause much problems (at least the one at the very top).
Although I defined the maxerror attribute that very one deformed line kills the whole operation. If I manually delete the last line containing all that minuses (-) everything works fine. Since that import shall run frequently and particularly unattended that extra post-treatment is not serious solution.
Can anyone help me to get sql server to be less picky and susceptible respectively. It is good that it documents the lines that couldn't be loaded but why does it abort the whole operation? And further after one execution of a statement that caused the creation of the reject.txt no other (or the same) statement can be executed before the txt file gets deleted manually:
Msg 4861, Level 16, State 1, Line 1
Cannot bulk load because the file "rejects.txt" could not be opened. Operating system error code 80(The file exists.).
Msg 4861, Level 16, State 1, Line 1
Cannot bulk load because the file "rejects.txt.Error.Txt" could not be opened. Operating system error code 80(The file exists.).
I think that is weird behavior. Please help me to suppress it.
EDIT - FOLLOWUP:
Here is the format file I use:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="EMPTY" xsi:type="CharTerm" TERMINATOR="|" MAX_LENGTH="100"/>
<FIELD ID="HEADER1" xsi:type="CharTerm" TERMINATOR="|" MAX_LENGTH="100"/>
<FIELD ID="HEADER2" xsi:type="CharTerm" TERMINATOR="|\r\n" MAX_LENGTH="100"/>
</RECORD>
<ROW>
<COLUMN SOURCE="HEADER1" NAME="HEADER2" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="HEADER2" NAME="HEADER2" xsi:type="SQLNVARCHAR"/>
</ROW>
</BCPFORMAT>

BULK INSERT is notoriously fiddly and unhelpful when it comes to handling data that doesn't meet the specifications provided.
I haven't done a lot of work with format files, but one thing you might want to consider as a replacement is using BULK INSERT to drop each line of the file into a temporary staging table with a single nvarchar(max) column.
This lets you get your data into SQL for further examination, and then you can use the various string manipulation functions to break it down into the data you want to finally insert.

I was in the same trouble, but using bcp command line the problem was solved, its simply dont take the last row

I had the same problem. I had a file with 115 billions of rows so manually deleting the last row was not an option, as I couldn't even open the file manually, as it was too big.
Instead of using BULK INSERT command, I used the bcp command, which looks like this:
(Open a DOS cmd in administrator then write)
bcp DatabaseName.dbo.TableNameToInsertIn in C:\Documents\FileNameToImport.dat -S ServerName -U UserName -P PassWord
It's about the same speed as bulk insert as far as I can tell (took me only 12 minutes to import my data). When looking at activity monitor, I can see a bulk insert, so I guess that it logs the same way when the database is in bulk recovery mode.

Related

SSIS Flat File Import errors

I have a ssis job that imports flat file data into my database and also data conversion. Please find a view of the scheme:
The issue is that I keep getting errors on the "Violations" field see below:
[Flat File Source [37]] Error: Data conversion failed. The data
conversion for column "Violations" returned status value 4 and status
text "Text was truncated or one or more characters had no match in the
target code page.".
[Flat File Source [37]] Error: The "Flat File Source.Outputs[Flat File
Source Output].Columns[Violations]" failed because truncation
occurred, and the truncation row disposition on "Flat File
Source.Outputs[Flat File Source Output].Columns[Violations]" specifies
failure on truncation. A truncation error occurred on the specified
object of the specified component.
[Flat File Source [37]] Error: An error occurred while processing file
"C:\Users\XXXX\XXXX\XXXX\XXXX\XXXX\XXXX\XXXX\Food_Inspections.csv"
on data row 25.
In line 25 of the CSV file, this field is over 4000 characters long.
In data conversion, I currently have the Data Type set to string [DT_STR] of length 8000, coding 65001.
Row delimiter {LF},
Column delimiter Semicolon {;}
I have already looked at other suggested solutions, i.e. increasing OutputColumnWidth to 5000 but it did not help - please advise how to solve this.

Getting Internal Server Error on pgSQL

Im tryingto import data from windows CSV (comma delimiter) file into pgSQL faxtest1 table, but I keep getting error saying "The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application."
The following is my code:
COPY faxtest1
FROM 'C:‪\Users\David\Desktop\test3.csv'
WITH DELIMITER AS ',' CSV ;
The CSV file is like:
Status,Fax ID
Fax to Email,2104
Fax to Email,2108

It is a bug of pg admin 4, hope they will fix it in the future.

In version 14, in the Import/Export data function, there are 2 columns, "Options" and "Columns." Try manually select the columns one at time, separated by a comma. See if this would by pass the error.
It worked for me.

unable to load csv file from GCS into bigquery

I am unable to load 500mb csv file from google cloud storage to big query but i got this error
Errors:
Too many errors encountered. (error code: invalid)
Job ID xxxx-xxxx-xxxx:bquijob_59e9ec3a_155fe16096e
Start Time Jul 18, 2016, 6:28:27 PM
End Time Jul 18, 2016, 6:28:28 PM
Destination Table xxxx-xxxx-xxxx:DEV.VIS24_2014_TO_2017
Write Preference Write if empty
Source Format CSV
Delimiter ,
Skip Leading Rows 1
Source URI gs://xxxx-xxxx-xxxx-dev/VIS24 2014 to 2017.csv.gz
I have gzipped 500mb csv file to csv.gz to upload to GCS.Please help me to solve this issue

The internal details for your job show that there was an error reading the row #1 of your CSV file. You'll need to investigate further, but it could be that you have a header row that doesn't conform to the schema of the rest of the file, so we're trying to parse a string in the header as an integer or boolean or something like that. You can set the skipLeadingRows property to skip such a row.
Other than that, I'd check that the first row of your data matches the schema you're attempting to import with.
Also, the error message you received is unfortunately very unhelpful, so I've filed a bug internally to make the error you received in this case more helpful.

Sql Bulk Insert -- File does not exist

I have the following query to insert into a table
BULK
INSERT tblMain
FROM 'c:\Type.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
It get the message
Msg 4860, Level 16, State 1, Line 1
Cannot bulk load. The file "c:\Type.txt" does not exist.
The file is clearly there. Anything I may be overlooking?

Look at that:
Cannot bulk load. The file "c:\data.txt" does not exist
Is that file on the SQL Server's C:\ drive??
SQL BULK INSERT etc. always works only with local drive on the SQL Server machine. Your SQL Server cannot reach onto your own local drive.
You need to put the file onto the SQL Server's C:\ drive and try again.

Bulk import utility syntax is described here
http://msdn.microsoft.com/en-us/library/ms188365.aspx
> BULK INSERT [ database_name . [ schema_name ] . | schema_name . ]
> [ table_name | view_name ]
> FROM 'data_file'
> [ WITH
> (
Note on data_file argument says
' data_file '
Is the full path of the data file that contains data to import into
the specified table or view. BULK INSERT can import data from a disk
(including network, floppy disk, hard disk, and so on).
data_file must specify a valid path from the server on which SQL
Server is running. If data_file is a remote file, specify the
Universal Naming Convention (UNC) name. A UNC name has the form
\Systemname\ShareName\Path\FileName. For example,
\SystemX\DiskZ\Sales\update.txt.

I've had this problem before. In addition to checking the file path you'll want to make sure you're referencing the correct file name and file type. Make sure this is indeed a text file that you have saved in the source location and not a word file etc. I got tripped up with .doc and .docx. This is a newb mistake of mine to make but hey, it can happen. Changed the file type and it fixed the problem.

Issue with bulk insert

I am trying to insert the data from this link to my SQL server
https://www.ian.com/affiliatecenter/include/V2/CityCoordinatesList.zip
I created the table
CREATE TABLE [dbo].[tblCityCoordinatesList](
[RegionID] [int] NOT NULL,
[RegionName] [nvarchar](255) NULL,
[Coordinates] [nvarchar](4000) NULL
) ON [PRIMARY]
And I am running the following script to do the bulk insert
BULK INSERT tblCityCoordinatesList
FROM 'C:\data\CityCoordinatesList.txt'
WITH
(
FIRSTROW = 2,
MAXERRORS = 0,
FIELDTERMINATOR = '|',
ROWTERMINATOR = '\n'
)
But the bulk insert fails with following error
Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".
When I google, I found several articles which says the issue may be with RowTerminator, but I tried everything like \n\r, \n etc, but nothing is working.
Could anyone please help me to insert this data into my database?

Try ROWTERMINATOR = '0x0a'.
it should work.

This can also happen if the number of columns mismatch between the table and the imported file

I got the same error message, and as you had mention, it was related to unexpected line ending.
In my case the line ending was specified in a fmt file as a Windows Line ending (CRLF), written as \r\n, and the data file to process has a Mac classic one (CR).
I solved it with an editor that can show the current line ending and change it. I used EditPad Lite wich shows the opened file line ending in the bottom bar and pressing it allow to replace with the expected one.

I had this on SQL2019 when the FORMAT='CSV' option was used, and there was a comma on the end of each line in the source file. So the table your BULK inserting into needed to have an extra dummy field to cater for the fact each record has essentially a blank field in the source file.
!

I get the same error, probably from the file encoding problem. I fixed it by opening the problem CSV file using Notepad++, select everything and copy to clipboard. Next, create a new text file (making sure it has the CSV file extension), open it using Notepad++, then paste the text to the new file. Save and close all files. You should be able to successfully load the new CSV file into the SQL server.

you need run BULK INSERT - command from windows login (not from SQL). Now I don't have any examples

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL SERVER bulk insert ignore deformed lines - sql

I was in the same trouble, but using bcp command line the problem was solved, its simply dont take the last row

Related

SSIS Flat File Import errors

Getting Internal Server Error on pgSQL

unable to load csv file from GCS into bigquery

Sql Bulk Insert -- File does not exist

Issue with bulk insert

Categories

Resources