I am trying to do a bulk insert of a .CSV from a remote location.
My SQL statement is:
BULK INSERT dbo.tblMaster
FROM '\\ZAJOHVAPFL20\20ZA0004\E\EDData\testbcp.csv'
WITH (FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n')
My .CSV looks like this:
john,smith
jane,doe
The CSV is saved with UTF-8 encoding, and there is no blank line at the bottom of the file. The table that I am bulk inserting too is also empty.
The table has two columns; firstname (nvarchar(max)) and secondname (nvarchar(max)).
I have sysadmin rights on the server so have permission to perform bulk inserts.
When running the SQL, it runs without error, and simple shows -
0 row(s) affected
and doesn't insert any information.
Any help is greatly appreciated.
I know this may be too late to answer but I thought this might help anyone looking for the fix. I had similar issue with Bulk Insert but didn't find any fix online. Most probably the flat/csv file was generated with non-windows format. If you can open the file in Notepad++ then go to edit tab and change the EOL Conversion to "Windows Format". This fixed the problem for me.
Notepad++>> Edit >> EOL Conversion >> Windows Format
When you specify \n as a row terminator for bulk export, or implicitly use the default row terminator, it outputs a carriage return-line feed combination (CRLF) as the row terminator. If you want to output a line feed character only (LF) as the row terminator - as is typical on Unix and Linux computers - use hexadecimal notation to specify the LF row terminator. For example:
ROWTERMINATOR='0x0A'
no need to do the following:
Notepad++>> Edit >> EOL Conversion >> Windows Format
I opened the CSV with Excel and hit Control + S to resave it. That fixed the issue for me.
Try inserting the file using bcp.exe first, see if you get any row or any error. The problem with
BULK INSERT ...
FROM '\\REMOTE\SHARE\...'
is that you're now bringing impersonation and delegation security into picture and is more difficult to diagnose the issue. When you access a remote share like this you are actually doing a 'double-hop' Kerberos impersonation (aka. delegation) and you need special security set up. Read Bulk Insert and Kerberos for the details.
The problem is, at least in part, the UTF-8 encoding. That is not supported by default. If you are using SQL Server 2016 then you can specify Code Page 65001 (i.e. add , CODEPAGE = '65001' to the WITH clause). If using an earlier version of SQL Server, then you need to first convert the file encoding to UTF-16 Little Endian (known as "Unicode" in the Microsoft universe). That can be done either when saving the file or by some command line utility.
Related
I have a problem when inserting values into my Oracle database. I have to insert French characters like à or è and when I try to insert them through an INSERT statement it will convert the character to ¿ or ?.
Is there any possibility to set the encoding of that specific script, or what can I do in this situation ?
Thank you
Usually you would set the character set when you install your database. You can, however, change it post-setup if required (Look up CSALTER). If your database needs to support multiple languages, then you should take a look at this: Supporting Multilingual Databases with Unicode
I have fixed this problem by adding an Environment Variable called NLS_LANG with the value .AL32UTF8 . This worked even though the database has as language American and territory America. The problem that I have faced here was that once I changed the NLS_LANG variable, it started to encode my characters also in the application.
Also you can try to change the encoding of the script that you are running. For example I have used ANSI encoding (you can do it by opening a script in notepad++ and from the Encoding menu, select Convert to ANSI) and it worked properly.
Thank you guys for your help :)
I am a new developer who just started using datastage (coming from a bit of experience with SSIS). One of the first things that I am doing is working with XML data flow into a database from MQ. I connect to the MQ, use an XML job to map out the tags to each db column, and then insert it into the db. However, I am having an issue with the incoming xml. One of the fields on each xml file that I process contains the same character sequence which is something along the lines of "&$!0" .
When I run my job I get an error saying that that is an illegal xml character and the job fails.
Is there a way within datastage to replace this value as it comes through the xml, or even just remove it? Is there a specific tool I should be using within my job for this?
Obviously the easiest solution would be to fix that data coming in, however in the mean-time while that is getting squared away, I want to be able to do some testing, so an alternate solution would be great for now.
Any advice would be greatly appreciated. I am a new developer so I apologize if this question is a bit ignorant/low level.
use a text editor like notepad++ to remove the characters yourself...
to automate, sed in linux will do your job and sed for windows will probably work on windows too!
These characters are nothing but Unicode. You need to remove them before you insert into DB table.
Try below code:
s = s.replaceAll("\\p{&$!0}+", "");
NOTE: You need to find out all Unicode and and replace them with "" (blank).
You will get more information here
I have a csv file that I am trying to import using BULK INSERT. The problem is that there is a field in the file that will be quoted (with double quotes) if a comma exists within the text (not quoted if no comma exists). The existence of the extra comma is causing SQL Server to throw errors because of an incorrect number of columns during the insert.
Here is a sample data set:
928 Riata Dr,Magnolia,TX,77354,4/15/2014
22 Roberts Ave.,McKinney,TX,75069,4/15/2014
"5531 Trinity Place, #22",San Antonio,TX,78212,4/15/2014
As you can see, the third row contains a comma within the address field, thus the address field is quoted. Since the BULK INSERT command is throwing errors because of this, I'm assuming I will need to scrub the file contents before attempting to load it.
Unless someone has a better solution
To scrub the file contents I will need to open the file (with SQL), read in the contents, and do a conditional replacement of the internal comma (found within the quotes). Since that comma doesn't really need to exist, I can just replace it with '' (blank).
Then, I can handle the quotes separately after the data gets imported with an update statement to replace any other characters I don't want.
I think the logic is sound, the problem is the syntax. I can't seem to find any syntax related to REGEX in SQL Server (Booo Microsoft). Which means I would need some other way to determine if the comma appears within quotes, and replace it if so.
Any thoughts, Suggestions, Code, etc.?
Thanks in advance.
This sounds too simple on the face of it, but if you can just replace the commas, can you open the csv in, say, Excel or OpenOffice Calc, and then do a find replace (commas with nothing)? I just tried with a csv of mine and it worked fine. The csv remains properly delimited.
Maybe I am missing something that prevents this, such as Excel opening this with extra cells due to the comma, in which case my answer is stupid. But it would make more sense to handle this in a spreadsheet app rather than after opening with SQL.
You may have to try delimiting with something other than commas, such as tabs or etc. I've had to do this with SQL imports before. In many cases you can save as a tab delimited txt file and upload to SQL.
Note that using Excel for this type of thing can be its own problem. For help with Excel and tab delimited SQL imports, see my answer here.
I used bulk insert into SQL Server Management Studio 2008 R2, 10 words from a text UTF-8 file, into single column.
However, the words do not appear correctly, I get extra space in front of some words.
Note: None of the answers have solved my problem, so far. :(
SCREENSHOT OF THE PROBLEM
This issue may occur if you are not using the correct collation (language settings). You need to use the appropriate collation in order to display your data in the correct format.
See the link http://technet.microsoft.com/en-us/library/ms187582(v=sql.105).aspx for more details.
You can also try using a different row terminator:
bulk insert table_name
from 'filename.txt' WITH (ROWTERMINATOR='\n')
Look at this post How to write UTF-8 characters using bulk insert in SQL Server?
Quote: You can't. You should first use a N type data field, convert
your file to UTF-16 and then import it. The database does not support
UTF-8.
Original answers
look at the encoding of youre text file. Should be utf8. If not this could cause problems.
Open with notepad, file-> save as and choose encoding
After this try to import as a bulk
secondly, make sure the column datatype is nvarchar and not varchar. Also see here
We currently use the SQL Publishing Wizard to back up our database schemas and data, however we have some database tables with hashed passwords that contain the null character (chr(0)). When SQL Publishing Wizard generates the insert data scripts, the null character causes errors when we try and run the resulting SQL - it appears to ignore ALL TEXT after the first instance of this character in a script. We recently tried out RedGate SQL Compare, and found that it has the same issue with this character. I have confirmed it is ascii character code 0 by running the ascii() sql function against the offending record.
A sample of the error we are getting is:
Unclosed quotation mark after the character string '??`????{??0???
The fun part is, I can't really paste a sample Insert statement because of course everything that appears after the CHR(0) is being omitted when pasting!
Change the definition of the column to VARBINARY. The data you store in there doesn't seem to be an appropiate VARCHAR to start with.
This will ripple through the code that uses the column as you'll get a byte[] CLR tpe back in the client, and you should change your insert/update code accordingly. But after all, a passowrd hash is a byte[], not a string.