I have an issue while importing a flat file into SSMS. As I download a CSV file from a specific system, the decimal separator is always ".". I have a regional setting of comma and I use commas all the time. But when importing these CSV files into SSMS, I get an error of type mismatch, due to the fact that SSMS cannot recognize the values as numbers (float, decimal, etc.)
I have tried to switch the regional Windows settings and replaced dot with a comma, which solved the issue and imported the file, but the question is, am I able to change settings in SSMS somehow so I am able to keep comma as default but import CSV files with decimal point separator?
I need to work only with SSMS, I am unable to install SSIS packages.
Thank you very much for any feedback.
In SSMS select "Import Data" option then in "Data Source" select "Flat File Source". Select file to load and then change "Locale" option (located just below file name).
Locales control how data formats are displayed and interpreted such as currency symbols and radix (decimal separator) character.
Related
I have text file which is having ^(CAP) and ,(Comma) as a delimiter and after clearing i need to load to sql . I have tried my best to clear a source file
But still file is not cleaned as expectation .
Please find the below picture i have tried to correct the source file
But still file is not cleared as expected . Please find below uncleared file .
You have a variety of issues here.
You have identified the header row delimiter as a comma. A row delimiter is the, usually invisible, delimiter than indicates a row's worth of data has happened. Traditionally, this is an Operating System specific value but it's a Carriage Return (CR), Line Feed (LF) or Carriage Return/Line Feed.
Your source data is not a comma delimited file with caret/circumflex/cap text delimiters. You have a comma-space delimited file which SSIS doesn't support in the editor. However, if you hand edit the dtsx file as I outlined in How to read a flatfile with lowercase thorn as the delimiter to specify that it should use comma space ColumnDelimiter="_x002C__x0020_"
Given a truncated version of your source data
ListCode, CAS, Name
^216^, ^^, ^Coal Dust^
^216^, ^7782-24-5^, ^Graphite (Natural)^
^216^, ^^, ^Inert or Nuisance Dust^
and the comma (0x2C) space (0x20) edited into the raw dtsx connection manager, I was able to pull data as I believe you are expecting
You might also run into additional issues given your selection of code pages and not checking the Unicode button but that's beyond my ability to generate matching source data from an image.
Just replace the ^, ^ with ^,^
It looks like your source
CAS, SubName, ListCode, Type, CountryCode, ListName
^1000413-72-8^,^fasiglifam^,^447^,^Chemical Inventory^,^EU^,^ECICS Custom Tariff Codes^
^1000413-72-8^,^fasiglifam^,^0^,^^,^NN^,^SPHERA Global Substance List^
Then edit your connection manager with below details
[![enter image description here][2]][2]
It will work .
[2]: https://i.stack.imgur.com/0x89k.png
I am trying to extract some records in a file using BCP command in SQL Server. However when the file is generated, there are extract spaces in between the result for each column.
To try I just wrote basic SQL Query as simple as this
select 'ABC', 40, 'TEST','NOTWORKING'
When we copy the output of above query and paste it in Notepad, the output comes as
ABC 40 TEST NOTWORKING
Notice the space between each value? The file that system is generating using BCP command also has same space coming in the output file which is incorrect. What I want to see in the output file is
ABC40TESTNOTWORKING
What must be causing this issue? I am simply amazed to see such weird issue and hoping that it can be fixed by some changes or setting. Please help.
Sample BCP command
EXEC xp_cmdshell 'bcp "select ''ABC'', 40, ''TEST'',''NOTWORKING''" queryout "E:\Testfile.txt" -c -T -S""'
Output in the File - Testfile.txt
ABC 40 TEST NOTWORKING
There are probably tabs between the values. If you want a single value, use concat():
select CONCAT('ABC', 40, 'TEST', 'NOTWORKING')
There's no issue. The command line has no field terminator argument, so the default is used, a tab. That's described in the docs :
-t field_term
Specifies the field terminator. The default is \t (tab character). Use this parameter to override the default field terminator. For more information, see Specify Field and Row Terminators (SQL Server).
If you specify the field terminator in hexadecimal notation in a bcp.exe command, the value will be truncated at 0x00. For example, if you specify 0x410041, 0x41 will be used.
If field_term begins with a hyphen (-) or a forward slash (/), do not include a space between -t and the field_term value.
The link points to an entire article that explains how to use terminators, for each of the bulk operations.
As for the Copy/Paste operation, it has nothing to do with SQL Server. SQL Server has no UI, it's a service. I suspect what was pasted in Notepad was copied from an SSMS grid.
SSMS is a client tool just like any other. When you copy data from it into the clipboard, it decides what to put there and what format to use. That format can be plain text, using spaces and tabs for layout, RTF, HTML etc.
Plain text with tabs as field separators is probably the best choice for any tool, as it preserves the visual layout up to a point and uses only a single character as a separator. A fixed-length layout using spaces could also be used but that would add characters that may well be part of a field.
Encodings and codepages
-c exports the data using the user's default codepage. This means that text stored in varchar fields using a different codepage (collation) may get mangled. Non-visible Unicode characters will also get mangled and appear as something else, or as ?.
-c
Performs the operation using a character data type. This option does not prompt for each field; it uses char as the storage type, without prefixes and with \t (tab character) as the field separator and \r\n (newline character) as the row terminator. -c is not compatible with -w.
It's better to use export the file as UTF16 using -w.
-w
Performs the bulk copy operation using Unicode characters. This option does not prompt for each field; it uses nchar as the storage type, no prefixes, \t (tab character) as the field separator, and \n (newline character) as the row terminator. -w is not compatible with -c.
The codepage can be specified using the -C parameter. -C 1251 for example will export the data using Windows' Latin1 codepage. 1253 will export it using the Greek codepage.
-C { ACP | OEM | RAW | code_page }
Specifies the code page of the data in the data file. code_page is relevant only if the data contains char, varchar, or text columns with character values greater than 127 or less than 32.
SQL Server 2016 and later can also export text as UTF8 with -C 65001. Earlier versions don't support UTF8.
Versions prior to version 13 (SQL Server 2016 (13.x)) do not support code page 65001 (UTF-8 encoding). Versions beginning with 13 can import UTF-8 encoding to earlier versions of SQL Server.
All this is described in bcp's online documentation.
This subject is so important for any database that it has an entire section in the docs, that describes data format and considerations, using format files to specify different settings per column, and guidelines to ensure compatibility with other applications
I am using a Text file output step in Pentaho Kettle for extracting data from sql and putting into CSV files. I have specified comma as the content separator. But sometimes I receive the files with semicolon seperated values. Any body else has faced the issue? I have read semicolon seperated values is the default content seperator for CSV file formats. I believe the content seperator is set to default to semicolon. Is this because the content seperator is set to default by the spoon environment based on the input data?
open the text file output step, go to content tab, their you will find option called Separator their what ever you will specify it will come into your final result, by-default you will find semi-column over their so just change it to comma and your problem will get resolved...
I have a CSV files with copy that contains apostrophes and when I import it into the database using MAMP, it turns all the apostrophes into question marks. Is there a fix for this?
Thank you in advance!
This is the result of saving your files in non UTF-8 format such as ANSI. When you open a file with text editors that don't support UTF-8 and save it again, this will happen.
Download a text editor such as Notepad++ and check and convert all your (plugin or theme) files to UTF-8 or UTF-8 Without BOM
http://wordpress.org/support/topic/how-fix-black-diamond-question-marks-in-wp-321
I had the same problem even when saving my CSV file as UTF-8. So in Microsoft Word, I edited the text and replaced the actual apostrophes and quotation marks with HTML code (' and ") and then copy and pasted it into my CSV file. This seemed to work for me. I used this website for the html code: http://www.ascii.cl/htmlcodes.htm
For special characters that aren't importing properly you could try TSQL Bulk Insert and include either:
CODEPAGE = 'ACP' --For ANSI files
CODEPAGE = '65001' --For UTF-8 files
I have a SQL query that returns 1 column. I run it in SQL Server Management Studio 2008R2.
I use File\Save Grid Results and create a .TXT file.
My problem is the first record of the file has 3 bytes inserted in front of the data. They three bytes are x'EFBBBF'. This causes problem when I use the file in another process.
I get the same thing wheter I save as .TXT or as .CSV.
Any ideas?
Found it.
Save Results As...
Choose a folder
Enter a file name
Save button now has a dropdown arrow to the right
Click on dropdown arrow and select Save with Encoding...
Select ANSI
Click OK
The ANSI-encoded file will not contain a UTF-8 BOM.
kuru kuru na is on the right track, those bytes are the UTF-8 BOM. I haven't found any settings to change the file encoding that Management Studio uses for saving results. I just use Vim to remove the BOM after saving the file. Your favorite text editor may have a similar option, or you could use a tool like iconv if you need to remove the mark or re-encode the file in a script.
I think it's called a "bom" (byte order mark) signature, which has something to do with telling whatever reads your file that it contains utf characters. I suspect it might be in your SMSS settings somewhere. But at least this is a place to start.