SQL Table, Importing CSV files turns apostrophe into question mark

SQL Table, Importing CSV files turns apostrophe into question mark - sql

I have a CSV files with copy that contains apostrophes and when I import it into the database using MAMP, it turns all the apostrophes into question marks. Is there a fix for this?
Thank you in advance!

This is the result of saving your files in non UTF-8 format such as ANSI. When you open a file with text editors that don't support UTF-8 and save it again, this will happen.
Download a text editor such as Notepad++ and check and convert all your (plugin or theme) files to UTF-8 or UTF-8 Without BOM
http://wordpress.org/support/topic/how-fix-black-diamond-question-marks-in-wp-321

I had the same problem even when saving my CSV file as UTF-8. So in Microsoft Word, I edited the text and replaced the actual apostrophes and quotation marks with HTML code (' and ") and then copy and pasted it into my CSV file. This seemed to work for me. I used this website for the html code: http://www.ascii.cl/htmlcodes.htm

For special characters that aren't importing properly you could try TSQL Bulk Insert and include either:
CODEPAGE = 'ACP' --For ANSI files
CODEPAGE = '65001' --For UTF-8 files

Related

Fix Unicode Decode Error Without Specifying Encoding='UTF-8'

I am getting the following error:
'ascii' codec can't decode byte 0xf4 in position 560: ordinal not in range(128)
I find this very weird given that my .csv file doesn't have special characters. Perhaps it has special characters that specify header rows and what not, idk.
But the main problem is that I don't actually have access to the source code that reads in the file, so I cannot simply add the keyword argument encoding='UTF-8'. I need to figure out which encoding is compatible with codecs.ascii_decode(...). I DO have access to the .csv file that I'm trying to read, and I can adjust the encoding to that, but not the source file that reads it.
I have already tried exporting my .csv file into Western (ASCII) and Unicode (UTF-8) formats, but neither of those worked.

Fixed. Had nothing to do with unicode shenanigans, my script was writing a parquet file when my Cloud Formation Template was expecting a csv file. Thanks for the help.

Decimal separator while importing a flat file in SSMS

I have an issue while importing a flat file into SSMS. As I download a CSV file from a specific system, the decimal separator is always ".". I have a regional setting of comma and I use commas all the time. But when importing these CSV files into SSMS, I get an error of type mismatch, due to the fact that SSMS cannot recognize the values as numbers (float, decimal, etc.)
I have tried to switch the regional Windows settings and replaced dot with a comma, which solved the issue and imported the file, but the question is, am I able to change settings in SSMS somehow so I am able to keep comma as default but import CSV files with decimal point separator?
I need to work only with SSMS, I am unable to install SSIS packages.
Thank you very much for any feedback.

In SSMS select "Import Data" option then in "Data Source" select "Flat File Source". Select file to load and then change "Locale" option (located just below file name).
Locales control how data formats are displayed and interpreted such as currency symbols and radix (decimal separator) character.

Characters not displayed correctly when reading CSV file

I have an issue when trying to read a string from a .CSV file. When I execute the application and the text is shown in a textbox, certain characters such as "é" or "ó" are shown as a question mark symbol.
The idea is that this code reads the whole CSV file and then splits each line into variables depending on the first word of the line.
The code I'm using to read is:
Dim test() As String
test = IO.File.ReadAllLines("Libro1.csv")
Dim test_chart As String = Array.Find(vls1load, Function(x) (x.StartsWith("sample")))
Dim test_chart_div() As String = test_chart.Split(";")
variable1 = test_chart_div(1)
variable2 = test_chart_div(2)
...etc
I have also tried with:
Dim test() As String
test = IO.File.ReadAllLines("Libro1.csv", System.Text.Encoding.UTF8)
But none of them works. The .csv file is supposed to be UTF8. The "web options" that you can see when saving the file in excel show encoding UTF8. I also tried the trick of changing the file extension to HTML and opening it with the browser to see that the encoding is also correct.
Can someone advice anything else I can try?
Thanks in advance.

When an Excel file is exported using the CSV Comma Separated output format, the Encoding selected in Tools -> Web Option -> Encoding of Excel's Save As... dialog doesn't actually generate the expected result:
the Text file is saved using the Encoding relative to the current Language selected in the Excel Application, not the Unicode (UTF16-LE) or UTF-8 Encoding selected (which is ignored) nor the default Encoding determined by the current System Language.
To import the CSV file, you can use the Encoding.GetEncoding() method to specify the Name or CodePage of the Encoding used in the machine that generated the file: again, not the Encoding related to System Language, but the Encoding of the Language that the Excel Application is currently using.
CodePage 1252 (Windows-1252) and ISO-8859-1 are commonly used in Latin1 zone.
Based the symbols you're referring to, this is most probably the original encoding used.
In Windows, use the former. ISO-8859-1 is still used, mostly in old Web Pages (or Web Pages created without care for the Encoding used).
As a note, CodePage 1252 and ISO-8859-1 are not exactly the same Encoding, there are subtle differences.
If you find documentation that states the opposite, the documentation is wrong.

How to download a csv from a query but keeping the original encoding in pgadmin

I am Brazilian and I am workin with files that are encoded in windows 1252, when I execut the queries the names are fine, but when I try to export the data to excel using the download CSV I am faceing a encoding problem and all the letters with accents are having problems
I want to know how to change the encoding or the collate in the download as cvs for queries so that it have the same encoding in impoted
The code I used to import the that is
COPY base_ans_02 FROM 'C:\Users\ben201907_SP.csv' DELIMITER ','
CSV HEADER encoding 'windows-1252';
and one example of erro is
AMIL ASSISTÃŠNCIA MÃ‰DICA INTERNACIONAL S.A.

If you inserted the data in your table using the WIN1252 encoding and it is not the default of your client, you might wanna also make sure it knows which encoding it's going to deal with.
Just set the client encoding right before your COPY command and you should be fine
SET CLIENT_ENCODING=WIN1252;
COPY base_ans_02 TO 'path_to_file' DELIMITER ',' CSV HEADER;

exporting text file with utf-8 encoding in ms access

I am exporting text files from 2 queries in ms access 2010. Queries are from different linked ODBC tables (but tables are different only by data, structure and data types are same). I set up export specification to export text file in utf-8 encoding for both files. Now here come the trouble part. When I export the queries and open them in notepad, one query is in utf-8 and second one is in ANSI. I don't know how is this possible when both queires has the same export specification and it is driving me crazy.
This is my VBA code to export queries:
DoCmd.TransferText acExportDelim, "miniflow", "qry01_CZ_test", "C:\TEST_CZ.txt", no
DoCmd.TransferText acExportDelim, "miniflow", "qry01_SK_test", "C:\TEST_SK.txt", no
I also tried to modify it by adding 65001 as coding argument by the results were same.
Do you have any idea what could be wrong?

Don't rely on the File Open dialog in Notepad to tell you whether a text file is encoded as "ANSI" or UTF-8. That is just Notepad's "guess" based on whether the file begins with the bytes EF BB BF, which is the UTF-8 Byte Order Mark (BOM).
Many (most?) Windows applications will include the UTF-8 BOM at the beginning of a text file that is UTF-8 encoded. Some Unicode purists insist, often quite vigorously, that the BOM is not required for UTF-8 files and should be excluded, but that is the way Windows applications tend to behave.
Unfortunately, Access does not always follow that pattern when it exports files to text. A UTF-8 text file exported from Access may omit the BOM and that can confuse applications like Notepad if they assume that a UTF-8 encoded file will always include the BOM as the first three bytes of the file.
For a more reliable way of determining the encoding of a text file consider using an application like Notepad++ to open the file. It will differentiate between the UTF-8 files with a BOM (which it designates as "UTF-8") and UTF-8 files without a BOM (which it designates as "ANSI as UTF-8")
To illustrate, consider the following Access table
When exported to text (CSV) with UTF-8 encoding,
the File Open dialog in Notepad reports that it is encoded as "ANSI"
but a hex editor shows that it is in fact encoded as UTF-8 (the character é is encoded as C3 A9, not simply E9 as would be the case for true "ANSI" encoding)
and Notepad++ recognizes it as "ANSI as UTF-8"
in other words, a UTF-8 encoded file without a BOM.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Table, Importing CSV files turns apostrophe into question mark - sql

I have a CSV files with copy that contains apostrophes and when I import it into the database using MAMP, it turns all the apostrophes into question marks. Is there a fix for this? Thank you in advance!

For special characters that aren't importing properly you could try TSQL Bulk Insert and include either: CODEPAGE = 'ACP' --For ANSI files CODEPAGE = '65001' --For UTF-8 files

Related

Fix Unicode Decode Error Without Specifying Encoding='UTF-8'

Decimal separator while importing a flat file in SSMS

Characters not displayed correctly when reading CSV file

How to download a csv from a query but keeping the original encoding in pgadmin

exporting text file with utf-8 encoding in ms access

Categories

Resources