I have this code:
wxString tmp(wxT("Información del usuario"));
wxStaticBoxSizer* sbSizer1 = new wxStaticBoxSizer (wxVERTICAL, panel, tmp);
This shows rare symbols instead of ñ in Windows but in Linux it shows correctly the letter..any ideas?
The value of the string in your code depends on the encoding of your source file and also the charset used by your compiler. If your source file itself is in Unicode (whether it's UTF-8 or UTF-16), then you can use L"..." to create a wide string literal. If not, or you're not sure, you can always use wxString::FromUTF8() to explicitly encode the string as UTF-8, e.g. wxString::FromUTF8("Informaci\xc3\xb3n...") will always work.
Related
I have an issue when trying to read a string from a .CSV file. When I execute the application and the text is shown in a textbox, certain characters such as "é" or "ó" are shown as a question mark symbol.
The idea is that this code reads the whole CSV file and then splits each line into variables depending on the first word of the line.
The code I'm using to read is:
Dim test() As String
test = IO.File.ReadAllLines("Libro1.csv")
Dim test_chart As String = Array.Find(vls1load, Function(x) (x.StartsWith("sample")))
Dim test_chart_div() As String = test_chart.Split(";")
variable1 = test_chart_div(1)
variable2 = test_chart_div(2)
...etc
I have also tried with:
Dim test() As String
test = IO.File.ReadAllLines("Libro1.csv", System.Text.Encoding.UTF8)
But none of them works. The .csv file is supposed to be UTF8. The "web options" that you can see when saving the file in excel show encoding UTF8. I also tried the trick of changing the file extension to HTML and opening it with the browser to see that the encoding is also correct.
Can someone advice anything else I can try?
Thanks in advance.
When an Excel file is exported using the CSV Comma Separated output format, the Encoding selected in Tools -> Web Option -> Encoding of Excel's Save As... dialog doesn't actually generate the expected result:
the Text file is saved using the Encoding relative to the current Language selected in the Excel Application, not the Unicode (UTF16-LE) or UTF-8 Encoding selected (which is ignored) nor the default Encoding determined by the current System Language.
To import the CSV file, you can use the Encoding.GetEncoding() method to specify the Name or CodePage of the Encoding used in the machine that generated the file: again, not the Encoding related to System Language, but the Encoding of the Language that the Excel Application is currently using.
CodePage 1252 (Windows-1252) and ISO-8859-1 are commonly used in Latin1 zone.
Based the symbols you're referring to, this is most probably the original encoding used.
In Windows, use the former. ISO-8859-1 is still used, mostly in old Web Pages (or Web Pages created without care for the Encoding used).
As a note, CodePage 1252 and ISO-8859-1 are not exactly the same Encoding, there are subtle differences.
If you find documentation that states the opposite, the documentation is wrong.
I have string variable txt. It contains "°" degree symbol. I would like to save string into CSV file ASCII encoded. I use the procedure below But the "°" symbol is converted to "?". Do you have any idea how to save properly degree symbol?
Public Sub Write_File(ByVal txt As String, ByVal fName As String)
Try
Using OutFile As New StreamWriter(fName, False, Text.Encoding.ASCII)
OutFile.Write(txt)
End Using
Me.Write_Log("Succesfully Exported")
Catch ex As Exception
Me.Write_Log("Write Error during export")
End Try
End Sub
Encoding.ASCII is for the standard 7-bit ASCII encoding, which does not contain a degree symbol at all. In order to get a degree symbol in ASCII, you would have to use one of the many 8-bit ASCII encodings. For English, you'd probably be most interested in using the ISO 8859-1 code page, since that's the most standard-ish one there is of the bunch. For instance, instead of using Encoding.ASCII, you could do something like this:
Using OutFile As New StreamWriter(fName, False, Text.Encoding.GetEncoding("iso-8859-1"))
OutFile.Write(txt)
End Using
For a complete list of available encodings, use the Encoding.GetEncodings method, or look at the list of supported ones in the MSDN documentation.
Of course, none of the various 8-bit ASCII encodings are compatible with each other, so, if you do use that, the degree symbol will be a completely different symbol when viewed on a system that uses a different code page by default. That is precisely why UTF-8 has become the new standard. Usage of 8-bit ASCII is widely discouraged since it is practically unworkable in multi-cultural scenarios. If you can use UTF-8 instead, I would. If you must use ASCII, it's best to stick to the standard 7-bit encoding. If you must use an 8-bit ASCII encoding, please do so sparingly and with full awareness of its drawbacks.
One more thing. You mention the degree symbol as being character 167 (0xA7) in your desired target encoding. If that is the case, you may actually be wanting IBM437 encoding rather than ISO 8859-1. IBM437 is the old code page that was used by default in MS-DOS. If you really need to use that code page, you may have additional trouble for two reasons. As you'll see in the MSDN article, that code page is not well supported in the .NET framework. In my testing, outputting the Unicode string containing the degree symbol using that encoding did not work properly. Therefore, you may find yourself needing to use a byte array to represent the data rather than a String variable (which is Unicode). For instance:
File.WriteAllBytes("Test.txt", {167})
The second problem is that IBM437 is likely not the default code page for your windows OS, so even when it is written to the file as byte value 167, it won't actually look like a degree symbol when you view it in a windows application such as notepad.
I had a situation where we produce a file for our client, and the file would contain some special characters like accented i or a (í, á) etc.
Originally, we used this code to open file for output:
Using sw As StreamWriter = New StreamWriter(fullpath, True)
However, the í and á would show up in the file as 2 character combinations of bytes with hex codes c3 ad for the í and c3 a1 for the á
We fixed the issue by enforcing the Windows1252 encoding when writing to the file (which is same as Encoding.Default, but according to MSDN we should NOT be using Encoding.Default):
Using sw As StreamWriter = New StreamWriter(fullpath, True, Text.Encoding.GetEncoding(1252))
Question: if Encoding.Default is not really a default encoding when no Encoding parameter was supplied, which encoding is the default default (pardon the pun) encoding?
Question2: probably the same answer as QUestion 1, what is the default default encoding for StreamReader if you don't specify Encoding parameter?
Well, you didn't really fix the issue. To get "c3 ad for the í" you must use Encoding.Utf8
Which is what StreamWriter is already using. However, it uses the Utf8Encoding constructor that takes the encoderShouldEmitUTF8Identifier argument and passes false. Otherwise known as the BOM (Byte Order Mark). The BOM tells the program that reads the file unequivocally what Unicode encoding is used. Sadly, Microsoft cannot force a BOM because the Unicode consortium, in a highly uncharacteristic moment of temporary insanity, made a BOM optional.
It probably works now because the program falls back to the system's default encoding when it can't find the BOM. You might have guessed correctly at 1252, it is common, but certainly no guarantee. Fix:
Using sw As StreamWriter = New StreamWriter(fullpath, True, Encoding.Utf8)
Do beware the True argument you use. Which appends text to the file. If the file already contains text then you can't get the BOM added anymore. Also a rather nasty problem if the file got started with a different encoding, you certainly don't want to get a mix. Do everything you can to avoid having to use True.
So I have a symbol: π in the strings file and it turnes out that due to it I cannot successfuly compile to fatal:
Copy EN.strings
Command /Developer/Library/Xcode/Plug-ins/CoreBuildTasks.xcplugin/Contents/Resources/copystrings failed with exit code 1
If I remove π it's fine. The strange thing is that even if I put π in the comment it still won't compile.
what to do?
Thankx
If you can find the Unicode value of the character, you could escape it in the following manor:
NSString *str = #"\u00F6"
And Java (just for comparison):
String str = "\u00F6";
Although I'd imagine that the compile issue relates to the character being from a different encoding to the specified encoding of your source file. I believe the compiler will interpret your source as UTF-8 by default.
Make sure your strings file is using a Unicode encoding, and make sure the string is quoted; this has solved the issue for me in the past.
I'm programming in VB.NET using Visual Studio 2008.
I need to define a string literal containing the character "÷" equivalent to Chr(247).
I understand that internally VS uses UTF-16 encoding, but when the source file is written to disk it contains the single byte value F7 for this character.
This source file is processed by another program that uses UTF-8 encoding by default, so it fails to interpret this character correctly, attempting to combine it with the following single-byte character.
What encoding would correctly interpret the single byte F7 as the single character ÷?
Alternatively, is there a way of expressing a non-ASCII literal that uses only ASCII characters - like using some kind of escape sequence?
well, i always thought that by default VS uses UTF-8 to save files. But ÷ is F7 in encoding ISO 8859-1. If this is not enough for you go here: how to change source file encoding in csharp project (visual studio / msbuild machine)?