Converting non-Unicode to Unicode - vb.net

I'm trying to convert a non-Unicode string like this, '¹ûº¤¡¾­¢º¤ìñ©2' to Unicode like this, 'ໃຊ້ໃນຄົວເຮືອນ' which is in Lao. I tried with the code below and its return value is like this, '??????'. Any idea how can I convert the string?
Public Shared Function ConvertAsciiToUnicode(asciiString As String) As String
' Create two different encodings.
Dim encAscii As Encoding = Encoding.ASCII
Dim encUnicode As Encoding = Encoding.Unicode
' Convert the string into a byte[].
Dim asciiBytes As Byte() = encAscii.GetBytes(asciiString)
' Perform the conversion from one encoding to the other.
Dim unicodeBytes As Byte() = Encoding.Convert(encAscii, encUnicode, asciiBytes)
' Convert the new byte[] into a char[] and then into a string.
' This is a slightly different approach to converting to illustrate
' the use of GetCharCount/GetChars.
Dim unicodeChars As Char() = New Char(encUnicode.GetCharCount(unicodeBytes, 0, unicodeBytes.Length) - 1) {}
encUnicode.GetChars(unicodeBytes, 0, unicodeBytes.Length, unicodeChars, 0)
Dim unicodeString As New String(unicodeChars)
' Return the new unicode string
Return unicodeString
End Function

Your 8-bit encoded Lao text is not in ASCII, but in some codepage like IBM CP1133 or Microsoft LC0454, or most likely, the Thai codepage 874. You have to find out which one it is.
It matters how you have obtained (read, received, computed) the input string. By the time you make it a string it is already in Unicode and is easy to output in UTF-8, for example, like this:
Dim writer As New StreamWriter("myfile.txt", True, System.Text.Encoding.UTF8)
writer.Write(mystring)
writer.Close()
Here is the whole in-memory conversion:
Dim utf8_input as Byte()
...
Dim converted as Byte() = Encoding.Convert(Encoding.GetEncoding(874), Encoding.UTF8, utf8_input)
The number 874 is the number that says in which codepage your input is. Whether a particular operating system installation supports this codepage, is another question, but your own system will nearly certainly support it if you just used it to compose your Stack Overflow question.

Related

ASCII to Base32

I am working at making what 10 characters go into a text box in my vb project convert into Base32. Here is my code. I am getting an error
Value of type 'String' cannot be converted to 'Byte()'. WindowsApplication2
Private Sub Ok_Click(sender As Object, e As EventArgs) Handles Ok.Click
Dim DataToEncode As Byte() = txtbox.Text
Dim Base32 As String
Base32 = DataToEncode.ToBase32String()
Auth.Text = Base32
End Sub
The value in txtbox.Text is a string which can't be automatically converted to a byte array. So the line Dim DataToEncode As Byte() = txtbox.Text can't be compiled. To get the ASCII representation of a string use the System.Text.Encoding.ASCII.GetBytes() method.
Dim DataToEncode As Byte() = System.Text.Encoding.ASCII.GetBytes(txtbox.Text)
Also strings in VB.Net do not store ASCII values, they use UTF-16.
As the error indicates, you're trying to take a string (the context of txtbox.Text) and put it in a variable of type Byte(), an array of bytes. A string isn't a byte array, it's a logical sequence of characters that can have different representation in bytes - do you want to treat it as a UTF-8-encoded string? An ASCII string? A full-blown UTF-32 string? All these are different byte representations of what might be the same textual data.
Once you know the representation you care about, use the System.Text.Encoding classes to convert the text to a Byte() and pass that to your method.
Try converting the string into a byte array using the GetBytes method:
Dim DataToEncode As Byte() = Encoding.UTF8.GetBytes(txtbox.Text)

Read Data From The Byte Array Returned From Web Service

I have a web service,which return data in byte array.Now i want to read that data in my console project.How can i do that,i already add the desire references to access that web service.I am using vb.net VS2012.Thanks.My web service method is as follow.
Public Function GetFile() As Byte()
Dim response As Byte()
Dim filePath As String = "D:\file.txt"
response = File.ReadAllBytes(filePath)
Return response
End Function
Something like,
Dim result As String
Using (Dim data As New MemoryStream(response))
Using (Dim reader As New StreamReader(data))
result = reader.ReadToEnd()
End Using
End Using
if you knew the encoding, lets say it was UTF-8 you could do,
Dim result = System.Text.UTF8Encoding.GetString(response)
Following on from your comments, I think you are asserting this.
Dim response As Byte() 'Is the bytes of a Base64 encoded string.
So, we know all the bytes will be valid ASCII (because its Base64,) so the string encoding is interchangable.
Dim base64Encoded As String = System.Text.UTF8Encoding.GetString(response)
Now, base64Encoded is the string Base64 representation of some binary.
Dim decodedBinary As Byte() = Convert.FromBase64String(base64Encoded)
So, we've changed the encoded base64 into the binary it represents. Now, because I can see that in your example, you are reading a file called "D:/file.txt" I'm going to make the assumption that the contents of the file is a character encoded string, but I don't know the encoding of the string. The StreamReader class has some logic in the constructor that can make an educated guess at character encoding.
Dim result As String
Using (Dim data As New MemoryStream(decodedBinary))
Using (Dim reader As New StreamReader(data))
result = reader.ReadToEnd()
End Using
End Using
Hopefully, now result contains the context of the text file.

How can I read Greek characters from my database in my web app?

I've got Greek text stored in my access database. For some reason it doesn't appear in Greek- it uses other symbols instead.
e.g. Ãëþóóá instead of Γλώσσα
I can convert it in my windows app like this:
Dim encoder As Encoding = Encoding.GetEncoding(1253)
Dim valueInBytes As Byte() = encoder.System.IO.File.ReadAllBytes(lanuageFilePath)
languageValue = encoder.GetString(valueInBytes)
However, I now need to use the values in my web app. But the ReadAllBytes method is not available to me. I've tried using GetBytes instead, but this doesn't seem to produce the same results.
Dim encoder As Encoding = Encoding.GetEncoding(1253)
Dim valueInBytes As Byte() = encoder.GetBytes(languageValue)
languageValue = encoder.GetString(valueInBytes)
What am I doing wrong?
The first one seems to have nothing to do with text in a variable, your reading from a file.
Dim encoder As Encoding = Encoding.GetEncoding(1253)
Dim valueInBytes As Byte() = System.IO.File.ReadAllBytes(languageValue)
languageValue = encoder.GetString(valueInBytes)
ReadAllBytes should be supported in most frameworks so there should not be a problem with this on the server.
The other code seems to be doing soething compleatly different. You are converting the string to bytes and back again in the same encoding, to get this to work you need to find out which encoding access thought it was and encode with that. However it may still not have survived the roundtrip as access may be doing some normalistion of the unicode.
Dim encoder As Encoding = Encoding.GetEncoding(1253)
Dim accessencoder As Encoding = Encoding.GetEncoding({{accesses encoding numer here}})
Dim valueInBytes As Byte() = accessencoder.GetBytes(languageValue)
languageValue = encoder.GetString(valueInBytes)

Converting UTF-8 to windows-1255 encoding in VB.NET

I am trying to convert a string encoded in UTF-8 to windows-1255 in VB.NET with no luck. Admittedly, I don't know VB but have tried using an example at MSDN and modifying it to my needs:
Public Function Utf82Hebrew(ByVal Str As String) As String
Dim ascii As Encoding = Encoding.GetEncoding("windows-1255")
Dim unicode As Encoding = Encoding.Unicode
' Convert the string into a byte array.
Dim unicodeBytes As Byte() = unicode.GetBytes(Str)
' Perform the conversion from one encoding to the other.
Dim asciiBytes As Byte() = Encoding.Convert(unicode, ascii, unicodeBytes)
' Convert the new byte array into a char array and then into a string.
Dim asciiChars(ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)-1) As Char
ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0)
Dim asciiString As New String(asciiChars)
Utf82Hebrew = asciiString
End Function
This function doesn't actually do anything—the string remains in UTF-8. However, if I change this line:
Dim ascii As Encoding = Encoding.GetEncoding("windows-1255")
To this:
Dim ascii As Encoding = Encoding.ASCII
Then the function returns question marks in the place of the string.
Does anyone know how to properly convert a UTF-8 string to a specific encoding (in this case, windows-1255), and/or what I'm doing wrong in the above code?
Thanks in advance.
I modified your code.
It is very straightforward to convert text from one encoding into another.
This is how you should do it in VB.Net.
Microsof Windows file encoding is 1252, not 1255.
Public Function Utf82Hebrew(ByVal Str As String) As String
Dim ascii As System.Text.Encoding = System.Text.Encoding.GetEncoding("1252")
Dim unicode As System.Text.Encoding = System.Text.Encoding.Unicode
' Convert the string into a byte array.
Dim unicodeBytes As Byte() = unicode.GetBytes(Str)
' Perform the conversion from one encoding to the other.
Dim asciiBytes As Byte() = System.Text.Encoding.Convert(unicode, ascii, unicodeBytes)
' Convert the new byte array into a char array and then into a string.
Dim asciiString As String = ascii.GetString(asciiBytes)
Utf82Hebrew = asciiString
End Function

Converting String to List of Bytes

This has to be incredibly simple, but I must not be looking in the right place.
I'm receiving this string via a FTDI usb connection:
'UUU'
I would like to receive this as a byte array of
[85,85,85]
In Python, this I would convert a string to a byte array like this:
[ord(c) for c in 'UUU']
I've looked around, but haven't figured this out. How do I do this in Visual Basic?
Use the Encoding class with the correct encoding.
C#:
// Assuming string is UTF8
Encoding utf8 = Encoding.UTF8Encoding();
byte[] bytes = utf8.GetBytes("UUU");
VB.NET:
Dim utf8 As Encoding = Encoding.UTF8Encoding()
Dim bytes As Byte() = utf8.GetBytes("UUU")
depends on what kind of encoding you want to use but for UTF8 this works, you could chane it to UTF16 if needed.
Dim strText As String = "UUU"
Dim encText As New System.Text.UTF8Encoding()
Dim btText() As Byte
btText = encText.GetBytes(strText)