Converting UTF-8 to windows-1255 encoding in VB.NET - vb.net

I am trying to convert a string encoded in UTF-8 to windows-1255 in VB.NET with no luck. Admittedly, I don't know VB but have tried using an example at MSDN and modifying it to my needs:
Public Function Utf82Hebrew(ByVal Str As String) As String
Dim ascii As Encoding = Encoding.GetEncoding("windows-1255")
Dim unicode As Encoding = Encoding.Unicode
' Convert the string into a byte array.
Dim unicodeBytes As Byte() = unicode.GetBytes(Str)
' Perform the conversion from one encoding to the other.
Dim asciiBytes As Byte() = Encoding.Convert(unicode, ascii, unicodeBytes)
' Convert the new byte array into a char array and then into a string.
Dim asciiChars(ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)-1) As Char
ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0)
Dim asciiString As New String(asciiChars)
Utf82Hebrew = asciiString
End Function
This function doesn't actually do anything—the string remains in UTF-8. However, if I change this line:
Dim ascii As Encoding = Encoding.GetEncoding("windows-1255")
To this:
Dim ascii As Encoding = Encoding.ASCII
Then the function returns question marks in the place of the string.
Does anyone know how to properly convert a UTF-8 string to a specific encoding (in this case, windows-1255), and/or what I'm doing wrong in the above code?
Thanks in advance.

I modified your code.
It is very straightforward to convert text from one encoding into another.
This is how you should do it in VB.Net.
Microsof Windows file encoding is 1252, not 1255.
Public Function Utf82Hebrew(ByVal Str As String) As String
Dim ascii As System.Text.Encoding = System.Text.Encoding.GetEncoding("1252")
Dim unicode As System.Text.Encoding = System.Text.Encoding.Unicode
' Convert the string into a byte array.
Dim unicodeBytes As Byte() = unicode.GetBytes(Str)
' Perform the conversion from one encoding to the other.
Dim asciiBytes As Byte() = System.Text.Encoding.Convert(unicode, ascii, unicodeBytes)
' Convert the new byte array into a char array and then into a string.
Dim asciiString As String = ascii.GetString(asciiBytes)
Utf82Hebrew = asciiString
End Function

Related

VB.NET, I can't convert Unicode escape sequences to text

I watched many videos on YouTube, read many solutions on Google and Stack Overflow! Can anyone tell me how I can convert Unicode escape sequences to text?
I tried this:
Dim f = System.Net.WebUtility.HtmlDecode("sa3444444d4ds\u0040outllok.com")
MsgBox(f)
and this:
Dim f = System.Uri.UnescapeDataString("sa3444444d4ds\u0040outllok.com")
MsgBox(f)
and this:
Dim myBytes As Byte() = System.Text.Encoding.Unicode.GetBytes("sa3444444d4ds\u0040outllok.com")
Dim myChars As Char() = System.Text.Encoding.Unicode.GetChars(myBytes)
Dim myString As String = New String(myChars)
MsgBox(myString)
and this:
Dim f = UnicodeToAscii("sa3444444d4ds\u0040outllok.com")
MsgBox(f)
Public Function UnicodeToAscii(ByVal unicodeString As String) As String
Dim ascii As Encoding = Encoding.ASCII
Dim unicode As Encoding = Encoding.Unicode
' Convert the string into a byte array.
Dim unicodeBytes As Byte() = unicode.GetBytes(unicodeString)
' Perform the conversion from one encoding to the other.
Dim asciiBytes As Byte() = Encoding.Convert(unicode, ascii, unicodeBytes)
' Convert the new byte array into a char array and then into a string.
Dim asciiChars(ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length) - 1) As Char
ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0)
Dim asciiString As New String(asciiChars)
Return asciiString
End Function
You can use Regex.Unescape.
For example,
Dim s = "sa3444444d4ds\u0040outllok.com"
Console.WriteLine(Regex.Unescape(s))
outputs:
sa3444444d4ds#outllok.com
Credit to Tim Patrick for showing this in the Visual Studio Magazine article Overcoming Escape Sequence Envy in Visual Basic and C#.

ASCII to Base32

I am working at making what 10 characters go into a text box in my vb project convert into Base32. Here is my code. I am getting an error
Value of type 'String' cannot be converted to 'Byte()'. WindowsApplication2
Private Sub Ok_Click(sender As Object, e As EventArgs) Handles Ok.Click
Dim DataToEncode As Byte() = txtbox.Text
Dim Base32 As String
Base32 = DataToEncode.ToBase32String()
Auth.Text = Base32
End Sub
The value in txtbox.Text is a string which can't be automatically converted to a byte array. So the line Dim DataToEncode As Byte() = txtbox.Text can't be compiled. To get the ASCII representation of a string use the System.Text.Encoding.ASCII.GetBytes() method.
Dim DataToEncode As Byte() = System.Text.Encoding.ASCII.GetBytes(txtbox.Text)
Also strings in VB.Net do not store ASCII values, they use UTF-16.
As the error indicates, you're trying to take a string (the context of txtbox.Text) and put it in a variable of type Byte(), an array of bytes. A string isn't a byte array, it's a logical sequence of characters that can have different representation in bytes - do you want to treat it as a UTF-8-encoded string? An ASCII string? A full-blown UTF-32 string? All these are different byte representations of what might be the same textual data.
Once you know the representation you care about, use the System.Text.Encoding classes to convert the text to a Byte() and pass that to your method.
Try converting the string into a byte array using the GetBytes method:
Dim DataToEncode As Byte() = Encoding.UTF8.GetBytes(txtbox.Text)

Converting String to String of Hex and vice-versa in Vb.Net

I need to convert a String of totally random characters in something i can read back!
My idea is:
Example String: hi
h (Ascii) -> 68 (hex)
i (Ascii) -> 69 (hex)
So converting hi i must have 6869
My value is now in Base64 (i got it with a Convert.ToBase64String()), is this "ascii to hex" conversion correct? In base64 i have value like "4kIw0ueWC/+c=" but i need characters only, special characters can mess my system
The vb.net Convert can only translate to base64 string :(
edit: This is my final solution:
i got the base64 string inside my enc variable and converted it first in ASCII then in corrispondent Hex using:
Dim bytes As Byte() = System.Text.Encoding.ASCII.GetBytes(enc)
Dim hex As String = BitConverter.ToString(bytes).Replace("-", String.Empty)
After that i reversed this with:
Dim b((input.Length \ 2) - 1) As Byte
For i As Int32 = 0 To b.GetUpperBound(0)
b(i) = Byte.Parse(input.Substring(i * 2, 2), Globalization.NumberStyles.HexNumber)
Next i
Dim enc As New System.Text.ASCIIEncoding()
result = enc.GetString(b)
After all this i got back my base64string and converted one last time with Convert.FromBase64String(result)
Done! Thanks for the hint :)
First get Byte() from your base64 string:
Dim data = Convert.FromBase64String(inputString)
Then use BitConverter:
String hex = BitConverter.ToString(data)

Converting non-Unicode to Unicode

I'm trying to convert a non-Unicode string like this, '¹ûº¤¡¾­¢º¤ìñ©2' to Unicode like this, 'ໃຊ້ໃນຄົວເຮືອນ' which is in Lao. I tried with the code below and its return value is like this, '??????'. Any idea how can I convert the string?
Public Shared Function ConvertAsciiToUnicode(asciiString As String) As String
' Create two different encodings.
Dim encAscii As Encoding = Encoding.ASCII
Dim encUnicode As Encoding = Encoding.Unicode
' Convert the string into a byte[].
Dim asciiBytes As Byte() = encAscii.GetBytes(asciiString)
' Perform the conversion from one encoding to the other.
Dim unicodeBytes As Byte() = Encoding.Convert(encAscii, encUnicode, asciiBytes)
' Convert the new byte[] into a char[] and then into a string.
' This is a slightly different approach to converting to illustrate
' the use of GetCharCount/GetChars.
Dim unicodeChars As Char() = New Char(encUnicode.GetCharCount(unicodeBytes, 0, unicodeBytes.Length) - 1) {}
encUnicode.GetChars(unicodeBytes, 0, unicodeBytes.Length, unicodeChars, 0)
Dim unicodeString As New String(unicodeChars)
' Return the new unicode string
Return unicodeString
End Function
Your 8-bit encoded Lao text is not in ASCII, but in some codepage like IBM CP1133 or Microsoft LC0454, or most likely, the Thai codepage 874. You have to find out which one it is.
It matters how you have obtained (read, received, computed) the input string. By the time you make it a string it is already in Unicode and is easy to output in UTF-8, for example, like this:
Dim writer As New StreamWriter("myfile.txt", True, System.Text.Encoding.UTF8)
writer.Write(mystring)
writer.Close()
Here is the whole in-memory conversion:
Dim utf8_input as Byte()
...
Dim converted as Byte() = Encoding.Convert(Encoding.GetEncoding(874), Encoding.UTF8, utf8_input)
The number 874 is the number that says in which codepage your input is. Whether a particular operating system installation supports this codepage, is another question, but your own system will nearly certainly support it if you just used it to compose your Stack Overflow question.

Converting String to List of Bytes

This has to be incredibly simple, but I must not be looking in the right place.
I'm receiving this string via a FTDI usb connection:
'UUU'
I would like to receive this as a byte array of
[85,85,85]
In Python, this I would convert a string to a byte array like this:
[ord(c) for c in 'UUU']
I've looked around, but haven't figured this out. How do I do this in Visual Basic?
Use the Encoding class with the correct encoding.
C#:
// Assuming string is UTF8
Encoding utf8 = Encoding.UTF8Encoding();
byte[] bytes = utf8.GetBytes("UUU");
VB.NET:
Dim utf8 As Encoding = Encoding.UTF8Encoding()
Dim bytes As Byte() = utf8.GetBytes("UUU")
depends on what kind of encoding you want to use but for UTF8 this works, you could chane it to UTF16 if needed.
Dim strText As String = "UUU"
Dim encText As New System.Text.UTF8Encoding()
Dim btText() As Byte
btText = encText.GetBytes(strText)