Mixed Encoding to String - vb.net

I have a string in VB.net that may contain something like the following:
This is a 0x000020AC symbol
This is the UTF-32 encoding for the Euro Symbol according to this article http://www.fileformat.info/info/unicode/char/20ac/index.htm
I'd like to convert this into
This is a € symbol
I've tried using UnicodeEncoding() class in VB.net (Framework 2.0, as I'm modifying a legacy application)
When I use this class to encode, and then decode I still get back the original string.
I expected that the UnicodeEncoding would recognise the already encoded part and not encode it against. But it appears to not be the case.
I'm a little lost now as to how I can convert a mixed encoded string into a normal string.
Background: When saving an Excel spreadsheet as CSV, anything outside of the ascii range gets converted to ?. So my idea is that if I can get my client to search/replace a few characters, such as the Euro symbol, into an encoded string such as 0x000020AC. Then I was hoping to convert those encoded parts back into the real symbols before I insert to a SQL database.
I've tried a function such as
Public Function Decode(ByVal s As String) As String
Dim uni As New UnicodeEncoding()
Dim encodedBytes As Byte() = uni.GetBytes(s)
Dim output As String = ""
output = uni.GetString(encodedBytes)
Return output
End Function
Which was based on the examples on the MSDN at http://msdn.microsoft.com/en-us/library/system.text.unicodeencoding.aspx
It could be that I have a complete mis-understanding of how this works in VB.net. In C# I can simply use escaped characters such as "\u20AC". But no such thing exists in VB.net.

Based on advice from Heinzi I implemented a Regex.Replace method using the following code, this appear to work for my examples.
Public Function Decode(ByVal s As String) As String
Dim output As String = ""
Dim sRegex As String = "0x[0-9a-zA-Z]{8}"
Dim r As Regex = New Regex(sRegex)
Dim myEvaluator As MatchEvaluator = New MatchEvaluator(AddressOf HexToString)
output = r.Replace(s, myEvaluator)
Return output
End Function
Public Function HexToString(ByVal hexString As Match) As String
Dim uni As New UnicodeEncoding(True, True)
Dim input As String = hexString.ToString
input = input.Substring(2)
input = input.TrimStart("0"c)
Dim output As String
Dim length As Integer = input.Length
Dim upperBound As Integer = length \ 2
If length Mod 2 = 0 Then
upperBound -= 1
Else
input = "0" & input
End If
Dim bytes(upperBound) As Byte
For i As Integer = 0 To upperBound
bytes(i) = Convert.ToByte(input.Substring(i * 2, 2), 16)
Next
output = uni.GetString(bytes)
Return output
End Function

Have you tried:
Public Function Decode(Byval Coded as string) as string
Return StrConv(Coded, vbUnicode)
End Function
Also, your function is invalid. It takes s as an argument, does a load of stuff and then outputs the s that was put into it instead of the stuff that was processed within it.

Related

Decode mail encoded-words =?utf-8?B?xxxx?=, =?utf-8?Q?xxxx?=

Is there a way to decode email subjects that are encoded? I know the dirty way of doing it is to get the string character between =?utf-8?B? xxx ?= and decoding that. But I have a program where I can get encoded strings like
=?utf-8?Bxxxx?= =?UTF-8?B?xxxx?= ...
Right now I'm doing something like this
If codedString.ToUpper().StartsWith("=?UTF-8?B?") Then
Dim temp As String = codedString.SubString(10)
Dim data = Convert.FromBase64String(temp)
Dim decodedString = ASCIIEncoding.ASCII.GetString(data)
'do something with decodedString
End If
But this doesn't work when the same string has multiple =?utf-8?B? encode like above. Also I can get strings with =?utf-8?Q encoding and =?windows-1252. Is there a way to tackle all of these encoding? I'm using Visual Studios 2017
I've never had trouble using this function to decode a email field value:
It finds matching utf-8 strings for types B or Q, and if type B, runs FromBase64String.
I'm sure you can manipulate for windows-1252.
Private Function DecodeEmailField(byVal strString as String) as String
DecodeEmailField = strString.toString()
Dim strMatch
Dim arrEncodeTypes = New String() {"B","Q"}
Dim strEncodeType as String
For Each strEncodeType in arrEncodeTypes
Dim objRegexB as RegEx = new RegEx("(?:\=\?utf\-8\?" & strEncodeType & "\?)(?:.+?)(?:\?=\s)", _
RegexOptions.Multiline or RegexOptions.IgnoreCase)
if (objRegexB.IsMatch(DecodeEmailField)) then
Dim thisMatch as Match = objRegexB.Match(DecodeEmailField)
For Each strMatch in thisMatch.Groups
Dim strMatchHold as String = strMatch.toString().Substring(("=?utf-8?" & strEncodeType & "?").length)
strMatchHold = strMatchHold.SubString(0,(strMatchHold.Length)-("?= ".Length))
If strEncodeType = "B" Then
Dim data() As Byte = System.Convert.FromBase64String(strMatchHold)
strMatchHold = System.Text.UTF8Encoding.UTF8.GetString(data)
End If
DecodeEmailField = Replace(DecodeEmailField,strMatch.toString(),strMatchHold)
Next
End If
Next
End Function

Convert string of byte array back to original string in vb.net

I have a plain text string that I'm converting to a byte array and then to a string and storing in a database.
Here is how I'm doing it:
Dim b As Byte() = System.Text.Encoding.UTF8.GetBytes("Hello")
Dim s As String = BitConverter.ToString(b).Replace("-", "")
Afterwards I store the value of s (which is "48656C6C6F") into a database.
Later on, I want to retrieve this value from the database and convert it back to "Hello". How would I do that?
You can call the following function with your hex string and get "Hello" returned to you. Note that the function doesn't validate the input, you would need to add validation unless you can be sure the input is valid.
Private Function HexToString(ByVal hex As String) As String
Dim result As String = ""
For i As integer = 0 To hex.Length - 1 Step 2
Dim num As Integer = Convert.ToInt32(hex.Substring(i, 2), 16)
result &= Chr(num)
Next
Return result
End Function
James Thorpe points out in his comment that it would be more appropriate to use Encoding.UTF8.GetString to convert back to a string as that is the reverse of the method used to create the hex string in the first place. I agree, but as my original answer was already accepted, I hesitate to change it, so I am adding an alternative version. The note about validation of input being skipped still applies.
Private Function HexToString(ByVal hex As String) As String
Dim bytes(hex.Length \ 2 - 1) As Byte
For i As Integer = 0 To hex.Length - 1 Step 2
bytes(i \ 2) = Byte.Parse(hex.Substring(i, 2), System.Globalization.NumberStyles.HexNumber)
Next
Return System.Text.Encoding.UTF8.GetString(bytes)
End Function

How to convert a string expression to vb code?

I have a string output from user interface as below,
strFormula ="gridControlName.Rows(i).cells("C1").value *
gridControlName.Rows(i).cells("C2").value"
if i write code like
dblRes=gridControlName.Rows(i).cells("C1").value *
gridControlName.Rows(i).cells("C2").value
it will give result.. but since its a string i could not get result
How can I remove the double quotes from the above string and get the values entered in the grid cells to be multiplied?
I don't think there's an 'easy' way to do this, since VB.Net doesn't have an "eval()" like some other languages. However, it does support run-time compilation. Here are a couple articles which may help you:
Using .NET Languages to make your Application Scriptable (VB.Net example)
Runtime Compilation (A .NET eval statement) (C# example)
Both are intended to be a bit more robust than just executing single lines of code, allowing users to input entire textboxes of their own code for example, but should give you some direction. Both include sample projects.
Hi guys thanks for your updates.. I wrote my own function by using your concepts and some other code snippets .I am posting the result
Function generate(ByVal alg As String, ByVal intRow As Integer) As String Dim algSplit As String() = alg.Split(" "c)
For index As Int32 = 0 To algSplit.Length - 1
'algSplit(index) = algSplit(index).Replace("#"c, "Number")
If algSplit(index).Contains("[") Then
Dim i As Integer = algSplit(index).IndexOf("[")
Dim f As String = algSplit(index).Substring(i + 1, algSplit(index).IndexOf("]", i + 1) - i - 1)
Dim grdCell As Infragistics.Win.UltraWinGrid.UltraGridCell = dgExcelEstimate.Rows(intRow).Cells(f)
Dim dblVal As Double = grdCell.Value
algSplit(index) = dblVal
End If
Next
Dim result As String = String.Join("", algSplit)
'Dim dblRes As Double = Convert.ToDouble(result)
Return result
End Function
Thanks again every one.. expecting same in future

Convert string from Base2 to Base4

I wish to convert text to base 4 (AGCT), by first converting it to binary (I've done this bit) and then break it into 2 bit pairs.
can someone help me turn this into code using vb.net syntax?
if (length of binary String is an odd number) add a zero to the front (leftmost position) of the String. Create an empty String to add translated digits to. While the original String of binary is not empty { Translate the first two digits only of the binary String into a base-4 digit, and add this digit to the end (rightmost) index of the new String. After this, remove the same two digits from the binary string and repeat if it is not empty. }
in this context:
Dim Base2Convert As String = ""
For Each C As Char In Result.Text
Dim s As String = System.Convert.ToString(AscW(C), 2).PadLeft(8, "0")
Base2Convert &= s
Next
Result.Text = Base2Convert
Dim Base4Convert As String = ""
For Each C As Char In Result.Text
'//<ADD THE STATEMENT ABOVE AS CODE HERE>//
Base4Convert &= s
Next
Result.Text = Base4Convert
.NET does not support conversion to non-standard base, such as 4, so this will not work:
Dim base4number As String = Convert.ToString(base10number, 4)
From MSDN:
[...] base of the return value [...] must be 2, 8, 10, or 16.
But you can write your own conversion function, or take the existing one off the web:
Public Function IntToStringFast(value As Integer, baseChars As Char()) As String
Dim i As Integer = 32
Dim buffer(i - 1) As Char
Dim targetBase As Integer = baseChars.Length
Do
buffer(System.Threading.Interlocked.Decrement(i)) =
baseChars(value Mod targetBase)
value = value \ targetBase
Loop While value > 0
Dim result As Char() = New Char(32 - i - 1) {}
Array.Copy(buffer, i, result, 0, 32 - i)
Return New String(result)
End Function
Used this answer. Converted with developer fusion from C# + minor adjustments. Example:
Dim base2number As String = "11110" 'Decimal 30
Dim base10number As Integer = Convert.ToInt32(base2number, 2)
Dim base4number As String = IntToStringFast(base10number, "0123")
Console.WriteLine(base4number) 'outputs 132
Notice that you don't need base 2 there as an intermediate value, you can convert directly from base 10. If in doubt, whether it worked correctly or not, here is a useful resource:
Number base converter
Converting the number to base first and then to base 4 doesn’t make a lot of sense, since directly converting to base 4 is the same algorithm anyway. In fact, representation of a number in any base requires the same general algorithm:
Public Shared Function Representation(number As Integer, digits As String) As String
Dim result = ""
Dim b = digits.Length
Do
result = digits(number Mod b) & result
number \= b
Loop While number > 0
Return result
End Function
Now you can verify that Representation(i, decimal) does the same as i.ToString():
Dim decimalDigits = "0123456789"
For i = 0 To 30 Step 3
Console.WriteLine("{0}, {1}", i.ToString(), Representation(i, decimalDigits))
Next
It’s worth noting that i.ToString() converts i to decimal base because this is the base we, humans, are mostly using. But there is nothing special about decimal, and in fact internally, i is not a decimal number: its representation in computer memory is binary.
For conversion to any other base, just pass a different set of digits to the method. In your case, that’d be "ACGT":
Console.WriteLine(Representation(i, "ACGT"))
Hexadecimal also works:
Console.WriteLine(Representation(i, "0123456789ABCDEF"))
And, just to repeat it because it’s such a nice mathematical property: so does any other base with at least two distinct digits.

build/check hash value for file

I'm having hard time with this one. Can someone either point me in the right direction for checking/building hash codes for an uploaded file or else tell me what I'm doing wrong with the code below?
getFileSHA256(softwareUpload.PostedFile) 'Line that calls the function includes a reference to an uploaded file
Private Function getFileSHA256(ByVal theFile As Web.HttpPostedFile) As String
Dim SHA256CSP As New SHA256Managed()
Dim byteHash() As Byte = SHA256CSP.ComputeHash(theFile.InputStream)
Return ByteArrayToString(byteHash)
End Function
Private Function ByteArrayToString(ByVal arrInput() As Byte) As String
Dim sb As New System.Text.StringBuilder(arrInput.Length * 2)
For i As Integer = 0 To arrInput.Length - 1
sb.Append(arrInput(i).ToString("X2"))
Next
Return sb.ToString().ToLower
End Function
I should add that the function works, but the return does not match other programs' sha256 values.
EDIT ------
There are two other functions that I'm using in my code. SHA1 gets the same kind of results as the SHA256; the results do not match trusted sources.
However, the MD5 works as expected.
Private Function getFileSHA1(ByVal theFile As Web.HttpPostedFile) As String
Dim SHA1CSP As New SHA1CryptoServiceProvider()
Dim byteHash() As Byte = SHA1CSP.ComputeHash(theFile.InputStream)
Return ByteArrayToString(byteHash)
End Function
Private Function getFileMd5(ByVal theFile As Web.HttpPostedFile) As String
Dim Md5CSP As New System.Security.Cryptography.MD5CryptoServiceProvider
Dim byteHash() As Byte = Md5CSP.ComputeHash(theFile.InputStream)
Return ByteArrayToString(byteHash)
End Function
I plan to consolidate these functions once I know they are working as expected.
The only difference between these is that MD5 is using "MD5CryptoServiceProvider" and it works as expected. SHA1 is also using "SHA1CryptoServiceProvider" but it does not match trusted sources.
I did some testing here, it appears that for text files SHA256Managed works perfectly.
My code is below, I used your implementation of ByteArrayToString:
Sub Main()
Dim s As New SHA256Managed
Dim fileBytes() As Byte = IO.File.ReadAllBytes("s:\sha256.txt")
Dim hash() As Byte = s.ComputeHash(fileBytes)
Dim referenceHash As String = "18ffd9682c5535a2b2798ca51b13e9490df326f185a83fe6e059f8ff47d92105"
Dim calculatedHash As String = ByteArrayToString(hash)
MsgBox(calculatedHash = referenceHash) 'outputs True
End Sub
Private Function ByteArrayToString(ByVal arrInput() As Byte) As String
Dim sb As New System.Text.StringBuilder(arrInput.Length * 2)
For i As Integer = 0 To arrInput.Length - 1
sb.Append(arrInput(i).ToString("X2"))
Next
Return sb.ToString().ToLower
End Function
For testing purposes, I created a file called sha256.txt under S: with the following contents:
my test file
(no trailing spaces or newline)
I got the reference hash value from here, by feeding same data.
Also check this and this - the fact you get non-match could be related to platform and/or implementation of your trusted source, or needing an extra conversion step.