How to read HDF5 bytes array resulting of string dataset - vb.net

I use the HDF5DotNet libraries with VB.net. I need to read a string dataset (3000 items, each item len = 16).
I use a byte array to store all the values but it's not easy to parse : I need to get a string by line and not a part of string. Do you know a better way to store and parse the result ?
Here my code:
'Load the file
Dim HDF5TestFileID As HDF5DotNet.H5FileId
HDF5TestFileID = H5F.open("C:\test.hdf5", H5F.OpenMode.ACC_RDONLY)
'Get datset and group id
Dim GroupRootId As HDF5DotNet.H5GroupId = H5G.open(HDF5TestFileID, "/")
Dim dataSetRN As H5DataSetId = H5D.open(GroupRootId, "MyItemsNames")
'Build byte array from the dataset
Dim readDataBackRN(16 * 3000) As Byte
Dim h5DataBackRN As New H5Array(Of Byte)(readDataBackRN)
Dim typeIdRN As H5DataTypeId = H5D.GetType(dataSetRN)
H5D.read(dataSetRN, typeIdRN, h5DataBackRN)
'try to parse the result but not easy to use data
Dim content as string = System.Text.Encoding.UTF8.GetString(readDataBackRN).Replace(" ", "<br>")

You normally read the buffer, check if the string is complete and if not continue reading and append the new contents to the previous fragment and keep doing that until you find the end of line.
Can you do this?

In this case the string is complete but it's possible that one day I reach the limit of size. I cannot modify the H5D.read method. I wonder if an simple array of byte is the best way ?
The better way would be to store result in a an array of array of bytes (1 array of byte per line of string) and not store all items in a single array of bytes ?
But I don't knwon if it's possible.

Related

How can I read a specific byte in a byte array in VB.NET?

I have a binary file being loaded into a Byte array using Dim settingsBinary As Byte() = My.Computer.FileSystem.ReadAllBytes(settingsFile) where settingsFile is the path to my binary file.
Let's say that my binary file has got three bytes and I want to read the first byte as a Boolean (00 = False, 01 = True). How can I get those bytes? I have tried to use this question's answer to read those three bytes, but I can't wrap my head around it.
Just to clarify, I need to get the three bytes separately: Get the first byte, then set CheckBox1.Checked to the first byte, and so on with the other bytes.
A byte array works just like any other array: You can use an indexer to access a specific element.
Dim firstByte As Byte = settingsBinary(0)
Dim secondByte As Byte = settingsBinary(1)
Dim thirdByte As Byte = settingsBinary(2)
And then you can convert the byte into a boolean:
Dim firstBoolean As Boolean
Select Case firstByte
Case 0
firstBoolean = False
Case 1
firstBoolean = True
Case Else
Throw New Exception("Invalid first byte in settings file: " & firstByte)
End Select
(Generalizing the conversion logic into a method that can be used for all three bytes is left as an exercise to the reader.)

how to get the specify items in byte() array in VB

I am working on the VB.NET. I have two arrays need to compare and output the remainder of the longer array. I just simplify my question as following code:
Dim array1 As Byte() = {&H01, &H22,&H10,&HBC,&HA2,&H01,&H00,&HA6,&H02,&HBB,&H33,&H11,&HB2,&H01}
Dim array As Byte() = {&H01, &H22,&H10,&HBC,&HA2,&H01,&H00,&HA6,&H02,&HBB,&H33,&H11,&HB2,&H02,&H77,&H44,&HBF}
Dim remainder As Byte()
IF(array1.Length< array2.Length) then
Dim remainderLength = array2.Length- array1.Length
For...
'''array1.Length =14
'''array2.Length =17
'''write the code to sign the last 3 Hex values into new array remainder and output the remainder
Next
the remainder length might be changed to other sizes, 100, 200 or more. I tried the Item()function and IndexOf()function, and didn't get the result. Please help to write the left code. The final result I want remainder = {&H77,&H44,&HBF}
thank you very much for any comments.
The simplest approach would be to use LINQ's Skip extension method.
For Each b As Byte In array.Skip(array1.Length)
Console.Write(b)
Next
Or:
Dim remainder As Byte() = array.Skip(array1.Length).ToArray()
You may also want to consider using the Array.Copy method which allows you to pass the starting index as an argument:
Array.Copy(array, array1.Length, remainder, 0, array.Length - array1.Length)
Or, if all else fails, it's always possible to access each item in the array by index:
For i As Integer = array1.Length to array.Length
Console.Write(array(i))
Next

VB Hex to DocX Object

I have a selection of docx files stored as blob data in hexadecimal, I need to retrieve these so I can access the text within.
So far, I have converted the hex to string format with the following:
Dim blob = BLOB DATA
Dim con As String = String.Empty
For x = 2 To st.Length - 2 Step 2
con &= ChrW(CInt("&H" & st.Substring(x, 2)))
Next
However, if I then save the output from this as a .docx the file will not open because it is 'corrupt'. I presume that is why when I load this string into a memorystream and then try and use Novacode.DocX.Load(memoryStream) it gives me a similar corruption error.
I have tried splitting to byte array in two fashions, both give me different results.
System.Text.Encoding.Default.GetBytes(hex)
I have also tried.
Public Function HexToByteArray(hex As String) As Byte()
Dim upperBound As Integer = hex.Length \ 2
If hex.Length Mod 2 = 0 Then
upperBound -= 1
Else
hex = "0" & hex
End If
Dim bytes(upperBound) As Byte
For i As Integer = 2 To upperBound
bytes(i) = Convert.ToByte(hex.Substring(i * 2, 2), 16)
Next
Return bytes
End Function
I then tried converting them both to a memory stream and using them to create a DocX object like so:
Dim doc As DocX = DocX.Load(New MemoryStream(bytes))
docx is not a text format, it's a binary format. Thus, converting it to a string is just plain wrong. Your end result needs to be a byte array.
Knowing that, your problem can be split into two simpler problems:
Split your hex string into strings of two characters each. See this SO question for details (or keep your existing loop, which is perfectly fine):
How to split a string by x amount of characters
Convert those "small" strings, which contain the hexadecimal representation of a byte, into bytes. See this SO question for details:
How do I convert a Hexidecimal string to a Byte Array?
Combining those two solutions is left as an exercise to the reader. We don't want to spoil all the fun or ruin the learning experience. ;-)

How to replace bytes in VB.NET?

I have two strings:
Dim Original_Hex_Bytes as string = "616572646E61"
Dim Patched_Hex_Bytes as string = "616E64726561"
Then I have a binary file and I need to search for the Original_Hex_Bytes and replace them with Patched_Hex_Bytes; I don't konw the offset where begin to write new bytes :(
How can I do this?
If needed, I also know how to convert Hex strings in bytes, I use this:
Private Function Hex_To_Bytes(ByVal strinput As String) As Byte()
Dim i As Integer = 0
Dim x As Integer = 0
Dim bytes(strinput.Length / 2) As Byte
Do While (strinput.Length > i + 1)
Dim lngDecimal As Long = Convert.ToInt32(strinput.Substring(i, 2), 16)
bytes(x) = Convert.ToByte(lngDecimal)
i += 2
x += 1
Loop
Return bytes
End Function
You can use BinaryReader and BinaryWriter classes to achieve this.
But in this case, as you do not know the file structure, need to read the entire file and sweep it in search of bytes array and will be easier to use ASCII strings as aerdna and andrea.
When you know the structure of a file is more appropriate to work with data structure to manipulate its contents.

Shortest way to get String from part of Bytes

I have read bytes from file and I have to get a String from known location.
Dim b() As Byte = File.ReadAllBytes("MYFILE.BIN")
Dim myYear As String = Encoding.ASCII.GetString(b)
That gives whole file in myYear String so I can extract a year from there.
Is there some handy and shortest way to get bytes from 50 to 54 and convert only that part to string?
Maybe something like the following. Its not shorter but you don't need to read the complete file into memory.
Using stream = File.OpenRead("c:\MYFILE.BIN")
stream.Seek(50, SeekOrigin.Begin)
Dim b = New Byte(4) {}
stream.Read(b, 0, 5)
Dim str = Encoding.ASCII.GetString(b)
End Using