VB.NET Speed up cycle function - vb.net

I have this function to write bytes in a bin file.
Public Shared Function writeFS(path As String, count As Integer) As Integer
Using writer As New BinaryWriter(File.Open(path, FileMode.Open, FileAccess.Write, FileShare.Write), Encoding.ASCII)
Dim x As Integer
Do Until x = count
writer.BaseStream.Seek(0, SeekOrigin.End)
writer.Write(CByte(&HFF))
x += 1
Loop
End Using
Return -1
End Function
I have a textBox that is the count value. Count is the number of byte to write into the file.
The problem is when i want to write 1mb+ it take like 10+ seconds because of the cycle.
I need a better/faster way to write hex value FF at the end of file 'value' times.
I'm sorry if i've not explained very well.

This should be better:
Public Shared Function writeFS(path As String, count As Integer) As Integer
Using writer As New BinaryWriter(File.Open(path, FileMode.Open, FileAccess.Write, FileShare.Write), Encoding.ASCII)
Dim x As Integer
Dim b as Byte
b = CByte(&HFF)
writer.BaseStream.Seek(0, SeekOrigin.End)
Do Until x = count
writer.Write(b)
x += 1
Loop
End Using
Return -1
End Function
That way you're not calling CByte every time. And there is no need to move to the end of the stream after each write.

Some questions before:
Why is the function Shared? Why do you use FileSHARE.Write? WriteShare means that OTHER process can write to the file, while YOU write to it. And WHY do you write single bytes that are all the same? And why does the function return -1 every time? Might be better fitter with a SUB? And why not using a simple for loop instead while?
Public Sub writeFS(path As String, count As Integer)
Using Stream As New FileStream("", FileMode.Append, FileAccess.Write, FileShare.Read)
Stream.Write(Enumerable.Repeat(Of Byte)(255, count).ToArray, 0, count)
End Using
End Sub
OK, if you need to write 100MB, this doesnt fit, but than you can partition your writes.

Related

Random() doesn't seem to be so random at all

Random() doesn't seem to be so random at all, it keeps repeating the pattern all the time.
How can I make this "more" random?
Dim ioFile As New System.IO.StreamReader("C:\names.txt")
Dim lines As New List(Of String)
Dim rnd As New Random()
Dim line As Integer
While ioFile.Peek <> -1
lines.Add(ioFile.ReadLine())
End While
line = rnd.Next(lines.Count + 0)
NAMES.AppendText(lines(line).Trim())
ioFile.Close()
ioFile.Dispose()
Clipboard.SetText(NAMES.Text)
This works fine for me. I changed a few things like implementing a using block, removed a redundant addition of 0, and added a loop to test 100 times out to debug. a sample of 200 that you are just "eyeballing" is not enough to say that a random sequence is "not working".
Using ioFile As New System.IO.StreamReader("C:\names.txt")
Dim lines As New List(Of String)
Dim rnd As New Random()
Dim line As Integer
While ioFile.Peek <> -1
lines.Add(ioFile.ReadLine())
End While
For i As Integer = 1 To 100
line = rnd.Next(lines.Count)
Debug.WriteLine(lines(line).Trim())
Next
End Using
You don't need a stream reader to read a text file. File.ReadAllLines will return an array of lines in the file. Calling .ToList on this method gets you the desired List(Of String)
We will loop through the length of the list in a for loop. We subtract one because indexes start at zero.
To get the random index we call .Next on our instance of the Random class that was declared outside the method (a form level variable) The .Next method is inclusive of the first variable and exclusive of the second. I used a variable to store the original value of lines.Count because this value will change in the loop and it would mess with for loop if we used lines.Count -1 directly in the To portion of the For.
Once we get the random index we add that line to the TextBox and remove it from the list.
Private Sub ShuffleNames()
Dim index As Integer
Dim lines = File.ReadAllLines("C:\Users\xxx\Desktop\names.txt").ToList
Dim loopLimit = lines.Count - 1
For i = 0 To loopLimit
index = rnd.Next(0, lines.Count)
TextBox1.AppendText(lines(index).Trim & Environment.NewLine)
lines.RemoveAt(index)
Next
End Sub

How to enlarge a List(of String)(bytes)

I have a variable Queue in which I write information from a stream. The variable is initiated as follows:
Public Shared Queue As List(Of String) = New List(Of String)(1024)
The code to read the stream is
Public Shared Sub ReadStreamForever(ByVal stream As Stream)
Dim encoder = New UTF8Encoding()
Dim buffer = New Byte(2047) {}
Dim counter as Integer = 0
While True
If stream.CanRead Then
Dim len As Integer = stream.Read(buffer, 0, 2048)
Counter = Counter + 1
If len > 0 Then
Dim text = encoder.GetString(buffer, 0, len)
SSEApplication.Push(text)
Else
Exit While
End If
Else
Exit While
End If
End While
End Sub
Where the push methode just does a few string manipulation and adds line after line into the Queue Variable
Public Shared Sub Push(ByVal text As String)
If String.IsNullOrWhiteSpace(text) Then
Return
End If
Dim lines = text.Trim().Split(vbLf)
SSEApplication.Queue.AddRange(lines)
End Sub
I have different big datasets I want to stream but the Queue length after filling it up is always 2691, so it looks like it is kind of limited in length. I just do not know where I limit the Queue Variable and how to enlarge it. Could anyone help me here?
In general, List doesn't have fixed length, Add method resizes List and makes space for another element.
If you want to have fixed length, you could use simple array: Dim Queue(1024) As string
But then, you will get an exception when trying to add more elements, so you can check the condition in Push method:
If lines.Count < 1024 Then
SSEApplication.Queue.AddRange(lines)
End If
That check will also prevent having more than 1024 elements when using List, but if you have collection of fixed length, I would recommend using simple array.
Useful resource: Arrays in Visual Basic, there you can also read, how to enlarge array, when you want to add extra elements using ReDim keyword.

Read several lines from file at once

The following code works just fine:
Function ReadLines(FN As String, n As Integer) As String()
Dim T(0 To n) As String
With My.Computer.FileSystem.OpenTextFileReader(FN)
For i As Integer = 0 To n
T(i) = .ReadLine()
Next
End With
Return T
End Function
However, if the file is located on a distant server, this might prove horrendously slow. Is there a way to do the same more efficiently? I could read the whole file at once too, but this also is fairly inefficient...
The BufferedStream class is specifically designed to reduce the number of system IO when a file is read (or written) sequentially in a series. So this is expected to make your reads more effective:
Function ReadLines(FN As String, n As Integer) As String()
Using fs As FileStream = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)
Using bs As New BufferedStream(fs)
Using sr As New StreamReader(bs)
Dim lines = New List(Of String)(n)
For i As Integer = 0 To n
Dim line As String = sr.ReadLine()
If (line Is Nothing) Then
Exit For
End If
lines.Add(line)
Next
Return lines.ToArray()
End Using
End Using
End Using
End Function
Dim lines = IO.File.ReadLines(fileName).
Skip(firstLineIndex).
Take(lineCount).
ToArray()
Unlike the File.ReadAllLines method, File.ReadLines doesn't read the whole file in a single operation. It exposes the contents of the file as an Enumerable(Of String) but doesn't read anything from the file until you use it. That code will have to read every line up to the last one you want but won't read anything beyond that.
To be honest, I'm not sure how much quicker it will be though, because it might actually use StreamReader.ReadLine under the hood. It's worth testing though. The alternative would be to just read larger chunks of the file and break it into lines yourself, stopping when you've read at least as much as you need.

zlib.net 2 files with the same lengt resulting in 2 different final lenght after compressing

is it normal for 2 files with the same length to have different lenghts after compressing there bytes using zlib.net on vb.net?
this is the compression module i use using zlib.net reference, the 2 files are almose the same, there are juste less than 100 bytes making the difference between them
Imports System.IO
Imports zlib
Module zlib_compression
Public Sub CopyStream(ByRef input As System.IO.Stream, ByRef output As System.IO.Stream)
Dim num1 As Integer
Dim buffer1 As Byte() = New Byte(2000 - 1) {}
num1 = input.Read(buffer1, 0, 2000)
Do While (num1 > 0)
output.Write(buffer1, 0, num1)
num1 = input.Read(buffer1, 0, 2000)
Loop
output.Flush()
End Sub
Public Function Compress(ByVal InputBytes As Byte()) As Byte()
Using output As New MemoryStream
Dim outZStream As Stream = New ZOutputStream(output, zlib.zlibConst.Z_BEST_SPEED)
Using input As Stream = New MemoryStream(InputBytes)
CopyStream(input, outZStream)
outZStream.Close() 'do not remove
Return output.ToArray()
End Using
End Using
End Function
Public Function Decompress(ByVal InputBytes As Byte()) As Byte()
Using output As New MemoryStream
Using outZStream As Stream = New ZOutputStream(output)
Using input As Stream = New MemoryStream(InputBytes)
CopyStream(input, outZStream)
Return output.ToArray()
End Using
End Using
End Using
End Function
End Module
Of course, yes. In fact it is necessarily true. It is not possible to losslessly compress all of the same length files to a smaller size, since there are not enough bits in the smaller size to identify all of the original files. If some are compressed, then some must be expanded.

linq submitchanges runs out of memory

I have a database with about 180,000 records. I'm trying to attach a pdf file to each of those records. Each pdf is about 250 kb in size. However, after about a minute my program starts taking about about a GB of memory and I have to stop it. I tried doing it so the reference to each linq object is removed once it's updated but that doesn't seem to help. How can I make it clear the reference?
Thanks for your help
Private Sub uploadPDFs(ByVal args() As String)
Dim indexFiles = (From indexFile In dataContext.IndexFiles
Where indexFile.PDFContent = Nothing
Order By indexFile.PDFFolder).ToList
Dim currentDirectory As IO.DirectoryInfo
Dim currentFile As IO.FileInfo
Dim tempIndexFile As IndexFile
While indexFiles.Count > 0
tempIndexFile = indexFiles(0)
indexFiles = indexFiles.Skip(1).ToList
currentDirectory = 'I set the directory that I need
currentFile = 'I get the file that I need
writePDF(currentDirectory, currentFile, tempIndexFile)
End While
End Sub
Private Sub writePDF(ByVal directory As IO.DirectoryInfo, ByVal file As IO.FileInfo, ByVal indexFile As IndexFile)
Dim bytes() As Byte
bytes = getFileStream(file)
indexFile.PDFContent = bytes
dataContext.SubmitChanges()
counter += 1
If counter Mod 10 = 0 Then Console.WriteLine(" saved file " & file.Name & " at " & directory.Name)
End Sub
Private Function getFileStream(ByVal fileInfo As IO.FileInfo) As Byte()
Dim fileStream = fileInfo.OpenRead()
Dim bytesLength As Long = fileStream.Length
Dim bytes(bytesLength) As Byte
fileStream.Read(bytes, 0, bytesLength)
fileStream.Close()
Return bytes
End Function
I suggest you perform this in batches, using Take (before the call to ToList) to process a particular number of items at a time. Read (say) 10, set the PDFContent on all of them, call SubmitChanges, and then start again. (I'm not sure offhand whether you should start with a new DataContext at that point, but it might be cleanest to do so.)
As an aside, your code to read the contents of a file is broken in at least a couple of ways - but it would be simpler just to use File.ReadAllBytes in the first place.
Also, your way of handling the list gradually shrinking is really inefficient - after fetching 180,000 records, you're then building a new list with 179,999 records, then another with 179,998 records etc.
Does the DataContext have ObjectTrackingEnabled set to true (the default value)? If so, then it will try to keep a record of essentially all the data it touches, thus preventing the garbage collector from being able to collect any of it.
If so, you should be able to fix the situation by periodically disposing the DataContext and creating a new one, or turning object tracking off.
OK. To use the smallest amount of memory we have to update the datacontext in blocks. I've put a sample code below. Might have sytax errors since I'm using notepad to type it in.
Dim DB as YourDataContext = new YourDataContext
Dim BlockSize as integer = 25
Dim AllItems = DB.Items.Where(function(i) i.PDFfile.HasValue=False)
Dim count = 0
Dim tmpDB as YourDataContext = new YourDataContext
While (count < AllITems.Count)
Dim _item = tmpDB.Items.Single(function(i) i.recordID=AllItems.Item(count).recordID)
_item.PDF = GetPDF()
Count +=1
if count mod BlockSize = 0 or count = AllItems.Count then
tmpDB.SubmitChanges()
tmpDB = new YourDataContext
GC.Collect()
end if
End While
To Further optimise the speed you can get the recordID's into an array from allitems as an anonymous type, and set DelayLoading on for that PDF field.