Advice on background processing in vb.net - vb.net

In a project I’ve recently taken over, there is a call to a function which does some calculations; this is called in a row, several times (between 1 and 10 times usually).
While dr.read ‘depending on a db call, loop 1 or more times
Dim calc As New CalcClass
Dim newDoStuff As New System.Threading.Thread(New System.Threading.ParameterizedThreadStart(AddressOf DoStuff))
newDoStuff.Start(calc)
End while
Private Sub DoStuff(ByVal calc As Object)
‘do something that takes between 5-10 seconds
End sub
In order to speed this up, i am trying to add asynchronous processing (see above example), this works in my code, all tasks are done at the same time, but what I don’t understand is how to then wait for all these threads to finish (there is no set amount of threads, it can be between 1 and 10 depending on some other data) before finishing up with a final task that needs to run after all tasks are completed.
Can anyone suggest a way to do this – I’m looking for an easy way to basically say “O.k, all tasks are finished at this point, call another task”
Cliffs
Several tasks need to run at the same time (between 1 and 10)
Each task takes several seconds
Code currently works - it does them all at the same time
Once all tasks (between 1-10) are
finished, fire off some other code (only when all tasks are finished) - stuck on best method to do the following

Put all your thread in an List
Dim threads As new List(Of System.Threading.Thread)
While dr.read ‘depending on a db call, loop 1 or more times
Dim calc As New CalcClass
Dim newDoStuff As New System.Threading.Thread(New System.Threading.ParameterizedThreadStart(AddressOf DoStuff))
threads.Add(newDoStuff)
newDoStuff.Start(calc)
End while
finally join all your threads
For Each thread In threads Do
thread.Join()
Next

The easiest way is to put all of your new threads into a list, then iterate over that list and call .Join() on each one. The join method blocks the current thread until the thread you are Joining completes:
Apologies if there are any syntactic errors in the following code- I don't have VB handy and my memory of the syntax is pretty rusty:
Dim threadList as New List(Of Thread)
While dr.read ‘depending on a db call, loop 1 or more times
Dim calc As New CalcClass
Dim newDoStuff As New System.Threading.Thread(New System.Threading.ParameterizedThreadStart(AddressOf DoStuff))
newDoStuff.Start(calc)
threadList.Add(newDoStuff)
End while
For Each t as Thread in ThreadList
t.Join
End For
With that said, I'd strongly encourage you to look into using the classes in the System.Threading.Tasks namespace, as that provides a much better paradigm than starting and managing your own threads.

Related

I need help creating a TaskScheduler to prevent threading overload

I want to add workers into a queue, but only have the first N workers processing in parallel. All samples I find are in C#.
This is probably simple for a programmer, but I'm not one. I know enough about VB to write simple programs.
But my first application runs fine until it suddenly hits 100% CPU and then crashes. Help, please (Yes, I've wasted 5 hours of work time searching before posting this...)
More Context: Performing a recursive inventory of directory structures, files, and permissions across file servers with over 1 million directories/subdirectories.
Process runs serially, but will take months to complete. Management already breathing on my neck. When I try using Tasks, it goes to about 1000 threads, then hits 100% CPU, stops responding, then crashes. This is on a 16 core server with 112 GB RAM.
--Added
So, with the sample provided on using Semaphores, this is what I've put in:
Public Class InvDir
Private mSm as Semaphore
Public Sub New(ByVal maxPrc As Integer)
mSm = New Semaphore(maxPrc, maxPrc)
End Sub
Public Sub GetInventory(ByVal Path As String, ByRef Totals As Object, ByRef MyData As Object)
mSm.WaitOne()
Task.Factory.StartNew(Sub()
Dim CurDir As New IO.DirectoryInfo(Path)
Totals.SubDirectoryCount += CurDir.GetDirectories().Count
Totals.FilesCount += CurDir.GetFiles().Count
For Each CurFile As IO.FileInfo in CurDir.EnumerateFiles()
MyData.AddFile(CurFile.FileName, CurFile.Extension, CurFile.FullName, CurFile.Length)
Next
End Sub).ContinueWith(Function(x) mSm.Release())
End Sub
End Class
You're attempting multithreading with disk I/O. It might be getting slower because you're throwing more threads at it. No matter how many threads there are, the disk can physically only seek one position at a time. (In fact, you mentioned that it works serially.)
If you did want to limit the number of concurrent threads you could use a Semaphore. A semaphore is like a syncLock except you can specify how many threads are allowed to execute the code at a time. In the example below, the semaphore allows three threads to execute. Any more than that have to wait until one finishes. Some modified code from the MSDN page:
Public Class Example
' A semaphore that simulates a limited resource pool.
'
Private Shared _pool As Semaphore
<MTAThread> _
Public Shared Sub Main()
' Create a semaphore that can satisfy up to three
' concurrent requests. Use an initial count of zero,
' so that the entire semaphore count is initially
' owned by the main program thread.
'
_pool = New Semaphore(0, 3)
End Sub
Private Sub SomeWorkerMethod()
'This is the method that would be called using a Task.
_pool.WaitOne()
Try
'Do whatever
Finally
_pool.Release()
End Try
End Sub
End Class
Every new thread must call _pool.WaitOne(). That tells it to wait its turn until there are fewer than three threads executing. Every thread blocks until the semaphore allows it to pass.
Every thread must also call _pool.Release() to let the semaphore know that it can allow the next waiting thread to begin. That's important, even if there's an exception. If threads don't call Release() then the semaphore will just block them forever.
If it's really going to take five months, what about cloning the drive and running the check on multiple instances of the same drive, each looking at different sections?

Multithreading and error handling in Vb.net

This Question has 2 parts. I am new to multithreading and so I want to firstly check if my logic is correct and then I want to find out how to handel erros in multithreading.
Q1: I have an application that calls SQL database obtaining information from 2 datatables, this info is then combined in a final display. Without multithreading, I call each SQL select to populate a dataset one after the other. With multithreading I call the more complex SQL first as a separate thread and then the less complex SQL call in the main thread second. I am trying to cut down the load time of both by doing them concurently.
(I realise that strictly I should do both as backround tasks to free up the UI, for me its small steps first)
Anyway the code looks little like this
Dim ThreadLoad_Longer_Data As Thread
ThreadLoad_Longer_Data = New Thread(AddressOf Me.Fill_LongerSQL)
ThreadLoad_Longer_Data.IsBackground = True
TThreadLoad_Longer_Data.Start()
'Execute some code here for the second SQL call in main thread
'Then stop the main prosess to wait for the finish of the of the background
ThreadLoad_Longer_Data.join
Im assuming that the .Join statment will infact stop the main thread and will wait for the other one to finish ? Is this correct ?
If so it brings me to the second part.
Q2. What happens if the first thread dosent finish? Like through an error ? How do I handle this situation ?
Thank you
Yes, calling ThreadLoad_Longer_Data.Join will stop the execution of the calling thread (the one that executes the code calling the Join) till the ThreadLoad_Longer_Data ends its execution.
If, inside ThreadLoad_Longer_Data, you have an unhandled exeception, the result is the ending of the thread and thus the resume of the execution of the calling thread.
Sub Main
Try
Console.WriteLine("Start of the main thread")
Dim ThreadLoad_Longer_Data As Thread
ThreadLoad_Longer_Data = New Thread(AddressOf Me.Fill_LongerSQL)
ThreadLoad_Longer_Data.IsBackground = True
ThreadLoad_Longer_Data.Start()
ThreadLoad_Longer_Data.Join
Console.WriteLine("End of the main thread")
Catch x as Exception
Console.WriteLine(x.Message)
End Try
End Sub
Sub Fill_LongerSQL()
Console.WriteLine("Before the exception")
dim y as integer
for x = 0 to 1000000000
y = y + 1
next
Throw new Exception("This is an unhandled exception")
' this will never executed
Console.WriteLine("After the exception")
End Sub

Code takes much longer to execute on a seperate thread in .net

In my VB.NET program is a time consuming function that gets data and updates the UI at a periodic interval. I moved this function to another thread, but it now takes much longer to execute. Using the stopwatch class, I calculated that when it is part of the main thread, it takes 130 ms, but in the separate thread it takes 542 ms, so that's more than 4 times slower.
My CPU is a Core I5 M520 (2 cores), so I don't now why is it taking so much longer.
I am using the System.Threading.Thread class. I also tried to set the new thread's priority higher, but this had no effect.
Why is the separate thread taking so much longer and is there a way I can speed it up?
Thanks
The code:
Public Sub update(ByVal temp As Visual)
SyncLock mUpdateQueue
If Not mUpdateQueue.Contains(temp) Then
mUpdateQueue.Enqueue(temp)
End If
End SyncLock
If Not mainThread.IsAlive Then ' moet hierdie beter doen
mainThread = New Thread(AddressOf DataFetchThread)
mainThread.Start()
End If
End Sub
Private Sub DataFetchThread()
Dim s As New Stopwatch()
s.Start()
Dim temp As Visual = Nothing
While mUpdateQueue.Count > 0
SyncLock mUpdateQueue
temp = mUpdateQueue.Peek()
End SyncLock
mDataCollector.updateV(temp)
SyncLock mUpdateQueue
mUpdateQueue.Dequeue()
End SyncLock
End While
s.Stop()
Debug.WriteLine("thread run time: " & s.ElapsedMilliseconds)
End Sub
mDataCollector.updateV(temp): This function get data from a database and plots the points on a picturebox to create a graph. It wouldn't make a lot of sense to add all of the code here.
To ask this question in another way: Is it normal that the second thread takes much longer to execute or is there something wrong with my code?
You are accessing the mUpdateQueue variable from multiple threads and using locks to gaurd access to it. This is fine, but using locks has an overhead (to aquire the lock, and during the time that the other threads wait to aquire the lock). This is probably why your new thread is taking longer: it is waiting on the locking.
You could try using the ReaderWriterLockSlim class which may provide faster access to your variables. Just remember that it implements IDisposable so you need to call Dispose on it when you're done with it.

Dividing work into multiple threads

I've read a lot of different questions on SO about multithreaded applications and how to split work up between them, but none really seem to fit what I need for this. Here's how my program currently basically works:
Module Module1
'string X declared out here
Sub Main()
'Start given number of threads of Main2()
End Sub
Sub Main2()
'Loops forever
'Call X = nextvalue(X), display info as needed
End Sub
Function nextvalue(byval Y as string)
'Determines the next Y in the sequence
End Function
End Module
This is only a rough outline of what actually happens in my code by the way.
My problem being that if multiple threads start running Main2(), they're dealing with the same X value as in the other threads. The loop inside of main2 executes multiple times per millisecond, so I can't just stagger the loops. There is often duplication of work done.
How can I properly divide up the work so that the two threads running simultaneously never have the same work to run?
You should synchronize the generation and storage of X so that the composite operation appears atomic to all threads.
Module Module1
Private X As String
Private LockObj As Object = New Object()
Private Sub Main2()
Do While True
' This will be used to store a snapshot of X that can be used safely by the current thread.
Dim copy As String
' Generate and store the next value atomically.
SyncLock LockObj
X = nextValue(X)
copy = X
End SyncLock
' Now you can perform operations against the local copy.
' Do not access X outside of the lock above.
Console.WriteLine(copy)
Loop
End Sub
End Module
A thread manager is required to manage the threads and the work that they do. Say it is desirable to split up the work into 10 threads.
Start the manager
Manager creates 10 threads
Assign work to the manager (queue up the work, let's say it queues up 10000 work items)
Manager assigns a work item to complete for each of the 10 threads.
As threads finish thier work, they report back to the manager that they are done and recieve another work item. The queue of work should be thread safe so that items can be enqueued and dequeued. The manager handles the management of work items. The threads just execute the work.
Once this is in place, work items should never be duplicated amongst threads.
Use a lock so that only one thread can access X at a time. Once one thread is done with it, another thread is able to use it. This will prevent two threads from calling nextvalue(x) with the same value.

Threading.Timer application is consuming more than 50% of CPU, why?

I have written below console application in VB.net.
My intention is to write an application that triggers every one minute and perform some task. But
when I run this application it is consuming 50% of CPU.
How can I make it to consume less CPU?
Am I calling the timer in the right place (In the main method)?
Later I would like to make a windows service with this same task and install on the server.
How can I make the application consume less CPU?
Module Module1
Dim inputPath As String = "C:\Input"
Dim outputPath As String = "C:\Output"
Dim folder As Directory
Sub Main()
Dim tmr As Timer = New Timer(New TimerCallback(AddressOf Upload), Nothing, 1000, 60000)
While Not tmr Is Nothing
End While
End Sub
Public Sub Upload(ByVal o As Object)
Dim sr As StreamReader
Dim conStr1 As String = "Data Source=TNS Name;User ID=xx; Password=xx;"
'Look up for pending requests in RQST_TBL
Dim cnn1 As New OracleConnection(conStr1)
Dim datReader As OracleDataReader
Dim cmd1 As New OracleCommand
cnn1.Open()
.....
.....
End Sub
End Module
Thank you..
I'm guessing you have two CPUs? The infinite while loop is consuming 100% of one CPU; leaving you with 50% total CPU consumption.
From the look of your code - the loop is completely unneeded. Your timer class is going to call the Upload() method when it is complete.
Remove the while loop...
While Not tmr Is Nothing
End While
And use something like Console.Readline to keep the application from closing.
Alternatively stick a thread.sleep() call inside the while loop if you really like the loop.
While Not tmr Is Nothing
End While
You were already warned about this in a previous question. Delete that code.
While Not tmr Is Nothing
End While
This is just an infinite loop. You're not actually allowing anything to get done.
As this is a console application, you probably only need a loop that sleeps for a minute, then performs your task.
As Rob said, the 50% load probably means you're using 100% of one of your CPUs cores.
Instead of the infinite loop, you can use Console.ReadLine() to keep the console application running.
While the console is waiting for input, your timer will still work as you intend it.