i have to develop windows service which will copy files to different servers.
So i have to do this task using multi-theading.
But i have to start only 3-4 threads.
So whenever one threads get finished then i have to start new thread so that count of thread should remain 3 or 4.
So how could i apply check on that ?
please provide some information on it.
Why not reuse the threads instead of spawning new ones?
Other than that look at a pattern known as a producer/consumer queue. Your producer adds files (their path information), the consumers read that and take the appropriate action (perform the copy operation)
This might give you a starting point. The idea is to use a blocking queue which will block on the dequeue operation until an item is available. So your worker threads will spin around an infinite loop waiting for items to appear in the queue. Your main thread will enqueue the items into the queue. The following example uses the BlockingCollection class from the .NET 4.0 BCL. If that is not available to you then you can get an implementation of a blocking queue from Stephen Toub's blog.
Module Example
Private m_Queue As BlockingCollection(Of String) = New BlockingCollection(Of String)
Sub Main()
Dim threads(4) As Thread
For i As Integer = 0 To threads.Length - 1
threads(i) = New Thread(AddressOf Consumer)
threads(i).IsBackground = True
threads(i).Start()
Next
Dim files As IEnumerable(Of String) = GetFilesToCopy()
For Each filePath As String In files
m_Queue.Add(filePath)
Next
End Sub
Sub Consumer()
Do While True
Dim filePath As String = m_Queue.Take()
' Process the file here.
Loop
End Sub
End Module
In .Net 4.0 this is very easy to do with tasks:
Dim a As new Task(AdressOf doWork).ContinueWith(AdressOf doOtherWork)
See here for more examples (in C#).
I don't know VB, but all other languages I know have this operation for this kind of stuff: join().
int main(){
threadA.start();
threadA.join(); //here main() wait threadA end
threadB.start(); //what you want
}
Sorry for not_vb. I wrote it because I expect the same function with the same name in VB.
Related
I want to add workers into a queue, but only have the first N workers processing in parallel. All samples I find are in C#.
This is probably simple for a programmer, but I'm not one. I know enough about VB to write simple programs.
But my first application runs fine until it suddenly hits 100% CPU and then crashes. Help, please (Yes, I've wasted 5 hours of work time searching before posting this...)
More Context: Performing a recursive inventory of directory structures, files, and permissions across file servers with over 1 million directories/subdirectories.
Process runs serially, but will take months to complete. Management already breathing on my neck. When I try using Tasks, it goes to about 1000 threads, then hits 100% CPU, stops responding, then crashes. This is on a 16 core server with 112 GB RAM.
--Added
So, with the sample provided on using Semaphores, this is what I've put in:
Public Class InvDir
Private mSm as Semaphore
Public Sub New(ByVal maxPrc As Integer)
mSm = New Semaphore(maxPrc, maxPrc)
End Sub
Public Sub GetInventory(ByVal Path As String, ByRef Totals As Object, ByRef MyData As Object)
mSm.WaitOne()
Task.Factory.StartNew(Sub()
Dim CurDir As New IO.DirectoryInfo(Path)
Totals.SubDirectoryCount += CurDir.GetDirectories().Count
Totals.FilesCount += CurDir.GetFiles().Count
For Each CurFile As IO.FileInfo in CurDir.EnumerateFiles()
MyData.AddFile(CurFile.FileName, CurFile.Extension, CurFile.FullName, CurFile.Length)
Next
End Sub).ContinueWith(Function(x) mSm.Release())
End Sub
End Class
You're attempting multithreading with disk I/O. It might be getting slower because you're throwing more threads at it. No matter how many threads there are, the disk can physically only seek one position at a time. (In fact, you mentioned that it works serially.)
If you did want to limit the number of concurrent threads you could use a Semaphore. A semaphore is like a syncLock except you can specify how many threads are allowed to execute the code at a time. In the example below, the semaphore allows three threads to execute. Any more than that have to wait until one finishes. Some modified code from the MSDN page:
Public Class Example
' A semaphore that simulates a limited resource pool.
'
Private Shared _pool As Semaphore
<MTAThread> _
Public Shared Sub Main()
' Create a semaphore that can satisfy up to three
' concurrent requests. Use an initial count of zero,
' so that the entire semaphore count is initially
' owned by the main program thread.
'
_pool = New Semaphore(0, 3)
End Sub
Private Sub SomeWorkerMethod()
'This is the method that would be called using a Task.
_pool.WaitOne()
Try
'Do whatever
Finally
_pool.Release()
End Try
End Sub
End Class
Every new thread must call _pool.WaitOne(). That tells it to wait its turn until there are fewer than three threads executing. Every thread blocks until the semaphore allows it to pass.
Every thread must also call _pool.Release() to let the semaphore know that it can allow the next waiting thread to begin. That's important, even if there's an exception. If threads don't call Release() then the semaphore will just block them forever.
If it's really going to take five months, what about cloning the drive and running the check on multiple instances of the same drive, each looking at different sections?
Before I begin, I have to apologize for two things. One is that it is very difficult for me to explain things in a concise manner. Two is that I need to be somewhat vague due to the nature of the company I work for.
I am working on enhancing the functionality of an application that I've inherited. It is a very intensive application that runs a good portion of my company's day to day business. Because of this I am limited to the scope of what I can change--otherwise I'd probably rewrite it from scratch. Anyways, here is what I need to do:
I have several threads that all perform the same task but on different data input streams. Each thread interacts through an API from another software system we pay licensing on to write out to what is called channels. Unfortunately we have only licensed a certain number of concurrently running channels, so this application is supposed to turn them on an off as needed.
Each thread should wait until there is an available channel, lock the channel for itself and perform its processing and then release the channel. Unfortunately, I don't know how to do this, especially across multiple threads. I also don't really know what to search Google or this site for, or I'd probably have my answer. This was my thought:
A class that handles the distribution of channel numbers. Each thread makes a call to a member of this class. When it does this it would enter a queue and block until the channel handling class recognizes that we have a channel, signals the waiting thread that a channel is available and passing it the channel id. I have no idea where to begin even looking this up. Below I have some horribly written PsuedoCode of how in my mind I would think it would work.
Public Class ChannelHandler
Private Shared WaitQueue as New Queue(of Thread)
'// calling thread adds itself to the queue
Public Shared Sub WaitForChannel(byref t as thread)
WaitQueue.enqueue(t)
End Sub
Public Shared Sub ReleaseChannel(chanNum as integer)
'// my own processing to make the chan num available again
End Sub
'// this would be running on a separate thread, polling my database
'// for an available channel, when it finds one, somehow signal
'// the first thread in the queue that its got a channel and here's the id
Public Shared Sub ChannelLoop()
while true
if WaitQueue.length > 0 then
if thereIsAChannelAvailable then '//i can figure this out my own
dim t as thread = ctype(WaitQueue.dequeue(), Thread)
lockTheChannel(TheAvailableChannelNumber) 'performed by me
'// signal the thread, passing it the channel number
t => SignalReady(theAvailableChannelNumber) '// how to signal?
end if
end if
end while
End Sub
End Class
and then
'// this inside the function that is doing the processing:
ChannelHandler.requestChannel(CurrentThread)
while (waitingForSignal) '// how?
block '// how?
dim channelNumber as int => getChannelNumberThatWasSignaledBack
'// perform processing with channelNumber
ChannelHandler.ReleaseChannel(channelNumber)
I am working with the .NET Framework 3.5 in VB.NET. I am sure there has got to be some sort of mechanism already built for this, but as I said I have no idea exactly what keywords I should be searching for. Any input pointing me in the right direction (ie specific .NET framework classes to use or code samples) would be greatly appreciated. If I need to elaborate on anything, please let me know and I will to the best of my ability.
Edit: The other problem that I have is that these channels can be turned on/off from outside of this application, manually by the user (or as a result of a user initiated event). I am not concerned with a channel be shut down while a thread is using it (it would throw an exception and then pick back up next time it came through. But the issue is that there are not a constant number of threads fighting over a constant number of channels (if a user turns one on manually, the count is reduced, etc). Both items are variable, so I cant rely on the fact that there are no external forces (ie, something outside this set of threads, which is why I do some processing via my DB to determine an available channel number)
What I would do:
Switch the System.Threading.Thread by the System.Threading.Tasks.Task class.
If a new Task needs to be created, but the List(Of Task) (or, in your example, Queue(Of Task) ) count greater than the maximum permitted, use the Task.WaitAny method.
EDIT:
As I answered the previous block on my phone (which is pretty challenging for writing code), let now me write an example about how I would do it:
Imports System.Threading.Tasks
Imports System.Collections.Generic
Public Class Sample
Private Const MAXIMUM_PERMITTED As Integer = 3
Private _waitQueue As New Queue(Of Task)
Public Sub AssignChannel()
Static Dim queueManagerCreated As Boolean
If Not queueManagerCreated Then
Task.Factory.StartNew(Sub() ManageQueue())
queueManagerCreated = True
End If
Dim newTask As New Task(Sub()
' Connect to 3rd Party software
End Sub)
SyncLock (_waitQueue)
_waitQueue.Enqueue(newTask)
End SyncLock
End Sub
Private Sub ManageQueue()
Dim tasksRunning As New List(Of Task)
While True
If _waitQueue.Count <= 0 Then
Threading.Thread.Sleep(10)
Continue While
End If
If tasksRunning.Count > MAXIMUM_PERMITTED Then
Dim endedTaskPos As Integer = Task.WaitAny(tasksRunning.ToArray)
If endedTaskPos > -1 AndAlso
endedTaskPos <= tasksRunning.Count Then
tasksRunning.RemoveAt(endedTaskPos)
Else
Continue While
End If
End If
Dim taskToStart As Task
SyncLock (_waitQueue)
taskToStart = _waitQueue.Dequeue()
End SyncLock
tasksRunning.Add(taskToStart)
taskToStart.Start()
End While
End Sub
End Class
I just have a simple vb.net website that need to call a Sub that performs a very long task that works with syncing up some directories in the filesystem (details not important).
When I call the method, it eventually times out on the website waiting for the sub routine to complete. However, even though the website times out, the routine eventually completes it's task and all the directories end up as they should.
I want to just prevent the timeout so I'd like to just call the Sub asynchronously. I do not need (or even want) and callback/confirmation that it ran successfully.
So, how can I call my method asynchronously inside a website using VB.net?
If you need to some code:
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Call DoAsyncWork()
End Sub
Protected Sub DoAsyncWork()
Dim ID As String = ParentAccountID
Dim ParentDirectory As String = ConfigurationManager.AppSettings("AcctDataDirectory")
Dim account As New Account()
Dim accts As IEnumerable(Of Account) = account.GetAccounts(ID)
For Each f As String In My.Computer.FileSystem.GetFiles(ParentDirectory)
If f.EndsWith(".txt") Then
Dim LastSlashIndex As Integer = f.LastIndexOf("\")
Dim newFilePath As String = f.Insert(LastSlashIndex, "\Templates")
My.Computer.FileSystem.CopyFile(f, newFilePath)
End If
Next
For Each acct As Account In accts
If acct.ID <> ID Then
Dim ChildDirectory As String = ConfigurationManager.AppSettings("AcctDataDirectory") & acct.ID
If My.Computer.FileSystem.DirectoryExists(ChildDirectory) = False Then
IO.Directory.CreateDirectory(ChildDirectory)
End If
My.Computer.FileSystem.DeleteDirectory(ChildDirectory, FileIO.DeleteDirectoryOption.DeleteAllContents)
My.Computer.FileSystem.CopyDirectory(ParentDirectory, ChildDirectory, True)
Else
End If
Next
End Sub
I wouldn't recommend using the Thread class unless you need a lot more control over the thread, as creating and tearing down threads is expensive. Instead, I would recommend using a ThreadPool thread. See this for a good read.
You can execute your method on a ThreadPool thread like this:
System.Threading.ThreadPool.QueueUserWorkItem(AddressOf DoAsyncWork)
You'll also need to change your method signature to...
Protected Sub DoAsyncWork(state As Object) 'even if you don't use the state object
Finally, also be aware that unhandled exceptions in other threads will kill IIS. See this article (old but still relevant; not sure about the solutions though since I don't reaslly use ASP.NET).
You could do this with a simple thread:
Add :
Imports System.Threading
And wherever you want it to run :
Dim t As New Thread(New ThreadStart(AddressOf DoAsyncWork))
t.Priority = Threading.ThreadPriority.Normal
t.Start()
The call to t.Start() returns immediately and the new thread runs DoAsyncWork in the background until it completes. You would have to make sure that everything in that call was thread-safe but at first glance it generally seems to be so already.
I also was looking for information on Asynchronous programming in VB. In addition to this thread, I also found the following: beginning with Visual Studio 2012 and .Net Framework 4.5, VB was given two new keywords to make a method asynchronous right in the declaration, without using Thread or Threadpool. The new keywords are "Async" and "Await". You may refer to the following links if you wish:
http://msdn.microsoft.com/library/hh191443%28vs.110%29.aspx
https://msdn.microsoft.com/en-us/library/hh191564%28v=vs.110%29.aspx
This is an older thread, but I figured I'd add to it anyway as I recently needed to address this. If you want to use the ThreadPool to call a method with parameters, you can modify #Timiz0r's example as such:
System.Threading.ThreadPool.QueueUserWorkItem(Sub() MethodName( param1, param2, ...))
I've read a lot of different questions on SO about multithreaded applications and how to split work up between them, but none really seem to fit what I need for this. Here's how my program currently basically works:
Module Module1
'string X declared out here
Sub Main()
'Start given number of threads of Main2()
End Sub
Sub Main2()
'Loops forever
'Call X = nextvalue(X), display info as needed
End Sub
Function nextvalue(byval Y as string)
'Determines the next Y in the sequence
End Function
End Module
This is only a rough outline of what actually happens in my code by the way.
My problem being that if multiple threads start running Main2(), they're dealing with the same X value as in the other threads. The loop inside of main2 executes multiple times per millisecond, so I can't just stagger the loops. There is often duplication of work done.
How can I properly divide up the work so that the two threads running simultaneously never have the same work to run?
You should synchronize the generation and storage of X so that the composite operation appears atomic to all threads.
Module Module1
Private X As String
Private LockObj As Object = New Object()
Private Sub Main2()
Do While True
' This will be used to store a snapshot of X that can be used safely by the current thread.
Dim copy As String
' Generate and store the next value atomically.
SyncLock LockObj
X = nextValue(X)
copy = X
End SyncLock
' Now you can perform operations against the local copy.
' Do not access X outside of the lock above.
Console.WriteLine(copy)
Loop
End Sub
End Module
A thread manager is required to manage the threads and the work that they do. Say it is desirable to split up the work into 10 threads.
Start the manager
Manager creates 10 threads
Assign work to the manager (queue up the work, let's say it queues up 10000 work items)
Manager assigns a work item to complete for each of the 10 threads.
As threads finish thier work, they report back to the manager that they are done and recieve another work item. The queue of work should be thread safe so that items can be enqueued and dequeued. The manager handles the management of work items. The threads just execute the work.
Once this is in place, work items should never be duplicated amongst threads.
Use a lock so that only one thread can access X at a time. Once one thread is done with it, another thread is able to use it. This will prevent two threads from calling nextvalue(x) with the same value.
I have been creating multiple background threads to parse xml files and recreate new xml files. Now the problem I am having is that even though I use synclock on global variables, I will still at times get errors and I am sure that this is just the crude way of coding I am doing, but I was wondering if someone had a better option.
program flow =
access local folder and upload all files into list
strip each file into xml entries and put these entries into an arraylist
parse for specific values and enter these values into a database table
now create a thread and take the arraylist of entries and the thread will reparse
thread parses and creates a new xml file
main thread continues with another function and then goes and get a file from list
I will add some code to show problem areas but if I have declared global variable in use does the different threads overwrite that value in the variable causing contamination.
For Each g In resultsList
gXmlList.Add(g)
Next
Dim bgw As New BackgroundWorker
bgw.WorkerSupportsCancellation = True
AddHandler bgw.DoWork, New DoWorkEventHandler(AddressOf createXML)
AddHandler bgw.RunWorkerCompleted, AddressOf WorkComplete
threadlist.Add(bgw)
bgw.RunWorkerAsync()
Private Sub createXML()
num += 1
Dim file As String = Module1.infile
xmlfile = directoryPath & "\New" & dateTime.Now.ToUniversalTime.ToString("yyyyMMddhhmmss") & endExtension
Thread.Sleep(2000)
Dim doc As XmlDocument = New XmlDocument
**xwriter = New XmlTextWriter(xmlfile, Encoding.UTF8)** this is where ioexception error
xwriter.Formatting = Formatting.Indented
xwriter.Indentation = 2
xwriter.WriteStartDocument(True)
xwriter.WriteStartElement("Posts")
I have global variables through out the app and should I be locking each one and does this not make using threads then useless.
Dim j As Integer = 0
I believe your biggest problem is not knowing what features in .Net are thread safe. A list for example is not (a dictionary is). While you may get away with it you will eventually run into problems with locking, etc.
Your using classes and variables that are not thread safe. Any time you are working with threads you have to be Extremely careful with locking. To answer your question, yes, you have to lock and unlock everything you are working with unless the type / method specifically handles it for you.
There are a lot of multi threading (PLINQ for example) in .Net 4.0 which handle a lot of the "grunt work" for you. While you should learn and understand how to do thread safe code yourself it will give you a head start.
Try passing the data into the createXML() method. That may help isolate the code from other data being accessed. I would suggest reading up on threading and learning how to do it without a background worker.
Global variables are generally a bad idea. Given your VB code I'm guessing this is a carry over from the VB6 world for you. That's not in any way intended to be insulting, just trying to help advance your skills forward. Variable scope should be as confined as possible.
Another thought looking at your code is to learn how to use String.Format() when building strings / paths.
Simple manual thread in VB to get you started:
Dim bThread As New Threading.Thread(AddressOf createXML)
bThread.IsBackground = True
bThread.Start()
Well if you are having issues with thread locking then you can simply wrap your action in the following manor.
'This will need to be out of scope so that all threads have access to it
Dim readerWriterLock As New Threading.ReaderWriterLockSlim
readerWriterLock.EnterWriteLock()
xwriter = New XmlTextWriter(xmlfile, Encoding.UTF8)
'other logic
readerWriterLock.ExitWriteLock()
'anything reading from this would need to have the following
readerWriterLock.EnterReadLock()
'logic
readerWriterLock.ExitReadLock()
Try this and then if not successful post the exception message and any other information that you can.