VB.NET - Does this code execute the ThreadPool correctly? - vb.net

Code:
'Get the Thread Count - Lets say this value was 150;
Dim threads As Integer = CheckerThreads.Value
'Set the Thread Count;
ThreadPool.SetMinThreads(threads, threads)
ThreadPool.SetMaxThreads(threads, threads)
ServicePointManager.DefaultConnectionLimit = threads
ServicePointManager.Expect100Continue = True
'For Each proxy from the Opened File;
For Each Proxy In proxies
'Check the Proxy;
ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf CheckProxy), Proxy)
Next
The code above, Takes Each Proxy from a List() and simply uses a WebRequest to check the proxy, But thats not related, What i'm wondering is using a For Each the way I have, is it executing like I think it is, Or have I done something wrong?
Which of the following is happening?:
1 - Is 1 proxy being checked with 150 processes?
-or-
2 - Is it checking 1 proxy per process with 150 processing checking at the same time?
If it's doing #1 then how can I resolve it to do #2?

The tasks assigned to your threads will be executed over the same time frame. I say it that way because the only way that two tasks can actually be processed at exactly the same time is by multiple processor cores. Obviously you don't have 150 processor cores so 150 tasks can't be processed at the same time. They will be interleaved though so, because processors work so fast, it appears to the naked eye that they are processed simultaneously.
The reason I suggested that you read the relevant documentation is that messing with the thread counts can actually hurt performance. With so many threads active you can cause things to slow down because of all the context switching. It's generally best to just queue everything up and let the system handle the rest, as the documentation says.

Related

Chronicle Queue - reader/tailer latency when run at same time while writing

I'm setting up a market data back-testing using Chronicle Queue (CQ), reading data from a binary file then writing into a single CQ and simultaneously reading the data from that CQ and dumping the statistics. I am doing a POC to replace our existing real-time market data feed handler worker queue.
While doing basic read/writes testing on Linux/SSD setup, I see reads are lagging behind writes - in fact latency is accumulating. Both Appender and Tailer are running as separate processes on same host.
Would like to know, if there is any issue in the code I am using?
Below is the code snippet -
Writer -
In constructor -
myQueue = SingleChronicleQueueBuilder.binary(queueName).build();
myAppender = myQueue.acquireAppender();
In data callback -
myAppender.writeDocument(myDataPacket);
myQueue.close();
where myDataPacket is Java object wrapping the byte[] and other fields.
Tailer -
In Constructor -
myQueue = SingleChronicleQueueBuilder.binary(queueName).build();
myTailer = myQueue.createTailer();
In Read method -
while (notLastRecord)
{
if(myTailer.readDocument(myDataPacket))
{
notLastRecord = ;
//do stuff
}
}
myQueue.close();
Any help is highly appreciated.
Thanks,
Pavan
First of all I assume by "reads are lagging behind writes - in fact latency is accumulating" you mean that for every every subsequent message, the time the message is read from the queue is further from the time the event was written to the queue.
If you see latency accumulating like that, most likely the data is produced much quicker then you can consume it which from the use case you described is very much possible - if all you need at the write side is parsing simple text line and dump it into a queue file, it's quick, but if you do some processing when you read the entry from the queue - it might be slower.
From the code it's not clear what/how much work your code is doing, and the code looks OK to me, except you probably shouldn't call queue.close() after each appender.writeDocument() call but most likely you are not doing this otherwise it would blow up.
Without seeing actual code or test case it's impossible to say more.

How can I use multi-threading (Parallel ForEach), and batch? Or should I handle it differently?

I'm creating an app to send out bulk emails, but I'm lacking understanding of batch, multi-threads, or the best way to handle it.
Say I do something like this:
Dim options as New ParallelOptions
options.MaxDegreeOfParallelism = Environment.ProcessorCount * 10
Parallel.ForEach(recipients.AsEnumerable(), options, _
Function(row)
Return SendEmail(args)
End Function)
What if there are a large amount of emails? I was thinking of adding a batch option, but not sure if it's beneficial. So if I have 500,000 subscribers, could this Parallel looping become unstable? Should there be a pause or sleep at some point? When thinking of emails I think of "Batch = #" but I'm unsure of how to take this concept into multi-threading like this or if I should be doing something entirely different.
I've heard of getting blacklisted if not handled correctly

VB.NET, best practice for sorting concurrent results of threadpooling?

In short, I'm trying to "sort" incoming results of using threadpooling as they finish. I have a functional solution, but there's no way in the world it's the best way to do it (it's prone to huge pauses). So here I am! I'll try to hit the bullet points of what's going on/what needs to happen and then post my current solution.
The intent of the code is to get information about files in a directory, and then write that to a text file.
I have a list (Counter.ListOfFiles) that is a list of the file paths sorted in a particular way. This is the guide that dictates the order I need to write to the text file.
I'm using a threadpool to collect the information about each file, create a stringbuilder with all of the text ready to write to the text file. I then call a procedure(SyncUpdate, inlcluded below), send the stringbuilder(strBld) from that thread along with the name of the path of the file that particular thread just wrote to the stringbuilder about(Xref).
The procedure includes a synclock to hold all the other threads until it finds a thread passing the correct information. That "correct" information being when the xref passed by the thread matches the first item in my list (FirstListItem). When that happens, I write to the text file, delete the first item in the list and do it again with the next thread.
The way I'm using the monitor is probably not great, in fact I have little doubt I'm using it in an offensively wanton manner. Basically while the xref (from the thread) <> the first item in my list, I'm doing a pulseall for the monitor. I originally was using monitor.wait, but it would eventually just give up trying to sort through the list, even when using a pulse elsewhere. I may have just been coding something awkwardly. Either way, I don't think it's going to change anything.
Basically the problem comes down to the fact that the monitor will pulse through all of the items it has in the queue, when there's a good chance the item I am looking for probably got passed to it somewhere earlier in the queue or whatever and it's now going to sort through all of the items again before looping back around to find a criteria that matches. The result of this is that my code will hit one of these and take a huge amount of time to complete.
I'm open to believing I'm just using the wrong tool for the job, or just not using tool I have correctly. I would strongly prefer some sort of threaded solution (unsurprisingly, it's much faster!). I've been messing around a bit with the Parallel Task functionality today, and a lot of the stuff looks promising, but I have even less experience with that vs. threadpool, and you can see how I'm abusing that! Maybe something with queue? You get the idea. I am directionless. Anything someone could suggest would be much appreciated. Thanks! Let me know if you need any additional information.
Private Sub SyncUpdateResource(strBld As Object, Xref As String)
SyncLock (CType(strBld, StringBuilder))
Dim FirstListitem As String = counter.ListOfFiles.First
Do While Xref <> FirstListitem
FirstListitem = Counter.ListOfFiles.First
'This makes the code much faster for reasons I can only guess at.
Thread.Sleep(5)
Monitor.PulseAll(CType(strBld, StringBuilder))
Loop
Dim strVol As String = Form1.Volname
Dim strLFPPath As String = Form1.txtPathDir
My.Computer.FileSystem.WriteAllText(strLFPPath & "\" & strVol & ".txt", strBld.ToString, True)
Counter.ListOfFiles.Remove(Xref)
End SyncLock
End Sub
This is a pretty typical multiple producer, single consumer application. The only wrinkle is that you have to order the results before they're written to the output. That's not difficult to do. So let's let that requirement slide for a moment.
The easiest way in .NET to implement a producer/consumer relationship is with BlockingCollection, which is a thread-safe FIFO queue. Basically, you do this:
In your case, the producer threads get items, do whatever processing they need to, and then put the item onto the queue. There's no need for any explicit synchronization--the BlockingCollection class implementation does that for you.
Your consumer pulls things from the queue and outputs them. You can see a really simple example of this in my article Simple Multithreading, Part 2. (Scroll down to the third example if you're just interested in the code.) That example just uses one producer and one consumer, but you can have N producers if you want.
Your requirements have a little wrinkle in that the consumer can't just write items to the file as it gets them. It has to make sure that it's writing them in the proper order. As I said, that's not too difficult to do.
What you want is a priority queue of some sort onto which you can place an item if it comes in out of order. Your priority queue can be a sorted list or even just a sequential list if the number of items you expect to get out of order isn't very large. That is, if you typically have only a half dozen items at a time that could be out of order, then a sequential list could work just fine.
I'd use a heap because it performs well. The .NET Framework doesn't supply a heap, but I have a simple one that works well for jobs like this. See A Generic BinaryHeap Class.
So here's how I'd write the consumer (the code is in pseudo-C#, but you can probably convert it easily enough).
The assumption here is that you have a BlockingCollection called sharedQueue that contains the items. The producers put items on that queue. Consumers do this:
var heap = new BinaryHeap<ItemType>();
foreach (var item in sharedQueue.GetConsumingEnumerable())
{
if (item.SequenceKey == expectedSequenceKey)
{
// output this item
// then check the heap to see if other items need to be output
expectedSequenceKey = expectedSequenceKey + 1;
while (heap.Count > 0 && heap.Peek().SequenceKey == expectedSequenceKey)
{
var heapItem = heap.RemoveRoot();
// output heapItem
expectedSequenceKey = expectedSequenceKey + 1;
}
}
else
{
// item is out of order
// put it on the heap
heap.Insert(item);
}
}
// if the heap contains items after everything is processed,
// then some error occurred.
One glaring problem with this approach as written is that the heap could grow without bound if one of your consumers crashes or goes into an infinite loop. But then, your other approach probably would suffer from that as well. If you think that's an issue, you'll have to add some way to skip an item that you think won't ever be forthcoming. Or kill the program. Or something.
If you don't have a binary heap or don't want to use one, you can do the same thing with a SortedList<ItemType>. SortedList will be faster than List, but slower than BinaryHeap if the number of items in the list is even moderately large (a couple dozen). Fewer than that and it's probably a wash.
I know that's a lot of info. I'm happy to answer any questions you might have.

How can I (reasonably) precisely perform an action every N milliseconds?

I have a machine which uses an NTP client to sync up to internet time so it's system clock should be fairly accurate.
I've got an application which I'm developing which logs data in real time, processes it and then passes it on. What I'd like to do now is output that data every N milliseconds aligned with the system clock. So for example if I wanted to do 20ms intervals, my oututs ought to be something like this:
13:15:05:000
13:15:05:020
13:15:05:040
13:15:05:060
I've seen suggestions for using the stopwatch class, but that only measures time spans as opposed to looking for specific time stamps. The code to do this is running in it's own thread, so should be a problem if I need to do some relatively blocking calls.
Any suggestions on how to achieve this to a reasonable (close to or better than 1ms precision would be nice) would be very gratefully received.
Don't know how well it plays with C++/CLR but you probably want to look at multimedia timers,
Windows isn't really real-time but this is as close as it gets
You can get a pretty accurate time stamp out of timeGetTime() when you reduce the time period. You'll just need some work to get its return value converted to a clock time. This sample C# code shows the approach:
using System;
using System.Runtime.InteropServices;
class Program {
static void Main(string[] args) {
timeBeginPeriod(1);
uint tick0 = timeGetTime();
var startDate = DateTime.Now;
uint tick1 = tick0;
for (int ix = 0; ix < 20; ++ix) {
uint tick2 = 0;
do { // Burn 20 msec
tick2 = timeGetTime();
} while (tick2 - tick1 < 20);
var currDate = startDate.Add(new TimeSpan((tick2 - tick0) * 10000));
Console.WriteLine(currDate.ToString("HH:mm:ss:ffff"));
tick1 = tick2;
}
timeEndPeriod(1);
Console.ReadLine();
}
[DllImport("winmm.dll")]
private static extern int timeBeginPeriod(int period);
[DllImport("winmm.dll")]
private static extern int timeEndPeriod(int period);
[DllImport("winmm.dll")]
private static extern uint timeGetTime();
}
On second thought, this is just measurement. To get an action performed periodically, you'll have to use timeSetEvent(). As long as you use timeBeginPeriod(), you can get the callback period pretty close to 1 msec. One nicety is that it will automatically compensate when the previous callback was late for any reason.
Your best bet is using inline assembly and writing this chunk of code as a device driver.
That way:
You have control over instruction count
Your application will have execution priority
Ultimately you can't guarantee what you want because the operating system has to honour requests from other processes to run, meaning that something else can always be busy at exactly the moment that you want your process to be running. But you can improve matters using timeBeginPeriod to make it more likely that your process can be switched to in a timely manner, and perhaps being cunning with how you wait between iterations - eg. sleeping for most but not all of the time and then using a busy-loop for the remainder.
Try doing this in two threads. In one thread, use something like this to query a high-precision timer in a loop. When you detect a timestamp that aligns to (or is reasonably close to) a 20ms boundary, send a signal to your log output thread along with the timestamp to use. Your log output thread would simply wait for a signal, then grab the passed-in timestamp and output whatever is needed. Keeping the two in separate threads will make sure that your log output thread doesn't interfere with the timer (this is essentially emulating a hardware timer interrupt, which would be the way I would do it on an embedded platform).
CreateWaitableTimer/SetWaitableTimer and a high-priority thread should be accurate to about 1ms. I don't know why the millisecond field in your example output has four digits, the max value is 999 (since 1000 ms = 1 second).
Since as you said, this doesn't have to be perfect, there are some thing that can be done.
As far as I know, there doesn't exist a timer that syncs with a specific time. So you will have to compute your next time and schedule the timer for that specific time. If your timer only has delta support, then that is easily computed but adds more error since the you could easily be kicked off the CPU between the time you compute your delta and the time the timer is entered into the kernel.
As already pointed out, Windows is not a real time OS. So you must assume that even if you schedule a timer to got off at ":0010", your code might not even execute until well after that time (for example, ":0540"). As long as you properly handle those issues, things will be "ok".
20ms is approximately the length of a time slice on Windows. There is no way to hit 1ms kind of timings in windows reliably without some sort of RT add on like Intime. In windows proper I think your options are WaitForSingleObject, SleepEx, and a busy loop.

How do i start Process iteratively in VB.NET? or change argument dynamically

i have used following code to repeat a process creation/close iteratively
dim vProcessInfo as new ProcessInfo
For i= 1 to 100
dim p as new Process
vProcessInfo.Arguments = "some"+i.toString()
p.StartInfo = vProcessInfo
p.Start()
p.WaitForExit()
p.Close()
Next i
the above code worked for me successfully. but it takes too much time for process creation and dispose. i had to change process argument dynamically in the iteration. is there any way to change the process argument dynamically. or is there any better method to reduce time. pls help me
"Is there any way to change the process argument dynamically" - do you mean you want to start one process, and change its command line arguments after it's started? No, you can't do that - but you could communicate with it in other ways, for example:
Using standard input/output (e.g. write lines of text to its standard input)
Using files (e.g. you write to a file, it monitors the directory, picks up the file and processes it)
Using named pipes or sockets
Creating a process is a relatively slow operation. You can't easily speed that up - but if you can change your process in some way like the above, and just launch it once, that should make it a lot faster.