VB.Net multiple background workers leads to high CPU usage - vb.net

I've got a VB.Net application that has two background workers. The first one connects to a device and reads a continuous stream of data from it into a structure. This runs and utilises around 2% of CPU.
From time to time new information comes in that's incomplete so I have another background worker which sits in a loop waiting for a global variable to be anything other than null. This variable is set when I want it to look up the missing information.
When both are running CPU utilisation goes to 30-50% of CPU.
I thought that offloading the lookup to its own thread would be a good move as the lookup process may block (it's querying a url) and this would avoid the first background worker from getting stuck as it needs to read the incoming data in realtime. However, just commenting out the code in worker 2 to leave just the Loop as shown below still results in the same high CPU.
Do While lookupRunning = True
If lookup <> "" Then
' Query a URL and get data
End If
Loop
The problem is clearly that I'm running an infinite loop on worker 2. Other than dumping this idea and looking up the information on the main thread with a very short timeout in case the web service fails to respond, putting Application.DoEvents in my loop doesn't seem to make much difference and seems to be frowned upon in any case.
Is there are better way to do this ?

Related

TimeOut in Thread with a query from io.vertx.ext.sql.SQLClient;

Well, I a new developer with Vert.x... so, I have a problem with an implementation with a database connection.
In one or many querys, I have a lot of information like 160K records, those records will be in a JSON object throw GraphQL; so... when the query time is over 30000(ms)... the console says:
Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 5026 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
So I investigated about this, and I cannot find a way to resolve, maximize or set a bigger value to the query until these is finish or get all records.
This question is actually covered in detail in the official documentation.
you can’t call blocking operations directly from an event loop, as
that would prevent it from doing any other useful work
That's what you're doing at the moment - calling a blocking operation.
An alternative way to run blocking code is to use a worker verticle A
worker verticle is always executed with a thread from the worker pool.
Run your "slow" code in a worker verticle. Communicate between EventLoop verticls and workers using EventBus. As long as you're inside same VM, passing even large collections over EventBus has no overhead.

Why Action In console was Paused till the process is done

When I use the python or etc,
In using for'for' command,
I print the row count
Beacause it can help to expect how long does it takes time,
But In VBA, Printing line numbers in console is paused when it reach some levels
However, the process is still operating normally.
when it process is done,
suddenly all output messages are displayed on the console at once.
The performance of the PC is very good. ( i7 CPU / ram = 16gb)
Why happen like this?
I used below code,
Please refer to this code and capture.
for RowVarialbe = 2 to 100000
debug.print (RowVariable)
next RowVarialbe
I don't know why, but I have observed the same. Add a
DoEvents
inside the loop and you'll be fine. It'll slow down the process slightly, so you might want to do it every nth row instead (but if performance is an issue, you shouldn't use the console that much - it is a performance killer). Another bonus is that the DoEvents allows you to pause or halt the execution.

Processing data while it is loading

We have a tool which loads data from some optical media, and once it's all copied to the hard drive runs it through a third-party tool for processing. I would like to optimise this process so each file is processed as it is read in. Trouble is, the third-party tool (which naturally I cannot change) has a 12 second startup overhead. What is the best way I can deal with this, in terms of finishing the entire process as soon as possible? I can pass any number of files to the processing tool in each run, so I need to be able to determine exactly when to run the tool to get the fastest result overall. The data being copied could be anything from one large file (which can't be processed until it's fully copied) to hundreds of small files.
The simplest would be to create and run 2 threads, one that runs the tool and one that loads data. Start 12 seconds timer and trigger both threads. Upon each file load completion check the passed time. If 12 seconds passed, fetch the data into the thread running the tool. Restart loading the data in parallel to processing of previous bulk. Once previous bulk processing completes restart the 12 sec timer and continue checking it upon every file load completion. Repeat till no more data remains.
For better results a more complex solution might be required. You can do some benchmarking to get an evaluation of average data loading time. Since it might be different for small and large files, several evaluations may be needed for different categories of files (according to size). Optimal resources utilization would be the one that processes the data in the same rate the new data arrives. Processing time includes the 12 seconds startup. The benchmarking should give you a ratio of processing threads number vs. reading threads number (you can also decrease/increase the number of active reading threads according to the incoming file sizes). Actually, it's a variation of producer-consumer problem with multiple producers and consumers.

SSIS 2005 Control Flow Priority

The short version is I am looking for a way to prioritize certain tasks in SSIS 2005 control flows. That is I want to be able to set it up so that Task B does not start until Task A has started but Task B does not need to wait for Task A to complete. The goal is to reduce the amount of time where I have idle threads hanging around waiting for Task A to complete so that they can move onto Tasks C, D & E.
The issue I am dealing with is converting a data warehouse load from a linear job that calls a bunch of SPs to an SSIS package calling the same SPs but running multiple threads in parallel. So basically I have a bunch of Execute SQL Task and Sequence Container objects with Precedent Constraints mapping out the dependencies. So far no problems, things are working great and it cut our load time a bunch.
However I noticed that tasks with no downstream dependencies are commonly being sequenced before those that do have dependencies. This is causing a lot of idle time in certain spots that I would like to minimize.
For example: I have about 60 procs involved with this load, ~10 of them have no dependencies at all and can run at any time. Then I have another one with no upstream dependencies but almost every other task in the job is dependent on it. I would like to make sure that the task with the dependencies is running before I pick up any of the tasks with no dependencies. This is just one example, there are similar situations in other spots as well.
Any ideas?
I am late in updating over here but I also raised this issue over on the MSDN forums and we were able to devise a partial work around. See here for the full thread, or here for the feature request asking microsoft to give us a way to do this cleanly...
The short version is that you use a series of Boolean variables to control loops that act like roadblocks and prevent the flow from reaching the lower priority tasks until the higher priority items have started.
The steps involved are:
Declare a bool variable for each of the high priority tasks and default the values to false.
Create a pre-execute event for each of the high priority tasks.
In the pre-execute event create a script task which sets the appropriate bool to true.
At each choke point insert a for each loop that will loop while the appropriate bool(s) are false. (I have a script with a 1 second sleep inside each loop but it also works with empty loops.)
If done properly this gives you a tool where at each choke point the package has some number of high priority tasks ready to run and a blocking loop that keeps it from proceeding down the lower priority branches until said high priority items are running. Once all of the high priority tasks have been started the loop clears and allows any remaining threads to move on to lower priority tasks. Worst case is one thread sits in the loop while waiting for other threads to come along and pick up the high priority tasks.
The major drawback to this approach is the risk of deadlocking the package if you have too many blocking loops get queued up at the same time, or misread your dependencies and have loops waiting for tasks that never start. Careful analysis is needed to decide which items deserved higher priority and where exactly to insert the blocks.
I don't know any elegant ways to do this but my first shot would be something like this..
Sequence Container with the proc that has to run first. In that same sequence container put a script task that just waits 5-10 seconds or so before each of the 10 independent steps can run. Then chain the rest of the procs below that sequence container.

Batch printing exception

I get this error while printing multiple .xps documents to a physical printer
Dim defaultPrintQueue As PrintQueue = GetForwardPrintQueue(My.Settings.SelectedPrinter)
Dim xpsPrintJob As PrintSystemJobInfo
xpsPrintJob = defaultPrintQueue.AddJob(JobName, Document, False)
Documents are spooled succesfully till, a print job exception occurs
The InnerException is Insufficient memory to continue the execution of the program.
The source is PresentationCore.dll
Where should i start searching?
When attempting to perform tasks that may fail due to temporary or permanent restrictions on some resource, I tend to use a back-off strategy. This strategy has been followed on things as diverse as message queuing and socket opens.
The general process for such a strategy is as follows.
set maxdelay to 16 # maximum time period between attempts
set maxtries to 10 # maximum attempts
set delay to 0
set tries to 0
while more actions needed:
if delay is not 0:
sleep delay
attempt action
if action failed:
add 1 to tries
if tries is greater than maxtries:
exit with permanent error
if delay is 0:
set delay to 1
else:
double delay
if delay is greater than maxdelay:
set delay to maxdelay
else:
set delay to 0
set tries to 0
This allows the process to run at full speed in the vast majority of cases but backs off when errors start occurring, hopefully giving the resource provider time to recover. The gradual increase in delays allows for more serious resource restrictions to recover and the maximum tries catches what you would term permanent errors (or errors that are taking too long to recover).
I actually prefer this try-it-and-catch-failure approach to the check-if-okay-then-try one since the latter can still often fail if something changes between the check and the try. This is called the "better to seek forgiveness than ask permission" method, which also works quite well with bosses most of the time, and wives a little less often :-)
One particularly useful case was a program which opened a separate TCP session for each short-lived transaction. On older hardware, the closed sockets (those in TCP WAIT state) eventually disappeared before they were needed again.
But, as the hardware got faster, we found that we could open sessions and do work much quicker and Windows was running out of TCP handles (even when increased to the max).
Rather than having to re-engineer the communications protocol to maintain sessions, this strategy was implemented to allow graceful recovery in the event handles were starved.
Granted it's a bit of a kludge but this was legacy software approaching end-of-life, where bug fixes are often just enough to get it working and it wasn't deemed strategic enough to warrant spending a lot of money in fixing it properly.
Update: It may be that there's a (more permanent) problem with PresentationCore. This KB article states that there's a memory leak in WPF within .NET 3.5SP1 (of which your print driver may be a client).
If the backoff strategy doesn't fix your problem (it may not if it's a leak in a long lived process), you might want to try applying the hotfix. Me, I'd replicate the problem in a virtual machine and then patch that to test it (but I'm an extreme paranoid).
It was found by googling PresentationCore Insufficient memory to continue the execution of the program and checking the first link here. Search for the string "hotfix that relates to this issue" on that page.
Before adding a new job to the queue you should check the queue state. More info on PrintQueue.IsOutOfMemory property and related properties that can be queried to verify that the queue is not in an error state.
Of course pax' hint to use a defensive strategy when accessing resources like printers is best practice. For starter you may want to put the line adding the job into a try block.
You might want to consider launching a new process to handle the printing of each document, the overhead should be low compared to the effort of printing the documents.