safety of using cocoa's performSelectorOnMainThread thousands of times

safety of using cocoa's performSelectorOnMainThread thousands of times - objective-c

In my app I have a worker thread which sits around doing a lot of processing. While it's processing, it sends updates to the main thread which uses the information to update GUI elements. This is done with performSelectorOnMainThread. For simplicity in the code, there are no restrictions on these updates and they get sent at a high rate (hundreds or thousands per second), and waitUntilDone is false. The methods called simply take the variable and copy it to a private member of the view controller. Some of them update the GUI directly (because I'm lazy!). Once every few seconds, the worker thread calls performSelectorOnMainThread with waitUntilDone set to true (this is related to saving the output of the current calculation batch).
My question: is this a safe use of performSelectorOnMainThread? I ask because I recently encountered a problem where my displayed values stopped updating, despite the background thread continuing to work without issues (and produce the correct output). Since they are fed values this way, I wondered if it might have hit a limit in the number of messages. I already checked the usual suspects (overflows, leaks, etc) and everything's clean. I haven't been able to reproduce the problem, however.

For simplicity in the code, there are no restrictions on these updates
and they get sent at a high rate (hundreds or thousands per second),
and waitUntilDone is false.
Yeah. Don't do that. Not even for the sake of laziness in an internal only application.
It can cause all kinds of potential problems beyond making the main run loop unresponsive.
Foremost, it will starve your worker thread for CPU cycles as your main thread is constantly spinning trying to update the UI as rapidly as the messages arrive. Given that drawing is oft done in a secondary thread, this will likely cause yet more thread contention, slowing things down even more.
Secondly, all those messages consume resources. Potentially lots of them and potentially ones that are relatively scarce, depending on implementation details.
While there shouldn't be a limit, there may likely be a practical limit that, when exceeded, things stop working. If this is the case, it would be a bug in the system, but one that is unlikely to be fixed beyond a console log that says "Too many messages, too fast, make fewer.".
It may also be a bug in your code, though. Transfer of state between threads is an area rife with pitfalls. Are you sure your cross-thread-communication code is bullet proof? (And, if it is bulletproof, it is quite likely a huge performance cost for your thousands/sec update notifications).
It isn't hard to throttle updates. While the commented suggestions are all reasonable, it can be done much more easily (NSNotificationQueue is fantastic, but likely overkill unless you are updating the main thread from many different places in your computation).
create an NSDate whenever you notify the main thread and store date in an ivar
next time you go to notify main thread, check if more than N seconds have passed
if they have, update your ivar
[bonus performance] if all that date comparison is too expensive, consider revisiting your algorithm to move the "update now" trigger to somewhere less frequent. Barring that, create an int ivar counter and only check the date every N iterations

Related

How does Flash Player execute Timer?

I know about runtime code execution fundamentals in Flash Player and Air Debugger. But I want to know how Timer code execution is done.
Would it be better to use Timer rather than enterFrame event for similar actions? Which one is better to maximize optimization?

It depends on what you want to use it for. Most will vehemently say to use Event.ENTER_FRAME. In most instances, this is what you want. It will only be called once every frame begins construction. If your app is running at 24fps (the default), that code will run once every 41.7ms, assuming no dropped frames. For almost all GUI related cases, you do not want to run that code more often than this because it is entirely unnecessary (you can run it more often, sure, but it won't do any good since the screen is only updated that often).
There are times when you need code to run more often, however, mostly in non-GUI related cases. This can range from a systems check that needs to happen in the background to something that needs to be absolutely precise, such as an object that needs to be displayed/updated on an exact interval (Timer is accurate to the ms, I believe, whereas ENTER_FRAME is only accurate to the 1000/framerate ms).
In the end, it doesn't make much sense to use Timer for anything less than ENTER_FRAME would be called. Any more than that and you risk dropping frames. ENTER_FRAME is ideal for nearly everything graphics related, other than making something appear/update at a precise time. And even then, you should use ENTER_FRAME as well since it would only be rendered in the next frame anyway.
You need to evaluate each situation on a case-by-case basis and determine which is best for that particular situation because there is no best answer for all cases.
EDIT
I threw together a quick test app to see when Timer fires. Framerate is 24fps (42ms) and I set the timer to run every 10ms. Here is a selection of times it ran at.
2121
2137
2154
2171
2188
2203
2221
2237
As you can see, it is running every 15-18ms instead of the 10ms I wanted it to. I also tested 20ms, 100ms, 200ms, and 1000ms. In the end, each one fired within about 10ms of when it should have. So this is not nearly as precise as I had originally thought it was.

How to compute score or heuristics for function optimization profitability?

I have instrumented my application with "stop watches". There is usually one such stop watch per (significant) function. These stop watches measure real time, thread time (and process time, but process time seems less useful) and call count. I can obviously sort the individual stop watches using either of the four values as a key. However that is not always useful and requires me to, e.g., disregard top level functions when looking for optimization opportunities, as top level functions/stop watches measure pretty much all of the application's run time.
I wonder if there is any research regarding any kind of score or heuristics that would point out functions/stop watches that are worthy looking at and optimizing?

The goal is to find code worth optimizing, and that's good, but
the question presupposes what many people think, which is that they are looking for "slow methods".
However there are other ways for programs to spend time unnecessarily than by having certain methods that are recognizably in need of optimizing.
What's more, you can't ignore them, because however much time they take will become a larger and larger fraction of the total if you find and fix other problems.
In my experience performance tuning, measuring time can tell if what you fixed helped, but it is not much use for telling you what to fix.
For example, there are many questions on SO from people trying to make sense of profiler output.
The method I rely on is outlined here.

updating 2 800 000 records with 4 threads

I have a VB.net application with an Access Database with one table that contains about 2,800,000 records, each raw is updated with new data daily. The machine has 64GB of ram and i7 3960x and its over clocked to 4.9GHz.
Note: data sources are local.
I wonder if I use ~10 threads will it finish updating the data to the rows faster.
If it is possiable what would be the mechanisim of deviding this big loop to multiple threads?
Update: Sometimes the loop has to repeat the calculation for some row depending on results also the loop have exacly 63 conditions and its 242 lines of code.

Microsoft Access is not particularly good at handling many concurrent updates, compared to other database platforms.
The more your tasks need to do calculations, the more you will typically benefit from concurrency / threading. If you spin up 10 threads that do little more than send update commands to Access, it is unlikely to be much faster than it is with just one thread.
If you have to do any significant calculations between reading and writing data, threads may show a performance improvement.
I would suggest trying the following and measuring the result:
One thread to read data from Access
One thread to perform whatever calculations are needed on the data you read
One thread to update Access
You can implement this using a Producer / Consumer pattern, which is pretty easy to do with a BlockingCollection.
The nice thing about the Producer / Consumer pattern is that you can add more producer and/or consumer threads with minimal code changes to find the sweet spot.
Supplemental Thought
IO is probably the bottleneck of your application. Consider placing the Access file on faster storage if you can (SSD, RAID, or even a RAM disk).

Well if you're updating 2,800,000 records with 2,800,000 queries, it will definitely be slow.
Generally, it's good to avoid opening multiple connections to update your data.
You might want to show us some code of how you're currently doing it, so we could tell you what to change.
So I don't think (with the information you gave) that going multi-thread for this would be faster. Now, if you're thinking about going multi-thread because the update freezes your GUI, now that's another story.
If the processing is slow, I personally don't think it's due to your servers specs. I'd guess it's more something about the logic you used to update the data.

Don't wonder, test. Write it so you could dispatch as much threads to make the work and test it with various numbers of threads. What does the loop you are talking about look like?
With questions like "if I add more threads, will it work faster"? it is always best to test, though there are rule of thumbs. If the DB is local, chances are that Oded is right.

About number of threads

I am reading concurrency programming guide in ios dev site
when move to the section "Moving away from thread" ,Apple said
Although threads have been around for many years and continue to have
their uses, they do not solve the general problem of executing
multiple tasks in a scalable way. With threads, the burden of creating
a scalable solution rests squarely on the shoulders of you, the
developer. You have to decide how many threads to create and adjust
that number dynamically as system conditions change. Another problem
is that your application assumes most of the costs associated with
creating and maintaining any threads it uses.
follow my previous learning,the OS will take care about process-thread management , and programmer just only create and destroy threads in desire ,
is it wrong ?

No it is not wrong. What it is saying is when you are programming with threads, most of the time you dynamically create threads based on certain conditions that the programmer places in their code. For example, finding prime numbers can be split up with threads but the creating and destruction of threads is made by the programmer. You are completely correct, it is just saying what you are saying in a more descriptive and elaborate way.
Oh and for the thread management, sometimes if the developer sees that most of the time the user will need to create a large amount of threads, it is cheaper to spawn a pool of threads and use those.

Say you have 100 tasks to perform, all using independent--for the duration of the task--data. Every thread you start costs quite a bit of overhead. So if you have two cores, you only want to start two threads, because that's all that's going to run anyway. Then you have to feed tasks to each of those threads to keep them both running. If you have 100 cores, you'll launch 100 threads. It's worth the overhead to get the job done 50 times faster.
So in old-fashioned programming, you have to do two jobs. You have to find out how many cores you have, and you have to feed tasks to each of your threads so they keep running and don't waste cores. (This becomes only one job if you have >= 100 cores.)
I believe Apple is offering take over these two awkward jobs for you.
If your jobs share data, that changes things. With two threads running, one can block the other, and even on a 2-core machine it pays to have three or more threads running. You are apt to find letting 100 threads loose at once makes sense because it improves the chances that at least two of them are not blocked. It prevents one blocked task from holding up the rest of the tasks in its thread. You pay a price in thread overhead, but get it back in high CPU usage.
So this feature is sometimes very useful and sometimes not. It helps with parallel programming, but would hinder with non-parallel concurrency (multithreading).

In OOP, if objects send each other messages, won't there be easily an infinite loop happening?

In an Apple paper about Object Oriented Programming, it depicts objects sending messages to each other. So Appliance can send a message to Valve, saying requesting for water, and the Valve object can then send a message to the Appliance, for "giving the water".
(to send a message is actually calling the method of the other object)
So I wonder, won't this cause subtle infinite loop in some way that even the programmer did not anticipate? For example, one is if we program two objects, each one of them pulling each other by gravity, so one is sending to the other object, that there is a "pull" force, and the other object's method got called, and in turn sends a message to the first object, and they will go into an infinite loop. So if the computer program only has 1 process or 1 thread, it will simply go into an infinite loop, and never run anything else in that program (even if the two object finally collide together, they still continue to pull each other). How does this programming paradigm work in reality to prevent this?
Update: this is the Apple paper: http://developer.apple.com/library/mac/documentation/cocoa/conceptual/OOP_ObjC/OOP_ObjC.pdf
Update: for all the people who just look at this obvious example and say "You are wrong! Programs should be finite, blah blah blah", well, what I am aiming at is, what if there are hundreds or thousands of objects, and they send each other messages, and when getting a message, they might in turn send other messages to other objects. Then, how can you be sure there can't be infinite loop and the program cannot go any further.
On the other hand, for people who said, "a program must be finite", well, what about a simple GUI program? It has the event loop, and it is an infinite loop, running UNTIL the user explicitly asks the program to stop. And what about a program that keep on looking for prime numbers? It can keep looking (with BigNum such as in Ruby so that there can be any number of digits for an integer), so the program is just written to keep on running, and write the next larger prime number into the hard disk (or write to hard disk once every million time it find greater prime number -- so it find 1 million prime number and write that 1 millionth to the hard drive and then keep on looking for the next million prime numbers and write the 2 millionth number to hard drive (write only 1 number, not 1 million of them). Well, for a computer with 12GB or RAM and 2TB of hard drive, maybe you can say it can take 20 years for the program to exceed the capability of the computer, when hard disk is full or when the 12GB of RAM cannot fit all the variables (it might be billion of years that an integer cannot fit in 1GB of RAM), but as far as the program is concerned, it just keep running, unless the memory manager cannot allocate another BigNum, or the hard drive is full, that the exception is raised and the program is forced to stop, but the program is written to run indefinitely. So not all programs HAS TO BE written to be finite.

Why should Appliance request for water repeatedly?
Why should Valve bombard Appliance saying that water is being provided?
In theory - it's likely to create infinite loop, but in practice - it comes down to proper modeling of Your objects.
Appliance should send ICanHasWater message only once, wait for response, receive water or receive an answer that water cannot be provided, or will be in future when Applicance might want to try requesting water once again.
that's why I went into the 2 objects and gravity example instead.
Infinite loop of calculation of gravity effects between objects would happen only if You would trigger this calculation on calculation.
I think that common approach is to introduce Time concept and calculate gravitation for particular TimeFrame and then move on to next one for next round of calculation. That way - Your World would have control over thread between TimeFrames and Your application might do something more useful than endless calculations of gravity effects.

Without OOP it is as easy to create infinite loops unintentionally, using imperative programming languages or functional programming maybe. Thus I cannot see what is special about OOP in this case.
If you think of your objects as actors sending each other messages, it's not necessarily wrong to go into an infinite loop. GUI toolkits work this way. Dependend on the programming language used this is made obvious by a call to toolKit.mainLoop()or the like.
I think that even your example of modelling gravity by objects pulling at each other is not wrong per se. You have to ensure that something is happening as a result to the message (i.e. the object being accelerated and moving a little) and you will get a rough discretization of the underlying formulae. You want to check for collision nevertheless to make your model more complete :-)
Using this model requires some level of concurrency in your program to ensure that messages are processed in proper order.

In real life implementations there's no infinite loop, there's infinite indirect recursion instead - A() calls B(), B() calls C() and on some branch C() calls A() again. In your example if Appliance sends GetWater, Valve sends HeresYourWaterSir inmmediately and Appliance's handler of HeresYouWaterSir for whatever reason sends GetWater again - infinite indirect recursion will begin.
So yes, you're right, in some cases problems can happen. The OOP paradigm itself doesn't protect against that - it's up to the developer.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas