If within a for loop (running say n times ),i make a call to a library function,which i know in the back end runs another loop,does it affect my overall complexity ? Or does it remain O(n) ?
It does affect your overall complexity. Imagine you were to make a call to the function you're writing from within another loop - you can't ignore a function's inherent runtime because it looks like a single statement.
Now, exactly HOW it affects your complexity depends on what you're doing with it, and what it does, but you certainly can't ignore it.
Related
When you have a function that accepts an array as an argument and calls another function with that array and that calls another function with it and so forth the stack will contain many copies of the pointer to that array. I just thought of an interesting way to alleviate this problem but I'm wondering whether or not it is worth implementing.
Does anyone have any idea how often stacks contain duplicate pointers in practice?
EDIT
Just to clarify, I am not optimizing a given program but, rather, am considering writing a new kind of optimization pass for my VM. My benchmarks have indicated that my current solution causes up to 70% of the total running time to be spent in stack manipulations. The optimization pass I am thinking of would generate code at compile time that would perform the same actions but pointers would (potentially) be duplicated on the stack less often. I am interested in any prior studies that have measured the number of duplicates on the stack because this would help me to quantify my optimization's potential. For example, if it is known that real programs do not push pointers already on the stack in practice then my optimization is worthless.
Moreover, these stack manipulations are due to the code generated by my VM making sure locally-held pointers are visible to the garbage collector and not due only to function parameters as both answerers have currently assumed. And they are actually operations on a shadow stack rather than the main stack.
First of all, the answer will depend on your application.
Secondly, even with high duplication, I doubt there is much sense in implementing the mechanism you describe, or even that it is possible in a general case. If you call a method and you pass it parameters, you must do it either one way or another.
There may be advantages to doing it in some specific way - for example there are several function calling conventions and many C/C++ compilers (e.g. gcc) let you choose between passing parameters on the stack or via registers. In certain cases, the latter may be faster - you can try and benchmark if it helps your application.
But in a general case, the cost of detecting duplicated values on the stack and "reusing" them would probably much exceed any gains from having a smaller stack. The code for pushing and popping values is really simple (just a few CPU instructions in an optimized case), code for finding and reusing duplicates - hardly so. You would also have to somehow store the information about which values are already on the stack and how to find them - a nontrivial data structure. Except for some really weird cases, I don't think this would be smaller than the actual copied data itself.
What you could do, would be to rewrite your algorithm in such way that some function calls are eliminated. For example, if your function's result only depends on the input arguments, you could somehow cache or memoize the results, thus avoiding repeated calls with the same values. This may indeed bring some gains, though it's usually a memory vs CPU time tradeoff. Getting an advantage both in memory and in CPU time is rarely possible. Also, rewriting your algorithm is not really "avoiding duplication of data on the stack".
Any way, for the original question, I think the idea is not viable and you should look at optimizations elsewhere.
PS: You use case may somewhat resemble tail-call optimization, so perhaps that's a direction worth looking at - but if you implement it yourself, I would also consider this to fall into the "change your algorithm" category. Maybe changing from a recursive algorithm to an iterative one could help also.
Can I suggest getting some exposure to actual performance tuning?
(Here's my canonical example.)
Between the time a program starts and the time it ends, of the cycles it uses, it obviously uses 100% of those cycles.
If it goes in and out of functions, and passes pointers to an array, but does nothing else, then there's no surprise that a high percent of time goes into function entry and exit, and passing arguments.
If a program P is written to do task T, there are a multitude of other programs P' which could also do task T. Some of them take fewer cycles than all the others, and those are the optimal ones.
The way the optimal ones differ from the non-optimal ones is that the non-optimal ones are doing things that can be done without.
So, to optimize any program, find out what cycles are being spent that don't have to be, and get rid of those activities. That link shows in great detail how I do it.
Trying to pass fewer arguments to functions might or might not be necessary, depending on what your diagnostics tell you.
I have an algorithm that works perfectly, but it uses recursion. I know there are patterns for just about everything, but I could not find one for this case.
I just need some simple examples that show how to modify an algorithm, specifically the part where a method or function calls itself. I've seen iteration algorithms that do it with a while loop. So there must be a simple checklist to follow in order to convert an recursive algorithm into an iterational one.
You can definitely model recursion with iteration and a custom call stack. Since recursion is nothing but execution of same instructions in a new environment, you can model your own environment using a simple stack structure, and just wrap your algorithm in a loop, pushing your current mini-environment at the start of an iteration and popping it whenever you finish a loop iteration, or exit it prematurely via break or continue.
Tail-call recursion where the recursion happens as the end of a function is pretty trivial to make non-recursive with a loop. Some compilers even do that for you automatically.
Converting any recursive function into an iterative one in a systematic way isn't so simple and in all likelihood you'd end up creating your own call stack, which most likely defeats the purpose of having a non-recursive algorithm anyway.
Also see: Can every recursion be converted into iteration?
BTW, if you are using GCC or llvm, then your non-debug code with -O2 or -O3 turned on will perform tail recursion elimination for you. (In case you don't know, tail recursion is when the recursion call comes last thing in the function and is simply returned, i.e. not part of an expression. See http://en.wikipedia.org/wiki/Tail_call.)
So, if the recursive write up is clearer to read, it's probably better to stick with that.
This question concerns optimization. Suppose I need the array length of an array A at two places in my code. Should I use the function a.length() in the two places, or is it faster to assign a local variable the value of a.length() and use it at the two places.
By "faster" I mean in terms of running time. Moreover, i am talking asymptotically.
The asymptotic complexity of calling the function twice is the same - any constant number of calls to the same (pure) function on the same arguments has the same asymptotic complexity as a single call to that function, since you can just roll the constant number of calls into the big-O's hidden constant.
As for what will be faster, there's no guarantee which one will be faster. It depends on the language and compiler. I'd suggest just writing it both ways and timing the result to see if there's an appreciable difference. That said, if you are writing something that is so performance-critical that you can't afford to call .length() twice, you may need to reconsider your approach in general to see if there's a better global solution to the problem. Microoptimizations are rarely worth the effort unless you have a compelling reason to believe that your program is markedly slower in the unoptimized version.
If you have to ask the question, you're not at a point where it matters yet. If you were, you'd already have code that you've profiled, and you could just try it and see. This kind of thing depends heavily on your language and compiler, and the only results that matter are the ones you see.
Don't worry about micro-optimizations til you find you need to shave cycles, and even then the algorithm is the first thing to check.
What language? In many languages, such calls are optimized away (either at compile time or by a JIT compiler) into direct access to the length field of the array object.
I'd rather not use code since it's common concept:
Say we have the scenario of a function which is neither too big or too small and also can't easily in itself be optimized with OpenMP for-loop optimizations.
However, it is a function which is called millions of times throughout the project's run in a few hundred unrelated circumstances in the code.
[inline in itself doesn't seem to do much (on by default on optimized gcc outcomes) and making it into a macro while not parallel either, it would be an undertaking to be compatible.]
OpenMP is for "making things run in parallel" - in general. Not only for loops... Well, you don't even need to have any loops at all to make some good use of OpenMP and speed up your code.
The only thing which matters is: "do I have a several independent operations which run one after one, and which could work at the same time instead?". If so, then you've found an easy spot for optimization with OpenMP.
When the function is called, is it called multiple times, particularly in a loop? The question is a little vague -- maybe yes (it's called thousands of times in each of a few hundred unrelated places -> millions) or maybe no (it's called once in each of a hundred unrelated places, and you hit those sections of code thousands of times -> millions).
In the first case, then yes, parallelizing the `map' -- that is, applying the function independantly to a bunch of cases -- is easy and OpenMPs very well.
In the second case, if the function is called a million times but each time once, then no. There's repetition of execution there, but no exposed concurrency; there's no list of tasks that have to be done at the same time that can be done independantly. All that you can do there, if the function is likely to be called with repeated parameters, is to use memoization, which is a memory/compute time tradeoff, not a parallelization technique.
In the second case, it may be the case that you can restructure the code so that a bunch of those function calls are made at once, thus exposing the concurrency and allowing parallelization -- but its not something that OpenMP (or any parallel programming model) can automatically do for you.
I am using HP Exstream (formerly Dialogue from Exstream Software) version 5.0.x. It has a feature to define and save boolean expressions as "Rules".
It has been about 6 years since I used this, but does anybody know if you can define a rule in terms of another rule? There is a "VB-like" language in a popup window, so you are not forced to use the and/or, variable-relational expression form, but I don't have documentation handy. :-(
I would like to define a rule, "NotFoo", in terms of "Foo", instead of repeating the inverse of the whole thing. (Yes, that would be retarded, but that's probably what I will be forced to do, as in other examples of what I am maintaining.) Actually, nested rules would have many uses, if I can figure out how to do it.
I later found that what one needs to do in this case is create user defined "functions", which can reference each other (so long as you avoid indirect recursion). Then, use the functions to define the "rules" (and, don't even bother with "library" rules instead of "inline" rules, most of the time).
I'm late to the question but since you had to answer yourself there is a better way to handle it.
The issue with using functions and testing the result is that there's a good chance that you're going to be adding unnecessary processing because the engine will run through the function every time it's called. Not a big issue with a simple function but it can easily become a problem if the function is complex, especially if it's called in several places.
Depending on the timing of the function (you didn't say whether it was a run level, customer level, or specific to particular documents), it's often better to have the function set a User Boolean variable to store the result then in your library rules you can just check the value of the variable without having to run through the function every time.