If statement processing speed - optimization

This is something simply to ease my curiosity, if someone would feel like answering it though that would be fantastic.
With if statements, is the time taken to calculate the result affected by the way it's written?
So what I mean is (if that wasn't overly clear) would the following two statements take the same amount of time to process?
if 1 < 2 and 3 = 3 then
//do something
end if
compared to
if 1 < 2 then
if 3 = 3 then
//Do something
end if
end if

If we consider that the compiler will not optimize these two calls, then the second statement will require two branching instructions instead of one. And branching requires some extra work for the CPU because of pipelining. So, technically, the second version will require more work, but it should not matter here.

This is just another case of premature-optimization. You are not going to gain anything by thinking a lot about this.
What you should be focusing on is how to make your code more readable.

Related

Optimization of Lisp function calls

If my code is calling a function, and one of the function's arguments will vary based on a certain condition, is it more efficient to have the conditional statement as an argument of the function, or to call the function multiple times in the conditional statement.
Example:
(if condition (+ 4 3) (+ 5 3))
(+ (if condition 4 5) 3)
Obiously this is just an example: in the real scenario the numbers would be replaced by long, complex expressions, full of variables. The if might instead be a long cond statement.
Which would be more efficient in terms of speed, space etc?
Don't
What you care about is not performance (in this case the difference will be trivial) but code readability.
Remember,
"... a computer language is not just a way of getting a computer to
perform operations, but rather ... it is a novel formal medium for
expressing ideas about methodology"
Abelson/Sussman "Structure and
Interpretation of Computer Programs".
You are writing code primarily for others (and you yourself next year!) to read. The fact that the computer can execute it is a welcome fringe benefit.
(I am exaggerating, of course, but much less than you think).
Okay...
Now that you skipped the harangue (if you claim you did not, close your eyes and tell me which specific language I mention above), let me try to answer your question.
If you profiled your program and found that this place is the bottleneck, you should first make sure that you are using the right algorithm.
E.g., using a linearithmic sort (merge/heap) instead of quadratic (bubble/insertion) sort will make much bigger difference than micro-optimizations like you are contemplating.
Then you should disassemble both versions of your code; the shorter version is (ceteris paribus) likely to be marginally faster.
Finally, you can waste a couple of hours of machine time repeatedly running both versions on the same output on an otherwise idle box to discover that there is no statistically significant difference between the two approaches.
I agree with everything in sds's answer (except using a trick question -_-), but I think it might be nice to add an example. The code you've given doesn't have enough context to be transparent. Why 5? Why 4? Why 3? When should each be used? Should there always be only two options? The code you've got now is sort of like:
(defun compute-cost (fixed-cost transaction-type)
(+ fixed-cost
(if (eq transaction-type 'discount) ; hardcoded magic numbers
3 ; and conditions are brittle
4)))
Remember, if you need these magic numbers (3 and 4) here, you might need them elsewhere. If you ever have to change them, you'll have to hope you don't miss any cases. It's not fun. Instead, you might do something like this:
(defun compute-cost (fixed-cost transaction-type)
(+ fixed-cost
(variable-cost transaction-type)))
(defun variable-cost (transaction-type)
(case transaction-type
((employee) 2) ; oh, an extra case we'd forgotten about!
((discount) 3)
(t 4)))
Now there's an extra function call, it's true, but computation of the magic addend is pulled out into its own component, and can be reused by anything that needs it, and can be updated without changing any other code.

VB.Net - Recursion or Iteration

first post here because I've got a problem thats got me stumped. I am creating a calculation tool for a project at uni, now I'm not in an it degree but for this project I havent had problem too difficult until now.
Basically I am doing design work for a building, where each floor of the building can be allocated one of 9 possible designs. By then allocating solutions to each of these then calculating costs I am trying to find the most efficient combination.
Normally I would just use some nested loops to find the best design, no problems, but for the calculation tool I am required to change the number of floors, and therefore the number of nested loops, which I am unfamiliar with how to do.
The General structure is this
1- X Number of Floors.
9 Possible designs for each floor.
Based on each combination a cost must be calculated, and if it is within the best 5 results it will be stored.
There is a total of 9^x total solutions.
So
Floor = 1
For Solution = 1 to 9
Floor = 2
For Solution = 1 to 9
CalculateCost()
if CalculateCost < Best Then
Write, Floor1 Solution Value, Floor2 Solution Value to Output
Etc...
Now I am using Vb.net, and do not really know how to do recursion. If someone could simply point me in the way of a resource that may help me on this issue I would be very grateful.
Edit - Whilst I have tried to simplify, the cost of design implementation changes based on various other factors, so I can't simply just take the cheapest design for all. I have tried to solve this through practical theory and so far found so solution, therefore the brute force method is required
You're already using For loops to try each solution for each given floor, but you're manually iterating through your Floor variable. It seems you have 9 independently declared variables with the schema FloorX Solution Value. Given all of these things, I think what you really need is a dynamic array. Here's some rough code using this approach:
Dim FloorSolutionValues() As Byte ' I'm assuming values of 1-9
Dim NumberOfFloors As Integer ' Get this from the user
ReDim FloorSolutionValues(NumberOfFloors - 1)
For CurrentFloor As Integer = 0 To NumberOfFloors - 1
For CurrentSolution As Byte = 1 To 9
If CalculateCost() < FloorSolutionValues(CurrentFloor) Then
FloorSolutionValues(CurrentFloor) = CurrentSolution
End If
Next
Next
I'm making some assumptions that may or may not be true, but this should get you on the right path.

Way to optimise a mapping on informatica

I would like to optimise a mapping developped by one of my colleague and where the "loading part" (in a flat file) is really really slow - 12 row per sec
Currently, to get to the point where I start writting in my file, I take about 2 hours, so I would like to know where I should start looking first otherwise, I will need at least 2 hours between each improvment - which is not really efficient.
Ok, so to describe simply what is done :
Oracle table (with big query inside - takes about 2 hours to get a result)
SQ
2 LKup on ref table (should not be heavy)
update strategy
1 transformer
2 Lk up (on big table - that should be one optimum point I guess : change them to joiner)
6 stored procedure (these also seem a bit heavy, what do you think ?)
another tranformer
load in the flat file
Can you confirm that either the LK up or the stored procedur part could be the reason why it is so slow ?
Do you think that I should look somewhere else to optimize ? I was thinking may be only 1 transformer.
First check the logs carefuly. Look at the timestamps. It should give you initial idea what part causes delay.
Lookups to big tables are not recommended. Joiners are a better way, but they still need to cache data. Can you limit the data for cache, perhaps? It'll be very hard to advise without seeing it.
Which leads us to the Stored Procedures: it's simply impossible to tell anything about them just like that.
So: first collect the stats and do log analysis. Next, read some tuning guides on the Net - there's plenty. Here's a more comprehensive one, but well... large - so you might like to try and look for some other ones.
Powercenter Performance Tuning Guide

Memory efficiency in If statements

I'm thinking more about how much system memory my programs will use nowadays. I'm currently doing A level Computing at college and I know that in most programs the difference will be negligible but I'm wondering if the following actually makes any difference, in any language.
Say I wanted to output "True" or "False" depending on whether a condition is true. Personally, I prefer to do something like this:
Dim result As String
If condition Then
Result = "True"
Else
Result = "False"
EndIf
Console.WriteLine(result)
However, I'm wondering if the following would consume less memory, etc.:
If condition Then
Console.WriteLine("True")
Else
Console.WriteLine("False")
EndIf
Obviously this is a very much simplified example and in most of my cases there is much more to be outputted, and I realise that in most commercial programs these kind of statements are rare, but hopefully you get the principle.
I'm focusing on VB.NET here because that is the language used for the course, but really I would be interested to know how this differs in different programming languages.
The main issue making if's fast or slow is predictability.
Modern CPU's (anything after 2000) use a mechanism called branch prediction.
Read the above link first, then read on below...
Which is faster?
The if statement constitutes a branch, because the CPU needs to decide whether to follow or skip the if part.
If it guesses the branch correctly the jump will execute in 0 or 1 cycle (1 nanosecond on a 1Ghz computer).
If it does not guess the branch correctly the jump will take 50 cycles (give or take) (1/200th of a microsecord).
Therefore to even feel these differences as a human, you'd need to execute the if statement many millions of times.
The two statements above are likely to execute in exactly the same amount of time, because:
assigning a value to a variable takes negligible time; on average less than a single cpu cycle on a multiscalar CPU*.
calling a function with a constant parameter requires the use of an invisible temporary variable; so in all likelihood code A compiles to almost the exact same object code as code B.
*) All current CPU's are multiscalar.
Which consumes less memory
As stated above, both versions need to put the boolean into a variable.
Version A uses an explicit one, declared by you; version B uses an implicit one declared by the compiler.
However version A is guaranteed to only have one call to the function WriteLine.
Whilst version B may (or may not) have two calls to the function WriteLine.
If the optimizer in the compiler is good, code B will be transformed into code A, if it's not it will remain with the redundant calls.
How bad is the waste
The call takes about 10 bytes for the assignment of the string (Unicode 2 bytes per char).
But so does the other version, so that's the same.
That leaves 5 bytes for a call. Plus maybe a few extra bytes to set up a stackframe.
So lets say due to your totally horrible coding you have now wasted 10 bytes.
Not much to worry about.
From a maintainability point of view
Computer code is written for humans, not machines.
So from that point of view code A is clearly superior.
Imagine not choosing between 2 options -true or false- but 20.
You only call the function once.
If you decide to change the WriteLine for another function you only have to change it in one place, not two or 20.
How to speed this up?
With 2 values it's pretty much impossible, but if you had 20 values you could use a lookup table.
Obviously that optimization is not worth it unless code gets executed many times.
If you need to know the precise amount of memory the instructions are going to take, you can use ildasm on your code, and see for yourself. However, the amount of memory consumed by your code is much less relevant today, when the memory is so cheap and abundant, and compilers are smart enough to see common patterns and reduce the amount of code that they generate.
A much greater concern is readability of your code: if a complex chain of conditions always leads to printing a conditionally set result, your first code block expresses this idea in a cleaner way than the second one does. Everything else being equal, you should prefer whatever form of code that you find the most readable, and let the compiler worry about optimization.
P.S. It goes without saying that Console.WriteLine(condition) would produce the same result, but that is of course not the point of your question.

SQL Server 'In' versus 'less than'

I've got a table of addresses that I'm replicating. There are 14 different types of addresses available. In order to cut down on the replicated data, I'm filtering on the AddressType field. The field is an int, and has a value of 1 to 14. I originally had a filter of
AddressType = 2
, as I was only interested in addresses with that type. However, a recent change requires I have both AddressType 1 and 2 replicated. I at first changed the filter to
AddressType in (1,2)
Would I be better off with a filter of
AddressType < 3
Thoughts?
There can be a significant difference as the numbers get larger. You won't see a performance difference at the smaller numbers but you will see a difference as it gets larger, especially if there is an index on AddressType. Your IN () version essentially gets translated to:
WHERE AddressType = 1
OR AddressType = 2
OR ...
I do agree with the others for this specific case. (1) the performance difference when there are only 14 values is unlikely to be noticeable. (2) Jonathan's point that IN () more accurately reflects what you want to do is a good one, also.
But for future readers who maybe have a lot more possible values, I think it's important to note how things can change when the list is not so limited (or when < and IN () no longer offer the same functionality, e.g. when an address type changes). At larger sizes even when there is the convenience that everything in the IN () list matches a range criterion, there are still other things to consider: (1) it is less convenient to type IN (1,2,3,4,5,6,7,8,9,10, ...) and this also can lead to a much larger batch size when we're talking about extremes.
When your requirements change and you need types 1, 2, 9 and 14, the IN formulation will be better. The list comparison more accurately reflects what you are doing (choosing two types from a small list of possible values).
The less than notation happens to work, but it is coincidental that the representation of the types is susceptible to range comparisons like that.
In terms of performance, there is essentially nothing to choose between the two. The less than operation will perhaps be marginally quicker, but the margin is unlikely to be measurable.
Execution plans look identical. I'm inclined to say you should go with IN in case you need to add another address type like "5" that will force you to rewrite the < query. IN is a lot more extensible because it doesnt matter what you add to it.
All answers are fine...in the same vein as other posters...its "what else you might do" that might make a difference.
So always consider NULL in any comparison. Your query is fine wrt NULLS as written but: IF nulls are possible... and IF you decide to change or reuse the SQL ad hoc to say, Negate it...you might have an issue with the comparison as opposed to IN.
for instance how would NOT IN (1,2) perform vs >= 3 ... or whatever incarnations we might use. A NULL is TRUE in the first but FALSE in the second. (NULLS is comparisons).
Considering NULLS should be like breathing in SQL creation.