How should I store and compare 4 values for several instances of the same object type? - objective-c

I'm working in Objective-C/Cocoa and I have an object type Tile. Each has a signature that can be represented as 4 different integer values. If I output a few these values as a string, with -es separating the values, it looks like this example:
signature: 4-4-3-3
signature: 4-3-3-3
signature: 0-0-0-1
signature: 0-0-1-1
signature: 0-0-1-0
signature: 1-1-1-2
signature: 1-1-2-2
signature: 1-1-2-1
signature: 3-3-3-1
signature: 3-3-1-1
signature: 3-3-1-3
signature: 4-4-4-3
signature: 4-4-3-3
I'm currently storing each of the values as an unsigned short. There never will be negative values and the maximum value is very unlikely to be above 15 or so. Zero is a valid value. There is no 'nil' value.
I would like to be able to call:
[myTile signature] to retrieve the value.
[myTile matches:otherTile] to return a BOOL indicating whether the signatures match.
What is the most efficient way to store this "signature" and compare it to the signatures of the other Tile instances? It seems like string comparisons would be slow...

First off, I'd use the commonly used methods names for these tasks: description and isEqual*:.
Concerning your question, I think the best way is the simpler:
- (BOOL)isEqualToTile:(Tile)tile
{
return self.value1 == tile.value1 &&
self.value2 == tile.value2 &&
self.value3 == tile.value3 &&
self.value4 == tile.value4;
}
Another possibility could be to implement hash.
EDIT: I wouldn't worry too much about performance if I were you.
Because 8 comparisons are fast. I mean really fast. If you were to put together a little benchmark, you would find that each comparison take ~1.5E-8s to run. This doesn't talk to me much, but lets just say you could make 10,000,000 of these comparisons under 100ms if my math is right.
Because if one day you find your sofware slow, then it will be time to investigate the origin of this slowness (and I doubt it will come from this method), but remember that premature optimization is the root of all evil.
Because it took you 12 seconds to implement it, and it would probably take a bit more to think of a working hash function. Don't over-think it. See my second point.
Because, if you need one day to optimize this function (if you are doing it now, re-read my second point), Cocoa have a couple handy tools to make your software doing such a dumb and repetitive task parallel. Without scratching your head to make it faster on only one of your (ever growing number of) cores.

Related

How to convert Greensock's CustomEase functions to be usable in CreateJS's Tween system?

I'm currently working on a project that does not include GSAP (Greensock's JS Tweening library), but since it's super easy to create your own Custom Easing functions with it's visual editor - I was wondering if there is a way to break down the desired ease-function so that it can be reused in a CreateJS Tween?
Example:
var myEase = CustomEase.create("myCustomEase", [
{s:0,cp:0.413,e:0.672},{s:0.672,cp:0.931,e:1.036},
{s:1.036,cp:1.141,e:1.036},{s:1.036,cp:0.931,e:0.984},
{s:0.984,cp:1.03699,e:1.004},{s:1.004,cp:0.971,e:0.988},
{s:0.988,cp:1.00499,e:1}
]);
So that it turns it into something like:
var myEase = function(t, b, c, d) {
//Some magic algorithm performed on the 7 bezier/control points above...
}
(Here is what the graph would look like for this particular easing method.)
I took the time to port and optimize the original GSAP-based CustomEase class... but due to license restrictions / legal matters (basically a grizzly bear that I do not want to poke with a stick...), posting the ported code would violate it.
However, it's fair for my own use. Therefore, I believe it's only fair that I guide you and point you to the resources that made it possible.
The original code (not directly compatible with CreateJS) can be found here:
https://github.com/art0rz/gsap-customease/blob/master/CustomEase.js (looks like the author was also asked to take down the repo on github - sorry if the rest of this post makes no sense at all!)
Note that CreateJS's easing methods only takes a "time ratio" value (not time, start, end, duration like GSAP's easing method does). That time ratio is really all you need, given it goes from 0.0 (your start value) to 1.0 (your end value).
With a little bit of effort, you can discard those parameters from the ease() method and trim down the final returned expression.
Optimizations:
I took a few extra steps to optimize the above code.
1) In the constructor, you can store the segments.length value directly as this.length in a property of the CustomEase instance to cut down a bit on the amount of accessors / property lookups in the ease() method (where qty is set).
2) There's a few redundant calculations done per Segments that can be eliminated in the ease() method. For instance, the s.cp - s.s and s.e - s.s operations can be precalculated and stored in a couple of properties in each Segments (in its constructor).
3) Finally, I'm not sure why it was designed this way, but you can unwrap the function() {...}(); that are returning the constructors for each classes. Perhaps it was used to trap the scope of some variables, but I don't see why it couldn't have wrapped the entire thing instead of encapsulating each one separately.
Need more info? Leave a comment!

Optimized binary search

I have seen many example of binary search,many method how to optimize it,so yesterday my lecturer write code (in this code let us assume that first index starts from 1 and last one is N,so that N is length of array,consider it in pseudo code.code is like this:
L:=1;
R:=N;
while( L<R)
{
m:=div(R+L,2);
if A[m]> x
{
L:=m+1;
}
else
{
R:=m;
}
}
Here we assume that array is A,so lecturer said that we are not waste time for comparing if element is at middle part of array every time,also benefit is that if element is not in array,index says about where it would be located,so it is optimal,is he right?i mean i have seen many kind of binary search from John Bentley for example(programming pearls) and so on,and is this code optimal really?it is written in pascal in my case, but language does not depend.
It really depends on whether you find the element. If you don't, this will have saved some comparisons. If you could find the element in the first couple of hops, then you've saved the work of all the later comparisons and arithmetic. If all the values in the array are distinct, it's obviously fairly unlikely that you hit the right index early on - but if you have broad swathes of the array containing the same values, that changes the maths.
This approach also means you can't narrow the range quite as much as you would otherwise - this:
R:=m;
would normally be
R:=m-1;
... although that would reasonably rarely make a significant difference.
The important point is that this doesn't change the overall complexity of the algorithm - it's still going to be O(log N).
also benefit is that if element is not in array,index says about where it would be located
That's true whether you check for equality or not. Every binary search implementation I've seen would give that information.

Is it possible to compare two Objective-C blocks by content?

float pi = 3.14;
float (^piSquare)(void) = ^(void){ return pi * pi; };
float (^piSquare2)(void) = ^(void){ return pi * pi; };
[piSquare isEqualTo: piSquare2]; // -> want it to behave like -isEqualToString...
To expand on Laurent's answer.
A Block is a combination of implementation and data. For two blocks to be equal, they would need to have both the exact same implementation and have captured the exact same data. Comparison, thus, requires comparing both the implementation and the data.
One might think comparing the implementation would be easy. It actually isn't because of the way the compiler's optimizer works.
While comparing simple data is fairly straightforward, blocks can capture objects-- including C++ objects (which might actually work someday)-- and comparison may or may not need to take that into account. A naive implementation would simply do a byte level comparison of the captured contents. However, one might also desire to test equality of objects using the object level comparators.
Then there is the issue of __block variables. A block, itself, doesn't actually have any metadata related to __block captured variables as it doesn't need it to fulfill the requirements of said variables. Thus, comparison couldn't compare __block values without significantly changing compiler codegen.
All of this is to say that, no, it isn't currently possible to compare blocks and to outline some of the reasons why. If you feel that this would be useful, file a bug via http://bugreport.apple.com/ and provide a use case.
Putting aside issues of compiler implementation and language design, what you're asking for is provably undecidable (unless you only care about detecting 100% identical programs). Deciding if two programs compute the same function is equivalent to solving the halting problem. This is a classic consequence of Rice's Theorem: Any "interesting" property of Turing machines is undecidable, where "interesting" just means that it's true for some machines and false for others.
Just for fun, here's the proof. Assume we can create a function to decide if two blocks are equivalent, called EQ(b1, b2). Now we'll use that function to solve the halting problem. We create a new function HALT(M, I) that tells us if Turing machine M will halt on input I like so:
BOOL HALT(M,I) {
return EQ(
^(int) {return 0;},
^(int) {M(I); return 0;}
);
}
If M(I) halts then the blocks are equivalent, so HALT(M,I) returns YES. If M(I) doesn't halt then the blocks are not equivalent, so HALT(M,I) returns NO. Note that we don't have to execute the blocks -- our hypothetical EQ function can compute their equivalence just by looking at them.
We have now solved the halting problem, which we know is not possible. Therefore, EQ cannot exist.
I don't think this is possible. Blocks can be roughly seen as advanced functions (with access to global or local variables). The same way you cannot compare functions' content, you cannot compare blocks' content.
All you can do is to compare their low-level implementation, but I doubt that the compiler will guarantee that two blocks with the same content share their implementation.

What is the difference between an IF, CASE, and WHILE statement

I just want to know what the difference between all the conditional statements in objective-c and which one is faster and lighter.
One piece of advice: stop worrying about which language constructs are microscopically faster or slower than which others, and instead focus on which ones let you express yourself best.
If and case statements described
While statement described
Since these statements do different things, it is unproductive to debate which is faster.
It's like asking whether a hammer is faster than a screwdriver.
The language-agnostic version (mostly, obviously this doesn't count for declarative languages or other weird ones):
When I was taught programming (quite a while ago, I'll freely admit), a language consisted of three ways of executing instructions:
sequence (doing things in order).
selection (doing one of many things).
iteration (doing something zero or more times).
The if and case statements are both variants on selection. If is used to select one of two different options based on a condition (using pseudo-code):
if condition:
do option 1
else:
do option 2
keeping in mind that the else may not be needed in which case it's effectively else do nothing. Also remember that option 1 or 2 may also consist of any of the statement types, including more if statements (called nesting).
Case is slightly different - it's generally meant for more than two choices like when you want to do different things based on a character:
select ch:
case 'a','e','i','o','u':
print "is a vowel"
case 'y':
print "never quite sure"
default:
print "is a consonant"
Note that you can use case for two options (or even one) but it's a bit like killing a fly with a thermonuclear warhead.
While is not a selection variant but an iteration one. It belongs with the likes of for, repeat, until and a host of other possibilities.
As to which is fastest, it doesn't matter in the vast majority of cases. The compiler writers know far more than we mortal folk how to get the last bit of performance out of their code. You either trust them to do their job right or you hand-code it in assembly yourself (I'd prefer the former).
You'll get far more performance by concentrating on the macro view rather than the minor things. That includes selection of appropriate algorithms, profiling, and targeting of hot spots. It does little good to find something that take five minutes each month and get that running in two minutes. Better to get a smaller improvement in something happening every minute.
The language constructs like if, while, case and so on will already be as fast as they can be since they're used heavily and are relative simple. You should be first writing your code for readability and only worrying about performance when it becomes an issue (see YAGNI).
Even if you found that using if/goto combinations instead of case allowed you to run a bit faster, the resulting morass of source code would be harder to maintain down the track.
while isn't a conditional it is a loop. The difference being that the body of a while-loop can be executed many times, the body of a conditional will only be executed once or not at all.
The difference between if and switch is that if accepts an arbitrary expression as the condition and switch just takes values to compare against. Basically if you have a construct like if(x==0) {} else if(x==1) {} else if(x==2) ..., it can be written much more concisely (and effectively) by using switch.
A case statement could be written as
if (a)
{
// Do something
}
else if (b)
{
// Do something else
}
But the case is much more efficient, since it only evaluates the conditional once and then branches.
while is only useful if you want a condition to be evaluated, and the associated code block executed, multiple times. If you expect a condition to only occur once, then it's equivalent to if. A more apt comparison is that while is a more generalized for.
Each condition statement serves a different purpose and you won't use the same one in every situation. Learn which ones are appropriate for which situation and then write your code. If you profile your code and find there's a bottleneck, then you go ahead and address it. Don't worry about optimizing before there's actually a problem.
Are you asking whether an if structure will execute faster than a switch statement inside of a large loop? If so, I put together a quick test, this code was put into the viewDidLoad method of a new view based project I just created in the latest Xcode and iPhone SDK:
NSLog(#"Begin loop");
NSDate *loopBegin = [NSDate date];
int ctr0, ctr1, ctr2, ctr3, moddedNumber;
ctr0 = 0;
ctr1 = 0;
ctr2 = 0;
ctr3 = 0;
for (int i = 0; i < 10000000; i++) {
moddedNumber = i % 4;
// 3.34, 1.23s in simulator
if (moddedNumber == 0)
{
ctr0++;
}
else if (moddedNumber == 1)
{
ctr1++;
}
else if (moddedNumber == 2)
{
ctr2++;
}
else if (moddedNumber == 3)
{
ctr3++;
}
// 4.11, 1.34s on iPod Touch
/*switch (moddedNumber)
{
case 0:
ctr0++;
break;
case 1:
ctr1++;
break;
case 2:
ctr2++;
break;
case 3:
ctr3++;
break;
}*/
}
NSTimeInterval elapsed = [[NSDate date] timeIntervalSinceDate:loopBegin];
NSLog(#"End loop: %f seconds", elapsed );
This code sample is by no means complete, because as pointed out earlier if you have a situation that comes up more times than the others, you would of course want to put that one up front to reduce the total number of comparisons. It does show that the if structure would execute a bit faster in a situation where the decisions are more or less equally divided among the branches.
Also, keep in mind that the results of this little test varied widely in performance between running it on a device vs. running it in the emulator. The times cited in the code comments are running on an actual device. (The first time shown is the time to run the loop the first time the code was run, and the second number was the time when running the same code again without rebuilding.)
There are conditional statements and conditional loops. (If Wikipedia is to be trusted, then simply referring to "a conditional" in programming doesn't cover conditional loops. But this is a minor terminology issue.)
Shmoopty said "Since these statements do different things, it is nonsensical to debate which is faster."
Well... it may be time poorly spent, but it's not nonsensical. For instance, let's say you have an if statement:
if (cond) {
code
}
You can transform that into a loop that executes at most one time:
while (cond) {
code
break;
}
The latter will be slower in pretty much any language (or the same speed, because the optimizer turned it back into the original if behind the scenes!) Still, there are occasions in computer programming where (due to bizarre circumstances) the convoluted thing runs faster
But those incidents are few and far between. The focus should be on your code--what makes it clearest, and what captures your intent.
loops and branches are hard to explain briefly, to get the best code out of a construct in any c-style language depends on the processor used and the local context of the code. The main objective is to reduce the breaking of the execution pipeline -- primarily by reducing branch mispredictions.
I suggest you go here for all your optimization needs. The manuals are written for the c-style programmer and relatively easy to understand if you know some assembly. These manuals should explain to you the subtleties in modern processors, the strategies used by top compilers, and the best way to structure code to get the most out of it.
I just remembered the most important thing about conditionals and branching code. Order your code as follows
if(x==1); //80% of the time
else if(x==2); // 10% of the time
else if(x==3); //6% of the time
else break;
You must use an else sequence... and in this case the prediction logic in your CPU will predict correctly for x==1 and avoid the breaking of your pipeline for 80% of all execution.
More information from intel. Particularly:
In order to effectively write your code to take advantage of these rules, when writing if-else or switch statements, check the most common cases first and work progressively down to the least common. Loops do not necessarily require any special ordering of code for static branch prediction, as only the condition of the loop iterator is normally used.
By following this rule you are flat-out giving the CPU hints about how to bias its prediction logic towards your chained conditionals.

Is there a practical limit to the size of bit masks?

There's a common way to store multiple values in one variable, by using a bitmask. For example, if a user has read, write and execute privileges on an item, that can be converted to a single number by saying read = 4 (2^2), write = 2 (2^1), execute = 1 (2^0) and then add them together to get 7.
I use this technique in several web applications, where I'd usually store the variable into a field and give it a type of MEDIUMINT or whatever, depending on the number of different values.
What I'm interested in, is whether or not there is a practical limit to the number of values you can store like this? For example, if the number was over 64, you couldn't use (64 bit) integers any more. If this was the case, what would you use? How would it affect your program logic (ie: could you still use bitwise comparisons)?
I know that once you start getting really large sets of values, a different method would be the optimal solution, but I'm interested in the boundaries of this method.
Off the top of my head, I'd write a set_bit and get_bit function that could take an array of bytes and a bit offset in the array, and use some bit-twiddling to set/get the appropriate bit in the array. Something like this (in C, but hopefully you get the idea):
// sets the n-th bit in |bytes|. num_bytes is the number of bytes in the array
// result is 0 on success, non-zero on failure (offset out-of-bounds)
int set_bit(char* bytes, unsigned long num_bytes, unsigned long offset)
{
// make sure offset is valid
if(offset < 0 || offset > (num_bytes<<3)-1) { return -1; }
//set the right bit
bytes[offset >> 3] |= (1 << (offset & 0x7));
return 0; //success
}
//gets the n-th bit in |bytes|. num_bytes is the number of bytes in the array
// returns (-1) on error, 0 if bit is "off", positive number if "on"
int get_bit(char* bytes, unsigned long num_bytes, unsigned long offset)
{
// make sure offset is valid
if(offset < 0 || offset > (num_bytes<<3)-1) { return -1; }
//get the right bit
return (bytes[offset >> 3] & (1 << (offset & 0x7));
}
I've used bit masks in filesystem code where the bit mask is many times bigger than a machine word. think of it like an "array of booleans";
(journalling masks in flash memory if you want to know)
many compilers know how to do this for you. Adda bit of OO code to have types that operate senibly and then your code starts looking like it's intent, not some bit-banging.
My 2 cents.
With a 64-bit integer, you can store values up to 2^64-1, 64 is only 2^6. So yes, there is a limit, but if you need more than 64-its worth of flags, I'd be very interested to know what they were all doing :)
How many states so you need to potentially think about? If you have 64 potential states, the number of combinations they can exist in is the full size of a 64-bit integer.
If you need to worry about 128 flags, then a pair of bit vectors would suffice (2^64 * 2).
Addition: in Programming Pearls, there is an extended discussion of using a bit array of length 10^7, implemented in integers (for holding used 800 numbers) - it's very fast, and very appropriate for the task described in that chapter.
Some languages ( I believe perl does, not sure ) permit bitwise arithmetic on strings. Giving you a much greater effective range. ( (strlen * 8bit chars ) combinations )
However, I wouldn't use a single value for superimposition of more than one /type/ of data. The basic r/w/x triplet of 3-bit ints would probably be the upper "practical" limit, not for space efficiency reasons, but for practical development reasons.
( Php uses this system to control its error-messages, and I have already found that its a bit over-the-top when you have to define values where php's constants are not resident and you have to generate the integer by hand, and to be honest, if chmod didn't support the 'ugo+rwx' style syntax I'd never want to use it because i can never remember the magic numbers )
The instant you have to crack open a constants table to debug code you know you've gone too far.
Old thread, but it's worth mentioning that there are cases requiring bloated bit masks, e.g., molecular fingerprints, which are often generated as 1024-bit arrays which we have packed in 32 bigint fields (SQL Server not supporting UInt32). Bit wise operations work fine - until your table starts to grow and you realize the sluggishness of separate function calls. The binary data type would work, were it not for T-SQL's ban on bitwise operators having two binary operands.
For example .NET uses array of integers as an internal storage for their BitArray class.
Practically there's no other way around.
That being said, in SQL you will need more than one column (or use the BLOBS) to store all the states.
You tagged this question SQL, so I think you need to consult with the documentation for your database to find the size of an integer. Then subtract one bit for the sign, just to be safe.
Edit: Your comment says you're using MySQL. The documentation for MySQL 5.0 Numeric Types states that the maximum size of a NUMERIC is 64 or 65 digits. That's 212 bits for 64 digits.
Remember that your language of choice has to be able to work with those digits, so you may be limited to a 64-bit integer anyway.