Sequence combinations of varying length using two 'alphabets' - objective-c

I am trying to generate all possible combinations of a sequence of length k given a certain alphabet (this is to generate query sequences for a bioinformatics project).
The sequences are of the form:
A first character and last character that can be any of A C G U (call these Y) and k - 2 characters in between that can be any of A C G U or ? (call these X).
e.g. if k = 3 then the pattern is of the form YXY and if k = 5 then YXXXY.
Generating all possible sequences if k is known is easy, as you can just use k nested for loops. But if k is not known in advance, then this implementation does not fit.
The total number of possible sequences can be expressed by 4^2 * 5^(k-2). With k = 3 this only gives 80 combinations, but scale that up to k = 9 and you have 1,250,000!
Any tips, ideas or suggestions would be much appreciated.
I need to use every sequence generated, so they need to be either stored in an array, or passed at creation/generation to another function, it doesn't really matter which, although I would prefer not to have to store all of them.
Many Thanks.
N.B. I am writing in objective-c, but any c-style code, or psuedocode or just plain english descriptions of an algorithm would be helpful.
UPDATE:
Here is the objc code I wrote base on the brilliant answer by Analog File. Currently it just outputs one sequence per line to stdout, but I will modify it to produce an array of strings.
Many thanks to everyone who contributed.
NSArray *yAlphabet = [NSArray arrayWithObjects:#"A", #"C", #"G", #"U", nil];
NSArray *xAlphabet = [NSArray arrayWithObjects:#"A", #"C", #"G", #"U", #"?", nil];
int i, v;
int count = 0;
int numberOfCases = 16 * pow(5 , (k - 2));
for (int n = 0; n < (numberOfCases); n++) {
i = n;
v = i % 4;
i = i / 4;
count++;
printf("\n%s", [[yAlphabet objectAtIndex:v] cStringUsingEncoding:NSUTF8StringEncoding]);
for (int m = 1; m < (k - 1); m++) {
v = i % 5;
i = i / 5;
printf("%s", [[xAlphabet objectAtIndex:v] cStringUsingEncoding:NSUTF8StringEncoding]);
}
printf("%s", [[yAlphabet objectAtIndex:i] cStringUsingEncoding:NSUTF8StringEncoding]);
}
printf("\n");
NSLog(#"No. Sequences: %i", count);
UPDATE 2:
And here's the code, outputting the generated sequences to an array of strings. Note that k is the length of the desired sequences and is given as a parameter elsewhere. I have tested this up to k=9 (1,250,000 sequences). Also note that my code uses ARC, hence there is no memory deallocation shown.
NSArray *yAlphabet = [NSArray arrayWithObjects:#"A", #"C", #"G", #"U", nil];
NSArray *xAlphabet = [NSArray arrayWithObjects:#"A", #"C", #"G", #"U", #"?", nil];
NSMutableArray *sequences = [[NSMutableArray alloc] init];
int i, v;
int count = 0;
int numberOfCases = 16 * pow(5 , (k - 2));
for (int n = 0; n < (numberOfCases); n++) {
i = n;
v = i % 4;
i = i / 4;
count++;
NSMutableString *seq = [[NSMutableString alloc] initWithString:[yAlphabet objectAtIndex:v]];
for (int m = 1; m < (k - 1); m++) {
v = i % 5;
i = i / 5;
[seq appendString:[xAlphabet objectAtIndex:v]];
}
[seq appendString:[yAlphabet objectAtIndex:i]];
[sequences addObject:seq];
}
NSLog(#"No. Sequences looped: %i", count);
//print the array to confirm
int count1 = 0;
for (NSMutableString *str in sequences) {
fprintf(stderr, "%s\n", [str cStringUsingEncoding:NSUTF8StringEncoding]);
count1++;
}
NSLog(#"No. Sequences printed: %i", count1);
NSLog(#"Counts match? : %#", (count == count1 ? #"YES" : #"NO"));

You know how many cases you are going to get.
This is pseudocode (k is the sequence length)
for n = 0 to num_of_cases - 1
i = n
v = i % length_of_alphabeth_Y
i = i / length_of_alphabeth_Y
output vth char in alphabeth Y
for m = 1 to k-1
v = i % length_of_alphabeth_X
i = i / length_of_alphabeth_X
output vth char in alphabeth X
output ith char in alphabeth Y
output end of sequence
Each iteration of the outer loop generates a case.I wrote output but it's easy to instead "store" the data in a dynamically allocated structure (n indexes in the roes, first case is column 0 then m indexes in the columns and last case is column k-1. if you do that "end of sequence" need not be output as it's subsumed by the increment of n).
Note how we are effectively "counting" in base length_of_alphabeth, except we use different bases depending on the digit. Modulo gives you the least significant digit, and integer division gets rid of it and shifts next digit to least significant position.
If you can imagine n as being just a value, in no specific base, the logic is rather simple. You could probably write this yourself from scratch, once you understand it.

It sounds like k will be passed in as a parameter, so something like this (in Python-ish pseudo-code) should work
Y_alphabet = ['A','C','G','U']
X_alphabet = ['A','C','G','U','?']
outputs = []
for i in range(k):
if i == 0 or i == k-1:
current_alphabet = Y_alphabet
else:
current_alphabet = X_alphabet
last_outputs = outputs
outputs = []
for next_character in current_alphabet:
# This just replaces outputs with a new list that consists
# of all the possible sequences of length i appended with
# the current character
outputs += [seq + next_character for seq in last_outputs]

The basic form for doing this looks like this (Java-ish psuedocode)
char[] output = new char[k];
pubilc void go(cur_k){
if(cur_k>k) // do something - copy the array and store it, etc.
for( char letter : alphabet ){
output[cur_k]=char_letter;
go(cur_k+1);
}
}

Related

Calculating value of K without messages

Question:
Find the value of K in myInterViewArray without any messages/calls
I was given this hint:
The numbers in the array will never exceed 1-9.
NSArray *myInterViewArray = #[#2,#1,#3,#9,#9,#8,#7];
Example:
If you send 3, the array will return the 3 biggest values in myInterViewArray * 3. So in the example below, K = 9 + 9 + 8.
--
I was asked this question a while back in an interview and was completely stumped. The first solution that I could think of looked something like this:
Interview Test Array:
[self findingK:myInterViewArray abc:3];
-(int)findingK:(NSArray *)myArray abc:(int)k{ // With Reverse Object Enumerator
myArray = [[[myArray sortedArrayUsingSelector:#selector(compare:)] reverseObjectEnumerator] allObjects];
int tempA = 0;
for (int i = 0; i < k; i++) {
tempA += [[myArray objectAtIndex:i] intValue];
}
k = tempA;
return k;
}
But apparently that was a big no-no. They wanted me to find the value of K without using any messages. That means that I was unable to use sortedArrayUsingSelector and even reverseObjectEnumerator.
Now to the point!
I've been thinking about this for quite a while and I still can't think of an approach without messages. Does anyone have any ideas?
There is only one way to do that and that is bridging the array to CF type and then use plain C, e.g.:
NSArray *array = #[#1, #2, #3];
CFArrayRef cfArray = (__bridge CFArrayRef)(array);
NSLog(#"%#", CFArrayGetValueAtIndex(cfArray, 0));
However, if the value is a NSNumber, you will still need messages to access its numeric value.
Most likely the authors of the question didn't have a very good knowledge of the concept of messages. Maybe they thought that subscripting and property access were not messages or something else.
Using objects in Obj-C without messages is impossible. Every property access, every method call, every method initialization is done using messages.
Rereading the question, they probably wanted you to implement the algorithm without using library functions, e.g. sort (e.g. you could implement a K-heap and use that heap to find the K highest numbers in a for iteration).
I assume what is meant is that you can't mutate the original array. Otherwise, that restriction doesn't make sense.
Here's something that might work:
NSMutableArray *a = [NSMutableArray array];
for (NSNumber *num in array) {
BOOL shouldAdd = NO;
for (int i = a.count - 1; i >= k; i--) {
if ([a[i] intValue] < [num intValue]) {
shouldAdd = YES;
break;
}
}
if (shouldAdd) {
[a addObject:num];
}
}
int result = a[a.count - k];
for (int i = k; k < a.count; k++) {
result += [a[i] intValue];
}
return result;

Objective C - comparing values in an array to an arbritary

How would you sort through an array, which also contains 0 values i.e.
-54
0
-12
0
-10
and comparing it to a constant (say -5), which would return the index of the corresponding closest value (smallest difference) ? (i.e. closest value = -10, so returned value = 4)
The challenge here being 0 values should always be overlooked, and the array cannot be sorted before hand
Heres a Similar problem, answers for which doesn't quite work in my case
How do I find the closest array element to an arbitrary (non-member) number?
That is relatively straightforward:
NSArray *data = #[#-54, #0, #-12, #0, #-10];
NSUInteger best = 0;
int target = -5;
for (NSUInteger i = 1 ; i < data.count ; i++) {
int a = [[data objectAtIndex:best] intValue];
int b = [[data objectAtIndex:i] intValue];
if (b && abs(a-target) > abs(b-target)) { // Ignore zeros, check diff
best = i;
}
}
// At this point, "best" contains the index of the best match
NSLog(#"%lu",best); // Prints 4

Dividing an array into separate arrays of four elements plus the reminder

I'm trying to divide an array into individual arrays of four elements, where the last array will contain the reminder. For example, if that main array's length property will be ten, three subarrays will be created - two consisting of four elements, and one of two elements.
The code I have right now looks like the following:
NSMutableArray *mainMutableArray = [NSMutableArray arrayWithObjects:#"First", #"Second", #"Third", #"Fourth", #"Fifth", #"Sixth", #"Seventh", #"Eighth", nil];
NSMutableArray *mutableArrayOfSubarrays = [NSMutableArray array];
int length = mainMutableArray.count / 4;
int location = 0;
for (int i = 0; i < length; i++)
{
[mutableArrayOfSubarrays addObject:[mainMutableArray subarrayWithRange:NSMakeRange(location, 4)]];
location += 4;
}
This of course works only when the reminder is equal to 0.
Any help would be greatly appreciated.
Ok, here we go:
int length = mainMutableArray.count;
for (int location = 0; location < length; location+=4)
{
unsigned int Size=length-location;
if (Size>4) Size=4;
[mutableArrayOfSubarrays addObject:[mainMutableArray subarrayWithRange:NSMakeRange(location, Size)]];
}
If you use a while loop, you can make the condition describe what you are actually trying to do:
NSUInteger length = [mainMutableArray count];
NSUInteger location = 0;
// Until the location is less than four away from the end
while( location <= (length - 4) ){
[mutableArrayOfSubarrays addObject:[mainMutableArray subarrayWithRange:NSMakeRange(location, 4)]];
location += 4;
}
// Pick up the remainder, if any
if( location != length ){
[mutableArrayOfSubarrays addObject:[mainMutableArray subarrayWithRange:NSMakeRange(location, length-location)]];
}
Loop from length*4 to mainMutableArray.count to get the remainder of the array.

Using malloc to create a 2d C style array of my class

(Edit: put possible solution at end)
I'm a C/C++ programmer who is learning Objective C to develop iPhone apps. The programs that I will be writing will deal with large 2d arrays of objects. I've read about using NSArray's of NSArray's and have some working code, but I'm trying to understand how to use C style arrays to save overhead and to learn what you can and can't do.
In this fragment MapClass only contains two properties, int x and int y. I have the following code fragment working with a statically defined array of 10x10.
MapClass *arr[10][10];
arr[2][3] = [[MapClass alloc] init];
arr[2][3].x = 2;
arr[2][3].y = 3;
NSLog(#"The location is %i %i", arr[2][3].x, arr[2][3].y);
// Output: "The location is 2 3"
This is an example of doing it with a one dimensional array and calculating where the cell is based on the X and Y:
MapClass **arr = (MapClass**) malloc(10 * 10 * sizeof(MapClass *));
arr[3 * 10 + 2] = [[MapClass alloc] init];
arr[3*10 + 2].x = 2;
arr[3*10 + 2].y = 3;
NSLog(#"The location is %i %i", arr[3*10 + 2].x, arr[3*10 + 2].y);
// Output: "The location is 2 3"
My question is this: How can I malloc my array as a two dimensional array so that I can use arr[2][3] style notation to access it?
Everything I'm trying is generating various errors such as "Subscript requires the size of [your class], which is not constant in non-fragile ABI".
Can anyone give me a snippit on how to do this? I've been reading and experimenting and can't figure it out. Does my one dimensional array example do anything wrong?
Answer?
After fooling around with xzgyb's answer, I have the following block working. Anything wrong with it? Thanks!
int dimX = 20;
int dimY = 35;
MapClass ***arr = (MapClass***) malloc( dimX * sizeof(MapClass **));
for (int x = 0; x < dimX; ++x)
{
arr[x] = (MapClass **) malloc( dimY * sizeof(MapClass*));
}
for (int x = 0; x < dimX; ++x)
{
for (int y = 0; y < dimY; ++y)
{
arr[x][y] = [[MapClass alloc] init];
arr[x][y].x = x;
arr[x][y].y = y;
}
}
for (int x = 0; x < dimX; ++x)
{
for (int y = 0; y < dimY; ++y)
{
NSLog(#"%i %i is %i %i", x, y, arr[x][y].x, arr[x][y].y);
}
}
// Cleanup
for (int x = 0; x < dimX; ++x) {
for (int y = 0; y < dimY; ++y) {
[arr[x][y] release];
}
}
for (int x = 0; x < dimX; ++x)
{
free(arr[x]);
}
free(arr);
Try the followed code:
MapClass ***arr = (MapClass***) malloc(10 * 10 * sizeof(MapClass *));
for ( int row = 0; row < 10; ++row ) {
arr[ row ] = (MapClass **)&arr[ row * 10 ];
}
arr[0][1] = [[MapClass alloc] init];
arr[1][2] = [[MapClass alloc] init];
Tested and it works fine using NSMutableString class and a variety of string methods.
I'd probably recommend using the standard message sending brackets than using the newer dot operator syntax just to simplify to the compiler what you are actually trying to accomplish.
The sizeof(ClassName ) should be the same as sizeof([ClassName class]) (and int or id for that matter) if I understand your meaning. The code you posted should not give an error like that as all pointers will be the same size. Now if you tried something like sizeof(*someInstanceOfAClass) then you're running into some issues because you're attempting to malloc enough memory to fit 10*10*(the actual size of your object) which is not what you're intending to do. (And sounds like what your warning is intended for.)

How can I efficiently select several unique random numbers from 1 to 50, excluding x?

I have 2 numbers which are between 0 and 49. Let's call them x and y. Now I want to get a couple of other numbers which are not x or y, but are also between 0 and 49 (I am using Objective C but this is more of a general theory question I think?).
Method I thought of is:
int a;
int b;
int c;
do {
a = arc4random() % 49;
} while ((a == x) || (a == y));
do {
b = arc4random() % 49;
} while ((b == x) || (b == y) || (b == a));
do {
c = arc4random() % 49;
} while ((c == x) || (c == y) || (c == a) || (c == b));
But it seem kind of bad to me, I don't know, I am just trying to learn to be a better programmer, what would be the most elegant way to do this for best practices?
You can use something called the Fisher-Yates shuffle. It's an efficient algorithm for producing a randomly ordered list of values from some set. You would first exclude N from the list of values from which to get random values, and then perform the shuffle.
You should shuffle an array of numbers (of values [0, ..., 49] in your case; you can also exclude your x and y from that array if you already know their values), then grab the first N values (however many you're seeking) from the shuffled array. That way, all the numbers are randomly of that range, and not "seen before".
I'd do something more along the lines of:
NSMutableSet * invalidNumbers = [NSMutableSet set];
[invalidNumbers addObject:[NSNumber numberWithInt:x]];
[invalidNumbers addObject:[NSNumber numberWithInt:y]];
int nextRandom = -1;
do {
if (nextRandom >= 0) {
[invalidNumbers addObject:[NSNumber numberWithInt:nextRandome]];
}
nextRandom = arc4random() % 49;
} while ([invalidNumbers containsObject:[NSNumber numberWithInt:nextRandom]]);
You could add x, y and the new number to a data structure that you can use as a set and do something like (in pseudo-code; the set structure needs something like push to add values and in for checking membership):
number_of_randoms = 2;
set.push(x);
set.push(y);
for (i = 0; i<number_of_randoms; i++) {
do {
new_random = arc4random() % 49;
} while !set.in(new_random);
set.push(new_random);
}
So if objc has something appropriate, this is easy...[aha, it does, see Dave DeLong's post].
This algorithm makes sense if number_of_randoms is much less than 49; if they are comparable then you should one of the shuffle (aka permutation) ideas.
First, make a set of valid numbers:
// Create a set of all the possible numbers
NSRange range = { 0, 50 };// Assuming you meant [0, 49], not [0, 49)
NSMutableSet *numbers = [NSMutableSet set];
for (NSUInteger i = range.location; i < range.length; i++) {
NSNumber *number = [NSNumber numberWithInt:i];
[numbers addObject:number];
}
// Remove the numbers you already have
NSNumber *x = [NSNumber numberWithInt:(arc4random() % range.length)];
NSNumber *y = [NSNumber numberWithInt:(arc4random() % range.length)];
NSSet *invalidNumbers = [NSSet setWithObjects:x, y, nil];
[numbers minusSet:invalidNumbers];
Then, if you don't need the numbers to be guaranteed to be random, you could use -anyObject and -removeObject to pull out a couple of other numbers. If you do need them to be random, then follow LBushkin's answer, but be careful not to accidentally implement Sattolo's algorithm:
// Shuffle the valid numbers
NSArray *shuffledNumbers = [numbers allObjects];
NSUInteger n = [shuffledNumbers count];
while (n > 1) {
NSUInteger j = arc4random() % n;
n--;
[shuffledNumbers exchangeObjectAtIndex:j withObjectAtIndex:n];
}