Select nth value of NSArray - objective-c

How would I go about selecting the nth values of an array and adding them to another array.
For example, if i have an NSArray which has 100 objects and I want to add every 5th object? I understand how to select the 5th object and how to add to a new array etc, but just looking for the best way to do this. This is for image manipulation, so will be dealing with arrays of up to 2m pixel values.
Is the best way to just use for loops?

You can using striding:
.stride(to: 100, by: 5)
So to create a new array:
Array(0.stride(to: 10, by: 2).map( { myArray[$0] }))
UPDATE: As Leo Dabus points out, the above will start at element 0 (and take every 2nd). If you want to start at the 5th and take every 5th, you would use:
Array(4.stride(to: 100, by: 5).map( { myArray[$0] }))

Using loops is pretty good: they are easy to read, and they are about as efficient as anything else that you may want to use for this purpose. The only optimization to the for loop approach is to reserve a specific number of elements upfront, because you know how many elements you are going to write.
If you are going to make the same selection from multiple arrays (e.g. processing an array of arrays), consider creating NSIndexSet, and applying it with objectsAtIndexes to perform the selection. This may give your code slightly better readability, because the for loop for creating indexes would be separate from the process of selection.
Finally, if you need to optimize for speed, and your arrays store wrapped primitives, consider using plain arrays instead of NSArray to avoid wrapping and unwrapping. This has a potential of giving you the most improvement, because by eliminating additional memory accesses for unwrapping it would also significantly improve locality of reference, which has crucial importance for cache use optimization.

Related

Natural way of indexing elements in Flink

Is there a built-in way to index and access indices of individual elements of DataStream/DataSet collection?
Like in typical Java collections, where you know that e.g. a 3rd element of an ArrayList can be obtained by ArrayList.get(2) and vice versa ArrayList.indexOf(elem) gives us the index of (the first occurence of) the specified element. (I'm not asking about extracting elements out of the stream.)
More specifically, when joining DataStreams/DataSets, is there a "natural"/easy way to join elements that came (were created) first, second, etc.?
I know there is a zipWithIndex transformation that assigns sequential indices to elements. I suspect the indices always start with 0? But I also suspect that they aren't necessarily assigned in the order the elements were created in (i.e. by their Event Time). (It also exists only for DataSets.)
This is what I currently tried:
DataSet<Tuple2<Long, Double>> tempsJoIndexed = DataSetUtils.zipWithIndex(tempsJo);
DataSet<Tuple2<Long, Double>> predsLinJoIndexed = DataSetUtils.zipWithIndex(predsLinJo);
DataSet<Tuple3<Double, Double, Double>> joinedTempsJo = tempsJoIndexed
.join(predsLinJoIndexed).where(0).equalTo(0)...
And it seems to create wrong pairs.
I see some possible approaches, but they're either non-Flink or not very nice:
I could of course assign an index to each element upon the stream's
creation and have e.g. a stream of Tuples.
Work with event-time timestamps. (I suspect there isn't a way to key by timestamps, and even if there was, it wouldn't be useful for
joining multiple streams like this unless the timestamps are
actually assigned as indices.)
We could try "collecting" the stream first but then we wouldn't be using Flink anymore.
The 1. approach seems like the most viable one, but it also seems redundant given that the stream should by definition be a sequential collection and as such, the elements should have a sense of orderliness (e.g. `I'm the 36th element because 35 elements already came before me.`).
I think you're going to have to assign index values to elements, so that you can partition the data sets by this index, and thus ensure that two records which need to be joined are being processed by the same sub-task. Once you've done that, a simple groupBy(index) and reduce() would work.
But assigning increasing ids without gaps isn't trivial, if you want to be reading your source data with parallelism > 1. In that case I'd create a RichMapFunction that uses the runtimeContext sub-task id and number of sub-tasks to calculate non-overlapping and monotonic indexes.

Fast, efficient method of assigning large array of data to array of clusters?

I'm looking for a faster, more efficient method of assigning data gathered from a DAQ to its proper location in a large cluster containing arrays of subclusters.
My current method 1 relies heavily on the OpenG cluster manipulation tools, but with a large data-set the performance is far too slow.
The array and cluster location of each element of data from the DAQ is determined during an initialization phase and doesn't change during acquisition.
Because the data element origin and end points are the same throughout acquisition, I would think an array of memory locations could be created and the data directly assigned to its proper place. I'm just not sure how to implement such a thing.
The following code does what you want:
For each of your cluster elements (AMC, ANLG_PM and PA) you should add a case in the string case structure, for the elements AMC and PA you will need to place a second case structure.
This is really more of a comment, but I do not have the reputation to leave those yet, so here it is:
Regarding adding cases for every possible value of Array name, is there any reason why you cannot use an enum here? Since you are placing it into a cluster anyway, I would suggest making a type-defined enum of your possible array names. That way, when you want to add or remove one, you only have to do it in one place.
You will still need to right-click on your case structures that use this enum and select Add item for every value if you are adding a value, or manually delete the obsolete value if you are removing one. I suppose some maintenance is required either way...

Keeping an array sorted - at setting, getting or later?

As an aid to learning objective c/oop, I'm designing an iOS app to store and display periodic bodyweight measurements. I've got a singleton which returns a mutablearray of the shared store of measurement object. Each measurement will have at least a date and a body weight, and I want to be able to add historic measurements.
I'd like to display the measurements in date order. What's the best way to do this? As far as I can see the options are as follows: 1) when adding a measurement - I override addobject to sort the shared store every time after a measurement is added, 2) when retrieving the mutablearray I sort it, or 3) I retrieve the mutablearray in whatever order it happens to be in the shared store, then sort it when displaying the table/chart.
It's likely that the data will be retrieved more frequently than a new datum is added, so option 1 will reduce redundant sorting of the shared store - so this is the best way, yes?
You can use a modified version of (1). Instead of sorting the complete array each time a new object is inserted, you use the method described here: https://stackoverflow.com/a/8180369/1187415 to insert the new object into the array at the correct place.
Then for each insert you have only a binary search to find the correct index for the new object, and the array is always in correct order.
Since you said that the data is more frequently retrieved than new data is added, this seems to be more efficient.
If I forget your special case, this question is not so easy to answer. There are two basic solutions:
Keep array unsorted and when you try to access the element and array is not sorted, then sort it. Let's call it "lazy sorting".
Keep array sorted when inserting elements. Note this is not about appending new element at the end and then sort the whole array. This is about finding where the element should be (binary search) and place it there. Let's call it "sorted insert".
Both techniques are correct and useful and deciding which one is better depends on your use cases.
Example:
You want to insert hundreds of elements into the array, then access the elements, then again insert hundreds of elements, then access. In summary, you will be inserting values in big chunks. In this case, lazy sorting will be better.
You will often insert individual elements and you will access the elements often. Then sorted insert will have better performance.
Something in the middle (between inserting 1 and inserting tens of elements). You probably don't care which one of the methods will be used.
(Note that you can use also specialized structures to keep an array sorted, not based on NSArray, e.g. structures based on a balanced tree, while keeping number of elements in the subtree).

Better to use size or count on collection?

When counting a collection. Is it better to do it via size or count?
Size = Ruby (#foobars.size)
Count = SQL (#foobars.count)
I also notice, count makes another trip to the db.
I tend to suggest using size for everything, just because it's safer. People make fewer silly mistakes using size.
Here's how they work:
length: length will return the number of elements from an array, or otherwise loaded collection - the key point is that the collection will be loaded here regardless. So if you're working with an activerecord association, it will pull the elements from the DB to memory, and then return the number.
count: count issues a database query, so if you have an array already it's a pointless call to your database.
size: best of both worlds - size checks which type you're using and then uses whichever seems more appropriate (so if you have an array, it will use length; if you have an unretrieved ActiveRecord::Association it will use count, and so on).
Source:
http://blog.hasmanythrough.com/2008/2/27/count-length-size/
It depends on the situation. In the example you show I would go with size since you already have the collection loaded and a call to size will just check the length of the array. As you noticed, count will do an extra db query and you really want to avoid that.
However, in the scenario that you only want to display the number of Foobars and not show those objects, then I would go with count because it will not load the instances into memory, just return the number of records.

Is there any built in method for sorting in Objective-c?

I have two sorted NSMutableArrays (or I can use any other collection, not critical), I need to insert objects from the first array to the second and preserve sort order in the second array. What is the optimal (fastest) method to do that? I can implement all the known good algorithms, but my question is, if there is already some built-in method? If not, what is the best algorithm in my case?
The real answer would be: it depends, since you are asking: what is the fastest way of inserting objects from one array into another while preserving sort order.
There is no built in way of inserting in the right place of a sorted array. You can achieve the same effect by just adding the two arrays together but it won't be "the fastest way".
What is actually faster depends on many things like: how much data does the arrays contain, what is the ratio of data in array1 vs array2 (does one array contain much more data than the other)?, etc.
NOTE: You should probably begin with the simple solution and only optimize once you experience performance problems. Do measurements with a large data set though, to see that your solution works with whatever data your users may have.
Inserting items from one sorted array into another sorted array
If you want to merge the two arrays by inserting objects in the right place then normal algorithms apply. You should insert the smaller array into the bigger array and try to insert entire sorted sequences where possible instead of every item one by one.
For best performance you should try to make a batch insert using insertObjects:atIndexes: instead of inserting the object one by one.
You can use indexOfObject:inSortedRange:options:usingComparator: to find the index that each item should be inserted in the other array if you specify NSBinarySearchingInsertionIndex for the options. Also, the comparator you are using must be the same as the comparator that sorted the array, otherwise the result is "undefined".
With this in mind you would do something like this
Create mutable index
For every ITEM in SMALLER ARRAY
Find the index where to insert ITEM in LONGER ARRAY
Add (the insertion location + the location in the short array) as the index in the mutable set.
Next item.
Batch insert all items.
The documentation for insertObjects:atIndexes: tells you that "the corresponding location specified in indexes after earlier insertions have been made." Which in your case with two sorted array mean all items with a lower index will already have been added and thus you should add the index of the object in the short array to the value returned from indexOfObject:inSortedRange:options:usingComparator:.
Another (probably very premature optimization) you can do is decrease the sortedRange for every item in the loop so that you don't have to search through parts of the array that you know the item to be inserted is bigger than.
There are probably many other optimizations that can be made. You should still measure first! Hopefully this will get you started.
NSArray *newArray=[firstArray arrayByAddingObjectsFromArray:secondArray];
newArray = [newArray sortedArrayUsingSelector:#selector(localizedCaseInsensitiveCompare:)];
I would start by simply adding all of the objects of the first array to the second and then resorting the second. Time how long it takes. If it is acceptable, stop there.
If not, you could try a binary search to find the insertion point in the second array for each item in the first array. Since both arrays are sorted, you might be able to optimise the search by using the last insertion point as the lower bound each time round. Something like this:
NSInteger insertionPoint = -1;
for (id object in array1)
{
insertionPoint = [self binarySearch: array2 for: object lowerBound: insertionPoint + 1];
[array2 insertObject: object atIndex: insertionPoint];
}
The Cocoa class NSSortDescriptor together with sortedArrayUsingDescriptors: from NSArray should do what you are after.
Since you are using mutable arrays, you might want to use sortUsingDescriptors: which sorts the mutable array without creating a new one.
Look at the documentation here to see if any of the NSArray sort methods work for you. http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSArray_Class/NSArray.html. You can scroll down to the methods and there's 7 built-in ones for sorting. You could probably just combine the two arrays and run the sortedArrayUsingComparator: or one of the other methods.