Keeping an array sorted - at setting, getting or later? - objective-c

As an aid to learning objective c/oop, I'm designing an iOS app to store and display periodic bodyweight measurements. I've got a singleton which returns a mutablearray of the shared store of measurement object. Each measurement will have at least a date and a body weight, and I want to be able to add historic measurements.
I'd like to display the measurements in date order. What's the best way to do this? As far as I can see the options are as follows: 1) when adding a measurement - I override addobject to sort the shared store every time after a measurement is added, 2) when retrieving the mutablearray I sort it, or 3) I retrieve the mutablearray in whatever order it happens to be in the shared store, then sort it when displaying the table/chart.
It's likely that the data will be retrieved more frequently than a new datum is added, so option 1 will reduce redundant sorting of the shared store - so this is the best way, yes?

You can use a modified version of (1). Instead of sorting the complete array each time a new object is inserted, you use the method described here: https://stackoverflow.com/a/8180369/1187415 to insert the new object into the array at the correct place.
Then for each insert you have only a binary search to find the correct index for the new object, and the array is always in correct order.
Since you said that the data is more frequently retrieved than new data is added, this seems to be more efficient.

If I forget your special case, this question is not so easy to answer. There are two basic solutions:
Keep array unsorted and when you try to access the element and array is not sorted, then sort it. Let's call it "lazy sorting".
Keep array sorted when inserting elements. Note this is not about appending new element at the end and then sort the whole array. This is about finding where the element should be (binary search) and place it there. Let's call it "sorted insert".
Both techniques are correct and useful and deciding which one is better depends on your use cases.
Example:
You want to insert hundreds of elements into the array, then access the elements, then again insert hundreds of elements, then access. In summary, you will be inserting values in big chunks. In this case, lazy sorting will be better.
You will often insert individual elements and you will access the elements often. Then sorted insert will have better performance.
Something in the middle (between inserting 1 and inserting tens of elements). You probably don't care which one of the methods will be used.
(Note that you can use also specialized structures to keep an array sorted, not based on NSArray, e.g. structures based on a balanced tree, while keeping number of elements in the subtree).

Related

Natural way of indexing elements in Flink

Is there a built-in way to index and access indices of individual elements of DataStream/DataSet collection?
Like in typical Java collections, where you know that e.g. a 3rd element of an ArrayList can be obtained by ArrayList.get(2) and vice versa ArrayList.indexOf(elem) gives us the index of (the first occurence of) the specified element. (I'm not asking about extracting elements out of the stream.)
More specifically, when joining DataStreams/DataSets, is there a "natural"/easy way to join elements that came (were created) first, second, etc.?
I know there is a zipWithIndex transformation that assigns sequential indices to elements. I suspect the indices always start with 0? But I also suspect that they aren't necessarily assigned in the order the elements were created in (i.e. by their Event Time). (It also exists only for DataSets.)
This is what I currently tried:
DataSet<Tuple2<Long, Double>> tempsJoIndexed = DataSetUtils.zipWithIndex(tempsJo);
DataSet<Tuple2<Long, Double>> predsLinJoIndexed = DataSetUtils.zipWithIndex(predsLinJo);
DataSet<Tuple3<Double, Double, Double>> joinedTempsJo = tempsJoIndexed
.join(predsLinJoIndexed).where(0).equalTo(0)...
And it seems to create wrong pairs.
I see some possible approaches, but they're either non-Flink or not very nice:
I could of course assign an index to each element upon the stream's
creation and have e.g. a stream of Tuples.
Work with event-time timestamps. (I suspect there isn't a way to key by timestamps, and even if there was, it wouldn't be useful for
joining multiple streams like this unless the timestamps are
actually assigned as indices.)
We could try "collecting" the stream first but then we wouldn't be using Flink anymore.
The 1. approach seems like the most viable one, but it also seems redundant given that the stream should by definition be a sequential collection and as such, the elements should have a sense of orderliness (e.g. `I'm the 36th element because 35 elements already came before me.`).
I think you're going to have to assign index values to elements, so that you can partition the data sets by this index, and thus ensure that two records which need to be joined are being processed by the same sub-task. Once you've done that, a simple groupBy(index) and reduce() would work.
But assigning increasing ids without gaps isn't trivial, if you want to be reading your source data with parallelism > 1. In that case I'd create a RichMapFunction that uses the runtimeContext sub-task id and number of sub-tasks to calculate non-overlapping and monotonic indexes.

Select nth value of NSArray

How would I go about selecting the nth values of an array and adding them to another array.
For example, if i have an NSArray which has 100 objects and I want to add every 5th object? I understand how to select the 5th object and how to add to a new array etc, but just looking for the best way to do this. This is for image manipulation, so will be dealing with arrays of up to 2m pixel values.
Is the best way to just use for loops?
You can using striding:
.stride(to: 100, by: 5)
So to create a new array:
Array(0.stride(to: 10, by: 2).map( { myArray[$0] }))
UPDATE: As Leo Dabus points out, the above will start at element 0 (and take every 2nd). If you want to start at the 5th and take every 5th, you would use:
Array(4.stride(to: 100, by: 5).map( { myArray[$0] }))
Using loops is pretty good: they are easy to read, and they are about as efficient as anything else that you may want to use for this purpose. The only optimization to the for loop approach is to reserve a specific number of elements upfront, because you know how many elements you are going to write.
If you are going to make the same selection from multiple arrays (e.g. processing an array of arrays), consider creating NSIndexSet, and applying it with objectsAtIndexes to perform the selection. This may give your code slightly better readability, because the for loop for creating indexes would be separate from the process of selection.
Finally, if you need to optimize for speed, and your arrays store wrapped primitives, consider using plain arrays instead of NSArray to avoid wrapping and unwrapping. This has a potential of giving you the most improvement, because by eliminating additional memory accesses for unwrapping it would also significantly improve locality of reference, which has crucial importance for cache use optimization.

Fast, efficient method of assigning large array of data to array of clusters?

I'm looking for a faster, more efficient method of assigning data gathered from a DAQ to its proper location in a large cluster containing arrays of subclusters.
My current method 1 relies heavily on the OpenG cluster manipulation tools, but with a large data-set the performance is far too slow.
The array and cluster location of each element of data from the DAQ is determined during an initialization phase and doesn't change during acquisition.
Because the data element origin and end points are the same throughout acquisition, I would think an array of memory locations could be created and the data directly assigned to its proper place. I'm just not sure how to implement such a thing.
The following code does what you want:
For each of your cluster elements (AMC, ANLG_PM and PA) you should add a case in the string case structure, for the elements AMC and PA you will need to place a second case structure.
This is really more of a comment, but I do not have the reputation to leave those yet, so here it is:
Regarding adding cases for every possible value of Array name, is there any reason why you cannot use an enum here? Since you are placing it into a cluster anyway, I would suggest making a type-defined enum of your possible array names. That way, when you want to add or remove one, you only have to do it in one place.
You will still need to right-click on your case structures that use this enum and select Add item for every value if you are adding a value, or manually delete the obsolete value if you are removing one. I suppose some maintenance is required either way...

Is there any built in method for sorting in Objective-c?

I have two sorted NSMutableArrays (or I can use any other collection, not critical), I need to insert objects from the first array to the second and preserve sort order in the second array. What is the optimal (fastest) method to do that? I can implement all the known good algorithms, but my question is, if there is already some built-in method? If not, what is the best algorithm in my case?
The real answer would be: it depends, since you are asking: what is the fastest way of inserting objects from one array into another while preserving sort order.
There is no built in way of inserting in the right place of a sorted array. You can achieve the same effect by just adding the two arrays together but it won't be "the fastest way".
What is actually faster depends on many things like: how much data does the arrays contain, what is the ratio of data in array1 vs array2 (does one array contain much more data than the other)?, etc.
NOTE: You should probably begin with the simple solution and only optimize once you experience performance problems. Do measurements with a large data set though, to see that your solution works with whatever data your users may have.
Inserting items from one sorted array into another sorted array
If you want to merge the two arrays by inserting objects in the right place then normal algorithms apply. You should insert the smaller array into the bigger array and try to insert entire sorted sequences where possible instead of every item one by one.
For best performance you should try to make a batch insert using insertObjects:atIndexes: instead of inserting the object one by one.
You can use indexOfObject:inSortedRange:options:usingComparator: to find the index that each item should be inserted in the other array if you specify NSBinarySearchingInsertionIndex for the options. Also, the comparator you are using must be the same as the comparator that sorted the array, otherwise the result is "undefined".
With this in mind you would do something like this
Create mutable index
For every ITEM in SMALLER ARRAY
Find the index where to insert ITEM in LONGER ARRAY
Add (the insertion location + the location in the short array) as the index in the mutable set.
Next item.
Batch insert all items.
The documentation for insertObjects:atIndexes: tells you that "the corresponding location specified in indexes after earlier insertions have been made." Which in your case with two sorted array mean all items with a lower index will already have been added and thus you should add the index of the object in the short array to the value returned from indexOfObject:inSortedRange:options:usingComparator:.
Another (probably very premature optimization) you can do is decrease the sortedRange for every item in the loop so that you don't have to search through parts of the array that you know the item to be inserted is bigger than.
There are probably many other optimizations that can be made. You should still measure first! Hopefully this will get you started.
NSArray *newArray=[firstArray arrayByAddingObjectsFromArray:secondArray];
newArray = [newArray sortedArrayUsingSelector:#selector(localizedCaseInsensitiveCompare:)];
I would start by simply adding all of the objects of the first array to the second and then resorting the second. Time how long it takes. If it is acceptable, stop there.
If not, you could try a binary search to find the insertion point in the second array for each item in the first array. Since both arrays are sorted, you might be able to optimise the search by using the last insertion point as the lower bound each time round. Something like this:
NSInteger insertionPoint = -1;
for (id object in array1)
{
insertionPoint = [self binarySearch: array2 for: object lowerBound: insertionPoint + 1];
[array2 insertObject: object atIndex: insertionPoint];
}
The Cocoa class NSSortDescriptor together with sortedArrayUsingDescriptors: from NSArray should do what you are after.
Since you are using mutable arrays, you might want to use sortUsingDescriptors: which sorts the mutable array without creating a new one.
Look at the documentation here to see if any of the NSArray sort methods work for you. http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSArray_Class/NSArray.html. You can scroll down to the methods and there's 7 built-in ones for sorting. You could probably just combine the two arrays and run the sortedArrayUsingComparator: or one of the other methods.

Bind Top 5 Values of a To-Many Core Data Relationship to Text Fields

I am making an application that represents a cell phone bill using Core Data. I have three entities: Bill, Line, and Calls. Bills can have many lines, and lines can have many calls. All of this is set up with relationships. Right now, I have a table view that displays all of the bills. When you double click on a bill, a sheet comes down with a popup box that lists all of the lines on the bill. Below the popup box is a box that has many labels that display various information about that line. Below that information I want to list the top 5 numbers called by that line in that month. Lines has a to-many relationship with Calls, which has two fields, number and minutes. I have all of the calls for the selected line loaded into an NSArrayController with a sort descriptor that properly arranges the values. How do I populate 5 labels with the top 5 values of this array controller?
EDIT: The array of calls is already unique, when I gather the data, I combine all the individual calls into total minutes per number for each month. I just need to sort and display the first 5 records of this combined array.
I may be wrong (and really hope I am), but it looks like you'll need to use brute force on this one. There are no set / array operators that can help, nor does NSPredicate appear to help.
I think this is actually a bit tricky and it looks like you'll have to do some coding. The Core Data Programming Guide says:
If you execute a fetch directly, you
should typically not add
Objective-C-based predicates or sort
descriptors to the fetch request.
Instead you should apply these to the
results of the fetch. If you use an
array controller, you may need to
subclass NSArrayController so you can
have it not pass the sort descriptors
to the persistent store and instead do
the sorting after your data has been
fetched.
I think this applies to your case because it's important to consider whether sorting or filtering takes place first in a fetch request (when the fetch requests predicate and sort descriptors are set). This is because you'll be tempted to use the #distinctUnionOfObjects set/array operator. If the list is collapsed to uniques before sorting, it won't help. If it's applied after sorting, you can just set the fetch request's limit to 5 and there're your results.
Given the documentation, I don't know that this is how it will work. Also, in this case, it might be easier to avoid NSArrayController for this particular problem and just use NSTableViewDataSource protocol, but that's beyond the scope of this Q&A.
So, here's one way to do it:
Create a predicate to filter for the
selected bill's line items.*
Create a sort descriptor to sort the
line items by their telephone number
(which are hopefully in a
standardized format internally, else
trouble awaits) via #"call.number" in your case.
Create a fetch request for the line
item entity, with the predicate and
sort descriptors then execute it**.
With those sorted results, it would be nice if you could collapse and "unique" them easily, and again, you'll be tempted to use #distinctUnionOfObjects. Unfortunately, set/array operators won't be any help here (you can't use them directly on NSArray/NSMutableArray or NSSet/NSMutableSet instances). Brute force it is, then.
I'd create a topFive array and loop through the results, adding the number to topFive if it's not there already, until topFive has 5 items or until I'm out of results.
Displaying it in your UI (using Bindings or not) is, as I said, beyond the scope of this Q&A, so I'll leave it there. I'd LOVE to hear if there's a better way to do this - it's definitely one of those "looks like it should be easy but it's not" kind of things. :-)
*You could also use #unionOfObjects in your key path during the fetch itself to get the numbers of the calls of the line items of the selected bill, which would probably be more efficient a fetch, but I'm getting tired of typing, and you get the idea. ;-)
**In practice I'd probably limit the fetch request to something reasonable - some bills (especially for businesses and teenagers) can be quite large.