I have a huge word list of over 280.000+ words that is loaded from an sqlite database to an NSArray. then I do a fast enumeration to check if a certain string value entered by the user matches one of the words in the Array. Since the array is so large it takes about 1-2 seconds on the iphone 4 to go through that array.
How can I improve the performance? Maybe I should make several smaller arrays? one for each letter in the alphabet so that there is less data to go through.
this is how my database class looks
static WordDatabase *_database;
+(WordDatabase *) database
{
if (_database == nil) {
_database = [[WordDatabase alloc] init];
}
return _database;
}
- (id) init
{
if ((self = [super init])) {
NSString *sqLiteDb = [[NSBundle mainBundle] pathForResource:#"dictionary" ofType:#"sqlite"];
if (sqlite3_open([sqLiteDb UTF8String], &_database) != SQLITE_OK) {
NSLog(#"Failed to open database!");
}
}
return self;
}
- (NSArray *)dictionaryWords {
NSMutableArray *retval = [[[NSMutableArray alloc] init] autorelease];
NSString *query = #"SELECT word FROM words";
sqlite3_stmt *statement;
if (sqlite3_prepare_v2(_database, [query UTF8String], -1, &statement, nil) == SQLITE_OK) {
while (sqlite3_step(statement) == SQLITE_ROW) {
char *wordChars = (char *) sqlite3_column_text(statement, 0);
NSString *name = [[NSString alloc] initWithUTF8String:wordChars];
name = [name uppercaseString];
[retval addObject:name];
}
sqlite3_finalize(statement);
}
return retval;
}
then in my main view I initialise it like this
dictionary = [[NSArray alloc] initWithArray:[WordDatabase database].dictionaryWords];
and finally I go through the array using this method
- (void) checkWord
{
NSString *userWord = formedWord.wordLabel.string;
NSLog(#"checking dictionary for %#", userWord);
for (NSString *word in dictionary) {
if ([userWord isEqualToString: word]) {
NSLog(#"match found");
}
}
}
Lots of different ways.
stick all the words in a dictionary or set, testing for presence is fast
break it up as you suggest; create a tree type structure of some kind.
use the database to do the search. They are generally pretty good at exactly that, if constructed correctly.
If space isn't an issue, store a hash value of each word and use that for your base lookup. Once filtered by the hash, then compare each of the words. This will reduce the number of costly string comparisons. Easier to index/sort and performs quick lookups.
I second a dictionary. NSDictionary for objective c.
for instance:
// To print out all key-value pairs in the NSDictionary myDict
for(id key in myDict)
NSLog(#"key=%# value=%#", key, [myDict objectForKey:key]);
I have a search string, where people can use quotes to group phrases together, and mix this with individual keywords. For example, a string like this:
"Something amazing" rooster
I'd like to separate that into an NSArray, so that it would have Something amazing (without quotes) as one element, and rooster as the other.
Neither componentsSeparatedByString nor componentsSeparatedByCharactersInSet seem to fit the bill. Is there an easy way to do this, or should I just code it up myself?
You probably will have to code some of this up yourself, but the NSScanner should be a good basis on which to build. If you use the scanUpToCharactersInSet method to look for everything up to your next whitespace or quote character to can pick off words. Once you encounter a quite character, you could continue to scan using just the quote in the character set to end at, so that spaces within the quotes don't result in the end of a token.
I made a simple way to do this using NSScanner:
+ (NSArray *)arrayFromTagString:(NSString *)string {
NSScanner *scanner = [NSScanner scannerWithString:string];
NSString *substring;
NSMutableArray *array = [[NSMutableArray alloc] init];
while (scanner.scanLocation < string.length) {
// test if the first character is a quote
unichar character = [string characterAtIndex:scanner.scanLocation];
if (character == '"') {
// skip the first quote and scan everything up to the next quote into a substring
[scanner setScanLocation:(scanner.scanLocation + 1)];
[scanner scanUpToString:#"\"" intoString:&substring];
[scanner setScanLocation:(scanner.scanLocation + 1)]; // skip the second quote too
}
else {
// scan everything up to the next space into the substring
[scanner scanUpToString:#" " intoString:&substring];
}
// add the substring to the array
[array addObject:substring];
//if not at the end, skip the space character before continuing the loop
if (scanner.scanLocation < string.length) [scanner setScanLocation:(scanner.scanLocation + 1)];
}
return array.copy;
}
This method will convert the array back to a tag string, re-quoting the multi-word tags:
+ (NSString *)tagStringFromArray:(NSArray *)array {
NSMutableString *string = [[NSMutableString alloc] init];
NSRange range;
for (NSString *substring in array) {
if (string.length > 0) {
[string appendString:#" "];
}
range = [substring rangeOfString:#" "];
if (range.location != NSNotFound) {
[string appendFormat:#"\"%#\"", substring];
}
else [string appendString:substring];
}
return string.description;
}
I ended up going with a regular expression as I was already using RegexKitLite, and creating this NSString+SearchExtensions category.
.h:
// NSString+SearchExtensions.h
#import <Foundation/Foundation.h>
#interface NSString (SearchExtensions)
-(NSArray *)searchParts;
#end
.m:
// NSString+SearchExtensions.m
#import "NSString+SearchExtensions.h"
#import "RegexKitLite.h"
#implementation NSString (SearchExtensions)
-(NSArray *)searchParts {
__block NSMutableArray *items = [[NSMutableArray alloc] initWithCapacity:5];
[self enumerateStringsMatchedByRegex:#"\\w+|\"[\\w\\s]*\"" usingBlock: ^(NSInteger captureCount,
NSString * const capturedStrings[captureCount],
const NSRange capturedRanges[captureCount],
volatile BOOL * const stop) {
NSString *result = [capturedStrings[0] stringByReplacingOccurrencesOfRegex:#"\"" withString:#""];
NSLog(#"Match: '%#'", result);
[items addObject:result];
}];
return [items autorelease];
}
#end
This returns an NSArray of strings with the search strings, removing the double quotes that surround the phrases.
If you'll allow a slightly different approach, you could try Dave DeLong's CHCSVParser. It is intended to parse CSV strings, but if you set the space character as the delimiter, I am pretty sure you will get the intended behavior.
Alternatively, you can peek into the code and see how it handles quoted fields - it is published under the MIT license.
I would run -componentsSeparatedByString:#"\"" first, then create a BOOL isPartOfQuote, initialized to YES if the first character of the string was a ", but otherwise set to NO.
Then create a mutable array to return:
NSMutableArray* masterArray = [[NSMutableArray alloc] init];
Then, create a loop over the array returned from the separation:
for(NSString* substring in firstSplitArray) {
NSArray* secondSplit;
if (isPartOfQuote == NO) {
secondSplit = [substring componentsSeparatedByString:#" "];
}
else {
secondSplit = [NSArray arrayWithObject: substring];
}
[masterArray addObjectsFromArray: secondSplit];
isPartOfQuote = !isPartOfQuote;
}
Then return masterArray from the function.
I have an NSMutableArray where each item is an NSMutableDictionary.
NSMutableAray *services = [NSMutableArray new];
NSMutableDictionary *dict = [NSMutableDictionary dictionary];
[dict setObject: aNetService forKey: #"net_service"];
[dict setObject: [aNetService name] forKey: #"net_service_name"];
[self.services addObject:dict];
Then I want to retrieve an item according to the "net_service_name" key. So, I tried the following:
-(void)netServiceBrowser:(NSNetServiceBrowser *)aBrowser didRemoveService:(NSNetService *)aNetService moreComing:(BOOL)more {
NSLog(#"netservname%#",[aNetService name]);
for (int i = 0; i < [services count]; i++)
{
NSDictionary *dict = [services objectAtIndex:i];
NSLog(#"netservname%#",[dict objectForKey:#"net_service_name"]);
if ([NSString stringWithFormat:#"%#",[dict objectForKey:#"net_service_name"]] == [NSString stringWithFormat:#"%#",[aNetService name]]){
NSLog(#"Match");
}
}
}
In the console both NSLog(#"netservname") are the same, but I'm not getting the "Match" message. Can anyone see why? Thanks very much!
Try using
if ([[dict objectForKey:#"net_service_name"] isEqualToString:[aNetService name]]).
== checks for identity, that is whether the two objects point to the same memory address.
isEqualToString checks for equality, in this case, that the two strings are the same characters in the same order.
[[dict objectForKey:#"net_service_name"] isEqualToString:[aNetService name]]
Try that.
try this:
[[NSString stringWithFormat:#"%#",[dict objectForKey:#"net_service_name"]] isEqualToString:[NSString stringWithFormat:#"%#",[aNetService name]]]
or
[[dict objectForKey:#"net_service_name"] isEqualToString:[aNetService name]]
NSString isEqualToString:(NSString*) http://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/Reference/NSString.html
see: Case-insensitive NSString comparison
Because Objective-C uses pointers for everything you can't directly compare strings with the == operator. What you're looking for is:
[string isEqualToString: otherString]
Always use isEqualToString: for NSString comparison.
(And in general, always use isEqual: and its variants for NSObject comparisons)
NSObject* and NSString* are pointers, so == do pointer comparison which is only true if the pointers are pointing to the exact same address in memory, which is quite never the case, while isEqualToString: check if the contents of the string are identical.
Besides, you should prefer fast enumerations form for your for loop, and avoid doing stuff like stringWithFormat:#"%#" which are totally useless (you are creating a string using a format that will only contain... another string. Why don't you use the string itself directly?)
// NSFastEnumeration for to loop thru a NSArray
for (NSDictionary *dict in services)
{
NSLog(#"netservname%#",[dict objectForKey:#"net_service_name"]);
// Loose the [NSString stringWithFormat:#"%#",...] stuff!!
if ([[dict objectForKey:#"net_service_name"] isEqualToString:[aNetService name]]) {
NSLog(#"Match");
}
}
Try to replace if-statement with this one:
if ([[dict objectForKey:#"net_service_name"] compare:[aNetService name]] == NSOrderedSame) {
NSLog(#"Match");
}
I've created a custom sorting by creating a new category for the NSString class. Below is my code.
#implementation NSString (Support)
- (NSComparisonResult)sortByPoint:(NSString *)otherString {
int first = [self calculateWordValue:self];
int second = [self calculateWordValue:otherString];
if (first > second) {
return NSOrderedAscending;
}
else if (first < second) {
return NSOrderedDescending;
}
return NSOrderedSame;
}
- (int)calculateWordValue:(NSString *)word {
int totalValue = 0;
NSString *pointPath = [[NSBundle mainBundle] pathForResource:#"pointvalues"ofType:#"plist"];
NSDictionary *pointDictionary = [[NSDictionary alloc] initWithContentsOfFile:pointPath];
for (int index = 0; index < [word length]; index++) {
char currentChar = [word characterAtIndex:index];
NSString *individual = [[NSString alloc] initWithFormat:#"%c",currentChar];
individual = [individual uppercaseString];
NSArray *numbersForKey = [pointDictionary objectForKey:individual];
NSNumber *num = [numbersForKey objectAtIndex:0];
totalValue += [num intValue];
// cleanup
individual = nil;
numbersForKey = nil;
num = nil;
}
return totalValue;
}
#end
My question is whether I create a point dictionary to determine the point value associated with each character in the alphabet based on a plist. Then in my view controller, I call
NSArray *sorted = [words sortedArrayUsingSelector:#selector(sortByPoint:)];
to sort my table of words by their point values. However, creating a new dictionary each time the -sortByPoint: method is called is extremely inefficient. Is there a way to create the pointDictionary beforehand and use it for each subsequent call in the -calculateWordValue:?
This is a job for the static keyword. If you do this:
static NSDictionary *pointDictionary = nil
if (pointDictionary==nil) {
NSString *pointPath = [[NSBundle mainBundle] pathForResource:#"pointvalues" ofType:#"plist"];
pointDictionary = [[NSDictionary alloc] initWithContentsOfFile:pointPath];
}
pointDictionary will be persistent for the lifetime of your app.
One other optimization is to build a cache of scores by using this against each of your words:
[dict setObject:[NSNumber numberWithInt:[word calculateWordValue:word]] forKey:word];
Then use the keysSortedByValueUsingSelector: method to extract your list of words (note the selector chould be compare:, since the objects being compared are the NSNumbers).
Finally, the word argument on your method is redundant. Use self instead:
-(int)calculateWordValue {
...
for (int index = 0; index < [self length]; index++)
{
char currentChar = [self characterAtIndex:index];
...
}
...
}
Change your sortByPoint:(NSString *) otherString method to take the dictionary as a parameter, and pass it your pre-created dictionary.
sortByPoint:(NSString *)otherString withDictionary:(NSDictionary *)pointDictionary
EDIT: Won't work because of usage in sortedArrayWithSelector. Apologies. Instead, you may be better off creating a wrapper class for your point dictionary as a singleton which you then obtain a reference to each time your sort function runs.
In calculateWordValue:
NSDictionary *pointDictionary = [[DictWrapper sharedInstance] dictionary];
DictWrapper has an NSDictionary as a property, and a class method sharedInstance (to return the singleton. You have to set that dictionary and pre-initialize it before you do you first sorting.
Is there any way to create a new
NSString from a format string like #"xxx=%#, yyy=%#" and a NSArray of objects?
In the NSSTring class there are many methods like:
- (id)initWithFormat:(NSString *)format arguments:(va_list)argList
- (id)initWithFormat:(NSString *)format locale:(id)locale arguments:(va_list)argList
+ (id)stringWithFormat:(NSString *)format, ...
but non of them takes a NSArray as an argument, and I cannot find a way to create a va_list from a NSArray...
It is actually not hard to create a va_list from an NSArray. See Matt Gallagher's excellent article on the subject.
Here is an NSString category to do what you want:
#interface NSString (NSArrayFormatExtension)
+ (id)stringWithFormat:(NSString *)format array:(NSArray*) arguments;
#end
#implementation NSString (NSArrayFormatExtension)
+ (id)stringWithFormat:(NSString *)format array:(NSArray*) arguments
{
char *argList = (char *)malloc(sizeof(NSString *) * arguments.count);
[arguments getObjects:(id *)argList];
NSString* result = [[[NSString alloc] initWithFormat:format arguments:argList] autorelease];
free(argList);
return result;
}
#end
Then:
NSString* s = [NSString stringWithFormat:#"xxx=%#, yyy=%#" array:#[#"XXX", #"YYY"]];
NSLog( #"%#", s );
Unfortunately, for 64-bit, the va_list format has changed, so the above code no longer works. And probably should not be used anyway given it depends on the format that is clearly subject to change. Given there is no really robust way to create a va_list, a better solution is to simply limit the number of arguments to a reasonable maximum (say 10) and then call stringWithFormat with the first 10 arguments, something like this:
+ (id)stringWithFormat:(NSString *)format array:(NSArray*) arguments
{
if ( arguments.count > 10 ) {
#throw [NSException exceptionWithName:NSRangeException reason:#"Maximum of 10 arguments allowed" userInfo:#{#"collection": arguments}];
}
NSArray* a = [arguments arrayByAddingObjectsFromArray:#[#"X",#"X",#"X",#"X",#"X",#"X",#"X",#"X",#"X",#"X"]];
return [NSString stringWithFormat:format, a[0], a[1], a[2], a[3], a[4], a[5], a[6], a[7], a[8], a[9] ];
}
Based on this answer using Automatic Reference Counting (ARC): https://stackoverflow.com/a/8217755/881197
Add a category to NSString with the following method:
+ (id)stringWithFormat:(NSString *)format array:(NSArray *)arguments
{
NSRange range = NSMakeRange(0, [arguments count]);
NSMutableData *data = [NSMutableData dataWithLength:sizeof(id) * [arguments count]];
[arguments getObjects:(__unsafe_unretained id *)data.mutableBytes range:range];
NSString *result = [[NSString alloc] initWithFormat:format arguments:data.mutableBytes];
return result;
}
One solution that came to my mind is that I could create a method that works with a fixed large number of arguments like:
+ (NSString *) stringWithFormat: (NSString *) format arguments: (NSArray *) arguments {
return [NSString stringWithFormat: format ,
(arguments.count>0) ? [arguments objectAtIndex: 0]: nil,
(arguments.count>1) ? [arguments objectAtIndex: 1]: nil,
(arguments.count>2) ? [arguments objectAtIndex: 2]: nil,
...
(arguments.count>20) ? [arguments objectAtIndex: 20]: nil];
}
I could also add a check to see if the format string has more than 21 '%' characters and throw an exception in that case.
#Chuck is correct about the fact that you can't convert an NSArray into varargs. However, I don't recommend searching for the pattern %# in the string and replacing it each time. (Replacing characters in the middle of a string is generally quite inefficient, and not a good idea if you can accomplish the same thing in a different way.) Here is a more efficient way to create a string with the format you're describing:
NSArray *array = ...
NSAutoreleasePool *pool = [NSAutoreleasePool new];
NSMutableArray *newArray = [NSMutableArray arrayWithCapacity:[array count]];
for (id object in array) {
[newArray addObject:[NSString stringWithFormat:#"x=%#", [object description]]];
}
NSString *composedString = [[newArray componentsJoinedByString:#", "] retain];
[pool drain];
I included the autorelease pool for good housekeeping, since an autoreleased string will be created for each array entry, and the mutable array is autoreleased as well. You could easily make this into a method/function and return composedString without retaining it, and handle the autorelease elsewhere in the code if desired.
This answer is buggy. As noted, there is no solution to this problem that is guaranteed to work when new platforms are introduced other than using the "10 element array" method.
The answer by solidsun was working well, until I went to compile with 64-bit architecture. This caused an error:
EXC_BAD_ADDRESS type EXC_I386_GPFLT
The solution was to use a slightly different approach for passing the argument list to the method:
+ (id)stringWithFormat:(NSString *)format array:(NSArray*) arguments;
{
__unsafe_unretained id * argList = (__unsafe_unretained id *) calloc(1UL, sizeof(id) * arguments.count);
for (NSInteger i = 0; i < arguments.count; i++) {
argList[i] = arguments[i];
}
NSString* result = [[NSString alloc] initWithFormat:format, *argList] ;// arguments:(void *) argList];
free (argList);
return result;
}
This only works for arrays with a single element
There is no general way to pass an array to a function or method that uses varargs. In this particular case, however, you could fake it by using something like:
for (NSString *currentReplacement in array)
[string stringByReplacingCharactersInRange:[string rangeOfString:#"%#"]
withString:currentReplacement];
EDIT: The accepted answer claims there is a way to do this, but regardless of how fragile this answer might seem, that approach is far more fragile. It relies on implementation-defined behavior (specifically, the structure of a va_list) that is not guaranteed to remain the same. I maintain that my answer is correct and my proposed solution is less fragile since it only relies on defined features of the language and frameworks.
For those who need a Swift solution, here is an extension to do this in Swift
extension String {
static func stringWithFormat(format: String, argumentsArray: Array<AnyObject>) -> String {
let arguments = argumentsArray.map { $0 as! CVarArgType }
let result = String(format:format, arguments:arguments)
return result
}
}
Yes, it is possible. In GCC targeting Mac OS X, at least, va_list is simply a C array, so you'll make one of ids, then tell the NSArray to fill it:
NSArray *argsArray = [[NSProcessInfo processInfo] arguments];
va_list args = malloc(sizeof(id) * [argsArray count]);
NSAssert1(args != nil, #"Couldn't allocate array for %u arguments", [argsArray count]);
[argsArray getObjects:(id *)args];
//Example: NSLogv is the version of NSLog that takes a va_list instead of separate arguments.
NSString *formatSpecifier = #"\n%#";
NSString *format = [#"Arguments:" stringByAppendingString:[formatSpecifier stringByPaddingToLength:[argsArray count] * 3U withString:formatSpecifier startingAtIndex:0U]];
NSLogv(format, args);
free(args);
You shouldn't rely on this nature in code that should be portable. iPhone developers, this is one thing you should definitely test on the device.
- (NSString *)stringWithFormat:(NSString *)format andArguments:(NSArray *)arguments {
NSMutableString *result = [NSMutableString new];
NSArray *components = format ? [format componentsSeparatedByString:#"%#"] : #[#""];
NSUInteger argumentsCount = [arguments count];
NSUInteger componentsCount = [components count] - 1;
NSUInteger iterationCount = argumentsCount < componentsCount ? argumentsCount : componentsCount;
for (NSUInteger i = 0; i < iterationCount; i++) {
[result appendFormat:#"%#%#", components[i], arguments[i]];
}
[result appendString:[components lastObject]];
return iterationCount == 0 ? [result stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]] : result;
}
Tested with format and arguments:
NSString *format = #"xxx=%#, yyy=%# last component";
NSArray *arguments = #[#"XXX", #"YYY", #"ZZZ"];
Result: xxx=XXX, yyy=YYY last component
NSString *format = #"xxx=%#, yyy=%# last component";
NSArray *arguments = #[#"XXX", #"YYY"];
Result: xxx=XXX, yyy=YYY last component
NSString *format = #"xxx=%#, yyy=%# last component";
NSArray *arguments = #[#"XXX"];
Result: xxx=XXX last component
NSString *format = #"xxx=%#, yyy=%# last component";
NSArray *arguments = #[];
Result: last component
NSString *format = #"some text";
NSArray *arguments = #[#"XXX", #"YYY", #"ZZZ"];
Result: some text
I found some code on the web that claims that this is possible however I haven't managed to do it myself, however if you don't know the number of arguments in advance you also need to build the format string dynamically so I just don't see the point.
You better off just building the string by iterating the array.
You might find the stringByAppendingString: or stringByAppendingFormat: instance method handy .
One can create a category for NSString and make a function which receives a format, an array and returns the string with replaced objects.
#interface NSString (NSArrayFormat)
+ (NSString *)stringWithFormat:(NSString *)format arrayArguments:(NSArray *)arrayArguments;
#end
#implementation NSString (NSArrayFormat)
+ (NSString *)stringWithFormat:(NSString *)format arrayArguments:(NSArray *)arrayArguments {
static NSString *objectSpecifier = #"%#"; // static is redundant because compiler will optimize this string to have same address
NSMutableString *string = [[NSMutableString alloc] init]; // here we'll create the string
NSRange searchRange = NSMakeRange(0, [format length]);
NSRange rangeOfPlaceholder = NSMakeRange(NSNotFound, 0); // variables are declared here because they're needed for NSAsserts
NSUInteger index;
for (index = 0; index < [arrayArguments count]; ++index) {
rangeOfPlaceholder = [format rangeOfString:objectSpecifier options:0 range:searchRange]; // find next object specifier
if (rangeOfPlaceholder.location != NSNotFound) { // if we found one
NSRange substringRange = NSMakeRange(searchRange.location, rangeOfPlaceholder.location - searchRange.location);
NSString *formatSubstring = [format substringWithRange:substringRange];
[string appendString:formatSubstring]; // copy the format from previous specifier up to this one
NSObject *object = [arrayArguments objectAtIndex:index];
NSString *objectDescription = [object description]; // convert object into string
[string appendString:objectDescription];
searchRange.location = rangeOfPlaceholder.location + [objectSpecifier length]; // update the search range in order to minimize search
searchRange.length = [format length] - searchRange.location;
} else {
break;
}
}
if (rangeOfPlaceholder.location != NSNotFound) { // we need to check if format still specifiers
rangeOfPlaceholder = [format rangeOfString:#"%#" options:0 range:searchRange];
}
NSAssert(rangeOfPlaceholder.location == NSNotFound, #"arrayArguments doesn't have enough objects to fill specified format");
NSAssert(index == [arrayArguments count], #"Objects starting with index %lu from arrayArguments have been ignored because there aren't enough object specifiers!", index);
return string;
}
#end
Because NSArray is created at runtime we cannot provide compile-time warnings, but we can use NSAssert to tell us if number of specifiers is equal with number of objects within array.
Created a project on Github where this category can be found. Also added Chuck's version by using 'stringByReplacingCharactersInRange:' plus some tests.
Using one million objects into array, version with 'stringByReplacingCharactersInRange:' doesn't scale very well (waited about 2 minutes then closed the app). Using the version with NSMutableString, function made the string in about 4 seconds. The tests were made using simulator. Before usage, tests should be done on a real device (use a device with lowest specs).
Edit: On iPhone 5s the version with NSMutableString takes 10.471655s (one million objects); on iPhone 5 takes 21.304876s.
Here's the answer without explicitly creating an array:
NSString *formattedString = [NSString stringWithFormat:#"%# World, Nice %#", #"Hello", #"Day"];
First String is the target string to be formatted, the next string are the string to be inserted in the target.
No, you won't be able to. Variable argument calls are solved at compile time, and your NSArray has contents only at runtime.