Find a sequence of matching words in two strings

Find a sequence of matching words in two strings - objective-c

I have a problem when comparing two arrays of NSString.
One of the arrays keeps changing, because it is linked to an UITextField, so when you write anything, it is stored in the array.
-(void) tick: (ccTime) dt{
NSString *level = (NSString *)gameData[2]; //this doesn't change. example: "one two three .."
NSString *text = (NSString *)tf.text; //text field keeps changing as I write
NSArray *separatedText = [text componentsSeparatedByString:#" "];
NSArray *separatedLevel = [level componentsSeparatedByString:#" "];
I want to check if any of the words that you are writing match with any of the words that are stored in the level array.
For example,
Level is: "Comparing two strings"
And I write "comparing"
So this method, would return that 1 word is matching. So if I write "comparing two" it returns that 2 words are matching.
I tried with this code:
for (int i=0;i<separatedText.count;i++){
for (int j=0;j<separatedLevel.count;j++){
if (![separatedText[i] caseInsensitiveCompare:separatedLevel[j]]){
NSLog(#"OK");
}
}else{
NSLog(#"NO");
}
}
}
but it is not working properly.
Any ideas?
Thank you very much.
EDIT
caseInsensitiveCompare does not return a BOOL, but it works for me that way.
If I run this code with level "one two three" and text "one" the result is:
OK
NO
NO
And with "one two" the result is:
NO
OK
NO
When it should be OK OK NO
EDIT 2
Sorry if I expressed myself wrong.
The result that I want is "OK OK NO" when I write 2 words that match
Or maybe, a result that returns the number of matches.
So with the previous example:
Level: "one two three"
Text: "one two"
Result: "2 matching words"

The issue is that you need to increment i if you find a match, and keep a count of the sequential matches you have made.
I have implemented it as a C Function here, but you should have no trouble converting it to an Objective-C class method:
#import <Foundation/Foundation.h>
NSUInteger numSequentialMatches(NSString *s1, NSString *s2) {
NSUInteger sequence = 0, highestSequence = 0;
NSArray *a1 = [s1 componentsSeparatedByString:#" "];
NSArray *a2 = [s2 componentsSeparatedByString:#" "];
for (NSInteger i = 0; i < a1.count; i++) {
for (NSInteger j = 0; j < a2.count; j++) {
if ([a1[i] caseInsensitiveCompare:a2[j]] == NSOrderedSame) {
if (i < a1.count)
i++;
if (++sequence > highestSequence)
highestSequence = sequence;
} else {
sequence = 0;
}
}
}
return highestSequence;
}
int main(int argc, const char **argv) {
#autoreleasepool {
NSLog(#"%lu", numSequentialMatches(#"one two three", #"one"));
NSLog(#"%lu", numSequentialMatches(#"one two three", #"one two"));
}
return 0;
}
$ clang -o array array.m -framework Foundation
$ ./array
2013-10-03 15:21:08.166 array[17194:707] 1
2013-10-03 15:21:08.167 array[17194:707] 2

Try like this using fast enumerator:-
NSArray *separatedText=[NSArray arrayWithObjects:#"one",#"two",#"three",nil];
NSArray *separatedLevel=[NSArray arrayWithObjects:#"one",#"two",nil];
NSString *str1;
NSString *str2;
BOOL isMatch;
for (str1 in separatedText){
isMatch=NO;
for (str2 in separatedLevel){
if (![str1 caseInsensitiveCompare:str2])
{
NSLog(#"OK");
isMatch=YES;
}
}
if (isMatch==NO)
{
NSLog(#"NO");
}
}

Use NSPredicates instead.
NSPredicate *predicate = [NSPredicate predicateWithFormat:#"SELF contains '%#'",separatedText];
NSArray *filteredArray = [separatedLevel filteredArrayUsingPredicate:predicate ];
if ([filteredArray count] == 0){
NSLog(#"No elements");
}
else{
NSLog(#"Have elements");
}

Related

Replacing several different characters in NSString

I am making an iPad app for personal use and I m struggling with some character replacement in some strings. For example I got an NSString which contains "\t\t\t C D". Now what I want to do is replace every C and every D there is in there with C# and D#. I have managed to do that but unfortunately it doesn't look efficient at all to me.
Here is my code so far:
- (IBAction)buttonPressed:(id)sender
{
if(sender)
{
NSError *error;
NSString *newTab = [[NSString alloc] init];
NSRegularExpression *regexC = [NSRegularExpression regularExpressionWithPattern:#"C" options:0 error:&error];
NSRegularExpression *regexD = [NSRegularExpression regularExpressionWithPattern:#"D" options:0 error:&error];
newTab = [regexC stringByReplacingMatchesInString:self.tab options:0 range:NSMakeRange(0, self.tab.length) withTemplate:#"C#"];
NSString *newTabAfterFirstRegex = [[NSString alloc] initWithString:newTab];
newTabAfterFirstRegex = [regexD stringByReplacingMatchesInString:newTab options:0 range:NSMakeRange(0, newTab.length) withTemplate:#"D#"];
NSLog(#"%#",newTabAfterFirstRegex);
}
}
Plus this is just a small tester code. What I would really like to do is to have an algorithm that checks for instances of all music tabs (C C# D D# E F F# G G# A A# B) in a given string and when the IBAction is triggered I would like each one of them to be replaced by the next one (and B becomes C).
Any ideas would be very much appreciated!
Thank you very much!

You can set a regular expression (e.g. '[A-G]#?') to match certain strings. With method -matchesInString:options:range: you can loop through all the matches (it will give back a range for each match) and use that range to do the replacements.

Regular expressions seem a bit like overkill for this, you could just do two string replacements, so that you don't get all the overhead from regexes, using
- (NSString *)stringByReplacingOccurrencesOfString:(NSString *)target
withString:(NSString *)replacement
and just replace it twice. Also, you don't need to do the NSString allocations, because it creates a reference in the return.

I created the following methods for encryption the other day. I've tested it for your purpose, and it seems to work.
-(NSString *)ReplaceMe:(NSString *)s {
// Putting the source into an array
NSMutableArray *myArray = [[NSMutableArray alloc] init];
int i;
for (i = 0; i < s.length; i++) {
[myArray addObject: [self Mid:s :i :1]];
}
// Creating a string with the revised array
NSMutableString *myString = [NSMutableString new];
for (i = 0; i < s.length; i++) {
[myString appendString:[self Conversion:[myArray objectAtIndex:i]]];
}
// Final
return myString;
}
The method above requires two additional functions.
-(NSString *)Mid:(NSString *)str:(NSInteger)s:(NSInteger)l {
if ((s <= str.length-1) && (s + l <= str.length) && (s >= 0) && (l >= 1)) {
return [str substringWithRange:NSMakeRange(s, l)];
}
else {
return #"";
}
}
The other is...
-(NSString *)Conversion:(NSString *)s {
if ([s isEqualToString:#"C"]) {
return #"C#";
}
else if ([s isEqualToString:#"D"]) {
return #"D#";
}
else {
return s;
}
}
You can put other conversion pairs in the function above. The following is an example as to how to use ReplaceMe.
- (IBAction)clickAction:(id)sender {
textField2.text = [self ReplaceMe:textField1.text];
}
So it's ReplaceMe is quite easy to use.

Algorithm to find anagrams Objective-C

I've got an algorithm to find anagrams within a group of eight-letter words. Effectively it's alphabetizing the letters in the longer word, doing the same with the shorter words one by one, and seeing if they exist in the longer word, like so:
tower = eortw
two = otw
rot = ort
The issue here is that if I look for ort in eortw (or rot in tower), it'll find it, no problem. Rot is found inside tower. However, otw is not inside eortw (or two in tower), because of the R in the middle. Ergo, it doesn't think two is found in tower.
Is there a better way I can do this? I'm trying to do it in Objective-C, and both the eight-letter words and regular words are stored in NSDictionaries (with their normal and alphabetized forms).
I've looked at various other posts re. anagrams on StackOverflow, but none seem to address this particular issue.
Here's what I have so far:
- (BOOL) doesEightLetterWord: (NSString* )haystack containWord: (NSString *)needle {
for (int i = 0; i < [needle length] + 1; i++) {
if (!needle) {
NSLog(#"DONE!");
}
NSString *currentCharacter = [needle substringWithRange:NSMakeRange(i, 1)];
NSCharacterSet *set = [NSCharacterSet characterSetWithCharactersInString: currentCharacter];
NSLog(#"Current character is %#", currentCharacter);
if ([haystack rangeOfCharacterFromSet:set].location == NSNotFound) {
NSLog(#"The letter %# isn't found in the word %#", currentCharacter, haystack);
return FALSE;
} else {
NSLog(#"The letter %# is found in the word %#", currentCharacter, haystack);
int currentLocation = [haystack rangeOfCharacterFromSet: set].location;
currentLocation++;
NSString *newHaystack = [haystack substringFromIndex: currentLocation];
NSString *newNeedle = [needle substringFromIndex: i + 1];
NSLog(#"newHaystack is %#", newHaystack);
NSLog(#"newNeedle is %#", newNeedle);
}
}
}

If you use only part of the letters it isn't a true anagram.
A good algorithm in your case would be to take the sorted strings and compare them letter by letter, skipping mis-matches in the longer word. If you reach the end of the shorter word then you have a match:
char *p1 = shorter_word;
char *p2 = longer_word;
int match = TRUE;
for (;*p1; p1++) {
while (*p2 && (*p2 != *p1)) {
p2++;
}
if (!*p2) {
/* Letters of shorter word are not contained in longer word */
match = FALSE;
}
}

This is one that approach I might take for finding out if one ordered word contained all of the letters of another ordered word. Note that it won't find true anagrams (That simply requires the two ordered strings to be the same) but this does what I think you're asking for:
+(BOOL) does: (NSString* )longWord contain: (NSString *)shortWord {
NSString *haystack = [longWord copy];
NSString *needle = [shortWord copy];
while([haystack length] > 0 && [needle length] > 0) {
NSCharacterSet *set = [NSCharacterSet characterSetWithCharactersInString: [needle substringToIndex:1]];
if ([haystack rangeOfCharacterFromSet:set].location == NSNotFound) {
return NO;
}
haystack = [haystack substringFromIndex: [haystack rangeOfCharacterFromSet: set].location+1];
needle = [needle substringFromIndex: 1];
}
return YES;
}

The simplest (but not most efficient) way might be to use NSCountedSet. We can do this because for counted sets, [a isSubsetOfSet:b] return YES if and only if [a countForObject:object] <= [b countForObject:object] for every object in a.
Let's add a category to NSString to do it:
#interface NSString (lukech_superset)
- (BOOL)lukech_isSupersetOfString:(NSString *)needle;
#end
#implementation NSString (lukech_superset)
- (NSCountedSet *)lukech_countedSetOfCharacters {
NSCountedSet *set = [NSCountedSet set];
[self enumerateSubstringsInRange:NSMakeRange(0, self.length) options:NSStringEnumerationByComposedCharacterSequences usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
[set addObject:substring];
}];
return set;
}
- (BOOL)lukech_isSupersetOfString:(NSString *)needle {
return [[needle lukech_countedSetOfCharacters] isSubsetOfSet:[self lukech_countedSetOfCharacters]];
}
#end

Check if NSString contains all or some characters

I have an NSString called query which contains ~10 characters.
I would like to check to see if a second NSString called word contains all of the characters in query, or some characters, but no other characters which aren't specified in query.
Also, if there is only one occurrence of the character in the query, there can only be one occurrence of the character in the word.
Please could you tell me how to do this?
NSString *query = #"ABCDEFJAKSUSHFKLAFIE";
NSString *word = #"fearing"; //would pass as NO as there is no 'n' in the query var.

The following answers the first half:
NSCharacterSet *nonQueryChars = [[NSCharacterSet characterSetWithCharactersInString:[query lowercaseString]] invertedSet];
NSRange badCharRange = [[word lowercaseString] rangeOfCharacterFromSet:nonQueryChars];
if (badCharRange.location == NSNotFound) {
// word only has characters in query
} else {
// found unwanted characters in word
}
I need to think about the second half of the requirement.
Ok, the following code should fulfill both requirements:
- (NSCountedSet *)wordLetters:(NSString *)text {
NSCountedSet *res = [NSCountedSet set];
for (NSUInteger i = 0; i < text.length; i++) {
[res addObject:[text substringWithRange:NSMakeRange(i, 1)]];
}
return res;
}
- (void)checkWordAgainstQuery {
NSString *query = #"ABCDEFJAKSUSHFKLAFIE";
NSString *word = #"fearing";
NSCountedSet *queryLetters = [self wordLetters:[query lowercaseString]];
NSCountedSet *wordLetters = [self wordLetters:[word lowercaseString]];
BOOL ok = YES;
for (NSString *wordLetter in wordLetters) {
int wordCount = [wordLetters countForObject:wordLetter];
// queryCount will be 0 if this word letter isn't in query
int queryCount = [queryLetters countForObject:wordLetter];
if (wordCount > queryCount) {
ok = NO;
break;
}
}
if (ok) {
// word matches against query
} else {
// word has extra letter or too many of a matching letter
}
}

How can I optimise out this nested for loop?

How can I optimise out this nested for loop?
The program should go through each word in the array created from the word text file, and if it's greater than 8 characters, add it to the goodWords array. But the caveat is that I only want the root word to be in the goodWords array, for example:
If greet is added to the array, I don't want greets or greetings or greeters, etc.
NSString *string = [NSString stringWithContentsOfFile:#"/Users/james/dev/WordParser/word.txt" encoding:NSUTF8StringEncoding error:NULL];
NSArray *words = [string componentsSeparatedByString:#"\r\n"];
NSMutableArray *goodWords = [NSMutableArray array];
BOOL shouldAddToGoodWords = YES;
for (NSString *word in words)
{
NSLog(#"Word: %#", word);
if ([word length] > 8)
{
NSLog(#"Word is greater than 8");
for (NSString *existingWord in [goodWords reverseObjectEnumerator])
{
NSLog(#"Existing Word: %#", existingWord);
if ([word rangeOfString:existingWord].location != NSNotFound)
{
NSLog(#"Not adding...");
shouldAddToGoodWords = NO;
break;
}
}
if (shouldAddToGoodWords)
{
NSLog(#"Adding word: %#", word);
[goodWords addObject:word];
}
}
shouldAddToGoodWords = YES;
}

How about something like this?
//load the words from wherever
NSString * allWords = [NSString stringWithContentsOfFile:#"/usr/share/dict/words"];
//create a mutable array of the words
NSMutableArray * words = [[allWords componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]] mutableCopy];
//remove any words that are shorter than 8 characters
[words filterUsingPredicate:[NSPredicate predicateWithFormat:#"length >= 8"]];
//sort the words in ascending order
[words sortUsingSelector:#selector(caseInsensitiveCompare:)];
//create a set of indexes (these will be the non-root words)
NSMutableIndexSet * badIndexes = [NSMutableIndexSet indexSet];
//remember our current root word
NSString * currentRoot = nil;
NSUInteger count = [words count];
//loop through the words
for (NSUInteger i = 0; i < count; ++i) {
NSString * word = [words objectAtIndex:i];
if (currentRoot == nil) {
//base case
currentRoot = word;
} else if ([word hasPrefix:currentRoot]) {
//word is a non-root word. remember this index to remove it later
[badIndexes addIndex:i];
} else {
//no match. this word is our new root
currentRoot = word;
}
}
//remove the non-root words
[words removeObjectsAtIndexes:badIndexes];
NSLog(#"%#", words);
[words release];
This runs very very quickly on my machine (2.8GHz MBP).

A Trie seems suitable for your purpose. It is like a hash, and is useful for detecting if a given string is a prefix of an already seen string.

I used an NSSet to ensure that you only have 1 copy of a word added at a time. It will add a word if the NSSet does not already contain it. It then checks to see if the new word is a substring for any word that has already been added, if true then it won't add the new word. It's case-insensitive as well.
What I've written is a refactoring of your code. It's probably not that much faster but you really do want a tree data structure if you want to make it a lot faster when you want to search for words that have already been added to your tree.
Take a look at RedBlack Trees or B-Trees.
Words.txt
objective
objectively
cappucin
cappucino
cappucine
programme
programmer
programmatic
programmatically
Source Code
- (void)addRootWords {
NSString *textFile = [[NSBundle mainBundle] pathForResource:#"words" ofType:#"txt"];
NSString *string = [NSString stringWithContentsOfFile:textFile encoding:NSUTF8StringEncoding error:NULL];
NSArray *wordFile = [string componentsSeparatedByString:#"\n"];
NSMutableSet *goodWords = [[NSMutableSet alloc] init];
for (NSString *newWord in wordFile)
{
NSLog(#"Word: %#", newWord);
if ([newWord length] > 8)
{
NSLog(#"Word '%#' contains 8 or more characters", newWord);
BOOL shouldAddWord = NO;
if ( [goodWords containsObject:newWord] == NO) {
shouldAddWord = YES;
}
for (NSString *existingWord in goodWords)
{
NSRange textRange = [[newWord lowercaseString] rangeOfString:[existingWord lowercaseString]];
if( textRange.location != NSNotFound ) {
// newWord contains the a substring of existingWord
shouldAddWord = NO;
break;
}
NSLog(#"(word:%#) does not contain (substring:%#)", newWord, existingWord);
shouldAddWord = YES;
}
if (shouldAddWord) {
NSLog(#"Adding word: %#", newWord);
[goodWords addObject:newWord];
}
}
}
NSLog(#"***Added words***");
int count = 1;
for (NSString *word in goodWords) {
NSLog(#"%d: %#", count, word);
count++;
}
[goodWords release];
}
Output:
***Added words***
1: cappucino
2: programme
3: objective
4: programmatic
5: cappucine

How do I divide NSString into smaller words?

Greetings,
I am new to objective c, and I have the following issue:
I have a NSString:
"There are seven words in this phrase"
I want to divide this into 3 smaller strings (and each smaller string can be no longer than 12 characters in length) but must contain whole words separated by a space, so that I end up with:
String1 = "There are" //(length is 9 including space)
String2 = "seven words"// (length is 11)
String3 = "in this" //(length is 7), with the word "phrase" ignored as this would exceed the maximum length of 12..
Currently I am splitting my original array into an array with:
NSArray *piecesOfOriginalString = [originalString componentsSeparatedByString:#" "];
Then I have multiple "if" statements to sort out situations where there are 3 words, but I want to make this more extensible for any array up to 39 (13 characters * 3 line) letters, with any characters >40 being ignored. Is there an easy way to divide a string based on words or "phrases" up to a certain length (in this case, 12)?

Something similar to this? (Dry-code warning)
NSArray *piecesOfOriginalString = [originalString componentsSeparatedByString:#" "];
NSMutableArray *phrases = [NSMutableArray array];
NSString *chunk = nil;
NSString *lastchunk = nil;
int i, count = [piecesOfOriginalString count];
for (i = 0; i < count; i++) {
lastchunk = [[chunk copy] autorelease];
if (chunk) {
chunk = [chunk stringByAppendingString:[NSString stringWithFormat:#" %#", [piecesOfOriginalString objectAtIndex:i]]];
} else {
chunk = [[[piecesOfOriginalString objectAtIndex:i] copy] autorelease];
}
if ([chunk length] > 12) {
[phrases addObject:lastchunk];
chunk = nil;
}
if ([phrases count] == 3) {
break;
}
}

well, you can keep splitting the string as you're already doing, or you could check out whether NSScanner suits your needs. In any case, you're going to have to do the math yourself.

Thanks McLemore, that is really helpful! I will try this immediately. My current solution is very similar, but less refined, as I hard coded the loops and use individual variable to hold the sub strings (called them TopRow, MidRow, and BottomRow), that and the memory management issue is overlooked... :
int maxLength = 12; // max chars per line (in each string)
int j=0; // for looping, j is the counter for managing the words in the "for" loop
TopRow = nil; //1st string
MidRow = nil; //2nd string
//BottomRow = nil; //third row string (not implemented yet)
BOOL Row01done = NO; // if YES, then stop trying to fill row 1
BOOL Row02done = NO; // if YES, then stop trying to fill row 2
largeArray = #"Larger string with multiple words";
tempArray = [largeArray componentsSeparatedByString:#" "];
for (j=0; j<[tempArray count]; j=j+1) {
if (TopRow == nil) {
TopRow = [tempArray objectAtIndex:j];
}
else {
if (Row01done == YES) {
if (MidRow == nil) {
MidRow = [tempArray objectAtIndex:j];
}
else {
if (Row02done == YES) {
//row 3 stuff goes here... unless I can rewrite as iterative loop...
//will need to uncommend BottomRow = nil; above..
}
else {
if ([MidRow length] + [[tempArray objectAtIndex:j] length] < maxLength) {
MidRow = [MidRow stringByAppendingString:#" "];
MidRow = [MidRow stringByAppendingString:[tempArray objectAtIndex:j]];
}
else {
Row02done = YES;
//j=j-1; // uncomment once BottowRow loop is implemented
}
}
}
}
else {
if (([TopRow length] + [[tempArray objectAtIndex:j] length]) < maxLength) {
TopRow = [TopRow stringByAppendingString:#" "];
TopRow = [TopRow stringByAppendingString:[tempArray objectAtIndex:j]];
}
else {
Row01done = YES;
j=j-1; //end of loop without adding the string to TopRow, subtract 1 from j and start over inside Mid Row
}
}
}
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find a sequence of matching words in two strings - objective-c

Use NSPredicates instead. NSPredicate predicate = [NSPredicate predicateWithFormat:#"SELF contains '%#'",separatedText]; NSArray filteredArray = [separatedLevel filteredArrayUsingPredicate:predicate ]; if ([filteredArray count] == 0){ NSLog(#"No elements"); } else{ NSLog(#"Have elements"); }

Related

Replacing several different characters in NSString

Algorithm to find anagrams Objective-C

Check if NSString contains all or some characters

How can I optimise out this nested for loop?

How do I divide NSString into smaller words?

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find a sequence of matching words in two strings - objective-c

Use NSPredicates instead. NSPredicate *predicate = [NSPredicate predicateWithFormat:#"SELF contains '%#'",separatedText]; NSArray *filteredArray = [separatedLevel filteredArrayUsingPredicate:predicate ]; if ([filteredArray count] == 0){ NSLog(#"No elements"); } else{ NSLog(#"Have elements"); }

Related

Replacing several different characters in NSString

Algorithm to find anagrams Objective-C

Check if NSString contains all or some characters

How can I optimise out this nested for loop?

How do I divide NSString into smaller words?

Categories

Resources

Use NSPredicates instead. NSPredicate predicate = [NSPredicate predicateWithFormat:#"SELF contains '%#'",separatedText]; NSArray filteredArray = [separatedLevel filteredArrayUsingPredicate:predicate ]; if ([filteredArray count] == 0){ NSLog(#"No elements"); } else{ NSLog(#"Have elements"); }