Count of chars in NSString or NSMutableString? - objective-c

I've tried this
NSCharacterSet *myCharSet = [NSCharacterSet characterSetWithCharactersInString: myString];
[myCharSet count];
But get a warning that NSCharacterSet may not respond to count. This is for desktop apps and not iPhone, which I think the above code works with.

I might be missing something here, but what's wrong with simply doing:
NSUInteger characterCount = [myString length];
To just get the number of characters in a string, I don't see any reason to mess around with NSCharacterSet.

That should not work on the iPhone either, as NSCharacterSet is not a subclass of NSSet on either platform.
If you really need to get a count why not subclass NSSet, add the value, then have a method that returns that as an NSCharacterSet on demand for use in anything that needs a character set?

NSString *string = #"0̄ 😄";
__block NSUInteger count = 0;
[string enumerateSubstringsInRange:NSMakeRange(0, [string length])
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
count++;
}];
NSLog(#"%ld %ld", (long)count, (long)[string length]);

Related

Split string into parts

I want to split NSString into array with fixed-length parts. How can i do this?
I searched about it, but i only find componentSeparatedByString method, but nothing more. It's also can be done manually, but is there a faster way to do this ?
Depends what you mean by "faster" - if it is processor performance you refer to, I'd guess that it is hard to beat substringWithRange:, but for robust, easy coding of a problem like this, regular expressions can actually come in quite handy.
Here's one that can be used to divide a string into 10-char chunks, allowing the last chunk to be of less than 10 chars:
NSString *pattern = #".{1,10}";
Unfortunately, the Cocoa implementation of the regex machinery is less elegant, but simple enough to use:
NSString *string = #"I want to split NSString into array with fixed-length parts. How can i do this?";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern: pattern options: 0 error: &error];
NSArray *matches = [regex matchesInString:string options:0 range:NSMakeRange(0, [string length])];
NSMutableArray *result = [NSMutableArray array];
for (NSTextCheckingResult *match in matches) {
[result addObject: [string substringWithRange: match.range]];
}
Break the string into a sequence of NSRanges and then try using NSString's substringWithRange: method.
You can split a string in different ways.
One way is to split by spaces(or any character):
NSString *string = #"Hello World Obj C is Awesome";
NSArray *words = [string componentsSeparatedByString:#" "];
You can also split at exact points in a string:
NSString *word = [string substringWithRange:NSMakeRange(startPoint, FIXED_LENGTH)];
Simply put it in a loop for a fixed length and save to Mutable Array:
NSMutableArray *words = [NSMutableArray array];
for (int i = 0; i < [string length]; i++) {
NSString *word = [string substringWithRange:NSMakeRange(i, FIXED_LENGTH)]; //you may want to make #define
[array addObject:word];
}
Hope this helps.

matching multiple words with enumerateSubstringsInRange in NSMutableAttributedString

I am trying to match the string below but unfortunately it only gives me "nope" as the result. Can anyone help? thanks in advance!
NSMutableAttributedString *text = [NSMutableString stringWithString:#"darn thing suddenly erupted without any warning.";
NSString *findMe = #"suddenly erupted";
[text enumerateSubstringsInRange:NSMakeRange(0, [text length]) options:NSStringEnumerationByWords usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if ([findMe isEqualToString:substring] ) {
NSLog(#"found it");
}
else {
NSLog(#"nope");
}
}];
Your method is only enumerating separate words. "suddenly erupted" are two words.
Why don't you use -rangeOfSubstring: in order to find whether text contains some substring? For example:
NSLog(#"%#",[[text mutableString] rangeOfString:findMe].location == NSNotFound ? #"nope" : #"found it");
enumerateSubstringsInRange have options like
NSStringEnumerationByLines
NSStringEnumerationBySentences
NSStringEnumerationByParagraphs
NSStringEnumerationByComposedCharacterSequences
NSStringEnumerationByWords
if you have words to compare means it will work
e.g
NSString *text = #"darn thing suddenlyerupted without any warning.";
NSString *findMe = #"suddenlyerupted";
so you cant compare sub string. You need to customize the block or move to some other option.

Separate Full Sentences in a block of NSString text

I have been trying to use Regular Expression to separate full sentences in a big block of text. I can't use the componentsSeparatedByCharactersInSet because it will obviously fail with sentences ending in ?!, !!, ... I have seen some external classes to do componentSeparateByRegEx but I prefer doing it without adding an external library.
Here is a sample input
Hi, I am testing. How are you? Wow!! this is the best, and I am happy.
The output should be an array
first element: Hi, I am testing.
second element: How are you?
third element: wow!!
forth element: this is the best, and I am happy.
This is what I have but as I mentioned it shouldn't do what I intend. Probably a regular expression will do a much better job here.
-(NSArray *)getArrayOfFullSentencesFromBlockOfText:(NSString *)textBlock{
NSMutableCharacterSet *characterSet = [[NSMutableCharacterSet alloc] init];
[characterSet addCharactersInString:#".?!"];
NSArray * sentenceArray = [textBlock componentsSeparatedByCharactersInSet:characterSet];
return sentenceArray;
}
Thanks for your help,
You want to use -[NSString enumerateSubstringsInRange:options:usingBlock:] with the NSStringEnumerationBySentences option. This will give you every sentence, and it does so in a language-aware manner.
NSArray *fullSentencesFromText(NSString *text) {
NSMutableArray *results = [NSMutableArray array];
[text enumerateSubstringsInRange:NSMakeRange(0, [text length]) options:NSStringEnumerationBySentences usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
[results addObject:substring];
}];
return results;
}
Note, in testing, each substring appears to contain the trailing spaces after the punctuation. You may want to strip those out.
Something like this could do the job:
NSString *msg = #"Hi, I am testing. How are you? Wow!! this is the best, and I am happy.";
[msg enumerateSubstringsInRange:NSMakeRange(0, [msg length])
options:NSStringEnumerationBySentences | NSStringEnumerationLocalized
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop)
{
NSLog(#"Sentence:%#", substring);
// Add each sentence into an array
}];
Or use:
[mutstri enumerateSubstringsInRange:NSMakeRange(0, [mutstri length])
options:NSStringEnumerationBySentences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
NSLog(#"%#", substring);
}];

Objective-C Find the most commonly used words in an NSString

I am trying to write a method:
- (NSDictionary *)wordFrequencyFromString:(NSString *)string {}
where the dictionary returned will have the words and how often they were used in the string provided. Unfortunately, I can't seem to find a way to iterate through words in a string to analyze each one - only each character which seems like a bit more work than necessary. Any suggestions?
NSString has -enumerateSubstringsInRange: method which allows to enumerate all words directly, letting standard api to do all necessary stuff to define word boundaries etc:
[s enumerateSubstringsInRange:NSMakeRange(0, [s length])
options:NSStringEnumerationByWords
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
NSLog(#"%#", substring);
}];
In the enumeration block you can use either NSDictionary with words as keys and NSNumber as their counts, or use NSCountedSet that provides required functionality for counts.
You can use componentsSeparatedByCharactersInSet: to split the string and NSCountedSet will count the words for you.
1) Split the string into words using a combination of the punctuation, whitespace and new line character sets:
NSMutableCharacterSet *separators = [NSMutableCharacterSet punctuationCharacterSet];
[separators formUnionWithCharacterSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
NSArray *words = [myString componentsSeparatedByCharactersInSet:separators];
2) Count the occurrences of the words (if you want to disregard capitalization, you can do NSString *myString = [originalString lowercaseString]; before splitting the string into components):
NSCountedSet *frequencies = [NSCountedSet setWithArray:words];
NSUInteger aWordCount = [frequencies countForObject:#"word"]);
If you are willing to change your method signature, you can just return the counted set.
Split the string into an array of words using -[NSString componentsSeparatedByCharactersInSet:] first. (Use [[NSCharacterSet letterCharacterSet] invertedSet] as the argument to split on all non-letter characters.)
I used following approach for getting most common word from NSString.
-(void)countMostFrequentWordInSpeech:(NSString*)speechString
{
NSString *string = speechString;
NSCountedSet *countedSet = [NSCountedSet new];
[string enumerateSubstringsInRange:NSMakeRange(0, [string length])
options:NSStringEnumerationByWords | NSStringEnumerationLocalized
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){
[countedSet addObject:substring];
}];
// NSLog(#"%#", countedSet);
//Sort CountedSet & get most frequent common word at 0th index of resultant array
NSMutableArray *dictArray = [NSMutableArray array];
[countedSet enumerateObjectsUsingBlock:^(id obj, BOOL *stop) {
[dictArray addObject:#{#"object": obj,
#"count": #([countedSet countForObject:obj])}];
}];
NSArray *sortedArrayOfWord= [dictArray sortedArrayUsingDescriptors:#[[NSSortDescriptor sortDescriptorWithKey:#"count" ascending:NO]]];
if (sortedArrayOfWord.count>0)
{
self.mostFrequentWordLabel.text=[NSString stringWithFormat:#"Frequent Word: %#", [[sortedArrayOfWord[0] valueForKey:#"object"] capitalizedString]];
}
}
"speechString" is my string from which I have to get most frequent/common words. Object at 0th index of array "sortedArrayOfWord" would be most common word.

How do you get the number of words in a NSTextStorage/NSString?

So my question is basically how do you get the number of words in a NSTextStorage/NSString? I don't want the character length but the word length. Thanks.
If you're on 10.6 or later, the following may be the easiest solution:
- (NSUInteger)numberOfWordsInString:(NSString *)str {
__block NSUInteger count = 0;
[str enumerateSubstringsInRange:NSMakeRange(0, [str length])
options:NSStringEnumerationByWords|NSStringEnumerationSubstringNotRequired
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
count++;
}];
return count;
}
If you want to take the current locale into account when doing word-splitting you can also add NSStringEnumerationLocalized to the options.
You could always find the number of spaces and add one.
To be more accurate one would have to take into all nonletter characters: commas, fullstops, whitespace characters, etc.
[[string componentsSeparatedByString:#" "] count];
When using NSTextStorage, you can use the words method to get to the number of words. It might not be the most memory-efficient way to count words, but it does a pretty good job at ignoring punctuation marks and other non-word characters:
NSString *input = #"one - two three four .";
NSTextStorage *storage = [[NSTextStorage alloc] initWithString:input];
NSLog(#"word count: %u", [[storage words] count]);
The output will be word count: 4.
CFStringTokenizer is your friend.
Use that:
NSArray *words = [theStorage words];
int wordCount = [words count];
Is that your problem?