I'd like to have a function that removes a random set of characters from a string and replaces them with '_'. eg. to create a fill in the blanks type of situation. The way I have it now works, but its not smart. Also I don't want to replace spaces with blanks (as you can see in the while loop). Any suggestions on a more efficient way to do this?
blankItem = #"Remove Some Characters";
for(int j=0;j<totalRemove;j++)
{
replaceLocation=arc4random() % blankItem.length;
while ([blankItem characterAtIndex:replaceLocation] == '_' || [blankItem characterAtIndex:replaceLocation] == ' ') {
replaceLocation=arc4random() % blankItem.length;
}
blankItem= [blankItem stringByReplacingCharactersInRange:NSMakeRange(replaceLocation, 1) withString:#"_"];
}
My issue is with the for and while loops in terms of efficiency. But, maybe efficiency isn't of the essence in something this small?
If the number of characters to remove/replace is small compared to the length of the
string, then your solution is good, because the probability of a "collision" in the
while-loop is small. You can improve the method by using a single mutable string instead of
allocating a new string in each step:
NSString *string = #"Remove Some Characters";
int totalRemove = 5;
NSMutableString *result = [string mutableCopy];
for (int j=0; j < totalRemove; j++) {
int replaceLocation;
do {
replaceLocation = arc4random_uniform((int)[result length]);
} while ([result characterAtIndex:replaceLocation] == '_' || [result characterAtIndex:replaceLocation] == ' ');
[result replaceCharactersInRange:NSMakeRange(replaceLocation, 1) withString:#"_"];
}
If the number of characters to remove/replace is about the same magnitude as the
length of the string, then a different algorithm might be better.
The following code uses the ideas from Unique random numbers in an integer array in the C programming language to replace characters
at random positions with a single loop over all characters of the string.
An additional (first) pass is necessary because of your requirement that space characters
are not replaced.
NSString *string = #"Remove Some Characters";
int totalRemove = 5;
// First pass: Determine number of non-space characters:
__block int count = 0;
[string enumerateSubstringsInRange:NSMakeRange(0, [string length])
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if (![substring isEqualToString:#" "]) {
count++;
}
}];
// Second pass: Replace characters at random positions:
__block int c = count; // Number of remaining non-space characters
__block int r = totalRemove; // Number of remaining characters to replace
NSMutableString *result = [string mutableCopy];
[result enumerateSubstringsInRange:NSMakeRange(0, [result length])
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if (![substring isEqualToString:#" "]) {
// Replace this character with probability r/c:
if (arc4random_uniform(c) < r) {
[result replaceCharactersInRange:substringRange withString:#"_"];
r--;
if (r == 0) *stop = YES; // Stop enumeration, nothing more to do.
}
c--;
}
}];
Another advantage of this solution is that it handles surrogate pairs (e.g. Emojis) and composed character sequences correctly, even if these are stores as two separate characters in the string.
Related
I'm trying to figure out the best approach to a problem. I have an essentially random alphanumeric string that I'm generating on the fly:
NSString *string = #"e04325ca24cf20ac6bd6ebf73c376b20ac57192dad83b22602264e92dac076611b51142ae12d2d92022eb2c77f";
You can see that there are no special characters, just numbers and letters, and all the letters are lowercase. Changing all the letters in this string to uppercase is easy:
[string capitalizedString];
The hard part is that I want to capitalize random characters in this string, not all of them. For example, this could be the output on one execution:
E04325cA24CF20ac6bD6eBF73C376b20Ac57192DAD83b22602264e92daC076611b51142AE12D2D92022Eb2C77F
This could be the output on another, since it's random:
e04325ca24cf20aC6bd6eBF73C376B20Ac57192DAd83b22602264E92dAC076611B51142AE12D2d92022EB2c77f
In case it makes this easier, let's say I have two variables as well:
int charsToUppercase = 12;//hardcoded value for how many characters to uppercase here
int totalChars = 90;//total string length
In this instance it would mean that 12 random characters out of the 90 in this string would be uppercased. What I've figured out so far is that I can loop through each char in the string relatively easily:
NSUInteger len = [string length];
unichar buffer[len+1];
[string getCharacters:buffer range:NSMakeRange(0, len)];
NSLog(#"loop through each char");
for(int i = 0; i < len; i++) {
NSLog(#"%C", buffer[i]);
}
Still stuck with selecting random chars in this loop to uppercase, so not all are uppercased. I'm guessing a condition in the for loop could do the trick well, given that it's random enough.
Here's one way, not particularly concerned with efficiency, but not silly efficiency-wise either: create an array characters in the original string, building an index of which ones are letters along the way...
NSString *string = #"e04325ca24cf20ac6bd6ebf73c376b20ac57192dad83b22602264e92dac076611b51142ae12d2d92022eb2c77f";
NSMutableArray *chars = [#[] mutableCopy];
NSMutableArray *letterIndexes = [#[] mutableCopy];
for (int i=0; i<string.length; i++) {
unichar ch = [string characterAtIndex:i];
// add each char as a string to a chars collection
[chars addObject:[NSString stringWithFormat:#"%c", ch]];
// record the index of letters
if ([[NSCharacterSet letterCharacterSet] characterIsMember:ch]) {
[letterIndexes addObject:#(i)];
}
}
Now, select randomly from the letterIndexes (removing them as we go) to determine which letters shall be upper case. Convert the member of the chars array at that index to uppercase...
int charsToUppercase = 12;
for (int i=0; i<charsToUppercase && letterIndexes.count; i++) {
NSInteger randomLetterIndex = arc4random_uniform((u_int32_t)(letterIndexes.count));
NSInteger indexToUpdate = [letterIndexes[randomLetterIndex] intValue];
[letterIndexes removeObjectAtIndex:randomLetterIndex];
[chars replaceObjectAtIndex:indexToUpdate withObject:[chars[indexToUpdate] uppercaseString]];
}
Notice the && check on letterIndexes.count. This guards against the condition where charsToUppercase exceeds the number of chars. The upper bound of conversions to uppercase is all of the letters in the original string.
Now all that's left is to join the chars array into a string...
NSString *result = [chars componentsJoinedByString:#""];
NSLog(#"%#", result);
EDIT Looking discussion in OP comments, you could, instead of acharsToUppercase input parameter, be given a probability of uppercase change as an input. That would compress this idea into a single loop with a little less data transformation...
NSString *string = #"e04325ca24cf20ac6bd6ebf73c376b20ac57192dad83b22602264e92dac076611b51142ae12d2d92022eb2c77f";
float upperCaseProbability = 0.5;
NSMutableString *result = [#"" mutableCopy];
for (int i=0; i<string.length; i++) {
NSString *chString = [string substringWithRange:NSMakeRange(i, 1)];
BOOL toUppercase = arc4random_uniform(1000) / 1000.0 < upperCaseProbability;
if (toUppercase) {
chString = [chString uppercaseString];
}
[result appendString:chString];
}
NSLog(#"%#", result);
However this assumes a given uppercase probability for any character, not any letter, so it won't result in a predetermined number of letters changing case.
I want to know a simple and fast way to determine if all characters in an NSString are the same.
For example:
NSString *string = "aaaaaaaaa"
=> return YES
NSString *string = "aaaaaaabb"
=> return NO
I know that I can achieve it by using a loop but my NSString is long so I prefer a shorter and simpler way.
you can use this, replace first character with null and check lenght:
-(BOOL)sameCharsInString:(NSString *)str{
if ([str length] == 0 ) return NO;
return [[str stringByReplacingOccurrencesOfString:[str substringToIndex:1] withString:#""] length] == 0 ? YES : NO;
}
Here are two possibilities that fail as quickly as possible and don't (explicitly) create copies of the original string, which should be advantageous since you said the string was large.
First, use NSScanner to repeatedly try to read the first character in the string. If the loop ends before the scanner has reached the end of the string, there are other characters present.
NSScanner * scanner = [NSScanner scannerWithString:s];
NSString * firstChar = [s substringWithRange:[s rangeOfComposedCharacterSequenceAtIndex:0]];
while( [scanner scanString:firstChar intoString:NULL] ) continue;
BOOL stringContainsOnlyOneCharacter = [scanner isAtEnd];
Regex is also a good tool for this problem, since "a character followed by any number of repetitions of that character" is in very simply expressed with a single back reference:
// Match one of any character at the start of the string,
// followed by any number of repetitions of that same character
// until the end of the string.
NSString * patt = #"^(.)\\1*$";
NSRegularExpression * regEx =
[NSRegularExpression regularExpressionWithPattern:patt
options:0
error:NULL];
NSArray * matches = [regEx matchesInString:s
options:0
range:(NSRange){0, [s length]}];
BOOL stringContainsOnlyOneCharacter = ([matches count] == 1);
Both these options correctly deal with multi-byte and composed characters; the regex version also does not require an explicit check for the empty string.
use this loop:
NSString *firstChar = [str substringWithRange:NSMakeRange(0, 1)];
for (int i = 1; i < [str length]; i++) {
NSString *ch = [str substringWithRange:NSMakeRange(i, 1)];
if(![ch isEqualToString:firstChar])
{
return NO;
}
}
return YES;
I have a NSString containing a unicode character bigger than U+FFFF, like the MUSICAL SYMBOL G CLEF symbol '๐'. I can create the NSString and display it.
NSString *s = #"A\U0001d11eB"; // "A๐B"
NSLog(#"String = \"%#\"", s);
The log is correct and displays the 3 characters. This tells me the NSString is well done and there is no encoding problem.
String = "A๐B"
But when I try to loop through all characters using the method
- (unichar)characterAtIndex:(NSUInteger)index
everything goes wrong.
The type unichar is 16 bits so I expect to get the wrong character for the musical symbol. But the length of the string is also incorrect!
NSLog(#"Length = %d", [s length]);
for (int i=0; i<[s length]; i++)
{
NSLog(#" Character %d = %c", i, [s characterAtIndex:i]);
}
displays
Length = 4
Character 0 = A
Character 1 = 4
Character 2 = .
Character 3 = B
What methods should I use to correctly parse my NSString and get my 3 unicode characters?
Ideally the right method should return a type like wchar_t in place of unichar.
Thank you
NSString *s = #"A\U0001d11eB";
NSData *data = [s dataUsingEncoding:NSUTF32LittleEndianStringEncoding];
const wchar_t *wcs = [data bytes];
for (int i = 0; i < [data length]/4; i++) {
NSLog(#"%#010x", wcs[i]);
}
Output:
0x00000041
0x0001d11e
0x00000042
(The code assumes that wchar_t has a size of 4 bytes and little-endian encoding.)
length and charAtIndex: do not give the expected result because \U0001d11e
is internally stored as UTF-16 "surrogate pair".
Another useful method for general Unicode strings is
[s enumerateSubstringsInRange:NSMakeRange(0, [s length])
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
NSLog(#"%#", substring);
}];
Output:
A
๐
B
I have a UITextField that users will be entering characters into. It is as simple as, how can I return it's actual length? When the string contains A-Z 1-9 characters it works as expected but any emoji or special characters get double counted.
In it's simplest format, this just has an allocation of 2 characters for some special characters like emoji:
NSLog(#"Field '%#' contains %i chars", myTextBox.text, [myTextBox.text length] );
I have tried looping through each character using characterAtIndex, substringFromIndex, etc. and got nowhere.
As per answer below, exact code used to count characters (hope this is the right approach but it works..):
NSString *sString = txtBox.text;
__block int length = 0;
[sString enumerateSubstringsInRange:NSMakeRange(0, [sString length])
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
length++;
}];
NSLog(#"Total: %u", length );
The [myTextBox.text length] returns the count of unichars and not the visible length of the string. รฉ = e+ยด which is 2 unichars. The Emoji characters should contain more the 1 unichar.
This sample below enumerates through each character block in the string. Which means if you log the range of substringRange it can longer than 1.
__block NSInteger length = 0;
[string enumerateSubstringsInRange:range
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
length++;
}];
You should go and watch the Session 128 - Advance Text Processing from 2011 WWDC. They explain why it is like that. It's really great!
I hope this was to any help.
Cheers!
We can also consider the below option as a solution
const char *cString = [myTextBox UTF8String];
int textLength = (int)strlen(cString);
This will work with special chars and emoji
How can I get the number of times an NSString (for example, #"cake") appears in a larger NSString (for example, #"Cheesecake, apple cake, and cherry pie")?
I need to do this on a lot of strings, so whatever method I use would need to be relatively fast.
Thanks!
This isn't tested, but should be a good start.
NSUInteger count = 0, length = [str length];
NSRange range = NSMakeRange(0, length);
while(range.location != NSNotFound)
{
range = [str rangeOfString: #"cake" options:0 range:range];
if(range.location != NSNotFound)
{
range = NSMakeRange(range.location + range.length, length - (range.location + range.length));
count++;
}
}
A regex like the one below should do the job without a loop interaction...
Edited
NSString *string = #"Lots of cakes, with a piece of cake.";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"cake" options:NSRegularExpressionCaseInsensitive error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string options:0 range:NSMakeRange(0, [string length])];
NSLog(#"Found %i",numberOfMatches);
Only available on iOS 4.x and superiors.
was searching for a better method then mine but here's another example:
NSString *find = #"cake";
NSString *text = #"Cheesecake, apple cake, and cherry pie";
NSInteger strCount = [text length] - [[text stringByReplacingOccurrencesOfString:find withString:#""] length];
strCount /= [find length];
I would like to know which one is more effective.
And I made an NSString category for better usage:
// NSString+CountString.m
#interface NSString (CountString)
- (NSInteger)countOccurencesOfString:(NSString*)searchString;
#end
#implementation NSString (CountString)
- (NSInteger)countOccurencesOfString:(NSString*)searchString {
NSInteger strCount = [self length] - [[self stringByReplacingOccurrencesOfString:searchString withString:#""] length];
return strCount / [searchString length];
}
#end
simply call it by:
[text countOccurencesOfString:find];
Optional:
you can modify it to search case insensitive by defining options:
There are a couple ways you could do it. You could iteratively call rangeOfString:options:range:, or you could do something like:
NSArray * portions = [aString componentsSeparatedByString:#"cake"];
NSUInteger cakeCount = [portions count] - 1;
EDIT I was thinking about this question again and I wrote a linear-time algorithm to do the searching (linear to the length of the haystack string):
+ (NSUInteger) numberOfOccurrencesOfString:(NSString *)needle inString:(NSString *)haystack {
const char * rawNeedle = [needle UTF8String];
NSUInteger needleLength = strlen(rawNeedle);
const char * rawHaystack = [haystack UTF8String];
NSUInteger haystackLength = strlen(rawHaystack);
NSUInteger needleCount = 0;
NSUInteger needleIndex = 0;
for (NSUInteger index = 0; index < haystackLength; ++index) {
const char thisCharacter = rawHaystack[index];
if (thisCharacter != rawNeedle[needleIndex]) {
needleIndex = 0; //they don't match; reset the needle index
}
//resetting the needle might be the beginning of another match
if (thisCharacter == rawNeedle[needleIndex]) {
needleIndex++; //char match
if (needleIndex >= needleLength) {
needleCount++; //we completed finding the needle
needleIndex = 0;
}
}
}
return needleCount;
}
A quicker to type, but probably less efficient solution.
- (int)numberOfOccurencesOfSubstring:(NSString *)substring inString:(NSString*)string
{
NSArray *components = [string componentsSeparatedByString:substring];
return components.count-1; // Two substring will create 3 separated strings in the array.
}
Here is a version done as an extension to NSString (same idea as Matthew Flaschen's answer):
#interface NSString (my_substr_search)
- (unsigned) countOccurencesOf: (NSString *)subString;
#end
#implementation NSString (my_substring_search)
- (unsigned) countOccurencesOf: (NSString *)subString {
unsigned count = 0;
unsigned myLength = [self length];
NSRange uncheckedRange = NSMakeRange(0, myLength);
for(;;) {
NSRange foundAtRange = [self rangeOfString:subString
options:0
range:uncheckedRange];
if (foundAtRange.location == NSNotFound) return count;
unsigned newLocation = NSMaxRange(foundAtRange);
uncheckedRange = NSMakeRange(newLocation, myLength-newLocation);
count++;
}
}
#end
<somewhere> {
NSString *haystack = #"Cheesecake, apple cake, and cherry pie";
NSString *needle = #"cake";
unsigned count = [haystack countOccurencesOf: needle];
NSLog(#"found %u time%#", count, count == 1 ? #"" : #"s");
}
If you want to count words, not just substrings, then use CFStringTokenizer.
Here's another version as a category on NSString:
-(NSUInteger) countOccurrencesOfSubstring:(NSString *) substring {
if ([self length] == 0 || [substring length] == 0)
return 0;
NSInteger result = -1;
NSRange range = NSMakeRange(0, 0);
do {
++result;
range = NSMakeRange(range.location + range.length,
self.length - (range.location + range.length));
range = [self rangeOfString:substring options:0 range:range];
} while (range.location != NSNotFound);
return result;
}
Swift solution would be:
var numberOfSubstringAppearance = 0
let length = count(text)
var range: Range? = Range(start: text.startIndex, end: advance(text.startIndex, length))
while range != nil {
range = text.rangeOfString(substring, options: NSStringCompareOptions.allZeros, range: range, locale: nil)
if let rangeUnwrapped = range {
let remainingLength = length - distance(text.startIndex, rangeUnwrapped.endIndex)
range = Range(start: rangeUnwrapped.endIndex, end: advance(rangeUnwrapped.endIndex, remainingLength))
numberOfSubstringAppearance++
}
}
Matthew Flaschen's answer was a good start for me. Here is what I ended up using in the form of a method. I took a slightly different approach to the loop. This has been tested with empty strings passed to stringToCount and text and with the stringToCount occurring as the first and/or last characters in text.
I use this method regularly to count paragraphs in the passed text (ie. stringToCount = #"\r").
Hope this of use to someone.
- (int)countString:(NSString *)stringToCount inText:(NSString *)text{
int foundCount=0;
NSRange range = NSMakeRange(0, text.length);
range = [text rangeOfString:stringToCount options:NSCaseInsensitiveSearch range:range locale:nil];
while (range.location != NSNotFound) {
foundCount++;
range = NSMakeRange(range.location+range.length, text.length-(range.location+range.length));
range = [text rangeOfString:stringToCount options:NSCaseInsensitiveSearch range:range locale:nil];
}
return foundCount;
}
Example call assuming the method is in a class named myHelperClass...
int foundCount = [myHelperClass countString:#"n" inText:#"Now is the time for all good men to come to the aid of their country"];
for(int i =0;i<htmlsource1.length-search.length;i++){
range = NSMakeRange(i,search.length);
checker = [htmlsource1 substringWithRange:range];
if ([search isEqualToString:checker]) {
count++;
}
}
No built-in method. I'd suggest returning a c-string and using a common c-string style algorithm for substring counting... if you really need this to be fast.
If you want to stay in Objective C, this link might help. It describes the basic substring search for NSString. If you work with the ranges, adjust and count, then you'll have a "pure" Objective C solution... albeit, slow.
-(IBAction)search:(id)sender{
int maincount = 0;
for (int i=0; i<[self.txtfmainStr.text length]; i++) {
char c =[self.substr.text characterAtIndex:0];
char cMain =[self.txtfmainStr.text characterAtIndex:i];
if (c == cMain) {
int k=i;
int count=0;
for (int j = 0; j<[self.substr.text length]; j++) {
if (k ==[self.txtfmainStr.text length]) {
break;
}
if ([self.txtfmainStr.text characterAtIndex:k]==[self.substr.text characterAtIndex:j]) {
count++;
}
if (count==[self.substr.text length]) {
maincount++;
}
k++;
}
}
NSLog(#"%d",maincount);
}
}