I'm learning Objective C and iOS development. I'm trying to recreate some of the projects we did in my Java class (I know they're completely different) but I'm running into trouble in one of the projects. We were doing a caesar shift in a lab one day. A string manipulation lab. It was a really basic deal in Java... a for loop through the string and change each character. I can't seem to find any way to change individual characters in Objective C. I've looked through the NSMutableString documentation and NSString documentation and I know I can do a
[NSString stringByReplacingCharactersInRange:(NSRange *) withString:(NSString *)
but that doesn't really help because I don't know what I'm going to be replacing with. I need to find a way to grab a character at a specific index and change it. Any ideas?
Sounds like you are looking for the [NSString characterAtIndex:(NSUInteger)] method
E.g.
NSString *string = #"abcde";
NSString *character = [NSString stringWithFormat:#"%C",[string characterAtIndex: 0]];
NSLog(#"%#", character);
Result: a
Using this, and an NSMutableString, you can build the string you need.
Ussing appendString you can add to the end of an NSMutableString
Probably the best way to do this would be using a good old C-string, as that allows you to change the bytes without the overhead of reallocating a different string every time:
NSString *ceasarShift(NSString *input)
{
char *UTF8Str = strdup([input UTF8String]);
int length = [input length];
for (int i = 0; i < length; i++)
{
UTF8Str[i] = changeValueOf(UTF8Str[i]); // some code here to change the value
}
NSString *result = [NSString stringWithUTF8String:UTF8Str];
free(UTF8Str);
return result;
}
This reduces overhead, and although you have to free the data you allocated when you are done, it gives you the advantage of not relying on a high level API, improving performance drastically. (The difference between an array set and a dynamic method lookup is ~5 CPU cycles, which means a lot if you are doing any major sort of encryption)
Maybe also look into NSMutableData for this kind of task, instead of NSString, as the random \0 may per chance appear in the result string.
Related
I am trying to implement code that converts const char * to NSString. I would like to try multiple encodings in a specified order until I find one that works. Unfortunately, all the initWith... methods on NSString say that the results are undefined if the encoding doesn't work.
In particular, (sometimes) I would like to try first to encode as NSMacOSRomanStringEncoding which never seems to fail. Instead it just encodes gobbledygook. Is there some kind of check I can perform ahead of time? (Like canBeConvertedToEncoding but in the other direction?)
Instead of trying encodings one by one until you find a match, consider asking NSString to help you out here by using +[NSString stringEncodingForData:encodingOptions:convertedString:usedLossyConversion:], which, given string data and some options, may be able to detect the encoding for you, and return it (along with the actual decoded string).
Specifically for your use-case, since you have a list of encodings you'd like to try, the encodingOptions parameter will allow you to pass those encodings in using the NSStringEncodingDetectionSuggestedEncodingsKey.
So, given a C string and some possible encoding options, you might be able to do something like:
NSString *decodeCString(const char *source, NSArray<NSNumber *> *encodings) {
NSData * const cStringData = [NSData dataWithBytesNoCopy:(void *)source length:strlen(source) freeWhenDone:NO];
NSString *result = nil;
BOOL usedLossyConversion = NO;
NSStringEncoding determinedEncoding = [NSString stringEncodingForData:cStringData
encodingOptions:#{NSStringEncodingDetectionSuggestedEncodingsKey: encodings,
NSStringEncodingDetectionUseOnlySuggestedEncodingsKey: #YES}
convertedString:&result
usedLossyConversion:&usedLossyConversion];
/* Decide whether to do anything with `usedLossyConversion` and `determinedEncoding. */
return result;
}
Example usage:
NSString *result = decodeCString("Hello, world!", #[#(NSShiftJISStringEncoding), #(NSMacOSRomanStringEncoding), #(NSASCIIStringEncoding)]);
NSLog(#"%#", result); // => "Hello, world!"
If you don't 100% care about using only the list of encodings you want to try, you can drop the NSStringEncodingDetectionUseOnlySuggestedEncodingsKey option.
One thing to note about the encoding array you pass in: although the documentation doesn't promise that the suggested encodings are attempted in order, spelunking through the disassembly of the (current) method implementation shows that the array is enumerated using fast enumeration (i.e., in order). I can imagine that this could change in the future (or have been different in the past) so if this is somehow a hard requirement for you, you could theoretically work around it by repeatedly calling +stringEncodingForData:encodingOptions:convertedString:usedLossyConversion: one encoding at a time in order, but this would likely be incredibly expensive given the complexity of this method.
I need to allocate lot's of NSString objects from cStrings (which come that way from a database), as fast as possible. cStringUsingEncoding and the likes are just too slow - about 10-15 times slower compared to allocating a cString.
However, creating a NSString with a NSString is getting pretty close to cString allocation (about 1.2s for 1M allocations). EDIT: Fixed alloc to use a copy of the string.
const char *n;
const char *s = "Office für iPad: Steve Ballmer macht Hoffnung";
NSString *str = [NSString stringWithUTF8String:s];
int len = strlen(s);
for (int i = 0; i<10000000; i++) {
NSString *s = [[NSString alloc] initWithString:[str copy]];
s = s;
}
cString allocation test (also about 1s for 1M allocations):
for (int i = 0; i<10000000; i++) {
n = malloc(len);
memccpy((void*)n, s, 0, len) ;
n = n;
free(n);
}
But as I said, using stringWithCString and the likes is an order of magnitude slower. The fastest I could get was using initWithBytesNoCopy (about 8s, therefore 8 times slower compared to stringWithString):
NSString *so = [[NSString alloc] initWithBytesNoCopy:(void*)n length:len encoding:NSUTF8StringEncoding freeWhenDone:YES];
So, is there another magic way to make allocations from cStrings faster? I'd even not rule out to subclass NSString (and yes, I know it's a cluster class).
EDIT: In instruments I see that NSString's call to CFStringUsingByteStream3 is the root issue.
EDIT 2: The root issue is according to instuments __CFFromUTF8. Just looking at the sources [1], this seems indeed to be quite inefficient and handling some legacy cases.
https://www.opensource.apple.com/source/CF/CF-476.17/CFBuiltinConverters.c?txt
This seems to me to not be a fair test.
cString allocation test looks to be allocating a byte array and copying data. I can't tell for sure because the variable definitions are not included.
NSString *s = [[NSString alloc] initWithString:str]; is taking an existing NSString (data already in the correct format) and maybe just increments the retain count. Even if a copy is forced the data is still already in the correct encoding and just needs to be copied.
[NSString stringWithUTF8String:s]; has to handle the UTF8 encoding and convert from one encoding (UTF8) to the internal NSString/CFString encoding. The method being used (CFStreamUsingByteStream) has support for multiple encodings (UTF8/UTF16/UTF32/others). A specialized UTF8 only method could be faster but that leads to the question of is this really a performance problem or just an exercise.
You can see the source code for CFStringUsingByteStream3 in this file.
As per my comment, and Brian's answer, I think the problem here is that to create NSStrings you're having to parse the UTF-8 strings. So the question arises: do you really need to parse them, then?
If parsing-on-demand is an option then I'd suggest you write a proxy that can impersonate NSString with an interface along the lines of:
#interface BJLazyUTF8String: NSProxy
- (id)initWithBytes:(const char *)bytes length:(size_t)length;
#end
So it's not a subclass of NSString and it doesn't try to provide any real functionality. Inside the init just keep the bytes, e.g. as _bytes, doing whatever is correct for your C memory ownership. Then:
- (NSString *)bjRealString
{
// we'd better create the NSString if we haven't already
if(!_string)
_string = [NSString stringWithUTF8String:_bytes];
return _string;
}
- (void)forwardInvocation:(NSInvocation *)anInvocation
{
// if this is invoked then someone is trying to
// make a call to what they think is a string;
// let's forward that call to a string so that
// it does what they expect
[anInvocation setTarget:[self bjRealString]];
[anInvocation invoke];
}
- (NSMethodSignature *)methodSignatureForSelector:(SEL)aSelector
{
return [[self bjRealString] methodSignatureForSelector:aSelector];
}
You can then do:
NSString *myString = [[BJLazyUTF8String alloc] initWithBytes:... length:...];
And subsequently treat myString exactly as though it were an NSString.
Microbenchmarks are a great distraction, but rarely useful. In this case, though, there is validity.
Assuming, for the moment, that you've actually measured string creation as being a real source of performance issues, then the real problem can be better expressed as how do I reduce memory bandwidth? because that is really where your problems lie; you causing tons and tons of data to be copied into freshly allocated buffers.
As you've discovered, the fastest you can go is to not copy at all. initWithBytesNoCopy:... exists exactly to solve this case. Thus, you'll want to create a data construct that holds the original string buffer and manages all the NSString instances that point to it as one cohesive unit.
Without thinking it through in detail, you could likely encapsulate the raw buffer in an NSData instance, then use associated objects to create a strong reference from your string instances to that NSData instance. That way, the NSData (and associated memory) will be deallocated when the last string is deallocated.
With the additional detail that this is for a CoreData-esque ORM layer (and, no, I'm not going to suggest yer doin' it wrong because your description really does sound like you need that level of control), then it would seem that your ORM layer would be the ideal place to manage these strings as described above.
I'd also encourage you to investigate something like FMDB to see if it can provide both the encapsulation you need and the flexibility to add your additional features (and the hooks to make it fast).
I would like to programmatically receive a JIRA ticket number, like #"ART-235", and obtain the bare digits / number, #"235".
A question I asked about using regular expressions turned up Regular expressions in an Objective-C Cocoa application with a link to https://developer.apple.com/library/ios/documentation/Foundation/Reference/NSRegularExpression_Class/Reference/Reference.html, and it looks indeed like I can have a regular expression such as \D*?(\d+) and retrieve the value via a regular expression.
However, I wanted to check in and ask if there is a less bletcherous way to do this, or is this an example of why Objective-C is called a bit archaic? The second link gives what looks like everything I need, but it smells a little funny. For the objective stated above, do I want to use regular expressions, or is there a more nicely idiomatic way to perform this sort of string manipulation?
Sounds like -componentsSeparatedByString: would do what you need.
Getting pieces of a fixed, known, format that doesn't use paired delimiters or nesting is exactly the kind of thing that regexes are made to do. I don't see a thing wrong with using one here.
To address your question as written (about "iteration"), however, you might want to look at NSScanner, which does move through the characters of a string by "character class", allowing you to evaluate them as you go.
NSString * ticket = #"ART-235";
NSScanner * scanner = [NSScanner scannerWithString:ticket];
[scanner scanUpToCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet]
intoString:nil];
// As an integer
NSInteger ticketNumber;
[scanner scanInteger:&ticketNumber];
// Or as a string
NSString * ticketNumber;
[scanner scanCharactersFromSet:[NSCharacterSet decimalDigitCharacterSet]
intoString:&ticketNumber];
Like other answers have already said: that simple case can be solved using componentsSeparatedByString:#"-".
That said, your original question is how to enumerate individual characters.
Not all characters are of the same size, some languages combine more than one character into a new language. When enumerating such a string you most likely want to get the resulting of that composition, not the individual pieces. In Objective-C you can enumerate these composed characters like this:
NSString *myString = #"Hello Strings!";
[myString enumerateSubstringsInRange:NSMakeRange(0, myString.length)
options:NSStringEnumerationByComposedCharacterSequences
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
// Do something with the composed character
NSLog(#"%#", substring);
}];
The example above will log each character one by one.
I made a simple method for you that does the trick, provided that the
ticket identifiers will always be in a "string-number" format !
-(int) numberFromJiraTicket:(NSString*)ticketId
{
//Get number as string
NSString *number = [[ticketId componentsSeparatedByString:#"-"] lastObject];
//Return the INT representation of the number
return [number intValue];
}
I'm getting an EXC_BAD_ACCESS error, and It's because of this part of code. Basically, I take an input and do some work on it. After multiple inputs, it throws the error. Am I doing something wrong with my memory here? I'd post the rest of the code, but it's rather long -- and I think this may be where my problem lies (It's where Xcode points me, at least).
-(IBAction) findShows: (id) clicked
{
char urlChars[1000];
[self getEventURL: urlChars];
NSString * theUrl = [[NSString alloc] initWithFormat:#"%s", urlChars];
NSData *data = [NSData dataWithContentsOfURL:[NSURL URLWithString:theUrl]];
int theLength = [data length];
NSString *content = [NSString stringWithUTF8String:[data bytes]];
char eventData[[data length]];
strcpy(eventData, [content UTF8String]);
[self parseEventData: eventData dataLength: theLength];
[whatIsShowing setStringValue:#"Showing events by this artist"];
}
When a crash occurs, there will be a backtrace.
Post it.
Either your program will break in the debugger, and the call stack will be in the debugger UI (or you can type 'bt
With that, the cause of the crash is often quite obvious. Without that, we are left to critique the code.
So, here goes....
char urlChars[1000];
[self getEventURL: urlChars];
This is, at best, a security hole and, at worst, the source of your crash. Any time you are going to copy bytes into a buffer, there should be some kind of way to (a) limit the # of bytes copied in (pass the length of the buffer) and (b) the # of bytes copied is returned (0 for failure or no bytes copied).
Given the above, what happens if there are 1042 bytes copied into urlChars by getEventURL:? boom
NSString * theUrl = [[NSString alloc] initWithFormat:#"%s", urlChars];
This is making some assumptions about urlChars that will lead to failure. First, it assumes that urlChars is of a proper %s compatible encoding. Secondly, it assumes that urlChars is NULL terminated (and didn't overflow the buffer).
Best to use one of the various NSString methods that create strings directly from the buffer of bytes using a particular encoding. More precise and more efficient.
NSData *data = [NSData dataWithContentsOfURL:[NSURL URLWithString:theUrl]];
I hope this isn't on the main thread... 'cause it'll block if it is and that'll make your app unresponsive on slow/flaky networks.
int theLength = [data length];
NSString *content = [NSString stringWithUTF8String:[data bytes]];
char eventData[[data length]];
strcpy(eventData, [content UTF8String]);
This is about the least efficient possible way of doing this. There is no need to create an NSString instance just to then turn it into a (char *). Just grab the bytes from the data directly.
Also -- are you sure that the data returned is NULL terminated? If not, that strcpy() is gonna blow right past the end of your eventData buffer, corrupting the stack.
[self parseEventData: eventData dataLength: theLength];
[whatIsShowing setStringValue:#"Showing events by this artist"];
What kind of data are you parsing that you really want to parse the raw bytes? In almost all cases, such data should be of some kind of structured type; XML or, even, HTML. If so, there is no need to drop down to parsing the raw bytes. (Not that raw data is unheard of -- just odd).
The bytes you get from [content UTF8String] could conceivably be different in number from the value of [data length]. Try using strncpy() instead and see if that still crashes. (It's also possible that getEventURL: sometimes fails to return a string in the format expected, but that's impossible to tell without the source to that method.)
Is it possible that the string contained in urlChars sometimes comes back non-NULL-terminated? You might want to try zeroing out the array, for example using bzero.
Additionally, there are a bunch of techniques for debugging EXC_BAD_ACCESS. Since you're doing a lot of pure C string manipulation, the usual method of turning on NSZombieEnabled may or may not help you (though I recommend turning it on regardless). Another technique you can try is recovering a previous stack frame using GDB. See my previous answer to a similar question if you're interested.
In my opinion the code is too complex. Do not resort to plain C arrays and strings unless you absolutely have to, they are harder to get right. (It’s no rocket science, but if you play with guns all the time, you will shoot yourself in the foot sooner or later.) Even if you insist on parsing plain C strings, isolate the code using the function interface:
// Callers have to mess with char*.
- (void) parseEventData: (char*) data {…}
// Callers can stay in the Objective-C land.
- (void) parseEventData: (NSString* or NSData*) data {
char *unwrappedData = …;
…
}
I’d certainly think twice before I used strcpy in my code. And I think you are leaking theUrl (although that should not cause EXC_BAD_ACCESS in this case). As for the bug itself, you might be hanging on parts of urlChars or eventData and when those stack-based variables disappear, you cause the segfault?
I want to replace multiple elements in my string in Objective-C.
In PHP you can do this:
str_replace(array("itemtoreplace", "anotheritemtoreplace", "yetanotheritemtoreplace"), "replacedValue", $string);
However in objective-c the only method I know is NSString replaceOccurancesOfString. Is there any efficient way to replace multiple strings?
This is my current solution (very inefficient and.. well... long)
NSString *newTitle = [[[itemTitleField.text stringByReplacingOccurrencesOfString:#"'" withString:#""] stringByReplacingOccurrencesOfString:#" " withString:#"'"] stringByReplacingOccurrencesOfString:#"^" withString:#""];
See what I mean?
Thanks,
Christian Stewart
If this is something you're regularly going to do in this program or another program, maybe make a method or conditional loop to pass the original string, and multi-dimensional array to hold the strings to find / replace. Probably not the most efficient, but something like this:
// Original String
NSString *originalString = #"My^ mother^ told me not to go' outside' to' play today. Why did I not listen to her?";
// Method Start
// MutableArray of String-pairs Arrays
NSMutableArray *arrayOfStringsToReplace = [NSMutableArray arrayWithObjects:
[NSArray arrayWithObjects:#"'",#"",nil],
[NSArray arrayWithObjects:#" ",#"'",nil],
[NSArray arrayWithObjects:#"^",#"",nil],
nil];
// For or while loop to Find and Replace strings
while ([arrayOfStringsToReplace count] >= 1) {
originalString = [originalString stringByReplacingOccurrencesOfString:[[arrayOfStringsToReplace objectAtIndex:0] objectAtIndex:0]
withString:[[arrayOfStringsToReplace objectAtIndex:0] objectAtIndex:1]];
[arrayOfStringsToReplace removeObjectAtIndex:0];
}
// Method End
Output:
2010-08-29 19:03:15.127 StackOverflow[1214:a0f] My'mother'told'me'not'to'go'outside'to'play'today.'Why'did'I'not'listen'to'her?
There is no more compact way to write this with the Cocoa frameworks. It may appear inefficient from a code standpoint, but in practice this sort of thing is probably not going to come up that often, and unless your input is extremely large and you're doing this incredibly frequently, you will not suffer for it. Consider writing these on three separate lines for readability versus chaining them like that.
You can always write your own function if you're doing something performance-critical that requires batch replace like this. It would even be a fun interview question. :)
Considered writing your own method? Tokenize the string and iterate through all of them replacing one by one, there really is no faster way than O(n) to replace words in a string.
Would be a single for loop at most.
Add the # to the start of the all the strings, as in
withString:#""
It's missing for a few.