What does stringWithUTF8String do? - objective-c

So I have done some searching around so that I could see what it was I was doing with my code, and I couldn't find any answers as to what this very one specific line of code does.
NSString* name = [NSString stringWithUTF8String:countryName];
I know what the rest does (I only had to google how to do this part), it is supposed to take my char* (countryName) and turn it into an NSString so later on I can compare it with the
isEqualToString:
thing. I would just like to know what the following is actually doing to the char, and what does the UTF8String even mean?
I have barely any Objective C programming experience so any feedback is helpful :D

you are not totally right.
this method
Returns a string created by copying the data from a given C array of UTF8-encoded bytes.
so, UTF-8 string here is just a C array of bytes.
Check the documentation here.

It doesn't do anything to the char * string. It's just the input to the method. stringWithUTF8String takes a C-style string (in UTF-8 encoding), and creates an NSString using it as a template.

Related

How to discover if a c-string can be encoded to NSString with a given encoding

I am trying to implement code that converts const char * to NSString. I would like to try multiple encodings in a specified order until I find one that works. Unfortunately, all the initWith... methods on NSString say that the results are undefined if the encoding doesn't work.
In particular, (sometimes) I would like to try first to encode as NSMacOSRomanStringEncoding which never seems to fail. Instead it just encodes gobbledygook. Is there some kind of check I can perform ahead of time? (Like canBeConvertedToEncoding but in the other direction?)
Instead of trying encodings one by one until you find a match, consider asking NSString to help you out here by using +[NSString stringEncodingForData:encodingOptions:convertedString:usedLossyConversion:], which, given string data and some options, may be able to detect the encoding for you, and return it (along with the actual decoded string).
Specifically for your use-case, since you have a list of encodings you'd like to try, the encodingOptions parameter will allow you to pass those encodings in using the NSStringEncodingDetectionSuggestedEncodingsKey.
So, given a C string and some possible encoding options, you might be able to do something like:
NSString *decodeCString(const char *source, NSArray<NSNumber *> *encodings) {
NSData * const cStringData = [NSData dataWithBytesNoCopy:(void *)source length:strlen(source) freeWhenDone:NO];
NSString *result = nil;
BOOL usedLossyConversion = NO;
NSStringEncoding determinedEncoding = [NSString stringEncodingForData:cStringData
encodingOptions:#{NSStringEncodingDetectionSuggestedEncodingsKey: encodings,
NSStringEncodingDetectionUseOnlySuggestedEncodingsKey: #YES}
convertedString:&result
usedLossyConversion:&usedLossyConversion];
/* Decide whether to do anything with `usedLossyConversion` and `determinedEncoding. */
return result;
}
Example usage:
NSString *result = decodeCString("Hello, world!", #[#(NSShiftJISStringEncoding), #(NSMacOSRomanStringEncoding), #(NSASCIIStringEncoding)]);
NSLog(#"%#", result); // => "Hello, world!"
If you don't 100% care about using only the list of encodings you want to try, you can drop the NSStringEncodingDetectionUseOnlySuggestedEncodingsKey option.
One thing to note about the encoding array you pass in: although the documentation doesn't promise that the suggested encodings are attempted in order, spelunking through the disassembly of the (current) method implementation shows that the array is enumerated using fast enumeration (i.e., in order). I can imagine that this could change in the future (or have been different in the past) so if this is somehow a hard requirement for you, you could theoretically work around it by repeatedly calling +stringEncodingForData:encodingOptions:convertedString:usedLossyConversion: one encoding at a time in order, but this would likely be incredibly expensive given the complexity of this method.

Objective-C equivalent of Swift "\(variable)"

The question title says it all, really. In swift you use "\()" for string interpolation of a variable. How does one do it with Objective-C?
There is no direct equivalent. The closest you will get is using a string format.
NSString *text = #"Tomiris";
NSString *someString = [NSString stringWithFormat:#"My name is %#", text];
Swift supports this as well:
let text = "Tomiris"
let someString = String(format: "My name is %#", text)
Of course when you use a format string like this (in either language), the biggest issue is that you need to use the correct format specifier for each type of variable. Use %# for object pointers. Use %d for integer types, etc. It's all documented.
#rmaddy's Answer is the gist of it. I just wanted to follow up on his comment that "It's all documented". Well, these symbols like %# and %d are called String Format Specifiers the documentation can be found at the following links.
Formatting String Objects
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Strings/Articles/FormatStrings.html
String Format Specifiers
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Strings/Articles/formatSpecifiers.html#//apple_ref/doc/uid/TP40004265-SW1
Just telling us noobs "It's all documented" isn't very helpful because often (if you're like me) you googled to find this stackoverflow post at the top of the SEO. And taking the link in hopes of finding the original documentation!

How to access a character in NSMutableString Objective-C

I have an instance of NSMutableString called MyMutableStr and I want access its character at index 7.
For example:
unsigned char cMy = [(NSString*) MyMutableStr characterAtIndex:7];
I think this is an ugly way; it's too much code.
My question is: Are there more simple ways in Objective-C to access the character in NSMutableString?
Like, in C language we can access a character of a string using [ ] operator:
unsigned char cMy = MyMutableStr[7];
The way of doing it is to use characterAtIndex:, but you don't need to cast it to a NSString pointer, since NSMutableString is a subclass of NSString. So it isn't that long, but if you still don't find it comfortable, I suggest to use UTF8String to obtain a C string over which you can iterate using the brackets operator:
const char* cString= [MyMutableStr UTF8String];
char first= cString[0];
But remember this (taken from NSString class reference):
The returned C string is automatically freed just as a returned object would be released; you should copy the C string if it needs to store it outside of the autorelease context in which the C string is created.
As others said characterAtIndex: but a few things you might want to consider carefully.
First you're dealing with an mutable string. You want to be careful to avoid it changing out from under you. One way is to an immutable copy and use that for the op.
Second, you're dealing with Unicode so you may want to consider normalizing your string to get a precomposed form as some visual representations may be more than one actual unichar. That's often a stumbling block for folks.

Unfamiliar C syntax in Objective-C context

I am coming to Objective-C from C# without any intermediate knowledge of C. (Yes, yes, I will need to learn C at some point and I fully intend to.) In Apple's Certificate, Key, and Trust Services Programming Guide, there is the following code:
static const UInt8 publicKeyIdentifier[] = "com.apple.sample.publickey\0";
static const UInt8 privateKeyIdentifier[] = "com.apple.sample.privatekey\0";
I have an NSString that I would like to use as an identifier here and for the life of me I can't figure out how to get that into this data structure. Searching through Google has been fruitless also. I looked at the NSString Class Reference and looked at the UTF8String and getCharacters methods but I couldn't get the product into the structure.
What's the simple, easy trick I'm missing?
Those are C strings: Arrays (not NSArrays, but C arrays) of characters. The last character is a NUL, with the numeric value 0.
“UInt8” is the CoreServices name for an unsigned octet, which (on Mac OS X) is the same as an unsigned char.
static means that the array is specific to this file (if it's in file scope) or persists across function calls (if it's inside a method or function body).
const means just what you'd guess: You cannot change the characters in these arrays.
\0 is a NUL, but including it explicitly in a "" literal as shown in those examples is redundant. A "" literal (without the #) is NUL-terminated anyway.
C doesn't specify an encoding. On Mac OS X, it's generally something ASCII-compatible, usually UTF-8.
To convert an NSString to a C-string, use UTF8String or cStringUsingEncoding:. To have the NSString extract the C string into a buffer, use getCString:maxLength:encoding:.
I think some people are missing the point here. Everyone has explained the two constant arrays that are being set up for the tags, but if you want to use an NSString, you can simply add it to the attribute dictionary as-is. You don't have to convert it to anything. For example:
NSString *publicTag = #"com.apple.sample.publickey";
NSString *privateTag = #"com.apple.sample.privatekey";
The rest of the example stays exactly the same. In this case, there is no need for the C string literals at all.
Obtaining a char* (C string) from an NSString isn't the tricky part. (BTW, I'd also suggest UTF8String, it's much simpler.) The Apple-supplied code works because it's assigning a C string literal to the static const array variables. Assigning the result of a function or method call to a const will probably not work.
I recently answered an SO question about defining a constant in Objective-C, which should help your situation. You may have to compromise by getting rid of the const modifier. If it's declared static, you at least know that nobody outside the compilation unit where it's declared can reference it, so just make sure you don't let a reference to it "escape" such that other code could modify it via a pointer, etc.
However, as #Jason points out, you may not even need to convert it to a char* at all. The sample code creates an NSData object for each of these strings. You could just do something like this within the code (replacing steps 1 and 3):
NSData* publicTag = [#"com.apple.sample.publickey" dataUsingEncoding:NSUnicodeStringEncoding];
NSData* privateTag = [#"com.apple.sample.privatekey" dataUsingEncoding:NSUnicodeStringEncoding];
That sure seems easier to me than dealing with the C arrays if you already have an NSString.
try this
NSString *newString = #"This is a test string.";
char *theString;
theString = [newString cStringWithEncoding:[NSString defaultCStringEncoding]];

Raw strings like Python's in Objective-C

Does Objective-C have raw strings like Python's?
Clarification: a raw string doesn't interpret escape sequences like \n: both the slash and the "n" are separate characters in the string. From the linked Python tutorial:
>>> print 'C:\some\name' # here \n means newline!
C:\some
ame
>>> print r'C:\some\name' # note the r before the quote
C:\some\name
Objective-C is a superset of C. So, the answer is yes. You can write
char* string="hello world";
anywhere. You can then turn it into an NSString later by
NSString* nsstring=[NSString stringWithUTF8String:string];
From your link explaining what you mean by "raw string", the answer is: there is no built in method for what you are asking.
However, you can replace occurrences of one string with another string, so you can replace #"\n" with #"\\n", for example. That should get you close to what you're seeking.
You can use stringize macro.
#define MAKE_STRING(x) ##x
NSString *expendedString = MAKE_STRING(
hello world
"even quotes will be escaped"
);
The preprocess result is
NSString *expendedString = #"hello world \"even quotes will be escaped\"";
As you can see, double quotes are escaped, however new lines are ignored.
This feature is very suitable to paste some JS code in Objective-C files. Using this feature is safe if you are using C99.
source:
https://gcc.gnu.org/onlinedocs/cpp/Stringizing.html
How, exactly, does the double-stringize trick work?
Like everyone said, raw ANSI strings are very easy. Just use simple C strings, or C++ std::string if you feel like compiling Objective C++.
However, the native string format of Cocoa is UCS-2 - fixed-width 2-byte characters. NSStrings are stored, internally, as UCS-2, i. e. as arrays of unsigned short. (Just like in Win32 and in Java, by the way.) The systemwide aliases for that datatype are unichar and UniChar. Here's where things become tricky.
GCC includes a wchar_t datatype, and lets you define a raw wide-char string constant like this:
wchar_t *ws = L"This a wide-char string.";
However, by default, this datatype is defined as 4-byte int and therefore is not the same as Cocoa's unichar! You can override that by specifying the following compiler option:
-fshort-wchar
but then you lose the wide-char C RTL functions (wcslen(), wcscpy(), etc.) - the RTL was compiled without that option and assumes 4-byte wchar_t. It's not particularly hard to reimplement these functions by hand. Your call.
Once you have a truly 2-byte wchar_t raw strings, you can trivially convert them to NSStrings and back:
wchar_t *ws = L"Hello";
NSString *s = [NSString stringWithCharacters:(const unichar*)ws length:5];
Unlike all other [stringWithXXX] methods, this one does not involve any codepage conversions.
Objective-C is a strict superset of C so you are free to use char * and char[] wherever you want (if that's what you call raw strings).
If you mean C-style strings, then yes.