iphone mail and special characters - objective-c

In my iPhone app, I pass email content to the standalone iPhone mail app, but the content is truncated when it contains special characters. It's the same even if I pre-process the content with stringByAddingPercentEscapesUsingEncoding:.

stringByAddingPercentEscapesUsingEncoding: will not escape characters that are valid in a URL, such as &. In this case you need to escape them though because otherwise they would be interpreted as part of the URL's structure (indicating a new parameter) and not as part of the parameter itself. Use CFURLCreateStringByAddingPercentEscapes instead:
NSString *escaped = [(NSString *)CFURLCreateStringByAddingPercentEscapes(kCFAllocatorDefault,
(CFStringRef)someURLParameter,
NULL,
(CFStringRef)#"!*'();:#&=+$,/?%#[]",
kCFStringEncodingUTF8) autorelease];

Percent escaping characters is only used in URLs. It's not part of the MIME spec.
I don't see why it wouldn't work. Are you sure these are proper UTF-8 characters?! So long as you're passing a string, Mail should package everything in an email itself.

Related

Using Placeholders in a URL string

I have a url that retrieves data from a Web API which looks like this with the entry of "Pizza Hut":
NSString *urlString = #"https://api.nutritionix.com/v1_1/search/Pizza Hut?results=0%3A20&cal_min=0&cal_max=50000&fields=item_name%2Cbrand_name%2Citem_id%2Cbrand_id&appId=MY_APP_ID&appKey=MY_APP_KEY";
This URL will return all the menu items of Pizza Hut.
Now I want to take a step beyond hard coding values, and so I created a text box where users can enter their own restaurant, and the web api should return data.
Here is an example of that:
NSString *urlString = [NSString stringWithFormat:#"https://api.nutritionix.com/v1_1/search/%#?results=0%3A20&cal_min=0&cal_max=50000&fields=item_name%2Cbrand_name%2Citem_id%2Cbrand_id&appId=MY_APP_ID&appKey=MY_APP_KEY", searchText.text];
All I did here was change the "Pizza Hut" to "%#".
However, I get a warning from the compiler saying:
"More '%' conversions than data arguments. As you would expect, the API returns no data, for this code doesn't seem to be working.
How would I re-write this string so that I could put the placeholder in there?
You have other percent symbols that need to be escaped properly. You want:
NSString *urlString = [NSString stringWithFormat:#"https://api.nutritionix.com/v1_1/search/%#?results=0%%3A20&cal_min=0&cal_max=50000&fields=item_name%%2Cbrand_name%%2Citem_id%%2Cbrand_id&appId=MY_APP_ID&appKey=MY_APP_KEY", searchText.text];
Basically, add a 2nd % symbol before all of the % symbols that you actually want to appear in the string.
BTW - make sure you properly escape the search text so special characters (such as spaces) are properly encoded.

Understanding urls correctly

I'm writing RSS reader and taking article urls from feeds, but often have invalid urls while parsing with NSXMLParser. Sometimes have extra symbols at the end of url(for example \n,\t). This issue I fixed.
Most difficult trouble is urls with queries that have characters not allowed to be url-encoded.
Working url for URL-request http://www.bbc.co.uk/news/education-23809095#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa
'#' character will replaced to "%23" by "stringByAddingPercentEscapesUsingEncoding:" method and will not work. Site will say what page not found. I believe after '#' character is a query string.
Are there a way to get(encode) any url from feeds correctly, at least always removing a query strings from xml?
There two approaches you could use to create a legal URL string by either using stringByAddingPercentEncodingWithAllowedCharacters or by using CFURL core foundation class which gives you a whole range of options.
Example 1 (NSCharacterSet):
NSString *nonFormattedURL = #"http://www.bbc.co.uk/news/education-23809095#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa";
NSLog(#"%#", [nonFormattedURL stringByAddingPercentEncodingWithAllowedCharacters:[[NSCharacterSet illegalCharacterSet] invertedSet]]);
This still keep the hash tag in place by inverting the illegalCharacterSet in NSCharacterSet object. If you like more control you also create your own mutable set.
Example 2 (CFURL.h):
NSString *nonFormattedURL = #"http://www.bbc.co.uk/news/education-23809095#sa-ns_mchannel=rss&ns_source=PublicRSS20-sa";
CFAllocatorRef allocator = CFAllocatorGetDefault();
CFStringRef formattedURL = CFURLCreateStringByAddingPercentEscapes(allocator,
(__bridge CFStringRef) nonFormattedURL,
(__bridge CFStringRef) #"#", //leave unescaped
(__bridge CFStringRef) #"", // legal characters to be escaped like / = # ? etc
NSUTF8StringEncoding); // encoding
NSLog(#"%#", formattedURL);
Does the same as above code but with way more control: replacing certain characters with the equivalent percent escape sequence based on the encoding specified, see logs for example.

What are the characters that stringByAddingPercentEscapesUsingEncoding escapes?

I've had to switch from stringByAddingPercentEscapesUsingEncoding to CFURLCreateStringByAddingPercentEscapes because it doesn't escape question marks (?). I'm curious what exactly it does escape, and the rationale behind the partial escaping vs RFC 3986.
Be careful not to leak memory on conversions when using CFStringRef. Here's what I came up with to work with Latin characters, and others. I use this to escape my parameters, not the entire URL. Depending on your use case, you may need to add or remove characters from "escapeChars"
CFStringRef escapeChars = (CFStringRef)#"%;/?¿:#&=$+,[]#!'()*<>¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ \"\n";
NSString *encodedString = (__bridge_transfer NSString *) CFURLCreateStringByAddingPercentEscapes(NULL, (__bridge_retained CFStringRef) url, NULL, escapeChars, kCFStringEncodingUTF8);
I hope this helps.
Some good categories have been created for doing just what you need:
http://iosdevelopertips.com/networking/a-better-url-encoding-method.html
http://www.cocoanetics.com/2009/08/url-encoding/
The rationale for leaving certain characters out is beyond me... except to say that the definition of the function is: Returns a representation of the receiver using a given encoding to determine the percent escapes necessary to convert the receiver into a legal URL string.
To be completely correct, + and & are legal characters within a URL, whereas a space is not. Hence the method will correctly escape a space, but leaves + and & intact.
Reading RFC2396 http://www.ietf.org/rfc/rfc2396.txt - there is a set of reserved and unreserved characters defined. My guess is that none of these characters are escaped by stringByAddingPercentEscapesUsingEncoding.

Unihan: combining UTF-8 chars

I am using data that involves Chinese Unihan characters in an Objective-C app. I am using a voice recognition program (cmusphinx) that returns a phrase from my data. It returns UTF-8 characters and when returning a Chinese character (which is three bytes) it separates it into three separate characters.
Example: When I want 人 to, I see: ‰∫∫. This is the proper in coding (E4 BA BA), but my code sees the returned value as three seperate characters rather than one.
Actually, my function is receiving the phrase as an NSString, (due to a wrap around) which uses UTF-16. I tried using Objective-C's built in conversion methods (to UTF-8 and from UTF-16), but these keep my string as three characters.
How can I decode these three separate characters into the one utf-8 codepoint for the Chinese character?
Or how can I properly encode it?
This is code fragment dealing with the cstring returned from sphinx and its encoding to a NSString:
const char * hypothesis = ps_get_hyp(pocketSphinxDecoder, &recognitionScore, &utteranceID);
NSString *hypothesisString = [[NSString alloc] initWithCString:hypothesis encoding:NSMacOSRomanEncoding];
Edit: From looking at the addition to your post, you actually do have control over the string encoding. In that case, why are you creating the string with NSMacOSRomanEncoding when you're expecting utf-8? Just change that to NSUTF8StringEncoding.
It sounds like what you're saying is you're being given an NSString that contains UTF-8 data that's being interpreted as a single-byte encoding (e.g. ISO-Latin-1, MacRoman, etc). I'm assuming here that you have no control over the code that creates the NSString, because if you did then the solution is just to change the encoding it's initializing with.
In any case, what you're asking for is a way to take the data in the string and convert it back to UTF-8. You can do this by creating an NSData from the NSString using whatever encoding its was originally created with (you need to know this much, at least, or it won't work), and then you can create a new NSString from the same data using UTF-8.
From the example character you gave (人) it looks like it's being interpreted as MacRoman, so lets go with that. The following code should convert it back:
- (NSString *)fixEncodingOfString:(NSString *)input {
CFStringEncoding cfEncoding = kCFStringEncodingMacRoman;
NSStringEncoding encoding = CFStringCovnertEncodingToNSStringEncoding(cfEncoding);
NSData *data = [input dataUsingEncoding:encoding];
if (!data) {
// the string wasn't actually in MacRoman
return nil;
}
NSString *output = [[[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding] autorelease];
}

diffrent between NSLog and -print description- with NSString and UTF-8

I'm confused. I'm parsing json string.
Before parsing, I check what is the content of the NSString.
In Xode4:
When I click on the NSString variable "print description"
The console show the value as \u434 \u433 format of the UTF-8
When I call NSLog("%#",content) the console show the "readble" character of the UTF-8 encoding.
Why is this different? How can I know that the string I got to parse is 100% UTF-8 ?
Thanks.
If you can see the Cyrillic characters you're looking for, rather than the escapes, through any method, then you're working with a UTF-8 string.
The "-description" method is not what you want to use here. It's more likely to show escaped characters; in particular, any time you store a value in a property list item like an NSArray or NSDictionary, its -description will generally escape any characters other than plain ASCII.
NSLog is a more reliable guide, because it doesn't use -description. If it's showing up in NSLog, it's probably just fine.
If you want to be absolutely sure your string is properly encoded UTF-8, the best way to test it is to display it. Create a text interface element (an NSTextField or UITextField) in your user interface, wire it up, and set your string as the value. If it displays there, it is properly formatted.
Short version: if it shows up in the debugger as escaped characters, it doesn't necessarily mean it's not UTF8. If it's showing up anywhere (including NSLog) with the proper characters, it's probably in the proper encoding. If you want to be sure, set up a test interface element and see how it looks there.