Text encoding problem between NSImage, NSData, and NSXMLDocument - objective-c

I'm attempting to take an NSImage and convert it to a string which I can write in an XML document.
My current attempt looks something like this:
[xmlDocument setCharacterEncoding: #"US-ASCII"];
NSData* data = [image TIFFRepresentation];
NSString* string = [[NSString alloc] initWithData:data encoding:NSASCIIStringEncoding];
//Put string inside of NSXMLElement, write out NSXMLDocument.
Reading back in looks something like this:
NSXMLDocument* newXMLDocument = [[NSXMLDocument alloc] initWithData:data options:0 error:outError];
//Here's where it fails. I get:
//Error Domain=NSXMLParserErrorDomain Code=9 UserInfo=0x100195310 "Line 7: Char 0x0 out of allowed range"
First of all, embedding large amounts of binary data in XML is not a good idea, IMHO.
To answer your question, you need an encoding scheme that supports binary data, such as Base64.
See this page for more than one way to represent arbitrary NSData as a Base64-encoded string: http://www.cocoadev.com/index.pl?BaseSixtyFour
UPDATE: The link to Colloquy's NSData additions seems to be broken on that page. Here's the new URL: http://colloquy.info/project/browser/trunk/Additions/NSDataAdditions.m


Initializing Nsdata with Nsdata as a string

I have some NSData output that I would like to convert to a string.
NSString * test = [NSString stringWithFormat:#"myfile.txt];
NSData *myData = [[NSData alloc] initWithContentsOfFile:(#"%#", test)];
This NSData was saved to a file, it looks like : 21fa9731 27c67c00 da1c3349 d82470eb 56f97b88 559f406c 6abecbb7 de020007 47a4541d 99c9c5e7 883f8bf1 165fba39
Do you know a way to get this string back as it was in "myfile.txt" ?
Thanks !
If data file somewhere on HDD (not in app bundle) you must provide full path to your data file.
NSString * test = [NSString stringWithFormat:#"myfile.txt];
NSData *myData = [[NSData alloc] initWithContentsOfFile:(#"%#", test)];
NSString* myString = [NSString alloc]initWithData:myData
encoding: NSUTF8StringEncoding];
Choose encoding in which you save you data file.
I don't know what your problem ist. You could share with us how the file was written so that we can get a bit closer.
However, you are doing much too complicated and therefore error prone. The following would to for reading a file with a constant name fo myfile.txt.
NSData *myData = [[NSData alloc] initWithContentsOfFile:#"myfile.txt"];
Frankly I don't even know in which directory myfile.txt is expected to be. And there are of course much better ways to deal with constant literals than using literals directly in the code.
BTW, what do you actually receive in myData? null?
NSstream write encoding issues

Im trying to send a string using NSoutputstream , however i cant seem to get the encoding right , using dataWithContentsOfURL works
im using a nodejs TCP server with actionHero library.
it works using netcat and telnet.
- (IBAction)sendText:(id)sender {
NSString *response = [NSString stringWithFormat:#"%#", [_sendTextField.text stringByAddingPercentEscapesUsingEncoding: NSUTF8StringEncoding]];
NSLog(#"writing %#",response);
///////////////////////////// this line works/////////////////////////////////////////////////////
// NSData *data = [[NSData alloc] initWithData:[NSData dataWithContentsOfURL:[NSURL URLWithString:#"http://www.google.com"]]];
///////////////////////////// this line doesnt work/////////////////////////////////////////////////////
NSData *data = [[NSData alloc] initWithData:[response dataUsingEncoding:NSUTF8StringEncoding]];
//%u returns a non zero value
NSLog(#"%u",[outputStream write:[data bytes] maxLength:[data length]]);
i get a null streamError from handle stream Event method
Not knowing the content of response, I can't give you a specific answer as to why NSUTF8StringEncoding doesn't work with it. Generally speaking, however, if there is a byte sequence in your content that is incompatible with UTF-8, you're going to get nil when you call -dataUsingEncoding:.
There's a strategy that I learned from reading Mike Ash's blog (look under the section "Fallbacks"), and it's served me pretty well in situations such as yours.
To briefly sum it up, first try using NSUTF8StringEncoding. If that doesn't work, try using NSISOLatin1StringEncoding. And if that doesn't work, try using NSMacOSRomanStringEncoding. Mike's blog has the rationale for this.
Found the answer to my own question. turns out the problem is in actionHero's on data method where it looks for a /n which is not provided by the ios application. appended a \n and its fine now

Objective-C: NSString not being entirely decoded from UTF-8

I'm querying a web server which returns a JSON string as NSData. The string is in UTF-8 format so it is converted to an NSString like this.
NSString *receivedString = [[NSString alloc] initWithData:receivedData encoding:NSUTF8StringEncoding];
However, some UTF-8 escapes remain in the outputted JSON string which causes my app to behave erratically. Things like \u2019 remain in the string. I've tried everything to remove them and replace them with their actual characters.
The only thing I can think of is to replace the occurances of UTF-8 escapes with their characters manually, but this is a lot of work if there's a quicker way!
Here's an example of an incorrectly parsed string:
{"title":"The Concept, Framed, The Enquiry, Delilah\u2019s Number 10 ","url":"http://livebrum.co.uk/2012/05/31/the-concept-framed-the-enquiry-delilah\u2019s-number-10","date_range":"31 May 2012","description":"","venue":{"title":"O2 Academy 3 ","url":"http://livebrum.co.uk/venues/o2-academy-3"}
As you can see, the URL hasn't been completely converted.
The \u2019 syntax isn't part of UTF-8 encoding, it's a piece of JSON-specific syntax. NSString parses UTF-8, not JSON, so doesn't understand it.
You should use NSJSONSerialization to parse the JSON then pull the string you want from the output of that.
So, for example:
NSError *error = nil;
id rootObject = [NSJSONSerialization
// error path here
// really you'd validate this properly, but this is just
// an example so I'm going to assume:
// (1) the root object is a dictionary;
// (2) it has a string in it named 'url'
// (technically this code will work not matter what the type
// of the url object as written, but if you carry forward assuming
// a string then you could be in trouble)
NSDictionary *rootDictionary = rootObject;
NSString *url = [rootDictionary objectForKey:#"url"];
NSLog(#"URL was: %#", url);

Is there a way to "auto detect" the encoding of a resource when loading it using stringFromContentsOfURL?

Is there a way to "auto detect" the encoding of a resource when loading it using stringFromContentsOfURL? The current (non-depracated) method, + (id)stringWithContentsOfURL:(NSURL *)url encoding:(NSStringEncoding)enc error:(NSError **)error;, wants a URL encoding. I've noticed that getting it wrong does make a difference for what I want to do. Is there a way to check this somehow and always get it right? (Right now I'm using UTF8.)
I'd try this function from the docs
Returns a string created by reading data from a given URL and returns by reference the encoding used to interpret the data.
+ (id)stringWithContentsOfURL:(NSURL *)url usedEncoding:(NSStringEncoding *)enc error:(NSError **)error
this seems to guess the encoding and then returns it to you
What I normally do when converting data (encoding-less string of bytes) to a string is attempt to initialize the string using various different encodings. I would suggest trying the most limiting (charset wise) encodings like ASCII and UTF-8 first, then attempt UTF-16. If none of those are a valid encoding, you should attempt to decode the string using a fallback encoding like NSWindowsCP1252StringEncoding that will almost always work. In order to do this you need to download the page's contents using NSData so that you don't have to re-download for every encoding attempt. Your code might look like this:
NSData * urlData = [NSData dataWithContentsOfURL:aURL];
NSString * theString = [[NSString alloc] initWithData:urlData encoding:NSASCIIStringEncoding];
if (!theString) {
theString = [[NSString alloc] initWithData:urlData encoding:NSUTF8StringEncoding];
if (!theString) {
theString = [[NSString alloc] initWithData:urlData encoding:NSUTF16StringEncoding];
if (!theString) {
theString = [[NSString alloc] initWithData:urlData NSWindowsCP1252StringEncoding];
// ...
// use theString here...
// ...
[theString release];

objective c - does not read utf-8 encoded file

I'm trying to display some japanese text on the ios simulator and an ipod touch. The text is read from an XML file. The header is:
<?xml version="1.0" encoding="utf-8"?>
When the text is in english, it displays fine. However, when the text is Japanese, it comes out as an unintelligible mishmash of single-byte characters.
I have tried saving the file specifically as unicode using TextEdit. I'm using NSXMLParser to parse the data. Any ideas would be much appreciated.
Here is the parsing code
// Override point for customization after application launch.
NSString *xmlFilePath = [[[NSBundle mainBundle] resourcePath] stringByAppendingPathComponent:#"questionsutf8.xml"];
NSString *xmlFileContents = [NSString stringWithContentsOfFile:xmlFilePath];
NSData *data = [NSData dataWithBytes:[xmlFileContents UTF8String] length:[xmlFileContents lengthOfBytesUsingEncoding: NSUTF8StringEncoding]];
XMLReader *xmlReader = [[XMLReader alloc] init];
[xmlReader parseXMLData: data];
stringWithContentsOfFile: is a deprecated method. It does not do encoding detection unless the file contains the appropriate byte order mark, otherwise it interprets the file as the default C string encoding (the encoding returned by the +defaultCStringEncoding method). Instead, you should use the non-deprecated [and encoding-detecting] method stringWithContentsOfFile:usedEncoding:error:.
You can use it like this:
NSStringEncoding enc;
NSError *error;
NSString *xmlFileContents = [NSString stringWithContentsOfFile:xmlFilePath
if (xmlFileContents == nil)
NSLog (#"%#", error);
First, you should verify with TextWrangler (free from the Mac app store or barebones.com) that your XML file truly is UTF-8 encoded.
Second, try creating xmlFileContents with +stringWithContentsOfFile:encoding:error:, explicitly specifying UTF-8 encoding. Or, even better, bypass the intermediate string entirely, and create data with +dataWithContentsOfFile:.