How to save a text document in Cocoa with specified NSString encoding? - objective-c

I'm trying to create a simple text editor like Textedit for Mac OS X, but after many hours of research can't figure out how to correctly write my document's data to a file. I'm using the Cocoa framework and my application is document-based. Looking around in the Cocoa API I found a brief tutorial, "Building a text editor in 15 minutes" or something like this, that implements the following method to write the data to a file:
- (NSData *)dataOfType:(NSString *)typeName error:(NSError **)outError {
[textView breakUndoCoalescing];
NSAttributedString *string=[[textView textStorage] copy];
NSData *data;
NSMutableDictionary *dict=[NSDictionary dictionaryWithObject:NSPlainTextDocumentType forKey:NSDocumentTypeDocumentAttribute];
data=[string dataFromRange:NSMakeRange(0,[string length]) documentAttributes:dict error:outError];
return data;
}
This just works fine, but I'd like to let the user choose the text encoding. I guess this method uses an "automatic" encoding, but how can I write the data using a predefined encoding? I tried using the following code:
- (NSData *)dataOfType:(NSString *)typeName error:(NSError **)outError {
[textView breakUndoCoalescing];
NSAttributedString *string=[[textView textStorage] copy];
NSData *data;
NSInteger saveEncoding=[prefs integerForKey:#"saveEncoding"];
// if the saving encoding is set to "automatic"
if (saveEncoding<0) {
NSMutableDictionary *dict=[NSDictionary dictionaryWithObject:NSPlainTextDocumentType forKey:NSDocumentTypeDocumentAttribute];
data=[string dataFromRange:NSMakeRange(0,[string length]) documentAttributes:dict error:outError];
// else use the encoding specified by the user
} else {
NSMutableDictionary *dict=[NSDictionary dictionaryWithObjectsAndKeys:NSPlainTextDocumentType,NSDocumentTypeDocumentAttribute,saveEncoding,NSCharacterEncodingDocumentAttribute,nil];
data=[string dataFromRange:NSMakeRange(0,[string length]) documentAttributes:dict error:outError];
}
return data;
}
saveEncoding is -1 if the user didn't set a specific encoding, otherwise one of the encodings listed in [NSString availableStringEncodings]. But whenever I try to save my document in a different encoding from UTF8, the app crashes. The same happens when I try to encode my document with the following code:
NSString *string=[[textView textStorage] string];
data=[string dataUsingEncoding:saveEncoding];
What am I doing wrong? It would be great if someone knows how Textedit solved this problem.

Perhaps you remember that NSDictionary can only store objects...
NSMutableDictionary *dict = [NSDictionary dictionaryWithObjectsAndKeys:
NSPlainTextDocumentType,
NSDocumentTypeDocumentAttribute,
[NSNumber numberWithInteger:saveEncoding],
NSCharacterEncodingDocumentAttribute,
nil];

Related

attributes not saving with file

This is probably easy, but I can not seam to figure it out - maybe it's late. I have a simple program that takes the text from an NSTextView and saves it as rtf. Saving the text itself works great, I just can not figure out how to get the attributes to tag along.
Code:
NSAttributedString *saveString = [[NSAttributedString alloc]
initWithString:[textView string]];
NSData *writeResults = [saveString
RTFFromRange:NSMakeRange:(0, [saveString length])
doumentAttributes:?? ];
[writeResults writeToURL:[panel URL] atomically: YES];
I know I need an NSDictionary for the documentAttributes, so how do I get that from the view?
What am I missing?
It seems that you are asking the textView for its string property. You need to ask it for its attributedString property:
NSAttributedString *saveString = textView.attributedString;
You can get the attributes from an attributed string like this:
NSMutableDictionary *allAttributes = [[NSMutableDictionary alloc] init];
[saveString enumerateAttribuesInRange:NSMakeRange(0,saveString.length) options:NSAttributedStringEnumerationReverse usingBlock:^(NSDictionary *attrs, NSRange range, BOOL *stop) {
[allAttrubutes addEntriesFromDictionary:attrs];
}];
NSData *writeResults = [saveString.string RTFFromRange:NSMakeRange(0,saveString.length) documentAttributes:allAttributes];
I have used this method to get attributes many times however I have never saved to RTF so I don't know exactly how this will turn out. All the attributes will be in the dictionary however.

How to convert &#8211,&#8222 etc in Objective-C

I made server side by Python and which return some scraped html string to client side which is made by Objective-C.
But When I try to show from client side which retuned string from server , it contains &#8211,&#8222,etc.But I don't know why it contains above characters.
Do you have any idea? And I want to convert them correctly with Objective-C. Do you have any idea? Thanks in advance.
If you want to stick with Cocoa you could also try to use NSAttributedString and initWithHTML:documentAttributes:, you will lose the markup than, though:
NSData *data = [#"<html><p>&#8211 Test</p></html>" dataUsingEncoding:NSUTF8StringEncoding];
NSAttributedString *string = [[NSAttributedString alloc] initWithHTML:data documentAttributes:nil];
NSString *result = [string string];
These are HTML Entities
Here is NSString category for HTML and here are the methods available:
- (NSString *)stringByConvertingHTMLToPlainText;
- (NSString *)stringByDecodingHTMLEntities;
- (NSString *)stringByEncodingHTMLEntities;
- (NSString *)stringWithNewLinesAsBRs;
- (NSString *)stringByRemovingNewLinesAndWhitespace;

Is there a way to "auto detect" the encoding of a resource when loading it using stringFromContentsOfURL?

Is there a way to "auto detect" the encoding of a resource when loading it using stringFromContentsOfURL? The current (non-depracated) method, + (id)stringWithContentsOfURL:(NSURL *)url encoding:(NSStringEncoding)enc error:(NSError **)error;, wants a URL encoding. I've noticed that getting it wrong does make a difference for what I want to do. Is there a way to check this somehow and always get it right? (Right now I'm using UTF8.)
I'd try this function from the docs
Returns a string created by reading data from a given URL and returns by reference the encoding used to interpret the data.
+ (id)stringWithContentsOfURL:(NSURL *)url usedEncoding:(NSStringEncoding *)enc error:(NSError **)error
this seems to guess the encoding and then returns it to you
What I normally do when converting data (encoding-less string of bytes) to a string is attempt to initialize the string using various different encodings. I would suggest trying the most limiting (charset wise) encodings like ASCII and UTF-8 first, then attempt UTF-16. If none of those are a valid encoding, you should attempt to decode the string using a fallback encoding like NSWindowsCP1252StringEncoding that will almost always work. In order to do this you need to download the page's contents using NSData so that you don't have to re-download for every encoding attempt. Your code might look like this:
NSData * urlData = [NSData dataWithContentsOfURL:aURL];
NSString * theString = [[NSString alloc] initWithData:urlData encoding:NSASCIIStringEncoding];
if (!theString) {
theString = [[NSString alloc] initWithData:urlData encoding:NSUTF8StringEncoding];
}
if (!theString) {
theString = [[NSString alloc] initWithData:urlData encoding:NSUTF16StringEncoding];
}
if (!theString) {
theString = [[NSString alloc] initWithData:urlData NSWindowsCP1252StringEncoding];
}
// ...
// use theString here...
// ...
[theString release];

objective c - does not read utf-8 encoded file

I'm trying to display some japanese text on the ios simulator and an ipod touch. The text is read from an XML file. The header is:
<?xml version="1.0" encoding="utf-8"?>
When the text is in english, it displays fine. However, when the text is Japanese, it comes out as an unintelligible mishmash of single-byte characters.
I have tried saving the file specifically as unicode using TextEdit. I'm using NSXMLParser to parse the data. Any ideas would be much appreciated.
Here is the parsing code
// Override point for customization after application launch.
NSString *xmlFilePath = [[[NSBundle mainBundle] resourcePath] stringByAppendingPathComponent:#"questionsutf8.xml"];
NSString *xmlFileContents = [NSString stringWithContentsOfFile:xmlFilePath];
NSData *data = [NSData dataWithBytes:[xmlFileContents UTF8String] length:[xmlFileContents lengthOfBytesUsingEncoding: NSUTF8StringEncoding]];
XMLReader *xmlReader = [[XMLReader alloc] init];
[xmlReader parseXMLData: data];
stringWithContentsOfFile: is a deprecated method. It does not do encoding detection unless the file contains the appropriate byte order mark, otherwise it interprets the file as the default C string encoding (the encoding returned by the +defaultCStringEncoding method). Instead, you should use the non-deprecated [and encoding-detecting] method stringWithContentsOfFile:usedEncoding:error:.
You can use it like this:
NSStringEncoding enc;
NSError *error;
NSString *xmlFileContents = [NSString stringWithContentsOfFile:xmlFilePath
usedEncoding:&enc
error:&error];
if (xmlFileContents == nil)
{
NSLog (#"%#", error);
return;
}
First, you should verify with TextWrangler (free from the Mac app store or barebones.com) that your XML file truly is UTF-8 encoded.
Second, try creating xmlFileContents with +stringWithContentsOfFile:encoding:error:, explicitly specifying UTF-8 encoding. Or, even better, bypass the intermediate string entirely, and create data with +dataWithContentsOfFile:.

Objective-c saving raw text

I implemented saving and loading methods in my document-based application. In the saving method, I have
[NSArchiver archivedDataWithRootObject:[self string]];
Where [self string] is a NSString.
When saving a file with just "normal content" inside of it, the contents of the file created are:
streamtypedè#NSStringNSObject+normal content
Is there a way to store in a file just raw text?
Thanks for your help.
There are methods inside NSString for saving in a file:
NSString * s = #"Foo bar";
NSError * err = NULL;
BOOL result = [s writeToFile:#"/tmp/test.txt" atomically:YES encoding:NSASCIIStringEncoding error:&err];
Since i am new with cocoa, i don't know if this is the right way to do it or even a valid way.
But after a quick look at the documentation i found this method of NSString instances, - (NSData *)dataUsingEncoding:(NSStringEncoding)encoding
A quick try on a sample project it worked fine with:
- (NSData *)dataOfType:(NSString *)typeName error:(NSError **)outError
So something like this might work for you:
- (NSData *)dataOfType:(NSString *)typeName error:(NSError **)outError {
return [[self string] dataUsingEncoding:NSUnicodeStringEncoding];
}