I need to add a custom attribute to the selected text in an NSTextView. So I can do that by getting the attributed string for the selection, adding a custom attribute to it, and then replacing the selection with my new attributed string.
So now I get the text view's attributed string as NSData and write it to a file. Later when I open that file and restore it to the text view my custom attributes are gone! After working out the entire scheme for my custom attribute I find that custom attributes are not saved for you. Look at the IMPORTANT note here: http://developer.apple.com/mac/library/DOCUMENTATION/Cocoa/Conceptual/AttributedStrings/Tasks/RTFAndAttrStrings.html
So I have no idea how to save and restore my documents with this custom attribute. Any help?
The normal way of saving an NSAttributedString is to use RTF, and RTF data is what the -dataFromRange:documentAttributes:error: method of NSAttributedString generates.
However, the RTF format has no support for custom attributes. Instead, you should use the NSCoding protocol to archive your attributed string, which will preserve the custom attributes:
//asssume attributedString is your NSAttributedString
//encode the string as NSData
NSData* stringData = [NSKeyedArchiver archivedDataWithRootObject:attributedString];
[stringData writeToFile:pathToFile atomically:YES];
//read the data back in and decode the string
NSData* newStringData = [NSData dataWithContentsOfFile:pathToFile];
NSAttributedString* newString = [NSKeyedUnarchiver unarchiveObjectWithData:newStringData];
There is a way to save custom attributes to RTF using Cocoa. It relies on the fact that RTF is a text format, and so can be manipulated as a string even if you don't know all the rules of RTF and don't have a custom RTF reader/writer. The procedure I outline below post-processes the RTF both when writing and reading, and I have used this technique personally. One thing to be very careful of is that the text you insert into the RTF uses only 7-bit ASCII and no unescaped control characters, which include "\ { }".
Here's how you would encode your data:
NSData *GetRtfFromAttributedString(NSAttributedString *text)
{
NSData *rtfData = nil;
NSMutableString *rtfString = nil;
NSString *customData = nil, *encodedData = nil;
NSRange range;
NSUInteger dataLocation;
// Convert the attributed string to RTF
if ((rtfData = [text RTFFromRange:NSMakeRange(0, [text length]) documentAttributes:nil]) == nil)
return(nil);
// Find and encode your custom attributes here. In this example the data is a string and there's at most one of them
if ((customData = [text attribute:#"MyCustomData" atIndex:0 effectiveRange:&range]) == nil)
return(rtfData); // No custom data, return RTF as is
dataLocation = range.location;
// Get a string representation of the RTF
rtfString = [[NSMutableString alloc] initWithData:rtfData encoding:NSASCIIStringEncoding];
// Find the anchor where we'll put our data, namely just before the first paragraph property reset
range = [rtfString rangeOfString:#"\\pard" options:NSLiteralSearch];
if (range.location == NSNotFound)
{
NSLog(#"Custom data dropped; RTF has no paragraph properties");
[rtfString release];
return(rtfData);
}
// Insert the starred group containing the custom data and its location
encodedData = [NSString stringWithFormat:#"{\\*\\my_custom_keyword %d,%#}\n", dataLocation, customData];
[rtfString insertString:encodedData atIndex:range.location];
// Convert the amended RTF back to a data object
rtfData = [rtfString dataUsingEncoding:NSASCIIStringEncoding];
[rtfString release];
return(rtfData);
}
This technique works because all compliant RTF readers will ignore "starred groups" whose keyword they don't recognize. Therefore you want to be sure your control word will not be recognized by any other reader, so use something likely to be unique, such as a prefix with your company or product name. If your data is complex, or binary, or may contain illegal RTF characters that you don't want to escape, encode it in base64. Be sure you put a space after your keyword.
Similarly, when reading the RTF, you search for your control word, extract the data, and restore the attribute. This routine takes as arguments the attributed string and the RTF it was created from.
void RestoreCustomAttributes(NSMutableAttributedString *text, NSData *rtfData)
{
NSString *rtfString = [[NSString alloc] initWithData:rtfData encoding:NSASCIIStringEncoding];
NSArray *components = nil;
NSRange range, endRange;
// Find the custom data and its end
range = [rtfString rangeOfString:#"{\\*\\my_custom_keyword " options:NSLiteralSearch];
if (range.location == NSNotFound)
{
[rtfString release];
return;
}
range.location += range.length;
endRange = [rtfString rangeOfString:#"}" options:NSLiteralSearch
range:NSMakeRange(range.location, [rtfString length] - endRange.location)];
if (endRange.location == NSNotFound)
{
[rtfString release];
return;
}
// Get the location and the string data, which are separated by a comma
range.length = endRange.location - range.location;
components = [[rtfString substringWithRange:range] componentsSeparatedByString:#","];
[rtfString release];
// Assign the custom data back to the attributed string. You should do range checking here (omitted for clarity)
[text addAttribute:#"MyCustomData" value:[components objectAtIndex:1]
range:NSMakeRange([[components objectAtIndex:0] integerValue], 1)];
}
Related
I am working with a Objective-C Application, specifically I am gathering the dictionary representation of NSUserDefaults with this code:
NSUserDefaults *defaults = [NSUserDefaults standardUserDefaults];
NSDictionary *userDefaultsDict = [defaults dictionaryRepresentation];
While enumerating keys and objects of the resulting dict, sometimes I find a kind of opaque string that you can see in the following picture:
So it seems like an encoding problem.
If I try to print description of the string, the debugger correctly prints:
Printing description of obj:
tsuqsx
However, if I try to write obj to a file, or use it in any other way, I get an unreadable output like this:
What I would like to achieve is the following:
Detect in some way that the string has the encoding problem.
Convert the string to UTF8 encoding to use it in the rest of the program.
Any help is greatly appreciated. Thanks
EDIT: Very Hacky possible Solution that helps explaining what I am trying to do.
After trying all possible solutions based on dataUsingEncoding and back, I ended up with the following solution, absolutely weird, but I post it here, in the hope that it can help somebody to guess the encoding and what to do with unprintable characters:
- (BOOL)isProblematicString:(NSString *)candidateString {
BOOL returnValue = YES;
if ([candidateString length] <= 2) {
return NO;
}
const char *temp = [candidateString UTF8String];
long length = temp[0];
char *dest = malloc(length + 1);
long ctr = 1;
long usefulCounter = 0;
for (ctr = 1;ctr <= length;ctr++) {
if ((ctr - 1) % 3 == 0) {
memcpy(&dest[ctr - usefulCounter - 1],&temp[ctr],1);
} else {
if (ctr != 1 && ctr < [candidateString length]) {
if (temp[ctr] < 0x10 || temp[ctr] > 0x1F) {
returnValue = NO;
}
}
usefulCounter += 1;
}
}
memset(&dest[length],0,1);
free(dest);
return returnValue;
}
- (NSString *)utf8StringFromUnknownEncodedString:(NSString*)originalUnknownString {
const char *temp = [originalUnknownString UTF8String];
long length = temp[0];
char *dest = malloc(length + 1);
long ctr = 1;
long usefulCounter = 0;
for (ctr = 1;ctr <= length;ctr++) {
if ((ctr - 1) % 3 == 0) {
memcpy(&dest[ctr - usefulCounter - 1],&temp[ctr],1);
} else {
usefulCounter += 1;
}
}
memset(&dest[length],0,1);
NSString *returnValue = [[NSString alloc] initWithUTF8String:dest];
free(dest);
return returnValue;
}
This returns me a string that I can use to build a full UTF8 string. I am looking for a clean solution. Any help is greatly appreciated. Thanks
We're talking about a string which comes from the /Library/Preferences/.GlobalPreferences.plist
(key com.apple.preferences.timezone.new.selected_city).
NSString *city = [[NSUserDefaults standardUserDefaults]
stringForKey:#"com.apple.preferences.timezone.new.selected_city"];
NSLog(#"%#", city); // \^Zt\^\\^]s\^]\^\u\^V\^_q\^]\^[s\^W\^Zx\^P
(lldb) p [city description]
(__NSCFString *) $1 = 0x0000600003f6c240 #"\x1at\x1c\x1ds\x1d\x1cu\x16\x1fq\x1d\x1bs\x17\x1ax\x10"
What I would like to achieve is the following:
Detect in some way that the string has the encoding problem.
Convert the string to UTF8 encoding to use it in the rest of the program.
&
After trying all possible solutions based on dataUsingEncoding and back.
This string has no encoding problem and characters like \x1a, \x1c, ... are valid characters.
You can call dataUsingEncoding: with ASCII, UTF-8, ... but all these characters will still be
present. They're called control characters (or non-printing characters). The linked Wikipedia page explains what these characters are and how they're defined in ASCII, extended ASCII and unicode.
What you're looking for is a way how to remove control characters from a string.
Remove control characters
We can create a category for our new method:
#interface NSString (ControlCharacters)
- (NSString *)stringByRemovingControlCharacters;
#end
#implementation NSString (ControlCharacters)
- (NSString *)stringByRemovingControlCharacters {
// TODO Remove control characters
return self;
}
#end
In all examples below, the city variable is created in this way ...
NSString *city = [[NSUserDefaults standardUserDefaults]
stringForKey:#"com.apple.preferences.timezone.new.selected_city"];
... and contains #"\x1at\x1c\x1ds\x1d\x1cu\x16\x1fq\x1d\x1bs\x17\x1ax\x10". Also all
examples below were tested with the following code:
NSString *cityWithoutCC = [city stringByRemovingControlCharacters];
// tsuqsx
NSLog(#"%#", cityWithoutCC);
// {length = 6, bytes = 0x747375717378}
NSLog(#"%#", [cityWithoutCC dataUsingEncoding:NSUTF8StringEncoding]);
Split & join
One way is to utilize the NSCharacterSet.controlCharacterSet.
There's a stringByTrimmingCharactersInSet:
method (NSString), but it removes these characters from the beginning/end only,
which is not what you're looking for. There's a trick you can use:
- (NSString *)stringByRemovingControlCharacters {
NSArray<NSString *> *components = [self componentsSeparatedByCharactersInSet:NSCharacterSet.controlCharacterSet];
return [components componentsJoinedByString:#""];
}
It splits the string by control characters and then joins these components back. Not a very efficient way, but it works.
ICU transform
Another way is to use ICU transform (see ICU User Guide).
There's a stringByApplyingTransform:reverse:
method (NSString), but it only accepts predefined constants. Documentation says:
The constants defined by the NSStringTransform type offer a subset of the functionality provided by the underlying ICU transform functionality. To apply an ICU transform defined in the ICU User Guide that doesn't have a corresponding NSStringTransform constant, create an instance of NSMutableString and call the applyTransform:reverse:range:updatedRange: method instead.
Let's update our implementation:
- (NSString *)stringByRemovingControlCharacters {
NSMutableString *result = [self mutableCopy];
[result applyTransform:#"[[:Cc:] [:Cf:]] Remove"
reverse:NO
range:NSMakeRange(0, self.length)
updatedRange:nil];
return result;
}
[:Cc:] represents control characters, [:Cf:] represents format characters. Both represents the same character set as the already mentioned NSCharacterSet.controlCharacterSet. Documentation:
A character set containing the characters in Unicode General Category Cc and Cf.
Iterate over characters
NSCharacterSet also offers the characterIsMember: method. Here we need to iterate over characters (unichar) and check if it's a control character or not.
Let's update our implementation:
- (NSString *)stringByRemovingControlCharacters {
if (self.length == 0) {
return self;
}
NSUInteger length = self.length;
unichar characters[length];
[self getCharacters:characters];
NSUInteger resultLength = 0;
unichar result[length];
NSCharacterSet *controlCharacterSet = NSCharacterSet.controlCharacterSet;
for (NSUInteger i = 0 ; i < length ; i++) {
if ([controlCharacterSet characterIsMember:characters[i]] == NO) {
result[resultLength++] = characters[i];
}
}
return [NSString stringWithCharacters:result length:resultLength];
}
Here we filter out all characters (unichar) which belong to the controlCharacterSet.
Other ways
There're other ways how to iterate over characters - for example - Most efficient way to iterate over all the chars in an NSString.
BBEdit & others
Let's write this string to a file:
NSString *city = [[NSUserDefaults standardUserDefaults]
stringForKey:#"com.apple.preferences.timezone.new.selected_city"];
[city writeToFile:#"/Users/zrzka/city.txt"
atomically:YES
encoding:NSUTF8StringEncoding
error:nil];
It's up to the editor how all these controls characters are handled/displayed. Here's en example - Visual Studio Code.
View - Render Control Characters off:
View - Render Control Characters on:
BBEdit displays question marks (upside down), but I'm sure there's a way how to
toggle control characters rendering. Don't have BBEdit installed to verify it.
I have a game with a public highscore list where I allow layers to enter their name (or anything unto 12 characters). I am trying to create a couple of functions to filter out bad words from a list of bad words
I have in a text file. I have two methods:
One to read in the text file:
-(void) getTheBadWordsAndSaveForLater {
badWordsFilePath = [[NSBundle mainBundle] pathForResource:#"badwords" ofType:#"txt"];
badWordFile = [[NSString alloc] initWithContentsOfFile:badWordsFilePath encoding:NSUTF8StringEncoding error:nil];
badwords =[[NSArray alloc] initWithContentsOfFile:badWordFile];
badwords = [badWordFile componentsSeparatedByString:#"\n"];
NSLog(#"Number Of Words Found in file: %i",[badwords count]);
for (NSString* words in badwords) {
NSLog(#"Word in Array----- %#",words);
}
}
And one to check a word (NSString*) agains the list that I read in:
-(NSString *) removeBadWords :(NSString *) string {
// If I hard code this line below, it works....
// *****************************************************************************
//badwords =[[NSMutableArray alloc] initWithObjects:#"shet",#"shat",#"shut",nil];
// *****************************************************************************
NSLog(#"checking: %#",string);
for (NSString* words in badwords) {
string = [string stringByReplacingOccurrencesOfString:words withString:#"-" options:NSCaseInsensitiveSearch range:NSMakeRange(0, string.length)];
NSLog(#"Word in Array: %#",words);
}
NSLog(#"Cleaned Word Returned: %#",string);
return string;
}
The issue I'm having is that when I hardcode the words into an array (see commented out above) then it works like a charm. But when I use the array I read in with the first method, it does't work - the stringByReplacingOccurrencesOfString:words does not seem to have an effect. I have traced out to the log so I can see if the words are coming thru and they are... That one line just doesn't seem to see the words unless I hardcore into the array.
Any suggestions?
A couple of thoughts:
You have two lines:
badwords =[[NSArray alloc] initWithContentsOfFile:badWordFile];
badwords = [badWordFile componentsSeparatedByString:#"\n"];
There's no point in doing that initWithContentsOfFile if you're just going to replace it with the componentsSeparatedByString on the next line. Plus, initWithContentsOfFile assumes the file is a property list (plist), but the rest of your code clearly assumes it's a newline separated text file. Personally, I would have used the plist format (it obviates the need to trim the whitespace from the individual words), but you can use whichever you prefer. But use one or the other, but not both.
If you're staying with the newline separated list of bad words, then just get rid of that line that says initWithContentsOfFile, you disregard the results of that, anyway. Thus:
- (void)getTheBadWordsAndSaveForLater {
// these should be local variables, so get rid of your instance variables of the same name
NSString *badWordsFilePath = [[NSBundle mainBundle] pathForResource:#"badwords" ofType:#"txt"];
NSString *badWordFile = [[NSString alloc] initWithContentsOfFile:badWordsFilePath encoding:NSUTF8StringEncoding error:nil];
// calculate `badwords` solely from `componentsSeparatedByString`, not `initWithContentsOfFile`
badwords = [badWordFile componentsSeparatedByString:#"\n"];
// confirm what we got
NSLog(#"Found %i words: %#", [badwords count], badwords);
}
You might want to look for whole word occurrences only, rather than just the presence of the bad word anywhere:
- (NSString *) removeBadWords:(NSString *) string {
NSLog(#"checking: %# for occurrences of these bad words: %#", string, badwords);
for (NSString* badword in badwords) {
NSString *searchString = [NSString stringWithFormat:#"\\b%#\\b", badword];
string = [string stringByReplacingOccurrencesOfString:searchString
withString:#"-"
options:NSCaseInsensitiveSearch | NSRegularExpressionSearch
range:NSMakeRange(0, string.length)];
}
NSLog(#"resulted in: %#", string);
return string;
}
This uses a "regular expression" search, where \b stands for "a boundary between words". Thus, \bhell\b (or, because backslashes have to be quoted in a NSString literal, that's #"\\bhell\\b") will search for the word "hell" that is a separate word, but won't match "hello", for example.
Note, above, I am also logging badwords to see if that variable was reset somehow. That's the only thing that would make sense given the symptoms you describe, namely that the loading of the bad words from the text file works but replace process fails. So examine badwords before you replace and make sure it's still set properly.
This question already has answers here:
Formatting html tags in Objective-C
(3 answers)
Closed 9 years ago.
I'm getting data from a web service and the result contains some HTML tags, that I'm then trying to convert. For example, I want to replace <P> tags with line breaks, and <STRONG> from HTML to bold text.
Can anyone help me? I've sort of worked out how I can replace text -- I'm halfway there, I think.
if([key isEqualToString:#"Description"]){
txtDesc.text=[results objectForKey:key];
NSString * a = txtDesc.text;
NSString * b = [a stringByReplacingOccurrencesOfString:#"<strong>" withString:#"STRONG TAG"];
b = [b stringByReplacingOccurrencesOfString:#"<\\/p>" withString:#""];
b = [b stringByReplacingOccurrencesOfString:#"</p>" withString:#""];
txtDesc.text=b;
}
Strings do not have attributes like bold. Strings contain only chars including line breaker. If you want to enrich your string with attributes, have a look at NSAttributedString.
Update:
For those of us, who cannot see, why attributed strings are the solution, a simple piece of code:
- (NSAttributedString*)attributedStringByReplaceHtmlTag:(NSString*)tagName withAttributes:(NSDictionary*)attributes
{
NSString *openTag = [NSString stringWithFormat:#"<%#>", tagName];
NSString *closeTag = [NSString stringWithFormat:#"</%#>", tagName];
NSMutableAttributedString *resultingText = [self mutableCopy];
while ( YES ) {
NSString *plainString = [resultingText string];
NSRange openTagRange = [plainString rangeOfString:openTag];
if (openTagRange.length==0) {
break;
}
NSRange searchRange;
searchRange.location = openTagRange.location+openTagRange.length;
searchRange.length = [plainString length]-searchRange.location;
NSRange closeTagRange = [plainString rangeOfString:closeTag options:0 range:searchRange];
NSRange effectedRange;
effectedRange.location = openTagRange.location+openTagRange.length;
effectedRange.length = closeTagRange.location - effectedRange.location;
[resultingText setAttributes:attributes range:effectedRange];
[resultingText deleteCharactersInRange:closeTagRange];
[resultingText deleteCharactersInRange:openTagRange];
}
return resultingText;
}
But I did not test it well, because I had to prepare a risotto while programming that. ;-)
As said before you need to use NSAttributedString
It implements the following method which you will pass an NSDictionary of attributes and a range (of characteres) to receive the attributes
- (void)setAttributes:(NSDictionary *)attributes range:(NSRange)range;
An example of NSDictionary would be:
#{ NSFontAttributeName: [UIFont systemFontOfSyze:24], NSForegroundColorAttributeName: [UIColor greenColor]}
You can look for more information on apple's documentation
https://developer.apple.com/library/mac/#documentation/Cocoa/Reference/Foundation/Classes/NSAttributedString_Class/Reference/Reference.html
or on the Lecture 4 of the amazing iPhone Application Development course from Stanford
http://www.stanford.edu/class/cs193p/cgi-bin/drupal/downloads-2013-winter
I'm trying to save the content of a UITextView which contains lines of text formatted both RTL and LTR.
The problem is that UITextView checks only the first character to format direction. Let's assume I'm in "edit" mode and write this text (__ means spaces):
text1_______________________________________
____________________________________________אקסא
text2_______________________________________
and after saving we lost RTL for אקסא. Now I'd like to edit this text once again which now looks like:
text1_______________________________________
אקסא
text2_______________________________________
I'm not able to mix \u200F with \u200E directional characters in one UITextView.
How to manage this and save correctly bidirectional text from UITextView?
Here is a quick proof of concept using NSAttributedString :
- Split the text in paragraphs
- For each paragraph, detect the main language
- Create an attributed text with the correct alignmenent for the corresponding range
// In a subclass of `UITextView`
+ (UITextAlignment)alignmentForString:(NSString *)astring {
NSArray *rightToLeftLanguages = #[#"ar",#"fa",#"he",#"ur",#"ps",#"sd",#"arc",#"bcc",#"bqi",#"ckb",#"dv",#"glk",#"ku",#"pnb",#"mzn"];
NSString *lang = CFBridgingRelease(CFStringTokenizerCopyBestStringLanguage((CFStringRef)astring,CFRangeMake(0,[astring length])));
if (astring.length) {
if ([rightToLeftLanguages containsObject:lang]) {
return NSTextAlignmentRight;
}
}
return NSTextAlignmentLeft;
}
- (void)setText:(NSString *)str { // Override
[super setText:str];
// Split in paragraph
NSArray *paragraphs = [self.text componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]];
// Attributed string for the whole string
NSMutableAttributedString *attribString = [[NSMutableAttributedString alloc]initWithString:self.text];
NSUInteger loc = 0;
for(NSString *paragraph in paragraphs) {
// Find the correct alignment for this paragraph
NSMutableParagraphStyle *paragraphStyle = [[NSMutableParagraphStyle alloc]init];
[paragraphStyle setAlignment:[WGTextView alignmentForString:paragraph]];
// Find its corresponding range in the string
NSRange range = NSMakeRange(loc, [paragraph length]);
// Add it to the attributed string
[attribString addAttribute:NSParagraphStyleAttributeName value:paragraphStyle range:range];
loc += [paragraph length];
}
[super setAttributedText:attribString];
}
Also, I recommend reading the Unicode BiDi Algorithm to manage more complex use cases.
I'm writing a simple shift cipher iPhone app as a pet project, and one piece of functionality I'm currently designing is a "universal" decryption of an NSString, that returns an NSArray, all of NSStrings:
- (NSArray*) decryptString: (NSString*)ciphertext{
NSMutableArray* theDecryptions = [NSMutableArray arrayWithCapacity:ALPHABET];
for (int i = 0; i < ALPHABET; ++i) {
NSString* theNewPlainText = [self decryptString:ciphertext ForShift:i];
[theDecryptions insertObject:theNewPlainText
atIndex:i];
}
return theDecryptions;
}
I'd really like to pass this NSArray into another method that attempts to spell check each individual string within the array, and builds a new array that puts the strings with the fewest typo'd words at lower indicies, so they're displayed first. I'd like to use the system's dictionary like a text field would, so I can match against words that have been trained into the phone by its user.
My current guess is to split a given string up into words, then spell check each with NSSpellChecker's -checkSpellingOfString:StartingAt: and using the number of correct words to sort the Array. Is there an existing library method or well-accepted pattern that would help return such a value for a given string?
Well, I found a solution that works using UIKit/UITextChecker. It correctly finds the user's most preferred language dictionary, but I'm not sure if it includes learned words in the actual rangeOfMisspelledWords... method. If it doesn't, calling [UITextChecker hasLearnedWord] on currentWord inside the bottom if statement should be enough to find user-taught words.
As noted in the comments, it may be prudent to call rangeOfMisspelledWords with each of the top few languages in [UITextChecker availableLanguages], to help multilingual users.
-(void) checkForDefinedWords {
NSArray* words = [message componentsSeparatedByString:#" "];
NSInteger wordsFound = 0;
UITextChecker* checker = [[UITextChecker alloc] init];
//get the first language in the checker's memory- this is the user's
//preferred language.
//TODO: May want to search with every language (or top few) in the array
NSString* preferredLang = [[UITextChecker availableLanguages] objectAtIndex:0];
//for each word in the array, determine whether it is a valid word
for(NSString* currentWord in words){
NSRange range;
range = [checker rangeOfMisspelledWordInString:currentWord
range:NSMakeRange(0, [currentWord length])
startingAt:0
wrap:NO
language:preferredLang];
//if it is valid (no errors found), increment wordsFound
if (range.location == NSNotFound) {
//NSLog(#"%# %#", #"Valid Word found:", currentWord);
wordsFound++;
}
else {
//NSLog(#"%# %#", #"Invalid Word found:", currentWord);
}
}
//After all "words" have been searched, save wordsFound to validWordCount
[self setValidWordCount:wordsFound];
[checker release];
}