Cocoa: Anomaly Writing TAB Character to File? - objective-c

I'm using the following to define an NSString containing the Tab character:
#define TAB #"\t"
I'm writing a longer NSString to a file using writeToFile:atomically:encoding:error, using encoding: NSUTF8StringEncoding. This longer NSString contains TAB characters.
When I open the resulting file in TextEdit, I see a character that looks like a Japanese glyph in the place of the TAB character. Here is a screen shot of a line that is intended to have two tab characters, but which has these odd characters instead:
odd characters http://www.market-research-services.com/starpowermedia/for_distribution/tab-char-anomaly.png
What is the correct way to #define an NSString that will contain a TAB character to be written to a file of NSUTF8StringEncoding?
Thanks in advance to all for any info.

Related

Objc-DataFile-Unreadable Substring-Unknown to any encoding

I have a DataFile, built by subsidiairy Application. I need to locate some substring contained in the data file. They are identifiable by the character symbols delimiting them. For instance : *!substringqSxt .The substring will vary from a project to another so I need to locate the symbols delimiting them to read the following substring. I also printed the file to different encodings trying which one was used and matched the original data file. found it was MacOsRomanStringEncoding.
I use NSRange:rangOfStringto locate the delimiting symbols. Here is my code :
char *debutAudio ="jjbj";
char *finAudio ="qSxt";
NSString *debutAudioConverted = [[NSString alloc]
initWithCString: debutAudio
encoding:NSMacOSRomanStringEncoding];
NSString *finAudioConverted = [[NSString alloc]
initWithCString: finAudio
encoding:NSMacOSRomanStringEncoding];
NSRange debutaudioRange =[dataFileContent rangeOfString:debutAudioConverted];
NSRange finaudioRange =[dataFileContent rangeOfString:finAudioConverted];
NSLog(#"range is %#",NSStringFromRange(debutaudioRange));
NSLog(#"range is %#",NSStringFromRange(finaudioRange));
Both NSLog returns range is {9223372036854775807, 0}
so not locating the delimiting strings there.
And if I ask to look for other strings contained in the file like "Settings" the rangeOfString will return the proper location and length.
I thought the file may contain multiple encodings, and tried converting with initWithCStringto any possible encoding but nothing would do.
Also if I open the file in text edit and use the "Find" function, it will not locate the delimiting string, but will locate other words. My guts tell me its related. I dont know where to look for info. Could the file be protected, I am reading a copy of it though.
I have found the problem occuring here. The proper encoding is still MacOsRoman. The problem is the prefix string *debutAudio "jjbj"there is actually a tiny space , like a quarter space between each characters. I have tried every unicode spaces listed here :https://www.cs.tut.fi/~jkorpela/chars/spaces.html#adj
without any success. Now I will tried to find some half or quarter space under MacOsRoman see if that is working.

How do I remove hidden characters from a NSString?

After copying pasting a text from the web, in my mac app NSTextArea, I see
EE
If I copy these 2 letters in a browser I see:
E?E
If I copy them in google translator I get
E 'E
I cannot identify this character in between the two E. But the question is: how do I remove these hidden characters from my NSString?
In your uploaded file the specific hex code for the hidden character is 0x18
(found via Hex Fiend)
This character, along with others are part of a 'control character set'. The set also contains characters such as the tab (0x09) and newline (0x0A) - obviously those we don't want to remove.
In Objective-C, we can use the NSCharacterSet controlCharacterSet in conjunction with whitespaceAndNewlineCharacterSet to get just the blank characters that have no rendered width.
NSMutableCharacterSet* zeroWidthCharacterSet = [[NSCharacterSet controlCharacterSet] mutableCopy];
[zeroWidthCharacterSet formIntersectionWithCharacterSet:[[NSCharacterSet whitespaceAndNewlineCharacterSet] invertedSet]];
Then we can simply use the good old split by character set method
string = [[string componentsSeparatedByCharactersInSet:zeroWidthCharacterSet] componentsJoinedByString:#""];
Note that if a special character that uses more than one UTF8 character to represent itself (like Emoji) uses 0x18 then stripping it will break the character combo
Because the control characters are special, I don't believe you'd ever find them in an Emoji sequence.

Insert unicode symbol into label from array

I want to insert a unicode symbol for pi, which is \u03c0 into a label and for it to display the symbol. I am loading this in from an array which was read from a txt file. For example if I have a txt file that contains "\u03c0":
string = [NSString stringWithContentsOfFile:filePath encoding:NSUTF8StringEncoding error:nil]
array[i] = string;
label.text = array[i];
What am getting is "\u03c0" as an output in the textfield, but I want the symbol. What I am doing wrong?
Edit: it seems that my problems is with string encoding because I am reading in the array from a file. I was using NSUTF8StringEncoding. What should this be changed to to allow unicode?
My guess is the contents of your file contains \\u03c0 rather than the actual character. If you have control of the file contents, paste in the actual character, not the sequence, because the editor will save it with the escaping "\". If you don't have control, i suggest writing code to detect this escaping, strip the preceding "\" and then use the result in your format.

Special characters not working with custom font

I'm having trouble with special characters (like â é ü) when using a custom font in my app.
For some reason, it's replacing the accented characters with seemingly random characters. I'm trying to set a UILabel's text to the word Château but both this:
myLabel.text = [NSString stringWithFormat:#"Château"];
and (
myLabel.text = [NSString stringWithFormat:#"Ch\u00E1teau"];
are setting the label text to Ch,teau. This is only happening when using the custom font, when I log the result in the console the correct character shows up. I've tried setting a different string encoding. These characters so exist within the font (if i type the same word in TextEdit, MS Word, etc it shows up fine), I've also validated and inspected the font in FontBook. It's an .OTF font.
Any ideas what's happening?
Not all font sets have all characters. Open the charactermap in windows load your custom font and see if it exists. It could be that the author of the font has in advertantly placed it at the wrong code.
Try:
[NSString stringWithFormat:#"Ch\u00E1teau"]
Note the forward slash was changed to a backslash.
PS: verified in real program.

How do I remove illegal characters from an NSString?

I am parsing a tab seperated list using a NSScanner based upon each line and the tabs. However for some reason the last field in the array (parsed from each row) contains a \r character.
How can I strip this from the NSString that represents the line (or the field)
If the \r character is at the end (probably because the file being parsed is CRLF), you can just do something like [string stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]]. (You might want to use an explicitly created '\r' character set instead if you don't want to strip whitespace as well.)
Try using the +[NSCharacterSet newlineCharacterSet] method with NSScanner in your various scanning method calls.
Just FYI, The \r is part of the line ending for a file created in a windows environment.