Parser over txt document in xcode - objective-c

I would like to go through .txt document and store text blocks in NSStrings. My problem is that this document contains linebreaks, and i don't know how to rid of those. It would be nice, if i could put each individual word into an ordered NSArray and then just go through that array and get information out from that. I would something like this:
// txt file
This is just a test.
End of the text file.
// NSArray and NSStrings
NSArray *wholeDocument =#"This","is","just","a","test","Foo","bar.", "End", "of", "the","text","file.";
NSString *beginDocument =#"This is just a test";
NSString *endDocument =#"End of the text file.";

Try this:
NSString *str = [NSString stringWithContentsOfFile:#"file.txt"];
NSArray *arr = [str componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
Here, arr will contain all the words which are separated either by spaces or by line breaks.

Related

Replacing bad words in a string in Objective-C

I have a game with a public highscore list where I allow layers to enter their name (or anything unto 12 characters). I am trying to create a couple of functions to filter out bad words from a list of bad words
I have in a text file. I have two methods:
One to read in the text file:
-(void) getTheBadWordsAndSaveForLater {
badWordsFilePath = [[NSBundle mainBundle] pathForResource:#"badwords" ofType:#"txt"];
badWordFile = [[NSString alloc] initWithContentsOfFile:badWordsFilePath encoding:NSUTF8StringEncoding error:nil];
badwords =[[NSArray alloc] initWithContentsOfFile:badWordFile];
badwords = [badWordFile componentsSeparatedByString:#"\n"];
NSLog(#"Number Of Words Found in file: %i",[badwords count]);
for (NSString* words in badwords) {
NSLog(#"Word in Array----- %#",words);
}
}
And one to check a word (NSString*) agains the list that I read in:
-(NSString *) removeBadWords :(NSString *) string {
// If I hard code this line below, it works....
// *****************************************************************************
//badwords =[[NSMutableArray alloc] initWithObjects:#"shet",#"shat",#"shut",nil];
// *****************************************************************************
NSLog(#"checking: %#",string);
for (NSString* words in badwords) {
string = [string stringByReplacingOccurrencesOfString:words withString:#"-" options:NSCaseInsensitiveSearch range:NSMakeRange(0, string.length)];
NSLog(#"Word in Array: %#",words);
}
NSLog(#"Cleaned Word Returned: %#",string);
return string;
}
The issue I'm having is that when I hardcode the words into an array (see commented out above) then it works like a charm. But when I use the array I read in with the first method, it does't work - the stringByReplacingOccurrencesOfString:words does not seem to have an effect. I have traced out to the log so I can see if the words are coming thru and they are... That one line just doesn't seem to see the words unless I hardcore into the array.
Any suggestions?
A couple of thoughts:
You have two lines:
badwords =[[NSArray alloc] initWithContentsOfFile:badWordFile];
badwords = [badWordFile componentsSeparatedByString:#"\n"];
There's no point in doing that initWithContentsOfFile if you're just going to replace it with the componentsSeparatedByString on the next line. Plus, initWithContentsOfFile assumes the file is a property list (plist), but the rest of your code clearly assumes it's a newline separated text file. Personally, I would have used the plist format (it obviates the need to trim the whitespace from the individual words), but you can use whichever you prefer. But use one or the other, but not both.
If you're staying with the newline separated list of bad words, then just get rid of that line that says initWithContentsOfFile, you disregard the results of that, anyway. Thus:
- (void)getTheBadWordsAndSaveForLater {
// these should be local variables, so get rid of your instance variables of the same name
NSString *badWordsFilePath = [[NSBundle mainBundle] pathForResource:#"badwords" ofType:#"txt"];
NSString *badWordFile = [[NSString alloc] initWithContentsOfFile:badWordsFilePath encoding:NSUTF8StringEncoding error:nil];
// calculate `badwords` solely from `componentsSeparatedByString`, not `initWithContentsOfFile`
badwords = [badWordFile componentsSeparatedByString:#"\n"];
// confirm what we got
NSLog(#"Found %i words: %#", [badwords count], badwords);
}
You might want to look for whole word occurrences only, rather than just the presence of the bad word anywhere:
- (NSString *) removeBadWords:(NSString *) string {
NSLog(#"checking: %# for occurrences of these bad words: %#", string, badwords);
for (NSString* badword in badwords) {
NSString *searchString = [NSString stringWithFormat:#"\\b%#\\b", badword];
string = [string stringByReplacingOccurrencesOfString:searchString
withString:#"-"
options:NSCaseInsensitiveSearch | NSRegularExpressionSearch
range:NSMakeRange(0, string.length)];
}
NSLog(#"resulted in: %#", string);
return string;
}
This uses a "regular expression" search, where \b stands for "a boundary between words". Thus, \bhell\b (or, because backslashes have to be quoted in a NSString literal, that's #"\\bhell\\b") will search for the word "hell" that is a separate word, but won't match "hello", for example.
Note, above, I am also logging badwords to see if that variable was reset somehow. That's the only thing that would make sense given the symptoms you describe, namely that the loading of the bad words from the text file works but replace process fails. So examine badwords before you replace and make sure it's still set properly.

Keep text the same while keeping special characters

i have a text that contains end of lines; i would like to have that text introduced into a NSString and still recognize the end of line.
i.e. i don't want to have to place a "\n" at the end of every line.
how can i do so in Obj-c?
I think it's best to hold the text using an NSArray, each element of which is a separate line. You can use [NSString componentsSeparatedByCharactersInSet:] (reference) for that:
NSString *str = #"hello\nworld";
NSArray *lines = [str componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]];

Not showing smily ( Emoji ) in in UITextView in iOS?

I have stored all uni-codes(emoji characters) in plist supported by iphone. When i write directly as
- (IBAction)sendButtonSelected:(id)sender {
NSMutableArray *emoticonsArray = [[NSMutableArray alloc]initWithObjects:#"\ue415",nil];
NSString *imageNameToPass = [NSString stringWithFormat:#"%#",[emoticonsArray objectAtIndex:0]];
NSLog(#"imageNameToPass1...%#",imageNameToPass);
messageTextView.text =imageNameToPass;
}
it show emoji in textview but as soon as i fetch from plist
NSString *plistPath1 = [[NSBundle mainBundle] pathForResource:#"unicodes" ofType:#"plist"];
NSDictionary *dictionary = [[NSDictionary alloc] initWithContentsOfFile:plistPath1];
activeArray= [dictionary objectForKey:categoryString];
NSLog(#"activeArray...%#",activeArray);
emoticonsArrayForHomeEmoji = [[NSMutableArray alloc]initWithCapacity:[activeArray count]];
for(int i=0; i<[activeArray count]; i++)
{
id objects = (id)[activeArray objectAtIndex:i];
[emoticonsArrayForHomeEmoji insertObject:objects atIndex:i];
}
NSString *imageNameToPass = [NSString stringWithFormat:#"%#",[emoticonsArrayForHomeEmoji
objectAtIndex:0]];
NSLog(#"imageNameToPass1...%#",imageNameToPass);
messageTextView.text =imageNameToPass;
then it shows unicode as text \ue415 in text view instead of emoji.
What i am doing wrong?. Please help me out!
Wel said by #AliSoftware, the Plist data will be read as-it is, so you can add the emojis to your plist by following this steps:
1) Go to your top bar, and click on Edit.
2) Now select Special Characters
3) Now drag and drop emoji to plist.
For more details I am adding snap shots. take a look at it.
The \uxxxx notation is only interpreted by the compiler (as the source code is usually in ASCII or MacRoman or whatever but not often UTF8)
Plist files uses the characters directly, and are encoded in UTF8.
So you should insert the emoji character itself into the plist directly, instead of using the \uxxxx notation, as the Plist data will be read as-is.
Lion and Mountain Lion Keyboard palettes contains emoji characters directly, so that should not be difficult to insert the characters when editing the PLIST anyway.

Why is it that every other object in my array is blank?

I read a CSV file into an array using:
NSString *theWholeTable = [NSString stringWithContentsOfFile:[[NSBundle mainBundle] pathForResource:#"example" ofType:#"csv"]
encoding:NSUTF8StringEncoding
error:NULL];
NSArray *tableRows = [theWholeTable componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]];
And every second object in the array is empty, any idea why?
the CSV file data looks like this:
10,156,326,614,1261,1890,3639,5800,10253,20914
20,107,224,422,867,1299,2501,3986,7047,14374
with 10 and 20 being the start of each new line.
thanks in advance.
Edit
I tried using the following code instead:
NSArray *tableRows = [theWholeTable componentsSeparatedByString:#"\n"];
And that worked the way I wanted it too.
Although I am still unsure why the newlineCharacterSet created empty objects...
If your CSV file comes from a non-UNIX system, it may contain multiple line separators (e.g. \r\n instead of \n). In this case, componentsSeparatedByCharactersInSet will insert empty strings for empty character sequences between \r and \n.
You can remove empty strings from NSArray using this method:
tableRows = [tableRows filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:#"length > 0"]];
to solve this problem, I have used other way:
NSArray *tableRows = [theWholeTable componentsSeparatedByCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:#"\n"]]
So you wouldn't need remove any lines.

Objective-C: Reading contents of a file into an NSString object doesn't convert unicode

I have a file, which I'm reading into an NSString object using stringWithContentsOfFile. It contains Unicode for Japanese characters such as:
\u305b\u3044\u3075\u304f
which I believe is
せいふく
I would like my NSString object to store the string as the latter, but it is storing it as the former.
The thing I don't quite understand is that when I do this:
NSString *myString = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:nil];
It stores it as: \u305b\u3044\u3075\u304f.
But when I hardcode in the string:
NSString *myString = #"\u305b\u3044\u3075\u304f";
It correctly converts it and stores it as: せいふく
Does stringWIthContentsOfFile escape the Unicode in some way? Any help will be appreciated.
Thanks.
In the file \u305b\u3044\u3075\u304f are just normal characters. So you are getting them in string. You need to save actual Japanese characters in the file. That is, store せいふく in file and that will be loaded in the string.
You can try this, dont know how feasible it is..
NSArray *unicodeArray = [stringFromFile componentsSeparatedByString:#"\\u"];
NSMutableString *finalString = [[NSMutableString alloc] initWithString:#""];
for (NSString *unicodeString in unicodeArray) {
if (![unicodeString isEqualToString:#""]) {
unichar codeValue;
[[NSScanner scannerWithString:unicodeString] scanHexInt:&codeValue];
NSString* betaString = [NSString stringWithCharacters:&codeValue length:1];
[finalString appendString:betaString];
}
}
//finalString should have せいふく
Something like \u305b in an Objective-C string is in fact an instruction to the compiler to replace it with the actual UTF-8 byte sequence for that character. The method reading the file is not a compiler, and only reads the bytes it finds. So to get that character (officially called "code point"), your file must contain the actual UTF-8 byte sequence for that character, and not the symbolic representation \u305b.
It's a bit like \x43. This is, in your source code, four characters, but it is replaced by one byte with value 0x43. So if you write #"\x43" to a file, the file will not contain the four characters '\', 'x', '4', '3', it will contain the single character 'C' (which has ASCII value 0x43).