Creating substrings from text file - objective-c

I have a text file that contains two lines of numbers, all I want to do is turn each line into a string, then add it into an array (called fields). My problem arrises when trying to find the EOF character. I can read from the file with no problem: I turn it's content into a NSString, then pass to this method.
-(void)parseString:(NSString *)inputString{
NSLog(#"[parseString] *inputString: %#", inputString);
//the end of the previous line, this is also the start of the next lien
int endOfPreviousLine = 0;
//count of how many characters we've gone through
int charCount = 0;
//while we havent gone through every character
while(charCount <= [inputString length]){
NSLog(#"[parseString] while loop count %i", charCount);
//if its an end of line character or end of file
if([inputString characterAtIndex:charCount] == '\n' || [inputString characterAtIndex:charCount] == '\0'){
//add a substring into the array
[fields addObject:[inputString substringWithRange:NSMakeRange(endOfPreviousLine, charCount)]];
NSLog(#"[parseString] string added into array: %#", [inputString substringWithRange:NSMakeRange(endOfPreviousLine, charCount)]);
//set the endOfPreviousLine to the current char count, this is where the next string will start from
endOfPreviousLine = charCount+1;
}
charCount++;
}
NSLog(#"[parseString] exited while. endOfPrevious: %i, charCount: %i", endOfPreviousLine, charCount);
}
The contents of my text file look like this:
123
456
I can get the first string (123) no problem. The call would be:
[fields addObject:[inputString substringWithRange:NSMakeRange(0, 3)]];
Next, I make the call for the second String:
[fields addObject:[inputString substringWithRange:NSMakeRange(4, 7)]];
But I get an error, and I think it is because my index is out of bounds. Since the index starts from 0, there is no index 7 (well I think its supposed to be the EOF character), and I get an error.
To sum everything up: I don't know how to deal with an index of 7 when there are only 6 characters + the EOF character.
Thanks.

You can use componentsSeparatedByCharactersInSet: to get the effect that you are looking for:
-(NSArray*)parseString:(NSString *)inputString {
return [inputString componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]];
}

Short answer is to use [inputString componentsSeparatedByString:#"\n"] and get the array of numbers.
Example:
Use the following code to get the lines in an array
NSString *path = [[NSBundle bundleForClass:[self class]] pathForResource:#"aaa" ofType:#"txt"];
NSString *str = [[NSString alloc] initWithContentsOfFile: path];
NSArray *lines = [str componentsSeparatedByString:#"\n"];
NSLog(#"str = %#", str);
NSLog(#"lines = %#", lines);
The above code assumes that you have a file called "aaa.txt" in your resources which is plain text file.

Related

Replacing bad words in a string in Objective-C

I have a game with a public highscore list where I allow layers to enter their name (or anything unto 12 characters). I am trying to create a couple of functions to filter out bad words from a list of bad words
I have in a text file. I have two methods:
One to read in the text file:
-(void) getTheBadWordsAndSaveForLater {
badWordsFilePath = [[NSBundle mainBundle] pathForResource:#"badwords" ofType:#"txt"];
badWordFile = [[NSString alloc] initWithContentsOfFile:badWordsFilePath encoding:NSUTF8StringEncoding error:nil];
badwords =[[NSArray alloc] initWithContentsOfFile:badWordFile];
badwords = [badWordFile componentsSeparatedByString:#"\n"];
NSLog(#"Number Of Words Found in file: %i",[badwords count]);
for (NSString* words in badwords) {
NSLog(#"Word in Array----- %#",words);
}
}
And one to check a word (NSString*) agains the list that I read in:
-(NSString *) removeBadWords :(NSString *) string {
// If I hard code this line below, it works....
// *****************************************************************************
//badwords =[[NSMutableArray alloc] initWithObjects:#"shet",#"shat",#"shut",nil];
// *****************************************************************************
NSLog(#"checking: %#",string);
for (NSString* words in badwords) {
string = [string stringByReplacingOccurrencesOfString:words withString:#"-" options:NSCaseInsensitiveSearch range:NSMakeRange(0, string.length)];
NSLog(#"Word in Array: %#",words);
}
NSLog(#"Cleaned Word Returned: %#",string);
return string;
}
The issue I'm having is that when I hardcode the words into an array (see commented out above) then it works like a charm. But when I use the array I read in with the first method, it does't work - the stringByReplacingOccurrencesOfString:words does not seem to have an effect. I have traced out to the log so I can see if the words are coming thru and they are... That one line just doesn't seem to see the words unless I hardcore into the array.
Any suggestions?
A couple of thoughts:
You have two lines:
badwords =[[NSArray alloc] initWithContentsOfFile:badWordFile];
badwords = [badWordFile componentsSeparatedByString:#"\n"];
There's no point in doing that initWithContentsOfFile if you're just going to replace it with the componentsSeparatedByString on the next line. Plus, initWithContentsOfFile assumes the file is a property list (plist), but the rest of your code clearly assumes it's a newline separated text file. Personally, I would have used the plist format (it obviates the need to trim the whitespace from the individual words), but you can use whichever you prefer. But use one or the other, but not both.
If you're staying with the newline separated list of bad words, then just get rid of that line that says initWithContentsOfFile, you disregard the results of that, anyway. Thus:
- (void)getTheBadWordsAndSaveForLater {
// these should be local variables, so get rid of your instance variables of the same name
NSString *badWordsFilePath = [[NSBundle mainBundle] pathForResource:#"badwords" ofType:#"txt"];
NSString *badWordFile = [[NSString alloc] initWithContentsOfFile:badWordsFilePath encoding:NSUTF8StringEncoding error:nil];
// calculate `badwords` solely from `componentsSeparatedByString`, not `initWithContentsOfFile`
badwords = [badWordFile componentsSeparatedByString:#"\n"];
// confirm what we got
NSLog(#"Found %i words: %#", [badwords count], badwords);
}
You might want to look for whole word occurrences only, rather than just the presence of the bad word anywhere:
- (NSString *) removeBadWords:(NSString *) string {
NSLog(#"checking: %# for occurrences of these bad words: %#", string, badwords);
for (NSString* badword in badwords) {
NSString *searchString = [NSString stringWithFormat:#"\\b%#\\b", badword];
string = [string stringByReplacingOccurrencesOfString:searchString
withString:#"-"
options:NSCaseInsensitiveSearch | NSRegularExpressionSearch
range:NSMakeRange(0, string.length)];
}
NSLog(#"resulted in: %#", string);
return string;
}
This uses a "regular expression" search, where \b stands for "a boundary between words". Thus, \bhell\b (or, because backslashes have to be quoted in a NSString literal, that's #"\\bhell\\b") will search for the word "hell" that is a separate word, but won't match "hello", for example.
Note, above, I am also logging badwords to see if that variable was reset somehow. That's the only thing that would make sense given the symptoms you describe, namely that the loading of the bad words from the text file works but replace process fails. So examine badwords before you replace and make sure it's still set properly.

for loop parse newline then equal sign and put it in dictionary

NSString *result
result contains:
NC_AllowedWebHosts=
NC_BgeLAN=br1
NC_Doc=/tmp/dhd=
NC_ExPts=1863==
NC_Redirect=1
[...]
binary_custom=/path/to/directory
blocklist=0
blocklist_url=http://list.g.com/?list=
[...]
I am using this function but i have problems parsing list with double == or triple === for example.
NSArray *strings = [result componentsSeparatedByCharactersInSet:
[NSCharacterSet characterSetWithCharactersInString:#"=\r\n"]];
NSMutableArray *keys = [NSMutableArray new];
NSMutableArray *values = [NSMutableArray new];
for (int i = 0; i+1 < strings.count; i+=2) {
[keys addObject:strings[i]];
[values addObject:strings[i+1]];
}
I would like to parse everything based on new line "\r\n" first then everything that is before the first "=" symbol put in a dictionary key, and everything after up to the new line in a dictionary value. This way I can say get me key "NC_ExPts" and value would return "1863==" and so on. Any help would be appreciated.
#Monolo I can read line-by-line but I don't know how to get values on the first appearance of "=" and put it in values and keys
NSArray *lines = [result componentsSeparatedByCharactersInSet:
[NSCharacterSet characterSetWithCharactersInString:#"\r\n\n"]];
for (NSString* line in lines) {
if (line.length) {
NSLog(#"line: %#", line);
}
}
You need to read the original text line-by-line, then divide each line by only the first "="-sign. With the method you are using, you divide lines and key-value pairs in one go, meaning that you lose too much information about the structure of the data. This is why you are having difficulties handling lines with "==" in them in the value part.
NSString's enumerateLinesUsingBlock: will take care of the first part, and finding the first "=" in each of those lines is easily done with rangeOfString:.

Parsing SRT file with Objective C

Text example:
1
00:00:00,000 --> 00:00:01,000
This is the first line
2
00:00:01,000 --> 00:00:02,000
This is the second line
3
00:00:02,000 --> 00:00:03,000
This is the last line
In JavaScript I would parse this with a regular expression certainly. I'm just wondering, is that the best way to do this in Obj C? I'm sure I could figure out a way to do this, but I'm wanting to do it an appropriate way.
I only need to know where to start and I'm happy to do the rest, but for understanding sake I'm going to end up with something like this (pseudo code):
NSDictionary
index -> [0-9]+
start -> hh:mm:ss,mmm
end -> hh:mm:ss,mmm
text -> one of the lines of text
In this case, I'd be parsing three entries into my dictionary.
Some background: I wrote a small app and created a file called stuff.srt containing your examples that resides in the bundle; hence, my means of accessing it.
This is just a quick and dirty thing, a proof-of-concept. Note that it doesn't check results. Real applications always check their results. As you can see, the work takes place in the -applicationDidFinishLaunching: method (I'm working in Mac OS X, not iOS).
EDIT:
It's been pointed out that the code as originally posted didn't handle multiple text lines correctly. To address this, I take advantage of the fact that SRT files use CRLF as their line breaks, and search for two occurrences of this sequence. I then change all occurrences of CRLF in the text string to spaces, based on what I observed here. This doesn't account for leading or trailing spaces in each line of the text.
I changed the contents of the stuff.srt file to this:
1
00:00:00,000 --> 00:00:01,000
This is the first line
and it has a secondary line
2
00:00:01,000 --> 00:00:02,000
This is the second line
3
00:00:02,000 --> 00:00:03,000
This is the last line
and it has a secondary line too
and the code has been revised as follows (I also put everything into an #autoreleasepool directive; there might be a lot of autoreleased objects generated in the course of parsing the file!):
- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
NSString *path = [[NSBundle mainBundle] pathForResource:#"stuff" ofType:#"srt"];
NSString *string = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:NULL];
NSScanner *scanner = [NSScanner scannerWithString:string];
while (![scanner isAtEnd])
{
#autoreleasepool
{
NSString *indexString;
(void) [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&indexString];
NSString *startString;
(void) [scanner scanUpToString:#" --> " intoString:&startString];
// My string constant doesn't begin with spaces because scanners
// skip spaces and newlines by default.
(void) [scanner scanString:#"-->" intoString:NULL];
NSString *endString;
(void) [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&endString];
NSString *textString;
// (void) [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&textString];
// BEGIN EDIT
(void) [scanner scanUpToString:#"\r\n\r\n" intoString:&textString];
textString = [textString stringByReplacingOccurrencesOfString:#"\r\n" withString:#" "];
// Addresses trailing space added if CRLF is on a line by itself at the end of the SRT file
textString = [textString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
// END EDIT
NSDictionary *dictionary = [NSDictionary dictionaryWithObjectsAndKeys:
indexString, #"index",
startString, #"start",
endString , #"end",
textString , #"text",
nil];
NSLog(#"%#", dictionary);
}
}
}
The revised output looks like this:
2013-02-09 16:10:17.727 SRTFileScan[4846:303] {
end = "00:00:01,000";
index = 1;
start = "00:00:00,000";
text = "This is the first line and it has a secondary line";
}
2013-02-09 16:10:17.729 SRTFileScan[4846:303] {
end = "00:00:02,000";
index = 2;
start = "00:00:01,000";
text = "This is the second line";
}
2013-02-09 16:10:17.730 SRTFileScan[4846:303] {
end = "00:00:03,000";
index = 3;
start = "00:00:02,000";
text = "This is the last line and it has a secondary line too";
}
One other thing I learned from what I've read today: The SRT file format originated in France, and the comma seen in the input is the decimal separator used there.
Apple has a sample code to parse subtitle files. Check the relevant part here:
https://developer.apple.com/library/mac/samplecode/avsubtitleswriterOSX/Listings/avsubtitleswriter_SubtitlesTextReader_m.html#//apple_ref/doc/uid/DTS40013409-avsubtitleswriter_SubtitlesTextReader_m-DontLinkElementID_5
My suggest is to use a NSDateFormatter to parse the second line. I would split that string in two strings (see componentsSeparatedByString: in NSString class reference). This while reading the file line per line.
So the loop would be:
If the file contains again data, read the next line;
If the next line is a multiple of 4, allocate a new object. This object should be able to contain two dates, one integer and one string;
If the next line is not a multiple of 4, read the line and assign it's value to the corresponding field.

How to check if NSString begins with a certain character

How do you check if an NSString begins with a certain character (the character *).
The * is an indicator for the type of the cell, so I need the contents of this NSString without the *, but need to know if the * exists.
You can use the -hasPrefix: method of NSString:
Objective-C:
NSString* output = nil;
if([string hasPrefix:#"*"]) {
output = [string substringFromIndex:1];
}
Swift:
var output:String?
if string.hasPrefix("*") {
output = string.substringFromIndex(string.startIndex.advancedBy(1))
}
You can use:
NSString *newString;
if ( [[myString characterAtIndex:0] isEqualToString:#"*"] ) {
newString = [myString substringFromIndex:1];
}
hasPrefix works especially well.
for example if you were looking for a http url in a NSString, you would use componentsSeparatedByString to create an NSArray and the iterate the array using hasPrefix to find the elements that begin with http.
NSArray *allStringsArray =
[myStringThatHasHttpUrls componentsSeparatedByString:#" "]
for (id myArrayElement in allStringsArray) {
NSString *theString = [myArrayElement description];
if ([theString hasPrefix:#"http"]) {
NSLog(#"The URL is %#", [myArrayElement description]);
}
}
hasPrefix returns a Boolean value that indicates whether a given string matches the beginning characters of the receiver.
- (BOOL)hasPrefix:(NSString *)aString,
parameter aString is a string that you are looking for
Return Value is YES if aString matches the beginning characters of the receiver, otherwise NO. Returns NO if aString is empty.
As a more general answer, try using the hasPrefix method. For example, the code below checks to see if a string begins with 10, which is the error code used to identify a certain problem.
NSString* myString = #"10:Username taken";
if([myString hasPrefix:#"10"]) {
//display more elegant error message
}
Use characterAtIndex:. If the first character is an asterisk, use substringFromIndex: to get the string sans '*'.
NSString *stringWithoutAsterisk(NSString *string) {
NSRange asterisk = [string rangeOfString:#"*"];
return asterisk.location == 0 ? [string substringFromIndex:1] : string;
}
Another approach to do it..
May it help someone...
if ([[temp substringToIndex:4] isEqualToString:#"http"]) {
//starts with http
}
This might help? :)
http://developer.apple.com/mac/library/documentation/Cocoa/Reference/Foundation/Classes/NSString_Class/Reference/NSString.html#//apple_ref/occ/instm/NSString/characterAtIndex:
Just search for the character at index 0 and compare it against the value you're looking for!
This nice little bit of code I found by chance, and I have yet to see it suggested on Stack. It only works if the characters you want to remove or alter exist, which is convenient in many scenarios. If the character/s does not exist, it won't alter your NSString:
NSString = [yourString stringByReplacingOccurrencesOfString:#"YOUR CHARACTERS YOU WANT TO REMOVE" withString:#"CAN either be EMPTY or WITH TEXT REPLACEMENT"];
This is how I use it:
//declare what to look for
NSString * suffixTorRemove = #"</p>";
NSString * prefixToRemove = #"<p>";
NSString * randomCharacter = #"</strong>";
NSString * moreRandom = #"<strong>";
NSString * makeAndSign = #"&amp;";
//I AM INSERTING A VALUE FROM A DATABASE AND HAVE ASSIGNED IT TO returnStr
returnStr = [returnStr stringByReplacingOccurrencesOfString:suffixTorRemove withString:#""];
returnStr = [returnStr stringByReplacingOccurrencesOfString:prefixToRemove withString:#""];
returnStr = [returnStr stringByReplacingOccurrencesOfString:randomCharacter withString:#""];
returnStr = [returnStr stringByReplacingOccurrencesOfString:moreRandom withString:#""];
returnStr = [returnStr stringByReplacingOccurrencesOfString:makeAndSign withString:#"&"];
//check the output
NSLog(#"returnStr IS NOW: %#", returnStr);
This one line is super easy to perform three actions in one:
Checks your string for the character/s you do not want
Can replaces them with whatever you like
Does not affect surrounding code
NSString* expectedString = nil;
if([givenString hasPrefix:#"*"])
{
expectedString = [givenString substringFromIndex:1];
}

scanUpToCharactersFromSet stops after one loop

I'm trying to get the contents of a CSV file into an array. When I've done this before I had one record per line, and used the newline character with scanUpToCharactersFromSet:intoString:, passing newlineCharacterSet as the character set:
while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet]
intoString:&line])
Now, I'm working with a file where many of the entries themselves contain newline characters. I've tried adding a unique character to the end of each record (a * character) but my loop only runs once. Is there something which is making the while loop break that I don't know about? Here's the code I'm using now:
NSError *error;
NSString *data = [[NSString alloc] initWithContentsOfFile:[[self delegate] filePath] encoding:NSUTF8StringEncoding error:&error];
NSScanner *lineScanner = [NSScanner scannerWithString:data];
NSString *line = nil;
// Start parsing the CSV file
while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:#"*"]
intoString:&line]) {
NSArray *elements = [line componentsSeparatedByString:#","];
NSLog("Name: %#", [elements objectAtIndex:1]);
}
**Edit: ** Thanks to Peter's answer below, I found that my scanner was stuck behind the * character. I added this line in the loop:
[lineScanner scanCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:#"*"] intoString:NULL];
and now it's working like it should.
Let's go through one pass at a time:
First:
while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:[NSCharacterSet newlineCharacterSet]] intoString:&line]) {
The scanner puts everything before the line break into line. It advances up to the newline.
Second:
while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:[NSCharacterSet newlineCharacterSet]] intoString:&line]) {
The scanner is already on a line break, so it scans no characters. As documented, since it scanned no characters, it returns NO. Your loop terminates.
The solution is to scan the line break at the end of the loop, to get the scanner past it. You can pass NULL for the output parameter, assuming you don't care what the line break was.
This is correct behavior: If you did/do care what the characters you scanned up to were, this lets you obtain them. That would be more difficult if NSScanner scanned past the characters automatically.
I think the while condition is wrong. According to the String Programming Guide, it should be something like:
while ([theScanner isAtEnd] == NO) {
[lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:#"*"] intoString:&line]
// ...
}