In the context of an iPhone app I am developing, I am parsing some html to extract data to map, using NSRegularExpression. This information is updated whenever the user "pans" the map to a new location.
This works fine the first time or two through, but on the second or third time the function is called, the application hangs. I have used XCode's profiler to confirm I am not leaking memory, and no error is generated (the application does not terminate, it just sits in execution at the point shown below).
When I examine the HTML being parsed, I do not see that it is incomplete or otherwise garbled when the application hangs.
Furthermore, if I replace the regex code with a collection of explicitly address strings, everything works as expected.
- (void)connectionDidFinishLoading:(NSURLConnection *)connection {
// receivedData contains the returned HTML
NSString *result = [[NSString alloc] initWithData:receivedData encoding:NSASCIIStringEncoding];
NSError *error = nil;
NSString *pattern = #"description.*?h4>(.*?)<\\/h4>.*?\"address>[ \\s]*(.*?)<.*?zip>.*?(\\d{5,5}), US<";
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:pattern
options:NSRegularExpressionDotMatchesLineSeparators
error:&error];
__block NSUInteger counter = 0;
// the application hangs on the next line after 1-2 times through
[regex enumerateMatchesInString:result options:0 range:NSMakeRange(0, [result length]) usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
NSRange range = [match rangeAtIndex:2];
NSString *streetAddress =[result substringWithRange:range];
range = [match rangeAtIndex:3];
NSString *cityStateZip = [result substringWithRange:range];
NSString *address = [NSString stringWithFormat:#"%# %#", streetAddress, cityStateZip];
EKItemInfo *party = [self addItem:address]; // geocode address and then map it
if (++counter > 4) *stop = true;
}];
[receivedData release];
[result release];
[connection release]; //alloc'd previously, so release here.
}
I realize this is going to be difficult/impossible to duplicate, but I was wondering if anyone has run into a similar issue with NSRegularExpression or if there is something obviously wrong here.
I also have encountered the regular expression exception, too. In my case, the problem was Character Encoding. So that I wrote a code to go well with several character encoding. Maybe this code help you.
+ (NSString *)encodedStringWithContentsOfURL:(NSURL *)url
{
// Get the web page HTML
NSData *data = [NSData dataWithContentsOfURL:url];
// response
int enc_arr[] = {
NSUTF8StringEncoding, // UTF-8
NSShiftJISStringEncoding, // Shift_JIS
NSJapaneseEUCStringEncoding, // EUC-JP
NSISO2022JPStringEncoding, // JIS
NSUnicodeStringEncoding, // Unicode
NSASCIIStringEncoding // ASCII
};
NSString *data_str = nil;
int max = sizeof(enc_arr) / sizeof(enc_arr[0]);
for (int i=0; i<max; i++) {
data_str = [
[NSString alloc]
initWithData : data
encoding : enc_arr[i]
];
if (data_str!=nil) {
break;
}
}
return data_str;
}
You can download the whole category library from GitHub and just run it. I wish this helps you.
https://github.com/weed/p120801_CharacterEncodingLibrary
Maybe the answer to this question can be found at: NSRegularExpression enumerateMatchesInString: [...] usingBlock does never stop .
I had this issue solved by passing NSMatchingReportCompletion as option and by setting stop to YES when the match is nil.
Related
I created a file using the following code:
NSMutableString *tabString = [NSMutableString stringWithCapacity:0]; // it will automatically expand
// write column headings <----- TODO
// now write out the data to be exported
for(int i = 0; i < booksArray.count; i++) {
[tabString appendString:[NSString stringWithFormat:#"%#\t,%#\t,%#\t\n",
[[booksArray objectAtIndex:i] author],
[[booksArray objectAtIndex:i] binding],
[[booksArray objectAtIndex:i] bookDescription]]];
}
if (![self checkForDataFile: #"BnN.tab"]) // does the file exist?
[[NSFileManager defaultManager] createFileAtPath:documentsPath contents: nil attributes:nil]; // create it if not
NSFileHandle *handle;
handle = [NSFileHandle fileHandleForWritingAtPath: [NSString stringWithFormat:#"%#/%#",documentsPath, #"BnN.tab"]]; // <---------- userID?
[handle truncateFileAtOffset:[handle seekToEndOfFile]]; // tell handle where's the file fo write
[handle writeData:[tabString dataUsingEncoding:NSUTF8StringEncoding]]; //position handle cursor to the end of file (why??)
This is the code I am using to read back the file (for debugging purposes):
// now read it back
NSString* content = [NSString stringWithContentsOfFile:[NSString stringWithFormat:#"%#/%#",documentsPath, #"BnN.tab"]
encoding:NSUTF8StringEncoding
error: ^{ NSLog(#"error: %#", (NSError **)error);
}];
I am getting 2 build errors on this last statement that says:
Sending 'void (^)(void)' to parameter of incompatible type 'NSError *__autoreleasing *'
and
Use of undeclared identifier 'error'
This is the first time I am using a block to handle method returned errors; I was unable to find any docs in SO or Google showing how to do this. What am I doing wrong?
That function is expecting an NSError** parameter, not a block. The way you should be calling it is something like:
NSError *error = nil;
NSString* content = [NSString stringWithContentsOfFile: [NSString stringWithFormat:#"%#/%#", documentsPath, #"BnN.tab"]
encoding: NSUTF8StringEncoding
error: &error];
if (content == nil) {
NSLog("error: %#", error);
}
I apologize if this is a repeat, but I honestly have done my best to research this and haven't come up with much.
I'm making an iPad reference app that will make several large textbooks searchable (maybe 3-4k pages in total). It's a fairly simple idea: the user can choose any combination of texts to search, put in his term, and the app will find those terms in all the texts and return them, indexed in a table view.
The view controller has a series of switches, the value of which get read by a method into an NSSet and passed to the search controller. That part works.
I have a SearchController class which the view controller instantiates and calls a method on:
-(void)performSearchWithString:(NSString *)searchString andTexts:(NSSet *)texts
{
for (id book in texts) {
if ([book isEqual:kBook1]){
NSError *error = nil;
NSURL *url = [[NSBundle mainBundle] URLForResource:#"neiJing" withExtension:#"txt"];
NSString *text = [NSString stringWithContentsOfURL:url encoding:NSStringEncodingConversionAllowLossy error:&error];
[text enumerateSubstringsInRange:NSMakeRange(0, [text length])
options:NSStringEnumerationByWords
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
NSRange found = [substring rangeOfString:text];
if (found.location != NSNotFound) {
NSLog(#"%#", substring);
} else {
NSLog(#"Not found");
}
}];
}
I seem to have succeeded in enumerating through every word in all the texts, but no return values are found so I just get a never-ending stream of "Not found".
I inserted a test phrase into each text that I know for certain to be there, but it's not coming up.
I have a feeling I'm going about this all wrong. Even if I made this work, the performance hit might be too big for a useable app...I'm still trying to wrap my head around blocks, too. I just haven't found any ready-baked solutions out there for searching large volumes of text and picking out results. If anyone has any hints or references to an open-source library that I might adapt, I would be very grateful.
It looks like you're searching for the whole text inside of each substring passed to the block. This line is the problem (and causes a retain cycle):
NSRange found = [substring rangeOfString:text];
The code needs to look for something it can find:
NSString *findMe = #"A string we expect to find";
[text enumerateSubstringsInRange:NSMakeRange(0, [text length])
options:NSStringEnumerationByWords
usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
if ([findMe isEqualToString:substring] ) {
NSLog(#"Found %#", substring);
} else {
NSLog(#"Not found");
}
}];
I don't think the block of code is what you want. It loops through each word in the text and you want to find all the search strings. Here is a loop that sets a new search range based on the last successful match.
-(void)performSearchWithString:(NSString *)searchString andTexts:(NSSet *)texts
{
for (id book in texts) {
if ([book isEqual:kBook1]){
NSError *error = nil;
NSURL *url = [[NSBundle mainBundle] URLForResource:#"neiJing" withExtension:#"txt"];
NSString *text = [NSString stringWithContentsOfURL:url encoding:NSStringEncodingConversionAllowLossy error:&error];
NSRange searchRange = NSMakeRange(0, [text length]);
NSRange match = [text rangeOfString:searchString options:0 range:searchRange];
while (match.location != NSNotFound) {
// match is the range of the current successful match
NSLog(#"matching range -- %#", NSStringFromRange(match));
NSUInteger locationOfNextSearchRange = NSMaxRange(match);
searchRange = NSMakeRange(locationOfNextSearchRange, [text length] - locationOfNextSearchRange);
match = [text rangeOfString:searchString options:0 range:searchRange];
}
}
}
I've only been leaning Cocoa/Objective C for a few days so apologies that this is probably simple/obvious but it's got me stumped.
I've written this handler for saving 3 floats to a text file. However when I'm running it the files are not being saved. Could anyone suggest if there's an error in my code or if you think there's something else (like file write permissions) preventing the file from being written.
Research has lead me to look into Sandboxing, but that gets confusing very quickly and I'm hoping just running the app from xcode in debug would let me write to my user directory.
Heres the code:
- (IBAction)saveResultsAction:(id)sender {
//Sets up the data to save
NSString *saveLens = [NSString stringWithFormat:#"Screen width is %.02f \n Screen Height is %.02f \n Lens is %.02f:1",
self.myLens.screenWidth,
self.myLens.screenHeight,
self.myLens.lensRatio];
NSSavePanel *save = [NSSavePanel savePanel];
long int result = [save runModal];
if (result == NSOKButton) {
NSURL *selectedFile = [save URL];
NSLog(#"Save URL is %#", selectedFile);
NSString *fileName = [[NSString alloc] initWithFormat:#"%#.txt", selectedFile];
NSLog(#"Appended URL is %#", fileName);
[saveLens writeToFile:fileName
atomically:YES
encoding:NSUTF8StringEncoding
error:nil];
}
}
a NSURL object is no POSIX path..
its a URL and getting its description doesnt make it a path
NSString *fileName = [selectedFile.path stringByAppendingPathExtension:#"txt"];
BUT as said, you shouldnt have to append the .txt at all. just use what the panel returns. Else, there would be sandboxd errors because you dont have access rights to the modified filename :)
NSString *fileName = selectedFile.path;
The problem is that you don't need to append the file extension to the URL.The extension is already there.You could directly do this:
if (result == NSOKButton)
{
[saveLens writeToURL: [save URL]
atomically:YES
encoding:NSUTF8StringEncoding
error:nil];
}
I see you've already accepted an answer, but it may also be helpful to know how to debug this type of issue using NSError pointers.
Cocoa uses NSError with method calls which generate error conditions, which richly encapsulate errors. (Objective-C also has exceptions, but they're reserved for cases of programmer error, like an array index out of bounds, or a nil parameter that should never be.)
When you have a method which accepts an error pointer, usually it also return a BOOL indicating overall success or failure. Here's how to get more information:
NSError *error = nil;
if (![saveLens writeToFile:fileName
atomically:YES
encoding:NSUTF8StringEncoding
error:&error]) {
NSLog(#"error: %#", error);
}
Or even:
NSError *error = nil;
if (![saveLens writeToFile:fileName
atomically:YES
encoding:NSUTF8StringEncoding
error:&error]) {
[NSApp presentError:error];
}
This is question about Objective-C. I wrote the program that uses regular expression with getting whole HTML. I have uploaded the program to GitHub. However, exception occurs.
The purpose of this program is to get the "og:image" by regular expression match. This is the image which is displayed by writing URL in Facebook. To set this image, you write in HTML as below:
<meta property="og:image"
content="http://business.nikkeibp.co.jp/article/NBD/20120727/235043/zu1.jpg">
So I wrote the program which get whole HTML and find og:image part. The code is below:
// Web page address
NSURL *url = [NSURL URLWithString:textField.text];
// Get the web page HTML
NSString *string =
[NSString stringWithContentsOfURL:url encoding:NSUTF8StringEncoding error:nil];
// prepare regular expression to find text
NSError *error = nil;
NSRegularExpression *regexp =
[NSRegularExpression regularExpressionWithPattern:
#"<meta property=\"og:image\" content=\".+\""
options:0
error:&error];
#try {
// find by regular expression
NSTextCheckingResult *match =
[regexp firstMatchInString:string options:0 range:NSMakeRange(0, string.length)];
// get the first result
NSRange resultRange = [match rangeAtIndex:0];
NSLog(#"match=%#", [string substringWithRange:resultRange]);
if (match) {
// get the og:image URL from the find result
NSRange urlRange = NSMakeRange(resultRange.location + 35, resultRange.length - 35 - 1);
NSURL *urlOgImage = [NSURL URLWithString:[string substringWithRange:urlRange]];
imageView.image = [UIImage imageWithData:[NSData dataWithContentsOfURL:urlOgImage]];
}
}
The whole code is in GitHub as below:
https://github.com/weed/p120728_GetOgImage/blob/master/GetOgImage/ViewController.m
However, sometimes this program through exception.
success case:http://www.nicovideo.jp/watch/1343369790
failure case:http://business.nikkeibp.co.jp/article/NBD/20120727/235043/?ST=pc
Screen shots is here: https://github.com/weed/p120728_GetOgImage/blob/master/readme.md
Why exception occurs? Please teach me. Thank you for your help.
My friend kindly pointed about considering Character Encoding. The character encoding of first URL page is UTF-8, and the second one is EUC-JP.
With the code below I could get the og:image of second URL I showed above.
- (NSString *)encodedStringWithContentsOfURL:(NSURL *)url
{
// Get the web page HTML
NSData *data = [NSData dataWithContentsOfURL:url];
// response
int enc_arr[] = {
NSUTF8StringEncoding, // UTF-8
NSShiftJISStringEncoding, // Shift_JIS
NSJapaneseEUCStringEncoding, // EUC-JP
NSISO2022JPStringEncoding, // JIS
NSUnicodeStringEncoding, // Unicode
NSASCIIStringEncoding // ASCII
};
NSString *data_str = nil;
int max = sizeof(enc_arr) / sizeof(enc_arr[0]);
for (int i=0; i<max; i++) {
data_str = [
[NSString alloc]
initWithData : data
encoding : enc_arr[i]
];
if (data_str!=nil) {
break;
}
}
return data_str;
}
I made the check library of character encoding named NSString+Encode. The whole code is in GitHub:
https://github.com/weed/p120728_OgImageLibrary
It looks like your regular expression is not matching the result for the second page, have you tested the html source of that page with your regular expression in a regex tester?
Something like this should do the trick: http://regexpal.com/
I need RegexKitlite in my App as a part of String validation.
Have also added libicucore.A.dylib .
Currently working with xcode 4.2,Base sdk iOS 5.0,Apple LLVM compiler 3.0,architechture armv7.
Adding the regexkit folder to my app, causes too many errors like
Automatic Reference Counting Errors ,
Cast of Objective-C pointer type 'NSString *' to C pointer type 'CFStringRef' etc
Please help;where have I gone wrong.
You can also disable the ARC for the RegexKitLite only by adding a flag:
select the project -> YOUR Target -> on the Tab the "Build Phases" and open the "Compile Sources" and add for "RegexKitLite.m" the flag "-fno-objc-arc".
Update:
If you get:
Undefined symbols:
"_uregex_reset", referenced from:
_rkl_splitArray in RegexKitLite.o
_rkl_replaceAll in RegexKitLite.o
"_uregex_appendTail", referenced from:.......
Then you need to add in the Tab "Build Settings" -> "Linking" -> "Other Linker Flags" the "-licucore"
You aren't doing anything wrong. Regexkit just hasn't been updated to iOS 5 yet. The big change in iOS 5 is there are no longer retains, releases, or autoreleases. Every memory thing is automatic like Java. (Except that it happens at compile time instead of run time. So it's conceptually like Java. Mostly.)
Anyway, instead of waiting for Regexkit to update you can use NSRegularExpression. Using Apple stuff is also future-proof since they keep their own stuff updated version to version.
Good luck!
CBGraham is right. Alternatively, you could disable automatic reference counting (Project > Build settings > search for 'automatic reference counting').
You will obviously have to do manual reference counting, but RegexKitLite should build now...
I replaced RegexKitLite with there two methods.
String Results:
+(NSString*) regExString: (NSString *) pattern withString: (NSString *) searchedString {
NSError *error = nil;
NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:&error];
NSTextCheckingResult *match = [regex firstMatchInString:searchedString options:0 range: NSMakeRange(0, [searchedString length])];
if ([searchedString substringWithRange:[match rangeAtIndex:1]]) {
return [searchedString substringWithRange:[match rangeAtIndex:1]];
} else {
return #"";
}
}
Array of Results:
+(NSArray *) regExArray:(NSString *) pattern withString: (NSString *) searchedString {
NSMutableArray *results = [[NSMutableArray alloc] init];
NSError *error;
NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:&error];
NSArray* matches = [regex matchesInString:searchedString options:0 range: NSMakeRange(0, searchedString.length)];
for (NSTextCheckingResult* match in matches) {
NSMutableArray *result = [NSMutableArray array];
NSRange matchRange = [match range];
NSString *numString = [searchedString substringWithRange:matchRange];
[result addObject:numString];
for (int i=1;i < (int)match.numberOfRanges;i++) {
NSRange range = [match rangeAtIndex:i];
#try {
NSString *numString = [searchedString substringWithRange:range];
[result addObject:numString];
}
#catch (NSException *exception) {
[result addObject:[NSNull null]];
}
}
[results addObject:result];
}
return results;
}