Unable to retrieve certain pages using stringWithContentsOfURL - objective-c

I am trying to get HTML files from the web, using stringWithContentsOfURL:. My problem is, sometimes it works but sometimes it doesn't. For example, I tried:
NSString *string = [NSString stringWithContentsOfURL:
[NSURL URLWithString:#"http://www.google.com/"]
encoding:encoding1
error:nil];
NSLog(#"html = %#",string);
This works fine, but when I replace the URL with #"http://www.youtube.com/" then I only get "NULL". Is there anyone that knows what's going on? Is it because of YouTube having some sort of protection?

Google's home page uses ISO-8859-1 encoding (aka "Latin-1", or NSISOLatin1StringEncoding). YouTube uses UTF-8 (NSUTF8StringEncoding), and the encoding you've specified with your encoding1 variable has to match the web page in question.
If you just want the web page and don't really care what encoding it's in, try this:
NSStringEncoding encoding;
NSError *error;
NSString *string = [NSString stringWithContentsOfURL:
[NSURL URLWithString:#"http://www.google.com/"]
usedEncoding:&encoding
error:&error];
NSLog(#"html = %#",string);
This method will tell you what the encoding was (by writing it to the encoding variable), but you can just throw that away and focus on the string.

Related

Google image from cocoa

I'd like to write a program in cocoa that parse a google image webpage and extract the images.
i use a code like this:
NSURL *url = [ NSURL URLWithString: [ NSString stringWithFormat: #"https://www.google.it/search?q=%#&tbm=isch", searchString] ];
NSStringEncoding enc;
NSString *test = [NSString stringWithContentsOfURL:url usedEncoding:&enc error:NULL];
The problem is that the page that is returned in this way is different from what it is in a browser.
I'don't get the imgurl parameter with the url of the full image. only the thumbnails.
There is a way to have the complete google images results in cocoa like i have in firefox?
Thank you
What you are doing is not correct way.
To get list of images, you should use Google API for image Search
Follow below link for more details.
https://developers.google.com/image-search/v1/jsondevguide
The URL for webservice would look like below.
https://ajax.googleapis.com/ajax/services/search/images?q=soccer&v=1.0
^^^^^^ your search keyword here...

NSstream write encoding issues

Im trying to send a string using NSoutputstream , however i cant seem to get the encoding right , using dataWithContentsOfURL works
im using a nodejs TCP server with actionHero library.
it works using netcat and telnet.
- (IBAction)sendText:(id)sender {
NSString *response = [NSString stringWithFormat:#"%#", [_sendTextField.text stringByAddingPercentEscapesUsingEncoding: NSUTF8StringEncoding]];
NSLog(#"writing %#",response);
///////////////////////////// this line works/////////////////////////////////////////////////////
// NSData *data = [[NSData alloc] initWithData:[NSData dataWithContentsOfURL:[NSURL URLWithString:#"http://www.google.com"]]];
///////////////////////////// this line doesnt work/////////////////////////////////////////////////////
NSData *data = [[NSData alloc] initWithData:[response dataUsingEncoding:NSUTF8StringEncoding]];
//%u returns a non zero value
NSLog(#"%u",[outputStream write:[data bytes] maxLength:[data length]]);
}
i get a null streamError from handle stream Event method
Not knowing the content of response, I can't give you a specific answer as to why NSUTF8StringEncoding doesn't work with it. Generally speaking, however, if there is a byte sequence in your content that is incompatible with UTF-8, you're going to get nil when you call -dataUsingEncoding:.
There's a strategy that I learned from reading Mike Ash's blog (look under the section "Fallbacks"), and it's served me pretty well in situations such as yours.
To briefly sum it up, first try using NSUTF8StringEncoding. If that doesn't work, try using NSISOLatin1StringEncoding. And if that doesn't work, try using NSMacOSRomanStringEncoding. Mike's blog has the rationale for this.
Found the answer to my own question. turns out the problem is in actionHero's on data method where it looks for a /n which is not provided by the ios application. appended a \n and its fine now

Objective-C: NSString not being entirely decoded from UTF-8

I'm querying a web server which returns a JSON string as NSData. The string is in UTF-8 format so it is converted to an NSString like this.
NSString *receivedString = [[NSString alloc] initWithData:receivedData encoding:NSUTF8StringEncoding];
However, some UTF-8 escapes remain in the outputted JSON string which causes my app to behave erratically. Things like \u2019 remain in the string. I've tried everything to remove them and replace them with their actual characters.
The only thing I can think of is to replace the occurances of UTF-8 escapes with their characters manually, but this is a lot of work if there's a quicker way!
Here's an example of an incorrectly parsed string:
{"title":"The Concept, Framed, The Enquiry, Delilah\u2019s Number 10 ","url":"http://livebrum.co.uk/2012/05/31/the-concept-framed-the-enquiry-delilah\u2019s-number-10","date_range":"31 May 2012","description":"","venue":{"title":"O2 Academy 3 ","url":"http://livebrum.co.uk/venues/o2-academy-3"}
As you can see, the URL hasn't been completely converted.
Thanks,
The \u2019 syntax isn't part of UTF-8 encoding, it's a piece of JSON-specific syntax. NSString parses UTF-8, not JSON, so doesn't understand it.
You should use NSJSONSerialization to parse the JSON then pull the string you want from the output of that.
So, for example:
NSError *error = nil;
id rootObject = [NSJSONSerialization
JSONObjectWithData:receivedData
options:0
error:&error];
if(error)
{
// error path here
}
// really you'd validate this properly, but this is just
// an example so I'm going to assume:
//
// (1) the root object is a dictionary;
// (2) it has a string in it named 'url'
//
// (technically this code will work not matter what the type
// of the url object as written, but if you carry forward assuming
// a string then you could be in trouble)
NSDictionary *rootDictionary = rootObject;
NSString *url = [rootDictionary objectForKey:#"url"];
NSLog(#"URL was: %#", url);

Url encodings with greek characters

I get the html source of a page to a NSString like this
NSString* url = #"example url";
NSURL *urlRequest = [NSURL URLWithString:url];
NSError *err = nil;
NSString *response = [NSString stringWithContentsOfURL:urlRequest encoding:kCFStringEncodingUTF8 error:&err];
a part of the response is like : 2 \u00cf\u0083\u00cf\u0087\u00cf\u008c\u00ce\u00bb\u00ce\u00b9\u00ce\u00b1
How can i have the Greek characters shown as they should in the NSString response?
The encoding of the page is "charset=iso-8859-7"
Ahhh, I understand your question a little bit better now.
The Apple-supplied native implementation of NSString doesn't know what to do with iso-8859-7 encoding.
You have two options.
1)
Try requesting different encodings to [NSString stringWithContentsOfURL: encoding: error:] to see if one successfully loads. My first attempt would be with NSISOLatin1StringEncoding.
2)
I found a third party library (and NSString category extension) that does do iso-8859-7 conversion. But to get access to CkoCharset will cost you (or your client) $290 USD. It might be a worthwhile investment to save time & hassle.
https://chilkatsoft.com/charset-objc.asp
and documentation is here:
http://www.chilkatsoft.com/refdoc/objcCkoCharsetRef.html

objective c - does not read utf-8 encoded file

I'm trying to display some japanese text on the ios simulator and an ipod touch. The text is read from an XML file. The header is:
<?xml version="1.0" encoding="utf-8"?>
When the text is in english, it displays fine. However, when the text is Japanese, it comes out as an unintelligible mishmash of single-byte characters.
I have tried saving the file specifically as unicode using TextEdit. I'm using NSXMLParser to parse the data. Any ideas would be much appreciated.
Here is the parsing code
// Override point for customization after application launch.
NSString *xmlFilePath = [[[NSBundle mainBundle] resourcePath] stringByAppendingPathComponent:#"questionsutf8.xml"];
NSString *xmlFileContents = [NSString stringWithContentsOfFile:xmlFilePath];
NSData *data = [NSData dataWithBytes:[xmlFileContents UTF8String] length:[xmlFileContents lengthOfBytesUsingEncoding: NSUTF8StringEncoding]];
XMLReader *xmlReader = [[XMLReader alloc] init];
[xmlReader parseXMLData: data];
stringWithContentsOfFile: is a deprecated method. It does not do encoding detection unless the file contains the appropriate byte order mark, otherwise it interprets the file as the default C string encoding (the encoding returned by the +defaultCStringEncoding method). Instead, you should use the non-deprecated [and encoding-detecting] method stringWithContentsOfFile:usedEncoding:error:.
You can use it like this:
NSStringEncoding enc;
NSError *error;
NSString *xmlFileContents = [NSString stringWithContentsOfFile:xmlFilePath
usedEncoding:&enc
error:&error];
if (xmlFileContents == nil)
{
NSLog (#"%#", error);
return;
}
First, you should verify with TextWrangler (free from the Mac app store or barebones.com) that your XML file truly is UTF-8 encoded.
Second, try creating xmlFileContents with +stringWithContentsOfFile:encoding:error:, explicitly specifying UTF-8 encoding. Or, even better, bypass the intermediate string entirely, and create data with +dataWithContentsOfFile:.