Scrape Offer returns no results - error-handling

I have researched this issue in the database, googled the questions, and found two different errors. I have determined that the 503 errors are Amazon blocking me but I get a different error that is along the same lines but doesn't have the 503 return in it. I'm not able to decipher the difference. Anyone able to help? The errors I am getting are listed below. The html parser I am using are HTML Agility Pack and Just Agile
http://www.amazon.com/gp/offer-listing/1902915208 error The remote server returned an error: (503) Server Unavailable.
and
Scrape Offers returned no results.
The error is not consistent and so not easy to trap. The error returns with in the next scrap. It is very random and does not happen with the same product consistently. I am getting lots of these daily and it is preventing me from updating prices and provide correct information to customers.

So as i found out, the problem was the User-Agent which was send to Amazon.
I added the following command to my CURL-options and it works now - even without an US-Proxy.
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');

I put 1 second sleep between every 20 webpage requests. And that solved my problem.

Related

The reason codes from BigQueryError.getError()

Do we have a list of possible reason codes from BigQueryError.getReason() https://googleapis.github.io/google-cloud-java/google-cloud-clients/apidocs/index.html?com/google/cloud/bigquery/package-summary.html ?
BigQueryError is a java class from Google's BigQuery client lib.
We are streaming inserting rows to BigQuery. Would like to know under what kind of reason codes, we can retry the insertion.
Thanks
You can find the BigQuery error codes in the following page
https://cloud.google.com/bigquery/troubleshooting-errors
One important point for HTTP errors from the documentation (not listed in the doc)
If you receive an HTTP response code that doesn't appear in the list
below, the response code indicates an issue or an expected result with
the HTTP request. For example, a 502 error indicates there is an issue
with your network connection. For a full list of HTTP response codes,
see HTTP response codes.

The use of HTTP status to communicate application circumstances

Suppose I ask a question to an criminal register: http://server/demographics/party/{partyId} and that person is not known on that system.
Is that an error? Isn't it a good thing?
When returning 404, it is an error code, Restlet has implemented it as a specific URL cannot be resolved to a route to a server-resource (a handling class), an erroneous situation.
In my opinion, if a system understands a call, and is able to process it without errors, it should return 200 (HTTP-ok), and it should return the information: "We don't know this person".
What is the best thing to do?
There's nothing wrong in returning a 404 status code. It simply means "you asked for a resource, and it doesn't exist". If it did return 200, it would have to somehow return an additional status code telling you that everything went fine, but the resource couldn't be found. That's unnecessary, because 404 already means exactly that.
A status 500 is normally returned to mean "something went wrong, I can't tell you if the resource exists or not". Now if you returned a 404 to mean that, this would be a mistake.
Whether it's a good thing or not is not doesn't have anything to do with HTTP or REST. And BTW, if the register was a file containing the survivors of a disaster, you would probably find it bad to not find the person you looked for, unless maybe if the person is your mother in law, unless you actually love your mother in law. (this is meant as a joke, for those who don't find it obvious).
A web service could be implemented in a way that produces either kind of response.
If it is being implemented in a way that is truer to HTTP/Rest, then '404 not found' would be the appropriate response for not finding something. This is what the 404 status is for.
It may help to think of your request as saying 'I assume this record exists, and I want to see it', and the 404 response as saying 'You are in error that record does not exist'.
If this URL is going to be used by code (rather than be displayed in a browser), then if you don't make a distinction in the response status then you will have to add extra information to the response body to make the difference obvious.
If this URL is going to be displayed to a user in a browser, then the server is still able to display content in the response body for a 404.
Please read: http://archive.oreilly.com/pub/post/restful_error_handling.html#__federated=1
Quoting:
Conclusion:
Human Readable Error Messages: Part of the major appeal of REST based web services is that you can open any browser, type in the right URL, and see an immediate response -- no special tools needed. However, HTTP error codes do not always provide enough information. For example, if we take option 1 above, and request and invalid book ID, we get back a 404 Error Code. From the developer perspective, have we actually typed in the wrong host name, or an invalid book ID? It's not immediately clear. In Option 3 (DAS), we get back a blank page with no information. To view the actual error code, you need to run a network sniffer, or point your browser through a proxy. For all these reasons, I think Option 4 has a lot to offer. It significantly lowers the barrier for new developers, and enables all information related to a web service to be directly viewable within a web browser.
Application Specific Errors: Option 1 has the disadvantage of not being directly viewable within a browser. It also has the additional disadvantage of mapping all HTTP error codes to application specific error codes. HTTP status codes are specific to document retrieval and posting, and these may not map directly to your application domain. For example, one of the DAS error codes relates to invalid genomic coordinates (sequence coordinate is out of bounds/invalid). What HTTP error code would we map to in this case?
Machine Readable Error Codes: As a third criteria, error codes should be easily readable by other applications. For example, the XooMLe application returns back only human readable error messages, e.g. "Invalid Google API key supplied". An application parsing a XooMLe response would have to search for this specific error message, and this can be notoriously brittle -- for example, the XooMLe server might simply change the message to "Invalid Key Supplied". Error codes, such as those provided by DAS are important for programmatic control, and easy creation of exceptions. For example, if XooMLe returned a 1001 error code, a client application could do a quick lookup and immediately throw an InvalidKeyException.
Based on these three criteria, here's my vote for best error handling option:
Use HTTP Status Codes for problems specifically related to HTTP, and not specifically related to your web service.
When an error occurs, always return an XML document detailing the error.
Make sure the XML error document contains both an error code, and a human readable error message. For example:
1001
Invalid Google API key supplied
By following these three simple practices, you can make it significantly easier for others to interface with your service, and react when things go wrong. New developers can easily see valid and invalid requests via a simple web browser, and programs can easily (and more robustly) extract error codes and act appropriately.
The Amazon.com web services API follows the approach of returned XML document can specify an ErrorMsg element.
XooMLe also follows this approach. (XooMLe provides a RESTful API wrapper to the existing SOAP based Google API).
Another approach is by DAS ( Distributed Annotation System) which always returns 200 if there was no HTTP-error and has error information in the HTTP-header, which is less favorable, because it is not human readable, as a browser does not display the HTTP-header.

Use AWS S3 success_action_redirect policy with XHR

I'm using signed POST to upload file directly to amazon S3. I had some trouble with the signature of the policy using PHP but finally fixed it and here is the sample of code.
This xhr request is send in javascript and I'm waiting for an answer from amazon. At first I was using success_action_status setting it to 201 to get the XML response.
What I'd like to do is using the success_action_redirect to call a script on my server to create a record in the database.
The reason why is that I could create the record in the database and if anything wrong happen at this stage I can return an error message directly at this point. Also it saves me another ajax request to my server.
So I've tried to set this up specifying the success_action_redirect to http:\\localhost\callback.php where I have a script that is waiting for some parameters.
But it looks like this script is never called and the response of the xhr.send() is empty.
I think it's a cross-browser issue and I'm wondering if it would be possible to use jsonp somehow to pass-by this?
Any ideas?
UPDATE
Apparently xhr is following redirect natively so it should work but when I specified the success_action_redirect it returns error Server responded with 0 code.
At first I thought it was because the redirect URL was on my local server so I've changed it to an accessible server but no chance.
Anyone knows why it's returning this error message?
I also run into this problem. It seems like nobody has a solution to this like this
maybe the best workaround i have found is something like this.
It seems thet the only workaround includes a second xhr-request to execute the callback manually. therefore the
success_action_status
should be used. Witht his you will get a 201 response if the upload was successful and you can start a second request for the actual callback. For me it looks like the only possible solution at the moment.
Any other solutions?

RestKit 0.10.3 - Object Manager deleteObject

I am wondering what the "default" response from a "DELETE /api/myEntity/1" request is RestKit expecting.
My current web service returns OK (200) status code with empty body. Meaning that the object was successfully deleted.
RestKit triggers the onDidFailWithError method, and also logs some messages to the debug output:
restkit.network:RKObjectLoader.m:300 Unable to find parser for MIME Type 'text/plain'
restkit.network:RKObjectLoader.m:329 Encountered unexpected response with status code: 200 (MIME Type: text/plain ->
The web service is developed by us. So we can return anything else, we just think returning "OK" is enough.
Please advise. Thanks.
For all empty responses the correct status code to be returned should be 204 No Content.
RestKit declares to correctly handle also 200 OK but I experienced some problem with delete, too.
I found some bux fixes done after tag v0.10.3 (see here) so I suggest you to upgrade to a more recent commit.
Be careful to update to newer v0.20 because it is an hard refactoring and a lot of things were changed!

What does "Predicate Mismatch for View"

I am writing a iOS client for a an existing product that uses a legacy SOAP webservice. I got the proper URL to send my SOAP/XML messages too and even have some samples. However, none of them seem to work...
I always get a 404 error with the following error text "Predicate mismatch for View"
I am using an ASIFormDataRequest for the actual request and apending the data (SOAP XML in this case) via [someFormRequest appenData:myData].
I am flat out of ideas here and am wondering what, if anything I am doing wrong. Or should I ping one of the back end guys? Could this error be a result of something on the server side?
This is an error message spit out by the pyramid web framework when attempting to access a URL without supplying all of the required parameters. You definitely want to double check that the URL you are using has all of the required params (headers, query string options, request body, etc) and if you're convinced that what you are sending is correct then but your backend guys because it's definitely a miscommunication or a bug between the two of you.