PhantomJS Parse Error On Specific Sites

PhantomJS Parse Error On Specific Sites - phantomjs

When I try to run the following extremely simple PhantomJS script, I get a parse error:
var page = require('webpage').create();
page.open('http://compare.nissanusa.com/nissan_compare/NNAComparator/TrimSelect.jsp', function (status) {});
Anyone know why this could be happening? The error message is not helpful at all... It just says "Parse Error".
Could this be a bug in PhantomJS?
I am using PhantomJS version 1.9. I'm able to run the above script with other URLs, but for some reason certain URLs return a parse error...
Any help would be greatly appreciated!

It's simply because there is a javascript error on the web site http://compare.nissanusa.com/nissan_compare/NNAComparator/TrimSelect.jsp. Parse Error is not because of your code.
Phantomjs does not really loves js error when loading a page, that's why it's important to add an error handler.
To easily catch an error occured in a web page, whether it is a syntax error or other thrown exception, use page.onError.
Here is a basic example :
page.onError = function(msg, trace) {
var msgStack = ['ERROR: ' + msg];
if (trace && trace.length) {
msgStack.push('TRACE:');
trace.forEach(function(t) {
msgStack.push(' -> ' + t.file + ': ' + t.line + (t.function ? ' (in function "' + t.function + '")' : ''));
});
}
console.error(msgStack.join('\n'));
};

Related

Cypress scrollIntoView() giving error while assertion

while I using should('be.visible') assertion then if the element is not on the screen assertion is giving an error. So that's why I using scrollIntoView() but this time element comes on screen so scrollIntoView() cannot work as respected an giving error too.
I don't want to use 'exist' i need to use 'be.visible' assertion.
Last error thrown code, I leaving here as an example.
Then('click work order from the work order list {int}', (i) => {
cy.xpath(`//div[contains(#class, 'workorder')] //div[contains(text(), '` + getWorkOrderWithNoAttachment(i) + ` /` + `')]`)
.scrollIntoView()
.should('be.visible', { setTimeout: 10000 })
.click()
})

Unable to access loaded & total object in progressEvent of axios request

I'm working on a react native project where I need to make a progress bar for uploading a file.
here is my code, kindly check it out.
axios.post(urlRequest, requestData, {
headers: headerConfig,
onUploadProgress: function (progressEvent) {
console.log('progressEvent : ', progressEvent);
console.dir('progressEvent loaded : ', progressEvent.loaded);
console.dir('progressEvent total : ', progressEvent.total);
},
in the response of this function, I'm getting following console output.
As you can see here, I'm getting loaded & total object inside Symbol(original_event) but, I can't access them.
Any Idea? how can I use them?

i think console.dir accepts and returns only object not string 'progressEvent '.
And Path to object you are pointing is wrong progressEvent.Symbol.loaded.

XMLHttpRequest status between firefox and chrome

var xmlhttp = new XMLHttpRequest();
xmlhttp.onreadystatechange = function () {
if (this.readyState == 4 && this.status == 200) {
createPopup(this);
}
else if (this.status == 404) {
alert("file not found from load");
}
};
xmlhttp.open("GET", url, true);
xmlhttp.send();
Hi, I am learning about html and css and now javaScript with Dom.
I am trying to parse xml file and know that I have to use XMLHttpRequest to get the data.
To make exception handling such as "there is no file", "xml has fault(wrong xml)", I am trying to use the XMLHttpRequest's member variables "readyStatus", "status" to figure out what status of the result.
If there is another way to deal with this problem, let me know..
First, the chrome doesn't give the "status" value whereas the firefox give with same code. but it is limited to give status == 200 when the file exist regardless of file's status(wrong or not), do you know why?
Second, How can I see "status == 404" using status, could you tell me when it occur?

"Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user's experience. For more help, check https://xhr.spec.whatwg.org/" ... This appear in the alert of the Chrome Console...I have the same problem...

download csv (or other non html data) with phantomjs

How can I access simple csv data?
var webpage = require('webpage');
var csvPage = webpage.create();
var csvUrl= "http://www.scoach.ch/arcmsdownload/023c5c5aa58e6e0ff963ddcdea5ac016/CONTENT.csv/derivatives_2013-05-24.csv";
csvPage.open(csvUrl, function(status){
console.log("csv: " + csvPage.content);
});
This will give me just an empty html: which is not the expected result :-) I have tried several callbacks, but nothing helped me.
Thanks for your Help!

First, I'll just quickly point out that PhantomJS is overkill for this job. Use wget, curl, PHP file_get_contents, etc. However, I'm assuming this is part of a more complicated PhantomJS script, and you have a good reason.
I can only half answer your question, by showing you how to see the missing error messages:
var webpage = require('webpage');
var csvPage = webpage.create();
var csvUrl= "http://www.scoach.ch/arcmsdownload/023c5c5aa58e6e0ff963ddcdea5ac016/CONTENT.csv/derivatives_2013-05-24.csv";
csvPage.open(csvUrl, function(status){
console.log("status="+status);
console.log("csv: " + csvPage.plainText);
phantom.exit();
});
I made these changes:
Show the status (it is "fail")
Change to use plainText instead of content. (The latter wraps your content in html tags, which you don't want for csv).
Add phantom.exit(), just so it doesn't sit there at the end.
I don't know why the status is "fail", when I can get the file fine with wget. The next troubleshooting step is to add these two lines before calling csvPage.open:
csvPage.onResourceRequested = function (request) {
console.log('Request ' + JSON.stringify(request, undefined, 4));
};
csvPage.onResourceReceived = function (response) {
console.log('Receive ' + JSON.stringify(response, undefined, 4));
};
It is returning immediately, with 3878 bytes, even though I see a Content-Length header of 6,335,428. This might be a PhantomJS bug/limitation with either chunked encoding or very large files.
UPDATE: Another idea, for a short-term solution, is to call wget or curl from inside your PhantomJS script, using the new spawn or execFile commands: http://code.google.com/p/phantomjs/source/browse/examples/child_process-examples.js

This SO post might help.
Also note that PhantomJS is a separate web server from NodeJS, so using csv node libraries isn't an option.

How can I control PhantomJS to skip download some kind of resource?

phantomjs has config loadImage,
but I want more,
how can I control phantomjs to skip download some kind of resource,
such as css etc...
=====
good news:
this feature is added.
https://code.google.com/p/phantomjs/issues/detail?id=230
The gist:
page.onResourceRequested = function(requestData, request) {
if ((/http:\/\/.+?\.css/gi).test(requestData['url']) || requestData['Content-Type'] == 'text/css') {
console.log('The url of the request is matching. Aborting: ' + requestData['url']);
request.abort();
}
};

UPDATED, Working!
Since PhantomJS 1.9, the existing answer didn't work. You must use this code:
var webPage = require('webpage');
var page = webPage.create();
page.onResourceRequested = function(requestData, networkRequest) {
var match = requestData.url.match(/wordfamily.js/g);
if (match != null) {
console.log('Request (#' + requestData.id + '): ' + JSON.stringify(requestData));
networkRequest.cancel(); // or .abort()
}
};
If you use abort() instead of cancel(), it will trigger onResourceError.
You can look at the PhantomJS docs

So finally you can try this http://github.com/eugenehp/node-crawler
otherwise you can still try the below approach with PhantomJS
The easy way, is to load page -> parse page -> exclude unwanted resource -> load it into PhatomJS.
Another way is just simply block the hosts in the firewall.
Optionally you can use a proxy to block certain URL addresses and queries to them.
And additional one, load the page, and then remove the unwanted resources, but I think its not the right approach here.

Use page.onResourceRequested, as in example loadurlwithoutcss.js:
page.onResourceRequested = function(requestData, request) {
if ((/http:\/\/.+?\.css/gi).test(requestData['url']) ||
requestData.headers['Content-Type'] == 'text/css') {
console.log('The url of the request is matching. Aborting: ' + requestData['url']);
request.abort();
}
};

No way for now (phantomjs 1.7), it does NOT support that.
But a nasty solution is using a http proxy, so you can screen out some request that you don't need

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

PhantomJS Parse Error On Specific Sites - phantomjs

Related

Cypress scrollIntoView() giving error while assertion

Unable to access loaded & total object in progressEvent of axios request

XMLHttpRequest status between firefox and chrome

download csv (or other non html data) with phantomjs

How can I control PhantomJS to skip download some kind of resource?

Categories

Resources