Phantomjs and ajax - phantomjs

I am a total noob with phantomjs, and I don't do any web development.
However, I want to get phantomjs to sit on my page, like a browser.
When a browser does this, I see a call every few seconds (5?). It appears to be an ajax thing.
When I load the page with phantomjs, it loads successfully, but I never see the call that is made as if I am in a browser.
Hopefully, this is not a difficult thing. Since I do not have any experience I don't want to have to jump through a bunch of hoops to make it work!
Thanks!
My very basic code is such:
var webPage = require('webpage');
var page = webPage.create();
page.open('http://myurl.com/', function(status) {
console.log('Status: ' + status);
// Do other things here...
});

Related

How to open a page with Phantomjs without running js or making subsequent requests?

Is there a way to just load the server generated HTML (without any js or images)?
The docs seem a little sparse
The strength of PhantomJS is exactly in its ability to emulate a real browser, which opens a page and makes all the subsequent request. If you want just html maybe better use curl or wget?
But nevertheless there is a way not to run js or load images: set corresponding page settings: http://phantomjs.org/api/webpage/property/settings.html
page.settings.javascriptEnabled = false;
page.settings.loadImages = false;

On a protractor helper function, how can I know if browser.get was ever called?

I'm creating a helper function for my protractor tests to log in the application only if necessary.
That is useful to allow me to run each test in isolation to debug it and, when running the whole suite, I get a little performance boost from not having to log in before every test.
I have to actually login in two situations:
If no page was ever loaded (browser.get() was not called).
We are on a page, but not logged in yet.
Situation 2 is easy. The application has an element that is shown on all pages when the user is logged in.
The hard part is 1. If I try to find the element that indicates the user is logged in before I load any pages, I get the following error:
Error while waiting for Protractor to sync with the page: "angular could not be found on the window"
And I didn't see anything on the protractor API or source code that could help me determine if any pages were already loaded.
Any pointers are appreciated.
Nice question, even though I don't think it is a good idea to have tests in non-deterministic state after you do something.
Only way to know if get was called I can think of is checking that browser.driver.getCurrentUrl() returns desired URL, or doesn't return about:blank.
But you'd have to wait some time, because getCurrentUrl returns about:blank even after get is called, but page is not fully loaded. So I'd do something like this:
beforeEach(function() {
EC = protractor.ExpectedConditions;
var URL;
browser.ignoreSynchronization = true;
//get rid of automatic check that would give you the "Protractor Sync Error"
browser.wait(EC.presenceOf($('.ng-scope')),500)
.then(function(){},function(err) {
console.log(err)
browser.driver.getCurrentUrl().then(function(url) {
URL = url;
if(URL == "about:blank"){
console.log('no get before');
// your login function here - no get was called before this wait
}
});
});
browser.ignoreSynchronization = false;
});
Question is if you even need to check for about:blank, because in your case it might be enough to check for that .ng-scope, or your "login element". But this should give you a kickstart.

Phantomjs equivalent of browser's "Save Page As... Webpage, complete"

For my application I need to programmatically save a copy of a webpage HTML along with the images and resources needed to render it. Browsers have this functionality in their Save page as... Webpage, complete options.
It is of course easy to save the rendered HTML of a page using phantomjs or casperjs. However, I have not seen any examples of combining this with downloading the associated images, and doing the needed DOM changes to use the downloaded images.
Given that this functionality exists in webkit-based browsers (Chrome, Safari) I'm surprised it isn't in phantomjs -- or perhaps I just haven't found it!
Alternatively the PhantomJS, you can use the CasperJS to achieve the required result. CasperJS is a framework based on the PhantomJS, however, with a variety of modules and classes that support and complement the PhantomJS.
An example of a script that you can use is:
casper.test.begin('test script', 0, function(test) {
casper.start(url);
casper.then(function myFunction(){
//...
});
casper.run(function () {
//...
test.done();
});
});
With this script, within a "step", you can perform your downloads, be it a single image of a document, the page, a print or whatever.
Take a study on the download methods, getPageContent and capture / captureSelector in this link.
I hope these pointers can help you to go further!

Testing a live website with QUnit

Can I test live websites using QUnit? For example, can I write a test that says:
Go to google.com
Enter a search term
Click 'Google Search'
Check there are 10 results and 2 ads
Would QUnit be an appropriate tool for this kind of "live" testing?
You can achieve that using qunit if qunit is the only tool/testing framework that is available for you and the page you will be testing allows GET requests.
How to do that qunit is you can make an ajax call to the page you are testing using JSONP and get the response. Then you would assert certain elements on the response exists.
For google, google itself has a very complex page structure on search results, I would not even attempt to do anything like this.
I would use qunit for testing javascript components on their own without dependencies.
If you are looking for another tool to do this task, I would recommend Selenium, which would do exactly what you want.
Good luck.
Do you want to test a website you own or a random live website?
If you want to test your own website
you can embed the live site in a iframe and perform actions in the user interface in your tests.
If you want to test live websites like google.com you need to do this server side since you can't access them from javascript/QUnit.
When you where the owner of a site like google.com you could do:
var submitted = false;
function starttests(){
if(!submitted)
test("testInput", function() {
expect(1);
submitted = true;
var dom = iframe.contentWindow || iframe.contentWindow.document;
jQuery(dom).find('input[type=text]').val("Testing google.com");
jQuery(dom).find('form').submit();
ok( true, "form submitted" );
});
else
test("testResult", function() {
var dom = iframe.contentWindow || iframe.contentWindow.document;
// Check for elements in dom.
});
}
iframe.onload(starttests);
iframe.src = "http://google.com";

Screen Scraping - still not working

I have browsed through many posts on this and have tried some of the suggestions but still not understanding it fully.
I would like to scrape html pages that have some script running that usually executes the script to display a link after clicking. Some mentioned firebug and others talked about reverse engineering the code I need. But after trying reverse engineering I still dont see how to get the data after tracing the script function.
jQuery('.category-selector').toggle(
function() {
var categoryList = jQuery('#category-list');
categoryList.css('top', jQuery(this).offset().top+43);
jQuery('.category-selector img').attr ('src', '/images/up_arrow.png');
categoryList.removeClass('nodisplay');
},
function() {
var categoryList = jQuery('#category-list');
jQuery('.category-selector img').attr('src', '/images/down_arrow.png');
categoryList.addClass('nodisplay');
}
);
jQuery('.category-item a').click(
function(){
idToShow = jQuery(this).attr('id').substr(9);
hideAllExcept(jQuery('#category_' + idToShow));
jQuery('.category-item a').removeClass('activeLink');
jQuery(this).addClass('activeLink');
}
);
I am using vb.net and some sites were easy using firebug where looking at the script I was able to pull the data that I needed. What woudl I do in this scenario? the link is http://featured.typepad.com/ and the categories are what I am trying to access. Notice the url does not change.
Appreciate any responses.
My best suggestion would be to use Selenium for screen scraping. It is normally used for automated website testing but would fit your case well. I've used to screen scrape AJAX pages on multiple occasions where the page was heavily Javascript dependent.
http://seleniumhq.org/projects/ide/
You can write your screen scraping code to run in .NET and it can use Firefox or IE to run your screen scraping with.
With selenium what you'll do is record a screen scraping session with the Selenium IDE in Firefox (look for the Firefox extension in the link above). That screen scraping session can either output an HTML template or C# code. It might be able to output VB as well.
You'll copy the C# or VB.NET output from the screen scrape into a selenium .NET project that you'll create and then run the Selenium project through Nunit.
I'd suggest looking online for some help with getting Selenium started and working but this should get you on your way.