Is there an equivalent of Don Libes's *expect* tool for scripting interaction with web pages? - testing

In the bad old days of interactive console applications, Don Libes created a tool called Expect, which enabled you to write Tcl scripts that interacted with these applications, much as a user would. Expect had two tremendous benefits:
It was possible to script interactions that otherwise would have had to be repeated by hand, tediously. A classic example was dialup Internet access hell (from the days before PPP).
It was possible to write scripts to test one's own interactive applications, programmatically, as part of a regression suite.
Today most interactive applications are on the web, not on the console. Hence my question: is there any tool that provides the ability to interact with web pages and web forms programmatically, much as Expect provides the ability to interact with console applications programmatically?
(The closest thing I am aware of is Chickenfoot.)

You might be looking for Selenium

I've used Selenium RC in conjunction with Python to drive web page interactions programmatically. This has allowed me to write pretty extensive user tests in which forms and inputs are driven and their results are measured.
Check out the Selenium IDE on Firefox (as mentioned above). It allows you to record tests in the browser and play them back, either using the IDE itself, or the Remote Control app.

Perl Mechanize works pretty well for this exact issue.
HTTPS and some authentication issues are tricky at times. I will be posting couple questions about those in the future.

I did a ton of Expect work in a former life and always thought Don Libes' Expect book was one of the best-written and most enlightening technical books I'd ever seen.
Hands down I would say that Perl's WWW::Mechanize library is what you want. I note above that you were having trouble finding documentation. There is good documentation for it! Look up the module's distribution on search.cpan.org and see what all is packaged with it. There's a FAQ, Cookbook with examples, etc. Plus I've always been able to get help on the web. If you can't get it here, try at use.perl.org or perlmonks.org. WWW::Mechanize's author, Andy Lester, is present on Stack Overflow. (He's also an all around friendly and helpful guy.)
I believe WWW::Mechanize also has a program that is analogous to Expect's autoexpect program: you set up a proxy process running this program as a server, point your browser to it as a proxy, perform the actions you want to automate, and then the proxy program gives you a WWW::Mechanize program for you to use as a base for your project. (If it works like autoexpect, you will certainly want to make modifications from there.)
As mentioned above, WWW::Mechanize is a browser (to be more exact, it is a web client or http client) that happens to be programmable. The last time I looked, there was even work in progress to make it support JavaScript.

In addition to Selenium, if you're doing the Ruby/Rails thing, there's Webrat.

Related

Selenium Webdriver vs Mechanize

I am interested in automating repetitive data entry in some forms for a website I frequent. So far the tools I've looked up that would provide support for this in a headless fashion could be Selenium WebDriver and Mechanize.
My question is, is there a fundamental technical difference in using once versus the other? Selenium is mostly used for testing. I've also noticed some folks use it for doing exactly what I'm looking for, and that's automating data entry. Testing becomes a second benefit in that case.
Is there reasons to not use Selenium for what I want to do over Mechanize? Does it not matter and both of these tools will work?
I'm not asking which is better, I'm asking which is the right tool for the job. Perhaps I'm not understanding the premise behind the purpose of each tool.
These are completely different tools that somewhat "cross" in the web-scraping, web automation, automated data extraction scope.
mechanize is a mature and widely-used tool for programmatic web-browsing with a lot of built-in features, like cookie handing, browser history, form submissions. The key thing to understand here is that mechanize.Browser is not a real browser, it cannot execute and understand javascript, it cannot send asynchronous requests often needed to form a web page.
This is where selenium comes into play - it is a browser automation tool which is also widely used in web-scraping. selenium usually becomes a "fall-back" tool - when someone cannot web-scrape a site with mechanize or RoboBrowser or MechanicalSoup (note - another alternatives) because of, for instance, it's javascript "heaviness", the choice is usually selenium. With selenium you can also go headless, automating PhantomJS browser, or having a virtual display. As a commonly mentioned drawback, performance is often mentioned - with selenium you are working with a target site as a real user in a web browser, which is loading additional files needed to form a page, making XHR requests, rendering etc.
And this itself does not mean you should use selenium everywhere - choose the tool wisely, choose it because it fits the problem better, not because you are more familiar with an instrument.
Also note that you should, first, consider using an API (if provided by the target website) instead of going down to web-scraping. And, if it comes to it, be a good web-scraping citizen:
How to be a good citizen when crawling web sites?
Web scraping etiquette

A simple text based full web-page regression testing

My duty is to pick up and continue developing PHP website for a small sized business client. Project has no testing code. I want to quickly establish at least very basic regression testing for the backend of the site.
I need to test the full contents of the web page char to char. Must see the diff of failed tests.
I need to be able to set up cookies and GET/POST data.
Once every few days I am updating the local database from production database. I would like to then have an overview of failed tests and very quickly update my test suits so that everything is passing again.
Is using WatiN or Selenium a good idea? My local environment is Linux.
About Selenium (and only Selenium as I don't know WatiN) - it can only do what you can do in your broswer. It can click, type in fields, submit forms, take screenshots (that's a very good one), set up cookies (so yes to this one). You can always set up GET data through the URL. But I am not aware of any technique in Selenium that would allow you to set up POST data in any other way than navigating in a browser. Also, because the tests are in your browser, they are not particularly fast. E.g, on our product, a single thorough test with ~250 steps takes about 10 minutes on my computer to complete. Of course, you can always divide that between many computers using Selenium Grid. It's just more work.
To conclude it - I'd say yes, Selenium is good for your needs as there are so many ways to write a good test in it that everyone finds his style. It is good for quick checks, functionality affirmations, but also for full-scale tests etc. But if you want to do some really advanced stuff, then that's a job for a long time. Selenium offers so much functionality in so many different ways that it is definitely a full-time job to understand them and know how to use them.
Try Selenium-IDE for 20 minutes. It is just an addon for Firefox that can record your actions and then replay them. If you like what you see, go for it. If no, hire someone who will.
I'm not sure if I am too late here, but in regards to WatiN it is only IE based, so if you plan to use any other browser you are better off with Selenium WebDriver (though WatiN has some Firefox support). From what I have found (I have used both WatiN and Selenium) Selenium can achieve more low level interactions (also see Selenium Grid), but really I think it is dependant on what you are looking to achieve and personal preference. If you have time to write your own wrapper to interact with WatiN/Selenium you will find the tests themselves are rather quick to run. Also, the beauty of Automation is that once these tests are written you can run them and walk away while they complete.

How to create an online rebol console?

Where can I find the code for creating an online rebol console like the one here ?
http://tryrebol.esperconsultancy.nl/
Update: for the sandbox system on the server, can't Rebol manage it itself with some security wrapper and its security options ?
As for console itself, I don't know Ruby so I don't want to use TryRuby and why would I need it ? Can't I mimic Rebol console itself by "remoting" it somehow ? Why RT or Esper Consultancy can't make an opensource version ? There's no value in keeping it closed source. Rebol needs to prove it's more open than in the past.
In my opinion, you should aim higher with something like the already open-sourced Try Ruby. You'd type in expressions and it would guide you. Their showcase site is at tryruby.org and is fairly slick.
I modified TryRuby to work with Rebol and it wound up looking like this:
But I'm not going to run it on my server because I didn't want to belabor the necessary sandboxing/etc. or protections against someone running an infinite loop. I can give you what I've got so far if you want it.
I started a tutorial script here that no one seemed interested in helping me with, so I wandered off to other tasks:
http://www.rebol.net/wiki/Interactive_tutorial_script
I'm not sure what exactly you want. You mention you want a remote REBOL shell instead of a tutoring setup, but that's what the Try REBOL site is. There are several reasons it's not open source:
It's in heavy development. I'm currently changing the code regularly.
So it's not in a release state. Preparing it for release, documenting and publishing it would take a lot of extra work, as with most projects.
It's written in my CMS that's also in heavy development. Even if the Try REBOL site were open source, it wouldn't run. The CMS is not planned to be open sourced soon.
It's not meant as a generic REBOL remoting tool, but as a one-off demo site. If that site is running, what's the use of more of them?
As others have answered, there are many generic solutions for remoting that you could use. Also, most parts of the Try REBOL site are readily available as open source:
Syllable Server, produced and published by us.
The Cheyenne web server.
The HTML source of the web client can be viewed, including my simple JavaScript command service bus.
Syllable Server is an essential part of the site, as the sandboxing is not done with REBOL facilities (except some extra limits in the R3 backend), but with standard Linux facilities.
A truly air tight (do I mean silica tight?) sandbox is close to impossible with R2.
R3 (still in alpha) is looking a lot more promising. The deep technical discussions in flight right now (see Cure code and AltME/REBOL3 Proposals regarding unwinds and protect and even occasionally mentioning sandboxes should lead to an excellent sandbox capability.
Right now, the big advance R3 has that makes Kaj's tryREBOL possible is R3's secure policy settings which make it possible (with some careful wrapper code) to construct an alpha/demo sandbox.
To answer your precise question("where can I find code...", you could try asking Kaj for his :)
I'm new to StackOverflow. I'm not sure if this is going to end up as a reply to your comment, or as a new answer.
The somewhat common idea that any project can be open sourced and contributed to by others is a naive view. In the case of my Try REBOL site, it makes no sense. It's not just in heavy development; it's written in a CMS that's also in heavy development. Basically noone could contribute to it at this point, because I'm the only one who knows my CMS. Or in any case its newest features, which I develop by developing Try REBOL, and other example sites. So developing Try REBOL means developing the CMS at the same time, and by definition, I'm the only one who can do that.
More generally, my projects are bleeding edge, innovative technology with a strong vision. The vision is mine, and to teach it to others, I have to build it to show how I intended it to work. So there's a catch 22: to enable others to contribute, I have to finish my projects first, because people typically don't understand them until I show them how they work.
There certainly are other projects where mass contribution makes more sense. Still, only the top projects get the contributors. We found that out the hard way. We created Syllable Desktop and Syllable Server with surrounding infrastructure for contributions. These are fairly classic, well understood operating systems that many people could work on in parallel. However, despite years of begging, we get very few contributions.
So, if you feel a burning need to contribute to our projects, please pick one of the many tasks in Syllable to execute. :-)

Automatic testing of web pages (and generating from use cases by DSL)

My goal is:
Our customers could generate new web-tests.
Our continuous integration server makes a test-environment deployment; it should execute the tests against it
The test could also be run against some other environment.
(Final acceptance tests should be made by the customer, to test fonts etc, but this would be a great pre-acceptance check for our test-environment. Customers could focus on other things than now.)
Usually some property (like text field id) has changed or something and the tests will break in a few weeks. It seems that recorded tests broke often, so it's better to easily record a new one than trying to maintenance and modify an old test.
Now, I found a whole new approach. Maybe recording is not the right way.
How about, if our customers could make use cases in a human readable own language which the machine would understand and compile to web-recording (with Domain Specific Language, DSL).
This is not sci-fi, it has been already made, so read on. :-)
I have tried to use these automatic web testing frameworks:
Visual Studio web test (Customers can't execute)
Selenium (Works only with Firefox, our customers have IE)
WatiN (.NET version of Watir, recorder seems to be a bit buggy)
HP Quick Test Pro (Not easy enough to make new tests)
None of these have provided actually what I need... but Selenium is the closest one.
Our customers speak Finnish, so in the beginning of a software project, in specification phase, user writes a use case like:
Avaa "OmaLomake"
Syötä "Tuomas" kohtaan "nimi"
Paina "Seuraava"
Translation:
Open "MyForm"
Insert "Tuomas" into field "name"
Press "Next"
Now... This is a human-readable use case, but also it can be compiled to automatic web acceptance test. Open, Insert, into field and Press are keywords, others are values.
What kind of DSL tool would be good for this?
Microsoft is making a new DSL-making-tool in their Oslo-project called MGrammar. It means that you can make a custom language to make it easy for non-technical people to work with machines. (The same basic idea that was (and failed) with Cobol and Visual Basic.)
I found that someone has already made this kind of DSL with MGrammar, but it is for Watin, not Selenium:
http://www.codinginstinct.com/2008/11/creating-watin-dsl-using-mgrammar.html
So the continuous integration server process will be:
Fetch a new version from source control (as usual).
Build, run unit tests and analyze the code (as usual).
Make an installation package and tag version in version control (as usual).
Compile use cases to web tests
Run web tests
Accept/Reject the software :-)
Running a web-test in the continuous integration server usually means a lot of configuration work. So, before I try this, I'm curious, what do you think?
Have you used same kind of setup, and what are your experiences? (What exact environment?)
How about DSL, will it have enough power for use cases or will it be another endless development task? Will the customers ever generate the tests?
First of all, Selenium does work with IE and other browsers as well as Firefox; cross browser support is one of its strengths. Here's the list of supported browsers.
However, if you want a human language-based DSL for writing your tests, take a look at Cucumber - the syntax is almost exactly like your example above. Cucumber already has Finnish language support - see the examples at this link.
Fitnesse and Selenium Integration tools such as Selenesse(http://github.com/marisaseal/selenesse) or Fitnium(http://www.magneticreason.com/tools/fitnium/fitnium.html) can also serve your purpose. However, you need to find answer for who will put the element locators in the test cases written by customers. If customers put the locators using the recorders, it may not be possible to maintain. If customers write the steps and a automation tester/developer can put those locators using regex, custom location strategy, this approach may work.
The TestPlan software uses a specialize language for writing tests. It is highly domain specific and works very well in web environments. It supports the Selenium backend so you gain that compatibility, plus it can run without a browser, for even faster tests. I have used it on some fairly large web projects in the type of setup you are looking for.
Your example script might look like this:
GotoURL /SomePage
Click MyForm
SubmitForm with
%Params:name% Tuomos
%Submit% value:Next
end
That's it. It nicely describes what the user wants to do and is a functioning test. You can combine scripts into units and have custom function as well. So if you really wanted you could write the Finish equivalents to the names.

How to stress test a javascript-requiring Web App

A similar question was already asked (
Performing a Stress Test on Web Application?), but I'd like to test a web application that prevents double-submits and takes some counter-XSRF actions and therefore REQUIRES JavaScripts to be evaluated.
Has anybody done stress tests with web apps that require (and use) JS and any experience to share?
jMeter wouldn't work I guess...
Thanks!
Watir?
Watir is a simple open-source library for automating web browsers.
Watir drives browsers the same way people do. It clicks links, fills in forms, presses buttons. Watir also checks results, such as whether expected text appears on the page.
It drives Internet Explorer, but is also functional with Firefox (and Safari to some extent).
The problem with Watir and Selenium RC or any other full browser solution is that they need a full browser :P
Browsers are very expensive to run, often requiring 300MB or more of RAM. Multiply those requirements by even 100 and you need massive hardware. Fortunately, there is a solution: I recently started a company that does exactly what you're looking for.
Check out http://browsermob.com and you can run a limited test (up to 25 users) to get a feel for the app. Feel free to contact us if you have any questions at all!
One solution that may be worth pursuing is to run Selenium on Amazon EC2 to provide the scalability you need. There is a tutorial over at Selenium using a sample that ships with Selenium grid. Windows machines are 12.5 cents an hour for a small machine meaning that a 500 machine test is going to cost $62.50 an hour.
PROS:
Selenium runs in a real browser meaning that your Javascript is executing as it would on a client
Low cost - trying to do this on your own hardware would cost significantly more
CONS:
You would have to establish network connectivity from Amazon to your application
The testers I work with use Bad Boy for load testing. I'm fairly certain you can test interactions that use javascript, so you should be able to test stuff like double-submits.
As far as your backend is concerned, it doesn't matter what triggers a request whether it's from JavaScript or a load testing tool as long as the request is valid.
You can create a bunch of fake requests that do lots of different things (hopefully representative of actual usage patterns) and slam your webserver with a load testing tool.
There's a bunch out there:
jMeter
http_load
Grinder
httpperf
Because JMeter is not a browser, it won't interpret the JavaScript code on the page you GETting:
JMeter does not process Javascript or applets embedded in HTML pages.
[JMeter Wiki]
So, what can you do? You can add WebDriver to JMeter test and by this, evaluate the web pages.
Web Driver Sampler automates the execution and collection of
Performance metrics on the Browser (client-side). A large part of
performance testing, up to this point, has been on the server side of
things. However, with the advancement of technology, HTML5, JS and CSS
improvements, more and more logic and behaviour have been pushed down
to the client. This adds to the overall perceived performance of
website/webapp, but this metric is not available in JMeter. Things
that add to the overall browser execution time may include:
Client-side Javascript execution - eg. AJAX, JS templates
CSS transforms - eg. 3D matrix transforms, animations
3rd party plugins - eg. Facebook like, Double click ads, site analytics, etc
All these things add to the overall browser execution time, and this
project aims to measure the time it takes to complete rendering all
this content.
Official guide: https://jmeter-plugins.org/wiki/WebDriverTutorial/
I've tried Badboy which is OK. The big, fat, heavy tool is SilkTest. It requires a lot of programming to get up and running, but you can get something very solid done!
If you only need to stress test request from e.g. IIS log files, I have a custom build tool. I will publish it at CodePlex very soon.
Selenium RC is another alternative.
Also related, check out my recent article on Ajaxian. I think it does a good job of explaining why real browsers do matter and why executing JavaScript is becoming more important for load testing.
http://ajaxian.com/archives/why-load-testing-ajax-is-hard
There's a new tool in this area called k6
it has a way to access the DOM, and I'm planning to try it in a project.
background story:
You can visit this and this blog.
maybe it will help.