Top things to look for when testing web pages/site - testing

I was looking to get a prioritized list of things to look for while doing web site testing. There are things we could do to improve the performance of a site and there are things which affects the performance. Is there a guide for developer and testers to follow strictly which will deliver the best website experience?
Related:
What should a developer know before building a public web site?

For starters, Yahoo has a great list # Best Practices for Speeding Up Your Web Site.
For security there is OWASP Top Ten.

That your website works as expected on all major web browsers

The guys from Yahoo! have an excellent Developer's Network. Specifically you can look at their Guide to Exceptional Performance.
I would definitely recommend that you follow their suggestion to download Yslow for Firebug and use that to gauge your site's performance up front and then use its grading system as a way to improve performance.
Keep in mind that these are general rules of thumb and sometimes it is simply not possible to follow all of their best recommendations for improving performance.

Some security problems: Looking if Cross-Site-Scripting or Cross-Request-Forgery-attacks are possible. Can pages be accessed without login, if you know the URL?

Related

Web CMS vs API vs Framework - HELP?! :(

I posed this question to my lecturer, but I would also like a variety of answers in order to better understand this conundrum of mine. Here is the original message with names omitted.
Hi **,
Thank you for your intro lecture today, I look forward to the work involved in the coming weeks.
I am however, rather confused regarding the terms CMS, API and Framework. The internet isn't providing much help either because these terms get thrown a lot and often for the very same thing!
A have a bit of background in LAMP web development, and I will provide a hypothetical scenario, where hopefully you can tell me where these terms would fit in.
I am using LAMP (Linux web server with Apache, MySQL and PHP).
I am developing a website whereby the public can watch movies (umm... ignore the legal issues, purely hypothetical and for educational purposes of course!)
I create my MySQL database using phpMyAdmin, and tables will involve 'users', 'categories', 'content' etc.
I now create an 'admin control panel (CP)' which I will refer to as the back-end. Authorised users, depending on their access levels (as determined by their account in the 'users' table) can add/edit/delete various things. These changes are reflected on what I call the front-end.
The front-end is the public facing website, whereby the public visit this website to watch films of their choice.
The back-end (i.e. the Admin CP) controls/regulates the content of users and pretty much everything. Over time, the developers could add more features to this for more functionality. E.g. a comments. Alternatively, a developer could use the Facebook comments API to include into every 'film' page on the front-end, this makes it a lot easier.
Now back to the main question at hand, is this a web CMS? Where would an API fit into this? Is this a framework?
Note: I'm not using anything like WordPress or Joomla etc., it's all custom coded by myself. Using PHP and HTML5, CSS3, maybe a bit of jQuery too, and of course SQL statements via PHP to interact with the MySQL database.
I appreciate your help in this confusion of mine.
Thanks,
EDIT: I have commented my thoughts based on Justin's input. If I'm on the right track let me know, cheers.
Thanks for the post.
The three terms you have stated are used quite often around the web, and they are always changing. First you have a CMS, CMS stands for Content Management System, like above you have stated Wordpress and Joomla. That is where someone has already created the software to create a site/blog without having to mess with PHP, MySQL, and Apache. You are merely doing anything on the front-end, just simply posting your content, and making it live. The software does all of the back-end work for you.
API, simply put. Open-source "plug-in" which allows the user to integrate a service or application into their site or application for use.
Framework, Like Bootstrap, created by Twitter. A Web Framework is an easy way to develop a site on the front-end. It gives the learning amateur a chance at developing the front-end while learning great concepts along the way.

How do I make sure my website can block automation scripts, bots?

I'd like to make sure that my website blocks automation tools like Selenium and QTP. Is there a way to do that ?
What settings on a website is Selenium bound to fail with ?
With due consideration to the comments on the original question asking "why on earth would you do this?", you basically need to follow the same strategy that any site uses to verify that a user is actually human. Methods such as asking users to authenticate or enter text from images or the like will probably work, but this will likely have the effect of blocking google crawlers and everything else.
Doing anything based on user agent strings or anything like that is mostly useless. Those are trivial to fake.
Rate-limiting connections or similar might have limited effectiveness, but it seems like you're going to inadvertently block any web crawlers too.
While this questions seems to be strange it is funny, so I tried to investigate possibilities
Besides adding a CAPTCHA which is the best and the only ultimate solution, you can block Selenium by adding the following JavaScript to your pages (this example will redirect to the Google page, but you can do anything you want):
<script>
var loc = window.parent.location.toString();
if (loc.indexOf("RemoteRunner.html")!=-1) {
// It is run in Selenium RC, so do something
document.location="http://www.google.com";
}
</script>
I do not know how can you block other automation tools and I am not sure if this will not block Selenium IDE
to be 100% certain that no automated bots/scripts can be run against your websites, don't have a website online. This will meet your requirement with certainty.
CAPTCHA are easy to break if not cheap, thanks to crowdsourcing and OCR methods.
Proxies can be found in the wild for free or bulk are available at extremely low costs. Again, useless to limit connection rates or detect bots.
One possible approach can be in your application logic, implement ways to increase time and cost for access to the site by having things like phone verification, credit card verification. Your website will never get off the ground because nobody will trust your site at it's infancy.
Solution: Do not put your website online and expect to be able to effectively eliminate bots and scripts from running.

What should I check before I release a web application?

I am nearly finished a web application. I need to test it and find the security issues before it release. Is there any methods / guideline to do this kind of testing? Or is there any tools to help me check my application is ready to go online? Thank you.  
I would say:
check that there are no warnings or errors even in strict mode (error report).
In case you store any sensitive data (as passwords, credit cards, etc.) be sure they are encrypted with non-standard algorithms. Use SSL and try to be somehow paranoid with it.
Set your database with specific accesses by action and hosts, and do not use root account.
Perform exhaustive testing (use unit test when possible). Involve as many people you can.
Test it under the main browsers (Firefox, Chrome, Opera, Safari, IE) and if have time in others.
Validate all your HTML/CSS against standards (W3C). (recommendable)
Depends on the platform you are using, there are profilers which can help you identify bottlenecks in your code. (can be done in later stages).
Tune settings for your web server / script language.
Be sure it is search-engine friendly.
Pray once is online :)
This is not a complete list as it depends in:
which language/platform/web server you are using.
what kind of application you developed (social, financial, management, etc.)
who will use that application (the entirely world, an specific company, your family or just you).
are you going to sell it? then you must have at least most of the previous points.
is your application using very sensitive information (as credit cards)? if so, you should pay for some professional (company?) to check your code, settings and methods.
This is just my opinion, take it as it is. I would also like to hear what other people suggests.
Good Luck
As well as what's already been suggested, depending on what type of application it is, you can use a vulnerability scanner to scan your application for any vulnerabilities that could lead to hackers gaining entry.
There are quite a few good scanners out there, but note when using them that the results may or may not be 100%. It's hard to say.
For a list of scanners, commercial and free, see: http://projects.webappsec.org/Web-Application-Security-Scanner-List
For more information on scanners: http://en.wikipedia.org/wiki/Web_Application_Security_Scanner
Good luck.
Here you can find a practical checklist to use before launching a website
http://launchlist.net/
And here is a list of all the stuff you forgot to test
http://www.thebraidytester.com/downloads/YouAreNotDoneYet.pdf

Implementing CAPTCHA after 50% of Article

We are planning to put large number of Business Research Reports and Articles from our intranet on to the Internet. However, we don't want others to copy the content and host it on their own.
I read about protection by CAPTCHA and was wondering if this is possible. Readers should be able to read 50% of the article for FREE after which a CAPTCHA should be entered to read the rest of the article [In this way we are making life little harder for those copycats]
Any pointers on how to implment this ? The content is in HTML and programming experience in Perl, PHP. Can hire others if required.
Aditionally, search engine will crawl half of the article and wondering if it will penalize the site for not being able to crawl the rest of the article since it won't be able to crack the CAPTCHA ?
Thanks.
There's a really good Captcha service provided by Recaptcha - http://recaptcha.net/
There is a PHP class that you can use to do all the hard work.
It's important to bear in mind that search engines aren't able to solve a Captcha and so they will only index the first half of the report. As long as this half contains largely the correct key words, it shouldn't cause a massive problem. Don't make the mistake of "detecting" a search engine and showing them different content to a normal user as the major search engines think that this is spamming.
An alternative solution would be to use a service like Copyscape (http://www.copyscape.com/) to protect your content.
I know this is not what you're asking, but please take into account that CAPTCHAs are universally broken, and will not protect your content. You said the first half is free, does that mean you intend to charge for the other half? CAPTCHA won't help you here at all...
But even if you're just trying to prevent automated scraping, CAPTCHA still won't do the trick. Check out my answer to another captcha question... Or you can go straight to the ppt I presented at OWASP last year.
Readers should be able to read 50% of the article for FREE after which a CAPTCHA should be entered to read the rest of the article
Have your PHP programmer output 50% of the article. On the bottom, add a captcha. If the user types in the correct captcha, output 100% of the article.
Any pointers on how to implment this ? The content is in HTML and programming experience in Perl, PHP. Can hire others if required.
As a PHP programmer, I use http://www.phpcaptcha.org to implement captcha.
Aditionally, search engine will crawl half of the article and wondering if it will penalize the site for not being able to crawl the rest of the article since it won't be able to crack the CAPTCHA ?
No, it won't penalize you but that particular section will not be shown on the search results.
As already mentioned reCAPTCHA is a good way to go.
Have a look at Captcha::reCAPTCHA on CPAN which according to the CPAN rating reviews "Works out of the box"
If your want Captcha then there are plenty of modules that do this on CPAN ;-)
Hope that helps.

Checklist for testing a new site

What are the most common things to test in a new site?
For instance to prevent exploits by bots, malicious users, massive load, etc.?
And just as importantly, what tools and approaches should you use?
(some stress test tools are really expensive/had to use, do you write your own? etc)
Common exploits that should be checked for.
Edit: the reason for this question is partially from being in SO beta, however please refrain from SO beta discussion, SO beta got me thinking about my own site and good thing too. This is meant to be a checklist for things that I, you, or someone else hasn't thought of before.
Try and break your own site before someone else does. Your web site is basically a publicly accessible API that allows access to a database and other backend systems. Test the URLs as if they were any other API. I like to start by cataloging all URLs that have some sort of permenant affect on the state of the system - this is easy if you are doing Ruby on Rails development or trying to follow a RESTful design pattern. For each of those URLs, try running a GET, POST, PUT or DELETE HTTP methods with different parameters so that you can ensure that you're only giving access to what you want to give access to.
This of course is in addition to obvious: Functional testing, Load Testing, SQL Injection, XSS etc.
Turn off javascript and make sure your site can still be navigated.
Even if you want to ignore the small but significant number of people who have it disabled, this will impact search engines as well.
YSlow can give you a quick analysis of different metrics.
What do friendly bots see (eg: Google); check using Google Webmaster Tools;
Regarding tools for running functional tests of a web pages, I've found that Selenium IDE to be useful.
The Firefox (version 2 only compatible at the moment) plug in lets your capture almost all web events, and save them and replay them in the same browser.
In conjunction with another Firefox https://addons.mozilla.org/en-US/firefox/addon/1843"> Firebug
you can create some very powerful tests.
If you want to set up Selenium Remote Control
you can then convert the Selenium IDE tests into nUnit tests, which you can run automatically.
I use cruise control and run these web tests as part of a daily build.
The nice thing about using Selenium remote control is that it can run the same functional tests on multiple browsers and operating systems, something that you can't do with the IDE.
Although the web tests will take ages to run, there is an version of Selenium called Selenium Grid that lets you use any old hardware you have spare to run the tests in parallel as part of a computing grid. Not tried this myself, but it sounds interesting.
All of the above is open source and free which helped me convince management to use if :-)
For checking the cross browser and cross platform look of your site, browershots.org is maybe the best free tool that can safe a lot of time and costs.
There's seperate stages for this one.
Firstly there's the technical testing, where you check all technical functionality:
SQL injections
Cross-site Scripting (XSS)
load times
stress levels
Then there's the phase where you have someone completely computer-illiterate sit down and ask them to find something. Not only does it show you where there's flaws in your navigational logic (I find that developers look upon things way differently than 'other people') but they're also guaranteed to find some way to break your site.