What methods do you use to test for scalability in web applications? - testing

Our testing system is pretty rudimentary; fire up a browser, see if it works. Recently we ran into problems, found by our client, with our application where the number of users created a slow-down in the application. The application is basically a huge Word document with people editing their own versions all at the same time. Part of the problem came from not knowing how to test multiple instances at the same time. My partner and I thought about how to test this; one idea was to hire out an internet cafe and hire students for an hour to bang on the app.
What are other ways that people have tried to emulate concurrency in testing their web-based application? Most of the advice here is for specific methodology; I'm asking, how do you test it to make sure that it works?

If you have never checked out Selenium, then you need to. It will allow you to do automated web testing through the browser. Ok, so first problem solved.
Now ideally you could use that same script and load it up on a bunch of boxes and run them all at once to get some sort of load testing right? Luckily for you someone has already figured this out, although it is a paid service: Browser Mob. But, it looks like you were willing to spend a little money to do this anyway, and would probably net you better, more repeatable results.

We usually answer the question "can the web application do more than one thing at a time" by using JMeter to produce a simulated HTTP load on the web server.
I find that it helps to consider distinguish several different types of testing; concurrency (what happens when two events in the system collide), capacity (what happens when there are many overlapping requests), volume (what happens as data accumulates in the system)...
Huge general slow down, evidenced by response times that fall outside of the SLA, are usually related to capacity problems (with contention as a common cause) or volume (many users, much data, and the system gets slower over time). The former usually requires some sort of multi-threaded request stream; the latter you can usually manage by preloading the volume, and then measuring the response times experienced by a single user.
I generally find that separating the load generator from the actual measurement/instrumentation is a good idea. That can be as simple as having a black box over there to generate a typical load, and sitting here with a stop watch measuring the responsiveness of a typical use case.

JMeter http://jmeter.apache.org/

Related

Can BDD work for Big Data ETL testing?

I was wandering if anyone uses BDD for testing a Big Data ETL application?
I can see how BDD can be used for testing applications having a client interact with them, but in case of Big Data ETL application there is no client interaction so its hard to see what 'When' I might use.
For example:
Give 100 event of type A occur
And 50 event of type B occur after 5 minute
Then database rows should be:
|Type|Count|Bucket|
|A|100|1|
|B|50|2|
But that seems wrong.
Any one with an insight?
Can you give me an example of what you'd expect to see in an ETL output?
There are a couple of responses you could give to this. One might be the different kinds of database rows you'd expect, and the fact that some of them will probably be repeated, but not others. That was something that struck me as weird, but if you're used to working with star schemas then you'll probably notice other differences instead.
Normally I'd steer people away from talking about the database, but if you're working with star schemas, I think it's OK to mention the facts and dimensions (I haven't worked with ETL a lot, but I do remember talking through specific examples of these and what I would expect to see).
The alternative is to use the client.
I saw that you said there was no client; however, there's always a client, even if it's one that might exist in the future. There are implications for ETL which run across security, performance and access, amongst others. It's worth having a client, even if it's a string-based or SQL-based toy, to explore the things which might trip you up.
Why are you doing this? What's new about the thing the business or users or customers will be able to do when this is in place, that they can't do already? And can you get hold of an example of that?
"We'll be able to understand how X is performing against Y standard."
Great. Can you give me an example of some X, some Y, and some standard? How will you measure the performance? What data will you be looking for? Should everyone be able to see that data? Can you think of any scenario where someone shouldn't be able to access that?
Those examples become the ETL equivalent of scenarios; the conversations retain the same pattern. You just end up automating them at a different level, since your API is machine-oriented rather than human-oriented, and some of your conversations will be about monitoring instead of testing. Your conversations should still be with the people.
Your "when" will be the query or report that you run, within the data, permission and security context in which you run it.
BDD always works for application logic inside Big data space. Remember the testing triangle principle. Have your unit tests. Practice BDD and build your integration and acceptance tests with BDD and within your sprint. Its not recommended to have your test data externally maintained and thus validating E2E flow with all moving pieces needs to be light weight. Practice TDD model if permits.

UI automation best practices

We have developed some UI automation test cases. Currently we are executing those on application which is under development. As per our observation, during execution, majority of scripts are failing due to application related performance issues (like window did not load properly / window took more time than expected to load etc.)
So to avoid this, during execution, we are planning to check which step is failed and planning to re-execute the same again, to check if window is loaded properly and if yes continue execution. But I have feeling that due to this approach some of the application performance related issues may get masked and am not sure whether we should follow such approach or not.
I would like to know whether it can be count as a best practice.
If you implement some mechanism for re-trying the operation that just failed, you'll keep falling in holes because sometimes, a re-try is not possible due to the app being in an unexpected UI state, or similar things.
Usually, each application has an expected, and a worst, response time. Take that time and use it as the maximum timeout for playback configuration.
Always try to predict what should happen when, and script accordingly. Making your script tolerate unexpected UI states (like long delay, etc.) just makes your testing effort become more of an "passive" automation effort.
As a rather rude measure, you could design a recovery scenario that retries the operation at least once (or for a specific period of time). This could help you getting a "stable" playback without finding ou what timeouts to use.
But generally: If a windows takes too long to show up, it is a defect. If your timeout is too low, it is a bug -- in your test robot config. If it is not defined what "takes too long" means, get the performance requirements.
Thus: Fix accordingly.
That's my 2 (OK -- 3) cents :)
Not the "best" but working practice.
Scripts must be portable. From environment to environment (and we all know, that test environments are much slower than UAT/Pre-prod, or Production) - with minimal / zero effort on maintenance.
Therefore:
use synchronization
don't hard-code what can change
make scripts configurable from the outside of QTP IDE
With regards to the little piece of GUI Step Automation, here's a general heuristic and acronym to remember: SEED NATALI.
SEED NATALI acronym stands for the following.
Synchronize till object
Exists
Enabled
Displayed
verify Number of Arguments
verify Type of Arguments
Log test flow
Investigate any issues occurred
Thank you,
Albert Gareev
http://automation-beyond.com/
If the objective is to perform functional test than,
It would be helpful to define bench mark on the response time taken by the application in different Environment, For example, If you have an web application, the Max load time is defined as 20sec and for Other application it is 10 sec. Similarly Once you have a clear benchmark You are on the floor to catch the issues.
Please note while defining the benchmark of an application there are many criteria( like network bandwidth, Server Types) which needs to be taken into consideration while defining benchmark.
If you're adding the retries now for a phase in the application development where the performance isn't stable yet, you should make sure to remove them when the application stabilizes.
QTP is sufficient for testing the performance of desktop applications or client server applications for a single user, if you want to test the performance for many users on a client server applications (e.g. web) perhaps you should consider using a load testing tool like LoadRunner.

How to get your clients to test

I build web apps for a living.
An important but often painful process is client/user acceptance testing.
How do you manage this process?
i.e. How do you get them to test? Do you give them test scripts?
Do you give them a system to log bugs and change requests/feedback. How do you get the client to understand the difference between a bug and a feature change?
How go get clients to give you repeatable steps to create a bug/issue?
Any good web apps for managing this process (thinking a Basecamp like app would be very uesful for this)
Thanks,
Ed
Don't give them test scripts.
To me that invalidates the testing process to a large degree because if you're thinking up test cases your software probably handles them because you've thought of them.
The idea of good testing is that there is a level of independence in testing so you can't cater for known test cases and also the client is likely to think of scenarios that you won't, which is the whole idea.
But how do you motivate them? Well, honestly I'd be surprised if they weren't motivated. I've generally found that motivating them to comment on func specs, requirements and other preliminary documentation is a far tougher battle. By the time you get to testing, you've eliminated an important psychological hurdle in that the software is now "real".
How you handle this depends to a large extent on the nature of your relationship to the client. If you have a formal process with an agreed upon spec, you should really be saying that the client has a certain period to sign off and accept the software and inaction is implied acceptance.
If it's an internal client well then that's harder. It probably all comes down to who's driving the project? Who are the stakeholders? These are the people you need to motivate such activity.
Usually the best method that I've come across for client testing is having them send screenshots of the problem and some of the things they did to create it. By this point, most of the testing should have been done in house and the egregious bugs should be weeded out. Having a system that automatically emails that an error occurs lets me know they are testing and I get most of the gory details from the stacktrace in the email.

Protection against automation

One of our next projects is supposed to be a MS Windows based game (written in C#, with a winform GUI and an integrated DirectX display-control) for a customer who wants to give away prizes to the best players. This project is meant to run for a couple of years, with championships, ladders, tournaments, player vs. player-action and so on.
One of the main concerns here is cheating, as a player would benefit dramatically if he was able to - for instance - let a custom made bot play the game for him (more in terms of strategy-decisions than in terms of playing many hours).
So my question is: what technical possibilites do we have to detect bot activity? We can of course track the number of hours played, analyze strategies to detect anomalies and so on, but as far as this question is concerned, I would be more interested in knowing details like
how to detect if another application makes periodical screenshots?
how to detect if another application scans our process memory?
what are good ways to determine whether user input (mouse movement, keyboard input) is human-generated and not automated?
is it possible to detect if another application requests informations about controls in our application (position of controls etc)?
what other ways exist in which a cheater could gather informations about the current game state, feed those to a bot and send the determined actions back to the client?
Your feedback is highly appreciated!
I wrote d2botnet, a .net diablo 2 automation engine a while back, and something you can add to your list of things to watch out for are malformed /invalid/forged packets. I assume this game will communicate over TCP. Packet sniffing and forging are usually the first way games (online anyways) are automated. I know blizzard would detect malformed packets, somehting i tried to stay away from doing in d2botnet.
So make sure you detect invalid packets. Encrypt them. Hash them. do somethign to make sure they are valid. If you think about it, if someone can know exactly what every packet means that is sent back and forth they dont even need to run the client software, which then makes any process based detection a moot point. So you can also add in some sort of packet based challenge response that your cleint must know how to respond to.
Just an idea what if the 'cheater' runs your software in a virtual machine (like vmware) and makes screenshots of that window? I doubt you can defend against that.
You obviously can't defend against the 'analog gap', e.g. the cheater's system makes external screenshots with a high quality camera - I guess it's only a theoretical issue.
Maybe you should investigate chess sites. There is a lot of money in chess, they don't like bots either - maybe they have come up with a solution already.
The best protection against automation is to not have tasks that require grinding.
That being said, the best way to detect automation is to actively engage the user and require periodic CAPTCHA-like tests (except without the image and so forth). I'd recommend utilizing a database of several thousand simple one-off questions that get posed to the user every so often.
However, based on your question, I'd say your best bet is to not implement the anti-automation features in C#. You stand very little chance of detecting well-written hacks/bots from within managed code, especially when all the hacker has to do is simply go into ring0 to avoid detection via any standard method. I'd recommend a Warden-like approach (download-able module that you can update whenever you feel like) combined with a Kernel-Mode Driver that hooks all of the windows API functions and watches them for "inappropriate" calls. Note, however, that you're going to run into a lot of false positives, so you need to not base your banning system on your automated data. Always have a human look over it before banning.
A common method of listening to keyboard and mouse input in an application is setting a windows hook using SetWindowsHookEx.
Vendors usually try to protect their software during installation so that hacker won't automate and crack/find a serial for their application.
Google the term: "Key Loggers"...
Here's an article that describes the problem and methods to prevent it.
I have no deeper understanding on how PunkBuster and such softwar works, but this is the way I'd go:
Iintercept calls to the API functions that handle the memory stuff like ReadProcessMemory, WriteProcessMemory and so on.
You'd detect if your process is involved in the call, log it, and trampoline the call back to the original function.
This should work for the screenshot taking too, but you might want to intercept the BitBlt function.
Here's a basic tutorial concerning the function interception:
Intercepting System API Calls
You should look into what goes into Punkbuster, Valve Anti-Cheat, and some other anti-cheat stuff for some pointers.
Edit: What I mean is, look into how they do it; how they detect that stuff.
I don't know the technical details, but Intenet Chess Club's BlitzIn program seems to have integrated program switching detection. That's of course for detecting people running a chess engine on the side and not directly applicable to your case, but you may be able to extrapolate the apporach to something like if process X takes more than Z% CPU time the next Y cycles, it's probably a bot running.
That in addition to a "you must not run anything else while playing the game to be eligible for prizes" as part of the contest rules might work.
Also, a draconian "we might decide in any time for any reason that you have been using a bot and disqualify you" rule also helps with the heuristic approach above (used in prized ICC chess tournaments).
All these questions are easily solved by the rule 1 above:
* how to detect if another application makes periodical screenshots?
* how to detect if another application scans our process memory?
* what are good ways to determine whether user input (mouse movement, keyboard input) is human-generated and not automated?
* is it possible to detect if another application requests informations about controls in our application (position of controls etc)?
I think a good way to make harder the problem to the crackers is to have the only authoritative copies of the game state in your servers, only sending to and receiving updates from the clients, that way you can embed in the communication protocol itself client validation (that it hasn't been cracked and thus the detection rules are still in place). That, and actively monitoring for new weird behavior found might get you close to where you want to be.

How much a tester should know about internal details of code?

How useful, if at all, is for the testers on a product team to know about the internal code details of a product. This does not mean they need to know every line of code but a good idea of how the code is structured, what is the object model, how the various modules are inter-linked, what are the inter-dependencies between various features etc.? This can argubaly help them in finding related issues or defects once they hit one. On the other side, this can potentially 'bias' their "user-centric" approach towards evaluating and certifying the product and can effect the testing results in the end.
I have not heard of any specific model for such interaction. (Lets assume a product that users, potentially non-technical consume, and not a framework or API that the testers are testing - in the latter case the testers may need to understand the code to test that because the user is another programmer).
That entirely depends upon the type of testing being done.
For functional system testing, the testers can and probably should be oblivious to the details of the implementation -- if they know the details they may inadvertently account for that in their test strategy and not properly test the product.
For performance and scalability testing it's often helpful for the testers to have some high-level knowledge of the structure of the codebase, as it's beneficial in identifying potential performance hotspots, and therefore writing targetted test cases. The reason this is important is that generally performance testing is a broad open-ended process, so anything that can be done to focus the testing to get results is beneficial to everybody.
This sounds similiar to this previous question: Should QA test from a strictly black-box perspective?
I've never seen a circumstance where a tester who knew a lot about the internals of system was disadvantaged.
I would assert that there are self justifying myths that an informed tester is as adequate or even better than a deeply technical one because:
It allows project managers to use 'random or low quality resources' for testing. The 'as uninformed as the user myth'. If you want this type of testing - get some 'real' users to test your stuff.
Testers are still often seen as cheaper and less valuable than developers. The 'anybody can do blackbox testing myth'.
Development can defer proper testing to the test team. Two myths in one 'we don't need to train testers' and 'only the test team does testing' myths.
What you are looking at here is the difference between black box (no knowledge of the internals), white box (all knowledge) and grey box (some select knowledge).
The answer really depends on the purpose of the code. For integration heavy projects then where and how they communicate, even if it is entirely behind the scenes, allows testers to produce appropriate non-functional test cases.
These test cases are determining whether or not a component will gracefully handle the lack of availability of a dependency. It can also be used to identify performance related issues.
For example: As a tester if I know that the Web UI component defers a request to a orchestration service that does the real work then I can construct a scenario where the orchestration takes a long time (high load). If the user then performs another request (simulating user impatience) and the web service will receive a second request while the first is still going. If we continually repeat this the web service will eventually die from stress. Without knowing the underlying model it would not be easy to find the problem
In most cases for functionality testing then black box is preferred, as soon as you move towards non-functional or system integration then understanding the interactions can assist in ensuring appropriate test coverage.
Not all testers are skilled or comfortable working/understanding the component interactions or internals so it is on a per tester/per system basis on whether it is appropriate.
In almost all cases we start with black box and head towards white as the need sees.
A tester does not need to know internal details.
The application should be tested without any knowledge of interal structure, development problems, externals depenedncies.
If you encumber the tester with those additional info you push him into a certain testing scheme and the tester should never be pushed in a direction he should just test from a non coder view.
There are multiple testing methodologies that require code reviewing, and also those that don't.
The advantages to white-box testing (i.e. reading the code) is that you can tailor your testing to only test areas that you know (from reading the code) will fail.
Disadvantages include time wasted from actual testing to understand the code.
Black-box testing (i.e. not reading the code) can be just as good (or better?) at finding bugs than white-box.
Normally both types of testing can happen on one project, developers white-box unit testing, and testers black-box integration testing.
I prefer Black Box testing for final test regimes
In an ideal world...
Testers should know nothing about the internals of the code
They should know everything the customer will - i.e. have the documents/help required to use the system/application.(this definetly includes the API description/documents if it's some sort of code deliverable)
If the testers can't manage to find the defects with these limitations, you haven't documented your API/application enough.
If they are dedicated testers (Only thing they do) then I think they should know as little about the code as possible that they are attempting to test.
Too often they try to determine why its failing, that is the responsibility of the developer not the tester.
That said I think developers make great testers, because we tend to know the edge cases for certain types of functionality.
Here's an example of a bug which you can't find if you don't know the code internals, because you simply can't test all inputs:
long long int increment(long long int l) {
if (l == 475636294934LL) return 3;
return l + 1;
}
However, in this case it would be found if the tester had 100% code coverage as a target, and looked at only enough of the internals to write tests to achieve that.
Here's an example of a bug which you quite likely won't find if you do know the code internals, because false confidence is contagious. In particular, it is usually not possible for the author of the code to write a test which catches this bug:
int MyConnect(socket *sock) {
/* socket must have been bound already, but that's OK */
return RealConnect(sock);
}
If the documentation of MyConnect fails to mention that the socket must be bound, then something unexpected will happen some day (someone will call it unbound, and presumably the socket implementation will select an arbitrary local address). But a tester who can see the code often doesn't have the mindset of "testing" the documentation. Unless they're really on form, they won't notice that there's an assumption in the code not mentioned in the docs, and will just accept the assumption. In contrast, a tester writing from the docs could easily spot the bug, because they'll think "what possible states can a socket be in? I'll do a test for each". Since no constraints are mentioned, there's no reason they won't try the case that fails.
Answer: do both. One way to do this is to write a test suite before you see/write the code, and then add more tests to cover any special cases you introduce in your implementation. This applies whether or not the tester is the same person as the programmer, although obviously if the programmer writes the second kind of test, then only one person in the organisation has to understand the code. It's arguable whether it's a good long-term strategy to have code only one person has ever understood, but it's widespread, because it certainly saves time getting something out the door.
[Edit: I decline to say how these bugs came about. Maybe the programmer of the first one was clinically insane, and for the second one there are some restrictions on the port used, in order to workaround some weird network setup known to occur, and the socket is supposed to have been created via some de-weirdifying API whose existence is mentioned in the general sockets docs, but they neglect to require its use. Clearly in both these cases the programmer has been very careless. But that doesn't affect the point: the examples don't need to be realistic, since if you don't catch bugs that only a very careless programmer would make, then you won't catch all the actual bugs in your code unless you never have a bad day, make a crazy typo, etc.]
I guess it depends how good of testing you want. If you just want to sanity check the common scenarios, then by all means, just give the testers / pizza-eaters the application and tell them to go crazy.
However, if you'd like to have a chance at finding edge cases, performance or load issues, or a whole lot of other issues that hide in the depths of your code, you'd probably be better off hiring testers who know how and when to use white box techniques.
Your call.
IMHO, I think the industry view of testers is completely wrong.
Think about it ... you have two plumbers, one is extremely experienced, knows all the rules, the building codes, and can quickly look at something and know if the work is done right or not. The other plumber is good, and get the job done reliably.
Which one would you want to do the final inspection to make sure you don't come home to a flooded house? In fact, in what other industry do they allow someone who knows hardly anything about the system they are inspecting to actually do the inspection?
I have seen the bar for QA go up over the years, and that makes me happy. In time, QA may become something that devs aspire to be.
In short, not only should they be familiar with the code being tested, but they should have an understanding that rivals the architects of the product, as well as be able to effectively interface with the product owner(s) / customers to ensure that what is being created is actually what they want. But now I am going into a whole seperate conversation ...
Will it happen? Probably sooner than you think. I have been able to reduce the number of people needed to do QA, increase the overall effectiveness of the team, and increase the quality of the product simply by hiring very skilled people with dev / architect backgrounds with a strong aptitude for QA. I have lower operating costs, and since the software going out is higher quality, I end up with lower support costs. FWIW ... I have found that while I can backfill the QA guys effectively into a dev role when needed, the opposite is almost always not true.
If there is time, a tester should definitely go through a developers code. This way, you can improve your tests to get better coverage.
So, maybe if you write your black box tests looking at the spec and think you have the time to execute all of those and will still be left with time, going through code cannot be a bad idea.
Basically it all depends on how much time you have.. Another thing you can do to improve coverage is look at the developers design documents. Those should give you a good idea of what the code is going to look like...
Testers have the advantage of being familiar with both the dev code and the test code!
I would say they don't need to know the internal code details at all. However they do need to know the required functionality and system rules in full detail - like an analyst. Otherwise they won't test all the functionality, or won't realise when the system misbehaves.
For user acceptance testing the tester does not need to know the internal code details of the app. They only need to know the expected functionality, the business rules. When a bug is reported
Whoever is fixing the bug should know the inter-dependencies between various features.