how can i get to know about my web site maximum capacity for the visitors at the same time - shared-hosting

how can i get to know about my web site maximum capacity for the visitors at the same time?
-kind of stress test for unexpected situations-

You are correct when you think in terms of stress test. You need to be able to reproduce the amount of users you are expecting in order to know precisely how many concurrently users your application will be able to handle.
You start with a low number of users and then you can increase it until you reach a point where your app stops answering in a acceptable amount of time.
I'm afraid there is no simple answer to this, but the simplest way to do this that I can think of is write a simple script that will make GET/POST requests (maybe even using wget) and run it on a farm on Amazon EC2 or something like that so you can truly reach the max capacity of your infrastructure.

If your site is primarily static content, then you will most likely be limited by bandwidth. In this case, an estimate of the capacity can be easily calculated for a given set of expected user activities.
If you have site that is built on common software, you might be able to find benchmarks of that software that will give you a rough estimate of the capacity you can expect.
If this is a critical site or it is a hand-built or highly customized application, then there is no substitute for testing. You need "web performance load testing" software - google for it. This type of software will simulate many browsers visiting your site at the same time. There is a wide variety of choices, from free to $$$$$$$$s.

Related

How to start writing performance tests on the project (JMeter + Selenium bundle)

I want to start writing performance tests on my project (JMeter + Selenium bundle). The following questions arose:
Where to start? What questions should I set before I start writing tests?
On what metrics should I concentrate on the first chord, if I'm
interested in stability and partial performance and, for example,
the speed of page loading?
Is it possible to integrate with the existing Test Automation
Framework to avoid writing tests specifically for JMeter from
scratch?
I will be glad to any other advice, tips, links, etc.
You need to determine your goal prior to starting any testing, for example you can ask around for any SLAs or NFRs you can stick to. If they don't exist you can mimic real-life usage of your application and increase the load until errors start occurring or response time exceeds reasonable values (it is highly unlikely that use will wait for page to load for more than i.e. 20 seconds)
It depends on the nature of application under test, check out Performance Metrics for Websites for high level overview. The key points are:
response time
throughput (requests per second)
application under test health (how much headroom in terms of CPU, RAM, network/disk IO is left)
how do KPI's correlate (i.e. when you add more users does throughput increase, remain the same or decrease)
how (if) your application scales
etc.
For JMeter there is one main rule: JMeter test should represent real user using real browser as close as possible (mind cookies, headers, cache, embedded resources, think times, etc.). You can build a test "skeleton" fairly quickly and easily using JMeter Proxy Server. For Selenium it will be enough to stick to main OOP principles plus Page Objects pattern.

PDF generation performance

I can't find enough data about pdf generation performance. I'm planning to create some system and one of its features is to generate PDFs. Mostly simple ones that have about 3-5 pages only with text and tables, occasionally some logo.
What's bothering me is the requirement to support high user traffic (about 2500 requests per second).
Do you know any tools (preferably in java) that are fast and reliable to serve that bunch of users as fast as possible ? How long will it take to serve this amount of people on a single, average machine? I would appreciate any info about experience on this topic.
You almost certainly have to execute some tests with your typical workload on your typical machine. This is probably the only way you can evaluate whether any tools will be able to do what you need.
2500 requests per second is a non-trivial requirement so you are right to be concerned. If that 2500/sec is a sustained load and each request has to produce the 3-5 page pdf you simply might not be able to keep up on a "single average machine". It's not only processing power you'll have to consider, but memory and IO performance.
From experience iText is fast and Docmosis has some built-in facilities to distribute load to other hosts. I've seen both working stably under load. Be careful with memory management when you have that many documents on the fly - if you fall behind you might "blow up" no matter what document engine you use.

Concurrent page request comparisons

I have been hoping to find out what different server setups equate to in theory for concurrent page requests, and the answer always seems to be soaked in voodoo and sorcery. What is the approximation of max concurrent page requests for the following setups?
apache+php+mysql(1 server)
apache+php+mysql+caching(like memcached or similiar (still one server))
apache+php+mysql+caching+dedicated Database Server (2 servers)
apache+php+mysql+caching+dedicatedDB+loadbalancing(multi webserver/single dbserver)
apache+php+mysql+caching+dedicatedDB+loadbalancing(multi webserver/multi dbserver)
+distributed (amazon cloud elastic) -- I know this one is "as much as you can afford" but it would be nice to know when to move to it.
I appreciate any constructive criticism, I am just trying to figure out when its time to move from one implementation to the next, because they each come with their own implementation feat either programming wise or setup wise.
In your question you talk about caching and this is probably one of the most important factors in a web architecture r.e performance and capacity.
Memcache is useful, but actually, before that, you should be ensuring proper HTTP cache directives on your server responses. This does 2 things; it reduces the number of requests and speeds up server response times (if you have Apache configured correctly). This can also be improved by using an HTTP accelerator like Varnish and a CDN.
Another factor to consider is whether your system is stateless. By stateless, it usually means that it doesn't store sessions on the server and reference them with every request. A good systems architecture relies on state as little as possible. The less state the more horizontally scalable a system. Most people introduce state when confronted with issues of personalisation - i.e serving up different content for different users. In such cases you should first investigate using the HTML5 session storage (i.e store the complete user data in javascript on the client, obviously over https) or if the data set is smaller, secure javascript cookies. That way you can still serve up cached resources and then personalise with javascript on the client.
Finally, your stack includes a database tier, another potential bottleneck for performance and capacity. If you are only reading data from the system then again it should be quite easy to horizontally scale. If there are reads and writes, its typically better to separate the read write datasets into a separate database and have the read only in another. You can then use more relevant methods to scale.
These setups do not spit out a single answer that you can then compare to each other. The answer will vary on way more factors than you have listed.
Even if they did spit out a single answer, then it is just one metric out of dozens. What makes this the most important metric?
Even worse, each of these alternatives is not free. There is engineering effort and maintenance overhead in each of these. Which could not be analysed without understanding your organisation, your app and your cost/revenue structures.
Options like AWS not only involve development effort but may "lock you in" to a solution so you also need to be aware of that.
I know this response is not complete, but I am pointing out that this question touches on a large complicated area that cannot be reduced to a single metric.
I suspect you are approaching this from exactly the wrong end. Do not go looking for technologies and then figure out how to use them. Instead profile your app (measure, measure, measure), figure out the actual problem you are having, and then solve that problem and that problem only.
If you understand the problem and you understand the technology options then you should have an answer.
If you have already done this and the problem is concurrent page requests then I apologise in advance, but I suspect not.

avoid user access speed getting slow if the amount of visitors are increaed

For an ecommerce website, if the number of the visotor is keeping increasing, the user acces speed on the website are getting slow.
Is there any solution to avoid user access speed becoming slow if the amount of visitors are increaed.
Many thanks!
I think that the answer depends on many variables. Probably too many.
First of all it depends on these factors:
The software used for the site (it is something written from scratch, something you bought, an open source project for ecommerce?)
It depends on the bandwidth available (you can increase it if needed)
It depends on the quality of the code (i saw some software that when loading some pages it loads several tables in it, causing the page loading very slowly)
It depends on the hardware, how many session it can handles concurrently.
etc.
Obviously if the number of users is growing of few units then probably there are some problems with the software (configuration? bad software? and so on).
Probably if you provide more details, the answer could be more accurate.

Are there well-identified patterns for software scalability testing?

I've recently become quite interested in identifying patterns for software scalability testing. Due to the variable nature of different software solutions, it seems to like there are as many good solutions to the problem of scalability testing software as there are to designing and implementing software. To me, that means that we can probably distill some patterns for this type of testing that are widely used.
For the purposes of eliminating ambiguity, I'll say in advance that I'm using the wikipedia definition of scalability testing.
I'm most interested in answers proposing specific pattern names with thorough descriptions.
All the testing scenarios I am aware of use the same basic structure for the test which involves generating a number of requests on one or more requesters targeted at the processing agent to be tested. Kurt's answer is an excellent example of this process. Generally you will run the tests to find some thresholds and also run some alternative configurations (less nodes, different hardware etc...) to build up an accurate averaged data.
A requester can be a machine, network card, specific software or thread in software that generates the requests. All it does is generate a request that can be processed in some way.
A processing agent is the software, network card, machine that actually processes the request and returns a result.
However what you do with the results determines the type of test you are doing and they are:
Load/Performance Testing: This is the most common one in use. The results are processed is to see how much is processed at various levels or in various configurations. Again what Kurt is looking for above is an example if this.
Balance Testing: A common practice in scaling is to use a load balancing agent which directs requests to a process agent. The setup is the same as for Load Testing, but the goal is to check distribution of requests. In some scenarios you need to make sure that an even (or as close to as is acceptable) balance of requests across processing agents is achieved and in other scenarios you need to make sure that the process agent that handled the first request for a specific requester handles all subsequent requests (web farms are commonly needed like this).
Data Safety: With this test the results are collected and the data is compared. What you are looking for here is locking issues (such as a SQL deadlock) which prevents writes or that data changes are replicated to the various nodes or repositories you have in use in an acceptable time or less.
Boundary Testing: This is similar to load testing except the goal is not processing performance but how much is stored effects performance. For example if you have a database how many rows/tables/columns can you have before the I/O performance drops below acceptable levels.
I would also recommend The Art of Capacity Planning as an excellent book on the subject.
I can add one more type of testing to Robert's list: soak testing. You pick a suitably heavy test load, and then run it for an extended period of time - if your performance tests usually last for an hour, run it overnight, all day, or all week. You monitor both correctness and performance. The idea is to detect any kind of problem which builds up slowly over time: things like memory leaks, packratting, occasional deadlocks, indices needing rebuilding, etc.
This is a different kind of scalability, but it's important. When your system leaves the development shop and goes live, it doesn't just get bigger 'horizontally', by adding more load and more resources, but in the time dimension too: it's going to be running non-stop on the production machines for weeks, months or years, which it hasn't done in development.