What is the standard acceptable request/response-timeout for API server (and Why)? - api

I'm working on developing both web-client and API server. I've been doing some research regarding default timeout, some are at 800ms, others 1200ms. However, I can't find the reason behind the arbitrary number. Can someone help me regarding this? An explanation behind the arbitrary number would be a great help.
Thanks,

TLDR: Please see paragraph starting with "The arbitrary number" in bold below. The rest is just extra info on the topic.
Although you might know this or have already read this in your research, I can share the following ideas:
Typically the timeout is set depending on the expected complexity of a query, the amount of data to be processed, and the expected load of the system when the query occurs (or any other expected operation that may require attention in terms of modifying a timeout). Also, this can be based on something like the number of requests an API makes to other APIs to handle an incoming request(s) and what those expectations might be.
The arbitrary number ("best guess" of whoever developed the software) would typically be expected when planning for a "most requests should complete in some fraction of this time if there is no issue regardless of what happens" or "this isn't anything to worry about" type of scenario. Hence the default values for timeouts are pretty much based on the assumption that they represent the vast majority of "acceptable" completed requests where no issue is present. It is typically set somewhere between "this should be plenty of time" and "there is most likely something terribly wrong with this request, let's end it" and most successful requests pass this test by "default".
In the case that you have operations that may take several minutes and you expect that this can occur without an actual issue being present, you may want to set the timeout higher than the default so your requests don't timeout when there is no actual problem (for example, most commercial APIs have constraints on the number of requests and time in which they must complete so problematic requests don't clog up the system and other reasons as seen by their developers).
Thus, there really isn't a great answer or standard to this aside from just taking a look at the amount of data/requests to be processed, planning for a reasonable ebb and flow of server load, level of optimization of your code compared to the expected load, and so on... It's almost like error-handling but for things that you don't know might happen yet (such as unexpected bugs) but based on things you already know about your system and its expected usage.
Generally, you won't have many scenarios where the timeout really matters all that much but you always want to have one (at least the default) to prepare for the unexpected.
I found the following article that talks about the topic and some of what I mentioned as well if you haven't seen it already:
https://medium.com/#masnun/always-use-a-timeout-for-http-requests-de4da538b9e3

tl;dr - According to SLA [ Service Level Agreement ] mostly. If not, try to optimize the code as much as possible to bring down the time it takes to give out the response in terms of milliseconds
I'll put the answer in layman's terms since it really depends on various factors.
Let's assume you have an API and it performs some operation and gives the result back. It's quite simple and you'll get the response for that under some milliseconds if they don't perform any complex operations.
And when we move into a more and more complex system where one API talks to another, it adds up the time, and the worst-case scenario, first API which started the request might get the final response after 5 seconds, 30 seconds, or even 60 seconds depending on the number of API calls and how good the system is designed.
And we are only considering the happy flow. What if something goes wrong in one of the APIs that gets called internally?
To avoid this bad experience, the clients will make an SLA that requires the company/developers to design the code in such a way that it gives the response within a certain acceptable range.
I came across this conversation once on Google Groups conversation and it might provide some insight.
So to answer the question about the acceptable range, If you don't have an SLA, try to optimize the code as much as possible to bring down the time it takes to give out the response in terms of milliseconds.

Generally 1 Second is considered acceptable. The reason for this and why the suggested numbers vary so much is most APIs have a lockout if you send requests to fast. However, some APIs will let you send requests faster. In my experience all of the APIs I have seen request a 1s(1000ms) delay between requests to prevent overload/accidental DDOS and have a timeout of 30-60sec.
Edit: It is important to mention to not let another request from the same IP be answered if the first one is still waiting as this would make a DDoS easy

Related

Can memcache.cas and memcache.add miss?

memcache.set can fail if the server load is too high and memcache.get can fail if there is no data in the server. These are called misses. [Is this terminology confusing because in the get scenario, it didn't miss anything, it did everything correctly but there was no data, so the term miss is misleading?]
However, would operations like memcache.cas and memcache.add also have this problem of misses and if so how is it defined? I believe they are fundamentally some kind of set operation so the miss could happen here as well?
I couldn't find an an API documentation that specifies what happens during all scenarios. I am finding that using memcache clients makes it even difficult because each client has its own rules. Currently I am trying to understand at the server level.

UI automation best practices

We have developed some UI automation test cases. Currently we are executing those on application which is under development. As per our observation, during execution, majority of scripts are failing due to application related performance issues (like window did not load properly / window took more time than expected to load etc.)
So to avoid this, during execution, we are planning to check which step is failed and planning to re-execute the same again, to check if window is loaded properly and if yes continue execution. But I have feeling that due to this approach some of the application performance related issues may get masked and am not sure whether we should follow such approach or not.
I would like to know whether it can be count as a best practice.
If you implement some mechanism for re-trying the operation that just failed, you'll keep falling in holes because sometimes, a re-try is not possible due to the app being in an unexpected UI state, or similar things.
Usually, each application has an expected, and a worst, response time. Take that time and use it as the maximum timeout for playback configuration.
Always try to predict what should happen when, and script accordingly. Making your script tolerate unexpected UI states (like long delay, etc.) just makes your testing effort become more of an "passive" automation effort.
As a rather rude measure, you could design a recovery scenario that retries the operation at least once (or for a specific period of time). This could help you getting a "stable" playback without finding ou what timeouts to use.
But generally: If a windows takes too long to show up, it is a defect. If your timeout is too low, it is a bug -- in your test robot config. If it is not defined what "takes too long" means, get the performance requirements.
Thus: Fix accordingly.
That's my 2 (OK -- 3) cents :)
Not the "best" but working practice.
Scripts must be portable. From environment to environment (and we all know, that test environments are much slower than UAT/Pre-prod, or Production) - with minimal / zero effort on maintenance.
Therefore:
use synchronization
don't hard-code what can change
make scripts configurable from the outside of QTP IDE
With regards to the little piece of GUI Step Automation, here's a general heuristic and acronym to remember: SEED NATALI.
SEED NATALI acronym stands for the following.
Synchronize till object
Exists
Enabled
Displayed
verify Number of Arguments
verify Type of Arguments
Log test flow
Investigate any issues occurred
Thank you,
Albert Gareev
http://automation-beyond.com/
If the objective is to perform functional test than,
It would be helpful to define bench mark on the response time taken by the application in different Environment, For example, If you have an web application, the Max load time is defined as 20sec and for Other application it is 10 sec. Similarly Once you have a clear benchmark You are on the floor to catch the issues.
Please note while defining the benchmark of an application there are many criteria( like network bandwidth, Server Types) which needs to be taken into consideration while defining benchmark.
If you're adding the retries now for a phase in the application development where the performance isn't stable yet, you should make sure to remove them when the application stabilizes.
QTP is sufficient for testing the performance of desktop applications or client server applications for a single user, if you want to test the performance for many users on a client server applications (e.g. web) perhaps you should consider using a load testing tool like LoadRunner.

What is the best way of pulling json data in terms of performance?

Currently I am using HttpWebRequest to pull json data from an external site, and the performance was not good. Is wcf much better?
I need expert advice on this..
Probably not, but that's not the right question.
To answer it: WCF, which certainly supports JSON, is ultimately going to use HttpWebRequest at the bottom level, and it will certainly have the same network latency. Even more importantly, it will use the same server to get the JSON. WCF has a lot of advantages in building, maintaining, and configuring web services and clients, but it's not magically faster. It's possible that your method of deserializing JSON is really slow compared to what WCF would use by default, but I doubt it.
And that brings up the really important point: find out why the performance is bad. Changing frameworks is only an intelligible optimization option if you know what's slow, and, by extension, how doing something different would make it less slow. Is it the server? Is it deserialization? Is it network? Is it authentication or some other request overhead detail? And so on.
So the real answer is: profile! Once you know what the performance issue really is, you can make an informed decision about whether a framework like WCF would help.
The short answer is: no.
The longer answer is that WCF is an API which doesn't specify a communication method, but supports multiple methods. However, those methods are normally over SOAP which is going to involve more overheard than a JSON, and it would seem the world has decided to move on from SOAP.
What sort of performance are you looking for and what are you getting? It may be that you are simply facing physical limitations of network locations, in which case you might look towards making your interface feel more responsive, even if the data is sluggish.
It'd be worth it to see if most of the latency is just in reaching the remote site (e.g. response times are comparable to ping times). Or, perhaps, the problem is the time it takes for the remote site to generate and serve the page. If so, some intermediate caching might be best.
+1 on what Isaac said, but one thing I'd add is, if you do use WCF here, it'll internally use the HttpWebRequest in most places, so you're definitely not gaining performance at all. One way you may unintentionally gain in performance -- however -- is in how WCF recycles, reuses, pools, and caches most transport objects internally. So it ultimately goes back to Isaac's advice on profiling.

What methods do you use to test for scalability in web applications?

Our testing system is pretty rudimentary; fire up a browser, see if it works. Recently we ran into problems, found by our client, with our application where the number of users created a slow-down in the application. The application is basically a huge Word document with people editing their own versions all at the same time. Part of the problem came from not knowing how to test multiple instances at the same time. My partner and I thought about how to test this; one idea was to hire out an internet cafe and hire students for an hour to bang on the app.
What are other ways that people have tried to emulate concurrency in testing their web-based application? Most of the advice here is for specific methodology; I'm asking, how do you test it to make sure that it works?
If you have never checked out Selenium, then you need to. It will allow you to do automated web testing through the browser. Ok, so first problem solved.
Now ideally you could use that same script and load it up on a bunch of boxes and run them all at once to get some sort of load testing right? Luckily for you someone has already figured this out, although it is a paid service: Browser Mob. But, it looks like you were willing to spend a little money to do this anyway, and would probably net you better, more repeatable results.
We usually answer the question "can the web application do more than one thing at a time" by using JMeter to produce a simulated HTTP load on the web server.
I find that it helps to consider distinguish several different types of testing; concurrency (what happens when two events in the system collide), capacity (what happens when there are many overlapping requests), volume (what happens as data accumulates in the system)...
Huge general slow down, evidenced by response times that fall outside of the SLA, are usually related to capacity problems (with contention as a common cause) or volume (many users, much data, and the system gets slower over time). The former usually requires some sort of multi-threaded request stream; the latter you can usually manage by preloading the volume, and then measuring the response times experienced by a single user.
I generally find that separating the load generator from the actual measurement/instrumentation is a good idea. That can be as simple as having a black box over there to generate a typical load, and sitting here with a stop watch measuring the responsiveness of a typical use case.
JMeter http://jmeter.apache.org/

Track Improvement Requests

We receive 5-10 improvement requests each day from our customers. Some of them are good, and some not so good. I can easily pick out the ones that I agree with, but I'd like a good way to organize the rest, so if we get a lot of similar requests we can prioritize those appropriately.
We have a large backlog of good ideas, so a request usually won't get added to the work queue unless we see a strong demand from customers. This makes it impractical , so it doesn't make sense to track them with our current work item tracking (TFS). The main goal of organizing them is so that we can see where the demand is strongest, and we can determine which features are most important to our users.
Any suggestions are welcome.
I've seen small applications that allow people to add requests and anyone can vote on them. This would show you what people using that particular system are most interested in - although depending on implementation it can be very susceptible to gaming. Look into something like UserVoice. They probably do most of the work for you.
Use a StackExchange site for letting your user's create and vote on their priorities. Another alternative would be using something similar, like UserVoice, though here on SO we've found that the SO platform seems to work better.
You might also want to compile a set of potential feature requests and use something like SurveyMonkey, Vovici/WebSurveyor, or even Google Docs Forms to collect information from your users on which items they would like to see.