Can memcache.cas and memcache.add miss? - api

memcache.set can fail if the server load is too high and memcache.get can fail if there is no data in the server. These are called misses. [Is this terminology confusing because in the get scenario, it didn't miss anything, it did everything correctly but there was no data, so the term miss is misleading?]
However, would operations like memcache.cas and memcache.add also have this problem of misses and if so how is it defined? I believe they are fundamentally some kind of set operation so the miss could happen here as well?
I couldn't find an an API documentation that specifies what happens during all scenarios. I am finding that using memcache clients makes it even difficult because each client has its own rules. Currently I am trying to understand at the server level.

Related

What is the standard acceptable request/response-timeout for API server (and Why)?

I'm working on developing both web-client and API server. I've been doing some research regarding default timeout, some are at 800ms, others 1200ms. However, I can't find the reason behind the arbitrary number. Can someone help me regarding this? An explanation behind the arbitrary number would be a great help.
Thanks,
TLDR: Please see paragraph starting with "The arbitrary number" in bold below. The rest is just extra info on the topic.
Although you might know this or have already read this in your research, I can share the following ideas:
Typically the timeout is set depending on the expected complexity of a query, the amount of data to be processed, and the expected load of the system when the query occurs (or any other expected operation that may require attention in terms of modifying a timeout). Also, this can be based on something like the number of requests an API makes to other APIs to handle an incoming request(s) and what those expectations might be.
The arbitrary number ("best guess" of whoever developed the software) would typically be expected when planning for a "most requests should complete in some fraction of this time if there is no issue regardless of what happens" or "this isn't anything to worry about" type of scenario. Hence the default values for timeouts are pretty much based on the assumption that they represent the vast majority of "acceptable" completed requests where no issue is present. It is typically set somewhere between "this should be plenty of time" and "there is most likely something terribly wrong with this request, let's end it" and most successful requests pass this test by "default".
In the case that you have operations that may take several minutes and you expect that this can occur without an actual issue being present, you may want to set the timeout higher than the default so your requests don't timeout when there is no actual problem (for example, most commercial APIs have constraints on the number of requests and time in which they must complete so problematic requests don't clog up the system and other reasons as seen by their developers).
Thus, there really isn't a great answer or standard to this aside from just taking a look at the amount of data/requests to be processed, planning for a reasonable ebb and flow of server load, level of optimization of your code compared to the expected load, and so on... It's almost like error-handling but for things that you don't know might happen yet (such as unexpected bugs) but based on things you already know about your system and its expected usage.
Generally, you won't have many scenarios where the timeout really matters all that much but you always want to have one (at least the default) to prepare for the unexpected.
I found the following article that talks about the topic and some of what I mentioned as well if you haven't seen it already:
https://medium.com/#masnun/always-use-a-timeout-for-http-requests-de4da538b9e3
tl;dr - According to SLA [ Service Level Agreement ] mostly. If not, try to optimize the code as much as possible to bring down the time it takes to give out the response in terms of milliseconds
I'll put the answer in layman's terms since it really depends on various factors.
Let's assume you have an API and it performs some operation and gives the result back. It's quite simple and you'll get the response for that under some milliseconds if they don't perform any complex operations.
And when we move into a more and more complex system where one API talks to another, it adds up the time, and the worst-case scenario, first API which started the request might get the final response after 5 seconds, 30 seconds, or even 60 seconds depending on the number of API calls and how good the system is designed.
And we are only considering the happy flow. What if something goes wrong in one of the APIs that gets called internally?
To avoid this bad experience, the clients will make an SLA that requires the company/developers to design the code in such a way that it gives the response within a certain acceptable range.
I came across this conversation once on Google Groups conversation and it might provide some insight.
So to answer the question about the acceptable range, If you don't have an SLA, try to optimize the code as much as possible to bring down the time it takes to give out the response in terms of milliseconds.
Generally 1 Second is considered acceptable. The reason for this and why the suggested numbers vary so much is most APIs have a lockout if you send requests to fast. However, some APIs will let you send requests faster. In my experience all of the APIs I have seen request a 1s(1000ms) delay between requests to prevent overload/accidental DDOS and have a timeout of 30-60sec.
Edit: It is important to mention to not let another request from the same IP be answered if the first one is still waiting as this would make a DDoS easy

How to secure real-world load-balanced WCF service against replay attacks

Whilst creating a new WCF service endpoint to improve our webservice security, I started looking at how to prevent replay attacks. On initial glance, this is easy, WCF has a "DetectReplays" flag that you turn on and everything is sorted. However, even a brief understanding of the mechanism in use (in memory nonce caching and duplicate rejection), shows that this is not a real-world implementation. Frankly it's baffling that they implement it at all. Anyone sufficiently bothered about security at this level is going to be running more than one server in their web-farm, and consequently this mechanism will allow N attacks where N is the number of servers you have. Thus nullifying any scaling you have to cope with surges in traffic, and possibly overwhelming the servers. No to mention the chaos that duplicate create calls will cause.
We could turn on sticky sessions... but lets not do that, as that's a whole different set of problems.
Further investigation shows that Microsoft themselves acknowledge this problem:
https://learn.microsoft.com/en-us/dotnet/framework/wcf/feature-details/preventing-replay-attacks-when-a-wcf-service-is-hosted-in-a-web-farm
Even by Microsoft standards, that is terse, and fairly useless. They acknowledge the problem, indicate that a solution exists, then provide only the most basic hint as to how to implement it.
Googling reveals that no-one out there has written anything about how to use it. Hunting through their source code shows that they internally use this mechanism with an in-memory implementation to provide the default functionality. It uses this in the SecurityProtocolFactory, setting the NonceCache to the in memory version if nothing has been supplied.
But how do you setup and use a SecurityProtocolFactory in WCF?
I know many will have the reaction that I shouldn't worry about replay attacks, as the transport security will take care of this. However, this is no longer true. Amazingly, some optimisations to the 1.3 version of TLS seem to have quietly removed this feature. See https://blog.cloudflare.com/introducing-0-rtt/
So the questions are:
Am I over thinking this, is it really a problem?
Has anyone actually got the Microsoft implementation to work? If so, how!?
What is everyone else doing? Is everyone just ignoring this problem, unaware of the TLS 1.3 issue?
I have tried setting the NonceCache variable on the localclientsecurity settings, but to no affect.
var sbe = (SymmetricSecurityBindingElement) bec.Find<SecurityBindingElement>();
if (sbe != null)
{
// Get the LocalSecuritySettings from the binding element.
LocalClientSecuritySettings lc = sbe.LocalClientSettings;
lc.DetectReplays = true;
lc.NonceCache = new MyNonceCache();
}

SSL equivalent of givedescriptor() and takedescriptor()

I am converting an old tcp only server to use SSL (via IBM's GSkit), and one of the problems is getting the SSL handle into the spawned program. The original code passes the raw socket in via calls to the givedescriptor() and then uses takedescriptor() to get and then use the passed in socket.
Is there a GSKit/SSL equivalent of the give/take descripter methods?
givedescriptor() API documentation
UPDATE:
The issue is that the socket and the SSLHandle are created in one process, which initialized the SSL environment, and then need to be passed on to another process entirely - hence the need to give/take descriptor, as the socket / SSLHandle need to 'given' to the new process (it is actually an RPG program that is submitted and runs independently from the original program).
UPDATE 2:
Looks similar to this question, so I'll have a read of that as well.
From the other article (which doesn't have a code based answer, but a written solution)
"It looks like the session handles are just pointers to some storage
in heap. Due to the design of Single Level Store, you could copy them
via shared memory (memmap, shmget/shmat, ...). You just have to ensure
that the process that opened the GSK environment doesn't die or the
activation group will get cleaned up and those pointers will become
invalid. You also will probably need to put a mutex or some other
locking primitive around them if you're going to have multiple threads
accessing the shared data structure."
UPDATE 3:
This is the example I am using to share the memory between processes - Example: Using semaphore set and shared memory functions, still not exactly solved the issue yet though.
UPDATE 4:
I thought I'd add more details of why I need to ask the question. I am changing a non-blocking TCP server that is used as a connection point to an IBM i. It has the 'standard' mechanism for handling connections as they come it, creating threads and negotiating the connections in these threads. The threads then create independent process (via sbmjob). In the vanilla TCP version we can then give the running job the handle of the socket via the give/takedescriptor function, and will merrily write to and from the socket.
So I need an equivalent way of getting the independently running program to be able to write to SSL.
It maybe that this is not possible with the current mechanism.
There is no such thing as an 'SSL handle' known to the operating system and inheritable by child processes or transferable to other processes. The 'SSL handle' will inevitably be a pointer into some opaque data structure in the originating process, as SSL is an application layer protocol, and therefore implemented in the process, not in the kernel. So you can't 'give' an 'SSL handle' to another process and expect it to work.
EDIT
The answers here don't really answer the underlying question, which how I should do this, so although the bounty has been awarded, I can't accept the only answer.
The answer is that you can't do it.
It maybe that this is not possible with the current mechanism.
Correct. As you've foreseen this possiblity in your question, it is difficult to understand why you can't accept it in an answer.
In principle your idea is not impossible! If you believe that is possible try to find the answer!
If all answerer from the SO will say that is impossible it is not all time the true!
For example: 15 years ago I had tried to find the answer how can I write an Java-applet which can write and read images on a server.
Everybody had said to me that is impossible, but I did not believed it. I
tried to find my answer again and again. And I had found the answer: I
disassembled one online apllet from one specialist and in the source
code I find my answer: using PHP server we can do it. I asked the
owner from this applet about details of communication between
Java-applet and PHP server and he has helped me.
You have to find your specialist. That is the first rule to find the correct answer. May be on the IBM forum you will find someone.
The second rule is to read a lot of books from specialist about this. Not only one book. May be three of them or more.
I would recommend you also to read How do I ask a good question?, because in your question you do not have any computer language. And I think we have on SO someone specialist which could give you the correct answer.
The first rule on SO for finding of correct specialist is to set the correct tags. Without correct tags only few people see your question and it is only question of luck that somebody from them is the correct specialist for you.
Be optimistic and try to believe in you! Good luck and success!

In the Diode library for scalajs, what is the distinction between an Action, AsyncAction, and PotAction, and which is appropriate for authentication?

In the scala and scalajs library Diode, I have used but not entirely understood the PotAction class and only recently discovered the AsyncAction class, both of which seem to be favored in situations involving, well, asynchronous requests. While I understand that, I don't entirely understand the design decisions and the naming choices, which seem to suggest a more narrow use case.
Specifically, both AsyncAction and PotAction require an initialModel and a next, as though both are modeling an asynchronous request for some kind of refreshable, updateable content rather than a command in the sense of CQRS. I have a somewhat-related question open regarding synchronous actions on form inputs by the way.
I have a few specific use cases in mind. I'd like to know a sketch (not asking for implementation, just the concept) of how you use something like PotAction in conjunction with any of:
Username/password authentication in a conventional flow
OpenAuth-style authentication with a third-party involved and a redirect
Token or cookie authentication behind the scenes
Server-side validation of form inputs
Submission of a command for a remote shell
All of these seem to be a bit different in nature to what I've seen using PotAction but I really want to use it because it has already been helpful when I am, say, rendering something based on the current state of the Pot.
Historically speaking, PotAction came first and then at a later time AsyncAction was generalized out of it (to support PotMap and PotVector), which may explain their relationship a bit. Both provide abstraction and state handling for processing async actions that retrieve remote data. So they were created for a very specific (and common) use case.
I wouldn't, however, use them for authentication as that is typically something you do even before your application is loaded, or any data requested from the server.
Form validation is usually a synchronous thing, you don't do it in the background while user is doing something else, so again Async/PotAction are not a very good match nor provide much added value.
Finally for the remote command use case PotAction might be a good fit, assuming you want to show the results of the command to the user when they are ready. Perhaps PotStream would be even better, depending on whether the command is producing a steady stream of data or just a single message.
In most cases you should use the various Pot structures for what they were meant for, that is, fetching and updating remote data, and maybe apply some of the ideas or internal models (such as the retry mechanism) to other request types.
All the Pot stuff was separated from Diode core into its own module to emphasize that they are just convenient helpers for working with Diode. Developers should feel free to create their own helpers (and contribute back to Diode!) for new use cases.

Testing fault tolerant code

I’m currently working on a server application were we have agreed to try and maintain a certain level of service. The level of service we want to guaranty is: if a request is accepted by the server and the server sends on an acknowledgement to the client we want to guaranty that the request will happen, even if the server crashes. As requests can be long running and the acknowledgement time needs be short we implement this by persisting the request, then sending an acknowledgement to the client, then carrying out the various actions to fulfill the request. As actions are carried out they too are persisted, so the server knows the state of a request on start up, and there’s also various reconciliation mechanisms with external systems to check the accuracy of our logs.
This all seems to work fairly well, but we have difficult saying this with any conviction as we find it very difficult to test our fault tolerant code. So far we’ve come up with two strategies but neither is entirely satisfactory:
Have an external process watch the server code and then try and kill it off at what the external process thinks is an appropriate point in the test
Add code the application that will cause it to crash a certain know critical points
My problem with the first strategy is the external process cannot know the exact state of the application, so we cannot be sure we’re hitting the most problematic points in the code. My problem with the second strategy, although it gives more control over were the fault takes, is I do not like have code to inject faults within my application, even with optional compilation etc. I fear it would be too easy to over look a fault injection point and have it slip into a production environment.
I think there are three ways to deal with this, if available I could suggest a comprehensive set of integration tests for these various pieces of code, using dependency injection or factory objects to produce broken actions during these integrations.
Secondly, running the application with random kill -9's, and disabling of network interfaces may be a good way to test these things.
I would also suggest testing file system failure. How you would do that depends on your OS, on Solaris or FreeBSD I would create a zfs file system in a file, and then rm the file while the application is running.
If you are using database code, then I would suggest testing failure of the database as well.
Another alternative to dependency injection, and probably the solution I would use, are interceptors, you can enable crash test interceptors in your code, these would know the state of the application and introduce the above listed failures at the correct time, or any others you may want to create. It would not require changes to your existing code, just some additional code to wrap it.
A possible answer to the first point is to multiply experiments with your external process so that probability to impact problematic parts of code is increased. Then you can analyze core dump file to determine where the code has actually crashed.
Another way is to increase observability and/or commandability by stubbing library or kernel calls, i.e., without modifying your application code.
You can find some resources on Fault Injection page of Wikipedia, in particular in Software Implemented Fault Injection section.
Your concern about fault injection is not a fundamental concern. You merely need a foolproof way to prevent such code ending up in deployment. One way to do so is by designing your fault injector as a debugger. I.e. the faults are injected by a process external to your process. This already provides a level of isolation. Furthermore, most OS'es provide some kind of access control which prevents debugging unless specifially enabled. In the most primitive form, it's by limiting it to root, on other operating systems it requires a specific "debug privilege". Naturally, on production nobody will have that, and thus your fault injector cannot even run on production.
Practially, the fault injector can set breakpoints at specific addresses, i.e. function or even line of code. You can then react to that, e.g. by terminating the process after a certain breakpoint is hit three times.
I was just about to write the same as Justin :)
The component I would suggest to replace during testing could be the logging component (if you have one, if not, I'd strongly suggest to implement one...). It's relatively easy to replace it with code that generates error and the logger usually gets enough information to know the current application state.
Also it seems to be feasible to make sure that the testing code doesn't go into production. I would discourage conditional compilation though but rather go with some configuration file to select the logging component.
Using "random" kills might help to detect errors but is not well suited for systematic testing because of its non-determinism. Therefore I wouldn't use it for automatic tests.