400 Bad Request on AWS Dynamo DB as Session Provider for ASP.NET - asp.net-mvc-4

We use AWS's DynamoDB Session Provider in our app to store session data.
I recently moved to an environment where I can have NewRelic monitoring my app and it started throwing alerts regarding Dynamo DB access. However, NewRelic is the only monitoring tool that is getting it. I cannot see anything related to this problem in my application logging (log4net) or the Windows event viewer.
I searched a lot and even went through the source code of the provider but came out empty.
I'm getting (400) Bad Request from what is seems to be all the calls made during a period of 1 or 2 minutes at a time happening 3 or 4 times per hour.
The stacktrace I could get is not promising:
at System.Net.HttpWebRequest.GetResponse()
at System.Net.HttpWebRequest.GetResponse()
at Amazon.Runtime.AmazonWebServiceClient.getResponseCallback(IAsyncResult result)
And the offending URL is:
dynamodb.us-east-1.amazonaws.com/Stream/GetResponse
From the time-graphs below we can see that all requests are fine during most of time (graph 1), but when the problem occurs the number of successful requests made to DynamoDB goes to 0 (graph 1). And, at the same time, there is a spike in the number of errors thrown (graph 2).
UPDATE: During a low usage period in the weekend I ran Fiddler on the production server too see what the error from AWS looks like. I'm getting "The conditional request failed" which seems to happen because the value was updated while requesting and old value and therefore the value is not consistent to what was expected. Below is a full request/response as a sample.
Request:
POST https://dynamodb.us-east-1.amazonaws.com/ HTTP/1.1
X-Amz-Target: DynamoDB_20120810.UpdateItem
Content-Type: application/x-amz-json-1.0
User-Agent: aws-sdk-dotnet-35/2.0.15.0 .NET Runtime/4.0 .NET Framework/4.0 OS/6.2.9200.0 SessionStateProvider TableSync
Host: dynamodb.us-east-1.amazonaws.com
X-Amz-Date: 20140510T153947Z
X-Amz-Content-SHA256: e7a4886acac6ccf16f0da9be962d3a68bd50e381c202277033d0d2bb3208aa8a
Authorization: AWS4-HMAC-SHA256 Credential=redacted/20140510/us-east-1/dynamodb/aws4_request, SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-target, Signature=redacted
Accept: application/json
X-NewRelic-ID: redacted
X-NewRelic-Transaction: redacted
Content-Length: 399
{
"TableName": "ASP.NET_SessionState",
"Key": {
"SessionId": {
"S": "redacted"
}
},
"AttributeUpdates": {
"LockId": {
"Value": {
"S": "42a9ed29-7a92-4455-8733-2f56c7d974b3"
},
"Action": "PUT"
},
"Locked": {
"Value": {
"N": "1"
},
"Action": "PUT"
},
"LockDate": {
"Value": {
"S": "2014-05-10T15:39:47.324Z"
},
"Action": "PUT"
}
},
"Expected": {
"Locked": {
"Value": {
"N": "0"
},
"Exists": true
}
},
"ReturnValues": "ALL_NEW"
}
Response:
HTTP/1.1 400 Bad Request
x-amzn-RequestId: redacted
x-amz-crc32: redacted
Content-Type: application/x-amz-json-1.0
Content-Length: 120
Date: Sat, 10 May 2014 15:33:17 GMT
{
"__type": "com.amazonaws.dynamodb.v20120810#ConditionalCheckFailedException",
"message": "The conditional request failed"
}
Graph 1
Graph 2
Any help is appreciated. Thanks!

The conditional lock failure can occur if your application makes multiple requests at the same time which access the session state. This can be common with Ajax calls. The article The Downsides of ASP.NET Session State provides a good explanation about how ASP.NET serializes access to a particular session state with some work arounds:
The first issue we'll look at is one that a lot developers don't know about; by default the ASP.NET pipeline will not process requests belonging to the same session concurrently. It serialises them, i.e. it queues them in the order that they were received so that they are processed serially rather than in parallel. [...]
These errors should not be bubbling up to application level. The AWS SDK for .NET throws exceptions for conditional update failures which the session provider is interpreting that as failure to get the lock. That is passed back to the ASP.NET framework which queues the request till it can get the lock:
[...] This means that if a request is in progress and another request from the same session arrives, it will be queued to only begin executing when the first request has finished. Why does ASP.NET do this? For concurrency control, so that multiple requests (i.e. multiple threads) do not read and write to session state in an inconsistent way.

Update
Norm Johanson's answer surfaces the root cause of the issue at hand, I'm keeping my respectively adjusted answer for the parts that still apply and the pointers to related issues.
Initial Answer
I haven't faced the exact issue you describe, but it rings a bell regarding similar patterns encountered in the context of investigating the AWS API's Eventual Consistency, see e.g. my answer to Deterministically creating and tagging EC2 instances for more on this. Things have considerably improved since then:
I've just updated my answer regarding the extended documentation about Troubleshooting API Request Errors being available meanwhile.
E.g. the AWS SDK for Java added more elaborate exponential backoff handling along the lines of what's proposed in Error Retries and Exponential Backoff in AWS, which also seems to be available in the AWS SDK for .NET to some extent.
Now, what I suspect is something like this:
New Relic is instrumenting the .NET byte code, which allows them to e.g. log all exceptions, regardless of whether they are handled or not.
Your client is e.g. getting throttled for request limit violations, which is causing a retryable 400 - ThrottlingException as per the API Error Codes, i.e. it triggers an exception that is handled and kicking off the exponential retry in turn, ultimately succeeding the request eventually, and leaving no trace for other tools accordingly.
Update: the exceptions at hand turn out to be the non retryable 400 - ConditionalCheckFailedException, thus this suspicion doesn't apply here.
In case, the question obviously is what might be causing this - even though the issue description doesn't match yours, the discussion in Performance issue in 2.0.12.0 hints on an ongoing threading issue in the 2.0.x releases of the .NET SDK, which might surface differently depending on the usage pattern at hand?
Update: Norm Johanson's answer surfaces the root cause of the issue at hand.

Related

What HTTP method type should be used for recalculating the related resources

In my application I maintain couple of Shapes. And these shapes are calculated with respect to some reference point. And this reference point is considered as (0, 0). For example someone wants to move this reference point by (x, y), then distance of all existing Shapes need to be re-calculated with new reference point.
I have an REST API, to change this reference point, so internally all the Shapes distance is re-calculated with that (x, y) movement.
What HTTP method should be used for such operation POST, PUT or PATCH? As I don't find any one of these correct in the context, if I go by the definition of these HTTP method.
What HTTP method should be used for such operation POST, PUT or PATCH? As I don't find any one of these correct in the context, if I go by the definition of these HTTP method.
It depends.
We use PUT/PATCH when the client edits a local copy of a resource, and sends the edited version of the document back to the server (aka "remote authoring"). Think "Save File".
We use POST when... well, pretty much for everything else.
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.” -- Fielding, 2009
Think HTML forms here - information is dispatched to the origin server in a representation that is completely unrelated to the representation of the resource itself.
For example, in a resource model where we have a resource for the reference point itself
GET /example
200 OK
Content-Type: appplication/json
{ "referencePoint": { "x": 0, "y": 0 } }
The using remote authoring semantics is pretty reasonable
PUT /example
Content-Type: appplication/json
{ "referencePoint": { "x": 0, "y": 999 } }
If the representation of the resource were very big (much bigger than the HTTP headers), and the changes were small, then we might use a patch document to communicate the information, rather than a complete document.
PATCH /example
Content-Type: application:json-patch+json
[ { "op": "replace", "path": "referencePoint/y", "value": 999 } ]
In both cases, the basic idea is the same - we are requesting that the origin server change its copy of the resource to match the edited version on the client.
It is okay to use POST for everything else.

RESTful API best practices on endpoint design

Let's say I'd like to implement two functions:
Register for a course
Pay the course fee
I understood that I might have two RESTful API endpoints like this
register the course for the student:
send POST request to /myapp/api/students/{id}/courses
with request body like
{
"course_id": 26,
"is_discount": true,
"reg_date": "2020-04-23T18:25:43.511Z"
}
create payment record of the student for the course:
send POST request to /myapp/api/payment-records
with request body like
{
"student_id": 204,
"course_id": 26,
"amount": 500
}
My question is, how this can be done in one action (or within one transaction) from client side by just calling to one RESTful endpoint without separating them into two like the above? Because if one fails to make successful payment, due to network failure of card system, for example, then the course registered by the student should be rollbacked accordingly.
Or, should I do it like:
send POST request to /myapp/api/course-registration
with request body like this?
{
course: {
"course_id": 26,
"is_discount": true,
"reg_date": "2020-04-23T18:25:43.511Z"
},
payment: {
"record_id": 1,
"student_id": 204,
"course_id": 26,
"amount": 500
}
}
My question is, how this can be done in one action (or within one transaction) from client side by just calling to one RESTful endpoint without separating them into two like the above?
How would you do it on the web?
You'd have a web page with a form to collect all of the information you need from the student -- some fields might be pre-filled in, others might be "hidden" so that they aren't part of the presentation. When the form is submitted, the browser would collect all of that information into a single application/x-www-form-urlencoded document, and would include that document in a POST request that targets whatever URI was specified by the form.
Note that you might also have two smaller forms - perhaps part of the same web page, perhaps somewhere else, to do the registration and payment separately.
Two things to note about the target URI. The browser does not care what the spelling of the target-uri is; it's just going to copy that information into the HTTP request. But the browser does care if the target URI is the same as something that is available in its local cache; see the specification of invalidation in RFC 7234.
So a heuristic that you might use for choosing the target is to think about which cached document must be refreshed if the POST is successful, and use the identifier for that document as the target-uri.

.NET Core custom ILogger middleware to catch long running requests

I am slowly learning .NET Core and am struggling to get my head around the logging middleware. I have written a custom ILogger which logs out to a text file. By default, .NET core will log Information about the length of time each request takes, for example:
Information: Request finished in 154.0077ms 200 application/json
I have a requirement to log any requests that take longer than n ms as a warning rather than information so that in production the default log level can be set to Warning and these slow requests are still captured
At the moment I'm doing the following, which works, but only if the default log level is set to Information (which I don't want in production). Obviously changing the default to warning means that the "request finished" data doesn't even get passed to the logger.
Regex requestFinished = new Regex(#"(.*\s)([0-9]{1,}[.]?[0-9]{1,})([m]?s)");
Match match = requestFinished.Match(message);
if (match.Success)
{
double timeTaken = 0;
Double.TryParse(match.Groups[2].ToString(), out timeTaken);
if (timeTaken > MyService.ExecutionThreshold)
{
logLevel = LogLevel.Warning;
}
}
I suspect there's a much easier way of doing this than the above but I cannot figure it out and can't seem to find any suggestions online. Any help gratefully received.

REST API design of a resource whose properties are not editable by the client

What's the best way to handle resource properties which must be modified/updated through another method that is not exposed to the API consumer?
Examples:
Requesting a new token to used for X. The token must be generated following a specific set of business rules/logic.
Requesting/refreshing the exchange rate of a currency after the old rate expires. The rate is for informational purposes and will be used in subsequent transactions.
Note that in the above two examples, the values are properties of a resource and not separate resources on their owns.
What's the best way to handle these types of scenarios and other scenarios where the API consumer doesn't have control of the value of the property, but needs to request a new one. One option would be to allow a PATCH with that specific property in the request body but not actually update the property to the value specified, instead, run the necessary logic to update the property and return the updated resource.
Lets look at #1 in more detail:
Request:
GET /User/1
Response:
{
"Id": 1,
"Email": "myemail#gmail.com",
"SpecialToken": "12345689"
}
As the consumer of the API, I want to be able to request a new SpecialToken, but the business rules to generate the token are not visible to me.
How do I tell the API that I need a new/refreshed SpecialToken with in the REST paradigm?
One thought would be to do:
Request:
PATCH /User/1
{
"SpecialToken": null
}
The server would see this request and know that it needs to refresh the token. The backend will update the SpecialToken with a specific algorithm and return the updated resource:
Response:
{
"Id": 1,
"Email": "myemail#gmail.com",
"SpecialToken": "99999999"
}
This example can be extended to example #2 where SpecialToken is an exchange rate on resource CurrencyTrade. ExchangeRate is a read only value that the consumer of the API can't change directly, but can request for it to be changed/refreshed:
Request:
GET /CurrencyTrade/1
Response:
{
"Id": 1,
"PropertyOne": "Value1",
"PropertyTwo": "Value2",
"ExchangeRate": 1.2
}
Someone consuming the API would need a way to request a new ExchangeRate, but they don't have control of what the value will be, it's strictly a read only property.
You're really dealing with two different representations of the resource: one for what the client can send via POST / PUT, and one for what the server can return. You are not dealing with the resource itself.
What are the requirements for being able to update a token? What is the token for? Can a token be calculated from the other values in User? This may just be an example, but context will drive how you end up building the system.
Unless there were a requirement which prohibited it, I would probably implement the token generation scenario by "touching" the resource representation using a PUT. Presumably the client can't update the Id field, so it would not be defined in the client's representation.
Request
PUT /User/1 HTTP/1.1
Content-Type: application/vnd.example.api.client+json
{
"Email": "myemail#gmail.com"
}
Response
200 OK
Content-Type: application/vnd.example.api.server+json
{
"Id": 1,
"Email": "myemail#gmail.com",
"SpecialToken": "99999999"
}
From the client's perspective, Email is the only field which is mutable, so this represents the complete representation of the resource when the client sends a message to the server. Since the server's response contains additional, immutable information, it's really sending a different representation of the same resource. (What's confusing is that, in the real world, you don't usually see the media type spelled out so clearly... it's often wrapped in something vague like application/json).
For your exchange rate example, I don't understand why the client would have to tell the server that the exchange rate was stale. If the client knew more about the freshness of the exchange rate than the server did, and the server is serving up the value, it's not a very good service. :) But again, in a scenario like this, I'd "touch" the resource like I did with the User scenario.
There are many approaches to that. I'd say the best one is probably to have a /User/1/SpecialToken resource, that gives a 202 Accepted with a message explaining that the resource can't be deleted completely and will be refreshed whenever someone tries to. Then you can do that with a DELETE, with a PUT that replaces it with a null value, and even with a PATCH directly to SpecialToken or to the attribute of User. Despite what someone else mentioned, there's nothing wrong with keeping the SpecialToken value in the User resource. The client won't have to do two requests.
The approach suggested by #AndyDennie, a POST to a TokenRefresher resource, is also fine, but I'd prefer the other approach because it feels less like a customized behavior. Once it's clear in your documentation that this resource can't be deleted and the server simply refreshes it, the client knows that he can delete or set it to null with any standardized action in order to refresh it.
Keep in mind that in a real RESTful API, the hypermedia representation of user would just have a link labeled "refresh token", with whatever operation is done, and the semantics of the URI wouldn't matter much.
I reckon you should consider making SpecialToken a resource, and allow consumers of the api to POST to it to retrieve a new instance. Somehow, you'll want to link the User resource to a SpecialToken resource. Remember, one of the central tenets of REST is that you should not depend on out-of-band information so if you want to stay true to that you'll want to investigate the possibility of using links.
First, let's look at what you've got:
Request:
GET /User/1
Accept: application/json
Response:
200 OK
Content-Type: application/json
{
"Id": 1,
"Email": "myemail#gmail.com",
"SpecialToken": "12345689"
}
While this response does include the SpecialToken property in the object, because the Content-Type is application/json will not actually mean anything to clients that aren't programmed to understand this particular object structure. A client that just understands JSON will take this as an object like any other. Let's ignore that for now. Let's just say we go with the idea of using a different resource for the SpecialToken field; it might look something like this:
Request:
GET /User/1/SpecialToken
Accept: application/json
Response:
200 OK
Content-Type: application/json
{
"SpecialToken": "12345689"
}
Because we did a GET, making this call ideally shouldn't modify the resource. The POST method however doesn't follow those same semantics. In fact, it may well be that issuing a POST message to this resource could return a different body. So let's consider the following:
Request:
POST /User/1/SpecialToken
Accept: application/json
Response:
200 OK
Content-Type: application/json
{
"SpecialToken": "98654321"
}
Note how the POST message doesn't include a body. This may seem unconventional, but the HTTP spec doesn't prohibit this and in fact the W3C TAG says it's all right:
Note that it is possible to use POST even without supplying data in an HTTP message body. In this case, the resource is URI addressable, but the POST method indicates to clients that the interaction is unsafe or may have side-effects.
Sounds about right to me. Back in the day, I've heard some servers had problems with POST messages without a body, but I personally have not had a problem with this. Just make sure the Content-Length header is set appropriately and you should be golden.
So with that in mind, this seems like a perfectly valid way (according to REST) to do what you're suggesting. However, remember before when I mentioned the bits about JSON not actually having any application level semantics? Well, this means that in order for your client to actually send a POST to get a new SpecialToken in the first place, it needs to know the URL for that resource, or at least how to craft such a URL. This is considered a bad practice, because it ties the client to the server. Let's illustrate.
Given the following request:
POST /User/1/SpecialToken
Accept: application/json
If the server no longer recognizes the URL /User/1/SpecialToken, it might return a 404 or other appropriate error message and your client is now broken. To fix it, you'll need to change the code responsible. This means your client and server can't evolve independently from each other and you've introduced coupling. Fixing this however, can be relatively easy, provided your client HTTP routines allow you to inspect headers. In that case, you can introduce links to your messages. Let's go back to our first resource:
Request:
GET /User/1
Accept: application/json
Response:
200 OK
Content-Type: application/json
Link: </User/1/SpecialToken>; rel=token
{
"Id": 1,
"Email": "myemail#gmail.com",
"SpecialToken": "12345689"
}
Now in the response, there's a link specified in the headers. This little addition means your client no longer has to know how to get to the SpecialToken resource, it can just follow the link. While this doesn't take care of all coupling issues (for instance, token is not a registered link relation,) it does go a long way. Your server can now change the SpecialToken URL at will, and your client will work without having to change.
This is a small example of HATEOAS, short for Hypermedia As The Engine Of Application State, which essentially means that your application discovers how to do things rather than know them up front. Someone in the acronym department did get fired for this. To wet your appetite on this topic, there's a really cool talk by Jon Moore that shows an API that makes extensive use of hypermedia. Another nice intro to hypermedia is the writings of Steve Klabnik. This should get you started.
Hope this helps!
Another thought just occurred to me. Rather than model a RefreshToken resource, you could simply POST the existing special token to a RevokedTokens collection that's associated with this User (assuming that only one special token is allowed per user at a given time).
Request:
GET /User/1
Accept: application/hal+json
Response:
200 OK
Content-Type: application/hal+json
{
_links: {
self: { href: "/User/1" },
"token-revocation": { href: "/User/1/RevokedTokens" }
},
"Id": 1,
"Email": "myemail#gmail.com",
"SpecialToken": "12345689"
}
Following the token-revocation relation and POSTing the existing special token would then look like this:
Request:
POST /User/1/RevokedTokens
Content-Type: text/plain
123456789
Response:
202 Accepted (or 204 No Content)
A subsequent GET for the user would then have the new special token assigned to it:
Request:
GET /User/1
Accept: application/hal+json
Response:
200 OK
Content-Type: application/hal+json
{
_links: {
self: { href: "/User/1" },
"token-revocation": { href: "/User/1/RevokedTokens" }
},
"Id": 1,
"Email": "myemail#gmail.com",
"SpecialToken": "99999999"
}
This has the advantage of modeling an actual resource (a token revocation list) which can effect other resources, rather than modeling a service as a resource (i.e., a token refresher resource).
How about a separate resource which is responsible for refreshing the token within the User resource?
POST /UserTokenRefresher
{
"User":"/User/1"
}
This could return the refreshed User representation (with the new token) in the response.

Choose appropriate HTTP status codes in controversial situations or introduce subcodes?

I am developing iOS application running against a remote server, having another developer behind it. The project and an API draft we are writing are in initial phase.
The problem we are faced with is that we are not satisfied with existing amount of conventional status codes described by HTTP/REST specifications: there are cases where we are uncertain about what code to pick up.
Examples to provide minimal context:
Server-side validation errors. Fx. Client-side validations are ok, but server API has recently been changed slightly, so a server should return something indicating that it is exactly the validation problem.
An attempt to register user that already exists. SO topics do not provide any precise point on that.
A user is registered, and tries to log in without having the password confirmation procedure accomplished.
Two obvious approaches we see here:
Use fx 400 error for the cases when an appropriate conventional status code could not be found. This will lead us to parsing error text messages from JSON responses. Obviously, this approach will introduce superfluous complication in a client-side code.
Create our own sub-codes system and rely on it in our code. This one involves too much artificial conventions, which will lead us towards becoming too opinionated and arbitrary.
Feeling that the number of such cases is going to grow, we are thinking about an introduction of custom sub-codes in JSON responses our server should give (i.e. choose the second approach).
What I'm asking here:
What are the recommended approaches, strategies, even glues or hacks for these kinds of situations?
What are pros-cons of moving away from strictly following REST/HTTP conventions for status codes?
Thanks!
For validation problems, I use 422 Unprocessable Entity (WebDAV; RFC 4918)
The request was well-formed but was unable to be followed due to semantic errors. This is because the request did not fail because of malformed syntax, but because of semantics.
Then in order to communicate you just need to decide on your errors format, so for situation 1 if there is a required field you might return a 422 with the following.
{
"field": ["required"]
}
I would treat number two as a validation problem, since really it is a validation problem on username, so a 422 with the following.
{
"username": ["conflict"]
}
Number three I would treat as a 403 Forbidden because passing an authorization header will not help and will be forbidden until they do something other than pass credentials.
You could do something like oauth2 does and return a human readable description, a constant that people can code against that further clarifies the error and a uri for more information.
{
"error": "unfinished_registration",
"error_description": "Must finish the user registration process",
"error_uri": "http://yourdocumentation.com"
}
Note: you will find that people disagree on what http codes map to what situation and if 422 should be used since is part of the WebDAV extensions, and that is fine, the most important thing you can do is document what the codes mean and be consistent rather than being perfect with what they mean.
There's no such thing as "sub-codes" in HTTP (Microsoft IIS is clearly violating the spec, and should be flogged).
If there's an appropriate status code, use it; don't say "this status code means that in my application" because that's losing the value of generic status codes; you might as well design your own protocol.
After that, if you need to refine the semantics of the status code, use headers and/or the body.
For the use cases you have described, you could use these error codes:
1) 400 Bad Request
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
2) 409 Conflict
The request could not be completed due to a conflict with the current state of the resource. This code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request. The response body SHOULD include enough
information for the user to recognize the source of the conflict. Ideally, the response entity would include enough information for the user or user agent to fix the problem; however, that might not be possible and is not required.
Conflicts are most likely to occur in response to a PUT request. For example, if versioning were being used and the entity being PUT included changes to a resource which conflict with those made by an earlier (third-party) request, the server might use the 409 response to indicate that it can't complete the request. In this case, the response entity would likely contain a list of the differences between the two versions in a format defined by the response Content-Type.
3) 401 Not Authorized
The request requires user authentication. The response MUST include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource. The client MAY repeat the request with a suitable Authorization header field (section 14.8). If the request already included Authorization credentials, then the 401 response indicates that authorization has been refused for those credentials. If the 401 response contains the same challenge as the prior response, and the user agent has already attempted authentication at least once, then the user SHOULD be presented the entity that was given in the response, since that entity might include relevant diagnostic information. HTTP access authentication is explained in "HTTP Authentication: Basic and Digest Access Authentication" [43].
For any other use case that you have, it varies. I would probably go with number 2 if there is truly no standard way of encoding specific errors.