Optimization for GetSecret with Azure Keyvault - optimization

Our main goal for now is optimising the a processing service.
The service has a system-assigned managed identity with accespolicies that allow to get a secret.
This service makes 4 calls to a keyvault. The first one takes a lot longer than the others. I'm scratching my head, because the Managed Identity token takes 91µs to obtain. Application Insights image
I changed the way the tokens were obtained. The program only obtains it once and keeps using that same token for other round trips. I did this by making the CredentialClass AddScoped.

I assume the question is why the first call takes more time. If yes, just throwing in a couple of generic reasons which might contribute:
HTTPS handshake on the first call might take time
The client might need to create a connection pool
In case of Azure Key Vault, the first call does two round-trips AFAIK, first for auth, the second for the real payload

Related

Use primary Id as idempotency token on API

i want to know if using primary id: 1;2;3;... can be used as idempotency toekn on API instead of using long string UUID.
as id is unique i think it can be
is there any risk?
I would recommend always using random idompotency tokens rather than reusing other values you have (like primary keys or sequential numbers).
It makes replaying sequences of commands during a debug session difficult if the IDs used for idempotency might overlap due to test data being reset, or other ways that you want to send the same test data in a request twice in a short period of time and not have the second request ignored.
An attacker may also be able to do a denial of service on your use of the API is they know enough to cause the web service to think that an idempotency token has already been used in a window. If you don't use a random token then you should consider the situations in which someone could disrupt your use of the service by sending their own requests with tokens that you will use (either guessed or just because they aren't generating random tokens themselves). Using either predictable or 'common' tokens in this situation will cause a problem. If the web service does not record tokens for attempts to call the web service without a valid authentication, then this is less of (or not) an issue.

How to handle secured API in service to service communication

I have a working monolith application (deployed in a container), for which I want to add notifications feature as a separate microservice.
I'm planning for the monolith to emit events to a message bus (RabbitMQ) where they will be received by the new service, which will send the notification to user. In order to compose a notification, it will need other information about the user from the monolit, so it will call monolith's REST API in order to obtain it.
The problem is, that access to the monolith's API requires authentication in form of a token. I was thinking of:
using the secret from the monolith to issue a never-expiring token - I don't think this is a great idea from the security perspective, and also I know that sometimes the keys rotate in which case the token would became invalid eventually anyway
using the message bus to retrieve the information - this does not seem a good idea either as the asynchrony would make it very complicated
providing all the info the notification service needs in the event - this would make them more coupled together, and moreover, I plan to also send notifications based on the state on the monolith not triggered by an event
removing the authentication from the monolith and implementing it differently (not sure how yet)
My question is, what are some of the good ways this kind of problem can be solved, and also, having just started learning about microservices, is what I am trying to do right in the first place?
When dealing with internal security you should always consider the deployment and how the APIs are exposed to the outside world, an API gateway might be used to simply make it impossible to access internal APIs. In that case, a fixed token might be good enough to ensure that the client is authorized.
In general, though, I would suggest looking into OAuth2 or a JWT-based solution as it helps to validate the identities of the calling system as well as their access grants.
As for your architecture doubts, you need to consider the following scenarios when building out the solution:
The remote call can fail, at any time for unknown reasons, as such you shouldn't acknowledge the notification event until you're certain that the notification has been processed successfully.
As you've mentioned RabbitMQ, you should aim to keep the notification queue as small as possible, to that effect, a cache that contains the user details might help speed things along (and help you reduce the chance of failure due to the external system not being available).
If your application sends a lot of notifications to potentially millions of different users, you could consider having a read-only database replica of the users which is accessible to the notification service, and directly read from the database cluster in batches. This reduces the load on the monolith and shift it to the database layer

App client authentication (login) and CQRS

I'm interested in practical scenarios of authentication/login in web application when CQRS pattern is used to build the system.
Say we using HTTP services for commands/queries. And authentication with JWT (or any other authentication token)
We send command LogInUser with credentials (HTTP request).
Server command handler checks credentials, writes events in the store (if using Event Sourcing).
What then? What should we return as the result of the command? Just ok result with authToken? Then client should query the state in the read service? In this case we just make the whole process longer. And this concern actually refers not only to authentication scenarios but also other scenarios when we send a command and expect to get the result of it execution as soon as possible.
I just would like to hear from people who implemented such things. Want to understand possible practical data/requests flows for authentication using CQRS.
Since you are using CQRS, you have decided to separate writing to the application from reading from the application.
To write to the application, you use commands.
To read from the application, you either wait for events, or you query the read model.
This diagram shows the relation between the different options:
(The diagram is taken from the documentation of wolkenkit, a CQRS and event-sourcing framework for JavaScript and Node.js.)
So, when you send your LogInUser command, the command itself does not return anything (of course, when using HTTP there must be a response, but it should just be a 200 OK, so that you can verify that the server received the command and will care about it sooner or later).
Now the server process the login, verifies the sent credentials, and so on, and generates an appropriate UserLoggedIn event. This event gets stored in the event store, and is then sent to the read model.
The read model does two things with this event:
It simply forwards it to the client.
It updates any denormalized tables you may have that are interested in this event, so you can query them later.
So your client has two options:
It can wait for the event after having sent the command. Once the event is being received, the client has the JWT.
It can query the read model, until a given record was updated.
As you need to make sure that only the sender of the command is able to receive the JWT, option 1 is actually the only viable way. You can make sure that an event gets only delivered to the client that sent the appropriate command, but you can't have a table that contains all JWTs where people can only read their JWTs before being authenticated. With the read model, you have a chicken-and-egg problem here.
So, to cut a long story short: The client should wait for the appropriate event, and the event contains the JWT. That's it.

In what scenarios is recommended a reliable session?

In few words, if I am not wrong, a session is used when I want to ensure that the packages are sent in order, and to be able to use sessions is needed a reliable connection.
But my doubt what kind of applications need that? In my case is a simple application in which a client request to a service data from a database, the service get the data from the database and send to the client the results. Also the client can requeset to add, modify or delete data from database. In this case, should I need a reliable connection and sessions or not?
Thanks.
Session presumes that for some period of time you want to retain some data. Such a period of time, as far as session is concerned, refers to client's lifecycle that is when client opens up proxy, both service along with session are created, when client closes proxy service and session terminate their actions. There is exception when closing proxy does not actually perform it right away and this occures when you invoke one-way-operation. Service will keep working as long as operation performs its action despite the fact that it previously received an order to get rid of instance.
Based on provided information I assume the best choice would be PerCall. You do not store any data between calls and every single call can be perceived separately. Additionaly, leverage of ConcurrencyMode set to multiple so as to allow services being created simultaneously.
Personally, I find session useful in MSMQ, whenever I want to specific number of messages be wrapped into single queue-message. If error occures, regardless of whether which message is in charge of it, the whole queue-message is rolled back.

How to keep an API idempotent while receiving multiple requests with the same id at the same time?

From a lot of articles and commercial API I saw, most people make their APIs idempotent by asking the client to provide a requestId or idempotent-key (e.g. https://www.masteringmodernpayments.com/blog/idempotent-stripe-requests) and basically store the requestId <-> response map in the storage. So if there's a request coming in which already is in this map, the application would just return the stored response.
This is all good to me but my problem is how do I handle the case where the second call coming in while the first call is still in progress?
So here is my questions
I guess the ideal behaviour would be the second call keep waiting until the first call finishes and returns the first call's response? Is this how people doing it?
if yes, how long should the second call wait for the first call to be finished?
if the second call has a wait time limit and the first call still hasn't finished, what should it tell the client? Should it just not return any responses so the client will timeout and retry again?
For wunderlist we use database constraints to make sure that no request id (which is a column in every one of our tables) is ever used twice. Since our database technology (postgres) guarantees that it would be impossible for two records to be inserted that violate this constraint, we only need to react to the potential insertion error properly. Basically, we outsource this detail to our datastore.
I would recommend, no matter how you go about this, to try not to need to coordinate in your application. If you try to know if two things are happening at once then there is a high likelihood that there would be bugs. Instead, there might be a system you already use which can make the guarantees you need.
Now, to specifically address your three questions:
For us, since we use database constraints, the database handles making things queue up and wait. This is why I personally prefer the old SQL databases - not for the SQL or relations, but because they are really good at locking and queuing. We use SQL databases as dumb disconnected tables.
This depends a lot on your system. We try to tune all of our timeouts to around 1s in each system and subsystem. We'd rather fail fast than queue up. You can measure and then look at your 99th percentile for timings and just set that as your timeout if you don't know ahead of time.
We would return a 504 http status (and appropriate response body) to the client. The reason for having a idempotent-key is so the client can retry a request - so we are never worried about timing out and letting them do just that. Again, we'd rather timeout fast and fix the problems than to let things queue up. If things queue up then even after something is fixed one has to wait a while for things to get better.
It's a bit hard to understand if the second call is from the same client with the same request token, or a different client.
Normally in the case of concurrent requests from different clients operating on the same resource, you would also want to implementing a versioning strategy alongside a request token for idempotency.
A typical version strategy in a relational database might be a version column with a trigger that auto increments the number each time a record is updated.
With this in place, all clients must specify their request token as well as the version they are updating (typical the IfMatch header is used for this and the version number is used as the value of the ETag).
On the server side, when it comes time to update the state of the resource, you first check that the version number in the database matches the supplied version in the ETag. If they do, you write the changes and the version increments. Assuming the second request was operating on the same version number as the first, it would then fail with a 412 (or 409 depending on how you interpret HTTP specifications) and the client should not retry.
If you really want to stop the second request immediately while the first request is in progress, you are going down the route of pessimistic locking, which doesn't suit REST API's that well.
In the case where you are actually talking about the client retrying with the same request token because it received a transient network error, it's almost the same case.
Both requests will be running at the same time, the second request will start because the first request still has not finished and has not recorded the request token to the database yet, but whichever one ends up finishing first will succeed and record the request token.
For the other request, it will receive a version conflict (since the first request has incremented the version) at which point it should recheck the request token database table, find it's own token in there and assume that it was a concurrent request that finished before it did and return 200.
It's seems like a lot, but if you want to cover all the weird and wonderful failure modes when your dealing with REST, idempotency and concurrency this is way to deal with it.