ABAC with Monorepo Microservices: What is the best approach? - authorization

At my work, I have a task to search and find solutions to implement the ABAC authorization in our microservices organized in a monorepo. We have some products and we use the concept of realms to organize the different client's data in the same database. Here our requirements are likely:
An user, which is a manager of his company, can only see data from your company and from your employees.
The same company can have N places, where each can have a manager. The manager of each place can only see the data from there.
First I thought to build some code to be used in every router of every API to verify the authorization and allow or deny the request. Something like this:
The other thing I thought was to create an API instead of a lib.
So, based on this question, I discovered that ABAC can be externalized from the apps (APIs) and make a lot of sense to me, see the image below.
But then I have some questions.
Is bad to do what I thought in the first image or in the second?
How the PDP will know what the user wants to do? Based on the route he is calling? But with this approach, the single responsibility will be hurt as the PDP needs to internalize (step 2) what other apps do, right?
The PIP needs to call the database for the PDP validates the authorization. So this can be slow as the same query will be done 2x, one for checking the policy and the other inside the service with business logic.

The modern way of doing this is by decoupling your policy and code - i.e. having a seperate microservice for Authorization - here's a part in a talk I gave at OWASP DevSlop about it.
You'd want you code in the middleware to be as simple as possible - basically just querying the Authorization microservice. That service basically becomes your PDP (in XACML terms).  This is true for both monolith and microservices (the assumption is you'll end up having more microservices next to your monolith anyhow). 
To implement the Authorization microservice / PDP you can use something like OPA (OpenPolicyAgent.org) and then use OPAL as a PAP and manager for PIPs. (Full disclosure I'm a contributor to both OPA and OPAL)
The query to the PDP should include what the user is doing (but not what the rules are). You can do this based on the Route (common when doing service-mesh), but often it's better to define a resource/action layout which becomes part of the query and is independent directly of the application route. Take a look at the Permit.io Policy-Editor which does exactly that kind of mapping. (Permit also uses both OPA and OPAL internally; Full disclosure I'm one of the founders of Permit.io )
Very good point. You don't want your PDP to constantly be querying other sources per incoming query to it (though its okay if you do it for a few edge cases) - What you want is to load data gradually in the background in an asynchronous fashion. Ideally in an event-driven fashion (i.e. have events propagate in realtime from the data sources into the PDP). This is exactly what you can do with OPAL.

Related

How do you name API paths of the same microservice but w/ different consumers

Context
Let's imagine a simple microservices architecture (e.g. 2-3 microservices). Microservices are domain-based, API gateway in place and everything is how it should be. At the same time, microservices APIs are consumed by public mobile applications, admin UI, and other services for S2S communication, hence, we have three possible APIs consumers. Depends on the consumer, response DTOs are different but the business process might be the same (e.g. response for GET /users endpoint has different DTOs for a consumer application and admin UI but technically the data is taken from the same DB).
Question
How do you segment APIs in that case? Do you use namespaces like external, internal and etc?
Also, feel free to share your experience of how you segment APIs.
Thanks in advance!
From my point of view, the APIs should be different depending on the type of consumer that is going to use them.
For example, talking about your use case, It couldn't be the same API one that is intended to provide simple user information that the one used by an administrator. You should define two different APIs in this case, with different paths like internal/users/ and external/users as you said, and internally these two endpoints can use the same logic.
This separation is not only good in order to return different dtos in each endpoint but also to define different security (authentication/authorization) mechanisms for each API because I suppose that these requirements will be different for an admin API that for a general user one
It depends a bit on the philosophy you want to adopt.
The one suggested by #JArgente is good, in that you'd get good separation, and the role of each is (or at least should be) very clear.
The other approach is layering, which (for the OO programmers out there) is a bit like developing overloads for a method. It assumes that the data required by the derived API's is provided by the base API. So:
Develop a base API that provides all the data this API family needs to provide. This API might be the one that internal users use (e.g. Admin User), and it could require authentication.
Develop a public facing API that consumes the base API. This one would be your public-facing one.
Each API has a separate API Spec; depending in how you do this you can leverage inheritance at the Spec level.
Each API also has an actual endpoint which triggers some sort of processing - e.g. logic within the API Gateway itself, or logic handled within a downstream component like a microservice.
The public-facing one can be anonymous, as long as something (e.g. the API Gateway) can make an authenticated call against the base API, using some kind of 'service account'.
The advantage here is that you still get good separation between different API's and their consumers, but you also get the advantages of inheritance, so that code duplication is reduced (testing effort isn't so diffuse, etc).
This approach also allows you to run the endpoints on the same API Gateway, or deployed on separate ones (internal vs external).

Reuse microservices across different project

We developed a monolithic API to be used as a SAAS.
In the company we also develop custom build for some customers.
Some of our customers are asking for some features that are already implemented in the monolithic application.
We are thinking about splitting our API into microservices, but our major questions are the following :
Does microservices can be reuse across different projects ?
If we do split, do we create a microservice that everybody use or do we create an instance per custom build ?
E.G :
project A use "User", "Project" so we deploy 2 microservices
project B use "User", "Project", "Store" so we deploy 3 microservices
total number of microservices deployed : 5
If we create an instance of each microservice per custom build, won't be too hard to manage the communication between all the microservices within the same custom build ?
Or do we stick with one instance per microservice that everybody use and we specify the project source ?
As we are using C# GraphQL.
We also thought about creating Nuget package for each component, so each package will contains :
Exposed GraphQL Queries / Mutations
His own db
His own logic
E.G :
- Api A install "User" & "Project" packages
- 3 db are instantiated "Api.A", "Api.A.User", "Api.A.Project"
- Api B install "User", "Project" & "Store" packages
- 4 db are instantiated "Api.B", "Api.B.User", "Api.B.Project" & "Api.B.Store"
But does it make sense to do that ?
In my mind it could very similar from Hangfire https://www.hangfire.io/
Note that we are currently using AWS Serverless to host our applications.
An important point is that we are a small team 2-4.
We are very open minded so any suggestion is good to hear.
Thank you !
First of all, I would like to say is that there is no right way here and I am providing my point of view from the way we have already done things hoping it will guide you in finding a solution best suited for your requirements.
So to understand your dilemma, you have a base vanilla product which is an API SAAS and there is a customized deployment for some customers as well. But as you are having to build custom deployments for each customer you are noticing a common pattern, wherein a lot of the functionality is repeated across the SAAS for each customer.
Now assuming I have the requirement correct, I would say micro-services will provide definite benefits in your case in terms of scaling and customer-specific customization which will be managed by independent teams.
But a lot of this depends on how your business logic is structured and how big and vast your customization is. Some of these questions should drive your solution are.
Can you store Customer-specific data in a central data store or at customers' end ? & How are your databases going to be structured and how many of them?
How big are the customizations ? are they cosmetic or workflow adhering?
How much cross-communication you expect across various services like User, Store, and Project and if there is any communication across A.User - B.User or A.Project - B.Store, etc?
Now moving to some of the important things you might want to consider post answering the above questions.
Consideration 1. If the data stores can be allowed to be in a single central place you can go ahead with a single cluster where all your micro-services can be deployed. But looking at the data provided I can assume you have multiple databases per customer and I would recommend to keep them separate and not introduce any coupling between them. Thus you may end up with one microservice or microservice per customer which talks only to that customer's database. ( more in fig.1)
Consideration 2. The customization as far I the norm goes should be separated from the service itself and your every service should have an input for configuration loading which will drive the service behavior. Again depending on how big your customization is there can be a limit to this configuration and in those cases, I woul recommend creating a new service with customizations built-in.
Consideration 3. This is a major factor for deciding the number of microservices you may have, but the boundary of each service should be defined by the domain, for example, a User service, a Store service, and a Project service. These are the vanilla services that interact with each other to produce a valid business scenario. And each of the customers is just specialized instances of these services.
ok Now that this is done lets gloss over your primary questions...
Des microservices can be reused across different projects?
-- Yes you can, but again it depends on how you have designed the business workflow, configuration injection.
If we do split, do we create a microservice that everybody uses or do we create an instance per custom build?
-- Yes this would be an ideal scenario enabling separation of concerns across projects as we do not want to mix data boundaries and client-specific sensitive configurations. That said there might be a case where the single microservice solution is what is demanded but should be done with caution.
If we create an instance of each microservice per custom build, won't it be too hard to manage the communication between all the microservices within the same custom builds?
-- Communication across microservice is an important part or factor which is more often than not unavoidable in most cases. Thus considering you will be requiring some form of cross microservice communication you can look at an enterprise bus or API communication based on your requirement. But it is a known triviality is my opinion.
Or do we stick with one instance per microservice that everybody uses and we specify the project source?
-- I would not recommend this as the example stated in your question for a module for database injections doesn't sound a good idea to me. This will cause a highly coupled system design. And this might also mean if one service fails all your customer sites go down. you surely wouldn't want that !!!
Now as it is said a picture is worth a thousand words...

Application Insights strategies for web api serving multiple clients

We have a back end API, running ASP.Net Core, with two front ends: A SPA web site (Vuejs) and a progressive web page (for mobile users). The front ends are basically only client code and all services are on different domains. We don't use cookies as authentication uses bearer tokens.
We've been playing with Application Insights for monitoring, but as the documentation is not very descriptive for our situations, I would like to get some more inputs for what is the best strategy and possibilities for:
Tracking users and metrics without cookies from e.g. the button click in the applications to the server call, Entity Framework/SQL query (I see that this is currently not supported, How to enable dependency tracking with Application Insights in an Asp.Net Core project), processing data and presentation of the result on the client.
Separating calls from mobile and standard web in an easy manner in Application Insights queries. Any way to show this in the standard charts that show up initially would be beneficial.
Making sure that our strategy will also fit in situations where other external clients will access the API, and we should be able to identify these easily, and see how much load they are creating for the system.
Doing all of the above with the least amount of code.
this might be worthy of several independent questions if you want specifics on any of them. (and generally your last bullet is always implied, isn't it? :))
What have you tried so far? most of the "best way for you" kinds of things are going to be opinions though.
For general answers:
re: tracking users...
If you're already doing user info/auth for other purposes, you'd just set the various context.user.* fields with the info you have on the incoming request's telemetry context. all other telemetry that occurs using that same telemetry context would then inerit whatever user info you already have.
re: separating calls from mobile and standard...
if you're already doing this as different services/domains, and you are already using the same instrumentation key for both places, then the domain/host info of pageviews or requests is already there, you can filter/group on this in the portal or make custom queries in the analytics portal to analyze that way. if you know which site it is regardless of the host, you could add that as custom properties in the telemetry context, you could also do that to avoid dealing with host info.
re: external callers via an api
similarly, if you're already exposing an api and using auth, you should (ideally) already know who the inbound callers are, and you can set that info in custom properties as well.
In general, custom properties (string:string key value pairs) and custom metrics (string:double key value pairs) are your friends. you can set them on contexts so all the events generated in that context inherit the same properties, you can explicitly set them on individual TrackEvent (or any of the other Track* calls) to send specific properties/metrics with any single event.
You can also use telemetry initializers to augment or filter any telemetry that's being generated automatically (like requests or dependencies on the server side, or page views and ajax dependencies client side)

RESTful API: Where should I code my workflow?

I am developing a RESTful API. This is my first API, but also my first really big coding project. As such, I'm still learning a lot about architecture etc.
Currently, I have my api setup in the following layers:
HTTP Layer
Resource Layer
Domain Model / Business Logic Layer
Data Access / Repository Layer
Persistent Storage / DB Layer
The issue I have run into at the moment is where do I need to put workflow objects / managers? By workflows, I mean code that evaluates what next step is required by the end user. For example, an e-commerce workflow. User adds item to basket, then checks out, then fills in personal details, then pays. The workflow would be responsible for deciding what steps are next, but also what steps are NOT allowed. For example, a user couldn't cause errors in the API by trying to pay before they have entered personal details (maybe they recall the URI for payments and try to skip a step). The workflow would check to see that all previous steps had been completed, if not, would not allow payment.
Currently, my workflow logic is in the Resource Layer. I am using hypermedia links to present the workflow to the user e.g. providing a 'next step' link. The problem I have with this is that the resource layer is a top level layer, and more aligned with presentation. I feel it needs to know too much about the underlying domain model to effectively evaluate a workflow i.e. it would need to know it has to check the personal_details entity before allowing payment.
This now leads me to thinking that workflows belong in the domain model. This does make a lot more sense, as really workflows are part of the business logic and I think are therefore best placed in the domain layer. After all, replace the Resource Layer with something else, and you would still need the underlying workflows.
But now the problem is that workflows required knowledge of several domain objects to complete their logic. It now feels right that it maybe goes in its own layer? Between Resource and Domain Layer?
HTTP Layer
Resource Layer
Workflow Layer
Domain Model / Business Logic Layer
Data Access / Repository Layer
Persistent Storage / DB Layer
Im just wondering if anyone had any other views or thoughts around this? As I said, I have no past application experience to know where workflows should be placed. Im really just learning this for the first time so want to make sure I'm going about it the right way.
Links to articles or blogs that cover this would be greatly appreciated. Love reading up on different implementations.
EDIT
To clarify, I release that HATEOAS allows the client to navigate through the 'workflow', but there must be something in my API that knows what links to show i.e. it is really defining the workflow that is allowed. It presents workflow related links in the resource, but additionally it validates requests are in sync with the workflow. Whilst I agree that a client will probably only follow the links provided in the resource, the danger (and beauty) of rest, is that its URI driven, so there is nothing stopping a mischievous client trying to 'skip' steps in the workflow by making an educated guess at the URI. The API needs to spot this and return a 302 response.
The answer to this question has taken me a fair bit of research, but basically the 'workflow' part has nothing to do with REST at all and more to do with the application layer.
My system was had the application logic and REST API too tightly coupled. I solved my problem by refactoring to reduce the coupling and now the workflow lives within the context of the application
REST encourages you to create a vocabulary of nouns (users, products, shopping carts) against an established set of verbs (GET, POST, PUT, DELETE). If you stick to this rule, then in your example the workflow really is defined by the set of interactions the user has with your site. It is how the user uses your app, which is really defined by the UI. Your REST services should react appropriately to invalid state requests, such as attempting to checkout with an empty cart, but the UI may also prevent such requests using script, which is an optional characteristic of REST.
For example, the UI which displays a product to the user might also display a link which would permit the user to add that product to their cart (POST shoppingcart/{productId}). The server really shouldn't care how the user got to that POST, only that it should add that product to the user's cart and return an updated representation of the cart to the user. The UI can then use javascript to determine whether or not to display a link to checkout only if the shopping cart has one or more items.
So it seems that your workflow lives outside the REST service and is rather defined by the navigation in your pages, which interact with your REST services as the user requests things. It's certainly possible that you might have internal workflows which must occur within your application based on the states setup by the user. But what you seem to be describing is a user interaction within the site, and while that's indeed a workflow, it seems better defined by your UI(s) than by a dedicated server-side component/layer.
You touch on the workflow (aka business logic) part of an API. Technically this is a separate concern from the API part which is the interface. Sure, as you mention, HATEOAS allows you to suggest certain actions which are valid, but you should be careful to maintain statelessness.
In REST applications, there must not be session state stored on the server side. Instead, it must be handled entirely by the client.
So, if there's session state on the server, it's not REST.
For your shopping cart example, you can save state in a separate caching layer like Redis. As for your workflows. You wouldn't want to put business logic like calculating their shopping cart or total bill in a domain model. That would be added to service layer.
You talked about mischievous users guessing URLs. This is always a concern and should be handled by your security. If the URL to delete a user is DELETE /user/3782 ... they can easily guess how to delete all the users. But you shouldn't rely only on obfuscating the URLs. You should have real security and access checks inside your endpoints checking if each request is valid.
This is the same solution for your shopping cart concerns You'll need to grant a token which will attach their shopping information and use that to validate each action, regardless if they knew the right URL or not. There are no shortcuts when it comes to security.
You may want to re-orient your architecture along the lines of DDD (Domain Driven Design) and perhaps use a MSA, that way you can shift from orchestrated workflow to EDA and choreography of micro processes.

How do I lock down a .net 4.0 WCF Data Service

I've had a WCF Data service published for about 2 months. It's 100% been hacked already. I even noticed the service published on twitter!
Luckily my site was under development and the user entity was only about 80 beta testers.
Still this is a pretty big problem. With the power of E.F. Navigation properties anyone can easily write a script to download all my user data and my valuable domain data that no-one else has. I want to provide non-authenticated access and do things like:
Limit what columns get exposed (e.g. a users emails)
Limit number of requests possible per day (e.g. 10 per request host address)
Be notified when someone is misusing the service
Limit the results set and expand options on different entity sets
Stuff I haven't yet thought about
Does this make sense or should I drop WCF Data Services - which in theory sounded great, but now that I've got experience with them I'm wondering if they are just good for development and not production (they're kind of fatter than I was expecting).
Thoughts that go beyond my knowledge and suggestions here will be greatly appreciated.
Also posting any links to thorough blog post examples or video presentation that cover ground would be excellent!
I think you need to implement some authentication. There is no other way I can think of to "lock down" a web service. This is one of the advantages of WCF -- it makes implementing complex authentication easy.
On my WCF service, I require a UserContext object, simply comprised of two strings, username and password.
Every method on the service requires that context, and if I haven't added the username/password to the database, it denies the request.
This also makes it simple to track who is abusing the service, as you will have their username/password tied to every request.
You should also run it over SSL so other users' credentials will not be easily compromised.
1 - WCF Data Services currently doesn't allow you to easily filter columns on per request basis. You could have two EF models (one "public", and one "private") and expose them as two services. The public one accessible to anybody, the private one behind full auth.
2 - This you will have to implement yourself. But for this to work you need some way to identify the user. So it's pretty close to authentication (Even if it doesn't require password or something like that). There's a series of posts about auth over WCF Data Services here: http://blogs.msdn.com/b/astoriateam/archive/tags/authentication/
3 - If you can identify the user as per #2, you can for example count the number or frequence or requests he/she makes and setup a notification based on that. Again the techniques used for auth should provide you the right hooks.
4 - This is reasonably simple. WCF Data Service allows you to set hard limit on the size of the response (DataServiceConfiguration.MaxResultsPerCollection) or a soft limit, which means paging. Paging is usually better, since it limits the size of a single response but still allows clients to get all the data with multiple requests. This can be done through DataServiceConfiguration.SetEntitySetPageSize. The exand behavior can be limited by usage of DataServiceConfiguration.MaxExpandCount and MaxExpandDepth properties.
Some other possible techniques to use
Query interceptors (http://msdn.microsoft.com/en-us/library/dd744842.aspx) - this allows you to filter rows on per request bases. Typically used to limit rows based on the user making the request (note that this only allows you to filter rows, not columns).
Service operations - if you define a service operation which returns IQueryable the client can still compose queries on top of it, but it gives you the ability to filter the data before the query is applied. Or you can make certain pieces of information accessible only through service operations (not as easy to use and not queryable, but it gives you full control). (http://msdn.microsoft.com/en-us/library/cc668788.aspx)