What naming convention should I use for a custom SCIM schema? - scim

I am struggling to find documentation or a recommendation on how to name the schema id for a custom SCIM resource.
{
"id": "urn:ietf:params:scim:schemas:mycompany:2.0:MyResource",
"name": "MyResource",
"description": "MyResource description",
"attributes": [
{
"name": "name",
"type": "string",
"multiValued": false,
"description": "A human-readable name for MyResource. REQUIRED.",
"required": true,
"caseExact": true ,
"mutability": "readWrite",
"returned": "default",
"uniqueness": "none"
}
],
"meta": {
"resourceType": "Schema",
"location": "/v2/Schemas/urn:ietf:params:scim:schemas:mycompany:2.0:MyResource"
}
}
Should it use the same prefix as the builtin schemas? urn:ietf:params:scim:schemas:
Or rather just my custom stuff? urn:mycompany:scim:schemas:MyResource
I am using SCIM2.

Also, although there are no definitive guidelines for the naming convention of proprietary resources, I would recommend making the schema naming a per-provider configuration within your Service Provider or client.
The reason being that some implementations literally don't care, and you can do what you want. However, Azure AD's SCIM client will not allow you to create a mapping to a custom attribute that does not follow one of these formats:
urn:ietf:params:scim:schemas:extension:2.0:CustomExtensionName:CustomAttribute
or
urn:ietf:params:scim:schemas:extension:CustomExtensionName:2.0:User.CustomAttributeName:value
Where CustomExtensionName, CustomAttribute, and CustomAttributeName:value can be changed to suit your model.
You don't want to implement someone's made-up idea of a standard to find that is in conflict with another implementation. So make it as dynamic as possible, within reason.

Like you, I did not find any resource that clearly indicates the best practices. But the way Oracle does it seems clean:
For a new resource: urn:ietf:params:scim:schemas:mycompany:core:2.0:NewResource
For an attribute extension, ie on User: urn:ietf:params:scim:schemas:extension:mycompany:2.0:User

The SCIM 2.0 RFC 7643 Section 10 contains a section regarding the IANA registration of the "scim" namespace ID, along with an optional registration process for adding schema URIs.
Although I haven't found any complete best practices around this topic either, I would propose using the "urn:ietf:params:scim:schemas" if you intend for your schemas to be standardized and utilized more broadly, and you are able to follow the registration process and requirements outlined in the RFC.
Otherwise, utilizing a company-based / proprietary namespace seems appropriate, e.g. urn:mycompany:scim:schemas:core:MyResource:1.0 or urn:mycompany:scim:schemas:extension:MyResource:myExtension:1.0.

Related

Having issues with documenting the dynamic keyword in C# ASP.NET Core

We utilize Pagination, and we have a handy little class that helps us, some of it looks like this:
public class PagedResponse<T>
{
public PagedResponse(HttpRequest request, IQueryable<dynamic> queryable,
string maxPageSizeKey, dynamic options, int pageNumber, int pageSize)
{
//code
}
public dynamic Data { get; set; }
At some point, we execute this and assign it to Data:
List<dynamic> dataPage = queryable
.Skip(skip)
.Take(take)
.ToList();
Because this class utilizes the dynamic type, here is what Swagger generates (imagine the type we pass to our PagedResponse class is Foo):
{
"PageNumber": 0,
"PageSize": 0,
"MaxPageSizeAllowed": 0,
"FirstPage": "string",
"NextPage": "string",
"LastPage": "string",
"TotalPages": 0,
"TotalRecords": 0,
"Data": {} <---- Here we want it to describe Foo
}
I also notice that when you click on 'schema' it gives it the concatenated name FooPagedResponse.
The fact that swagger doesn't give any information about Data is becoming a sticking point for our React developers who utilize some utility to grab schemas.
Now, here's the thing, if you replaced anywhere that I used dynamic with T, swagger, of course, would be happy. But that's no solution, because we have to use dynamic.
Why? OdataQueryOptions and the possibility of using the odata $select command.
Before we are executing the queryable (the .ToList above), we are doing something like this (as well as other odata commands):
if (options.SelectExpand != null)
{
queryable = options.SelectExpand.ApplyTo(queryable, settings) as IQueryable<dynamic>;
}
(options was the "dynamic options" passed to the constructor, this is the ODataQueryOptions)
Once you apply a $select you can no longer assign the .ToList to List<T> it has to be List<dynamic>, I get an exception otherwise about not being able to convert types.
I can't take away the ability to $select just to get proper documentation. What can I do here?
While this is not a direct answer to the original post, there is a lot to unpack here, too much for a simple comment.
Firstly, the original question is essentially: How to customise the swagger documentation when returning dynamic typed responses
To respond to this directly would require the post to include the swagger configuration as well as an example of the implementation, not just the type.
The general mechanism for extending Swagger docs is by implementing IDocumentFilter or IOpertationFilter so have a read over these resources:
IOperationFilter and IDocumentFilter in ASP.NET Core
Is there a way to get Swashbuckle to add OData parameters
Because you are using a dynamic typed data response the documentation tools cannot determine the expected type so you would need to provide that information, if it is available, in another way, there are many different addons and code examples out there, but I haven't come across an implementation like this, so can't provide a specific example. Most of OData configuration is derived through reflection, so if you have obfuscated the specific implementation through dynamic most of the OData tools and inner workings will simply fail.
Now, here's the thing, if you replaced anywhere that I used dynamic with T, swagger, of course, would be happy. But that's no solution, because we have to use dynamic.
The only reason that your code requires you to use a dynamic typed response is because it has been written in a way as to require that. This is a constraint that you have enforced on the code, it has nothing to do with EF, OData or any other constraints outside of your code. When a $select operation is applied to project a query result, we are not actually changing the shape of the data at all, we are simply omitting non-specified columns, in a practical sense their values will be null.
OData is very specific about this concept of projection, to simplify the processing at both the server and client ends, a $select (or $expand) query option can not modify the schema at all, it only provides a mask that is applied to the schema. It is perfectly valid to deserialise a response from a $select projection into the base type, however the missing fields will not be initialized.
It is not always practical to process the projected response in the base type of the request, so many clients, especially any late-bound languages will receive the data in the same shape that it was transferred over the wire in.
While you could go deep into customising the swagger output, this post smells strongly like an XY problem, OData has a built in mechanism for paging IQueryable<T> responses OOTB, yes the response is different to the PagedResponse<T> class that you are using, it is more than adequate for typical data paging scenarios and it's implementation is very similar to your custom implementation.
If you use the in-built paging mechanism in OData, using the $skip and $top query options, then your code implementation will be simpler and the swagger documentation would be correct.
You have admitted to being new to OData, so before blindly customising standard endpoints it is a valuable experience to first gain an understanding of the default behaviours and see if you can re-align your requirements with the standards.
A key driving reason to adopt OData in the first place is to be standards compliant, so that clients can make calls against your API following standard conventions and code generation tools can create reliable client-side interfaces. Once you start customising the in-built behaviours you must then customise the documentation or $metadata if you want to maintain compatibility with those types of processes.
By default, when a $top query option is provided, the response will include a link to get the next page of results. If you have not enabled the count to be provided automatically, you can use the $count=true query option to enable the overall count to be provided in the output.
The key difference between your custom implementation and the OData implementation is that the client is responsible for managing or maintaining the list of Page Numbers and translating them into $top and $skip values.
This is a deliberate push from the OData team to support virtual paging or load on demand scrolling. The expectation is that the user might get to the bottom of a list and either click to see more, or by virtue of reaching the end of the current list more records would be dynamically added to the previous set, becoming a delayed load yet continuous list.
To select the 3rd page, if the page size is 5, with both a projection and a filter we could use this url:
~/OData/Companies?$top=5&$skip=10&$select=CompanyName,TradingName&$filter=contains(CompanyName,'Bu')&$count=true
{
"#odata.context": "~/odata/$metadata#Companies(CompanyName,TradingName)",
"#odata.count": 12,
"value": [
{
"CompanyName": "BROWERS BULBS",
"TradingName": "Browers Bulbs"
},
{
"CompanyName": "BUSHY FLOWERS",
"TradingName": "Bushy Flowers"
}
],
"#odata.nextLink": "~/OData/Companies?$top=5&$skip=10&$select=CompanyName%2CTradingName&$filter=contains%28CompanyName%2C%27Bu%27%29"
}
This table has thousands of rows and over 30 columns, way to much to show here, so its great that I can demonstrate both $select and $filter operations.
What is interesting here is that this response represents the last page, the client code should be able to interpret this easily because the number of rows returned was less than the page count, but also because the total number of rows that match the filter criteria was 12, so 2 rows on page 3 when paging by 5 was expected.
This is still enough information on the client side to build out links to the specific individual pages but greatly reduces the processing on the server side and the repetitive content returned in the custom implementation. In other words the exact response from the custom paging implementation can easily be created from the standard response if you had a legacy need for that structure.
Don't reinvent the wheel, just realign it.
Anthony J. D'Angelo

Data replication or API Gateway Aggregation: which one to choose using microservices?

As an example, let's say that I'm building a simple social network. I currently have two services:
Identity, managing the users, their personal data (e-mail, password hashes, etc.) and their public profiles (username) and authentication
Social, managing the users' posts, their friends and their feed
The Identity service can give the public profile of an user using its API at /api/users/{id}:
// GET /api/users/1 HTTP/1.1
// Host: my-identity-service
{
"id": 1,
"username": "cat_sun_dog"
}
The Social service can give a post with its API at /api/posts/{id}:
// GET /api/posts/5 HTTP/1.1
// Host: my-social-service
{
"id": 5,
"content": "Cats are great, dogs are too. But, to be fair, the sun is much better.",
"authorId": 1
}
That's great, but my client, a web app, would like to show the post with the author's name, and it would preferably receive the following JSON data in one single REST request.
{
"id": 5,
"content": "Cats are great, dogs are too. But, to be fair, the sun is much better.",
"author": {
"id": 1,
"username": "cat_sun_dog"
}
}
I found two main ways to approach this.
Data replication
As described in Microsoft's guide for data and Microsoft's guide for communication between microservices, it's possible for a microservice to replicate the data it needs by setting up an event bus (such as RabbitMQ) and consuming events from other services:
And finally (and this is where most of the issues arise when building microservices), if your initial microservice needs data that's originally owned by other microservices, do not rely on making synchronous requests for that data. Instead, replicate or propagate that data (only the attributes you need) into the initial service's database by using eventual consistency (typically by using integration events, as explained in upcoming sections).
Therefore, the Social service can consume events produced by the Identity service such as UserCreatedEvent and UserUpdatedEvent. Then, the Social service can have in its very own database a copy of all the users, but only the required data (their Id and Username, nothing more).
With this eventual consistent approach, the Social service now has all the required data for the UI, all in one request!
// GET /api/posts/5 HTTP/1.1
// Host: my-social-service
{
"id": 5,
"content": "Cats are great, dogs are too. But, to be fair, the sun is much better.",
"author": {
"id": 1,
"username": "cat_sun_dog"
}
}
Benefits:
Makes the Social service totally independent from the Identity service; it can work totally fine without it
Retrieving the data requires less network roundtrips
Provides data for cross-service validation (e.g. check if the given user exists)
Drawbacks and questions:
Takes some time for a change to propagate
The system is absolutely RUINED for some users if some messages fail to get through due to a disaster that fried all your replicated queues!
What if, one day, I need more data from the user, like their ProfilePicture?
What to do if I want to add a new service with the same replicated data?
API Gateway aggregation
As described in Microsoft's guide for data, it's possible to create an API gateway that aggregates data from two requests: one to the Social service, and another to the Identity service.
Therefore, we can have an API gateway action (/api/posts/{id}) implemented as such, in pseudo-code for ASP.NET Core:
[HttpGet("/api/posts/{id}")]
public async Task<IActionResult> GetPost(int id)
{
var post = await _postService.GetPostById(id);
if (post is null)
{
return NotFound();
}
var author = await _userService.GetUserById(post.AuthorId);
return Ok(new
{
Id = post.Id,
Content = post.Content,
Author = new
{
Id = author.Id,
Username = author.Username
}
});
}
Then, a client just uses the API gateway and gets all the data in one query, without any client-side overhead:
// GET /api/posts/5 HTTP/1.1
// Host: my-api-gateway
{
"id": 5,
"content": "Cats are great, dogs are too. But, to be fair, the sun is much better.",
"author": {
"id": 1,
"username": "cat_sun_dog"
}
}
Benefits:
Very easy to implement
Always gives the up-to-date data
Gives a centralized place to cache API queries
Drawbacks and questions:
Increased latency: in this case, it's due to two sequential network roundtrips
The action breaks if the Identity service is down, although this can be mitigated using the circuit breaker pattern, the client won't see the author's name anyway
Unused data might get still queried and waste resources (but that's marginal most of the time)
Having those two options: aggregation on the API gateway and data replication on individual microservices using events, which one to use for which situation, and how to implement them correctly?
In general, I strongly favor state replication via events in durable log-structured storage over services making synchronous (in the logical sense, even if executed in a non-blocking fashion) queries.
Note that all systems are, at a sufficiently high level, eventually consistent: because we don't stop the world to allow an update to a service to happen, there's always a delay from update to visibility elsewhere (including in a user's mind).
In general, if you lose your datastores, things get ruined. However, logs of immutable events give you active-passive replication for nearly free (you have a consumer of that log which replicates events to another datacenter): in a disaster you can make the passive side active.
If you need more events than you are already publishing, you just add a log. You can seed the log with a backfilled dump of synthesized events from the state before the log existed (e.g. dump out all the current ProfilePictures).
When you think of your event bus as a replicated log (e.g. by implementing it using Kafka), consumption of an event doesn't prevent arbitrarily many other consumers from coming along later (it's just incrementing your read-position in the log). So that allows for other consumers to come along and consume the log for doing their own remix. One of those consumers could be simply replicating the log to another datacenter (enabling that active-passive).
Note that once you allow services to maintain their own views of the important bits of data from other services, you are in practice doing Command Query Responsibility Segregation (CQRS); it's thus a good idea to familiarize yourself with CQRS patterns.

Using json schema within pact for contract testing

I started experiencing pact a while a ago, and I'm wondering if any of you guys has any idea if pact support Json schemas!
I'll describe the flow. suppose you have two microservices:
Microservice A - the consumer
Microservice B - the Provider
The provider exposes an Api - basically a POST request and expects an explicit json body payload (some field are required).
Let's say:
{
"id": "123",
"name": "Bob"
}
both id and name are required properties.
Suppose now that the provider changed it's behaviour to expect the same json payload but with an additional property:
{
"id": "123",
"name": "Bob"
"extraProperty": "newProperty"
}
My question is is there anyway to detect such breaking contract using pact framework? if not what do think is the best way to test such scenario.
Thanks for the help.
My question is is there anyway to detect such breaking contract using pact framework?
Of course - this is what Pact is designed to do!
If extraProperty is a new required field (in addition to id and name) then when the provider tests run against the contracts generated by the consumer(s) that don't know about the new field, then your API will not behave as they expected and your pact tests will fail.
As to your other question:
I'm wondering if any of you guys has any idea if pact support Json schemas!
I'm not sure what you mean - we certainly support JSON formats. If you mean JSON Schema then yes, but you still need to write the tests (see https://docs.pact.io/faq#why-doesnt-pact-use-json-schema).

HAL+JSON hypermedia type not a media type for REST?

Can the HAL+JSON hypermedia type be used in a way that creates a RESTful service?
From what I have read, the client of a RESTful API should not need to treat different resources with special cases. The media type should instead be used to describe what resources are expected to look like.
The HAL spec gives this example:
GET /orders
{
...
"shippedToday": 20,
...
}
```
As a client of this sample HAL+JSON-serving API, I seem to need to know that an "order" has an attribute of shippedToday. That seems to go against the constraint that the client should not need to understand the representation's syntax.
This is not a critique of HAL. The question is to help my (and others') understanding of RESTful API design.
Can the HAL+JSON hypermedia type be used in a way that creates a RESTful service?
Yes, definitely.
The API should have a billboard URL, which in your case could be /.
This is the entry point from which humans and ideally even machines can start to discover your API.
According to the HAL specification a resources representation contains an optional property called "_links" which is described here:
It is an object whose property names are link relation types (as
defined by RFC5988) and values are either a Link Object or an array
of Link Objects.
So these links represent the hypermedia part of your API. The relations can be IANA-registered relations or you can use your own extension relations.
Relations should not be ambiguous. Their names should be unique. That's why it is recommended to use URIs from your own domain as names for your own relations. These URIs identify a resource that represents the relation and contains an API documentation, a human or machine readable documentation of your relation.
In your case this would be a relation that describes the state transition to the /orders resource. This should also include a description and explanation of the response and therefore document that e.g. the /orders resource represents a list of orders and has a property called "shippedToday" with a value of type number.
Here is an example response for a GET / HTTP/1.1 request:
HTTP/1.1 200 OK
Content-Type: application/hal+json
{
"_links": {
"self": { "href": "/" },
"http://yourdomain.com/docs/rels/orders": { "href": "/orders" },
}
}
Under http://yourdomain.com/docs/rels/orders there should be the API docs.

REST standards for posting data

I am using Ember.
The model being posted by ember is
{
"user": {
"firstName": "Vivek",
"lastName": "Muthal"
}
}
Ember had wrapped the data into "user" object.
But the service I have written accepts only {"firstName":"string","lastName":"string"}.
So my question is does REST standards specifies to send / receive data in wrapped Object only?
Any references please so I can change the service accordingly.
Or else I will modify the ember to use my current service as it is. Thanks.
I suppose that the resource is User, so the JSON should represent a User. Lets say you have this URI schema.
GET /host/users/{userId}
PUT /host/users/{userId}
POST /host/users
When we do GET we expect a JSON that represents A User
{
"firstName": "Vivek",
"lastName": "Muthal"
}
There is no need to specify the resource name because we already mentioned that in our GET request, The same for POST, there is no need to mention the resource name in the request body because it is specified in the request URI.So no there, is no need for user key.
No. There is no predefined format for the data you send in the body of your HTTP requests. Well ok, the HTTP RFCs do put technical limits on the data you send, but the formatting is entirely up to you.
As such, you can format your data however you want. You just need to represent the resource. You do need to consider if the JSON for a user should clearly mark the object as a 'user' or not, I would consider it mostly redundant to do so.
REST defines an uniform interface constraint. This constraint states that you have to use standards solutions to create an uniform interface between the service and the clients. This way the clients will be decoupled by the implementation details of the service, because the standards we use are not maintained by the server...
To make it short you can use any standard MIME type, or define a vendor specific MIME type. I strongly suggest you to use an existing hypermedia format like JSON-LD+Hydra or HAL. I guess this kind of serialization is an ember specific thing.