Is it better to have one or two REST endpoints for this scenario? - api

I have a microservice that handles the bookings process. Half of the data is on a db bookings table and half within an external 3rd party service that is required for completing the booking.
From a REST perspective is it better to expose 2 endpoints on the same microservice each for each case ie
POST /bookings (pass relevant data)
POST /external-service/bookings (pass relevant data)
or one
POST /bookings (pass all data for both cases)
That does create a new record in the bookings db table but also talks with the external service API to complete the booking?
Personally I am leaning towards the second approach.

From a REST perspective, it is generally recommended to have a single endpoint for each resource, in this case, the bookings is the resource.
The second approach, where you have one endpoint for creating a booking and it handles both saving to the database and interacting with the external service, would be more consistent with REST principles.
POST /bookings (pass all data for both cases)
It allows for better separation of concerns as the microservice is responsible for handling the booking resource, and it doesn't expose the details of how the booking is completed to the client. Additionally, it eliminates the need for the client to make multiple requests to different endpoints to complete a single action.
However, this approach may make testing and debugging more challenging if the external service is not available or if the external service is required to be updated.

Related

Handling Large Requests to API

Are there best practices for how to pass large lists between services? I see some recommendations to pass S3 file URLs between services if the payloads can be large, but that seems like a step backwards because if the data is in S3 then the client can't use the server's API schema to validate the request as easily as if the data were passed in a list.
I can't process the data in small batches because it all needs to be processed at once.
Example:
Service B has API 1.
API 1's job is to receive a list of cars and when all cars are received to take some action on each car. All cars need to be acted on, it's not OK to take the action on only some cars.
Service A wants to send Service B 400,000 cars to store in Service B's database.
Should Service B structure API 1's API so that it expects:
A list of cars
A URL for an S3 file that contains a list of cars
Something else
It hugely depends on what actual requirements and constraints are. Sending a big amount of data via API in one go is usually not a good idea due to multiple reasons (network interruptions, memory consumption, etc.). If you don't have transactional requirements - you can just send data in small batches and (re)design the API to support that. Potentially you should consider completely switching from synchronous calls via API to asynchronous ones for example using some messaging pipeline (using Kafka for example).

Should API delegate all work to other services?

Suppose we have an API server with several endpoints that serves user requests. I have been wondering what is a good trade-off between implementing logic in the API server vs. delegating to other microservices.
For example, suppose we want to fetch data from a database upon an API call. Should database query:
Be performed by the API itself?
Delegated to a separate microservice that handles database queries?
Delegated to something even simpler than a microservice, say a lambda function in the cloud?
Thanks for help.
In case of Microservices the service itself is the owner of its own data. This implies two things:
This service is the only application which has direct access to its own data
If ServiceB wants perform any operation on this data it has to do that via the ServiceA's API (not directly via the database)
If ServiceB needs to retrieve data frequently and the data is fairly static then the ServiceB could implement a local cache via replication. But the source of truth still remains at ServiceA.
Afaik. microservices split up the problem horizontally, and what you are talking about is splitting up the problem vertically. Ofc. it is possible to combine both approaches and having multiple microservices each or some of them split up to smaller services in different layers. This can be both an architectural and a scaling decision, so better to check the numbers what kind of load you expect, what response time you need, and how much you want to spend on it imho. So better not to solve problems which are non-existent atm. but maybe in the future you will have them, or not...

Seperate or Merge Kafka Consumer and API services together

After recently reading about event-based architecture, I wanted to change my architecture into one making use of such strengths.
I have two services that expose an API (crud, graphql), each based around a different entity and using a different database.
However, now whenever someone deletes a certain type of row in service A, i need to delete a coupled row in Service B.
So I added Kafka to my design, and whenever I delete the entity in service A, it publishes a notification message into Kafka.
In service B I am currently consuming the same topic so whenever a new message is received the Service will also handle the deletion of the matching entity, because it already has access that table because the same service already exposes the CRUD API to users.
What i'm not sure about is whether putting the Kafka Consumer and the API together in the same service is a good design. It contradicts the point of single responsibility in micro services, and whether there is an issue in one part of the service, it will likely affect the second.
However, creating a new service will also cause me issues - i will have 2 different services accessing the same table, and i will have to make sure i always maintain them together, whenever making changes to the table or database.
What is the best practice in a incident such as this? Is it inevitable to have different services have data coupling or is it not so bad to use the same service for two, similiar usages.
There is nothing wrong with using Kafka... You could do the same with point-to-point service communication, however (JSON-RPC / gRPC), however.
The real problem you seem to be asking about is dual-writes or race-conditions leading to data inconsistency.
While you could use a single consumer group and one topic-partition to preserve order and locking across consumers interested in those events, that does not lock out other consumer-groups from interacting with the database to perform the same action. Therefore, Kafka itself won't help with this problem.
You'll need external, distributed locks (e.g. Zookeeper can be used here) that fence off your database clients while you are performing actions against it.
To the original question, Kafka Connect offers an API and is also a Producer and Consumer client (and would be recommended for database interactions). So is Confluent Schema Registry, KSQLdb, etc.
I believe that the consumer of your service B would not be considered "a service" or part of the "service", as in that it is not called as part the code which services requests. Yet it does provide functionality that is required for the domain function of your microservice. So yes I would consider the consumer part of the Microservice in terms of team/domain responsibility.
There may be different opinions on if the consumer code should share the same code base/repo as the "service" code. Some people believe that it is better to limit the repo scope to a single "executable", others believe it is beneficial to keep the domain scope and have everything in a single repo. I probably belong to the latter group but do not have a very strong opinion on it. I would argue it is more important to have a central documentation / wiki for the domain that will point to the repos involved etc.

Need help in selecting the right design pattern

We are into the lead business. We capture leads and pass it on to the clients based on some rules. integration to each client very in nature like nature of the API and in some cases, data mapping is also required. We perform the following steps in order to route leads to the client.
Select the client
Check if any client-specific mapping(master data) is required.
Send Lead to nearest available dealer(optional step)
Call client api to send lead
Update push status of the lead to database
Note that some of the steps can be optional.
Which design pattern would be suitable to solve this problem. The motive is to simplify integration to each client.
You'll want to isolate (and preferably externalize) the aspects that differ between clients, like the data mapping and API, and generalize as much as possible. One possible force to consider is how easily new clients and their APIs can be accommodated in the future.
I assume you have a lot of clients, and a database or other persistent mechanism that holds this client list, so data-driven routing logic that maps leads to clients shouldn't be a problem. The application itself should be as "dumb" as possible.
Data mapping is often easily described with meta-data, and also easily data-driven. Mapping meta-data is client specific, so it could easily be kept in your database associated with each client in XML or some other format. If the transformations to leads necessary to conform to specific APIs are very complex, the logic could be isolated through the use of a strategy pattern, with the specific strategy selected according to the target client. If an extremely large number of clients and APIs need to be accommodated, I'd bend over backwards to make the API data-driven as well. If you have just a few client types (say less than 20), I'd employ some distributed asynchronicity, and just have my application publish the lead and client info to a topic corresponding to client-type, and have subscribed external processors specific for each client-type do their thing and publish the results on another single queue. A consumer listing to the results queue would update the database.
I will divide your problem statement into three parts mentioned below:
1) Integration of API with different clients.
2) Perfom some steps in order to route leads to the client.
3) Update push status of the lead to database.
Design patterns involved in above three parts:
1) Integration of API with different clients - Integration to each client vary in nature like the nature of the API. It seems you have incompitable type of interface so, you should design this section by using "Adapter Design Pattern".
2) Perform some steps in order to route leads to the client- You have different steps of execution. Next step is based on the previous steps. So, you should design this section by using "State Design Pattern".
3) Update push status of the lead to database: This statement shows that you want to notify your database whenever push status of the lead happens so that information will be updated into database. So, you should design this section by using "Observer Design Pattern".
Sounds like this falls in the workflow realm.
If you're on Amazon Web Services, there's SWF, otherwise, there's a lot of workflow solutions out there for your favorite programming language.

How to organize stateful WCF web services with custom session handling

Consider a company with a large set of functionality to be exposed via web services. A subset of this functionality is used for building up some very complex and computation intensive scenarios, and requires a session to be maintained during this iterative build-up. Each scenario targets one single base structure, representing, say a single customer. That is, a scenario is a series of heavy operations on a single customer structure. The operations can be grouped by which area they target, but basically all operations in the same scenario roots in the same customer structure.
The following decision is given from the outside, and cannot be altered: an already made custom session handler must be used, which basically operates on a session given a simple GUID-token to be send to/from the client. Therefore, from a technical perspective the session need not to be limited to a single service, but can live across multiple services.
Besides the stateful operations, there is also a number of stateless operations.
Given the above decision about the custom session handler, the question is now: how should all these operations be organized? What organization is most elegant?
Some possibilities:
All stateful operations are gathered in one single stateful service, while all stateless services are grouped in an arbitrary set of services, possibly by which area they target. Possible problem: the single stateful service can become very large.
Both stateful and stateless operations are grouped into smaller services, but stateful and stateless operations are still separated so that no service contains both stateful and stateless operations. Possibly, all session estabilshment and finalization can be put in a separate thin dedicated service, say SessionService. With this a approach we have no huge single stateful service. But is the organization elegant? Why force a strict separation of the stateful and stateless operations at all?
Group all operations by their target, ignoring their statefulness. This gives a number of services with mixed stateful and stateless operations. The former can take the session GUID token as input argument, and a service behavior can take care of automatically handle the session estabilshment given some appropriate naming convention for the session key/token.
Similar to above, a separate dedicated service can take care of session establishment and finalization.
Something else, not mentioned above?
Which organizaion is the most elegant?
I have implemented what is basically your third option. We have 20+ operations, some of which check the request for a SessionID (GUID), some of which do not (like Ping() and Login(DeviceID)). All the session handling is "Custom" in the sense that we're not using WCF persistant sessions; rather, we've created a Login() function that takes a GUID, ID, Password from client requests for authentication.
On Login(), we validate the DeviceID, UserID and Pwd against the DB, creating a row on the Session table containing StartTime (session only good for <8 hrs) and DeviceID. We then send back to the client a SessionID (GUID) that he/she uses in all subsequent connections, whether uploading or downloading.
As far as organization is concerned, the subs and methods are organized by the device type (iOS, PC, Android) and the type of operation, just to keep the apples from the oranges. But each function that's Session-related always authenticates the request, validating the inbound SessionID. That may seem wasteful, checking each session with each request (again and again), but because we're using BasicHTTPBinding, we're forced to use a stateless model.