Seperate or Merge Kafka Consumer and API services together - api

After recently reading about event-based architecture, I wanted to change my architecture into one making use of such strengths.
I have two services that expose an API (crud, graphql), each based around a different entity and using a different database.
However, now whenever someone deletes a certain type of row in service A, i need to delete a coupled row in Service B.
So I added Kafka to my design, and whenever I delete the entity in service A, it publishes a notification message into Kafka.
In service B I am currently consuming the same topic so whenever a new message is received the Service will also handle the deletion of the matching entity, because it already has access that table because the same service already exposes the CRUD API to users.
What i'm not sure about is whether putting the Kafka Consumer and the API together in the same service is a good design. It contradicts the point of single responsibility in micro services, and whether there is an issue in one part of the service, it will likely affect the second.
However, creating a new service will also cause me issues - i will have 2 different services accessing the same table, and i will have to make sure i always maintain them together, whenever making changes to the table or database.
What is the best practice in a incident such as this? Is it inevitable to have different services have data coupling or is it not so bad to use the same service for two, similiar usages.

There is nothing wrong with using Kafka... You could do the same with point-to-point service communication, however (JSON-RPC / gRPC), however.
The real problem you seem to be asking about is dual-writes or race-conditions leading to data inconsistency.
While you could use a single consumer group and one topic-partition to preserve order and locking across consumers interested in those events, that does not lock out other consumer-groups from interacting with the database to perform the same action. Therefore, Kafka itself won't help with this problem.
You'll need external, distributed locks (e.g. Zookeeper can be used here) that fence off your database clients while you are performing actions against it.
To the original question, Kafka Connect offers an API and is also a Producer and Consumer client (and would be recommended for database interactions). So is Confluent Schema Registry, KSQLdb, etc.

I believe that the consumer of your service B would not be considered "a service" or part of the "service", as in that it is not called as part the code which services requests. Yet it does provide functionality that is required for the domain function of your microservice. So yes I would consider the consumer part of the Microservice in terms of team/domain responsibility.
There may be different opinions on if the consumer code should share the same code base/repo as the "service" code. Some people believe that it is better to limit the repo scope to a single "executable", others believe it is beneficial to keep the domain scope and have everything in a single repo. I probably belong to the latter group but do not have a very strong opinion on it. I would argue it is more important to have a central documentation / wiki for the domain that will point to the repos involved etc.

Related

How to seperate two projects from one solution and communicate with each other?

I have a two project linked with each other, Project A and Project B in same VS solutions.
Now the requirement is, I have to do seperate projects, it means seperate DB, web site name, servers etc.
Now I want to know how I will communicate both the project each other using web API?
Before starting separate I would suggest reading about DDD (Domain Driven Design) to make the rights choices about what functionality and classes could be in one microservice or the other.
Anyway, once you have separated the two services properly, so the communication between them would be minimum, there are mainly two ways to communicate with each other.
One is using synchronous communication using HTTP calls using some HTTP client for your language, for example, in java-spring applications is very common to use Webclient. The pros of this approach is that you don't have to deal with eventual consistency because all happens in the same thread, the main con is that you are coupling the services and if one of them fails the other will fail too.
The second approach is to use asynchronous messaging using some message broker like kafka. With this approach when something happen in one of the service, it will publish the event in the message broker and it will be consumed by the other. Ther is no direct communication so there can't be cascade failures but the problem is that you don't know when exactly the message will be consumed (eventual consistency)
As you see both options have pros and cons and it depends on your use case to choose one or the other

Need help in selecting the right design pattern

We are into the lead business. We capture leads and pass it on to the clients based on some rules. integration to each client very in nature like nature of the API and in some cases, data mapping is also required. We perform the following steps in order to route leads to the client.
Select the client
Check if any client-specific mapping(master data) is required.
Send Lead to nearest available dealer(optional step)
Call client api to send lead
Update push status of the lead to database
Note that some of the steps can be optional.
Which design pattern would be suitable to solve this problem. The motive is to simplify integration to each client.
You'll want to isolate (and preferably externalize) the aspects that differ between clients, like the data mapping and API, and generalize as much as possible. One possible force to consider is how easily new clients and their APIs can be accommodated in the future.
I assume you have a lot of clients, and a database or other persistent mechanism that holds this client list, so data-driven routing logic that maps leads to clients shouldn't be a problem. The application itself should be as "dumb" as possible.
Data mapping is often easily described with meta-data, and also easily data-driven. Mapping meta-data is client specific, so it could easily be kept in your database associated with each client in XML or some other format. If the transformations to leads necessary to conform to specific APIs are very complex, the logic could be isolated through the use of a strategy pattern, with the specific strategy selected according to the target client. If an extremely large number of clients and APIs need to be accommodated, I'd bend over backwards to make the API data-driven as well. If you have just a few client types (say less than 20), I'd employ some distributed asynchronicity, and just have my application publish the lead and client info to a topic corresponding to client-type, and have subscribed external processors specific for each client-type do their thing and publish the results on another single queue. A consumer listing to the results queue would update the database.
I will divide your problem statement into three parts mentioned below:
1) Integration of API with different clients.
2) Perfom some steps in order to route leads to the client.
3) Update push status of the lead to database.
Design patterns involved in above three parts:
1) Integration of API with different clients - Integration to each client vary in nature like the nature of the API. It seems you have incompitable type of interface so, you should design this section by using "Adapter Design Pattern".
2) Perform some steps in order to route leads to the client- You have different steps of execution. Next step is based on the previous steps. So, you should design this section by using "State Design Pattern".
3) Update push status of the lead to database: This statement shows that you want to notify your database whenever push status of the lead happens so that information will be updated into database. So, you should design this section by using "Observer Design Pattern".
Sounds like this falls in the workflow realm.
If you're on Amazon Web Services, there's SWF, otherwise, there's a lot of workflow solutions out there for your favorite programming language.

NServicebus hierarchy and structure

I am just starting out learning NServicebus (and SOA in general) and have a few questions and points I need clarification on regarding how the solution is typically structured and common best practices:
The documentation doesn't really explain what an endpoint is. From what I gather it is a unit of
deployment and your service will have 1 or more endpoints. Is this correct?
Is it considered best practice to have one VS solution per Service you are developing? With a project for messages, then a project for each endpoint, and finally a project that is shared with the endpoints containing your domain layer?
From what I read, services are usually comprised of individual components. Can (or should) any component in the service access the same database, or should it be one database per component?
Thanks for any clarification or insight one can offer.
I will try and answer your questions the best i can...
I'm not sure about the term "Best Practices" i would consider the term "Best Thinking" or "Paradigm"
Q1: yes, an endpoint is a effectively a deployed process that consumes the messages of it's queue.
It's dose not have to belong to a single "Service" (logical) (In the case of a web endpoint for example), an endpoint can have one or more handlers deployed to it.
Q2: I would go with one solution (and later repo) per logical domain service, Inside a solution I would create a project per message handler, because as you scale you will need to move you handler between endpoints, or to their own endpoint depending on scale. Messages however are contracts, so i would put them in a solution/s maybe split commands and events. You may consider something like nuget to publish your message packages.
Q3: A "Service" is a logical composition of autonomous components, each is a vertical slice of functionality, so they can share the same database, but i would say that only one component has the authority to modify it's own data. I would always try to think what would happen when you need to scale.
Dose this make sense?

Two wcf servers vs a wcf server with callback

I have got two applications that need to communicate via WCF:
Called A and B.
A suppose to push values to B for storage/update
B suppose to push a list of values stored in it to A
the senior programmer in my team wants to open a WCF server at A and another WCF server at B.
I claim that one should be the server and the other should be the client and use server call back In order to avoid splitting the interface into two, avoid circular dependency, and duplication of code settings. he doesn't understand it. can anyone help me explain him why his solution is bad code?
It depends on your criteria. Let's assume a client/server model where A is the client and B is the server. You state that B should "push" data to A.
If you truly need push then you should make B into a duplex server. This does put some strain on your bandwith, so if you have a bandwith restriction, this might not be the right choice.
If you can incur some delay at A than you might want to opt for a polling mechanism of your own (maybe based on timing, or some other logic).
If both are not an option, you can try and swap roles. So then make B the client and A the server. It's les intuitive, but it might fit your scenario. If you can incur a delay on storing data, make B poll A for changes in the data and save at an interval.
If there can be no delay in both and bandwith is limited, you do end up with two WCF services. Altough it may look silly at first glance, keep in mind they are services and not servers. It does make things a bit more complex, so I would keep it as a last resort.
A service should encapsulate a set of functionality that other applications can consume. All it does is wait for and respond to requests from other components, it doesn't initiate actions by itself.
If Application B is storing data, then it can of course be provided to Application A as a service. It provides the "service" of storing data without application A having to worry about how or where, and returns successfully stored data. This is exactly the kind of thing that WCF Services are meant to handle.
I am assuming that application A is the one initiating the requests (unless you have an unmentioned 3rd application, one of them must be the initiator). If Application A is initiating actions (for example, it has a UI, or is triggered to do some batch processing etc.) then it should not be modeled as a "service".
I hope that helps :)

SOA and WCF design questions: Is this an unusual system design?

I have found myself responsible for carrying on the development of a system which I did not originally design and can't ask the original designers why certain design decisions were taken, as they are no longer here. I am a junior developer on design issues so didn't really know what to ask when I started on the project which was my first SOA / WCF project.
The system has 7 WCF services, will grow to 9, each self-hosted in a seperate console app/windows service. All of them are single instance and single threaded. All services have the same OperationContract: they expose a Register() and Send() method. When client services want to connect to another service, they first call Register(), then if successful they do all the rest of their communication with Send(). We have a DataContract that has an enum MessageType and a Content propety which can contain other DataContract "payloads." What the service does with the message is determined by the enum MessageType...everything comes through the Send() method and then gets routed to a switch statement...I suspect this is unusual
Register() and Send() are actually OneWay and Async...ALL results from services are returned to client services by a WCF CallbackContract. I believe that the reson for using CallbackContracts is to facilitate the Publish-Subscribe model we are using. The problem is not all of our communication fits publish-subscribe and using CallbackContracts means we have to include source details in returned result messages so clients can work out what the returned results were originally for...again clients have a switch statements to work out what to do with messages arriving from services based on the MessageType (and other embedded details).
In terms of topology: the services form "nodes" in a graph. Each service has hardcoded a list of other services it must connect to when it starts, and wont allow client services to "Register" with it until is has made all of the connections it needs. As an example, we have a LoggingService and a DataAccessService. The DataAccessSevice is a client of the LoggingService and so the DataAccess service will attempt to Register with the LoggingService when it starts. Until it can successfully Register the DataAccess service will not allow any clients to Register with it. The result is that when the system is fired up as a whole the services start up in a cascadeing manner. I don't see this as an issue, but is this unusual?
To make matters more complex, one of the systems requirements is that services or "nodes" do not need to be directly registered with one another in order to send messages to one another, but can communicate via indirect links. For example, say we have 3 services A, B and C connected in a chain, A can send a message to C via B...using 2 hops.
I was actually tasked with this and wrote the routing system, it was fun, but the lead left before I could ask why it was really needed. As far as I can see, there is no reason why services cannot just connect direct to the other services they need. Whats more I had to write a reliability system on top of everything as the requirement was to have reliable messaging across nodes in the system, wheras with simple point-to-point links WCF reliabily does the job.
Prior to this project I had only worked on winforms desktop apps for 3 years, do didn't know any better. My suspicions are things are overcomplicated with this project: I guess to summarise, my questions are:
1) Is this idea of a graph topology with messages hopping over indirect links unusual? Why not just connect services directly to the services that they need to access (which in reality is what we do anyway...I dont think we have any messages hopping)?
2) Is exposing just 2 methods in the OperationContract and using the a MessageType enum to determine what the message is for/what to do with it unusual? Shouldnt a WCF service expose lots of methods with specific purposes instead and the client chooses what methods it wants to call?
3) Is doing all communication back to a client via CallbackContracts unusual. Surely sync or asyc request-response is simpler.
4) Is the idea of a service not allowing client services to connect to it (Register) until it has connected to all of its services (to which it is a client) a sound design? I think this is the only design aspect I agree with, I mean the DataAccessService should not accept clients until it has a connection with the logging service.
I have so many WCF questions, more will come in later threads. Thanks in advance.
Well, the whole things seems a bit odd, agreed.
All of them are single instance and
single threaded.
That's definitely going to come back and cause massive performance headaches - guaranteed. I don't understand why anyone would want to write a singleton WCF service to begin with (except for a few edge cases, where it does make sense), and if you do have a singleton WCF service, to get any decent performance, it must be multi-threaded (which is tricky programming, and is why I almost always advise against it).
All services have the same
OperationContract: they expose a
Register() and Send() method.
That's rather odd, too. So anyone calling will first .Register(), and then call .Send() with different parameters several times?? Funny design, really.... The SOA assumption is that you design your services to be the model of a set of functionality you want to expose to the outside world, e.g. your CustomerService might have methods like GetCustomerByID, GetAllCustomersByCountry, etc. methods - depending on what you need.
Having just a single Send() method with parameters which define what is being done seems a bit.... unusual and not very intuitive / clear.
Is this idea of a graph topology with
messages hopping over indirect links
unusual?
Not necessarily. It can make sense to expose just a single interface to the outside world, and then use some internal backend services to do the actual work. .NET 4 will actually introduce a RoutingService in WCF which makes these kind of scenarios easier. I don't think this is a big no-no.
Is doing all communication back to a
client via CallbackContracts unusual.
Yes, unusual, fragile, messy - if you can ever do without it - go for it. If you have mostly simple calls, like GetCustomerByID - make those a standard Request/Response call - the client requests something (by supplying a Customer ID) and gets back a Customer object as a return value. Much much simpler!
If you do have long-running service calls, that might take minutes or more to complete - then you might consider One-Way calls which just deposit a request into a queue, and that request gets handled later on. Typically, here, you can either deposit the answer into a response queue which the client then checks, or you can have two additional service methods which give you the status of a request (is it done yet?) and a second method to retrieve the result(s) of that request.
Hope that helps to get you started !
All services have the same OperationContract: they expose a Register() and Send() method.
Your design seems unusual at some parts specially exposing only two operations. I haven't worked with WCF, we use Java. But based on my understanding the whole purpose of Web Services is to expose Operations that your partners can utilise.
Having only two Operations looks like odd design to me. You generally expose your API using WSDL. In this case the WSDL would add nothing of value to the partners, unless you have lot of documentation. Generally the operation name should be self-explanatory. Right now your system cannot be used by partners without having internal knowledge.
Is doing all communication back to a client via CallbackContracts unusual. Surely sync or asyc request-response is simpler.
Agree with you. Async should only be used for long running processes. Async adds the overhead of correlation.