I have read articles about the differences between SOAP and REST as a web service communication protocol, but I think that the biggest advantages for REST over SOAP are:
REST is more dynamic, no need to create and update UDDI(Universal Description, Discovery, and Integration).
REST is not restricted to only XML format. RESTful web services can send plain text/JSON/XML.
But SOAP is more standardized (E.g.: security).
So, am I correct in these points?
Unfortunately, there are a lot of misinformation and misconceptions around REST. Not only your question and the answer by #cmd reflect those, but most of the questions and answers related to the subject on Stack Overflow.
SOAP and REST can't be compared directly, since the first is a protocol (or at least tries to be) and the second is an architectural style. This is probably one of the sources of confusion around it, since people tend to call REST any HTTP API that isn't SOAP.
Pushing things a little and trying to establish a comparison, the main difference between SOAP and REST is the degree of coupling between client and server implementations. A SOAP client works like a custom desktop application, tightly coupled to the server. There's a rigid contract between client and server, and everything is expected to break if either side changes anything. You need constant updates following any change, but it's easier to ascertain if the contract is being followed.
A REST client is more like a browser. It's a generic client that knows how to use a protocol and standardized methods, and an application has to fit inside that. You don't violate the protocol standards by creating extra methods, you leverage on the standard methods and create the actions with them on your media type. If done right, there's less coupling, and changes can be dealt with more gracefully. A client is supposed to enter a REST service with zero knowledge of the API, except for the entry point and the media type. In SOAP, the client needs previous knowledge on everything it will be using, or it won't even begin the interaction. Additionally, a REST client can be extended by code-on-demand supplied by the server itself, the classical example being JavaScript code used to drive the interaction with another service on the client-side.
I think these are the crucial points to understand what REST is about, and how it differs from SOAP:
REST is protocol independent. It's not coupled to HTTP. Pretty much like you can follow an ftp link on a website, a REST application can use any protocol for which there is a standardized URI scheme.
REST is not a mapping of CRUD to HTTP methods. Read this answer for a detailed explanation on that.
REST is as standardized as the parts you're using. Security and authentication in HTTP are standardized, so that's what you use when doing REST over HTTP.
REST is not REST without hypermedia and HATEOAS. This means that a client only knows the entry point URI and the resources are supposed to return links the client should follow. Those fancy documentation generators that give URI patterns for everything you can do in a REST API miss the point completely. They are not only documenting something that's supposed to be following the standard, but when you do that, you're coupling the client to one particular moment in the evolution of the API, and any changes on the API have to be documented and applied, or it will break.
REST is the architectural style of the web itself. When you enter Stack Overflow, you know what a User, a Question and an Answer are, you know the media types, and the website provides you with the links to them. A REST API has to do the same. If we designed the web the way people think REST should be done, instead of having a home page with links to Questions and Answers, we'd have a static documentation explaining that in order to view a question, you have to take the URI stackoverflow.com/questions/<id>, replace id with the Question.id and paste that on your browser. That's nonsense, but that's what many people think REST is.
This last point can't be emphasized enough. If your clients are building URIs from templates in documentation and not getting links in the resource representations, that's not REST. Roy Fielding, the author of REST, made it clear on this blog post: REST APIs must be hypertext-driven.
With the above in mind, you'll realize that while REST might not be restricted to XML, to do it correctly with any other format you'll have to design and standardize some format for your links. Hyperlinks are standard in XML, but not in JSON. There are draft standards for JSON, like HAL.
Finally, REST isn't for everyone, and a proof of that is how most people solve their problems very well with the HTTP APIs they mistakenly called REST and never venture beyond that. REST is hard to do sometimes, especially in the beginning, but it pays over time with easier evolution on the server side, and client's resilience to changes. If you need something done quickly and easily, don't bother about getting REST right. It's probably not what you're looking for. If you need something that will have to stay online for years or even decades, then REST is for you.
REST vs SOAP is not the right question to ask.
REST, unlike SOAP is not a protocol.
REST is an architectural style and a design for network-based software architectures.
REST concepts are referred to as resources. A representation of a resource must be stateless. It is represented via some media type. Some examples of media types include XML, JSON, and RDF. Resources are manipulated by components. Components request and manipulate resources via a standard uniform interface. In the case of HTTP, this interface consists of standard HTTP ops e.g. GET, PUT, POST, DELETE.
#Abdulaziz's question does illuminate the fact that REST and HTTP are often used in tandem. This is primarily due to the simplicity of HTTP and its very natural mapping to RESTful principles.
Fundamental REST Principles
Client-Server Communication
Client-server architectures have a very distinct separation of concerns. All applications built in the RESTful style must also be client-server in principle.
Stateless
Each client request to the server requires that its state be fully represented. The server must be able to completely understand the client request without using any server context or server session state. It follows that all state must be kept on the client.
Cacheable
Cache constraints may be used, thus enabling response data to be marked as cacheable or not-cacheable. Any data marked as cacheable may be reused as the response to the same subsequent request.
Uniform Interface
All components must interact through a single uniform interface. Because all component interaction occurs via this interface, interaction with different services is very simple. The interface is the same! This also means that implementation changes can be made in isolation. Such changes, will not affect fundamental component interaction because the uniform interface is always unchanged. One disadvantage is that you are stuck with the interface. If an optimization could be provided to a specific service by changing the interface, you are out of luck as REST prohibits this. On the bright side, however, REST is optimized for the web, hence incredible popularity of REST over HTTP!
The above concepts represent defining characteristics of REST and differentiate the REST architecture from other architectures like web services. It is useful to note that a REST service is a web service, but a web service is not necessarily a REST service.
See this blog post on REST Design Principles for more details on REST and the above stated bullets.
EDIT: update content based on comments
SOAP (Simple Object Access Protocol) and REST (Representation State Transfer) both are beautiful in their way. So I am not comparing them. Instead, I am trying to depict the picture, when I preferred to use REST and when SOAP.
What is payload?
When data is sent over the Internet, each unit transmitted includes both header information and the actual data being sent. The header identifies the source and destination of the packet, while the actual data is referred to as the payload. In general, the payload is the data that is carried on behalf of an application and the data received by the destination system.
Now, for example, I have to send a Telegram and we all know that the cost of the telegram will depend on some words.
So tell me among below mentioned these two messages, which one is cheaper to send?
<name>Arin</name>
or
"name": "Arin"
I know your answer will be the second one although both representing the same message second one is cheaper regarding cost.
So I am trying to say that, sending data over the network in JSON format is cheaper than sending it in XML format regarding payload.
Here is the first benefit or advantages of REST over SOAP. SOAP only support XML, but REST supports different format like text, JSON, XML, etc. And we already know, if we use Json then definitely we will be in better place regarding payload.
Now, SOAP supports the only XML, but it also has its advantages.
Really! How?
SOAP relies on XML in three ways
Envelope – that defines what is in the message and how to process it.
A set of encoding rules for data types, and finally the layout of the procedure calls and responses gathered.
This envelope is sent via a transport (HTTP/HTTPS), and an RPC (Remote Procedure Call) is executed, and the envelope is returned with information in an XML formatted document.
The important point is that one of the advantages of SOAP is the use of the “generic” transport but REST uses HTTP/HTTPS. SOAP can use almost any transport to send the request but REST cannot. So here we got an advantage of using SOAP.
As I already mentioned in above paragraph “REST uses HTTP/HTTPS”, so go a bit deeper on these words.
When we are talking about REST over HTTP, all security measures applied HTTP are inherited, and this is known as transport level security and it secures messages only while it is inside the wire but once you delivered it on the other side you don’t know how many stages it will have to go through before reaching the real point where the data will be processed. And of course, all those stages could use something different than HTTP.So Rest is not safer completely, right?
But SOAP supports SSL just like REST additionally it also supports WS-Security which adds some enterprise security features. WS-Security offers protection from the creation of the message to it’s consumption. So for transport level security whatever loophole we found that can be prevented using WS-Security.
Apart from that, as REST is limited by it's HTTP protocol so it’s transaction support is neither ACID compliant nor can provide two-phase commit across distributed transnational resources.
But SOAP has comprehensive support for both ACID based transaction management for short-lived transactions and compensation based transaction management for long-running transactions. It also supports two-phase commit across distributed resources.
I am not drawing any conclusion, but I will prefer SOAP-based web service while security, transaction, etc. are the main concerns.
Here is the "The Java EE 6 Tutorial" where they have said A RESTful design may be appropriate when the following conditions are met. Have a look.
Hope you enjoyed reading my answer.
REST(REpresentational State Transfer)
REpresentational State of an Object is Transferred is REST i.e. we don't send Object, we send state of Object.
REST is an architectural style. It doesn’t define so many standards like SOAP. REST is for exposing Public APIs(i.e. Facebook API, Google Maps API) over the internet to handle CRUD operations on data. REST is focused on accessing named resources through a single consistent interface.
SOAP(Simple Object Access Protocol)
SOAP brings its own protocol and focuses on exposing pieces of application logic (not data) as services. SOAP exposes operations. SOAP is focused on accessing named operations, each operation implement some business logic. Though SOAP is commonly referred to as web services this is misnomer. SOAP has a very little if anything to do with the Web. REST provides true Web services based on URIs and HTTP.
Why REST?
Since REST uses standard HTTP it is much simpler in just about ever way.
REST is easier to implement, requires less bandwidth and resources.
REST permits many different data formats where as SOAP only permits XML.
REST allows better support for browser clients due to its support for JSON.
REST has better performance and scalability. REST reads can be cached, SOAP based reads cannot be cached.
If security is not a major concern and we have limited resources. Or we want to create an API that will be easily used by other developers publicly then we should go with REST.
If we need Stateless CRUD operations then go with REST.
REST is commonly used in social media, web chat, mobile services and Public APIs like Google Maps.
RESTful service return various MediaTypes for the same resource, depending on the request header parameter "Accept" as application/xml or application/json for POST and /user/1234.json or GET /user/1234.xml for GET.
REST services are meant to be called by the client-side application and not the end user directly.
ST in REST comes from State Transfer. You transfer the state around instead of having the server store it, this makes REST services scalable.
Why SOAP?
SOAP is not very easy to implement and requires more bandwidth and resources.
SOAP message request is processed slower as compared to REST and it does not use web caching mechanism.
WS-Security: While SOAP supports SSL (just like REST) it also supports WS-Security which adds some enterprise security features.
WS-AtomicTransaction: Need ACID Transactions over a service, you’re going to need SOAP.
WS-ReliableMessaging: If your application needs Asynchronous processing and a guaranteed level of reliability and security. Rest doesn’t have a standard messaging system and expects clients to deal with communication failures by retrying.
If the security is a major concern and the resources are not limited then we should use SOAP web services. Like if we are creating a web service for payment gateways, financial and telecommunication related work then we should go with SOAP as here high security is needed.
source1
source2
IMHO you can't compare SOAP and REST where those are two different things.
SOAP is a protocol and REST is a software architectural pattern. There is a lot of misconception in the internet for SOAP vs REST.
SOAP defines XML based message format that web service-enabled applications use to communicate each other over the internet. In order to do that the applications need prior knowledge of the message contract, datatypes, etc..
REST represents the state(as resources) of a server from an URL.It is stateless and clients should not have prior knowledge to interact with server beyond the understanding of hypermedia.
First of all: officially, the correct question would be web services + WSDL + SOAP vs REST.
Because, although the web service, is used in the loose sense, when using the HTTP protocol to transfer data instead of web pages, officially it is a very specific form of that idea. According to the definition, REST is not "web service".
In practice however, everyone ignores that, so let's ignore it too
There are already technical answers, so I'll try to provide some intuition.
Let's say you want to call a function in a remote computer, implemented in some other programming language (this is often called remote procedure call/RPC). Assume that function can be found at a specific URL, provided by the person who wrote it. You have to (somehow) send it a message, and get some response. So, there are two main questions to consider.
what is the format of the message you should send
how should the message be carried back and forth
For the first question, the official definition is WSDL. This is an XML file which describes, in detailed and strict format, what are the parameters, what are their types, names, default values, the name of the function to be called, etc. An example WSDL here shows that the file is human-readable (but not easily).
For the second question, there are various answers. However, the only one used in practice is SOAP. Its main idea is: wrap the previous XML (the actual message) into yet another XML (containing encoding info and other helpful stuff), and send it over HTTP. The POST method of the HTTP is used to send the message, since there is always a body.
The main idea of this whole approach is that you map a URL to a function, that is, to an action. So, if you have a list of customers in some server, and you want to view/update/delete one, you must have 3 URLS:
myapp/read-customer and in the body of the message, pass the id of the customer to be read.
myapp/update-customer and in the body, pass the id of the customer, as well as the new data
myapp/delete-customer and the id in the body
The REST approach sees things differently. A URL should not represent an action, but a thing (called resource in the REST lingo). Since the HTTP protocol (which we are already using) supports verbs, use those verbs to specify what actions to perform on the thing.
So, with the REST approach, customer number 12 would be found on URL myapp/customers/12. To view the customer data, you hit the URL with a GET request. To delete it, the same URL, with a DELETE verb. To update it, again, the same URL with a POST verb, and the new content in the request body.
For more details about the requirements that a service has to fulfil to be considered truly RESTful, see the Richardson maturity model. The article gives examples, and, more importantly, explains why a (so-called) SOAP service, is a level-0 REST service (although, level-0 means low compliance to this model, it's not offensive, and it is still useful in many cases).
Among many others already covered in the many answers, I would highlight that SOAP enables to define a contract, the WSDL, which define the operations supported, complex types, etc.
SOAP is oriented to operations, but REST is oriented at resources.
Personally I would select SOAP for complex interfaces between internal enterprise applications, and REST for public, simpler, stateless interfaces with the outside world.
Addition for:
++ A mistake that’s often made when approaching REST is to think of it as “web services with URLs”—to think of REST as another remote procedure call (RPC) mechanism, like SOAP, but invoked through plain HTTP URLs and without SOAP’s hefty XML namespaces.
++ On the contrary, REST has little to do with RPC. Whereas RPC is service oriented and focused on actions and verbs, REST is resource oriented, emphasizing the things and nouns that comprise an application.
A lot of these answers entirely forgot to mention hypermedia controls (HATEOAS) which is completely fundamental to REST. A few others touched on it, but didn't really explain it so well.
This article should explain the difference between the concepts, without getting into the weeds on specific SOAP features.
REST API
RESTful APIs are the most famous type of API. REST stands REpresentational State Transfer.
REST APIs are APIs that follow standardized principles, properties, and constraints.
You can access resources in the REST API using HTTP verbs.
REST APIs operate on a simple request/response system. You can send a request using these HTTP methods:
GET
POST
PUT
PATCH
DELETE
TRACE
OPTIONS
CONNECT
HEAD
Here are the most common HTTP verbs
GET (read existing data)
POST (create a new response or data)
PATCH (update the data)
DELETE (delete the data)
The client can make requests using HTTP verbs followed by the endpoint.
The endpoint (or route) is the URL you request for. The path determines the resource you’re requesting.
When you send a request to an endpoint, it responds with the relevant data, generally formatted as JSON, XML, plain text, images, HTML, and more.
REST APIs can also be designed with many different endpoints that return different types of data. Accessing multiple endpoints with a REST API requires various API calls.
An actual RESTful API follows the following five constraints:
Client-Server Architecture
The client requests the data from the server with no third-party interpretation.
Statelessness
Statelessness means that every HTTP request happens in complete isolation. Each request contains the information necessary to service the request. The server never relies on information from previous requests. There’s no state.
Cacheability
Responses can be explicitly or implicitly defined as cacheable or non-cacheable to improve scalability and performance. For example, enabling the cache of GET requests can improve the response times of requests for resource data.
Layering
Different layers of the API architecture should work together, creating a scalable system that is easy to update or adjust.
Uniform Interface
Communication between the client and the server must be done in a standardized language that is independent of both. This improves scalability and flexibility.
REST APIs are a good fit for projects that need to be
Flexible
Scalable
Fast
SOAP API
SOAP is a necessary protocol that helped introduce the widespread use of APIs.
SOAP is the acronym for Simple Object Access Protocol.
SOAP is a standardized protocol that relies on XML to make requests and receive responses.
Even though SOAP is based on XML, the SOAP protocol is still in wide usage.
SOAP APIs make data available as a service and are typically used when performing transactions involving multiple API calls or applications where security is the primary consideration.
SOAP was initially developed for Microsoft in 1998 to provide a standard mechanism for integrating services on the internet regardless of the operating system, object model, or programming language.
The “S” in SOAP stands for Simple, and for a good reason — SOAP can be used with less complexity as it requires less coding in the app layer for transactions, security, and other functions.
SOAP has three primary characteristics:
Extensibility of SOAP API
SOAP allows for extensions that introduce more robust features, such as Windows Server Security, Addressing, and more.
Neutrality of SOAP API
SOAP is capable of operating over a wide range of protocols, like UDP, JMS, SMTP, TCP, and HTTP.can operate.
Independence of SOAP API
SOAP API responses are purely based on XML. Therefore SOAP APIs are platform and language independent.
Developers continue to debate the pros and cons of using SOAP and REST. The best one for your project will be the one that aligns with your needs.
SOAP APIs remain a top choice for corporate entities and government organizations that prioritize security, even though REST has largely dominated web applications.
SOAP is more secure than REST as it uses WS-Security for transmission along with Secure Socket Layer
SOAP also has more excellent transactional reliability, which is another reason why SOAP historically has been favored by the banking industry and other large entities.
What is REST
REST stands for representational state transfer, it's actually an architectural style for creating Web API which treats everything(data or functionality) as recourse.
It expects; exposing resources through URI and responding in multiple formats and representational transfer of state of the resources in stateless manner. Here I am talking about two things:
Stateless manner: Provided by HTTP.
Representational transfer of state: For example if we are adding an employee. .
into our system, it's in POST state of HTTP, after this it would be in GET state of HTTP, PUT and DELETE likewise.
REST can use SOAP web services because it is a concept and can use any protocol like HTTP, SOAP.SOAP uses services interfaces to expose the business logic. REST uses URI to expose business logic.
REST is not REST without HATEOAS. This means that a client only knows the entry point URI and the resources are supposed to return links the client should follow. Those fancy documentation generators that give URI patterns for everything you can do in a REST API miss the point completely. They are not only documenting something that's supposed to be following the standard, but when you do that, you're coupling the client to one particular moment in the evolution of the API, and any changes on the API have to be documented and applied, or it will break.
HATEOAS, an abbreviation for Hypermedia As The Engine Of Application State, is a constraint of the REST application architecture that distinguishes it from most other network application architectures. The principle is that a client interacts with a network application entirely through hypermedia provided dynamically by application servers. A REST client needs no prior knowledge about how to interact with any particular application or server beyond a generic understanding of hypermedia. By contrast, in some service-oriented architectures (SOA), clients and servers interact through a fixed interface shared through documentation or an interface description language (IDL).
Reference 1
Reference 2
Although SOAP and REST share similarities over the HTTP protocol, SOAP is a more rigid set of messaging patterns than REST. The rules in SOAP are relevant because we can’t achieve any degree of standardization without them. REST needs no processing as an architecture style and is inherently more versatile. In the spirit of information exchange, both SOAP and REST depend on well-established laws that everybody has decided to abide by.
The choice of SOAP vs. REST is dependent on the programming language you are using the environment you are using and the specifications.
To answer this question it’s useful to understand the evolution of the architecture of distributed applications from simple layered architectures, to object & service based, to resources based, & nowadays we even have event based architectures. Most large systems use a combination of styles.
The first distributed applications had layered architectures. I'll assume everyone here knows what layers are. These structures are neatly organized, and can be stacks or cyclical structures. Effort is made to maintain a unidirectional data flow.
Object-based architectures evolved out of layered architectures and follow a much looser model. Here, each component is an object (often called a distributed object). The objects interact with one another using a mechanism similar to remote procedure calls - when a client binds to a distributed object it loads an implementation of the objects interface into its address space. The RPC stub can marshal a request & receive a response. Likewise the objects interface on the server is an RPC style stub. The structure of these object based systems is not as neatly organized, it looks more like an object graph.
The interface of a distributed object conceals its implementation. As with layered components, if the interface is clearly defined the internal implementation can be altered - even replaced entirely.
Object-based architectures provide the basis for encapsulating services. A service is provided by a self-contained entity, though internally it can make use of other services. Gradually object-based architectures evolved into service-oriented architectures (SOAs).
With SOA, a distributed application is composed of services. These services can be provided across administrative domains - they may be available across the web (i.e. a storage service offered by a cloud provider).
As web services became popular, and more applications started using them, service composition (combining services to form new ones) became more important. One of the problems with SOA was that integrating different services could become extremely complicated.
While SOAP is a protocol, its use implies a service oriented architecture. SOAP attempted to provide a standard for services whereby they would be composable and easily integrated.
Resource-based architectures were a different approach to solving the integration problems of SOA. The idea is to treat the distributed system as a giant collection of resources that are individually managed by components.
This led to the development of RESTful architectures. One thing that characterizes RESTful services is stateless execution. This is different than SOA where the server maintains the state.
So… how do service-specific interfaces, as provided by service-oriented architectures (including those that use SOAP) compare with resource-based architecture like REST?
While REST is simple, it does not provide a simple interface for complex communication schemes. For example, if you are required to use transactions REST is not appropriate, it is better to keep the complex state encapsulated on the server than have the client manage the transaction. But there are many scenarios where the orthogonal use of resources in RESTful architectures greatly simplifies integration of services in what would otherwise mean an explosion of service interfaces. Another tradeoff is resource-based architectures put more complexity on the client & increase traffic over the network while service-based increase the complexity of the server & tax its memory & CPU resources.
Some people have also mentioned common HTTP services or other services that do not satisfy the requirements of RESTful architecture or SOAP. These too can be categorized as either service-based or resource-based. These have the advantage of being simpler to implement. You'd only use such an approach if you knew your service will never need to be integrated across administrative domains since this makes no attempt at fixing the integration issues that arise.
These sorts of HTTP-based services, especially Pseudo-RESTful services are still the most common types. Implementing SOAP is complicated and should only be used if you really need it - i.e. you need a service that's easily integrated across domains and you want it to have a service-interface. There are still cases where this is needed. A true RESTful service is also difficult to implement, though not as difficult as SOAP.
Related
I am implementing a micro-service architecture for the first time.
Some of my services (.NET Core Web APIs) need to communicate with each other through HTTP requests. For that purpose, I am injecting a wrapper around HttpClient.
But I suspect that I am reinventing the wheel. Among micro-service practitioners, is there a pattern or even a third-party library to solve this problem?
In a micro-service architecture, the most important thing is a clear separation of concerns and application boundaries. Imagine a simple setup, with Product and Price micro services
An important concept is each service is master of data, and owns its own database. In this example,
a client of the 'Product' service will make an HTTP call to the Product API.
the product API will make a call to the Price API to get prices for the products
the product API therefore depends on the Price API to create a response
These are the synchronous parts of the process, generally achieved through HTTP calls across boundaries. You'll also have asynchronous parts of your solution, in this example,
the Price API publishes an event to a bus whenever a price is changed
the product API publishes an event whenever a product is created
There may be one or more subscribers to these events, that will respond and probably call an API to retrieve the changed data.
The critical parts of this are clearly defining your API and message contracts, understanding if things will be async or sync, having the right level of telemetry across the entire architecture to track and understand distributed system behaviour, and keeping everything as independently buildable/testable/deployable components.
First and foremost, if you're not using containers, start, along with orchestration (both natively supported in Visual Studio, assuming you have Docker, etc. actually installed). Among the many benefits, you can reference your services via hostname without having to worry about ports and different locations for different environments.
As far as actual communication goes. There's not really a magic solution here. HttpClient is what you use, of course, and generally, yes, you want to have a wrapper around that to abstract away the low-level HTTP communication stuff, so the rest of your code can simply call simple methods on that wrapper.
If you aren't using IHttpClientFactory, start. If you already have a wrapper class, you're halfway there, and with that, not only do you get efficient management of HttpMessageHandlers so you don't exhaust your server's connection pool, but you can also use the Polly integration to handle transient HTTP errors and even do retry policies, circuit breakers, etc. for your microservice connections.
Finally, there is the Refit library which can make things a tad more straight-forward. I find it to have more use with huge third-party APIs like Facebook, Google, etc., though. Since microservices should by design be simple, you're probably not saving much code over just having your own wrapper class. Regardless, the way it works is that you define an interface that represents the API, and then Refit uses that to actually make appropriate requests. It's kind of like a wrapper class for free, but you still need to create the interface.
I have a set of RESTful services that my Angular 5 client uses to perform CRUD and business operations for the application. These are a set of micro services and they use pub/sub message queues to communicate between them, e.g. when a user is created the user server publishes a UserCreated event to the message queue and subscribers can listen for this event and act upon it as required.
Now, this is all good but i was thinking that wouldn’t it be better if the Angular 5 application itself published the event to the message queue rather than making HTTP POST/PUT or DELETE and only make GET requests against the API?
So repeating the example above the Angular 5 client would publish a CreateUserEvent to the message bus (in my case cloud pub/sub), I could then have services subscribe to these events and act upon them. My RESTful services would then only expose GET /users and GET /user/:id for example.
I know that this is doable and I guess what I am describing is CQRS, but I am keen to understand if publishing events to a message bus from the UI is good practice?
The concept of Messaging Bus is very different than Microservices. Probably, the answer to your question lies in the way you look at these two, from architectural perspective.
A messaging bus(whether it is backend specific or frontend specific) is designed in such a way, that it serves the purpose of communication of entities within the confined boundary of an environment, i.e. backend or frontend.
Whereas on the other hand, microservices architecture is designed in such a way that, two different environments that may be backend-frontend or backend-backend, can "effectively" communicate.
So there is a clear separation of motivation behind both the concepts. Now, from your viewpoint, you may use a hybrid approach which might work, and it may also lead to interesting findings related to performance, architectural design or overheads as well.
Publishing directly from the client is possible, but the caveat is that it means that the client needs to have the proper credentials to publish. For this reason, it may be preferable to have the service do the publishing in response to requests sent from the clients.
When is an API NOT RPC ?
After a long long discussion on twitter regarding API design.
I want to try to come to a clear answer when an API is not RPC based.
There seems to be quite a bit of confusion around this Is That REST API Really RPC? Roy Fielding Seems to Think So
That specific link is about REST and RPC. my question just aims to ask about RPC vs not RPC in general, not in the context of HTTP.
The TLDR; definition of when an API IS RPC is "when it hides the network from the developer"
Fair enough.
But back to when is it not.
The Twitter duscussion focused on the HttpClient, as not being RPC.
My argument is that the HttpClient does not really model the network.
There is nothing in there modelling latency.
There is also a very weak representation of network failures.
Most status codes in HTTP are not network related, e.g. NotFound, Payment Required etc.
Depending on language and SDK, the HttpClient may take a string or an URI.
In the case where it takes a string representing the URI, that also doesn't really communicate that there is a network involved, you would need to know what the string represents and whant an URI is.
If you had never seen HTTP before, how would you know by looking at that specific API?
There is an API consuming a string, and returning an object either containing a string or some int response codes.
How would you know?
Most developers of-course knows what HTTP is, but from a strict definition point of view, how would you define an API as not RPC?
Would a checked exception in Java "NetworkException" be enough to clearly communicate that there is a network involved?
If the functions you are calling have descriptive names like "SayHelloOverNetwork" would that be enough to make it not RPC?
And exactly when is a procedure a procedure in remote terms?
All network communication will result in some code being run on the receiving system.
If we take a person who have never been in touch with technology or protocols, and teach this person a programming language.
What would be the definition of "Not RPC" this person could employ to spot what is RPC and not?
I am being dense and possibly silly here, but I am trying to find the essence of "Not RPC", what exactly is required for this?
Does it for example need to be non blocking in order to play nice with unknown latency?
Or could a non RPC API be blocking?
IMO, if it is blocking, it hides that there will be latency involved, and thus falls into the "Latency is zero" fallacy, meaning it hides the network.
This all seems to be super obvious to everyone else but me, but no one have yet shown a concise answer on what the requirements for not being RPC is.
For me it looks like you're putting all eggs into one basket. Lets start from the very beginning:
API is application programming interface
it is a set of clearly defined methods of communication between various software components
The only definitive property of API is that it is defined/documented way of communication.
If one application/module/component A can use/call another application/module/component B somehow - it means that B provides API and A uses this API.
Usually there are two aspects of API which must be defined:
What exactly is passed into/returned from component (its your application logic)
How exactly data is transferred/serialized (this is technical implementation)
I'm not touching "what" part for obvious reasons.
Lets focus on different ways of "how":
push 4 byte integer to stack, change IP register and read 4 byte integer output from EAX register
connect to socket, write data serialized as byte array and read some response as byte array back
call 911 from any phone, say your address and expect several cars on your driveway
In all these cases you're using some API = you're communicating with other components in some predefined way.
RPC is remote procedure call
computer program causes a procedure (subroutine) to execute in another address space
The only definitive property of RPC is that data is passed across different/remote address spaces, some architecture allow different address spaces on single host, for example x86. As soon as different physical hosts usually do not share address space, any call across network is RPC, but not vice versa.
Note: It is rare, but possible to share memory space across different physical hosts connected in network - then such communication strictly speaking is not RPC, lets omit such cases.
Any RPC call automatically means that you're calling some API. By definition. So we can say that RPC is part of API's "how", it is transport level. As soon as RPC itself does not define actual mechanism, there are could be very different implementations, for example shared memory, DMA, TCP/IP, etc.
I suppose now you can answer your question when an API is not RPC based - When API says so. It is up to API developer to define whether it should/can be called via RPC or not, API can define multiple ways of calling it.
As antonym to RPC you can use "in-process".
So, phrase API IS RPC is "when it hides the network from the developer" is absolutely non-sense. API must define "how" section.
HTTP is hyper text transfer protocol
request–response protocol in the client–server computing model
The only definitive property of HTTP is that it describes protocol/format of request (input arguments) and response.
You can use HTTP in non-RPC API. For example I prefer to think about HTTP as file format. So we can say that HTTP is another part of API's "how". It defines serialization part of API, but does not dictate you transport level.
Note: Some RPC protocols actually define both transport and serialization.
So, HttpClient is the tool which allows you to invoke API and use HTTP encoded request/response, usually such library supposes RPC as transport level. None of these terms mandates network or any particular transport protocol. This is why http client should not declare any kind of network exceptions/errors, but it could throw HTTP errors as exceptions.
Note: Network exceptions could be thrown from TCP/IP RPC implementation for example, HTTP client library could proxy them to you. Unfortunately some libraries wrongly couple HTTP with TCP/IP too much and border between different responsibilities is crossed.
REST is representational state transfer
architectural style for distributed hypermedia systems
It is very wide term, it defines a lot of things from different aspects and at very different levels, most important:
HOW your API should be designed (usage of URI)
HOW your API should be implemented (stateless, HTTP verbs)
HOW your API should be called (client-server)
Client-server, usually assumes cross-process communication. Ie different address spaces, this is why we can say that REST mandates RPC as invoking mechanism to API + HTTP as serialization format.
Now, I suppose you will be able to understand these answers:
How would you know?
when you give client your REST API, it automatically defines some part of your API in terms of other protocols. Ie - to use REST API you must read/know HTTP protocol first. API must define what kind of RPC must be used.
HttpClient does not really model the network.
It should not do this. It works with HTTP semantics.
you would need to know what the string represents and whant an URI
There are two URIs actually:
URI is part of API's "What" section, it defines business object location. It has nothing to do with network or DNS system. You should not understand it.
URI could be part of TCP/IP RPC requirements, in this case it represents domain name/path. But some implementations can work with IP addresses, not URI.
If the functions you are calling have descriptive names like "SayHelloOverNetwork" would that be enough to make it not RPC?
As I wrote before - we can assume network as RPC always.
Does it for example need to be non blocking in order to play nice with unknown latency?
Its up to API developer:
API can contain functions which suppose asynchronous execution, but client can call them in blocking way
API can contain blocking functions, but client can call them asynchronously
An RPC is a network API with enough layers of abstraction on top. What's enough layers? That's a subjective thing.
Whether an API is an RPC or not doesn't, in my opinion, depend on method names or the names of exceptions/errors you need to handle. We're adults, we know that the call is done over the network. Naming it "SayHello" instead of "SayHelloOverNetwork" doesn't make a difference.
What does make a difference is all of those layers of abstraction - the less resource management and error handling you need to do, the more RPC-ish the code.
And about the specific HttpClient example - I'd say that today, developers doing high-level work consider HTTP to be a transport medium; an alternative to "plain old sockets".
So while a person who does not know what HTTP is might look at "a function that takes a string and returns an error code" as RPC, a modern developer would probably see it as "nitty gritty networking code". He would then say yuck, and put a few more layers of abstraction on top to make method calls from his business logic more RPC-ish. That way, the business logic would have to handle only the most extreme failures.
A) If the documentation told you how to do what you want by calling the API, then you are calling an RPC-like API, not a REST API.
B) If the API itself told you how to call it to accomplish what you want, then that API is not very RPC-like, and might be a REST API.
Programming in an object oriented or a procedural style is like case (A) -- you know what to call and how to call it to accomplish what you want before you write the code to call it. When [I assume Roy F.] says that an RPC-like API hides the network from the developer, he means that the developer can continue to program in this way whether or not his calls are remote -- he doesn't have to care about the network.
When you call a REST API, however, you have to program differently, because you have to let the API tell you what you can do and how you can do it. That's what it means to be "hypertext-driven".
Being hypertext-driven means that your stuff will continue to work when the guy on the other side of the network, who doesn't know or care about your program at all, completely changes what you can do and how you have to do it. Note that the lack of any contract between you and the system you are calling is the fundamental feature of the network that RPC hides.
I am building huge application using microservices architecture. The application will consist of multiple backend microservices (deployed on multiple cloud instances), some of which I would like to connect using rest apis in order to pass data between them.
The application will also expose public api for third parties, but the above mentioned endpoints should be restricted ONLY to other microservices within the same application creating some kind of a private network.
So, my question is:
How to achieve that restricted api access to other microservices within the same application?
If there are better ways to connect microservices than using http transport layer, please mention them.
Please keep the answers server/language agnostic if possible.
Thanks.
Yeah easy. Each client of a micro service has an API key. Micro services only accept requests from clients with a valid API Key.
Also, its good to know that REST is simply a protocol that allows communication between bounded contexts.
It doesn't have to be over HTTP. The requirement is that it has a uniform interface (this is why HTTP is used with its PUT, POST, GET, DELETE... methods) and that it is stateless (all state being transferred through a URI).
So if all your micro services run on the same box, all you need to do is something like this:
class SomeClass implements RestfulMethods {
public function get(params){ // return something}
public function post(params){ // add something}
public function put(params){ // update something}
public function delete(params){ // delete something}
}
Micro services then communicated by interacting with the RestfulMethod implementations of other services.
But if your micorservices are on different machines, its probably best to use HTTP as the transport mechanism.
One way is to use HTTPS for internal MS communication. Lock down the access (using a trust store) to only your services. You can share a certificate among the services for backend communication. Preferably a wildcard certificate. Then it should work as long as your services can be adressed to the same domain. Like *.yourcompany.com.
Once you have it all in place, it should work fine. HTTPS sessions does imply some overhead, but that's primarily in the handshake process. Using keep-alive on your sessions, there shouldn't be much overhead with encrypted channels.
Of course, you can simply add some credentials to your http headers as well. That would be less secure.
RestAPI is not only way to do it, one of the some ideas that i have seeming is about the usage of Service Registry link Eureka (Netflix), Zookeeper (Apache) and others.
Here is an example:
https://github.com/tiarebalbi/qcon2015-sao-paolo-microservices-workshop
...the above mentioned endpoints should be restricted ONLY to other
microservices within the same application...
What you are talking about in a broad sense is authorisation.
Authorisation is the granting or denying of "powers" or "abilities" within your application to authentic users.
Therefore the job of any authorisation mechanism is to validate the "claim" implicit in any inbound API request - that the user is allowed to do the thing encoded in the request.
As an example, imagine I turned up at your API with a PUT request for Widget 1234:
PUT /widgetservice/widget/1234 HTTP/1.1
This could be interpreted as me (Bob Smith, a known user) making a claim that I am allowed to make changes to a widget in your system with id 1234.
Whatever you do to validate this claim, I hope you can see this needs to be done at the application level, rather than at the API level. In fact, authorisation is an application-level concern, rather than an API-level concern (unlike authentication, which is very much an API level concern).
To demonstrate, in our example above, it's theoritically possible I'm allowed to create a new widget, but not to update an existing widget:
POST /widgetservice/widget/1234 HTTP/1.1
Or even I'm allowed to update only widget 1234 and requests to change other widgets should not be allowed
PUT /widgetservice/widget/5678 HTTP/1.1
How to achieve that restricted api access to other microservices
within the same application?
So this becomes a question about how can you build authorisation into your application so that you can validate individual requests coming from known users (in your case your other services in your ecosystem are just another kind of known user).
Well, and apologies but I'm going to be prescriptive here, you could use a claims-based authorisation service, which stores valid claims based on user identity or membership of roles.
It depends largely on how you are handling authentication, and whether or not you are supporting roles as part of that process. You could store claims against individual users but this becomes arduous as the number of users increases. OAuth, despite being pretty heavy to implement, is a leading platform for this.
I am building huge application using microservices architecture
The only thing I will say here is read this first.
The easiest way is to only enable access from the IP address that your microservices are running on.
I know i'm super late for this question :)) but for anyone who came across this thread, Kafka is a great option for operations similar to this question.
based on Kafka's own introduction
Kafka is generally used for two broad classes of applications:
Building real-time streaming data pipelines that reliably get data between systems or applications
Building real-time streaming applications that transform or react to the streams of data
Side Note: Kafka is created by LinkedIn and is being used in many huge companies so it's kindda battle tested.
you can use RabbitMQ
publish your resquests to queue and then consume tasks
I have found myself responsible for carrying on the development of a system which I did not originally design and can't ask the original designers why certain design decisions were taken, as they are no longer here. I am a junior developer on design issues so didn't really know what to ask when I started on the project which was my first SOA / WCF project.
The system has 7 WCF services, will grow to 9, each self-hosted in a seperate console app/windows service. All of them are single instance and single threaded. All services have the same OperationContract: they expose a Register() and Send() method. When client services want to connect to another service, they first call Register(), then if successful they do all the rest of their communication with Send(). We have a DataContract that has an enum MessageType and a Content propety which can contain other DataContract "payloads." What the service does with the message is determined by the enum MessageType...everything comes through the Send() method and then gets routed to a switch statement...I suspect this is unusual
Register() and Send() are actually OneWay and Async...ALL results from services are returned to client services by a WCF CallbackContract. I believe that the reson for using CallbackContracts is to facilitate the Publish-Subscribe model we are using. The problem is not all of our communication fits publish-subscribe and using CallbackContracts means we have to include source details in returned result messages so clients can work out what the returned results were originally for...again clients have a switch statements to work out what to do with messages arriving from services based on the MessageType (and other embedded details).
In terms of topology: the services form "nodes" in a graph. Each service has hardcoded a list of other services it must connect to when it starts, and wont allow client services to "Register" with it until is has made all of the connections it needs. As an example, we have a LoggingService and a DataAccessService. The DataAccessSevice is a client of the LoggingService and so the DataAccess service will attempt to Register with the LoggingService when it starts. Until it can successfully Register the DataAccess service will not allow any clients to Register with it. The result is that when the system is fired up as a whole the services start up in a cascadeing manner. I don't see this as an issue, but is this unusual?
To make matters more complex, one of the systems requirements is that services or "nodes" do not need to be directly registered with one another in order to send messages to one another, but can communicate via indirect links. For example, say we have 3 services A, B and C connected in a chain, A can send a message to C via B...using 2 hops.
I was actually tasked with this and wrote the routing system, it was fun, but the lead left before I could ask why it was really needed. As far as I can see, there is no reason why services cannot just connect direct to the other services they need. Whats more I had to write a reliability system on top of everything as the requirement was to have reliable messaging across nodes in the system, wheras with simple point-to-point links WCF reliabily does the job.
Prior to this project I had only worked on winforms desktop apps for 3 years, do didn't know any better. My suspicions are things are overcomplicated with this project: I guess to summarise, my questions are:
1) Is this idea of a graph topology with messages hopping over indirect links unusual? Why not just connect services directly to the services that they need to access (which in reality is what we do anyway...I dont think we have any messages hopping)?
2) Is exposing just 2 methods in the OperationContract and using the a MessageType enum to determine what the message is for/what to do with it unusual? Shouldnt a WCF service expose lots of methods with specific purposes instead and the client chooses what methods it wants to call?
3) Is doing all communication back to a client via CallbackContracts unusual. Surely sync or asyc request-response is simpler.
4) Is the idea of a service not allowing client services to connect to it (Register) until it has connected to all of its services (to which it is a client) a sound design? I think this is the only design aspect I agree with, I mean the DataAccessService should not accept clients until it has a connection with the logging service.
I have so many WCF questions, more will come in later threads. Thanks in advance.
Well, the whole things seems a bit odd, agreed.
All of them are single instance and
single threaded.
That's definitely going to come back and cause massive performance headaches - guaranteed. I don't understand why anyone would want to write a singleton WCF service to begin with (except for a few edge cases, where it does make sense), and if you do have a singleton WCF service, to get any decent performance, it must be multi-threaded (which is tricky programming, and is why I almost always advise against it).
All services have the same
OperationContract: they expose a
Register() and Send() method.
That's rather odd, too. So anyone calling will first .Register(), and then call .Send() with different parameters several times?? Funny design, really.... The SOA assumption is that you design your services to be the model of a set of functionality you want to expose to the outside world, e.g. your CustomerService might have methods like GetCustomerByID, GetAllCustomersByCountry, etc. methods - depending on what you need.
Having just a single Send() method with parameters which define what is being done seems a bit.... unusual and not very intuitive / clear.
Is this idea of a graph topology with
messages hopping over indirect links
unusual?
Not necessarily. It can make sense to expose just a single interface to the outside world, and then use some internal backend services to do the actual work. .NET 4 will actually introduce a RoutingService in WCF which makes these kind of scenarios easier. I don't think this is a big no-no.
Is doing all communication back to a
client via CallbackContracts unusual.
Yes, unusual, fragile, messy - if you can ever do without it - go for it. If you have mostly simple calls, like GetCustomerByID - make those a standard Request/Response call - the client requests something (by supplying a Customer ID) and gets back a Customer object as a return value. Much much simpler!
If you do have long-running service calls, that might take minutes or more to complete - then you might consider One-Way calls which just deposit a request into a queue, and that request gets handled later on. Typically, here, you can either deposit the answer into a response queue which the client then checks, or you can have two additional service methods which give you the status of a request (is it done yet?) and a second method to retrieve the result(s) of that request.
Hope that helps to get you started !
All services have the same OperationContract: they expose a Register() and Send() method.
Your design seems unusual at some parts specially exposing only two operations. I haven't worked with WCF, we use Java. But based on my understanding the whole purpose of Web Services is to expose Operations that your partners can utilise.
Having only two Operations looks like odd design to me. You generally expose your API using WSDL. In this case the WSDL would add nothing of value to the partners, unless you have lot of documentation. Generally the operation name should be self-explanatory. Right now your system cannot be used by partners without having internal knowledge.
Is doing all communication back to a client via CallbackContracts unusual. Surely sync or asyc request-response is simpler.
Agree with you. Async should only be used for long running processes. Async adds the overhead of correlation.