How to route ElasticSearch to a specific node (not shard)? - lucene

I know it's possible to route a search to a specific shard, but I'm looking to route a search to a specific node. The reason is because some nodes are more powerful than others, and I want to have logic to hit those nodes more than the weaker nodes when conducting a query.
Is this possible? I know, short question but believe me I've done a lot of research and Googling and couldn't find the answer.

If you will set an awareness attribute based on the class of machine, you can use it to automatically route requests within the class. If you are using java-based transport or node clients you can simply assign the same awareness attributes to your clients and they will automatically route request to the nodes with the same set of attributes. If you are using REST clients, you can either connect to machines of desired class or add dedicated client nodes into your cluster, assign them desired awareness attribute and use the to execute queries.

Found the answer:
Just append this in the REST URL when searching "&preference=_primary_first"

Related

What is the difference between performing a procedure on a resource and performing a state transformation?

I'm new to web development and I'm attempting to understand REST. The tutorial I'm watching makes mention of the difference between "procedures" and "state transformation". Stating that REST is based on the notion of "state transformation", but it does not delineate the difference between the two.
This has left me wondering what is the difference between the two? Why can't an operation which transforms the state of a resource also be considered a procedure? After all, 'procedure' sounds like a generic enough term that it would also encompass an operation that would transform the state of a resource.
So, what is the difference between performing a procedure on a resource, and performing a state transformation? Or is it merely a matter of semantics?
I have also tried searching for the answer but can't seem to find anything that will shed light on this.
TL;DR
RPC focues on sending a payload containing method names and arguments in a predefined format. Clients couple tightly to servers through a shared interface (Skeletton classes, WSDL or other interface definition languages (IDLs))
REST focues on decoupling clients from servers and on introducing indirections, like support of multiple different media types to marshal resource state in, and the whole interaction concepts summarized by HATEOAS where hypertext controls are used to drive the application state forward through a domain application protocol / state machine on the server side. Responses usually contain semi-structured data, which usually don't go well with simple CRUD application, that follow the definition of corresponding media type definition (i.e. the HTML spec). If you will the state of a resource is transformed into a representation format adhering to the rules in the media type definition and transferred to the remote side
In network programming, remote procedure call (RPC)-style invocations, i.e. often used in RMI, Corba, SOAP or similar frameworks, will send usually a method name that should be invoked at the server alongside with parameters to feed the method with. The return value is then marshalled into corresponding response and sent back to the caller. What a client could invoke is usually exposed via external stuff, i.e. skeletton classes, WSDL- or other form of contracts and so on. So far, so simple. This is how most of the networking stuff works. However, the drawback here is that a client is tightly coupled to the exposed interface (skeletton classes, WSDL, external documentation) and many problems in internet computing arise due to changes over time that are not adequatly depictable in those interfaces.
If you take a closer look though at how the Web works for decades, change is an inherent part of it. Your browser will just show the most recent state of a resource (Web page) it has. It might eigther got it from its cache or from a server it asked for. If the version available in its cache is older than a predefined threshold value it will ignore the cached value and request a new version. If there happened an update since the last version your browser is automatically served with the new version. Fielding, who was working on the HTTP 1.0 and 1.1 spec back then, analyzed how the interaction on the Web takes place and generalized his findings into the REST architecture design. So, if you will, REST is just Web surfing for applications.
Unfortunately a mojority of enthusiasts and professional have not yet understood what REST really is and there is so much false information available in regards to REST, even here at Stackoverflow most people don't seem to care actually and posts explaining the true nature of REST are downvoted and wrong information upvoted.
So, what does REST different than typical RPC-like method invocations?
First, REST relies on a certain set of uniform interfaces, that are the same for every participant in that architecture. These are i.e. HTTP as transport layer and a naming scheme for resource (URI) so that everyone acts on these fixed principles. This helps to reduce interoperability issues that are just way to common in traditional network programming.
Next, it relies on a basic principle: Servers teach clients what they need to know. But how does a server know what a client need to know? Well, as Jim Webber pointed out, the designer of the application develops a state machine (or domain application protocol) a client will follow through. Think of a checkout system on your favorite online shop. At one point it presents you the items in your trolly and offers you a choice to progress to the next "page" where you can enter the shipping address and on further progressing through the state machine you will be asked for your payment options and so on until at one point to finished the checkout and are served with a "Thank you" page that summarizes your order. Under the hood you just progressed through their protocol on how to place orders and used application controls to progress your client further through their state machine. You therefore got served with some Web forms and links that you used to fulfill your task. In essence, this is what Hypertext as the engine of application state (or HATEOAS for short) is all about.
On the Web HTML forms are used to teach a client about what properties a resource supports, which ones are editable and so on. Besides that, it also teaches clients on the actual URI to send input data to, the HTTP operation to utilze as well as, mostly implicitly given, the media type to marshal the request into. I.e. a regular HTML form will use application/x-www-form-urlencoded as its default media type to send the data to the server. So a full HTTP request for an input of a first and last name may look like this:
POST /path/to/resource HTTP/1.1
Host: acme.org
Connection: close
Accept: */*
User-Agent: ...
Content-Type: application/x-www-form-urlencoded
Content-Length: 32
firstName=Roman&lastName=Vottner
The same data could be sent using a different representation format, if it were supported by the media type the form was issued for. Unfortunately, HTML does not support that many.
Links provided by a server should usually be annotated (or accompanyied) by so called link relation names that put the current resource in relation with the given URI. If you will they are the predicate in a tripple of subject (current page), predicate (link relation name) and object (link target resource). Such names, of course, should be standadized or at least follow the Web linking extension mechanism. URIs itself are opaque, meaning they themselves don't provide meaning and should therefore not get parsed and analyzed at all. A common mistake often seen in so called "REST APIs" is that they have typed resources, i.e. a user resource or a car resource that can be marshalled on the client side to a programming language specific object (i.e. Java object of class User or the like) that is pretty common in traditional RPC-sytle programming. In a REST architecture the representation format however is usually semi-structured data, i.e. a mix of syntax defining control inputs or elements and actual data. As such, a direct mapping from DB-Entry, to Model-Object to a resource itself, as done by so many CRUD applications, is not possible.
Why is this all done in first place?
If you compare traditional network programming a client is usually only able to work with that one server and if something at that server changes clients may be affected and thus stop working. There is a tight coupling between those two apparent. The REST architecture introduces a couple of indirections, i.e. usage of link relations instead of attempting to analyze meaningful URIs as well as usage of a multitude of possible media-types instead of relying on a specified version format, which help to decouple clients from servers. I.e. instead on coupling to the server in regards of the message exchanged, both, client and server couple to media types. Through content-type negotiation a client simply tells the server of its capabilities and the server should generate a response the client can process. Instead of focusing on one message format, REST has the freedom of almost infinite ones as long as both, client and server, support these. The more media types a peer supports, the more likely it will be to interact with other peers in that network.
All these points I've mentioned above lead to a strict decoupling of client and servers, which grant the latter one to evolve freely without having to fear that changes introduce will break clients as neither the transport protocol nor the naming scheme have changed and the changes introdcued are still in scope of the media-type definition. So, well-behaved peers in that network will be able to pick up changes on the fly automatically. This is especially handy if you develop an application that should withstand the sands of time and still server clients in years to come.
If you don't need such properties, there is nothing wrong with not being "RESTful" at all, just don't call such services/APIs REST then. Also, developing REST is for sure more overhead compared to typical RPC-style interactions.

some doubts related to REST API [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Can we say Like Rest API we have SOAP, XML RPC, JSON RPC, GRAPH QL I mean Rest is just a type of API?
Is Rest is just a mechanism to share data between applications by using HTTP methods?
REST can share data between applications only with HTTP there is no alternative?
What is the relation between REST & CRUD exactly? we are saying HTTP: POST is CRUD: CREATE my question is HTTP: POST will just post the data to server and it's business logic's headache to CREATE A NEW RESOURCE in the server side but how we all are saying HTTP: POST is nothing but CRUD: CREATE here HTTP: POST is just helping to share the data only right then how it is related to CRUD: CREATE? If that is the case we can create a new resource with HTTP: GET by writing business logic right then why so many sites are saying REST is a mechanism to do CRUD operations... But it should be REST is a mechanism to help in Sharing the data between applications right?!!! (I have doubt similarly with HTTP: DELETE, GET, PUT aswell)
Last but not least... what exactly Representational state transfer mean? could you please answer this with a very low level general answer instead of definition.
Can we say Like Rest API we have SOAP, XML RPC, JSON RPC, GRAPH QL I mean Rest is just a type of API?
REST stands for representation state transfer and is just an architectural style, not a technology or protocol. According to Robert C. "Uncle Bob" Martin an architecture is about intent and the intention behind REST is the decoupling of servers and clients in a distributed system.
REST basically just defines a set of constraints that when followed correctly allow servers to change at any time without breaking clients as clients will just depend on the data given by the server and not on any external data or documentation. REST can be regarded as Web surfing for applications. The main premise should always be that a server teaches a client on how certain things can be achieved.
On the Web a server can i.e. teach a client on the supported properties of a resource through the help of HTML forms. Not only does a client learn that way what a server expects as input, it also learns what HTTP operation should be used to send the data to the server, the endpoint URI to send the request to as well as, usually implicitly given, also the media type to convert the input to, which is application/x-www-form-urlencoded usually by default, which transforms an input like the HTML example below for first and lastname to something like this:
fName=Roman&lName=Vottner
Is Rest is just a mechanism to share data between applications by using HTTP methods?
REST itself is protocol agnostic meaning that it is not tide to HTTP itself and could just work on other transport protocols as well. Though the common perception many developers have is that it is based on HTTP. After all, as Jim Webber put it, HTTP is just a transport layer whose domain is the transfer of files or data over the Web. All HTTP does is to send one document from one machine to the next and any business rules we conclude from sending/receiving a request are just a side effect of the actual document management. It is therefore always better to think of a request as a whole document and the HTTP operation define how the document should be stored on the current machine, especially when such a document is already available, instead of thinking of a service method invocations. The latter one is a typical RPC view.
REST can share data between applications only with HTTP there is no alternative?
HTTP is just a transport layer used in a REST architecture. The architecture cares more on the interaction model between client and servers than on the technical nuances of HTTP. As REST itself is transport protocol agnostic it could be used with other, maybe proprietary protocols as well.
What is the relation between REST & CRUD exactly? we are saying HTTP: POST is CRUD: CREATE my question is HTTP: POST will just post the data to server and it's business logic's headache to CREATE A NEW RESOURCE in the server side but how we all are saying HTTP: POST is nothing but CRUD: CREATE here HTTP: POST is just helping to share the data only right then how it is related to CRUD: CREATE? If that is the case we can create a new resource with HTTP: GET by writing business logic right then why so many sites are saying REST is a mechanism to do CRUD operations... But it should be REST is a mechanism to help in Sharing the data between applications right?!!! (I have doubt similarly with HTTP: DELETE, GET, PUT aswell)
Fielding's thesis on the REST architecture style does not contain the term CRUD at all. The term REST nowadays is heavily misused as people probably didn't bother to actually read the thesis, which admittingly is a bit abstract, and just follow what some people thought may be REST but was actually RPC. Nowadays, if a typical stakeholder talks about REST they usually think of a JSON-based HTTP CRUD-API whose supported endpoints are defined in some Web documentation (Swagger, OpenAPI, ...) and where the HTTP operations for POST (= Create), GET (= Read), PUT (= Update), Delete (= Delete) are supported by default. However, this is unfortunately far from the truth. Though people are just to accustomed with their (wrong) definition and don't see or don't care about the actual problem in their misusage. They don't care about a long-lasting service as in 2-5 years the next-gen technology is here that allows to reduce the number of lines of codes even more and if a new "version" of a service is needed, this usually goes hand in hand with a technology switch also, to justify the "cost of change" somehow.
Last but not least... what exactly Representational state transfer mean? could you please answer this with a very low level general answer instead of definition.
Probably the easiest way to grasp how the interaction in a REST architecture should be is by analyzing typical interaction on the Web, the big cousin of REST. You, as a user, usually start by opening your browser (= client) and typing in some URL in the search bar. Next a Web page is rendered on your screen. Behind the curtain a couple of things happened. Besides the whole connection management and any eventual TLS handshake your browser sent at least one GET request to the target server. On sending the GET request, the client included information on his capabilities, i.e. through the Accept HTTP header. This header is used on the server side to decide which representation format to generate and send to the user. On the Web this might be something like text/html or application/xhtml+xml or if some report is generated might be something like application/pdf or application/vnd.ms-excel or the like, depending which representation format fit the data best.
The representation format itself is now a concrete instance of a document following a certain media type specification. I.e. the HTML forms specification defines the supported elements within a <form>...</form> tag pair as well as describes the semantics of each of the elements. The concrete instance may now define a form as such:
<form action="/action_page.php" method="get">
<label for="fname">First name:</label>
<input type="text" id="fname" name="fname"><br><br>
<label for="lname">Last name:</label>
<input type="text" id="lname" name="lname"><br><br>
<input type="submit" value="Submit">
</form>
which should only use elements and attributes defined within the specification else a receiver of that document might not be able to process it correctly.
This process of telling the server which document types a client supports and allowing the server to chose a fitting representation is called content type negotiation and allows the exchange of arbitrary, type-less documents. Of course, both parties need to support and understand at least one common media type to be able to interact with each other. This is similar to a Frenchmen who does not understand a word of Chinese and a Chinese one that does not understand any word of French who need to communicate (for whatever reason), if both speak English they will be able to communicate.
There are loads of different media types already available that all server different purposes. Depending on your needs an all-purpose one, such as text/html, might be sufficient, others such as application/json or even application/hal+json might though lack support for certain needed elements. Existing media types might not support needed elements. In such a case extending such media types and registering those is probably easier than creating a whole media type from scratch.
REST assumes that a resource (i.e. a remote document) contains some internal data, its state. This state can be represented in many different ways. Think of some monthly sails figures. You might ship the data either in a HTML table, as CSV data, as Excel file, as PDF or yet a different representation format. Regardless of the chosen media type to marshal the data in, the actual data at least should express the same. Instead of questioning which media type you want to support, you should better ask how many different ones you want to support as this just increases the likelihood that other clients may interact with your server also.
edit:
I got all the points except 4th point... Could you please elaborate a bit.. So is it just a mechanism that helps in sharing the data between applications by using HTTP methods? we can say like that?
CRUD is a typical term in the context of persistence, especially with databases. REST or more formally the REST architecture itself treats persistence as internal detail. A typical user usually does not care whether some data is persisted into a DB, a local file system or is kept just in memory. All s/he cares is that the server can process it or for storage services also return the same data that was sent to the server.
In regards to the mapping of CRUD to the HTTP operations, if you take a look at HTML you might see that it only supports POST and GET operations. So anything related to C, U or D are performed with POST which is defined to process the enclosed representation according to the resource's own semantic. With POST you are basically allowed to do anything, even retrieving data if you like.
However, HTTP defines certain properties for the respective operations:
safe
idempotent
cacheable
The first property is a promise to clients that a well-behaved server should not alter the state of a resource upon requesting. The second one is a promise in regards to automatic retry attempts caused by i.e. temporary connection issues. And the latter one allows clients to store responses locally and reuse these instead of requesting the same resource again, if the cached content is "fresh-enough".
GET and HEAD are both safe, idempotent and cacheable, meaning that a client can request resources with such an operation without being hold accountable for any eventual changes. Think of a Web spider that is invoking arbitrary URIs all the time to learn new pages over time. If a GET request on a URI would trigger an order of a Pizza or the like, it is basically the server's fault and not the clients one if a crawler would order Pizzas every time such URIs are called.
PUT and DELETE are only idempotent, which basically allows a client to automatically resend a request in case of a network issue as the outcome of the operation leads to the same result regardless whether the request was processed once or multiple times in a row. Note that this property does not consider changes done by other clients to that resource between a resend. Such data would of course be overwritten.
The remaining operations (POST, PATCH, CONNECT, TRACE) are neither of these.
While technically it is sufficient to only use POST for each request, the above mentioned properties should trigger an inner intention to use them, when appropriate. However, as before mentioned, not the client should chose which operation to perform but the server should tell a client which HTTP operation it should use.
In regards to POST vs. PUT, both operations should behave similar on creating a resource. Both need to add a Location header within the response that teaches a client about the location of the new resource. PUT however, in contrast to POST, replaces the current representation of the requested resource with the one provided in the request body. So it already targets the respective resource while for a POST request the server defines where the resource is created. It is allowed to perform certain sanity checks and also to transform the representation to fit the representation format of the current one. It is also allowed to have side effects, i.e. think of Git where a commit creates a new entry on top of the current branch and moves the HEAD to the new commit.
PUT is probably considered as update operation as the replacement of the document more or less has the effect of an update. If no representation was yet available this just has the effect of the creation (including the location header). In the past, unfortunately, many developers used PUT incorrectly by performing a partial update instead of really replacing the whole document. While the spec states that a partial update could be achieved by overlapping resources (i.e. share parts of the same data in multiple resources), the usage of PATCH, which also is used incorrectly most of the time, may be better from a performance standpoint on larger resources.
Due to POSTs definition, one can do anything with it, though historically a document upload in HTML was triggered through this operation that is basically a resource creation on the server side. That POST is used for many other things as well though is not that important for that CRUD paradigm.
In regards to your concerns about the right terminology, most people, according to my experience, simply do not care. They just want to get the job done ASAP and move on. As roughly 90% of the users seem to understand a pretty similar concept when talking about REST (even though this view is flawed) which usually resolves around JSON, HTTP, CRUD, Swagger/OpenAPI, ... they usually only look for quick-win-solutions and more or less agreement on their thought process.
As HTTP (0.9-1.1) is a plain text protocol sending a GET request is not much different from a POST or PUT request, so technically you can create resources with GET request or support payloads on GET requests (semantics of the payload is undefined according to the spec). That's why I mentioned well-behaved client/servers above. In such a case, however, due to the safe property of GET, if you as a server maintainer violate the HTTP protocol you are the one to blame in case something "unexpected" is happening (crawler is ordering 500 Pizzas).

How to establish relationships between Spring Data REST / Spring HATEOAS based (micro) services?

Trying to figure out a pattern for how to handle relationships when using a hypermedia based microservices based on Spring Data Rest or HATEOAS.
If you have service A (Instructor) and Service B (Course) each exist as an a stand alone app.
What is the preferred method for establishing a relationship between the two services. In a manner that does not require columns for IDs of the foreign service. It would be possible for each service to have many other services that need to communicate in the same manor.
Possible solution (Not sure a correct path)
Each service has a second table with a OneToMany with the primary entity within the service. The table would have the following fields:
ID, entityID, rel, relatedID
Then in the opposite service using Spring Data Rest setup a find that queries the join table to find records that match.
The primary goal I want to accomplish would be any service can have relationships with any number of other services without having to have knowledge of the other service.
The basic steps are the following ones:
The service needs to discover the resources of the other service.
The service then adds a link to the resources it renders where necessary.
I have a very rudimentary example of these steps in this repository. The example consists of two services: a service to provide geo-spatial searches for stores. The second service is some rudimentary customer management that optionally integrates with store service if it is currently available.
Here's how the steps are implemented:
Resource discovery
In my example the consuming service (i.e. the customer one) uses Spring HATEOAS' Traverson API to traverse a set of link relations until it finds a link named by-location. This is done in StoreIntegration. So all the client service needs to know is the root URI (taken from the environment in my case) and a set of link relations. It periodically checks the link for existence using a HEAD-request.
This of course can be done in a more sophisticated manner: hard-wiring the base URI into the client service might be considered suboptimal but actually works quite well if you're using DNS anyway (so that you can exchange the actual host behind the URI hard-coded). Nonetheless it's a decent pragmatic approach, still rediscovers the other service if it changes URIs, no additional libraries required.
For an even more sophisticated approach have a look at Netflix' Eureka library which is basically a service registry. Also, you might wanna checkout the Spring Cloud integration we have for that.
Augmenting resources with links
Spring HATEOAS provides a ResourceProcessor API that Spring Data REST leverages. It allows you to manipulate the Resource instance about to be rendered and e.g. add links to it. The implementation for the customers service can be found here.
It basically takes the link just discovered in the steps above and expands it using well-known parameters and thus allows clients to trigger a store geo-search by just following the link.
Beyond that
You can find a more sophisticated variant of this example in the examples projects for Spring Cloud. It takes the very same example but switches to Spring Cloud components such as Eureka integration, gathering metrics, adding UI etc.
In my case I can only derive related items from the service itself. My goal is to abstract the related items to the point that any number of services can be related to a service and only need to lookup ID's or links. One thought was an #ElementCollection named related with a join of the entity ID of the service. Then in the #Embedded have a relLink field and a relatedID field. Then in the repository do a findby to find the relLink and relatedID.
The hope is to keep it abstracted enough to essentially mimic a Many to Many setup.

What is the difference between a cornice.Service and cornice.resource in Cornice?

I have read through the documentation many times over and search all over for the answer to this question but have come up short. Specifically I've looked at Defining the service and the Cornice API for a service and at Defining resource for resource.
I'm currently building a REST API that will have a similar structure to this:
GET /clients # Gets a list of clients
GET /clients/{id} # Gets a specific client
GET /clients/{id}/users # Gets a specific clients users
What would be the best way to go about this? Should I use a service or a resource or both? And, if both, how?
Resources are high-level convenience, services offer lower level control.
I am just learning cornice myself. Looking at the source code, a Resource creates Services internally, one for the item and one for the collection (if a collection path is specified). The resource also adds views to the services for each method defined with an http verb as the name or in the form collection_[verb].
So there is little difference except that resources are a neat, structured way to define services.
The resource decorator uses a url for the collection, as well as a url pattern for an object.
collection_path=/rest/users
path=/rest/users/{id}
The resource decorator is best used in view classes where you can use get/put/post/delete methods on the objects, as well as collection_get, collection_put, etc. on the collection. I have some examples here:
https://github.com/umeboshi2/trumpet/blob/master/trumpet/views/rest/users.py
Since I make heavy use of the resource decorator and view classes, I haven't found a need for the service function, but it allows you to create get,put,post decorators that wrap view callable functions.
If you are using backbone.js on the client side, the resource decorator and url examples work well with the Backbone collections and models.

Querying multiple OData entities for the same search term

I have a client who has a web service providing several different top-level entities. Let's say there are three which are of particular interest: Organisations, Sectors and Activities.
The client wants to be able to search for a term across all three of these entities simultaneously without have to make three separate calls. For example, "return all records whose name contains bread".
While the expand keyword would seem to be the solution at first glance, this only provides a view into the parent entity.
My suspicion is that this cannot be done by virtue of the way in which OData is designed to work, but I need to have a conclusive answer before going back to the client.
Unless the server provides a service operation for this exact purpose (and that would be pretty tricky to design anyway, what type should it return?), then it's not possible in one query.
On the other hand the client can send three queries inside one batch request. So that it's just a single roundtrip to the server. Might be good enough.
You could add a webget to the service to perform this function. You would have to wrap the response objects though.