HttpRequest Host Vulnerabilites - sql

I would like your advice regarding any security vulnerabilities from extracting the Domain Name from the Host property on an HttpRequest?
I have developed a PWA using ASP.NET Core that is multi-tenant and I extract the domain (i.e. The tenant) name from the host (HttpRequest.Host) which I use to look up information in a database.
For example, if I had a URL like www.JoeBloggs.com the extracted domain would be 'JoeBloggs'. Using this I then retrieve the information I require for that tenant.
Information is always sent over a HTTPS connection.
Can the Host value be faked or potentially used in a SQL Injection attack if I am using the domain name as part of a database lookup? 
Thanks, Advance.

Historically there have been a slew of HTTP Host header attacks in which target webservers implicitly trust the Host header value with no/improper whitelist checking or sanitization. In short, it is possible to fake this value in certain contexts/configurations.
For concerns regarding SQL injection specifically, you should already be using prepared statements and parameterized queries to mitigate such risks; if you aren't already you should absolutely be working to refactor your SQL interaction code to do so. Even if the Host value sent by a malicious client is intended to exploit a SQL injection vulnerability somewhere downstream from the HTTP server, such a value shouldn't be able to trigger any unintended functionality or data exposure as parameters passed to query strings via mechanisms like prepared statements/parameterized queries wouldn't interpretable as SQL statements.
Relatedly, if you're using the value of the Host header to determine whether a client should receive any sort of "privileged" information in the response from your server - don't. A Host header value is not nearly a stand-in for a proper authentication/authorization flow and should absolutely never be used as such, considering it's able to be manipulated rather trivially. You can certainly use it in conjunction with other, more secure methods of authentication/authorization, but using it by itself is a big security no-no.
This advice does not preclude there being a separate exploitable flaw/bug in your database, database driver, or anywhere else in your stack that examines the contents of the Host header.

Related

Differentiating between 404 types

I know the 404 vs 204 debate has been beaten to death, and I understand the argument for using 404 when there is no record in the table corresponding to a REST endpoint request, but it feels like there should be some way of differentiating between "This endpoint is malformed" and "there is no record in the table." For example if I have an endpoint like this:
https://mycloudfront.cloudfront.com/api/my-table/{userId}
Is there a recommended way of configuring error handling on the backend to differentiate between "no resource found because there is no entry for userId" and "no resource found because there is no table named my-table" or "no resource found because there is no cloudfront distribution named mycloudfront"?
I ask, because it would be nice on the frontend to inform the end user whether or not their request did not produce the desired result because they have no data in the table (in which case I would display a message encouraging them to take an action that would generate data) or because something went wrong (in which case I would display an error message).
it would be nice on the frontend to inform the end user whether or not their request did not produce the desired result because they have no data in the table
That's what the response body is for.
Except when responding to a HEAD request, the server SHOULD send a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition. RFC 9110.
Status codes are metadata in the transfer-of-documents-over-a-network domain (Webber, 2011) - the information indicates to general purpose web components (browsers, proxies, caches, spiders....) the semantics (meaning) of the fields and response body (ex: does the message include a representation of a resource or a representation of an error?)
Bespoke HTTP message handlers (and human operators) are expected to look for information in the body (ex: a 404 for a web page returns a picture of a fail whale and a bunch of links to different resources that might clarify what's gone wrong).
You can also leverage ideas like web linking (RFC 8288), if you want to describe relationships between the error and other resources.
Problem Details (RFC 7807) describes a standardized JSON schema for communicating error information, if you want a JSON representation but prefer not to do all of the schema design yourself.
First and foremost, REST has no endpoints but resources.
there should be some way of differentiating between "This endpoint is malformed" and "there is no record in the table."
By "This endpoint is malformed" I guess you probably mean the request issued to the server doesn't conform to the HTTP specification. As voice already mentioned, HTTP status code are coordination metadata for the outcome of the transportation and not necessarily the outcome of your business logic. Of course you need to come up with a mapping for problems you noticed while applying your business logic to the HTTP transport domain.
Unfortunately, REST is polluted with false assumption and believes. Plenty of people seem to think of it as HTTP based CRUD mostly done with JSON payloads. But this is just a very tiny fraction of what REST really is. At its heart it is a technique used in distributed computing to help decouple clients from server to allow the latter to evolve freely in future. Clients on the other hand are build with the inherent design decision of a possible change in mind and therefore get much more robust towards change in the end.
So, how does REST help to decouple clients from servers?
First, the spelling of a URI is not of importance. The URI needs to be a valid one but that's it basically. Clients shouldn't parse the URI or try to extract some knowledge off the URI nor does a URI pattern like /api/user/1 and /api/user/1/stuff mean that both of those URIs are somehow related. That's what link-relations are there for.
Next, in order to teach a client what an URI returned by the server is good for URIs should come with one or multiple link-relation names, which should either be based on registered ones or at least follow the Web Linking extension mechanism, which basically is just a further URI that does not necessarily need to point to a valid resource. Treat it like a predicate in a (SemWeb) ontology.
Use forms similar to HTML, like HAL-forms, JsonForms or Ion, if your server needs further input from clients. Forms also teach clients on what HTTP methods to use, which URI to send the request to, what media-type to encode the request in and of course a description of the properties the resource has and/or the server expects input for. This information is enough to let a client send valid HTTP requests in terms of the transport domain to the server. Note that this does not mean that there won't be any issues then. Requests still might fail to reach the server due to internet outage on whatever end, the request being routed badly and exceed the maximum number of allowed hops and so on but depending on the HTTP method used for sending the request a client might automatically reissue a request once it hit its timeout threshold.
In order to increase interoperability of any peer in a REST ecosystem REST has a strong focus on media types. Think of it as the binding contract between a client and a server which should be negotiated between both of them. This guarantees that both are capable of exchanging "messages" both understand and are able to process. One of the difference to regular RPC services here though is that RPC services are usually restricted to one payload mechanism while REST supports more or less an unlimited amount of payloads, depending on its support for various media-types. Media types are a human-readable description on how payload should be encoded and processed and also contains information, besides the syntax description of allowed elements, a semantic description on the purpose of the respective elements. A payload issued for plain application/json doesn't teach a client really what the properties of the respective JSON objects used in the payload mean nor does it really support URIs in first place. Note however that issuing a plain JSON request to the server is fine if the client was "instructed" that way using a form the client was acting upon. The server here just expects that kind of payload then. Just look at how a typical HTML document is build up and read up on some of the tag definitions that are used within the HTML document and you might get the gist of this paragraph.
Especially about the latter two points Fielding himself was quite vocal about in his famous rant:
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types ...
So, back to the actual question at hand. Is "there is no record in the table" really a business logic error? You could also design it to return what's currently available there and return an empty list. This at least spare you the hazzle of mapping that business error onto the transportation domain in that case.
If you want or need to express a business logic failure to the client you should, as voice also recommended before, look into application/problem+json (or its XML alternative application/problem+xml) which define properties such as type of failure, general title, status and details among others. The respective type the response is issued for may define further properties specific to that type that are part of the payload. I.e. you may define an extension type of http://acme.com/problem/validation and this extension type defines that the payload needs to contain a target-ref property to identify the element that failed the validation check as well as a property for the actual error message.
In the end some general recommendations in terms of REST are:
Design the interactions of client and servers first as if you'd interact with a typical human-focused Web page and then translate the interaction steps onto the application domain. REST in the end is nothing more than a generalized approach for how we humans interact on the Web for decades. REST is basically Web surfing for applications rather than humans. As we humans follow an outlined state machine of i.e. Amazon.com to order some books, computers can do the same. Therefore design the whole interaction between client and server as state machine that clients just follow along and may exit at certain points
Allow servers to teach clients what they need to know using various form-support and use link-relations to set given URIs in context to the current resource

What is the difference between performing a procedure on a resource and performing a state transformation?

I'm new to web development and I'm attempting to understand REST. The tutorial I'm watching makes mention of the difference between "procedures" and "state transformation". Stating that REST is based on the notion of "state transformation", but it does not delineate the difference between the two.
This has left me wondering what is the difference between the two? Why can't an operation which transforms the state of a resource also be considered a procedure? After all, 'procedure' sounds like a generic enough term that it would also encompass an operation that would transform the state of a resource.
So, what is the difference between performing a procedure on a resource, and performing a state transformation? Or is it merely a matter of semantics?
I have also tried searching for the answer but can't seem to find anything that will shed light on this.
TL;DR
RPC focues on sending a payload containing method names and arguments in a predefined format. Clients couple tightly to servers through a shared interface (Skeletton classes, WSDL or other interface definition languages (IDLs))
REST focues on decoupling clients from servers and on introducing indirections, like support of multiple different media types to marshal resource state in, and the whole interaction concepts summarized by HATEOAS where hypertext controls are used to drive the application state forward through a domain application protocol / state machine on the server side. Responses usually contain semi-structured data, which usually don't go well with simple CRUD application, that follow the definition of corresponding media type definition (i.e. the HTML spec). If you will the state of a resource is transformed into a representation format adhering to the rules in the media type definition and transferred to the remote side
In network programming, remote procedure call (RPC)-style invocations, i.e. often used in RMI, Corba, SOAP or similar frameworks, will send usually a method name that should be invoked at the server alongside with parameters to feed the method with. The return value is then marshalled into corresponding response and sent back to the caller. What a client could invoke is usually exposed via external stuff, i.e. skeletton classes, WSDL- or other form of contracts and so on. So far, so simple. This is how most of the networking stuff works. However, the drawback here is that a client is tightly coupled to the exposed interface (skeletton classes, WSDL, external documentation) and many problems in internet computing arise due to changes over time that are not adequatly depictable in those interfaces.
If you take a closer look though at how the Web works for decades, change is an inherent part of it. Your browser will just show the most recent state of a resource (Web page) it has. It might eigther got it from its cache or from a server it asked for. If the version available in its cache is older than a predefined threshold value it will ignore the cached value and request a new version. If there happened an update since the last version your browser is automatically served with the new version. Fielding, who was working on the HTTP 1.0 and 1.1 spec back then, analyzed how the interaction on the Web takes place and generalized his findings into the REST architecture design. So, if you will, REST is just Web surfing for applications.
Unfortunately a mojority of enthusiasts and professional have not yet understood what REST really is and there is so much false information available in regards to REST, even here at Stackoverflow most people don't seem to care actually and posts explaining the true nature of REST are downvoted and wrong information upvoted.
So, what does REST different than typical RPC-like method invocations?
First, REST relies on a certain set of uniform interfaces, that are the same for every participant in that architecture. These are i.e. HTTP as transport layer and a naming scheme for resource (URI) so that everyone acts on these fixed principles. This helps to reduce interoperability issues that are just way to common in traditional network programming.
Next, it relies on a basic principle: Servers teach clients what they need to know. But how does a server know what a client need to know? Well, as Jim Webber pointed out, the designer of the application develops a state machine (or domain application protocol) a client will follow through. Think of a checkout system on your favorite online shop. At one point it presents you the items in your trolly and offers you a choice to progress to the next "page" where you can enter the shipping address and on further progressing through the state machine you will be asked for your payment options and so on until at one point to finished the checkout and are served with a "Thank you" page that summarizes your order. Under the hood you just progressed through their protocol on how to place orders and used application controls to progress your client further through their state machine. You therefore got served with some Web forms and links that you used to fulfill your task. In essence, this is what Hypertext as the engine of application state (or HATEOAS for short) is all about.
On the Web HTML forms are used to teach a client about what properties a resource supports, which ones are editable and so on. Besides that, it also teaches clients on the actual URI to send input data to, the HTTP operation to utilze as well as, mostly implicitly given, the media type to marshal the request into. I.e. a regular HTML form will use application/x-www-form-urlencoded as its default media type to send the data to the server. So a full HTTP request for an input of a first and last name may look like this:
POST /path/to/resource HTTP/1.1
Host: acme.org
Connection: close
Accept: */*
User-Agent: ...
Content-Type: application/x-www-form-urlencoded
Content-Length: 32
firstName=Roman&lastName=Vottner
The same data could be sent using a different representation format, if it were supported by the media type the form was issued for. Unfortunately, HTML does not support that many.
Links provided by a server should usually be annotated (or accompanyied) by so called link relation names that put the current resource in relation with the given URI. If you will they are the predicate in a tripple of subject (current page), predicate (link relation name) and object (link target resource). Such names, of course, should be standadized or at least follow the Web linking extension mechanism. URIs itself are opaque, meaning they themselves don't provide meaning and should therefore not get parsed and analyzed at all. A common mistake often seen in so called "REST APIs" is that they have typed resources, i.e. a user resource or a car resource that can be marshalled on the client side to a programming language specific object (i.e. Java object of class User or the like) that is pretty common in traditional RPC-sytle programming. In a REST architecture the representation format however is usually semi-structured data, i.e. a mix of syntax defining control inputs or elements and actual data. As such, a direct mapping from DB-Entry, to Model-Object to a resource itself, as done by so many CRUD applications, is not possible.
Why is this all done in first place?
If you compare traditional network programming a client is usually only able to work with that one server and if something at that server changes clients may be affected and thus stop working. There is a tight coupling between those two apparent. The REST architecture introduces a couple of indirections, i.e. usage of link relations instead of attempting to analyze meaningful URIs as well as usage of a multitude of possible media-types instead of relying on a specified version format, which help to decouple clients from servers. I.e. instead on coupling to the server in regards of the message exchanged, both, client and server couple to media types. Through content-type negotiation a client simply tells the server of its capabilities and the server should generate a response the client can process. Instead of focusing on one message format, REST has the freedom of almost infinite ones as long as both, client and server, support these. The more media types a peer supports, the more likely it will be to interact with other peers in that network.
All these points I've mentioned above lead to a strict decoupling of client and servers, which grant the latter one to evolve freely without having to fear that changes introduce will break clients as neither the transport protocol nor the naming scheme have changed and the changes introdcued are still in scope of the media-type definition. So, well-behaved peers in that network will be able to pick up changes on the fly automatically. This is especially handy if you develop an application that should withstand the sands of time and still server clients in years to come.
If you don't need such properties, there is nothing wrong with not being "RESTful" at all, just don't call such services/APIs REST then. Also, developing REST is for sure more overhead compared to typical RPC-style interactions.

some doubts related to REST API [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Can we say Like Rest API we have SOAP, XML RPC, JSON RPC, GRAPH QL I mean Rest is just a type of API?
Is Rest is just a mechanism to share data between applications by using HTTP methods?
REST can share data between applications only with HTTP there is no alternative?
What is the relation between REST & CRUD exactly? we are saying HTTP: POST is CRUD: CREATE my question is HTTP: POST will just post the data to server and it's business logic's headache to CREATE A NEW RESOURCE in the server side but how we all are saying HTTP: POST is nothing but CRUD: CREATE here HTTP: POST is just helping to share the data only right then how it is related to CRUD: CREATE? If that is the case we can create a new resource with HTTP: GET by writing business logic right then why so many sites are saying REST is a mechanism to do CRUD operations... But it should be REST is a mechanism to help in Sharing the data between applications right?!!! (I have doubt similarly with HTTP: DELETE, GET, PUT aswell)
Last but not least... what exactly Representational state transfer mean? could you please answer this with a very low level general answer instead of definition.
Can we say Like Rest API we have SOAP, XML RPC, JSON RPC, GRAPH QL I mean Rest is just a type of API?
REST stands for representation state transfer and is just an architectural style, not a technology or protocol. According to Robert C. "Uncle Bob" Martin an architecture is about intent and the intention behind REST is the decoupling of servers and clients in a distributed system.
REST basically just defines a set of constraints that when followed correctly allow servers to change at any time without breaking clients as clients will just depend on the data given by the server and not on any external data or documentation. REST can be regarded as Web surfing for applications. The main premise should always be that a server teaches a client on how certain things can be achieved.
On the Web a server can i.e. teach a client on the supported properties of a resource through the help of HTML forms. Not only does a client learn that way what a server expects as input, it also learns what HTTP operation should be used to send the data to the server, the endpoint URI to send the request to as well as, usually implicitly given, also the media type to convert the input to, which is application/x-www-form-urlencoded usually by default, which transforms an input like the HTML example below for first and lastname to something like this:
fName=Roman&lName=Vottner
Is Rest is just a mechanism to share data between applications by using HTTP methods?
REST itself is protocol agnostic meaning that it is not tide to HTTP itself and could just work on other transport protocols as well. Though the common perception many developers have is that it is based on HTTP. After all, as Jim Webber put it, HTTP is just a transport layer whose domain is the transfer of files or data over the Web. All HTTP does is to send one document from one machine to the next and any business rules we conclude from sending/receiving a request are just a side effect of the actual document management. It is therefore always better to think of a request as a whole document and the HTTP operation define how the document should be stored on the current machine, especially when such a document is already available, instead of thinking of a service method invocations. The latter one is a typical RPC view.
REST can share data between applications only with HTTP there is no alternative?
HTTP is just a transport layer used in a REST architecture. The architecture cares more on the interaction model between client and servers than on the technical nuances of HTTP. As REST itself is transport protocol agnostic it could be used with other, maybe proprietary protocols as well.
What is the relation between REST & CRUD exactly? we are saying HTTP: POST is CRUD: CREATE my question is HTTP: POST will just post the data to server and it's business logic's headache to CREATE A NEW RESOURCE in the server side but how we all are saying HTTP: POST is nothing but CRUD: CREATE here HTTP: POST is just helping to share the data only right then how it is related to CRUD: CREATE? If that is the case we can create a new resource with HTTP: GET by writing business logic right then why so many sites are saying REST is a mechanism to do CRUD operations... But it should be REST is a mechanism to help in Sharing the data between applications right?!!! (I have doubt similarly with HTTP: DELETE, GET, PUT aswell)
Fielding's thesis on the REST architecture style does not contain the term CRUD at all. The term REST nowadays is heavily misused as people probably didn't bother to actually read the thesis, which admittingly is a bit abstract, and just follow what some people thought may be REST but was actually RPC. Nowadays, if a typical stakeholder talks about REST they usually think of a JSON-based HTTP CRUD-API whose supported endpoints are defined in some Web documentation (Swagger, OpenAPI, ...) and where the HTTP operations for POST (= Create), GET (= Read), PUT (= Update), Delete (= Delete) are supported by default. However, this is unfortunately far from the truth. Though people are just to accustomed with their (wrong) definition and don't see or don't care about the actual problem in their misusage. They don't care about a long-lasting service as in 2-5 years the next-gen technology is here that allows to reduce the number of lines of codes even more and if a new "version" of a service is needed, this usually goes hand in hand with a technology switch also, to justify the "cost of change" somehow.
Last but not least... what exactly Representational state transfer mean? could you please answer this with a very low level general answer instead of definition.
Probably the easiest way to grasp how the interaction in a REST architecture should be is by analyzing typical interaction on the Web, the big cousin of REST. You, as a user, usually start by opening your browser (= client) and typing in some URL in the search bar. Next a Web page is rendered on your screen. Behind the curtain a couple of things happened. Besides the whole connection management and any eventual TLS handshake your browser sent at least one GET request to the target server. On sending the GET request, the client included information on his capabilities, i.e. through the Accept HTTP header. This header is used on the server side to decide which representation format to generate and send to the user. On the Web this might be something like text/html or application/xhtml+xml or if some report is generated might be something like application/pdf or application/vnd.ms-excel or the like, depending which representation format fit the data best.
The representation format itself is now a concrete instance of a document following a certain media type specification. I.e. the HTML forms specification defines the supported elements within a <form>...</form> tag pair as well as describes the semantics of each of the elements. The concrete instance may now define a form as such:
<form action="/action_page.php" method="get">
<label for="fname">First name:</label>
<input type="text" id="fname" name="fname"><br><br>
<label for="lname">Last name:</label>
<input type="text" id="lname" name="lname"><br><br>
<input type="submit" value="Submit">
</form>
which should only use elements and attributes defined within the specification else a receiver of that document might not be able to process it correctly.
This process of telling the server which document types a client supports and allowing the server to chose a fitting representation is called content type negotiation and allows the exchange of arbitrary, type-less documents. Of course, both parties need to support and understand at least one common media type to be able to interact with each other. This is similar to a Frenchmen who does not understand a word of Chinese and a Chinese one that does not understand any word of French who need to communicate (for whatever reason), if both speak English they will be able to communicate.
There are loads of different media types already available that all server different purposes. Depending on your needs an all-purpose one, such as text/html, might be sufficient, others such as application/json or even application/hal+json might though lack support for certain needed elements. Existing media types might not support needed elements. In such a case extending such media types and registering those is probably easier than creating a whole media type from scratch.
REST assumes that a resource (i.e. a remote document) contains some internal data, its state. This state can be represented in many different ways. Think of some monthly sails figures. You might ship the data either in a HTML table, as CSV data, as Excel file, as PDF or yet a different representation format. Regardless of the chosen media type to marshal the data in, the actual data at least should express the same. Instead of questioning which media type you want to support, you should better ask how many different ones you want to support as this just increases the likelihood that other clients may interact with your server also.
edit:
I got all the points except 4th point... Could you please elaborate a bit.. So is it just a mechanism that helps in sharing the data between applications by using HTTP methods? we can say like that?
CRUD is a typical term in the context of persistence, especially with databases. REST or more formally the REST architecture itself treats persistence as internal detail. A typical user usually does not care whether some data is persisted into a DB, a local file system or is kept just in memory. All s/he cares is that the server can process it or for storage services also return the same data that was sent to the server.
In regards to the mapping of CRUD to the HTTP operations, if you take a look at HTML you might see that it only supports POST and GET operations. So anything related to C, U or D are performed with POST which is defined to process the enclosed representation according to the resource's own semantic. With POST you are basically allowed to do anything, even retrieving data if you like.
However, HTTP defines certain properties for the respective operations:
safe
idempotent
cacheable
The first property is a promise to clients that a well-behaved server should not alter the state of a resource upon requesting. The second one is a promise in regards to automatic retry attempts caused by i.e. temporary connection issues. And the latter one allows clients to store responses locally and reuse these instead of requesting the same resource again, if the cached content is "fresh-enough".
GET and HEAD are both safe, idempotent and cacheable, meaning that a client can request resources with such an operation without being hold accountable for any eventual changes. Think of a Web spider that is invoking arbitrary URIs all the time to learn new pages over time. If a GET request on a URI would trigger an order of a Pizza or the like, it is basically the server's fault and not the clients one if a crawler would order Pizzas every time such URIs are called.
PUT and DELETE are only idempotent, which basically allows a client to automatically resend a request in case of a network issue as the outcome of the operation leads to the same result regardless whether the request was processed once or multiple times in a row. Note that this property does not consider changes done by other clients to that resource between a resend. Such data would of course be overwritten.
The remaining operations (POST, PATCH, CONNECT, TRACE) are neither of these.
While technically it is sufficient to only use POST for each request, the above mentioned properties should trigger an inner intention to use them, when appropriate. However, as before mentioned, not the client should chose which operation to perform but the server should tell a client which HTTP operation it should use.
In regards to POST vs. PUT, both operations should behave similar on creating a resource. Both need to add a Location header within the response that teaches a client about the location of the new resource. PUT however, in contrast to POST, replaces the current representation of the requested resource with the one provided in the request body. So it already targets the respective resource while for a POST request the server defines where the resource is created. It is allowed to perform certain sanity checks and also to transform the representation to fit the representation format of the current one. It is also allowed to have side effects, i.e. think of Git where a commit creates a new entry on top of the current branch and moves the HEAD to the new commit.
PUT is probably considered as update operation as the replacement of the document more or less has the effect of an update. If no representation was yet available this just has the effect of the creation (including the location header). In the past, unfortunately, many developers used PUT incorrectly by performing a partial update instead of really replacing the whole document. While the spec states that a partial update could be achieved by overlapping resources (i.e. share parts of the same data in multiple resources), the usage of PATCH, which also is used incorrectly most of the time, may be better from a performance standpoint on larger resources.
Due to POSTs definition, one can do anything with it, though historically a document upload in HTML was triggered through this operation that is basically a resource creation on the server side. That POST is used for many other things as well though is not that important for that CRUD paradigm.
In regards to your concerns about the right terminology, most people, according to my experience, simply do not care. They just want to get the job done ASAP and move on. As roughly 90% of the users seem to understand a pretty similar concept when talking about REST (even though this view is flawed) which usually resolves around JSON, HTTP, CRUD, Swagger/OpenAPI, ... they usually only look for quick-win-solutions and more or less agreement on their thought process.
As HTTP (0.9-1.1) is a plain text protocol sending a GET request is not much different from a POST or PUT request, so technically you can create resources with GET request or support payloads on GET requests (semantics of the payload is undefined according to the spec). That's why I mentioned well-behaved client/servers above. In such a case, however, due to the safe property of GET, if you as a server maintainer violate the HTTP protocol you are the one to blame in case something "unexpected" is happening (crawler is ordering 500 Pizzas).

Multi-tenancy in Golang

I'm currently writing a service in Go where I need to deal with multiple tenants. I have settled on using the one database, shared-tables approach using a 'tenant_id' decriminator for tenant separation.
The service is structured like this:
gRPC server -> gRPC Handlers -
\_ Managers (SQL)
/
HTTP/JSON server -> Handlers -
Two servers, one gRPC (administration) and one HTTP/JSON (public API), each running in their own go-routine and with their own respective handlers that can make use of the functionality of the different managers. The managers (lets call one 'inventory-manager'), all lives in different root-level packages. These are as far as I understand it my domain entities.
In this regard I have some questions:
I cannot find any ORM for Go that supports multiple tenants out there. Is writing my own on top of perhaps the sqlx package a valid option?
Other services in the future will require multi-tenant support too, so I guess I would have to create some library/package anyway.
Today, I resolve the tenants by using a ResolveTenantBySubdomain middleware for the public API server. I then place the resolved tenant id in a context value that is sent with the call to the manager. Inside the different methods in the manager, I get the tenant id from the context value. This is then used with every SQL query/exec calls or returns a error if missing or invalid tenant id. Should I even use context for this purpose?
Resolving the tenant on the gRPC server, I believe I have to use the UnaryInterceptor function for middleware handling. Since the gRPC
API interface will only be accessed by other backend services, i guess resolving by subdomain is unneccessary here. But how should I embed the tenant id? In the header?
Really hope I'm asking the right questions.
Regards, Karl.
I cannot find any ORM for Go that supports multiple tenants out there. Is writing my own on top of perhaps the sqlx package a valid option?
ORMs in Go are a controversial topic! Some Go users love them, others hate them and prefer to write SQL manually. This is a matter of personal preference. Asking for specific library recommendations is off-topic here, and in any event, I don't know of any multi-tenant ORM libraries – but there's nothing to prevent you using a wrapper of sqlx (I work daily on a system which does exactly this).
Other services in the future will require multi-tenant support too, so I guess I would have to create some library/package anyway.
It would make sense to abstract this behavior from those internal services in a way which suits your programming and interface schemas, but there's no further details here to answer more concretely.
Today, I resolve the tenants by using a ResolveTenantBySubdomain middleware for the public API server. I then place the resolved tenant id in a context value that is sent with the call to the manager. Inside the different methods in the manager, I get the tenant id from the context value. This is then used with every SQL query/exec calls or returns a error if missing or invalid tenant id. Should I even use context for this purpose?
context.Context is mostly about cancellation, not request propagation. While your use is acceptable according to the documentation for the WithValue function, it's widely considered a bad code smell to use the context package as currently implemented to pass values. Rather than use implicit behavior, which lacks type safety and many other properties, why not be explicit in the function signature of your downstream data layers by passing the tenant ID to the relevant function calls?
Resolving the tenant on the gRPC server, I believe I have to use the UnaryInterceptor function for middleware handling. Since the gRPC API interface will only be accessed by other backend services, i guess resolving by subdomain is unneccessary here. But how should I embed the tenant id? In the header? [sic]
The gRPC library is not opinionated about your design choice. You can use a header value (to pass the tenant ID as an "ambient" parameter to the request) or explicitly add a tenant ID parameter to each remote method invocation which requires it.
Note that passing a tenant ID between your services in this way creates external trust between them – if service A makes a request of service B and annotates it with a tenant ID, you assume service A has performed the necessary access control checks to verify a user of that tenant is indeed making the request. There is nothing in this simple model to prevent a rogue service C asking service B for information about some arbitrary tenant ID. An alternative implementation would implement a more complex trust-nobody policy whereby each service is provided with sufficient access control information to make its own policy decision as to whether a particular request scoped to a particular tenant should be fulfilled.

Best practice : What instead of "User-Agent" in http headers to identify an app?

It looks like you can't always set "user-agent" header using Ajax. (user-agent is somewhat a reserved keyword and you can't forge it on some browser because of security concern).
When calling my REST service I'd like the caller to give me a clue about who (which application) is using it.
Registration won't be mandatory, it's rather a way to check if there are some external (valuable) clients that still use my web service when I'd like to close it.
So if I can't use the "user-agent" is there some name of choice to use instead ?
X-Application-Id ? X-UserAgent ?
Is there some doc that lists all those X-*** headers ?
Depending on your application, it may or may not be possible to add custom headers to your application.
From your question, I assume that you are able to set custom headers. The default for your application would be to use the User-Agent Flag as described in the Official HTTP Specification:
The "User-Agent" header field contains information about the user agent originating the request, which is often used by servers to help identify the scope of reported interoperability problems, to work around or tailor responses to avoid particular user agent limitations, and for analytics regarding browser or operating system use. A user agent SHOULD send a User-Agent field in each request unless specifically configured not to do so.
If you choose to add another custom header, there are no restrictions or recommendations for the name of the header. Please note, that you should not use an "X-" prefix as described in RFC 6648:
Historically, designers and implementers of application protocols have often distinguished between standardized and unstandardized parameters by prefixing the names of unstandardized parameters with the string "X-" or similar constructs. In practice, that convention causes more problems than it solves. Therefore, this document deprecates the convention for newly defined parameters with textual (as opposed to numerical) names in application protocols.