How is request value passed in boost - boost-asio

I've started network programming and I can neither understand nor find how asio server knows what client wants. I mean in tutorial on boost.org there was one function, but what if client has some specific request? How to get information what he wants from his request?

Related

Intercepting, manipulating and forwarding API calls to a 3rd party API

this might be somewhat of a weird, long and convoluted question but hear me out.
I am running a licensed 3rd party closed-source proprietary software on my on-premise server that stores and manipulates data, the specifics of what it does are not important. One of the features of this software is that it has an API that accepts requests to insert/manipulate/retrieve data. Because of the poorly designed software, there is no mechanism to write internal scripts (at least not anymore, it has been deprecated in the newest versions) for the software or any events to attach to for writing code that further enhances the functionality of the software (further manipulation of the data according to preset rules, timestamping through a TSA of the incoming packages, etc.).
How can I bypass the need for an internal scripting functionality that still gives me a way to e.g. timestamp an incoming package and return an appropriate response via the API to the sender in case of an error.
I have thought about using the in-built database trigger mechanisms (specifically MongoDB Change Streams API) to intercept the incoming data and adding the required hash and other timestamping-related information directly into the database. This is a neat solution, other than the fact that in case of an error (there have been some instances where our timestamping authority API is down or not responding to requests) there is no way to inform the sender that the timestamping process has not gone through as expected and that the new data will not be accepted into the server (all data on the server must be timestamped by law).
Another way this could be done is by intercepting the API request somehow before it reaches its endpoint, doing whatever needs to be done to the data, and then forwarding the request further to the server's API endpoint and letting it do its thing. If I am not mistaken the concept is somewhat similar to what a reverse proxy does on the network layer - it routes incoming requests according to rules set in the configuration, removes/adds headers to the packets, encrypts the connection to the server, etc.
Finally, my short question to this convoluted setup would be: what is the best way of tackling this problem, are there any software solutions or concepts that I should be researching?

how to design REST API to ask server to wait for resource version to arrive on GET requests?

I work on splitting monoliths into microservices. With the monolith, I had a single source of truth and can just GET /resources/123 right after the PATCH /resources/123 and be sure that the database has the up-to-date data I need.
With microservices and CQRS in place, there is a risk that the query service has not seen yet the latest update to the record when I perform a GET request.
What is the best or standard approach to making sure that the client receives back the up-to-date value? I know that the client may compare resource versions that he receives after PATCH and after GET and retry requests, but is there a known API design to tell the server something like GET /resources/123 and wait up to 5 sec for the resource version 45 or bigger to arrive?
Since a PATCH request allows a response body, to my mind there's nothing wrong with the response including the object after patching. The requestor who sent the PATCH can use the response in lieu of a GET; for others, the eventual consistency delay for the GET isn't observable (since they don't know when the PATCH was issued).
CQRS means to not contort your write model for the sake of reads. If there's a read that is easily performed based on the write model, that read can be done against the write model.
Generally a better design might be for the PATCH request to delay its own response, if that's an option.
However, your GET request can also just 'hang' until it's ready. This generally feels like a better design than polling.
A client could indicate to the server how long it's willing to wait using a Prefer: wait= header: https://datatracker.ietf.org/doc/html/rfc7240#section-4.3
This could be used both for the GET or the PATCH request.
I don't think there's a standard HTTP way to say: this resource is not available right now, but will be in the future. However, there is a standard HTTP header to tell clients when to retry the request:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After
This is mainly used for 429 and 503 errors but neither seem appropriate here.
Frankly this is one of the first thing I've heard in a while that could be a good new HTTP status code. 425 Too Early exists but its a different use-case.

When is an API not RPC?

When is an API NOT RPC ?
After a long long discussion on twitter regarding API design.
I want to try to come to a clear answer when an API is not RPC based.
There seems to be quite a bit of confusion around this Is That REST API Really RPC? Roy Fielding Seems to Think So
That specific link is about REST and RPC. my question just aims to ask about RPC vs not RPC in general, not in the context of HTTP.
The TLDR; definition of when an API IS RPC is "when it hides the network from the developer"
Fair enough.
But back to when is it not.
The Twitter duscussion focused on the HttpClient, as not being RPC.
My argument is that the HttpClient does not really model the network.
There is nothing in there modelling latency.
There is also a very weak representation of network failures.
Most status codes in HTTP are not network related, e.g. NotFound, Payment Required etc.
Depending on language and SDK, the HttpClient may take a string or an URI.
In the case where it takes a string representing the URI, that also doesn't really communicate that there is a network involved, you would need to know what the string represents and whant an URI is.
If you had never seen HTTP before, how would you know by looking at that specific API?
There is an API consuming a string, and returning an object either containing a string or some int response codes.
How would you know?
Most developers of-course knows what HTTP is, but from a strict definition point of view, how would you define an API as not RPC?
Would a checked exception in Java "NetworkException" be enough to clearly communicate that there is a network involved?
If the functions you are calling have descriptive names like "SayHelloOverNetwork" would that be enough to make it not RPC?
And exactly when is a procedure a procedure in remote terms?
All network communication will result in some code being run on the receiving system.
If we take a person who have never been in touch with technology or protocols, and teach this person a programming language.
What would be the definition of "Not RPC" this person could employ to spot what is RPC and not?
I am being dense and possibly silly here, but I am trying to find the essence of "Not RPC", what exactly is required for this?
Does it for example need to be non blocking in order to play nice with unknown latency?
Or could a non RPC API be blocking?
IMO, if it is blocking, it hides that there will be latency involved, and thus falls into the "Latency is zero" fallacy, meaning it hides the network.
This all seems to be super obvious to everyone else but me, but no one have yet shown a concise answer on what the requirements for not being RPC is.
For me it looks like you're putting all eggs into one basket. Lets start from the very beginning:
API is application programming interface
it is a set of clearly defined methods of communication between various software components
The only definitive property of API is that it is defined/documented way of communication.
If one application/module/component A can use/call another application/module/component B somehow - it means that B provides API and A uses this API.
Usually there are two aspects of API which must be defined:
What exactly is passed into/returned from component (its your application logic)
How exactly data is transferred/serialized (this is technical implementation)
I'm not touching "what" part for obvious reasons.
Lets focus on different ways of "how":
push 4 byte integer to stack, change IP register and read 4 byte integer output from EAX register
connect to socket, write data serialized as byte array and read some response as byte array back
call 911 from any phone, say your address and expect several cars on your driveway
In all these cases you're using some API = you're communicating with other components in some predefined way.
RPC is remote procedure call
computer program causes a procedure (subroutine) to execute in another address space
The only definitive property of RPC is that data is passed across different/remote address spaces, some architecture allow different address spaces on single host, for example x86. As soon as different physical hosts usually do not share address space, any call across network is RPC, but not vice versa.
Note: It is rare, but possible to share memory space across different physical hosts connected in network - then such communication strictly speaking is not RPC, lets omit such cases.
Any RPC call automatically means that you're calling some API. By definition. So we can say that RPC is part of API's "how", it is transport level. As soon as RPC itself does not define actual mechanism, there are could be very different implementations, for example shared memory, DMA, TCP/IP, etc.
I suppose now you can answer your question when an API is not RPC based - When API says so. It is up to API developer to define whether it should/can be called via RPC or not, API can define multiple ways of calling it.
As antonym to RPC you can use "in-process".
So, phrase API IS RPC is "when it hides the network from the developer" is absolutely non-sense. API must define "how" section.
HTTP is hyper text transfer protocol
request–response protocol in the client–server computing model
The only definitive property of HTTP is that it describes protocol/format of request (input arguments) and response.
You can use HTTP in non-RPC API. For example I prefer to think about HTTP as file format. So we can say that HTTP is another part of API's "how". It defines serialization part of API, but does not dictate you transport level.
Note: Some RPC protocols actually define both transport and serialization.
So, HttpClient is the tool which allows you to invoke API and use HTTP encoded request/response, usually such library supposes RPC as transport level. None of these terms mandates network or any particular transport protocol. This is why http client should not declare any kind of network exceptions/errors, but it could throw HTTP errors as exceptions.
Note: Network exceptions could be thrown from TCP/IP RPC implementation for example, HTTP client library could proxy them to you. Unfortunately some libraries wrongly couple HTTP with TCP/IP too much and border between different responsibilities is crossed.
REST is representational state transfer
architectural style for distributed hypermedia systems
It is very wide term, it defines a lot of things from different aspects and at very different levels, most important:
HOW your API should be designed (usage of URI)
HOW your API should be implemented (stateless, HTTP verbs)
HOW your API should be called (client-server)
Client-server, usually assumes cross-process communication. Ie different address spaces, this is why we can say that REST mandates RPC as invoking mechanism to API + HTTP as serialization format.
Now, I suppose you will be able to understand these answers:
How would you know?
when you give client your REST API, it automatically defines some part of your API in terms of other protocols. Ie - to use REST API you must read/know HTTP protocol first. API must define what kind of RPC must be used.
HttpClient does not really model the network.
It should not do this. It works with HTTP semantics.
you would need to know what the string represents and whant an URI
There are two URIs actually:
URI is part of API's "What" section, it defines business object location. It has nothing to do with network or DNS system. You should not understand it.
URI could be part of TCP/IP RPC requirements, in this case it represents domain name/path. But some implementations can work with IP addresses, not URI.
If the functions you are calling have descriptive names like "SayHelloOverNetwork" would that be enough to make it not RPC?
As I wrote before - we can assume network as RPC always.
Does it for example need to be non blocking in order to play nice with unknown latency?
Its up to API developer:
API can contain functions which suppose asynchronous execution, but client can call them in blocking way
API can contain blocking functions, but client can call them asynchronously
An RPC is a network API with enough layers of abstraction on top. What's enough layers? That's a subjective thing.
Whether an API is an RPC or not doesn't, in my opinion, depend on method names or the names of exceptions/errors you need to handle. We're adults, we know that the call is done over the network. Naming it "SayHello" instead of "SayHelloOverNetwork" doesn't make a difference.
What does make a difference is all of those layers of abstraction - the less resource management and error handling you need to do, the more RPC-ish the code.
And about the specific HttpClient example - I'd say that today, developers doing high-level work consider HTTP to be a transport medium; an alternative to "plain old sockets".
So while a person who does not know what HTTP is might look at "a function that takes a string and returns an error code" as RPC, a modern developer would probably see it as "nitty gritty networking code". He would then say yuck, and put a few more layers of abstraction on top to make method calls from his business logic more RPC-ish. That way, the business logic would have to handle only the most extreme failures.
A) If the documentation told you how to do what you want by calling the API, then you are calling an RPC-like API, not a REST API.
B) If the API itself told you how to call it to accomplish what you want, then that API is not very RPC-like, and might be a REST API.
Programming in an object oriented or a procedural style is like case (A) -- you know what to call and how to call it to accomplish what you want before you write the code to call it. When [I assume Roy F.] says that an RPC-like API hides the network from the developer, he means that the developer can continue to program in this way whether or not his calls are remote -- he doesn't have to care about the network.
When you call a REST API, however, you have to program differently, because you have to let the API tell you what you can do and how you can do it. That's what it means to be "hypertext-driven".
Being hypertext-driven means that your stuff will continue to work when the guy on the other side of the network, who doesn't know or care about your program at all, completely changes what you can do and how you have to do it. Note that the lack of any contract between you and the system you are calling is the fundamental feature of the network that RPC hides.

How to program pcap with Objective-C and get HTTP request and response values in text format

I am working with pcap in an OS X application to understand packet analysis.
I am working with a app https://github.com/jpiccari/MacAlyzer
but I am getting only raw data but I want to differentiate every domain request into separate and clear way to read request and response value. Please guide me the way to how to develop an application with pcap.
I have tried some code but they translate data into hex format. How do I convert that data into meaningful request and response objects like Charles and Fiddler show?
MacAlyzer wasn't developed for your needs. I know because I'm the author. As already stated, Charles and Fiddler are web proxies and work entirely different (and serve different purposes).
Diving a bit deeper into your question, communication between client and server happens IP-to-IP and not domain-to-domain. Domain information is not contained in the packets at the either the IP or TCP level. Instead computers request domain-to-IP lookup information which is then stored and communication is carried out using the client and server IP addresses.
MacAlyzer, and really libpcap, don't have sophisticated packet dissection (like say Wireshark) and cannot display packet information as verbosely as other programs. Before I lost interest in the project I was planning a library that would allow much richer packet dissection and analysis, but free time became very limited.
As for adding domain information to MacAlyzer, I'll explain at a high-level since it seems you know what you're doing. To include domain information instead of IP address in the Source and Destination columns you could edit function ip_host_string() in ip.m. This function controls how the client and server addresses are displayed. Modifying it to lookup the hostname from IP address and returning the resulting string would cause the domains to be displayed instead of IP addresses.
If you come up with some nice updates, consider submitting a pull request.
Here is the food for thoughts:
http://www.binarytides.com/packet-sniffer-code-c-linux/
Anyway, you will need to use C. Therefore, check the codes of the includes, for example:
http://www.eg.bucknell.edu/~cs363/2014-spring/code/tcp.h
Here is the documentation of "pcap":
http://www-01.ibm.com/support/knowledgecenter/#!/ssw_aix_71/com.ibm.aix.basetrf1/pcap_close.htm

Is a status method necessary for an API?

I am building an API and I was wondering is it worth having a method in an API that returns the status of the API whether its alive or not?
Or is this pointless, and its the API users job to be able to just make a call to the method that they need and if it doesn't return anything due to network issues they handle it as needed?
I think it's quite useful to have a status returned. On the one hand, you can provide more statuses than 'alive' or not and make your API more poweful, and on the other hand, it's more useful for the user, since you can tell him exactly what's going on (e.g. 'maintainance').
But if your WebService isn't available at all due to network issues, then, of course, it's up to the user to catch that exception. But that's not the point, I guess, and it's not something you could control with your API.
It's useless.
The information it returns is completely out of date the moment it is returned to you because the service may fail right after the status return call is dispatched.
Also, if you are load balancing the incoming requests and your status request gets routed to a failing node, the reply (or lack thereof) would look to the client like a problem with the whole API service. In the meantime, all the other nodes could be happily servicing requests. Now your client will think that the whole API service is down but subsequent requests would work just fine (assuming your load balancer would remove the failed node or restart it).
HTTP status codes returned from your application's requests are the correct way of indicating availability. Your clients of course have to be coded to tolerate and handle them.
What is wrong with standard HTTP response status codes? 503 Service Unavailable comes to mind. HTTP clients should already be able to handle that without writing any code special to your API.
Now, if the service is likely to be unavailable frequently and it is expensive for the client to discover that but cheap for the server, then it might be appropriate to have a separate 'health check' URL that can quickly let the client know that the service is available (at the time of the GET on the health check URL).
It is not necessary most of the time. At least when it returns simple true or false. It just makes client code more complicated because it has to call one more method. Even if your client received active=true from service, next useful call may still fail. Let you client make the calls that they need during normal execution and have them handle network, timeout and HTTP errors correctly. Very useful pattern for such cases is called Circuit Breaker.
The reasons where status check may be useful:
If all the normal calls are considered to be expensive there may be an advantage in first calling lightweight status-check method (just to avoid expensive call).
Service can have different statuses and client can change its behavior depending on these statuses.
It might also be worth looking into stateful protocols like XMPP.