Correct way to instrument a Jersey client - jax-rs

My goal is to instrument a Jersey client to collect data on HTTP response/execution time, and I had thought I had the right approach by implementing a JAX-RS ClientRequestFilter and a ClientResponseFilter with code in each to record the start and end of the request. However when used with code like the following:
InputStream is = client.target("https://mytarget")
.request(MediaType.APPLICATION_OCTET_STREAM).get(InputStream.class));
// Actually consume the stream, which seems to block waiting for data
final byte[] bytes = IOUtils.toByteArray(is);
... I experience a significant problem as I only end up measuring what appears to be time to header download, and do not measure time it takes to download the entire entity which seems to be happen as I convert the input stream to a byte array (just an example, in my case I am reading the entity contents as they become available).
My use case is to instrument it for use with a distributed tracer (AWS X-Ray). An alternative I had considered was to use a library for Apache HttpClient for this purpose, but that would require changing the default transport layer for Jersey which seemed like an extreme modification to fix a simple problem. (This is slightly more straightforward with RESTEasy since it does appear to use Apache HttpClient by default so with RESTEasy I might have gone that route.)
Is it possible with Jersey to set up a filter that executes when the last byte is written to the entity? Or is there a better way to go about instrumenting a Jersey client?

Related

Implementing basic S3 compatible API with akka-http

I'm trying to implement the file storage ыукмшсу with basic S3 compatible API using akka-http.
I use s3 java sdk to test my service API and got the problem with the putObject(...) method. I can't consume file properly on my akka-http backend. I wrote simple route for the test purposes:
def putFile(bucket: String, file: String) = put{
extractRequestEntity{ ent =>
val finishedWriting = ent.dataBytes.runWith(FileIO.toPath(new File(s"/tmp/${file}").toPath))
onComplete(finishedWriting) { ioResult =>
complete("Finished writing data: " + ioResult)
}
}
}
It saves file, but file is always corrupted. Looking inside the file I found the lines like these:
"20000;chunk-signature=73c6b865ab5899b5b7596b8c11113a8df439489da42ddb5b8d0c861a0472f8a1".
When I try to PUT file with any other rest client it works as fine as expected.
I know S3 uses "Expect: 100-continue" header and may it he causes problems.
I really can't figure out how to deal with that. Any help appreciated.
This isn't exactly corrupted. Your service is not accounting for one of the four¹ ways S3 supports uploads to be sent on the wire, using Content-Encoding: aws-chunked and x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD.
It's a non-standards-based mechanism for streaming an object, and includes chunks that look exactly like this:
string(IntHexBase(chunk-size)) + ";chunk-signature=" + signature + \r\n + chunk-data + \r\n
...where IntHexBase() is pseudocode for a function that formats an integer as a hexadecimal number as a string.
This chunk-based algorithm is similar to, but not compatible with, Transfer-Encoding: chunked, because it embeds checksums in the stream.
Why did they make up a new HTTP transfer encoding? It's potentially useful on the client side because it eliminates the need to either "read your payload twice or buffer [the entire object payload] in memory [concurrently]" -- one or the other of which is otherwise necessary if you are going to calculate the x-amz-content-sha256 hash before the upload begins, as you otherwise must, since it's required for integrity checking.
I am not overly familiar with the internals of the Java SDK, but this type of upload might be triggered by using .withInputStream() or it might be standard behavor for files too, or for files over a certain size.
Your minimum workaround would be to throw an HTTP error if you see x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD in the request headers since you appear not to have implemented this in your API, but this would most likely only serve to prevent storing objects uploaded by this method. The fact that this isn't already what happens automatically suggests that you haven't implemented x-amz-content-sha256 handling at all, so you are not doing the server-side payload integrity checks that you need to be doing.
For full compatibility, you'll need to implement the algorithm supported by S3 and assumed to be available by the SDKs, unless the SDKs specifically support a mechanism for disabling this algorithm -- which seems unlikely, since it serves a useful purpose, particularly (it appears) for streams whose length is known but that aren't seekable.
¹ one of four -- the other three are a standard PUT, a web-based html form POST, and the multipart API that is recommended for large files and mandatory for files larger than 5 GB.

Uploading a file via Jaxax REST Client interface, with third party server

I need to invoke a remote REST interface handler and submit it a file in request body. Please note that I don't control the server. I cannot change the request to be multipart, the client has to work in accordance to external specification.
So far I managed to make it work like this (omitting headers etc. for brevity):
byte[] data = readFileCompletely ();
client.target (url).request ().post (Entity.entity (data, "file/mimetype"));
This works, but will fail with huge files that don't fit into memory. And since I have no restriction on filesize, this is a concern.
Question: is it somehow possible to use streams or something similar to avoid reading the whole file into memory?
If possible, I'd prefer to avoid implementation-specific extensions. If not, a solution that works with RESTEasy (on Wildfly) is also acceptable.
ReastEasy as well as Jersey support InputStream out of the box so simply use Entity.entity(inputStream, "application/octet-stream"); or whatever Content-Type header you want to set.
You can go low-level and construct the HTTP request using a library such as the plain java.net.URLConnection.
I have not tried it myself but there is example code which reads a local file and writes it to the request stream without loading it into a byte array.
Upload files from Java client to a HTTP server
Of course this solution requires more manual coding but it should work (unless java.net.URLConnection loads the whole file into memory)

HTTPS proxy with support for chunked-encoded requests

I'm developing a simple HTTPS proxy (written in Python) which receives POST/GET requests/responses, applies some transformation and finally forwards the result to the recipient.
I need to handle chunked-encoded requests/responses in a "streaming" fashion, meaning that as soon as a chunk is received the proxy transforms it and forwards it to the recipient.
Before deciding to support chunked-encoded requests, I've been using mitmproxy http://mitmproxy.org/ and it worked perfectly. Unfortunately, I noticed that it waits until the entire body is received before letting me handle the response/request.
How can I implement a proxy supporting chunked-encoded requests/responses? Has anyone of you ever done something like this?
Thanks
EDIT: MORE INFO ON MY USE CASE
I need to handle POST requests and GET responses.
In the POST request I receive a JSON object and I have to encrypt some of its values.
In the GET response I receive a JSON object and I have to decrypt some of its values.
Till now, the following code has worked perfectly:
def handle_request(self, r):
if(r.method=='POST'):
// encryption of r.get_form_urlencoded()
def handle_response(self, r):
if(r.request.method=='GET'):
// decryption of r.content
How can I do the same thing with single chunks?
EDIT: UPDATES
After evaluating different solutions, I decided to go for Squid (proxy) + ICAP (content adaptation).
I've successfully configured Squid and the performance are just great. Unfortunately, I can't find a suitable ICAP server (in Python, if possible) for doing content adaptation (modification). I thought this one https://github.com/netom/pyicap could do the job but looks like it doesn't read the body of myPOST requests.
Do you guys know a Python ICAP server that I can use together with Squid?
Thanks
The answer below is outdated. You can now pass --stream to mitmproxy, whose behaviour is explained in the mitmproxy documentation.
mitmproxy developer here. This is definitely a feature we want for mitmproxy as well, but it's not that trivial and probably not coming very soon. If you really want to implement that yourself, I can recommend two things:
If you have a very specific use case, you can employ libmproxy.protocol.http.HTTPRequest.from_stream for parsing the header and do the body processing yourself.
If you do not want to modify the request/response body, you may find it sufficient to modify mitmproxy itself. In a nutshell, you would need to read the request/response without content (see 1.), modify it to your needs, pass it to the server and then delegate control to the libmproxy.protocol.tcp (see https://github.com/mitmproxy/mitmproxy/blob/master/libmproxy/proxy/server.py#L169)
If you have further questions, don't hesistate to ask here or on mitmproxy's IRC channel.
Re Comment #1:
You can't take too much out of mitmproxy, but at least you get delegate the header parsing & processing.
# ...accept request, socket.makefile() etc...
req = HTTPRequest.from_stream(client_conn.rfile, include_content=False)
# manually forward to the server (req._assemble_head())
# manually receive response body chunk by chunk and forward it to the server, see
# https://github.com/mitmproxy/netlib/blob/master/netlib/http.py#L98
resp = HTTPResponse.from_stream(server_conn.rfile, include_content=False)
# manually forward headers
# manually process body and forward
That being said, this is a fairly complex topic. Eventually, you're better off hacking that directly into libmproxy.protocol.http.HTTPHandler.
Another option, depending on your use case again: Use mitmproxy, set the conntype to tcp and forward traffic as-is and use regex replacements on the content in libmproxy.protocol.tcp . Probably the easiest way, but the most hacky one.
If you can provide some context, I may guide you further in the right direction.
Re Comment #2:
Before we get to the main part: JSON is a really bad choice for streaming/chunking as long as you don't want to encrypt the complete JSON object and treat it as a single string. You should definitely consider something like tnetstrings if you only want to encrypt parts.
Apart from that, hooking into read_chunk works, but first you need to get to the point where you can actually receive chunks over the line. Then, it's as simple as reading the single chunks, encrypting them and forwarding them.

Sending a file from a java client to a server using wcf method?

I want to build a wcf web service so that the client and the server would be able to transfer files between each other. Do you know how I can achieve this? I think I should turn it into a byte array but I have no idea how to do that. The file is also quite big so I must turn on streamed response.
It sounds like you're on the right track. A quick search of the interwebz yielded this link: http://www.codeproject.com/Articles/166763/WCF-Streaming-Upload-Download-Files-Over-HTTP
Your question indicates that you want to send a file from a java client to a WCFd endpoint, but the contents of your question indicate that this should be a bidirectional capability. If this is the case, then you'll need to implement a service endpoint on your client as well. As far as that is concerned, I cannot be of much help, but there are resources out there like this SO question: In-process SOAP service server for Java
As far as practical implementation, I would think that using these two links you should be able to produce some code for your server and client.
As far as reading all bytes of a file, in C# you can use: File.ReadAllBytes It should work as in the following code:
//Read The contents of the file indicated
string fileName = "/path/to/some/file";
//store the binary in a byte array
byte[] buffer = File.ReadAllBytes(fileName);
//do something with those bytes!
Be sure to use the search function in the future:

memory exception using wcf wshttpbinding

I have an application to upload files to a server. I am using nettcpbinding and wshttpbinding. When the files is larger than 200 MB, I get a memory exception. Working around this, I have seen people recommend streaming, and of course it works with nettcpbinding for large files (>1GB), but when using wshttpbinding, what would be the approach?? Should I change to basichttpbinding?? what?? Thanks.
I suggest you expose another end point just to upload such large size data. This can have a binding that supports streaming. In our previous project we needed to do file uploads to server as part of business process. We ended up creating 2 endpoints one just dedicated to file upload, and another for all other business functionality.
The streaming data service can be a generic service to stream any data to the server and maybe return a token for identifying the data on server.For subsequent requests this token can be passed along to manipulate the data on server.
If you don't want to (or cannot because of legit reasons) change the binding nor use streaming, what you can do is have some method with a signature along the lines of the following:
void UploadFile(string fileName, long offset, byte[] data)
Instead of sending the whole file, you send little packets, and tell where the data should be placed. You can add more data, of course, like the whole filesize, CRC of the file to know if the transfer was successful, etc.