Client POST of indeterminate length to a server - akka-http

[Java DSL] I'm trying to post a stream of bytes to a server (as the body) using the client api in real-time but won't know the length prior to the start of the request.
I can't figure out how to do this from the akka-http documentation, has anyone attempted this?

Given that, you have created a materializer from the Akka context and have a Source that generates ByteString objects called mysource:
Http httpContext =
Http.get(context().system());
Source<ByteString, NotUsed> chunked =
mysource.map(str -> ByteString(str.concat("\n")))
.concat(Source.single(ByteString.empty()));
HttpRequest post = HttpRequest.POST("http://some-server/address")
.withEntity(HttpEntities.createChunked(ContentTypes.APPLICATION_OCTET_STREAM, chunked))
.withProtocol(HttpProtoclas.HTTP_1_1);
CompletionStage<HttpResponse> result =
httpContext.singleRequest(post, materializer);
Note that we concatenate an empty ByteString Source object to the original Source in order to signal the end of the chunked stream.
If you are issuing this from within an actor it's best to use a pipe() to submit the final request.

Related

Akka HTTP Source Streaming vs regular request handling

What is the advantage of using Source Streaming vs the regular way of handling requests? My understanding that in both cases
The TCP connection will be reused
Back-pressure will be applied between the client and the server
The only advantage of Source Streaming I can see is if there is a very large response and the client prefers to consume it in smaller chunks.
My use case is that I have a very long list of users (millions), and I need to call a service that performs some filtering on the users, and returns a subset.
Currently, on the server side I expose a batch API, and on the client, I just split the users into chunks of 1000, and make X batch calls in parallel using Akka HTTP Host API.
I am considering switching to HTTP streaming, but cannot quite figure out what would be the value
You are missing one other huge benefit: memory efficiency. By having a streamed pipeline, client/server/client, all parties safely process data without running the risk of blowing up the memory allocation. This is particularly useful on the server side, where you always have to assume the clients may do something malicious...
Client Request Creation
Suppose the ultimate source of your millions of users is a file. You can create a stream source from this file:
val userFilePath : java.nio.file.Path = ???
val userFileSource = akka.stream.scaladsl.FileIO(userFilePath)
This source can you be use to create your http request which will stream the users to the service:
import akka.http.scaladsl.model.HttpEntity.{Chunked, ChunkStreamPart}
import akka.http.scaladsl.model.{RequestEntity, ContentTypes, HttpRequest}
val httpRequest : HttpRequest =
HttpRequest(uri = "http://filterService.io",
entity = Chunked.fromData(ContentTypes.`text/plain(UTF-8)`, userFileSource))
This request will now stream the users to the service without consuming the entire file into memory. Only chunks of data will be buffered at a time, therefore, you can send a request with potentially an infinite number of users and your client will be fine.
Server Request Processing
Similarly, your server can be designed to accept a request with an entity that can potentially be of infinite length.
Your questions says the service will filter the users, assuming we have a filtering function:
val isValidUser : (String) => Boolean = ???
This can be used to filter the incoming request entity and create a response entity which will feed the response:
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.model.HttpResponse
import akka.http.scaladsl.model.HttpEntity.Chunked
val route = extractDataBytes { userSource =>
val responseSource : Source[ByteString, _] =
userSource
.map(_.utf8String)
.filter(isValidUser)
.map(ByteString.apply)
complete(HttpResponse(entity=Chunked.fromData(ContentTypes.`text/plain(UTF-8)`,
responseSource)))
}
Client Response Processing
The client can similarly process the filtered users without reading them all into memory. We can, for example, dispatch the request and send all of the valid users to the console:
import akka.http.scaladsl.Http
Http()
.singleRequest(httpRequest)
.map { response =>
response
.entity
.dataBytes
.map(_.utf8String)
.foreach(System.out.println)
}

How to set http response code in Parse Server cloud function?

A parse server cloud function is defined via
Parse.Cloud.define("hello", function(request, response) {..});
on the response, I can call response.success(X) and response.error(Y), and that sets the http response code and the body of the response.
But how do I define a different code, like created (201)?
And how do I set the headers of the response?
thanks, Tim
You are allowed to return any valid JSON from response.success(). Therefore, you could create an object with fields such as code, message, and value, so you can set the code, give it a string descriptor, and pass back the value you normally would, if there is one. This seems to accomplish what you need, though you will have to keep track of those codes across your platforms. I recommend looking up standard http response codes and make sure you don't overlap with any standards.

How to use .withoutSizeLimit in Akka-http (client) HttpRequest?

I'm using Akka 2.4.7 to read a web resource that is essentially a stream of JSON objects, delimited with newlines. The stream is practically unlimited in size.
When around 8MB has been consumed, I get an exception:
[error] (run-main-0) EntityStreamSizeException: actual entity size (None) exceeded content length limit (8388608 bytes)! You can configure this by setting `akka.http.[server|client].parsing.max-content-length` or calling `HttpEntity.withSizeLimit` before materializing the dataBytes stream.
The "actual entity size (None)" seems a bit funny, but my real question is, how to use the HttpEntity.withSizeLimit (or in my case, rather .withoutSizeLimit that should be there, as well).
My request code is like this:
val chunks_src: Source[ByteString,_] = Source.single(req)
.via(connection)
.flatMapConcat( _.entity.dataBytes )
I tried adding a .map( (x: HttpResponse) => x.withoutSizeLimit ), but it does not compile. What's the role of the HttpEntity when doing client side programming, anyways?
I can change the global config, but that's kind of missing the point. I'd like to flag "no limits" only for a particular request.
As a further question, I understand the need for a max-content-length on the server side, but why affect the client?
References:
Akka 2.4.7: Limiting message entity length
Akka 2.4.7: HttpEntity
I'm far from an expert on this topic, but it would seem you need to add the .withoutSizeLimit() to the entity like:
Source.single(req)
.via(connection)
.flatMapConcat( _.entity.withoutSizeLimit().dataBytes )

ASP.NET Web API - Reading querystring/formdata before each request

For reasons outlined here I need to review a set values from they querystring or formdata before each request (so I can perform some authentication). The keys are the same each time and should be present in each request, however they will be located in the querystring for GET requests, and in the formdata for POST and others
As this is for authentication purposes, this needs to run before the request; At the moment I am using a MessageHandler.
I can work out whether I should be reading the querystring or formdata based on the method, and when it's a GET I can read the querystring OK using Request.GetQueryNameValuePairs(); however the problem is reading the formdata when it's a POST.
I can get the formdata using Request.Content.ReadAsFormDataAsync(), however formdata can only be read once, and when I read it here it is no longer available for the request (i.e. my controller actions get null models)
What is the most appropriate way to consistently and non-intrusively read querystring and/or formdata from a request before it gets to the request logic?
Regarding your question of which place would be better, in this case i believe the AuthorizationFilters to be better than a message handler, but either way i see that the problem is related to reading the body multiple times.
After doing "Request.Content.ReadAsFormDataAsync()" in your message handler, Can you try doing the following?
Stream requestBufferedStream = Request.Content.ReadAsStreamAsync().Result;
requestBufferedStream.Position = 0; //resetting to 0 as ReadAsFormDataAsync might have read the entire stream and position would be at the end of the stream causing no bytes to be read during parameter binding and you are seeing null values.
note: The ability of a request's content to be read single time only or multiple times depends on the host's buffer policy. By default, the host's buffer policy is set as always Buffered. In this case, you will be able to reset the position back to 0. However, if you explicitly make the policy to be Streamed, then you cannot reset back to 0.
What about using ActionFilterAtrributes?
this code worked well for me
public HttpResponseMessage AddEditCheck(Check check)
{
var request= ((System.Web.HttpContextWrapper)Request.Properties.ToList<KeyValuePair<string, object>>().First().Value).Request;
var i = request.Form["txtCheckDate"];
return Request.CreateResponse(HttpStatusCode.Ok);
}

Upload file to Solr with HttpClient and MultipartEntity

httpclient, httpmime 4.1.3
I am trying to upload a file through http to a remote server with no success.
Here's my code:
HttpPost method;
method = new HttpPost(solrUrl + "/extract");
method.getParams().setParameter("literal.id", fileId);
method.getParams().setBooleanParameter("commit", true);
MultipartEntity me = new MultipartEntity();
me.addPart("myfile", new InputStreamBody(doubleInput, contentType, fileId));
method.setEntity(me);
//method.setHeader("Content-Type", "multipart/form-data");
HttpClient httpClient = new DefaultHttpClient();
HttpResponse hr = httpClient.execute(method);
The server is Solr.
This is to replace a working bash script that calls curl like this,
curl http://localhost:8080/solr/update/extract?literal.id=bububu&commit=true -F myfile=#bububu.doc
If I try to set "Content-Type" "multipart/form-data", the receiving part says that there's no boundary (which is true):
HTTP Status 500 - the request was rejected because no multipart boundary was found
If I omit this header setting, the server issues an error description that, as far as I discovered, indicates that the content type was not multipart [2]:
HTTP Status 400. The request sent by the client was syntactically incorrect ([doc=null] missing required field: id).
This is related to [1] but I couldn't determine the answer from it. I was wondering,
I am in the same situation but didn't understand what to do. I was hoping that the MultipartEntity would tell the HttpPost object that it is multipart, form data and have some boundary, and I wouldnt set content type by myself. I didn't quite get how to provide boundaries to the entities - the MultipartEntity doesn't have a method like setBoundary. Or, how to get that randomly generated boundary to specify it in addHeader by myself - no getBoundary methor either...
[1] Problem with setting header "Content-Type" in uploading file with HttpClient4
[2] http://lucene.472066.n3.nabble.com/Updating-the-index-with-a-csv-file-td490013.html
I am suspicious of
method.getParams().setParameter("literal.id", fileId);
method.getParams().setBooleanParameter("commit", true);
In the first line, is fileId a string or file pointer (or something else)? I hope it is a string. As for the second line, you can rather set a normal parameter.
I am trying to tackle the HTTP Status 400. I dont know much Java (or is that .Net?)
http://en.wikipedia.org/wiki/List_of_HTTP_status_codes#4xx_Client_Error