Streaming S3 object to VertX Http Server Response - amazon-s3

The title basically explains itself.
I have a REST endpoint with VertX. Upon hitting it, I have some logic which results in an AWS-S3 object.
My previous logic was not to upload to S3, but to save it locally. So, I can do this at the response routerCxt.response().sendFile(file_path...).
Now that the file is in S3, I have to download it locally before I could call the above code.
That is slow and inefficient. I would like to stream S3 object directly to the response object.
In Express, it's something like this. s3.getObject(params).createReadStream().pipe(res);.
I read a little bit, and saw that VertX has a class called Pump. But it is used by vertx.fileSystem() in the examples.
I am not sure how to plug the InputStream from S3'sgetObjectContent() to the vertx.fileSystem() to use Pump.
I am not even sure Pump is the correct way because I tried to use Pump to return a local file, and it didn't work.
router.get("/api/test_download").handler(rc -> {
rc.response().setChunked(true).endHandler(endHandlr -> rc.response().end());
vertx.fileSystem().open("/Users/EmptyFiles/empty.json", new OpenOptions(), ares -> {
AsyncFile file = ares.result();
Pump pump = Pump.pump(file, rc.response());
pump.start();
});
});
Is there any example for me to do that?
Thanks

It can be done if you use the Vert.x WebClient to communicate with S3 instead of the Amazon Java Client.
The WebClient can pipe the content to the HTTP server response:
webClient = WebClient.create(vertx, new WebClientOptions().setDefaultHost("s3-us-west-2.amazonaws.com"));
router.get("/api/test_download").handler(rc -> {
HttpServerResponse response = rc.response();
response.setChunked(true);
webClient.get("/my_bucket/test_download")
.as(BodyCodec.pipe(response))
.send(ar -> {
if (ar.failed()) {
rc.fail(ar.cause());
} else {
// Nothing to do the content has been sent to the client and response.end() called
}
});
});
The trick is to use the pipe body codec.

Related

Design Minimal API and use HttpClient to post a file to it

I have a legacy system interfacing issue that my team has elected to solve by standing up a .NET 7 Minimal API which needs to accept a file upload. It should work for small and large files (let's say at least 500 MiB). The API will be called from a legacy system using HttpClient in a .NET Framework 4.7.1 app.
I can't quite seem to figure out how to design the signature of the Minimal API and how to call it with HttpClient in a way that totally works. It's something I've been hacking at on and off for several days, and haven't documented all of my approaches, but suffice it to say there have been varying results involving, among other things:
4XX and 500 errors returned by the HTTP call
An assortment of exceptions on either side
Calls that throw and never hit a breakpoint on the API side
Calls that get through but the Stream on the API end is not what I expect
Errors being different depending on whether the file being uploaded is small or large
Text files being persisted on the server that contain some of the HTTP headers in addition to their original contents
On the Minimal API side, I've tried all sorts of things in the signature (IFormFile, Stream, PipeReader, HttpRequest). On the calling side, I've tried several approaches (messing with headers, using the Flurl library, various content encodings and MIME types, multipart, etc).
This seems like it should be dead simple, so I'm trying to wipe the slate clean here, start with an example of something that partially works, and hope someone might be able to illuminate the path forward for me.
Example of Minimal API:
// IDocumentStorageManager is an injected dependency that takes an int and a Stream and returns a string of the newly uploaded file's URI
app.MapPost(
"DocumentStorage/CreateDocument2/{documentId:int}",
async (PipeReader pipeReader, int documentId, IDocumentStorageManager documentStorageManager) =>
{
using var ms = new MemoryStream();
await pipeReader.CopyToAsync(ms);
ms.Position = 0;
return await documentStorageManager.CreateDocument(documentId, ms);
});
Call the Minimal API using HttpClient:
// filePath is the path on local disk, uri is the Minimal API's URI
private static async Task<string> UploadWithHttpClient2(string filePath, string uri)
{
var fileStream = File.Open(filePath, FileMode.Open);
var content = new StreamContent(fileStream);
var httpRequestMessage = new HttpRequestMessage(HttpMethod.Post, uri);
var httpClient = new HttpClient();
httpRequestMessage.Content = content;
httpClient.Timeout = TimeSpan.FromMinutes(5);
var result = await httpClient.SendAsync(httpRequestMessage);
return await result.Content.ReadAsStringAsync();
}
In the particular example above, a small (6 bytes) .txt file is uploaded without issue. However, a large (619 MiB) .tif file runs into problems on the call to httpClient.SendAsync which results in the following set of nested Exceptions:
System.Net.Http.HttpRequestException - "Error while copying content to a stream."
System.IO.IOException - "Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host.."
System.Net.Sockets.SocketException - "An existing connection was forcibly closed by the remote host."
What's a decent way of writing a Minimal API and calling it with HttpClient that will work for small and large files?
Kestrel allows uploading 30MB per default.
To upload larger files via kestrel you might need to increase the max size limit. This can be done by adding the "RequestSizeLimit" attribute. So for example for 1GB:
app.MapPost(
"DocumentStorage/CreateDocument2/{documentId:int}",
[RequestSizeLimit(1_000_000_000)] async (PipeReader pipeReader, int documentId) =>
{
using var ms = new MemoryStream();
await pipeReader.CopyToAsync(ms);
ms.Position = 0;
return "";
});
You can also remove the size limit globally by setting
builder.WebHost.UseKestrel(o => o.Limits.MaxRequestBodySize = null);
This answer is good but the RequestSizeLimit filter doesn't work for minimal APIs, it's an MVC filter. You can use the IHttpMaxRequestBodySizeFeature to limit the size (assuming you're not running on IIS). Also, I made a change to accept the body as a Stream. This avoids the memory stream copy before calling the CreateDocument API:
app.MapPost(
"DocumentStorage/CreateDocument2/{documentId:int}",
async (Stream stream, int documentId, IDocumentStorageManager documentStorageManager) =>
{
return await documentStorageManager.CreateDocument(documentId, stream);
})
.AddEndpointFilter((context, next) =>
{
const int MaxBytes = 1024 * 1024 * 1024;
var maxRequestBodySizeFeature = context.HttpContext.Features.Get<IHttpMaxRequestBodySizeFeature>();
if (maxRequestBodySizeFeature is not null and { IsReadOnly: true })
{
maxRequestBodySizeFeature.MaxRequestBodySize = MaxBytes;
}
return next(context);
});
If you're running on IIS then https://learn.microsoft.com/en-us/iis/configuration/system.webserver/security/requestfiltering/requestlimits/#configuration

How to read the request body with spring webflux

I'm using Spring 5, Netty and Spring webflux to develop and API Gateway. Sometime I want that the request should be stopped by the gateway but I also want to read the body of the request to log it for example and return an error to the client.
I try to do this in a WebFilter by subscribing to the body.
#Override
public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain) {
if (enabled) {
logger.debug("gateway is enabled. The Request is routed.");
return chain.filter(exchange);
} else {
logger.debug("gateway is disabled. A 404 error is returned.");
exchange.getRequest().getBody().subscribe();
exchange.getResponse().setStatusCode(HttpStatus.NOT_FOUND);
return exchange.getResponse().writeWith(Mono.just(exchange.getResponse().bufferFactory().allocateBuffer(0)));
}
}
When I do this it works when the content of the body is small. But when I have a large boby, only the first element of the flux is read so I can't have the entire body. Any idea how to do this ?
1.Add "readBody()" to the post route:
builder.routes()
.route("get_route", r -> r.path("/**")
.and().method("GET")
.filters(f -> f.filter(myFilter))
.uri(myUrl))
.route("post_route", r -> r.path("/**")
.and().method("POST")
.and().readBody(String.class, requestBody -> {return true;})
.filters(f -> f.filter(myFilter))
.uri(myUrl))
2.Then you can get the body string in your filter:
String body = exchange.getAttribute("cachedRequestBodyObject");
Advantages:
No blocking.
No need to refill the body for further process.
Works with Spring Boot 2.0.6.RELEASE + Sring Cloud Finchley.SR2 + Spring Cloud Gateway.
The problem here is that you are subscribing manually within the filter, which means you're disconnecting the reading of the request from the rest of the pipeline. Calling subscribe() gives you a Disposable that helps you manage the underlying Subscription.
So you need to turn connect the whole process as a single pipeline, a bit like:
Flux<DataBuffer> requestBody = exchange.getRequest().getBody();
// decode the request body as a Mono or a Flux
Mono<String> decodedBody = decodeBody(requestBody);
exchange.getResponse().setStatusCode(HttpStatus.NOT_FOUND);
return decodedBody.doOnNext(s -> logger.info(s))
.then(exchange.getResponse().setComplete());
Note that decoding the whole request body as a Mono means your gateway will have to buffer the whole request body in memory.
DataBuffer is, on purpose, a low level type. If you'd like to decode it (i.e. implement the sample decodeBodymethod) as a String, you can use one of the various Decoder implementations in Spring, like StringDecoder.
Now because this is a rather large and complex space, you can use and/or take a look at Spring Cloud Gateway, which does just that and way more.

how to use spring webflux for file streaming

I want to stream a file in a reactive way using spring webflux.
How my endpoint should look like more specific what is the type of the object ?
#GetMapping("/file")
Flux<???> file() {
//Read file content into this ??? thing .
}
You can return a Resource instance like this:
#GetMapping("/file")
Mono<Resource> file() {
//Create a ClassPathResource, for example
}
Note that this supports Byte Range HTTP requests automatically.

Upload image with RESTSharp (addFile)

I'd like to send a picture from my Windows Phone on a webservice hosted on Windows Azure.
To communicate with my service, I use RESTSharp and I saw that there was a method named addFile for sending file.
RestRequest request;
request = new RestRequest("/report/add", Method.POST);
request.AddFile("test", ConvertToBytes(e.ChosenPhoto), "testfile");
App.Client.ExecuteAsync(request, response =>
{
RestResponse resource = response;
if (response.StatusCode == HttpStatusCode.OK)
{
MessageBox.Show("Your report has been sent! Thank you for your participation!");
}
});
However, I do not know how to retrieve the array of bytes sent when the request arrives at the service.
Can you help me please?
Could you show the code that you use to handle the file server side? It could be that you're looking in the wrong place.
Alternatively, you could try an other way to add the file:
request.AddBody(new { myFile = fileByteArray }))
Note: In both cases the file will be loaded in memory. This could be a problem for large files.

Grails how to post out to someone else's API

I am writing a Grails app, and I want the controller to hit some other API with a POST and then use the response to generate the page my user sees. I am not able to Google the right terms to find anything about posting to another page and receiving the response with Grails. Links to tutorials or answers like "Thats called..." would me much appreciated.
Seems like you are integrating with some sort of RESTful web service. There is REST client plugin, linked here.
Alternatively, its quite easy to do this without a plugin, linked here.
I highly recommend letting your controller just be a controller. Abstract your interface with this outside service into some class like OtherApiService or some sort of utility. Keep all the code that communicates with this outside service in one place; that way you can mock your integration component and make testing everywhere else easy. If you do this as a service, you have room to expand, say in the case you want to start storing some data from the API in your own app.
Anyway, cutting and posting from the linked documentation (the second link), the following shows how to send a GET to an API and how to set up handlers for success and failures, as well as dealing with request headers and query params -- this should have everything you need.
#Grab(group='org.codehaus.groovy.modules.http-builder', module='http-builder', version='0.5.0-RC2' )
import groovyx.net.http.*
import static groovyx.net.http.ContentType.*
import static groovyx.net.http.Method.*
def http = new HTTPBuilder( 'http://ajax.googleapis.com' )
// perform a GET request, expecting JSON response data
http.request( GET, JSON ) {
uri.path = '/ajax/services/search/web'
uri.query = [ v:'1.0', q: 'Calvin and Hobbes' ]
headers.'User-Agent' = 'Mozilla/5.0 Ubuntu/8.10 Firefox/3.0.4'
// response handler for a success response code:
response.success = { resp, json ->
println resp.statusLine
// parse the JSON response object:
json.responseData.results.each {
println " ${it.titleNoFormatting} : ${it.visibleUrl}"
}
}
// handler for any failure status code:
response.failure = { resp ->
println "Unexpected error: ${resp.statusLine.statusCode} : ${resp.statusLine.reasonPhrase}"
}
}
You might also want to check out this, for some nifty tricks. Is has an example with a POST method.