'Content-Length too' long when uploading file using Tornado - file-upload

Using a slightly modified version of this Tornado upload app on my development machine, I get the following error from tornado server and a blank page whenever I try to upload large files (+100MB):
[I 130929 07:45:44 httpserver:330] Malformed HTTP request from
127.0.0.1: Content-Length too long
There is no problem uploading files up to ~20MB.
so I'm wondering whether there is any particular file upload limit in Tornado web server? Or does it have someting to do with the machine's available memory. And whatever the reason is, how can I overcome this problem?

Tornado has a configurable limit on upload size (defaulting to 10MB). You can increase the limit by passing max_buffer_size to the HTTPServer constructor (or Application.listen). However, since Tornado (version 3.1) reads the entire upload body into a single contiguous string in memory, it's dangerous to make the limit too high. One popular alternative is the nginx upload module.

Related

Autodesk Forge - problems with very large .zip files

We allow our users to upload files to forge, but to our bucket (they don't need to create their own) as we're only using the model viewer. This means they need to upload to our server first.
The upload method uses the stream from the HttpContent (we're using WebAPI2) and sends it right on into the Forge API methods.
Well, it would, but I get this exception - Error getting value from 'WriteTimeout' on 'System.Net.Http.StreamContent+ReadOnlyStream'.
This means that the Forge API is checking the Write Timeout without checking CanWrite or CanTimeout. Have I found an API bug?
Copying to another stream is feasible but I can't use a debugger to test the file our client is reporting further problems with, because it's 1.1GB and my dev box runs out of memory.

Server timeout when re-assembling the uploaded file

I am running a simple server app to receive uploads from a fine-uploader web client. It is based on the fine-uploader Java example and is running in Tomcat6 with Apache sitting in front of it and using ProxyPass to route the requests. I am running into an occasional problem where the upload gets to 100% but ultimately fails. In the server logs, as well as on the client, I can see that Apache is timing out on the proxy with a 502 error.
After trying and seeing this myself, I realized the problem occurs with really large files. The Java server app was taking longer than 30 seconds to reassemble the chunks into a single file and so Apache would kill the connection and stop waiting. I have increased Apache Timeout to 300 seconds which should largely correct the problem but the potential remains.
Any ideas on other ways to handle this so that the connection between Apache and Tomcat is not killed while the app is assembling the chunks on the server? I am currently using 2 MB chunks and was thinking maybe I should use a larger chunk size. Perhaps with fewer chunks to assemble the server code could do it faster. I could test that but unless the speedup is dramatic it seems like the potential for problems remain and will just be waiting for a large enough upload to come along to trigger them.
It seems like you have two options:
Remove the timeout in Apache.
Delegate the chunk-combination effort to a separate thread, and return a response to the request as soon as possible.
With the latter approach, you will not be able to let Fine Uploader know if the chunk combination operation failed, but perhaps you can perform a few quick sanity checks before responding, such as determining if all chunks are accessible.
There's nothing Fine Uploader can do here, the issue is server side. After Fine Uploader sends the request, its job is done until your server responds.
As you mentioned, it may be reasonable to increase the chunk size or make other changes to speed up the chunk combination operation to lessen the chance of a timeout (if #1 or #2 above are not desirable).

Does apache commons fileupload support chunked uploads?

We have moved to using PLupload for file uploads and found that it can support "chunked" file uploads. The problem is that our server sees one large file upload as multiple smaller files coming in multiple POST requests.
Does anybody know if Apache Commons FileUpload supports chunked uploads?
FWIW looking at the PLupload webpage the "Chunking" they are talking about is not "HTTP Chunking". http://www.plupload.com/index.php
Their marketing term "Chunking" is their concept of sending a large payload up in small and separate HTTP requests. The server is required to have logic to group, stitch up and verify all the small parts. You are better off getting help on their forum on this. There is no reason why this logic can not be created by you on the server side and maybe they have example Java code implementing it.
Useful info and pointer to their upload.php example (maybe you convert to Java and on top of Apache Commons FileUpload) :
http://www.plupload.com/punbb/viewtopic.php?id=1484
What you are observing the small segments of a file arriving like they are separate files is exactly how the "PLupload Chunking" mechanism works. This technique is not defined in any standard, but it is also not an uncommon solution to the problems it addresses.
The "HTTP Chunking" is standard for defining how to transfer a single HTTP Request (and/or HTTP Response) between click/server using a HTTP transfer encoding. This is supported by all webservers and all browsers and has been around for a long time (since HTTP/1.1).

Pyramid/Pylons: How to check if an uploaded file is complete in a POST request?

I'm building a web tool which allows users to upload PDFs to a server using their web browsers. The server is based on Python (Paste + Pyramid).
The problem I have right now is the following: If a user uploads a rather large file (let's say 100 MB) and they cancel the upload before it is completed, my handler code on the server is still called (instead of the request being aborted).
The problem is that the request.POST['myfile'].file is incomplete when that happens. This effectively means that the PDF file is corrupted if I simply write it to some place on the server.
When I watch the server's log, it shows a "broken pipe" exception within the Paste server; however I have no idea how to catch that exception and have it prevent my view/handler code from executing and storing the incomplete file.
Seems like the paster HTTP server does not correctly validate the uploaded form data and simply passes the request down the WSGI pipeline even if the connection (HTTP POST) was closed by the user.
I worked around this issue by simply setting up NGINX to act as a reverse proxy. This also adds some security benefits as it might be better tested than paster.
Update:
My main problem was that I was using runserver (the built in web server of manage.py). After some trial and error we ended up using WSGI.
More specifically, uWSGI and Nginx as web server. Static content is served directly by Nginx while dynamic pages are piped through uWSGI and are handled by the Python web app.
Unless you are doing something fancy (like tracking the upload progress, etc), your pylons controller should not be invoked until the entire file has been uploaded.

Serving dynamic zip files through Apache

One of the responsibilities of my Rails application is to create and serve signed xmls. Any signed xml, once created, never changes. So I store every xml in the public folder and redirect the client appropriately to avoid unnecessary processing from the controller.
Now I want a new feature: every xml is associated with a date, and I'd like to implement the ability to serve a compressed file containing every xml whose date lies in a period specified by the client. Nevertheless, the period cannot be limited to less than one month for the feature to be useful, and this implies some zip files being served will be as big as 50M.
My application is deployed as a Passenger module of Apache. Thus, it's totally unacceptable to serve the file with send_data, since the client will have to wait for the entire compressed file to be generated before the actual download begins. Although I have an idea on how to implement the feature in Rails so the compressed file is produced while being served, I feel my server will get scarce on resources once some lengthy Ruby/Passenger processes are allocated to serve big zip files.
I've read about a better solution to serve static files through Apache, but not dynamic ones.
So, what's the solution to the problem? Do I need something like a custom Apache handler? How do I inform Apache, from my application, how to handle the request, compressing the files and streaming the result simultaneously?
Check out my mod_zip module for Nginx:
http://wiki.nginx.org/NgxZip
You can have a backend script tell Nginx which URL locations to include in the archive, and Nginx will dynamically stream a ZIP file to the client containing those files. The module leverages Nginx's single-threaded proxy code and is extremely lightweight.
The module was first released in 2008 and is fairly mature at this point. From your description I think it will suit your needs.
You simply need to use whatever API you have available for you to create a zip file and write it to the response, flushing the output periodically. If this is serving large zip files, or will be requested frequently, consider running it in a separate process with a high nice/ionice value / low priority.
Worst case, you could run a command-line zip in a low priority process and pass the output along periodically.
it's tricky to do, but I've made a gem called zipline ( http://github.com/fringd/zipline ) that gets things working for me. I want to update it so that it can support plain file handles or paths, right now it assumes you're using carrierwave...
also, you probably can't stream the response with passenger... I had to use unicorn to make streaming work properly... and certain rack middleware can even screw that up (calling response.to_s breaks it)
if anybody still needs this bother me on the github page