How to receive an uploaded file using node.js formidable library and save it to Amazon S3 using knox? - file-upload

I would like to upload a form from a web page and directly save the file to S3 without first saving it to disk. This node.js app will be deployed to Heroku, where there is no local disk to save the file to.
The node-formidable library provides a great way to upload files and save them to disk. I am not sure how to turn off formidable (or connect-form) from saving file first. The Knox library on the other hand provides a way to read a file from the disk and save it on Amazon S3.
1) Is there a way to hook into formidable's events (on Data) to send the stream to Knox's events, so that I can directly save the uploaded file in my Amazon S3 bucket?
2) Are there any libraries or code snippets that can allow me to directly take the uploaded file and save it Amazon S3 using node.js?
There is a similar question here but the answers there do not address NOT saving the file to disk.

It looks like there is no good way to do it. One reason might be that the node-formidable library saves the uploaded file to disk. I could not find any options to do otherwise. The knox library takes the saved file on the disk and using your Amazon S3 credentials uploads it to Amazon.
Since on Heroku I cannot save files locally, I ended up using transloadit service. Though their authentication docs have some learning curve, I found the service useful.
For those who want to use transloadit using node.js, the following code sample may help (transloadit page had only Ruby and PHP examples)
var crypto, signature;
crypto = require('crypto');
signature = crypto.createHmac("sha1", 'auth secret').
update('some string').
digest("hex")
console.log(signature);

this is Andy, creator of AwsSum:
https://github.com/appsattic/node-awssum/
I just released v0.2.0 of this library. It uploads the files that were created by Express' bodyParser() though as you say, this won't work on Heroku:
https://github.com/appsattic/connect-stream-s3
However, I shall be looking at adding the ability to stream from formidable directly to S3 in the next (v0.3.0) version. For the moment though, take a look and see if it can help. :)

Related

Uploading large files with Carrierwave to S3

So I will need to upload large files (zip files that are a few GB large) to S3, and I would like Carrierwave to manage the download/distribution of those files.
Meaning, when a user pays Carrierwave can automagically generate the dynamic URL and send it to them. I know how to do this already, but it just occurred to me that I have never uploaded files via Carrierwave that are bigger than a few dozen MB, much less a few GB to S3.
Given the flakiness of HTTP connections, I figure this is a suboptimal way to do it.
I don't have that many files to upload (maybe 10 - 20 max), and users won't be uploading them. It will be a storefront where the customers will be buying/downloading the files, not uploading them.
It would be nice if there was a way for me to upload the files into my S3 bucket separately (say FTP, git, or some other mechanism) and then just link it to my app through Carrierwave in some way.
What's the best way to approach this?
Also, don't forget that you will encounter the Heroku 30 second timeout when you are uploading the file in the first place.
Don't worry though, there are options:
Direct Upload - S3 supports direct upload where you present a form which uploads directly to s3 bypassing Heroku, you then receive a call back into your application with the uploaded files details for you to process (https://github.com/dwilkie/carrierwave_direct)
Upload to S3 and then expose bucket/folder in your application to connect to your models. We do this approach with a number of clients. They use Transmit (Mac Client) to upload large assets to S3 and then visit their app to link the asset to a Rails model.
Also, I'm pretty sure S3 is an HTTP based service so you're only going to be able to upload via HTTP.

AWS S3 and AjaXplorer

I'm using AjaXplorer to give access to my clients to a shared directory stored in Amazon S3. I installed the SD, configured the plugin (http://ajaxplorer.info/plugins/access/s3/) and could upload and download files but the upload size is limited to my host PHP limit which is 64MB.
Is there a way I can upload directly to S3 without going over my host to improve speed and have S3 limit, no PHP's?
Thanks
I think that is not possible, because the server will first climb to the PHP file and then make transfer to bucket.
Maybe
The only way around this is to use some JQuery or JS that can bypass your server/PHP entirely and stream directly into S3. This involves enabling CORS and creating a signed policy on the fly to allow your uploads, but it can be done!
I ran into just this issue with some inordinately large media files for our website users that I no longer wanted to host on the web servers themselves.
The best place to start, IMHO is here:
https://github.com/blueimp/jQuery-File-Upload
A demo is here:
https://blueimp.github.io/jQuery-File-Upload/
This was written to upload+write files to a variety of locations, including S3. The only tricky bits are getting your MIME type correct for each particular upload, and getting your bucket policy the way you need it.

red5 with s3(i want to customize the path for streaming videos)

I am using red5 for streaming videos in my project and I am able to play the videos from the local system which are saved in default folder "streams".
Now i want to customize the path and want to get the videos from S3. How do i configure red5 to work with S3. Is this a good practice?
I've got code using the IStreamFilenameGenerator works with S3; I'll warn you now that it may not work with the latest jets3 library, but you'll get the point of how it works by looking through the source. One problem / issue that you must understand when using S3 is that you cannot "record" to the bucket on-the-fly; your flv files can only be transferred to S3 once the file is finalized; there is an example upload call in the Application.class. Whereas "play" from S3 will work as expected.
I added the S3 code to the red5-examples repo: https://github.com/Red5/red5-examples
Search for:
https://stackoverflow.com/search?q=IStreamFilenameGenerator
Or https://www.google.com.au/search?q=IStreamFilenameGenerator+example&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:de:official&client=firefox-a
and you will find some examples howto modify the path(s).
You could alternatively also of course simply mount some drive into the streams folder or I guess a symbolic link would even work. But it might be not that flexible as if you can do it with IStreamFilenameGenerator and generate really some string completely like you want it.
Sebastian

Allowing users to download files as a batch from AWS s3 or Cloudfront

I have a website that allows users to search for music tracks and download those they they select as mp3.
I have the site on my server and all of the mp3s on s3 and then distributed via cloudfront. So far so good.
The client now wishes for users to be able to select a number of music track and then download them all in bulk or as a batch instead of 1 at a time.
Usually I would place all the files in a zip and then present the user a link to that new zip file to download. In this case, as the files are on s3 that would require I first copy all the files from s3 to my webserver process them in to a zip and then download from my server.
Is there anyway i can create a zip on s3 or CF or is there someway to batch / group files in to a zip?
Maybe i could set up an EC2 instance to handle this?
I would greatly appreciate some direction.
Best
Joe
I am afraid you won't be able to create the batches w/o additional processing. firing up an EC2 instance might be an option to create a batch per user
I am facing the exact same problem. So far the only thing I was able to find is Amazon's s3sync tool:
https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html
In my case, I am using Rails + its Paperclip addon which means that I have no way to easily download all of the user's images in one go, because the files are scattered in a lot of subdirectories.
However, if you can group your user's files in a better way, say like this:
/users/<ID>/images/...
/users/<ID>/songs/...
...etc., then you can solve your problem right away with:
aws s3 sync s3://<your_bucket_name>/users/<user_id>/songs /cache/<user_id>
Do have in mind you'll have to give your server the proper credentials so the S3 CLI tools can work without prompting for usernames/passwords.
And that should sort you.
Additional discussion here:
Downloading an entire S3 bucket?
s3 is single http request based.
So the answer is threads to achieve the same thing
Java api - uses TransferManager
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html
You can get great performance with multi threads.
There is no bulk download sorry.

How to upload a file from the web onto Amazon S3?

I have a link to a file (like so: http://example.com/tmp/database.csv). I want to upload it directly into S3, instead of downloading it on my computer first (and then uploading). Is this possible?
The file will have to move through some application you write. Amazon S3 does not have any mechanism to execute code or pull files, so the only way to do this is to send it directly from the server where the file is hosted or from another server.