I have a program which takes input from S3, generates a text file, and then sends it to the mapper class. I am unable to write the file to S3, from where the mapper can read it later. Now, I realize that we cannot write files to S3 directly, so I am trying to upload the text file created to S3 using copyFromLocalFile(). However, I get a null pointer exception in the following line:
fs.copyFromLocalFile(true, new Path(tgiPath), mapIP);
I am creating the text file in main function, so I am not sure where exactly it's being created. The only reason behind the null pointer exception, that I can think of is that the text file is not being written on the local disk. So my question is: How do I write files on the local disk? If I just specify the name of the file while creating it, where is it created and how do I access it?
Have a look at Jets3t
This seems to be exactly what you need.
Jets3t is awesome, but I am using Google's App Engine, and it doesn't work on there because of threading limitations.
I banged my head against the wall until I came up with a solution that worked on App Engine by combining a bunch of existing libraries: http://socialappdev.com/using-amazon-s3-with-google-app-engine-02-2011
Related
I have quite common situation, as I suppose. I have website that is lcoated on amazon EC2 and I'd like to move all dynamic files to amazon S3. Everything seems ok, except 2 points:
I'm using library PDFNet with their WebViewer. To display pdf files in browser Webviwer use special ".xod" format. PDFNet provide functionality to convert pdf files to xod format. Let's see an example, when PDF file was upload on S3 and no xod file was created (I'm going to use Lambda to avoid it in future, but still). So in this case I have to download file to my local machine, convert it to xod file and upload xod file on S3(I don't see any other opportunities to do it, but it can take a lot of traffic)?
Second problem is almost the same, but it's linked with thumbnails. Currently I'm dynamically resize thumbnails depending on the required resolution and I'd like to keep it. Amazon Lambda is not situable in this case, what is the best way to do it?
Why do you say that Lambda is not suitable here?
For pt#1 PDFNet gives a library for Java, you can write a lambda function in java (its possible now) and use that to get infinite scale.
For pt#2: Amazons tutorial (http://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html) gives a detailed example of how to resize images when uploaded to S3. The example is in nodeJs, you can write a java version as well if you like.
Note that if you want to have custom logic for decision making, you can add attributes while uploading the file in S3 (http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#User-Defined Metadata) which you can use in your lambda function to take decisions while resizing.
Is there a way to run imagemagick or some other tool on s3 servers to resize the images.
The way I know is first downloading all the image files on my machine and then convert these files and reupload them on s3 server. The problem is the number of file is more than 10000. I don't want to download all the files on my local machine.
Is there a way to convert it on s3 server itself.
look at it: https://github.com/Turistforeningen/node-s3-uploader.
It is a library providing some features for s3 uploading including resizing as you want
Another option is NOT to change the resolution, but to use a service that can convert the images on-the-fly when they are accessed, such as:
Cloudinary
imgix
Also check out the following article on amazon's compute blog.. I found myself here because i had the same question. I think i'm going to implement this in Lambda so i can just specify the size and see if that helps. My problem is i have image files on s3 that are 2MB.. i dont want them at full resolution because I have an app that is retrieving them and it takes a while sometimes for a phone to pull down a 2MB image. But i dont mind storing them at full resolution if i can get a different size just by specifying it in the URL. easy!
https://aws.amazon.com/blogs/compute/resize-images-on-the-fly-with-amazon-s3-aws-lambda-and-amazon-api-gateway/
S3 does not, alone, enable arbitrary compute (such as resizing) on the data.
I would suggest looking into AWS-Lambda (available in the AWS console), which will allow you to setup a little program (which they call a Lambda) to run when certain events occur in a S3 bucket. You don't need to setup a VM, you only need to specify a few files, with a particular entry point. The program can be written in a few languages, namely node.js python and java. You'd be able to do it all from the console's web GUI.
Usually those are setup for computing things on new files being uploaded. To trigger the program for files that are already in place on S3, you have to "force" S3 to emit one of the events you can hook into for the files you already have. The list is here. Forcing a S3 copy might be sufficient (copy A to B, delete B), an S3 rename operation (rename A to A.tmp, rename A.tmp to A), and creation of new S3 objects would all work. You essentially just poke your existing files in a way that causes your Lambda to fire. You may also invoke your Lambda manually.
This example shows how to automatically generate a thumbnail out of an image on S3, which you could adapt to your resizing needs and reuse to create your Lambda:
http://docs.aws.amazon.com/lambda/latest/dg/walkthrough-s3-events-adminuser-create-test-function-create-function.html
Also, here is the walkthrough on how to configure your lambda with certain S3 events:
http://docs.aws.amazon.com/lambda/latest/dg/walkthrough-s3-events-adminuser.html
The sequence of events that I'm trying to make happen in Meteor is:
On the client browser, upload a zip file and send it to the server
On the server, receive the zip file and hold it in a memory object
Unzip the memory object into individual objects representing the contents
Process the individual files one at a time
Return success/failure status to the client
I have steps 1 and 2 working, using EJSON to stringify the contents of the zip file on the client and again to convert it back to its original form on the server. The problem I'm encountering is when I try to unzip the object on the server. It seems that every unzip library available wants to operate directly on a file or stream, not on a memory object.
I suppose I could write the object to disk and read it back again, but that seems like an unnecessary step. Is there a library available to unzip a memory object? Alternatively, is there a way to create a stream directly from the object that I can then feed to the unzip routine?
Any advice would be greatly appreciated.
You could use the unzip module from npm. It accepts streaming input and allows you to process output without saving to disk.
It will take some work to wrap it to work with meteor. Your two options are the meteorhacks:npm package or upgrading to the Meteor 1.3 beta.
Wanting to use Orchard 1.7 with Media storage on S3 (as I'm deploying to AppHarbor)
So far I'm looking at the S3 Storage provider But its a bit out of date.
Has anyone done this ? is there a better way to use S3 with the new media manager?
I've got images uploading to s3, but they don't display when I click the folder.
here is the Gist of my updated S3Provider
Missing methods for create file, rename folder, get file, and Get storage path. any help on how to complete these would be appreciated.... however stepping through the debugger in VS this doesn't seem to be the root cause of my displaying images issue above.
Edit
Looks like the file is up loading to s3 but not to the database, due to the GetFile method throwing an error...
Edit 2
Added some code to the Get file method. Now that works; (gist updated) Can up load images. However the thumbnails are still not working, they just come back as empty tags ...Think this is because the media manager is using the Open get method - which is supposed to open a file so you can write a stream to it. Don't know how to achieve this with S3... any ideas welcome
As Part of the AWSSKD NuGet package version 1.5.28.3 you can access a S3FileInfo object. I've used this in my S3 Storage File and updated the S3 Storage provider.
This seem to work, need to do a bit more testing on it.
NOTE: I had to add some code on the GetFile Method to ensure the permissions where set correctly otherwise the updating of thumbnails overwrote permissions on the file.... I'm sure there is a better way to do this.
I'm currently working out the best structure for a document I'm trying to create. The document is basically a core data document that uses sqlite as its store, but uses the Apple provided NSPersistentDocument+FileWrapperSupport to enable file wrapper support.
The document makes heavy use of media, such as images, videos, audio files, etc. with potentially 1000s of files. So what I'm trying to do is create a structure similar to the following:
/myfile.ext/
/myfile.ext/store.sqlite
/myfile.ext/content/
/myfile.ext/content/images/*
/myfile.ext/content/videos/*
/myfile.ext/content/audio/*
Now, first of all I went down the route of creating a temporary directory and placing all of my media in there. Basically creating the paths and file names '/content/images/image1.jpg' as I wanted them to appear in the saved file wrapper, and then upon save I attempted to copy these all into the filewrapper...
What I found was that the files were indeed copied into the wrapper with the file structure I wanted, but when the actual wrapper was saved, these files all magically disappeared.
Great.
So, I trashed my existing solution and tried to use file wrappers instead. This solution involved creating a content wrapper file directory when a new document was created, or loading in a content directory file wrapper upon opening a document.
When an image was added/modified, I created the necessary directory wrappers inside this root content wrapper (i.e. an images directory wrapper if it didn't already exist, or any other intermediary directory wrappers that needed to be created) and then created a regular file wrapper for the media, removing any existing wrapper for that file name if one was there.
Saving the document was just a case of making sure the content file wrapper was added to the document file wrapper, and the document would save.
Well... it did. The first time. However, any attempts to make any subsequent changes i.e add an image, save. Then replace image, save. Did not behave as expected, only showing the image from the first save.
So, my question is... first of all, which of the above approaches is the correct one, if at all, and what am I doing that wrong for them to fail.
And secondly, as I expect to be managing 1000s of images, is using file wrappers the correct way to go about things at all.
With that much media in play, you should likely give your users control over whether the media resides in the document or only a reference to the media is included in the document, and the media resides elsewhere, such as in a library/repository managed by your application. Then they could save out a (potentially many times larger) copy with all references resolved.
You might want to zip/unzip any directory so that users don't get confused trying to attach the document to an email. I believe iWork has been doing this with its document bundles for a while now.
As far as what you are doing wrong, no-one can say, as you haven't provided any code demonstrating what you are doing.
Why don't you create a one-off application that lets you select files on disk and saves those files in a document using a file wrapper? This would let you tackle this functionality without any interference from other issues in your application. Once you understand how to use file wrappers, you can port the code back or just write new code that works.