Reprocessing S3 asset with Paperclip

Reprocessing S3 asset with Paperclip - ruby-on-rails-3

Background:
I have implemented user-defined cropping on image uploads roughly as-per Ryan Bates Railscast #182.
This works when set to the :file storage method, but not when set to :s3. S3 storage was working fine before adding the intermediate cropping step.
From the server log, it appears to be looking for the source file locally:
[paperclip] An error was received while processing: #<Paperclip::Errors::NotIdentifiedByImageMagickError: /profiles/pictures/000/001/543/original/headshot.jpg is not recognized by the 'identify' command.>
This file is present on S3, but not locally by this point, as the upload is processed before being cropped (as well as after).
My question:
How can I bring the file down from S3 to the local server before the second process step?
N.B. I have looked at other answers on SO already.
Paperclip looking for file locally for reprocessing when using S3 – seems relevant, but the only answer refers to downgrading Paperclip. I can’t do that, and besides, that answer is neither upvoted nor accepted.
Error reprocessing in Paperclip 2.3.5 – this is about an older version of Paperclip.
Other thoughts:
It has occurred to me that another approach would be to store the file locally until it has been cropped, and then use DelayedJob or something similar to upload it to S3 later on. This will be more work though, so I’d rather avoid it for now.

In order to better understand what's happening, it would be cool to see your model set up. Specifically I'm looking for the "has_attached_file" setup.
Just to cover the basics of what I'm looking for: here's an example
has_attached_file :picture,
path: <optional, default is fine.>
url: ':s3_alias_url',
s3_protocol: 'https',
s3_host_alias: 'cdn.<something>.com' (or, s3.amazonaws.com/bucketname/,
storage: :s3,
s3_credentials: Proc.new{ |a| a.instance.credentials }
When you reprocess an image, it should be brought down into a temp file and processed there, then reuploaded with these settings.
based on the profiles/pictures/000/001/543/original/headshot.jpg it almost looks like it's grabbing your path variable, but not going to your s3 bucket to get that image. so I would check the storage value, specifically.
With more info, I can update my answer appropriately.

Related

Creating thumbnails for images on S3

I have quite common situation, as I suppose. I have website that is lcoated on amazon EC2 and I'd like to move all dynamic files to amazon S3. Everything seems ok, except 2 points:
I'm using library PDFNet with their WebViewer. To display pdf files in browser Webviwer use special ".xod" format. PDFNet provide functionality to convert pdf files to xod format. Let's see an example, when PDF file was upload on S3 and no xod file was created (I'm going to use Lambda to avoid it in future, but still). So in this case I have to download file to my local machine, convert it to xod file and upload xod file on S3(I don't see any other opportunities to do it, but it can take a lot of traffic)?
Second problem is almost the same, but it's linked with thumbnails. Currently I'm dynamically resize thumbnails depending on the required resolution and I'd like to keep it. Amazon Lambda is not situable in this case, what is the best way to do it?

Why do you say that Lambda is not suitable here?
For pt#1 PDFNet gives a library for Java, you can write a lambda function in java (its possible now) and use that to get infinite scale.
For pt#2: Amazons tutorial (http://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html) gives a detailed example of how to resize images when uploaded to S3. The example is in nodeJs, you can write a java version as well if you like.
Note that if you want to have custom logic for decision making, you can add attributes while uploading the file in S3 (http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#User-Defined Metadata) which you can use in your lambda function to take decisions while resizing.

Change resolutions of image files stored on S3 server

Is there a way to run imagemagick or some other tool on s3 servers to resize the images.
The way I know is first downloading all the image files on my machine and then convert these files and reupload them on s3 server. The problem is the number of file is more than 10000. I don't want to download all the files on my local machine.
Is there a way to convert it on s3 server itself.

look at it: https://github.com/Turistforeningen/node-s3-uploader.
It is a library providing some features for s3 uploading including resizing as you want

Another option is NOT to change the resolution, but to use a service that can convert the images on-the-fly when they are accessed, such as:
Cloudinary
imgix

Also check out the following article on amazon's compute blog.. I found myself here because i had the same question. I think i'm going to implement this in Lambda so i can just specify the size and see if that helps. My problem is i have image files on s3 that are 2MB.. i dont want them at full resolution because I have an app that is retrieving them and it takes a while sometimes for a phone to pull down a 2MB image. But i dont mind storing them at full resolution if i can get a different size just by specifying it in the URL. easy!
https://aws.amazon.com/blogs/compute/resize-images-on-the-fly-with-amazon-s3-aws-lambda-and-amazon-api-gateway/

S3 does not, alone, enable arbitrary compute (such as resizing) on the data.
I would suggest looking into AWS-Lambda (available in the AWS console), which will allow you to setup a little program (which they call a Lambda) to run when certain events occur in a S3 bucket. You don't need to setup a VM, you only need to specify a few files, with a particular entry point. The program can be written in a few languages, namely node.js python and java. You'd be able to do it all from the console's web GUI.
Usually those are setup for computing things on new files being uploaded. To trigger the program for files that are already in place on S3, you have to "force" S3 to emit one of the events you can hook into for the files you already have. The list is here. Forcing a S3 copy might be sufficient (copy A to B, delete B), an S3 rename operation (rename A to A.tmp, rename A.tmp to A), and creation of new S3 objects would all work. You essentially just poke your existing files in a way that causes your Lambda to fire. You may also invoke your Lambda manually.
This example shows how to automatically generate a thumbnail out of an image on S3, which you could adapt to your resizing needs and reuse to create your Lambda:
http://docs.aws.amazon.com/lambda/latest/dg/walkthrough-s3-events-adminuser-create-test-function-create-function.html
Also, here is the walkthrough on how to configure your lambda with certain S3 events:
http://docs.aws.amazon.com/lambda/latest/dg/walkthrough-s3-events-adminuser.html

Orchard CMS 1.7 Media with S3 Storage

Wanting to use Orchard 1.7 with Media storage on S3 (as I'm deploying to AppHarbor)
So far I'm looking at the S3 Storage provider But its a bit out of date.
Has anyone done this ? is there a better way to use S3 with the new media manager?
I've got images uploading to s3, but they don't display when I click the folder.
here is the Gist of my updated S3Provider
Missing methods for create file, rename folder, get file, and Get storage path. any help on how to complete these would be appreciated.... however stepping through the debugger in VS this doesn't seem to be the root cause of my displaying images issue above.
Edit
Looks like the file is up loading to s3 but not to the database, due to the GetFile method throwing an error...
Edit 2
Added some code to the Get file method. Now that works; (gist updated) Can up load images. However the thumbnails are still not working, they just come back as empty tags ...Think this is because the media manager is using the Open get method - which is supposed to open a file so you can write a stream to it. Don't know how to achieve this with S3... any ideas welcome

As Part of the AWSSKD NuGet package version 1.5.28.3 you can access a S3FileInfo object. I've used this in my S3 Storage File and updated the S3 Storage provider.
This seem to work, need to do a bit more testing on it.
NOTE: I had to add some code on the GetFile Method to ensure the permissions where set correctly otherwise the updating of thumbnails overwrote permissions on the file.... I'm sure there is a better way to do this.

Paperclip & S3: How to change attachment path without reuploading

I have an app with a lot of images uploaded via paperclip and stored on S3. I'm having some trouble where S3 is telling my iOS app that a few of the image keys don't exist (although I see that they indeed DO exist when I take a look at my S3 bucket). One of my theories is that this is being caused by the filenames, so I'd like to simplify my paperclip path.
My existing path is:
:path => "/:class/:style/:id_:basename.:extension"
I'd like it to be
:path => "/:class/:id/:style.:extension"
which is much cleaner.
My problem is that I'm not sure how to go about doing this. My first thought was to change the path format string in the model and then reprocess! all of the attachments, but now I realize that paperclip needs to use the original path to get at the original uploaded images before it can reprocess and save the images to a new path.
Is there a simple, quick way to make this change?
Thanks!

You cannot 'rename' an object in S3. However, there is a copy command that will duplicate the object within S3. After you duplicate the object, delete the original.

Why are RackMultipart* files persisting in my Rails /tmp directory?

I'm using Paperclip (2.3) to handle image uploads on a Rails 3.0.3 app running on Ubuntu. Paperclip is handling the uploads as advertised BUT the RackMultipart* files that are created in the application's /tmp folder persist -- that is, they simply accumulate rather than deleting themselves. I realize that I could use tmpreaper to delete old tmpfiles but I'd really like to find a more elegant (and scalable) solution.
I had a previous issue with temp files (i.e. RackMultipart* files) accumulating in the Rails app's root directory (instead of in /tmp). I resolved this by explicitly setting the temp path in my environment.rb file like so:
ENV['TMPDIR'] = Rails.root.join('tmp')
Is there another environment variable that needs to be set to make sure that the tempfiles are handled properly -- i.e. deleted once they've been saved in the model? I'm not sure if this is a problem with Paperclip or my Rails setup.
I've searched high and low but have made little progress on this. I'd be grateful for any leads.
Sincere thanks.
PS - I'm using currently using S3 for storage. This doesn't seem to be tied to the problem though -- I had the same problem when I was storing the files locally.

The TempFileReaper is the Rack middleware thought to handle this issue.
http://www.rubydoc.info/github/rack/rack/Rack/TempfileReaper
Including this line in the application.rb solves the problem:
config.middleware.use Rack::TempfileReaper

I don't know if this is anymore elegant but this is what I am doing after the file is saved"
tempfile = params[:file].tempfile.path
if File::exists?(tempfile)
File::delete(tempfile)
end

UPDATE: Problem should be resolved in rack-1.6.0.beta2. I see it's already being used in Rails 4.2.0.rc2.
Below workaround served me well for almost a year:
I've added this at the end of controller action that accepts file uploads:
Thread.new { GC.start }
This triggers Garbage Collection of unused Rack::Request objects which also deletes associated temp files. Note it doesn't sweep temp file of current request, but it does remove previous files, and prevents them from accumulating.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas