Paperclip + S3 massive zipping - ruby-on-rails-3

If you got Paperclip + AWS S3 working in your rails 3 application and you want to zip attachments related to a model how to proceed?

Note: Some questions at stackoverflow are outdated, some paperclip methods are gone.
Lets say we got a User and it :has_many => user_attachments
GC.disable
#user = User.find(params[:user_id])
zip_filename = "User attachments - #{#user.id}.zip" # the file name
tmp_filename = "#{Rails.root}/tmp/#{zip_filename}" # the path
Zip::ZipFile.open(tmp_filename, Zip::ZipFile::CREATE) do |zip|
#user.user_attachments.each { |e|
attachment = Paperclip.io_adapters.for(e.attachment) #has_attached_file :attachment (,...)
zip.add("#{e.attachment.original_filename}", attachment.path)
}
end
send_data(File.open(tmp_filename, "rb+").read, :type => 'application/zip', :disposition => 'attachment', :filename => zip_filename)
File.delete tmp_filename
GC.enable
GC.start
The trick is to disable the GC in order to avoid Errno::ENOENT exception. The GC will delete the downloaded attachment from S3 before it gets zipped.
Sources:
to_file broken in master?
io_adapters.for(object.attachment).path failing randomly

Related

How to set the Content Disposition header with Paperclip / fog for a file hosted on S3?

I'm using Paperclip 4.2.0 and fog 1.24.0, and host files on S3. I want to generate an expiring URL that has the "Content-Disposition" header set to "attachment".
Paperclip has this option to pass additional parameters to S3 expiring URLs but I can't have it working when using Paperclip with Paperclip::Storage::Fog.
This fog issue gives the following solution:
file.url(60.seconds.from_now, { :query => { 'response-content-disposition' => 'attachment' } }
but it does not work for me. My Rails model ResourceDocument has has_attached_file :target. document.target.url(60.seconds.from_now, { :query => { 'response-content-disposition' => 'attachment' } } returns the same URL than document.target.url(60.seconds.from_now), ie no content-disposition is included in the generated URL: "xxx.s3.amazonaws.com/uploads/resource_documents/targets/40/2014-12-01%2017:26:20%20UTC/my_file.csv"
I am using aws-sdk gems and it works fine for me, hope this helpful for you.
gem 'aws-sdk-core'
gem 'aws-sdk'
and model's method:
def download_url
s3 = AWS::S3.new
s3_videos_bucket = 'xxxx' #bucket name goes right here
bucket = s3.buckets[s3_videos_bucket]
object_path = 'xxxx' #file path goes right here
object = bucket.objects[object_path]
object.url_for(:get, {
expires: 60.minutes,
response_content_disposition: 'attachment;'
}).to_s
end

Preventing duplicates when seeding with existing image using Paperclip + Amazon S3

Every time I re-seed my database locally, duplicate images are being created in my Amazon S3 bucket. I think this is happening because I am not seeding correctly, but I don't know the proper way to do it. I've been using the method shown here. I'm using Rails 4, Ruby 2, paperclip 3.5.2, and aws-sdk 1.20.0.
You can see below in my seeds.rb file, I'm trying to set the image to the url of an image that has already been uploaded to the correct folder in my bucket. However, I think using open() here is causing a new, identical file to be saved to the same folder, usually something like http://s3.amazonaws.com/BUCKET_NAME/restaurants/images/1/original/open-uri20131111-22904-xvzitl.?1384211739.
EDIT: so my bucket will have both this file stored as well as http://s3.amazonaws.com/BUCKET_NAME/restaurants/images/1/original/NAME.jpg
Would really appreciate any help!
model
has_attached_file :image,
:styles => { :medium => "300x300>", :thumb => "100x100>" }
seeds.rb
Restaurant.create!( name: ...,
description: ...,
image: open('https://s3.amazonaws.com/<BUCKET NAME>/restaurants/images/1/original/<NAME>.jpg') )
config/initializers/paperclip.rb
Paperclip::Attachment.default_options[:storage] = :s3
Paperclip::Attachment.default_options[:s3_credentials] = {
:access_key_id => ENV['AWS_ACCESS_KEY_ID'],
:secret_access_key => ENV['AWS_SECRET_ACCESS_KEY']
}
Paperclip::Attachment.default_options[:bucket] = ENV['AWS_BUCKET']
Paperclip::Attachment.default_options[:url] = ":s3_path_url"
Paperclip::Attachment.default_options[:path] = "/:class/:attachment/:id/:style/:basename.:extension"
Paperclip::Attachment.default_options[:default_url] = "https://s3.amazonaws.com/<BUCKET NAME>/images/missing.png"
I'm pretty late to the party on this one but I figure others may still be having the same problem. If you set the attachments on your models to nil before deleting them paperclip will delete them from S3.

Carrierwave + s3 + fog (Excon::Errors::SocketError)

I'm currently getting the following error: Excon::Errors::SocketError - Broken pipe (Errno::EPIPE) when uploading images bigger than about 150kb. Images under 150kb work correctly. Research indicates that others have also experienced this problem but I'm yet to find a solution.
Error message
Excon::Errors::SocketError at /photos
Message Broken pipe (Errno::EPIPE)
File /Users/thmsmxwll/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/openssl/buffering.rb
Line 375
image_uploader.rb
class ImageUploader < CarrierWave::Uploader::Base
include CarrierWave::RMagick
storage :fog
include CarrierWave::MimeTypes
process :set_content_type
def store_dir
"uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
end
version :large do
process :resize_to_limit => [800, 600]
end
end
carrierwave.rb
CarrierWave.configure do |config|
config.fog_credentials = {
:provider => 'AWS',
aws_access_key_id: ENV['AWS_ACCESS_KEY_ID'],
aws_secret_access_key: ENV['AWS_SECRET_ACCESS_KEY'],
:region => 'us-east-1'
}
config.fog_directory = 'abcd'
config.fog_public = true
config.fog_attributes = {'Cache-Control'=>'max-age=315576000'}
end
For me, the solution required me to recreate the bucket in the US-Standard region. Originally, the bucket was in the Oregon region and while I wasn't specifying a region in my carrierwave settings, I could not get an upload to complete, even with very small files.
I'm having the same issue, i noticed that only happend when i upload big files (400kb), with a smaller (100kb) it works fine.

Paperclip- validate pdfs with content_type='application/octet-stream'

I was using paperclip for file upload. with validations as below:
validates_attachment_content_type :upload, :content_type=>['application/pdf'],
:if => Proc.new { |module_file| !module_file.upload_file_name.blank? },
:message => "must be in '.pdf' format"
But, my client complained today that he is not able to upload pdf. After investigating I come to know from request headers is that the file being submitted had content_type=application/octet-stream.
Allowing application/octet-stream will allow many type of files for upload.
Please suggest a solution to deal with this.
Seems like paperclip doesn't detect content type correctly. Here is how I was able to fix it using custom content-type detection and validation (code in model):
VALID_CONTENT_TYPES = ["application/zip", "application/x-zip", "application/x-zip-compressed", "application/pdf", "application/x-pdf"]
before_validation(:on => :create) do |file|
if file.media_content_type == 'application/octet-stream'
mime_type = MIME::Types.type_for(file.media_file_name)
file.media_content_type = mime_type.first.content_type if mime_type.first
end
end
validate :attachment_content_type
def attachment_content_type
errors.add(:media, "type is not allowed") unless VALID_CONTENT_TYPES.include?(self.media_content_type)
end
Based on the above, here's what I ended up with which is compatible with PaperClip 4.2 and Rails 4:
before_post_process on: :create do
if media_content_type == 'application/octet-stream'
mime_type = MIME::Types.type_for(media_file_name)
self.media_content_type = mime_type.first.to_s if mime_type.first
end
end
For paperclip 3.3 and Rails 3, I did this a bit differently
before_validation on: :create do
if media_content_type == 'application/octet-stream'
mime_type = MIME::Types.type_for(media_file_name)
self.media_content_type = mime_type.first if mime_type.first
end
end
validates_attachment :media, content_type: { content_type: VALID_CONTENT_TYPES }
By the way, i needed to do this because testing with Capybara and phantom js using attach_file did not generate the correct mime type for some files.

rails aws-s3 delete file throws AWS::S3::PermanentRedirect error - EU bucket problem?

I'm building a rails3 app on heroku, and I'm using aws-s3 gem to manipulate files stored in an Amazon S3 eu bucket.
When I try to perform a AWS::S3::S3Object.delete filename, 'mybucketname' command, I get the following error:
AWS::S3::PermanentRedirect (The bucket you are attempting to access
must be addressed using the specified endpoint. Please send all future
requests to this endpoint.):
I have added the following to my application.rb file:
AWS::S3::Base.establish_connection!(
:access_key_id => "myAccessKey",
:secret_access_key => "mySecretAccessKey"
)
and the following code to my controller:
def destroy
song = tape.songs.find(params[:id])
AWS::S3::S3Object.delete song.filename, 'mybucket'
song.destroy
respond_to do |format|
format.js { render :nothing => true }
end end
I found a proposed solution somewhere to add AWS_CALLING_FORMAT: SUBDOMAIN to my amazon_s3.yml file, as supposedly, aws-s3 should handle differently eu buckets than us.
However, this did not work, same error is received.
Could you please provide any assistance?
Thank you very much for your help.
the problem is you need to type SUBDOMAIN as uppercase string in config, try this out
You can specify custom endpoint at connection initialization point:
AWS::S3::Base.establish_connection!(
:access_key_id => 'myAccessKey',
:secret_access_key => 'mySecretAccessKey',
:server => 's3-website-us-west-1.amazonaws.com'
)
you can find actual endpoint through the AWS console:
full list of valid options - here https://github.com/marcel/aws-s3/blob/master/lib/aws/s3/connection.rb#L252
VALID_OPTIONS = [:access_key_id, :secret_access_key, :server, :port, :use_ssl, :persistent, :proxy].freeze
My solution is to set the constant to the actual service link at initialization time.
in config/initializers/aws_s3.rb
AWS::S3::DEFAULT_HOST = "s3-ap-northeast-1.amazonaws.com"
AWS::S3::Base.establish_connection!(
:access_key_id => 'access_key_id',
:secret_access_key => 'secret_access_key'
)