Set content type in S3 when attaching via Paperclip 4? - ruby-on-rails-3

I'm trying to attach CSV files to a Rails3 model using paperclip 4.1.1, but I'm having trouble getting the content-type as reported by S3 to be text/csv (instead I am getting text/plain). When I subsequently download the file from S3, the extension is getting changed to match the content-type instead of preserving the original extension (so test.csv is downloaded as test.txt).
From what I can see, when you upload a file, the FileAdapter will cache the content-type on creation with whatever value was determined by the ContentTypeDetector (which calls file -b --mime filename). Unfortunately, CSV files return text/plain which makes sense, as how can you really distinguish this? Attempts to set the content-type with attachment.instance_write(:content_type, 'text/csv') only set the value in the model and do not affect what gets written to S3.
FileAdapter's content_type initialized here: https://github.com/thoughtbot/paperclip/blob/v4.0/lib/paperclip/io_adapters/file_adapter.rb#L14
Call which creates that io_adapter:
https://github.com/thoughtbot/paperclip/blob/v4.0/lib/paperclip/attachment.rb#L98
I really have a generic upload here (so I can't hard-code the content type in the S3 headers definition in has_attached_file), and I don't really want the content-type spoofing protection. Any ideas/suggestions? I would prefer not to downgrade to 3.5 because it would mean just delaying the pain, but if that's the only way, I'll entertain it...

If you are using fog then you can do something like this:
has_attached_file :report,
fog_file: lambda { |attachment|
{
content_type: 'text/csv',
content_disposition: "attachment; filename=#{attachment.original_filename}",
}
}
If you are using Amazon S3 as your storage provider, then something like this should work:
has_attached_file :report
s3_headers: lambda { |attachment|
{
'Content-Type' => 'text/csv',
'Content-Disposition' => "attachment; filename=#{attachment.original_filename}",
}
}

Had this problem just recently and both the post process and the lambda don't work so did a work around. Same with others observation, the values of the attachment is empty when calling the s3 lambda headers.
add this line to the model
attr_accessor :tmp_content_type, :tmp_file_name
override the file assignment method so we could get the file info and store it for later use
def file=(f)
set_tmp_values(f.path)
file.assign(f)
end
def set_tmp_values(file_path)
self.tmp_file_name = File.basename(file_path)
self.tmp_content_type = MIME::Types.type_for(file_path).first.content_type
end
Use the temp vars
:s3_headers => lambda { |attachment|
{
'Content-Type' => attachment.tmp_content_type,
"Content-Disposition" => "attachment; filename=\"# {attachment.tmp_file_name}\""
}
}

Related

Correct code to upload local file to S3 proxy of API Gateway

I created an API function to work with S3. I imported the template swagger. After deployment, I tested with a Node.js project by the npm module aws-api-gateway-client.
It works well with: get bucket lists, get bucket info, get one item, put a bucket, put a plain text object, however I am blocked with put a binary file.
firstly, I ensure ACL is allowed with all permissions on S3. secondly, binary support also added
image/gif
application/octet-stream
The code snippet is as below. The behaviors are:
1) after invokeAPI, the callback function is never hit, after sometime, the Node.js project did not respond. no any error message. The file size (such as an image) is very small.
2) with only two times, the uploading seemed to work, but the result file size is bigger (around 2M bigger) than the original file, so the file is corrupt.
Could you help me out? Thank you!
var filepathname = './items/';
var filename = 'image1.png';
fs.stat(filepathname+filename, function (err, stats) {
var fileSize = stats.size ;
fs.readFile(filepathname+filename,'binary',function(err,data){
var len = data.length;
console.log('file len' + len);
var pathTemplate = '/my-test-bucket/' +filename ;
var method = 'PUT';
var params = {
folder: '',
item:''
};
var additionalParams = {
headers: {
'Content-Type': 'application/octet-stream',
//'Content-Type': 'image/gif',
'Content-Length': len
}
};
var result1 = apigClient.invokeApi(params,pathTemplate,method,additionalParams,data)
.then(function(result){
//never hit :(
console.log(result);
}).catch( function(result){
//never hit :(
console.log(result);
});;
});
});
We encountered the same problem. API Gateway is meant for limited data (10MB as of now), limits shown here,
http://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html
Self Signed URL to S3:
Create an S3 self signed URL for POST from the lambda or the endpoint where you are trying to post.
How do I put object to amazon s3 using presigned url?
Now POST the image directly to S3.
Presigned POST:
Apart from posting the image if you want to post additional properties, you can post it in multi-form format as well.
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#createPresignedPost-property
If you want to process the file after delivering to S3, you can create a trigger from S3 upon creation and process with your Lambda or anypoint that need to process.
Hope it helps.

Fetch an AWS S3 object to use in Rekognition when uploaded via Carrierwave

I have a Gallery and Attachment models. A gallery has_many attachments and essentially all attachments are images referenced in the ':content' attribute of Attachment.
The images are uploaded using Carrierwave gem and are stored in Aws S3 via fog-aws gem. This works OK. However, I'd like to conduct image recognition to the uploaded images with Amazon Rekognition.
I've installed aws-sdk gem and I'm able to instantiate Rekognition without a problem until I call the detect_labels method at which point I have been unable to use my attached images as arguments of this method.
So fat I've tried:
#attachement = Attachment.first
client = Aws::Rekognition::Client.new
resp = client.detect_labels(
image: #attachment
)
# I GET expected params[:image] to be a hash... and got class 'Attachment' instead
I've tried using:
client.detect_labels( image: { #attachment })
client.detect_labels( image: { #attachment.content.url })
client.detect_labels( image: { #attachment.content })
All with the same error. I wonder how can I fetch the s3 object form #attachment and, even if I could do that, how could I use it as an argument in detect_labels.
I've tried also fetching directly the s3 object to try this last bit:
s3 = AWS:S3:Client.new
s3_object = s3.list_objects(bucket: 'my-bucket-name').contents[0]
# and then
client.detect_labels( image: { s3_object })
Still no success...
Any tips?
I finally figured out what was the problem, helped by the following AWS forum
The 'Image' hash key takes as a value an object that must be named 's3_object' and which subsequently needs only the S3 bucket name and the path of the file to be processed. As a reference see the correct example below:
client = Aws::Rekognition::Client.new
resp = client.detect_labels(
image:
{ s3_object: {
bucket: "my-bucket-name",
name: #attachment.content.path,
},
}
)
# #attachment.content.path => "uploads/my_picture.jpg"

How to set the Content Disposition header with Paperclip / fog for a file hosted on S3?

I'm using Paperclip 4.2.0 and fog 1.24.0, and host files on S3. I want to generate an expiring URL that has the "Content-Disposition" header set to "attachment".
Paperclip has this option to pass additional parameters to S3 expiring URLs but I can't have it working when using Paperclip with Paperclip::Storage::Fog.
This fog issue gives the following solution:
file.url(60.seconds.from_now, { :query => { 'response-content-disposition' => 'attachment' } }
but it does not work for me. My Rails model ResourceDocument has has_attached_file :target. document.target.url(60.seconds.from_now, { :query => { 'response-content-disposition' => 'attachment' } } returns the same URL than document.target.url(60.seconds.from_now), ie no content-disposition is included in the generated URL: "xxx.s3.amazonaws.com/uploads/resource_documents/targets/40/2014-12-01%2017:26:20%20UTC/my_file.csv"
I am using aws-sdk gems and it works fine for me, hope this helpful for you.
gem 'aws-sdk-core'
gem 'aws-sdk'
and model's method:
def download_url
s3 = AWS::S3.new
s3_videos_bucket = 'xxxx' #bucket name goes right here
bucket = s3.buckets[s3_videos_bucket]
object_path = 'xxxx' #file path goes right here
object = bucket.objects[object_path]
object.url_for(:get, {
expires: 60.minutes,
response_content_disposition: 'attachment;'
}).to_s
end

Preventing duplicates when seeding with existing image using Paperclip + Amazon S3

Every time I re-seed my database locally, duplicate images are being created in my Amazon S3 bucket. I think this is happening because I am not seeding correctly, but I don't know the proper way to do it. I've been using the method shown here. I'm using Rails 4, Ruby 2, paperclip 3.5.2, and aws-sdk 1.20.0.
You can see below in my seeds.rb file, I'm trying to set the image to the url of an image that has already been uploaded to the correct folder in my bucket. However, I think using open() here is causing a new, identical file to be saved to the same folder, usually something like http://s3.amazonaws.com/BUCKET_NAME/restaurants/images/1/original/open-uri20131111-22904-xvzitl.?1384211739.
EDIT: so my bucket will have both this file stored as well as http://s3.amazonaws.com/BUCKET_NAME/restaurants/images/1/original/NAME.jpg
Would really appreciate any help!
model
has_attached_file :image,
:styles => { :medium => "300x300>", :thumb => "100x100>" }
seeds.rb
Restaurant.create!( name: ...,
description: ...,
image: open('https://s3.amazonaws.com/<BUCKET NAME>/restaurants/images/1/original/<NAME>.jpg') )
config/initializers/paperclip.rb
Paperclip::Attachment.default_options[:storage] = :s3
Paperclip::Attachment.default_options[:s3_credentials] = {
:access_key_id => ENV['AWS_ACCESS_KEY_ID'],
:secret_access_key => ENV['AWS_SECRET_ACCESS_KEY']
}
Paperclip::Attachment.default_options[:bucket] = ENV['AWS_BUCKET']
Paperclip::Attachment.default_options[:url] = ":s3_path_url"
Paperclip::Attachment.default_options[:path] = "/:class/:attachment/:id/:style/:basename.:extension"
Paperclip::Attachment.default_options[:default_url] = "https://s3.amazonaws.com/<BUCKET NAME>/images/missing.png"
I'm pretty late to the party on this one but I figure others may still be having the same problem. If you set the attachments on your models to nil before deleting them paperclip will delete them from S3.

Amazon AWS S3 to Force Download Mp3 File instead of Stream It

I'm using Amazon S3 to put the mp3 file then allow our site visitor to download the mp3 from Amazon AWS. I use S3Fox to manage the file, everything seems working fine until recently we got many complaints from visitor that the mp3 was streamed via the browser instead of displaying browser save dialog.
I try for some mp3 and notice that for some mp3, the save dialog box is appear, and for some others they're streamed via browser. What can I do to force that the mp3 file will be downloaded instead of streamed via web browser....
Any help would be much appreciated.
Thanks
In order to do so you need to set the Content-Disposition header:
Content-disposition: attachment; filename=song.mp3
I don't think this is possible with S3Fox. You could use Bucket Explorer (not free) or write a script to upload the files.
Ok, it's been a long time since you ask this, but I had the same problem and I'd like to share my solution with the community, just in case someone else need to solve this thing. Of course, you can change Content-Type and Content-Disposition from the Amazon S3 Console, but the interesting thing is to do it programmatically.
The following code works fine for me:
require_once '../sdk-1.4.2.1/sdk.class.php';
// Instantiate the class
$s3 = new AmazonS3();
// Copy object over itself and modify headers
$response = $s3->copy_object(
array( // Source
'bucket' => 'your_bucket',
'filename' => 'Key/To/YourFile'
),
array( // Destination
'bucket' => 'your_bucket',
'filename' => 'Key/To/YourFile'
),
array( // Optional parameters
'headers' => array(
'Content-Type' => 'application/octet-stream',
'Content-Disposition' => 'attachment'
)
)
);
// Success?
var_dump($response->isOK());
Hope it can helps other struggling with the same trouble.
This ended up being my solution for force downloading files from AWS S3.
In safari the files were downloading as .html files until I stopped returning the readfile and just ran the function alone.
public function get_download($upload_id)
{
try {
$upload = Upload::find($upload_id);
if ($upload->deleted)
throw new Exception("This resource has been deleted.");
if ($upload->filename == '')
throw new Exception("No downloadable file found. Please email info#clouddueling.com for support.");
header("Content-Description: File Transfer");
header("Content-Type: application/octet-stream");
header("Content-Disposition: attachment; filename={$upload->uploaded_filename};");
readfile("https://s3.amazonaws.com/stackoverflow/uploads/" . $upload->filename);
exit;
} catch(Exception $e) {
return $e->getMessage();
}
}
In s3 management console window, right click and chose properties.
Click on metadata.
Click on add more metadata
Key: content-disposition
Value: attachment
Save. That's all.