Why are RackMultipart* files persisting in my Rails /tmp directory? - file-upload

I'm using Paperclip (2.3) to handle image uploads on a Rails 3.0.3 app running on Ubuntu. Paperclip is handling the uploads as advertised BUT the RackMultipart* files that are created in the application's /tmp folder persist -- that is, they simply accumulate rather than deleting themselves. I realize that I could use tmpreaper to delete old tmpfiles but I'd really like to find a more elegant (and scalable) solution.
I had a previous issue with temp files (i.e. RackMultipart* files) accumulating in the Rails app's root directory (instead of in /tmp). I resolved this by explicitly setting the temp path in my environment.rb file like so:
ENV['TMPDIR'] = Rails.root.join('tmp')
Is there another environment variable that needs to be set to make sure that the tempfiles are handled properly -- i.e. deleted once they've been saved in the model? I'm not sure if this is a problem with Paperclip or my Rails setup.
I've searched high and low but have made little progress on this. I'd be grateful for any leads.
Sincere thanks.
PS - I'm using currently using S3 for storage. This doesn't seem to be tied to the problem though -- I had the same problem when I was storing the files locally.

The TempFileReaper is the Rack middleware thought to handle this issue.
http://www.rubydoc.info/github/rack/rack/Rack/TempfileReaper
Including this line in the application.rb solves the problem:
config.middleware.use Rack::TempfileReaper

I don't know if this is anymore elegant but this is what I am doing after the file is saved"
tempfile = params[:file].tempfile.path
if File::exists?(tempfile)
File::delete(tempfile)
end

UPDATE: Problem should be resolved in rack-1.6.0.beta2. I see it's already being used in Rails 4.2.0.rc2.
Below workaround served me well for almost a year:
I've added this at the end of controller action that accepts file uploads:
Thread.new { GC.start }
This triggers Garbage Collection of unused Rack::Request objects which also deletes associated temp files. Note it doesn't sweep temp file of current request, but it does remove previous files, and prevents them from accumulating.

Related

Error importing app from backup on wit.ai

Since this weekend, when I try to create a new app in Wit.ai by importing from a backup just saved from another app I get a blank error message and the new app receive only part of the information from the backup.
Here is the blank error message
Has anyone encoutered the same problem? Any suggestions on why it may be and how to solve it?
If you tamper with the original archive, do not add directory entries to the backup.
zip --no-dir-entries project.zip -r project
adding: project/actions.json (stored 0%)
adding: project/app.json (stored 0%)
adding: project/entities/intent.json (stored 0%)
adding: project/stories.json (stored 0%)
will work.
Adding the directory entry for entities
adding: project/entities/ (stored 0%)
seems to break the importer.
I kept receiving same error. In my case it had nothing to do with the content and/or formatting and/or encoding of any of *.json files.
Solution that works for me now is:
1) Export zip from any Wit.ai application (Even completely empty)
2) Copy all *.json files that are meant to be uploaded directly to this zip - overwrite or append files as needed
3) Import your app from a backup using this modified zip file
I had the exact problems. So if anyone else stumble upon this. Check that your bot is valid and has no empty bookmarks (first issue), recursive errors (had this as well) or similar.
I notice the same issue with the new wit.ai ui. So I just use the previous version of api for backup and restore

Reprocessing S3 asset with Paperclip

Background:
I have implemented user-defined cropping on image uploads roughly as-per Ryan Bates Railscast #182.
This works when set to the :file storage method, but not when set to :s3. S3 storage was working fine before adding the intermediate cropping step.
From the server log, it appears to be looking for the source file locally:
[paperclip] An error was received while processing: #<Paperclip::Errors::NotIdentifiedByImageMagickError: /profiles/pictures/000/001/543/original/headshot.jpg is not recognized by the 'identify' command.>
This file is present on S3, but not locally by this point, as the upload is processed before being cropped (as well as after).
My question:
How can I bring the file down from S3 to the local server before the second process step?
N.B. I have looked at other answers on SO already.
Paperclip looking for file locally for reprocessing when using S3 – seems relevant, but the only answer refers to downgrading Paperclip. I can’t do that, and besides, that answer is neither upvoted nor accepted.
Error reprocessing in Paperclip 2.3.5 – this is about an older version of Paperclip.
Other thoughts:
It has occurred to me that another approach would be to store the file locally until it has been cropped, and then use DelayedJob or something similar to upload it to S3 later on. This will be more work though, so I’d rather avoid it for now.
In order to better understand what's happening, it would be cool to see your model set up. Specifically I'm looking for the "has_attached_file" setup.
Just to cover the basics of what I'm looking for: here's an example
has_attached_file :picture,
path: <optional, default is fine.>
url: ':s3_alias_url',
s3_protocol: 'https',
s3_host_alias: 'cdn.<something>.com' (or, s3.amazonaws.com/bucketname/,
storage: :s3,
s3_credentials: Proc.new{ |a| a.instance.credentials }
When you reprocess an image, it should be brought down into a temp file and processed there, then reuploaded with these settings.
based on the profiles/pictures/000/001/543/original/headshot.jpg it almost looks like it's grabbing your path variable, but not going to your s3 bucket to get that image. so I would check the storage value, specifically.
With more info, I can update my answer appropriately.

Copying git history to new Rails app?

I decided to do a big coding overhaul in one of my Rails apps where it was easier to start again from scratch and then re-add the pieces from my old code one by one until I had most of it reintegrated. Now I want to basically take what I have now and make it my new app and get rid of the old one, but I want to old git log from that app to be present, and my app changeover to be treated like just another commit (albeit a big one). Any good way to do this?
I thought of deleting the entire directory structure of the old app except for the .git directory and then copying in the whole directory structure of the new app, adding, and committing, but that seems rather messy. Is there a better way to do it?
I thought of deleting the entire directory structure of the old app except for the .git directory and then copying in the whole directory structure of the new app, adding, and committing, but that seems rather messy. Is there a better way to do it?
That sounds perfect. You'll want a git add -u in there as well to catch any deletions.

How do you edit .tmp file removal settings?

Using Struts 2: when will the .tmp file - that gets created after uploading a file - be deleted?
How can you customize when the .tmp file should be deleted? Do you have to create a copy of it?
Please don't be shy to give some code :)
1. This depends on which version of S2 you're talking about.
S2.2.1 and prior: the file upload interceptor deleted temp files.
S2.2.3 and above: the filter dispatchers start the deletion process, changed due to WW-3490.
2. Assuming you're using a recent version, it might be possible to inject a tweaked Dispatcher, although it's not immediately obvious how–if it is, that's the easiest change at the core level.
The easiest approach from a practical standpoint is to copy files in the action, which is also pretty fast on any reasonable file system.

How do i force a file to be deleted? Windows server 2008

On my site a user may upload a file (pic, zip, audio, video, whatever). He then may decide to replace it with a newer revision. This user may upload a file, make a post then decide to put up a new revision replacing the old (lets say its a large zip or tar.gz file). Theres a good chance people may be downloading it if he sent out an email or even im for the home user.
Problem. I need to replace the file and people may be downloading and it may be some minutes before it is deleted. I dont want my code to stall until i cant delete or check every second to see if its unused (especially bad if another user can start and he takes long creating a cycle).
How do i delete the file while users are downloading the file? i dont care if they stop i just care that the file can be replaced and new downloads are the new revision.
What about referencing the files indirectly?
A mapping script, maps a virtual file entry from your site to a real file . If the user wants to upload a new revision of his file you just update the mapping, not the real file.
You can install a daily task that scans all files and deletes all files without a mapping and without open connections.
lajuette's answer is right, the easiest solution is to work around the file locking altogether:
When a user uploads file foo.zip, internally store it as foo-v1.zip.
Create a mapping file somewhere (database, code, whatever) that maps foo.zip to foo-v1.zip.
Rather than exposing a direct link to the file, expose a link to a service that gets the file: mysite.com/Download?foo.zip or something. This service uses the mapping to determine which version of the file to send to the client.
When a new version is uploaded, create foo-v2.zip and update the mapping file.
It wouldn't be that hard to write a scheduled task that cleans up old, un-mapped files.
If your oppose to a database and If the filenames are in a fix format (such as user/id.ext) you could append the id with a revision number and enumerate the folder using a pattern (user/id-*) and use the latest revision.