Rails 3: Can't uncompress zip after compressing it in - ruby-on-rails-3

I want to compress some files in Ruby on Rails and save the zip file in the tmp folder. I've got a Document model which has a name field with an associated uploader. I'm also using Carrierwave to upload files to Amazon S3. I've got the following code:
class Document < ActiveRecord::Base
mount_uploader :name, DocumentUploader
...
end
def create_zip
documents = Document.all
folder = "#{Rails.root}/tmp"
tmp_filename = "#{folder}/export.zip"
zip_path = tmp_filename
Zip::ZipFile::open(zip_path, true) do |zipfile|
documents.each do |photo|
zipfile.get_output_stream(document.name.identifier) do |io|
io.write document.name.file.read
end
end
end
end
This creates an export.zip file in my tmp folder, but when I try to open it, Archive Manager (Mac OS X) begins unarchiving it, but keeps doing it so without ever finishing. I believe there's something missing from my code. The zip file size does make sense to me, but I've got that problem. Any thoughts? Thanks!

Actually, I found out I could open the zip file using other program (zipeg). However, only the last file from the documents array was in the compressed file. I believe I had been overwriting previous files, as the only remaining file was called the same (export, as the name of the zip itself) in all cases.
The code bellow works for me:
def create_zip
documents = Document.all
folder = "#{Rails.root}/tmp"
tmp_filename = "#{folder}/export.zip"
zip_path = tmp_filename
Zip::ZipOutputStream.open(zip_path) do |zos|
documents.each do |document|
path = document.name_identifier
zos.put_next_entry(path)
zos.write photo.name.file.read
end
end
end

Related

Extracting text from several images

I want to extract text from several images.
I want to do it in colab.
I know how to do it with one image:https://github.com/bhadreshpsavani/ExploringOCR/blob/master/OCRusingTesseract.ipynb
But how to do it in a cycle, because I have more than a hundred pictures?
Thanks in advance!
I uploaded my images in colab.research in root directory and resolved this task with following code:
image_ext = ['.jpg', '.png', '.jpeg']
directory = '/'
for file in os.listdir(directory):
ext = os.path.splitext(file)[-1].lower()
if ext not in image_ext:
continue
filename = os.path.join(directory, file)
extracted_information = pytesseract.image_to_string(Image.open(filename))
print(extracted_information)

How to rename multiple files from the multiple text files?

My goal is to do following:
I am using Win 10 and I have files like so:
folder
2020-04-23_19-30-52_UTC.mp4
2020-04-23_19-30-52_UTC.txt which contains string "This video is me at a wedding"
2020-05-25_19-30-52_UTC.mp4
2020-05-25_19-30-52_UTC.txt which contains string "This video is dogwalk at the sunset"
where .txt contains the name of the mp4 from the same date and I want to do the following:
folder
This video is me at a wedding.mp4
2020-04-23_19-30-52_UTC.txt
This video is dogwalk at the sunset.mp4
2020-05-25_19-30-52_UTC.txt
there is a few ways how to achieve this but I am not that good with coding. My only priority is to have it done and I am for now not limited to use of any tool or programming language.
Thanks
I'd tackle this problem with Python.
import os
dir = ('[path to original folder]')
files = os.listdir(dir)
# Iterate through all the files in the folder
for path in files:
filetype = path[-4:] # Grabs last 4 characters of the filepath
# Checks if it's a textfile
if (filetype == '.txt'):
f = open(os.path.join(dir, path), "r") # open the textfile
new_name = f.read() # grab the description
f.close() # close the textfile
new_name = new_name + '.mp4' # Add proper filetype
path = path[:-4] # Throws away the last 4 characters of the filepath
path = path + '.mp4' # Add proper filetype
os.rename(os.path.join(dir, path), os.path.join(dir, new_name)) # Rename
If any more issues arise please let me know so I can help.

Rails 5: How to add folders and files in application template script

I would like to add folders and add files (like my own readme.md) to newly created rails apps using application templates.
In template.rb
require "fileutils"
require "shellwords"
def add_folders
mkdir views/components/buttons
mkdir csv/
end
def add_file
cd csv
touch user.csv
end
def add_readme
rm README.md
touch README.md
inject_into_file("README.md", "New readme..")
end
after_bundle do
add_folder
add_file
add_readme
end
But I don't know how to do it.
FileUtils covers most of what you want. mkdir_p uses the command line mkdir -p command, which makes the full path if the directories don't exist.
IO.write (which File inherits from IO) accepts a file name, and content. No need to delete the old file and touch a new one.
Also, you'll want to make sure you use Rails.root.join with your file paths. It's similar to File.join, in that it helps you build a file path without doubling up your / on accident, but it also returns an absolute file path on your computer. Also, it makes your code OS agnostic because while unix systems use '/' as the folder separator, Windows computers use '\'. So, Rails.root.join makes all of that safer.
Here's an example of using it on a unix system:
If Rails.root is '/some/cool/path/here', then Rails.root.join('views','components', 'buttons') would be '/some/cool/path/here/views/components/buttons'.
require 'fileutils'
require 'shellwords'
def add_folders
FileUtils.mkdir_p(Rails.root.join('views', 'components', 'buttons'))
FileUtils.mkdir_p(Rails.root.join('csv'))
end
def add_file
FileUtils.touch('Rails.root.join('csv', 'user.csv'))
end
def add_readme
File.write(Rails.root.join('README.md'), 'New readme..')
end
after_bundle do
add_folder
add_file
add_readme
end

Read contents of .gz file with python

I'm new to Python and am running into issues reading the contents of a .gz file:
I've got a folder full of .gz files that I've extracted programatically using a private API. The contents of each .gz file is a .xml file so I need to iterate over the dir and extract them.
The problem is when I programatically extract these .gz files into their respective .xml versions... The files create without error and when I open one (Using TextWrangler) it looks like a regular .xml file, but NOT when I view it in a hex editor. Also, when I open the .xml file programatically and print it's contents, it shows up as a bunch of (binary?) jumbled text.
With the above in mind, If I manually extract one of the files (ie: using OSX, but not Python), the file is viewable in a hex editor as I'd expect it to be.
Here is my code snippet (appropriate imports not shown, but they are glob and gzip):
searchpattern = siteid + "_" + resource + "_*.gz"
for infile in glob.glob(workingDir + searchpattern):
print infile
#read the zipped contents (https://docs.python.org/2/library/gzip.html)
f = gzip.open(infile, 'rb')
file_content = f.read()
file_content = str(file_content) #This was an attempt to fix
print file_content # This shows a bunch of mumbo jumbo
#write the contents we just read to a new file (uncompressed)
newfilename = infile[0:-3] # the filename without the ".gz"
newfilename = newfilename + ".xml"
fnew = open(newfilename, 'w+b')
fnew.write(str(file_content))
fnew.close()
#delete the .gz version of the file
#os.remove(infile)
If I run this against XML I don't get any issues with the program.
If I compress and XML and extract it with this program and diff the original with the output of this program I get no differences.
This program does add an extra ".xml" extension.
So this turns out to be a silly mistake on my part, but I'll post this as a followup for anybody else who makes the same mistake I did.
The problem was that i was zipping what had already been zipped earlier in my program. So with that in mind, my code snippet on this thread didn't have anything wrong with it. Neither did my code that i created the .gz file with (technically). As you can see below. Opening the file normally, instead of with the gzip library earlier in the program did the trick.
#Download and write the contents of each response to a .gz file
if limitCounter < limit or int(limit) == 0:
print _name + " " + scopeStartDate + " through " + scopeEndDate + " at " + href
file = api.get(href)
gz_file_content = file.content
#gz_file = gzip.open(workingDir + _name, "wb") # This breaks the program later
gz_file = open(workingDir + _name, 'wb') # This works.
gz_file.write(gz_file_content)
gz_file.close()

No such file or directory when opening file in memory with ZipRuby Zip::File

I'm consuming an api that replies with a zip file in the contents of the body of the http response. I'm able to unzip the file and write each file to disk using the example at the zip-ruby wiki (https://bitbucket.org/winebarrel/zip-ruby/wiki/Home):
Zip::Archive.open('filename.zip') do |ar| # except I'm opening from a string in memory
ar.each do |zf|
if zf.directory?
FileUtils.mkdir_p(zf.name)
else
dirname = File.dirname(zf.name)
FileUtils.mkdir_p(dirname) unless File.exist?(dirname)
open(zf.name, 'wb') do |f|
f << zf.read
end
end
end
end
However, I don't want to write the files to disk. Instead, I want create an active record object, and set a paperclip file attachment:
asset = Asset.create(:name => zf.name)
asset.file = open(zf.name, 'r')
asset.save
What's odd is the open statement in the first example that writes the file to disk works consistently. However, when I want to just open the zf (Zip::File) as a generic File in memory, I will sometimes get:
*** Errno::ENOENT Exception: No such file or directory - assets/somefilename.png
How can I assign the Zip::File zipruby creates to the paperclip file without getting this error?