how to shrink KVM hard drive image - kvm

When I copy the hd image from one machine to another one, the file become the full size. Is it possible to shrink it to its actual size? I have google a lot but nothing found. Hope someone give me clue. Thanks!
Leo

Images can be sparsed, and your cp might not respect that for various reasons. I would:
check the real file usage (du -sh vm.img or qemu-img info vm.img)
use qemu-img convert -S to recreate a sparse image
convert to qcow2 to have growing and versatile image instead

Related

Ghostscript to compress a batch of PDFs

I have no experience of programming.
My PDFs won't display images on the iPad in PDFExpert or GoodNotes as the images are in JPEG2000, from what I could find on the internet.
These are large PDFs, upto 1500-2000 pages with images. One of these was an 80MB or so file. I tried printing it with Foxit to convert the images to JPG from JPEG2000 but the file size jumped to 800MB...plus it's taking too long.
I stumbled upon Ghostscript, but I have NO clue how to use the command line interface.
I am very short on time. Pretty much need a step by step guide for a small script that converts all my PDFs in one go.
Very sorry about my inexperience and helplessness. Can someone spoon-feed me the steps for this?
EDIT: I want to switch the JPEG2000 to any other format that produces less of an increase in file size and causes a minimal loss in quality (within reason). I have no clue how to use Ghostscript. I basically want to change the compression on the images to something that will display correctly on the iPad while maintaining the quality of the rest of the text, as well as the embedded bookmarks.
I'll repeat that I have NO experience with command line...I don't even know how to point GS to the folder my PDFs are in...
You haven't really said what it is you want. 'Convert' PDFs how exactly ?
Note that switching from JPX (JPEG2000) to JPEG will result in a quality loss, because the image data will be quantised (with a different quantisation scheme to JPX) by the JPEG encoder. You can use a lossless compression scheme instead, but then you won't get the same kind of compression. You won't get the same compression ratio as JPX anyway no matter what you use, the result will be larger.
A simple Ghostscript command would be:
gs -sDEVICE=pdfwrite -o out.pdf in.pdf
Because JPEG2000 encoding is (or at least, was) patent encumbered, the pdfwrite device doesn't write images as JPX< by default it will write them several times with different compression schemes, and then use the one that gives the best compression (practically always JPEG).
Getting better results will require more a complex command line, but you'll also have to be more explicit about what exactly you want to achieve, and what the perceived problem with the simplistic command line is.
[EDIT]
Well, giving help on executing a command line is a bit off-topic for Stack Overflow, this is supposed to be a site for software developers :-)
Without knowing what operating system you are using its hard to give you detailed instructions, I also have no idea what an iPad uses, I don't generally use Apple devices and my only experience is with Macs.
Presumably you know where (the directory) you installed Ghostscript. Either open a command shell there and type the command ./gs or execute the command by giving the full path, such as :
/usr/bin/gs
I thought the arguments on the command line were self-explanatory, but....
The -sDEVICE=pdfwrite switch tells Ghostscript to use the pdfwrite device, as you might guess from the name, that device writes PDF files as its output.
The -o switch is the name (and full path if required) of the output file.
The final argument is the name (and again, full path if its not in the current directory) of the input file.
So a command might look like:
/usr/bin/gs -sDEVICE=pdfwrite -o /home/me/output.pdf /home/me/input.pdf
Or if Ghostscript and the input file are in the same directory:
./gs -sDEVICE=pdfwrite -o out.pdf input.pdf

Could someone explain to me about the training Tesseract OCR?

I'm trying to do the training process, but I don't understand even how to start. I would like to train for read it numbers. My images are from real world, so it didn't go so good with the reading process.
It says that I have to have a ".tif" image with the examples... is a single image of every number (in this case) or a image with a lot of different types of number (same font, though)?
And what about the makebox? The command didn't work here.
https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Could someone explain me better, at least how to start?
I saw a few softwares that do this more quickly, but I tryied one (SunnyPage 1.8) but isn't free. Anyone know any free software that does this? Or a good tutorial?
Using Tesseract 3, Windows 8 (32bits).
It is important to patiently follow the training wiki google code project site. If needed multiple times. It is an open source library and is constantly evolving.
You will have to create a training image(tiff) with a lot of different types of numbers probably should have all the numbers you wish the engine to recognize.
Please consider posting the exact error message you got with make box.
I think Tesseract is the best free solution available. You have to keep working and seek help from community.
There is a very good post from Cédric here explaining the training process for Tesseract.
A good free OCR software is PDF OCR X which is also based on Tesseract. I tried to copy my notes from German which I had scanned at 1200dpi, and the results were commendable but not perfect. I found that this website - http://onlineocr.net - is a lot more accurate. If you are not registered, it allows a maximum of 4mb file size from most image formats (BMP, PNG, JPEG etc.) and PDF. It can output them as a Word file, an Excel file or an txt file.
Hope this helps.

Read only one RGB color in Octave

I have to read a few hundred RGB images using the PNG format. I only need one of the colors ( either Red, Green or Blue ), and right now I'm doing something like this:
A = imread(file);
A = A(:, :, 1);
I was wondering if it was possible to only read the values for one color, to make the reading faster. I need this operation to be as fast as possible.
Like #carandraug mentioned, octave does not provide such a method. You already posted the simplest option. Octave is using ImageMagick as a back-end for reading image files. There is not much space for optimization here.
Besides, if you really need to speed up the reading process for a rather large quantity of images, you might want to look for alternative reading methods or implement your own. A good place to start is the source code of libpng. Another idea is to convert your RGB png's into uncompressed simple bmp's first. Let the process of converting the images handle by another fast program of your choice. Create the bmp's e.g. inside a ram drive and read them from octave with low-level commands (fread). Such strategies can be optimized to some degree. But they are only worth the effort if we are talking about a lot of images.

Possibilities to compress an image (size)

I am implementing an application. In that I need to find out a way to compress the image (size). Because it will help a lot for me to making the space comfortable in the database(server).Please help me on this.
Thanks in advance,
Sekhar Behalam.
Your options are to reduce the dimension of the images and/or reduce the quality by increasing the compression. Are the images photographic in nature (JPG is best) or simple solid colour graphics (use PNGs)?
If the images are JPG (lossy compression) you can simply load and re-save them with a higher compression setting. This can result in a large space saving.
The image quality will of course decline, but you can get away with quite a lot of compression in JPG before it is noticeable. What is acceptable of course is determined by the use of the images (which you have not stated).
Hope this helps.
Also consider pngcrush, which is a utility that is included with the SDK. In the Project Settings in Xcode, there's an option to "Compress PNG Images." Make sure to check that. Note that this only works for resource images (as far as I know)—but you haven't stated if these images will be user-created/instantiated, or brought into the app bundle directly.

Tools for JPEG optimization? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Do you know of any tools (preferrably command-line) to automatically and losslessly optimize JPEGs that I could integrate into our build environment? For PNGs I'm currently using PNGOUT, and it generally saves around 40% bandwidth/image size.
At the very least, I would like a tool that can strip metadata from the JPGs - I noticed a strange case where I tried to make thumbnail from a photograph, and couldn't get it smaller than 34 kB. After investigating more, I found that the EXIF data was still part of the image, and the thumbnail was 3 kB after removing the metadata.
And beyond that - is it possible to further optimize JPGs losslessly? The PNG optimizer tries different compression strategies, random initialization of the Huffmann encoding etc.
I am aware that most savings come from the JPEG quality parameter, and that it's a rather subjective measure. I'm only looking for a tool that can be run as a build step and that losslessly squeezes a few bytes from the images.
I wrote a GUI for all image optimization tools I could find, including MozJPEG and jpegoptim that optimize Huffman tables, progressive scans, and (optionally) remove invisible metadata.
If you don't have a Mac, I also have a basic web interface that works on any platform.
I use libjpeg for lossless operations. It contains a command-line tool jpegtran that can do all you want. With the commandline option -copy none all the metadata is stripped, and -optimize does a lossless optimization of the Huffmann compression. You can also convert the images to progressive mode with -progressive, but that might cause compatibility problems (does anyone know more about that?)
[WINDOWS ONLY]
RIOT(Radical Image Optimization Tool)
This is the greatest image optimization tool I have found!
http://luci.criosweb.ro/riot/
You can easily get a 10MB image down to 800KB through sub-sampling.
It supports PNG, GIF, and JPEG.
It even integrates into context menus so you can send pictures straight there.
Allows you to rotate, re-size, compress to specified KB's, and more. Also has plugins for GIMP and IrfanView and other things.
There is also a DLL available if you want to incorporate it into your own programs or java script / c++ program.
Another alternative is http://pnggauntlet.com/ PNGGAUNTLET takes forever but it does a pretty good job.
[WINDOWS ONLY]
A new service called JPEGmini produces incredible results. A shame that it's online only. Edit: It's available for Windows and Mac now
Tried a number of the suggestions above - I personally was after lossless compression.
My sample image had an original size of 67,737 bytes.
Using kraken.io, it went down to 64,718
Using jpegtran, it went down to 64,718
Using yahoo smush-it, it went down to 61,746
Using imagemagick (-strip), it went down to 65,312
The smush.py option looks promising, but the installation was too complex for me to do quickly
jpegrescan looks promising too, but seems to be unix and I'm using windows
jpegmini is NOT lossless, but I can't tell the difference (down to 22,172)
plinth's Altrasoft jpegstripper app does not work on my windows 7
jpegoptim is not windows - no good for me
Riot (keeping quality at 100%) got it down to 63,416 and with chroma subsampling set to high, it got it down to 61,912 - I don't know if that is lossless or not though, and I think it looks lighter than the original.
So my verdict is yahoo smushit if it must be lossless
I would try Imagemagick. It has tons of command line options, its free and have a nice license.
http://www.imagemagick.org
There seems to be an option called Strip that may help you:
http://www.imagemagick.org/script/command-line-options.php#strip
ImageOptim is really slick. The command line option posted by the author will populate the GUI and show progress. I used jpegtran for optimizing and converting to progressive, then ImageOptim for further progressive optimizations and for other file types.
Reuse of script code also found in this forum (all files replaced in place):
jpegtran
for file in $(find $DIR -type f \( -name "*.jpg" -or -name "*.jpeg" -or -name "*.JPG" \)); do
echo found $file for optimizing...
jpegtran -copy comments -optimize -progressive -outfile $file $file
done
ImageOptim
for file in $(find $DIR -type f \( -name "*.jpg" -or -name "*.png" -or -name "*.gif" \)); do
do
echo found $file for optimizing...
open -a ImageOptim.app $file
done
In case anyone's looking, I've written an offline version of Yahoo's Smush.it. It will losslessly optimise pngs, jpgs and gifs (animated and static):
http://github.com/thebeansgroup/smush.py
You can use jpegoptim which will losslessly optimize jpeg files by default. The --strip-all option strips all extra embedded info. You can also specify a lossy mode with the --max switch which is useful when you have images saved with a very high quality setting, which is not necessary for eg. web content.
You get similar optimization as with jpegtran (see answer by OutOfMemory) but jpegoptim can't save to progressive jpegs.
I've written a command line tool called 'picopt' (similar to ImageOptim) that uses external programs to optimize JPEGs, PNGs, GIFS, animated GIFS and even comic book archive contents (CBR/CBZ).
This is suitable for use with homebrew on OS X or Linux systems where you have installed tools like jpegrescan, jpegtran, optipng, gifsicle, etc.
https://github.com/ajslater/picopt
I too would recommend ImageMagick. It has a command line option to remove EXIF metadata
mogrify -strip image.jpg
There are plenty of other tools out there that do the same thing.
As far as recompressing JPEGs go, don't. JPEGs are lossy to start with, so any form of recompression is only going to hurt image quality. However, if you have losslessly encoded images, some encoders do a better job than others. I have noticed that JPEGs done with Photoshop consistently look better than when encoded with ImageMagick (despite the same file size) due to complicated reasons. Furthermore (and this is relevant to you), I know that at least Photoshop can save JPEGs as optimized which means they drop compatibility with some stuff that you probably don't care about to save a couple of KB. Also, make sure you don't have any colour profiles embedded and you may be able to save another couple of KB.
I would recommend using http://kraken.io It's ultra-fast webapp which will optimize your PNG and JPEG files far better than smush.it does.
I recommend to use JpegOptim, it's free and really nice, you can specify the quality, the size you want ... And easy to use in command line.
JpegOptim
May I recommend this for near-transparency:
convert 'yourfile.png' ppm:- | jpeg-recompress -t 97 -q veryhigh -a -m smallfry -s -r -S disable - yourfile.jpg
It uses imagemagick's convert and jpeg-recompress from jpeg-archive.
Both are open-source and work on Windows, Mac and Linux. You may want to tweak the options above for different quality expectations.