I have gigabytes of data that would be nice to compress with gzip for storage. Can gnuplot open compressed files? If it can't, is there a way to pipe the data to gnuplot so that the uncompressed file doesn't need to be written to disk?
Something like plot "< gzip -dc compressed.gz", IIRC.
Related
I am unfamiliar with compression algorithms. Is it possible with zlib or some other library to decompress, modify and recompress only the beginning of a gzip stream and then concatenate it with the compressed remainder of the stream? This would be done in a case where, for example, I need to modify the first bytes of user data (not headers) of a 10GB gzip file so as to avoid decompressing and recompressing the entire file.
No. Compression will generally make use of the preceding data in compressing the subsequent data. So you can't change the preceding data without recompressing the remaining data.
An exception would be if there were breakpoints put in the compressed data originally that reset the history at each breakpoint. In zlib this is accomplished with Z_FULL_FLUSH during compression.
I have 1000+ *.tar.gz files with size 4G+ each. But the only thing that I needed is the top 5 lines of each file. I am wondering whether there is a fast way to read these lines without uncompressing process (it takes 3-5 minutes to uncompress a single file).
My platform is Linux.
No, there isn't any faster way.
The issue is that .tar file is stream of concatenated original files (with some meta information). gzip then adds compression of full archive. Therefore even to just get the list of the files the archive has to be uncompressed first.
We are working on a system (on Linux) that has very limited transmission resources. The maximum file size that can be sent as one file is defined, and we would like to send the minimum number of files. Because of this, all files sent are packed and compressed in GZip format (.tar.gz).
There are a lot of small files of different type (binary, text, images...) that should be packed in the most efficient way to send the maximum amount of data everytime.
The problem is: is there a way to estimate the size of the tar.gz file without running the tar utility? (So the best combination of files can be calculated)
Yes, there is a way to estimate tar size before running the command.
tar -czf - /directory/to/archive/ | wc -c
Meaning:
This will create the archive as standar output and will pipe it to the wc command, a tool that will count the bytes. The output will be the amount of KB in the archive. Technically, it runs the tool but doesn't save it.
Source: The Ultimate Tar Command Tutorial with 10 Practical Examples
It depends on what you mean by "small files", but generally, no. If you have a large file that is relatively homogenous in its contents, then you could compress 100K or 200K from the middle and use that compression ratio as an estimate for the remainder of the file.
For files around 32K or less, you need to compress it to see how big it will be. Also when you concatenate many small files in a tar file, you will get better compression overall than you would individually on the small files.
I would recommend a simple greedy approach where you take the largest file whose size plus some overhead is less than the remaining space in the "maximum file size". The overhead is chosen to cover the tar header and the maximum expansion from compression (a fraction of a percent). Then add that to the archive. Repeat.
You can flush the compression at each step to see how big the result is.
I have a file upload to upload and save bytes of files into database.
Now I want to first compress my file size before save into database.
I have gone through below site;
http://programmerpayback.com/2010/01/21/use-silverlight-to-resize-images-and-increase-compression-before-uploading/
In above site, there is solution for jpeg and png but I want to compress all files and get bytes and save into database and when I get files bytes from database it will be same as original files.
please guide me how to do this.
Thanks,
While for images jpeg and png are often a better way of compression, zip files offer a decent compression across all sorts of file types.
In Silverlight, you have a few options, the most popular being DotNetZip and #ziplib.
You can install both as NuGet packages.
Such libraries also have the benefit of being able to package multiple files together, something the image compression formats don't offer in any convenient way.
Whenever I try to compress a PDF file to a lower possible size, by either using ghostscript or pdftk or pdfopt, I end up having a file near to half the size of original. But lately, I am getting files of size in 1000 MB range, which are compressing to say a few hundreds. Can we further reduce them?
The pdf is made from jpg images which are of higher resolutions, cant we reduce the size of those images and further bring in some more reduction in size?
As far I know, without degrading jpeg streams and loosing quality, you can try the special feature offered by
Multivalent
https://rg.to/file/c6bd7f31bf8885bcaa69b50ffab7e355/Multivalent20060102.jar.html
java -cp path/to.../multivalent.jar tool.pdf.Compress -compact file.pdf
resulting output will be compressed in a special way. the resulting file needs Multivalent browser to be read again
it is unpredictable how much space you can save (many times you cannot save any further space)