how k means algorithm achieve color compression - k-means

am reading about k means algorithm at below link
https://github.com/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.11-K-Means.ipynb
At ln[22] here author mentioned that Input color space is 16 million possible colors. How author came up with 16 million number here. Kindly explain.
Additionally at the end it is mentioned as below
Some detail is certainly lost in the rightmost panel, but the overall
image is still easily recognizable. This image on the right achieves a
compression factor of around 1 million! While this is an interesting
application of k-means, there are certainly better way to compress
information in images. But the example shows the power of thinking
outside of the box with unsupervised methods like k-means.
How compression is done here as we are still using data of shape (273280, 3) after compression. How compression of 1 million is achieved.

Related

Reverse Image search (for image duplicates) on local computer

I have a bunch of poor quality photos that I extracted from a pdf. Somebody I know has the good quality photo's somewhere on her computer(Mac), but it's my understanding that it will be difficult to find them.
I would like to
loop through each poor quality photo
perform a reverse image search using each poor quality photo as the query image and using this persons computer as the database to search for the higher quality images
and create a copy of each high quality image in one destination folder.
Example pseudocode
for each image in poorQualityImages:
search ./macComputer for a higherQualityImage of image
copy higherQualityImage to ./higherQualityImages
I need to perform this action once.
I am looking for a tool, github repo or library which can perform this functionality more so than a deep understanding of content based image retrieval.
There's a post on reddit where someone was trying to do something similar
imgdupes is a program which seems like it almost achieves this, but I do not want to delete the duplicates, I want to copy the highest quality duplicate to a destination folder
Update
Emailed my previous image processing prof and he sent me this
Off the top of my head, nothing out of the box.
No guaranteed solution here, but you can narrow the search space.
You’d need a little program that outputs the MSE or SSIM similarity
index between two images, and then write another program or shell
script that scans the hard drive and computes the MSE between each
image on the hard drive and each query image, then check the images
with the top X percent similarity score.
Something like that. Still not maybe guaranteed to find everything
you want. And if the low quality images are of different pixel
dimensions than the high quality images, you’d have to do some image
scaling to get the similarity index. If the poor quality images have
different aspect ratios, that’s even worse.
So I think it’s not hard but not trivial either. The degree of
difficulty is partly dependent on the nature of the corruption in the
low quality images.
UPDATE
Github project I wrote which achieves what I want
What you are looking for is called image hashing
. In this answer you will find a basic explanation of the concept, as well as a go-to github repo for plug-and-play application.
Basic concept of Hashing
From the repo page: "We have developed a new image hash based on the Marr wavelet that computes a perceptual hash based on edge information with particular emphasis on corners. It has been shown that the human visual system makes special use of certain retinal cells to distinguish corner-like stimuli. It is the belief that this corner information can be used to distinguish digital images that motivates this approach. Basically, the edge information attained from the wavelet is compressed into a fixed length hash of 72 bytes. Binary quantization allows for relatively fast hamming distance computation between hashes. The following scatter plot shows the results on our standard corpus of images. The first plot shows the distances between each image and its attacked counterpart (e.g. the intra distances). The second plot shows the inter distances between altogether different images. While the hash is not designed to handle rotated images, notice how slight rotations still generally fall within a threshold range and thus can usually be matched as identical. However, the real advantage of this hash is for use with our mvp tree indexing structure. Since it is more descriptive than the dct hash (being 72 bytes in length vs. 8 bytes for the dct hash), there are much fewer false matches retrieved for image queries.
"
Another blogpost for an in-depth read, with an application example.
Available Code and Usage
A github repo can be found here. There are obviously more to be found.
After importing the package you can use it to generate and compare hashes:
>>> from PIL import Image
>>> import imagehash
>>> hash = imagehash.average_hash(Image.open('test.png'))
>>> print(hash)
d879f8f89b1bbf
>>> otherhash = imagehash.average_hash(Image.open('other.bmp'))
>>> print(otherhash)
ffff3720200ffff
>>> print(hash == otherhash)
False
>>> print(hash - otherhash)
36
The demo script find_similar_images also on the mentioned github, illustrates how to find similar images in a directory.
Premise
I'll focus my answer on the image processing part, as I believe implementation details e.g. traversing a file system is not the core of your problem. Also, all that follows is just my humble opinion, I am sure that there are better ways to retrieve your image of which I am not aware. Anyway, I agree with what your prof said and I'll follow the same line of thought, so I'll share some ideas on possible similarity indexes you might use.
Answer
MSE and SSIM - This is a possible solution, as suggested by your prof. As I assume the low quality images also have a different resolution than the good ones, remember to downsample the good ones (and not upsample the bad ones).
Image subtraction (1-norm distance) - Subtract two images -> if they are equal you'll get a black image. If they are slightly different, the non-black pixels (or the sum of the pixel intensity) can be used as a similarity index. This is actually the 1-norm distance.
Histogram distance - You can refer to this paper: https://www.cse.huji.ac.il/~werman/Papers/ECCV2010.pdf. Comparing two images' histograms might be potentially robust for your task. Check out this question too: Comparing two histograms
Embedding learning - As I see you included tensorflow, keras or pytorch as tags, let's consider deep learning. This paper came to my
mind: https://arxiv.org/pdf/1503.03832.pdf The idea is to learn a
mapping from the image space to a Euclidian space - i.e. compute an
embedding of the image. In the embedding hyperspace, images are
points. This paper learns an embedding function by minimizing the
triplet loss. The triplet loss is meant to maximize the distance
between images of different classes and minimize the distance between
images of the same class. You could train the same model on a Dataset
like ImageNet. You could augment the dataset with by lowering the
quality of the images, in order to make the model "invariant" to
difference in image quality (e.g. down-sampling followed by
up-sampling, image compression, adding noise, etc.). Once you can
compute embedding, you could compute the Euclidian distance (as a
substitute of the MSE). This might work better than using MSE/SSIM as a similarity indexes. Repo of FaceNet: https://github.com/timesler/facenet-pytorch. Another general purpose approach (not related to faces) which might help you: https://github.com/zegami/image-similarity-clustering.
Siamese networks for predicting similarity score - I am referring to this paper on face verification: http://bmvc2018.org/contents/papers/0410.pdf. The siamese network takes two images as input and outputs a value in the [0, 1]. We can interpret the output as the probability that the two images belong to the same class. You can train a model of this kind to predict 1 for image pairs of the following kind: (good quality image, artificially degraded image). To degrade the image, again, you can combine e.g. down-sampling followed by
up-sampling, image compression, adding noise, etc. Let the model predict 0 for image pairs of different classes (e.g. different images). The output of the network can e used as a similarity index.
Remark 1
These different approaches can also be combined. They all provide you with similarity indexes, so you can very easily average the outcomes.
Remark 2
If you only need to do it once, the effort you need to put in implementing and training deep models might be not justified. I would not suggest it. Still, you can consider it if you can't find any other solution and that Mac is REALLY FULL of images and a manual search is not possible.
If you look at the documentation of imgdupes you will see there is the following option:
--dry-run
dry run (do not delete any files)
So if you run imgdupes with --dry-run you will get a listing of all the duplicate images but it will not actually delete anything. You should be able to process that output to move the images around as you need.
Try similar image finder I have developed to address this problem.
There is an explanation and the algorithm there, so you can implement your own version if needed.

Should the size of the photos be the same for deep learning?

I have lots of image (about 40 GB).
My images are small but they don't have same size.
My images aren't from natural things because I made them from a signal so all pixels are important and I can't crop or delete any pixel.
Is it possible to use deep learning for this kind of images with different shapes?
All pixels are important, please take this into consideration.
I want a model which does not depend on a fixed size input image. Is it possible?
Without knowing what you're trying to learn from the data, it's tough to give a definitive answer:
You could pad all the data at the beginning (or end) of the signal so
they're all the same size. This allows you to keep all the important
pixels, but adds irrelevant information to the image that the network
will most likely ignore.
I've also had good luck with activations where you take a pretrained
network and pull features from the image at a certain part of the
network regardless of size (as long as it's larger than the network
input size). Then run through a classifier.
https://www.mathworks.com/help/deeplearning/ref/activations.html#d117e95083
Or you could window your data, and only process smaller chunks at one
time.
https://www.mathworks.com/help/audio/examples/cocktail-party-source-separation-using-deep-learning-networks.html

Good library for Digital watermarking

Can somebody help me, to find a library, or a detailed description of algorithm, that could embed a Digital watermark(invisible watermark, just a kind of steganography) to a jpeg/png file. But the quality of algorithm, should be great. It should be possible to extract this mark after rotation and expansion(if possible) of image.
Mark is just a key 32bytes.
I found a good site, but the algorithm are made for the NetPBM format, that is dead...
I know that there is a LSB method, but it is not stable to the expansion. Are there something better?
Changing metadata, is not suitable, because it is visible changes.
This maybe won't really be an answer, as I don't think it would be easy to give a magical, precise answer on this question.Watermarking is complex, and the best way to do it is by yourself : this will make things more hard for an attacker trying to reverse engineer your code. One could even read your question here, guess what library you used, and attack your system more easily.
Making Steganography resist to expansion in JPEG images is also very hard, because the JPEG compression is reapplied after the expansion. There are in fact a bunch of JPEG steganography algorithms. Which one you should use, depends on what exactly do you require :
Data confidentiality ?
Message presence confidentiality ?
Message coherence after JPEG changes ?
Resistance to "Known Cover" attacks (when attackers try to find the message, based on the steganographic system) ?
Resistance to "Known Message" attacks (when attackers try to find the steganographic system used, based on the message) ?
From what I know, usually, algorithm that resist to JPEG changes (picture recompression) are often really easier to attack, whereas algorithms that run the "encode" stage during the JPEG compression (after the DCT (lossy) transform, and before the Huffmann (non-lossy) transform) are more prone to resist.
Also, one key factor about steganography is scale : if you have only 32bytes of data to encode in a, say, 256*256px image, don't use an algo that can encode 512bytes of data in the same size. Either use a scalable algorithm, either use an algorithm at its efficient scale.
Also, the best way to do good steganography is to know its limitations,and to know how steganalyzers work. Try these tools, so you can understand what attackers will do to your picture.^
Now, I cannot tell you what steganographic system will be the best for you, but I can give you some indications :
jSteg - Quite old, I don't think it will resist to JPEG changes
OutGuess - Quite old too, but one of the best algorithms
F5 (and F3/F4) - More recent, good algorithm, scientifical research behind.
stegHide
I think all of these are LSB based : the encoding is done during the JPEG compression, after the DCT and Quantization. The only non LSB-based steganography system I heard of was mentionned in this research paper, however, I did not read it to the end yet, so I cannot tell if this will meet your needs.
However, I'm not sure there exists a real steganography algorithm resisting to JPEG compression, to JPEG resize and rotation, resisting to visual and statisticals attacks. Or I'm not aware of it.
Sorry for the lack of precise answer, I tried to give you what I know on the subject, as it's always better to be more informed. Sorry also for the lack of proper English, I'm French, nobody's perfect :)
Pistache is right in what he told you regarding the watermarking implementation algorithms. I will try to help you by showing one algorithm for the given requirements.
Before explaining you the algorithms first I guess that the distinction between the JPG and PNG formats should be done.
JPEG is a lossy format, i.e, the images are susceptible to compression that could remove the watermark. When you open an image for manipulation purposes and you save it, upon the writing procedure, a compression is made by using DCT filtering that removes some important components of the image.
On the other hand, PNG format is lossless, and that means that images are not susceptible to such kind of compression when stored after manipulation.
As a matter of fact, JPEG is used as a watermarking scheme attack due to its compressing characteristic that could remove the watermark if an attacker performed the compression.
Now that you know the difference between both formats, I can tell you a suitable algorithm resistant to the attacks that you mentioned.
Regarding methods to embed a watermark message for PNG files you can use the histogram embedding method. The histogram embedding method changes values on the histogram by changing the values of the neighbor bins. For example imagine that you have a PNG image in grayscale.
Therefore, you'll have only one channel for embedding and that means that you have one histogram with 256 bins. By selecting the neighbor bins x and x+1, you change the values of x and x+1 by moving the pixels with the bright x to x+1 or the other way around, so that (x/(x+1))>T for embedding a '1' or ((x+1)/x)>T for embedding a '0'.
You can repeat the same procedure for the whole histogram length and therefore you can embed in the best case up to 128bits. However this payload is less than what you asked. Therefore I suggest you to split the image into parts, for example blocks, and if you split one image into 4 components you'd be able to embed in the best case up to
512 bits which means 64 bytes.
This method although is very, susceptible to filtering and compression if applied straight in the space domain. Therefore, I suggest you to compute before the DWT of the image and use its low-frequency sub-band. This will provide you better transparency and robustness increased for the warping, resizing etc attacks and compression or filtering as well.
There are other approaches such as LPM (Log Polar Maps) but they are very complex to implement and I think for your case this approach would be fine.
I can suggest you two papers, the first is:
Watermarking digital image and video data. A state-of-the-art overview
This paper will give you some basic notions of watermarking and explain more in detail the LSB algorithm. And the second paper is:
Real-Time Compressed- Domain Video Watermarking Resistance to Geometric Distortions
This paper will explain the algorithm that I just explained now.
Cheers,
I do not know if you are considering approaches different to steganography. Instead of storing data hidden in the pixel data you could create a new data block in the JPEG file and store encripted data.
Take a look at the JPEG file structure on Wikipedia
You can create an application specific data block, using the marker 0xFF 0xEn. Doing so, any change in the image pixels do not change the information stored in the image. Moreover, many image editing software respect custom data blocks and will keep them even after image manipulation.

BIG header: one jpg or several png?

I've read some of posts here about png/jpg/gif but still I'm quite confused..
I've got a big header on my website :
width:850px height:380px weight:108kb
And it's jpg. A woman + gradient + some layers on top and behing her..
Do you think 108kb it's too much? I was thinking about cut it to png pieces..Would that be a bad idea? What's your suggestions?;) THX for help;]
It depends on the nature of the image, if it's a photograph, JPEG would give the highest quality/compression ratio, if it was pixel stuff like writing or clipart, or have some transparency, then it's GIF/PNG to choose (GIF,PNG8 offers one level transparency, while PNG24 offers a levelled transparency for each pixel).
When I'm confused, I usually save the picture in all three formats and decide what gives the best quality/size deal within the size limits I need, I also try to lower down JPEG quality to the level where the image is still good quality (because that varies from image to another).
Also, if it was a photograph with some writing, split it into a JPEG photograph with a transparent GIF writing overlay (because writing edges look distorted in JPEG).
So, when you are confused, give the three a try and decide, with time you'll gain experience what format suits what content the most.
Actually, 108kb for an image of that size isn't abnormal. It's even more optimal to have it in 1 image: your browser only needs to perform 1 get-request. If you break it up into multiple images, the user needs to wait longer for the site to load.
PNG probably wouldn't help you since JPG is usually much more efficient at handling gradients. PNG would be better if you had big unicolored spaces in your image.

Streaming Jpeg Resizer

Does anyone know of any code that does streaming Jpeg resizing. What I mean by this is reading a chunk of an image (depending on the original source and destination size this would obviously vary), and resizing it, allowing for lower memory consumption when resizing very large jpegs. Obviously this wouldn't work for progressive jpegs (or at least it would become much more complicated), but it should be possible for standard jpegs.
The design of JPEG data allows simple resizing to 1/2, 1/4 or 1/8 size. Other variations are possible. These same size reductions are easy to do on progressive jpegs as well and the quantity of data to parse in a progressive file will be much less if you want a reduced size image. Beyond that, your question is not specific enough to know what you really want to do.
Another simple trick to reduce the data size by 33% is to render the image into a RGB565 bitmap instead of RGB24 (if you don't need the full color space).
I don't know of a library that can do this off the shelf, but it's certainly possible.
Lets say your JPEG is using 8x8 pixel MCUs (the units in which pixels are grouped). Lets also say you are reducing by a factor to 12 to 1. The first output pixel needs to be the average of the 12x12 block of pixels at the top left of the input image. To get to the input pixels with a y coordinate greater than 8, you need to have decoded the start of the second row of MCUs. You can't really get to decode those pixels before decoding the whole of the first row of MCUs. In practice, that probably means you'll need to store two rows of decoded MCUs. Still, for a 12000x12000 pixel image (roughly 150 mega pixels) you'd reduce the memory requirements by a factor of 12000/16 = 750. That should be enough for a PC. If you're looking at embedded use, you could horizontally resize the rows of MCUs as you read them, reducing the memory requirements by another factor of 12, at the cost of a little more code complexity.
I'd find a simple jpeg decoder library like Tiny Jpeg Decoder and look at the main loop in the jpeg decode function. In the case of Tiny Jpeg Decoder, the main loop calls decode_MCU, Modify from there. :-)
You've got a bunch of fiddly work to do to make the code work for non 8x8 MCUs and a load more if you want to reduce by a none integer factor. Sounds like fun though. Good luck.