I am attempting to extract hidden data that has been hidden using DWT steganograpy.Then, when I apply compression, nothing happening!
I have used the following code to compress my .bmp image, but no hidden message is being extracted after compression is applied. I tried running in debugger and it just seems to be jumping to the end of the code, after looping around only once. Any ideas of the problem. Data is extracting fine prior to compression being applied.
%%%%%%%%%%%%%%%%%%DECODING%%%%%%%%%%%%%%%%%%%%%%%
%clear;
filename='newStego.bmp';
stego_image=imread(filename);
compression=90;
file_compressed=sprintf('compression_%d_percent.jpg',compression);
imwrite(imread(filename),file_compressed,'Quality',compression);
new_Stego = double(imread (file_compressed));
[LL,LH,HL,HH] = dwt2(new_Stego,'haar');
message = '';
msgbits = '';
for ii = 1:size(HH,1)*size(HH,2)
if HH(ii) > 0
msgbits = strcat (msgbits, '1');
elseif HH(ii) < 0
msgbits = strcat (msgbits, '0');
else
return;
end
if mod(ii,8) == 0
msgChar = bin2dec(msgbits);
if msgChar == 0
break;
end
msgChar = char (msgChar);
message = [message msgChar];
msgbits = '';
disp(message);
end
end
Your compression scheme is lossy, which means that you irreversibly lose some information in compressing your data.
Specifically, jpeg compression transforms the pixel data to the frequency domain and zeroes out many high frequency components. The DWT detailed coefficients (LH, HL and HH) have some parallels to frequency coefficients and so will be strongly affected by this compression (the HH coefficients even more so). Keep in mind that even 100% quality jpeg compression is lossy, but the distortions are naturally minimised.
If you still want to compress your data, you must do it in a way that doesn't destroy the way you have embedded your information. You have two options:
Use a lossless compression scheme, e.g. png or zip.
Use a different steganography algorithm which is robust to jpeg compression.
Extra: The reason why your decoding process only loops around once is because one of the first few HH coefficients is 0, resulting in premature termination. Either that, or the first 8 coefficients are negative, which results to an extracted character of 0, which is your end of message condition.
Related
tl/dr: I've got two audio recordings of the same song without timestamps, and I'd like to align them. I believe FFT is the way to go, but while I've got a long way, it feels like I'm right on the edge of understanding enough to make it work, and would greatly benefit from a "you got this part wrong" advice on FFT. (My education never got into this area) So I came here seeking ELI5 help.
The journey:
Get two recordings at the same sample rate. (done!)
Transform them into a waveform. (DoubleArray) This doesn't keep any of the meta info like "samples/second" but the FFT math doesn't care until later.
Run a FFT on them using a simplified implementation for beginners
Get a Array<Frame>, each Frame contains Array<Bin>, each Bin has (amplitude, frequency) because the older implementation hid all the details (like frame width, and number of Bins, and ... stuff?) and outputs words I'm familiar with like "amplitude" and "frequency"
Try moving to a more robust FFT (Apache Commons)
Get an output of 'real' and 'imaginary' (uh oh)
Make the totally incorrect assumption that those were the same thing (amplitude and frequency). Surprise, they aren't!
Apache's FFT returns an Array<Complex> which means it... er... is just one frame's worth? And I should be chopping the song into 1 second chunks and passing each one into the FFT and call it multiple times? That seems strange, how does it get lower frequencies?
To the best of my understanding, the complex number is a way to convey the phase shift and amplitude in one neat container (and you need phase shift if you want to do the FFT in reverse). And the frequency is calculated from the index of the array.
Which works out to (pseudocode in Kotlin)
val audioFile = File("Dream_On.pcm")
val (phases, amplitudes) = AudioInputStream(
audioFile.inputStream(),
AudioFormat(
/* encoding = */ AudioFormat.Encoding.PCM_SIGNED,
/* sampleRate = */ 44100f,
/* sampleSizeInBits = */ 16,
/* channels = */ 2,
/* frameSize = */ 4,
/* frameRate = */ 44100f,
/* bigEndian = */ false
),
(audiFile.length() / /* frameSize */ 4)
).use { ais ->
val bytes = ais.readAllBytes()
val shorts = ShortArray(bytes.size / 2)
ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(shorts)
val allWaveform = DoubleArray(shorts.size)
for (i in shorts.indices) {
allWaveform[i] = shorts[i].toDouble()
}
val halfwayThroughSong = allWaveform.size / 2
val moreThanOneSecond = allWaveform.copyOfRange(halfwayThroughSong, halfwayThroughSong + findNextPowerOf2(44100))
val fft = FastFourierTransformer(DftNormalization.STANDARD)
val fftResult: Array<Complex> = fft.transform(moreThanOneSecond, TransformType.FORWARD)
println("fftResult size: ${fftResult.size}")
val phases = DoubleArray(fftResult.size / 2)
val amplitudes = DoubleArray(fftResult.size / 2)
val frequencies = DoubleArray(fftResult.size / 2)
fftResult.filterIndexed { index, _ -> index < fftResult.size / 2 }.forEachIndexed { idx, complex ->
phases[idx] = atan2(complex.imaginary, complex.real)
frequencies[idx] = idx * 44100.0 / fftResult.size
amplitudes[idx] = hypot(complex.real, complex.imaginary)
}
Triple(phases, frequencies, amplitudes)
}
Is my step #8 at all close to the truth? Why would the FFT result return an array as big as my input number of samples? That makes me think I've got the "window" or "frame" part wrong.
I read up on
FFT real/imaginary/abs parts interpretation
Converting Real and Imaginary FFT output to Frequency and Amplitude
Java - Finding frequency and amplitude of audio signal using FFT
An audio recording in waveform is a series of sound energy levels, basically how much sound energy there should be at any one instant. Based on the bit rate, you can think of the whole recording as a graph of energy versus time.
Sound is made of waves, which have frequencies and amplitudes. Unless your recording is of a pure sine wave, it will have many different waves of sound coming and going, which summed together create the total sound that you experience over time. At any one instant of time, you have energy from many different waves added together. Some of those waves may be at their peaks, and some at their valleys, or anywhere in between.
An FFT is a way to convert energy-vs.-time data into amplitude-vs.-frequency data. The input to an FFT is a block of waveform. You can't just give it a single energy level from a one-dimensional point in time, because then there is no way to determine all the waves that add together to make up the amplitude at that point of time. So, you give it a series of amplitudes over some finite period of time.
The FFT then does its math and returns a range of complex numbers that represent the waves of sound over that chunk of time, that when added together would create the series of energy levels over that block of time. That's why the return value is an array. It represents a bunch of frequency ranges. Together the total data of the array represents the same energy from the input array.
You can calculate from the complex numbers both phase shift and amplitude for each frequency range represented in the return array.
Ultimately, I don’t see why performing an FFT would get you any closer to syncing your recordings. Admittedly it’s not a task I’ve tried before. But I would think waveform data is already the perfect form for comparing the data and finding matching patterns. If you break your songs up into chunks to perform FFTs on, then you can try to find matching FFTs but they will only match perfectly if your chunks are divided exactly along the same division points relative to the beginning of the original recording. And even if you could guarantee that and found matching FFT’s, you will only have as much precision as the size of your chunks.
But when I think of apps like Shazam, I realize they must be doing some sort of manipulation of the audio that breaks it down into something simpler for rapid comparison. That possibly involves some FFT manipulation and filtering.
Maybe you could compare FFTs using some algorithm to just find ones that are pretty similar to narrow down to a time range and then compare wave form data in that range to find the exact point of synchronization.
I would imagine the approach that would work well would to find the offset with the maximum cross-correlation between the two recordings. This means calculate the cross-correlation between the two pieces at various offsets. You would expect the maximum cross-correlation to occur at the offset where the two piece were best aligned.
I can't not find the source of the difference in the handling of contrast for version 1.75.01 and 1.82. Here are two images that show what it used to look like (1.75),
and what it now looks like:
Unfortunately, rolling back is not trivial as I run into problems with dependencies (especially PIL v PILLOW). The images are created from a numpy array, and I suspect there is something related to how the numbers are getting handled (?type, rounding) when the conversion from array to image occurs, but I can't find the bug. Any help will be deeply appreciated.
Edited - New Minimal Example
#! /bin/bash
import numpy as np
from psychopy import visual,core
def makeRow (n,c):
cp = np.tile(c,[n,n,3])
cm = np.tile(-c,[n,n,3])
cpm = np.hstack((cp,cm))
return(cpm)
def makeCB (r1,r2,nr=99):
#nr is repeat number
(x,y,z) = r1.shape
if nr == 99:
nr = x/2
else:
hnr = nr/2
rr = np.vstack((r1,r2))
cb=np.tile(rr,[hnr,hnr/2,1])
return(cb)
def makeTarg(sqsz,targsz,con):
wr = makeRow(sqsz,1)
br = makeRow(sqsz,-1)
cb = makeCB(wr,br,targsz)
t = cb*con
return(t)
def main():
w = visual.Window(size = (400,400),units = "pix", winType = 'pyglet',colorSpace = 'rgb')
fullCon_np = makeTarg(8,8,1.0)
fullCon_i = visual.ImageStim(w, image = fullCon_np,size = fullCon_np.shape[0:2][::-1],pos = (-100,0),colorSpace = 'rgb')
fullCon_ih = visual.ImageStim(w, image = fullCon_np,size = fullCon_np.shape[0:2][::-1],pos = (-100,0),colorSpace = 'rgb')
fullCon_iz = visual.ImageStim(w, image = fullCon_np,size = fullCon_np.shape[0:2][::-1],pos = (-100,0),colorSpace = 'rgb')
fullCon_ih.contrast = 0.5
fullCon_ih.setPos((-100,100))
fullCon_iz.setPos((-100,-100))
fullCon_iz.contrast = 0.1
partCon_np = makeTarg(8,8,0.1)
partCon_i = visual.ImageStim(w, image = partCon_np,pos = (0,0), size = partCon_np.shape[0:2][::-1],colorSpace = 'rgb')
zeroCon_np = makeTarg(8,8,0.0)
zeroCon_i = visual.ImageStim(w, image = zeroCon_np,pos=(100,0), size = zeroCon_np.shape[0:2][::-1],colorSpace = 'rgb')
fullCon_i.draw()
partCon_i.draw()
fullCon_ih.draw()
fullCon_iz.draw()
zeroCon_i.draw()
w.flip()
core.wait(15)
core.quit()
if __name__ == "__main__":
main()
Which yields this:
The three checker-boards along the horizontal have the contrast changed in the array when generated before conversion to the image. The Vertical left shows that changing the image contrast afterwards works fine. The reason I can't use this is that a) I have collected a lot of data with the last version, and b) I want to grade the contrast of those big long bars in the centre programatically by multiplying one array against another, e.g. using a log scale or some other function, and doing the math is easier in numpy.
I still suspect the issue is in the conversion from np.array -> pil.image. The dtype of these array is float64, but even if I coerce to float32 nothing changes. If you examine the array before conversion at half contrast it is filled with 0.5 and -0.5 numbers, but all the negative numbers are getting turned to black and black is being set to zero at the time of conversion by psychopy.tools.imagetools.array2image I think.
OK, yes, the problem was to do with the issue of the scale for the array values. Basically, you've found a corner case that PsychoPy isn't handling correctly (i.e. a bug).
Explanation:
PsychoPy has a complex set of transformation rules for handling image/textures; it tries to deduce what you're going to do with this image and whether it should be stored in a way that supports colour manipulations (signed float) or not (can be an unsigned byte). In your case PsychoPy was getting this wrong; the fact that the array was filled with floats made PsychoPy think it could do color transforms, but the fact that it was NxNx3 suggest it shouldn't (we don't want to specify a "color" for something that already has its color specified for every pixel as rgb vals).
Workarounds (any one of these):
Just provide your array as NxN, not NxNx3. This is the right thing to do anyway; it means less for you to compute/store and by providing "intensity" values these can then be recolored on-the-fly. This is roughly what you had discovered already in providing just one slice of your NxNx3 array, but the point is that you could/should only create one slice in the first place.
Use GratingStim, which converts everything to signed floating point values rather than trying to work out what's best (potentially then you'd need to work out the spatial frequency stuff though)
You could add a line to fix it by rescaling your array (*0.5+0.5) but you'd have to set something so that this only occurred for this version (we'll fix it before the next release)
Basically, I'm suggesting you do (1) because that already works for past, present and future versions and is more efficient anyway. But thanks for letting us know - I'll try to make sure we catch this one in future
best wishes
Jon
The code is too long for me to read through and work out the actual problem.
What might be the problem is the issue of where zero should be. I think for a while numpy arrays were treated as having vals 0:1 whereas the rest of PsychoPy expects values to be -1:1 so it might be that you need to rescale your values with array=array*2-1 to get back to old (bad behaviour). Or check opacity too, which might have a similar issue. If you write a minimal example I'll read/test it properly
Thanks
I'm currently using python 3.3 in combination with pyaudio and numpy. I took the example from the pyaudio website to play a simple wave file and send that data onto the default sound card.
Now I would like to change the volume of the audio, but when I multiply the array by 0.5, I get a lot of noise and distortion.
Here is a code sample:
while data != '':
decodeddata = numpy.fromstring(data, numpy.int16)
newdata = (decodeddata * 0.5).astype(numpy.int16)
stream.write(newdata.tostring())
data = wf.readframes(CHUNK)
How should I handle multiplication or division on this array without ruining the waveform?
Thanks,
It seemed that the source file's bitrate (24 bit) was not compatible with portaudio. After exporting to a 16 bit pcm file, the multiplication did not result in distortion.
To fix this for different typed files, it is necessary to check the bit depth and rescale correspondingly.
I'm trying to make a powerspectrum from an experimental dataset which I am reading in, and then to fit it to an theoretical curve. Now everything is working fine and I'm not getting errors, except for the fact that my curve keeps differing by a factor of 1000 from the data and I have absolutely no idea what the problem could be. I've asked a few people, but to no avail. (I hope that you guys will be able to help)
Anyways, I'm pretty sure that its not the units, as they were tripple checked by me and 2 others. Basically, I need to fit a powerspectrum to an equation by using the least squares method.
I can't post the whole code, as its rather long and a bit messy, but this is the fourier part, I added comments to all arrays and vars which have not been declared in the code)
#Calculate stuff
Nm = 10**-6 #micro to meter
KbT = 4.10E-21 #Joule
T = 297. #K
l = zvalue*Nm #meter
meany = np.mean(cleandatay*Nm) #meter (cleandata is the array that I read in from a cvs at the start.)
SDy = sum((cleandatay*Nm - meany)**2)/len(cleandatay) #meter^2
FmArray[0][i] = ((KbT*l)/SDy) #N
#print FmArray[0][i]
print float((i*100/len(filelist)))#how many % done?
#fourier
dt = cleant[1]-cleant[0] #timestep
N = len(cleandatay) #Same for cleant, its the corresponding time to cleandatay
Here is where the fourier part starts, I take the fft and turn it into a powerspectrum. Then I calculate the corresponding freq steps with the array freqs
fouriery = np.fft.fft((cleandatay*(10**-6)))
fourierpower = (np.abs(fouriery))**2
fourierpower = fourierpower[1:N/2] #remove 0th datapoint and /2 (remove negative freqs)
fourierpower = fourierpower*dt #*dt to account for steps
freqs = (1.+np.arange((N/2)-1.))/50.
#Least squares method
eta = 8.9E-4 #pa*s
Rbead = 0.5E-6#meter
constant = 2*KbT/(3*eta*pi*Rbead)
omega = 2*pi*freqs #rad/s
Wcarray = 2.*pi*np.arange(0,30, 0.02003) #0.02 = 30/len(freqs)
ChiSq = np.zeros(len(Wcarray))
for k in range(0, len(Wcarray)):
Py = (constant / (Wcarray[k]**2 + omega**2))
ChiSq[k] = sum((fourierpower - Py)**2)
pylab.loglog(omega, Py)
print k*100/len(Wcarray)
index = np.where(ChiSq == min(ChiSq))
cutoffw = Wcarray[index]
Pygoed = (constant / (Wcarray[index]**2 + omega**2))
print cutoffw
print constant
print min(ChiSq)
pylab.loglog(omega,ChiSq)
So I have no idea what could be going wrong, I think its the fft, as nothing else can really go wrong.
Below is the pic I get when I plot all the fit lines against the spectrum, as you can see it is off by about 1000 (actually exactly 1000, as this leaves a least square residue of 10^-22, but I can't just randomly multiply without knowing why)
Just to elaborate on the picture. The green dots are the fft spectrum, the lines are the fits, the red dot is where it thinks the cutoff frequency is, and the blue line is the chi-squared fit, looking for the lowest value.
Take a look at the documentation for the FFT that you are using. Many FFTs introduce a scaling factor that is usually N * result (number of samples). Multiplying by 1/N will scale the results back in line. (You said that the result is 1000 too high....could it be that you are using a 1024 size FFT?)
Your library FFT routine might include a scale factor of 1/sqrt(n).
Check the documentation for the fft you used, as the proportion of the scale factor allocated between the fft and the ifft is arbitrary.
I want to know how much data can be embedded into an image of different sizes.
For example in 30kb image file how much data can be stored without distortion of the image.
it depends on the image type , algoridum , if i take a example as a 24bitmap image to store ASCII character
To store a one ASCII Character = Number of Pixels / 8 (one ASCII = 8bits )
It depends on two points:
How much bits per pixel in your image.
How much bits you will embed in one pixel .
O.K lets suppose that your color model is RGB and each pixel = 8*3 bits (one byte for each color), and you want embed 3 bits in one pixel.
data that can be embedded into an image = (number of pixels * 3) bits
If you would use the LSB to hide your information this would give 30000Bits of available space to use. 3750 bytes.
As the LSB represents 1 or 0 into a byte that gets values from 0-256 this gives you in the worst case scenario that you are going to modify all the LSBs distortion of 1/256 that equals 0,4%.
In the statistical average scenario you would get 0,2% distortion.
So depends on which bit of the byte you are going to change.