I have a bit of code attached below that display a stimulus for a certain number of frames.
from psychopy import visual, logging, event, core
#create a window to draw in
myWin = visual.Window((600,600), allowGUI=False, blendMode='add', useFBO=True)
logging.console.setLevel(logging.DEBUG)
#INITIALISE SOME STIMULI
grating1 = visual.GratingStim(myWin,mask="gauss",
color=[1.0,1.0,1.0],contrast=0.5,
size=(1.0,1.0), sf=(4,0), ori = 45,
autoLog=False)#this stim changes too much for autologging to be useful
grating2 = visual.GratingStim(myWin,mask="gauss",
color=[1.0,1.0,1.0],opacity=0.5,
size=(1.0,1.0), sf=(4,0), ori = -45,
autoLog=False)#this stim changes too much for autologging to be useful
for frameN in range(300):
grating1.draw()
grating2.draw()
win.flip()
myWin.flip() #update the screen
At 60Hz frame refresh rate, 300 frames should be approximately 5 seconds. When I test it out - it is definitely longer then that.
In my experiment, I need the number of frames to be as few as 2 frames - and it seems that my code isn't going to be displaying that accurately.
I was wondering if there is a better way to display the number of frames? Such as using the grating1.draw() and grating1.draw() before the for-loop maybe?
I appreciate any help - thanks!
The timing discrepancy might be due to the frame rate not being exactly 60Hz. Try using myWin.getActualFrameRate() (PsychoPy documentation) to get the actual frame rate. Multiplying the real frame rate by 5.0 seconds should theoretically allow you to draw for exactly 5.0 seconds.
frame_rate = myWin.getActualFrameRate(nIdentical=60, nMaxFrames=100,
nWarmUpFrames=10, threshold=1)
# The number of frames you want to display is equal to
# the product of frame-rate and the number of seconds to display
# the stimulus.
# n_frames = frame_rate * n_seconds
for frameN in range(frame_rate*5.0):
grating1.draw()
grating2.draw()
myWin.flip()
Related
tl/dr: I've got two audio recordings of the same song without timestamps, and I'd like to align them. I believe FFT is the way to go, but while I've got a long way, it feels like I'm right on the edge of understanding enough to make it work, and would greatly benefit from a "you got this part wrong" advice on FFT. (My education never got into this area) So I came here seeking ELI5 help.
The journey:
Get two recordings at the same sample rate. (done!)
Transform them into a waveform. (DoubleArray) This doesn't keep any of the meta info like "samples/second" but the FFT math doesn't care until later.
Run a FFT on them using a simplified implementation for beginners
Get a Array<Frame>, each Frame contains Array<Bin>, each Bin has (amplitude, frequency) because the older implementation hid all the details (like frame width, and number of Bins, and ... stuff?) and outputs words I'm familiar with like "amplitude" and "frequency"
Try moving to a more robust FFT (Apache Commons)
Get an output of 'real' and 'imaginary' (uh oh)
Make the totally incorrect assumption that those were the same thing (amplitude and frequency). Surprise, they aren't!
Apache's FFT returns an Array<Complex> which means it... er... is just one frame's worth? And I should be chopping the song into 1 second chunks and passing each one into the FFT and call it multiple times? That seems strange, how does it get lower frequencies?
To the best of my understanding, the complex number is a way to convey the phase shift and amplitude in one neat container (and you need phase shift if you want to do the FFT in reverse). And the frequency is calculated from the index of the array.
Which works out to (pseudocode in Kotlin)
val audioFile = File("Dream_On.pcm")
val (phases, amplitudes) = AudioInputStream(
audioFile.inputStream(),
AudioFormat(
/* encoding = */ AudioFormat.Encoding.PCM_SIGNED,
/* sampleRate = */ 44100f,
/* sampleSizeInBits = */ 16,
/* channels = */ 2,
/* frameSize = */ 4,
/* frameRate = */ 44100f,
/* bigEndian = */ false
),
(audiFile.length() / /* frameSize */ 4)
).use { ais ->
val bytes = ais.readAllBytes()
val shorts = ShortArray(bytes.size / 2)
ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().get(shorts)
val allWaveform = DoubleArray(shorts.size)
for (i in shorts.indices) {
allWaveform[i] = shorts[i].toDouble()
}
val halfwayThroughSong = allWaveform.size / 2
val moreThanOneSecond = allWaveform.copyOfRange(halfwayThroughSong, halfwayThroughSong + findNextPowerOf2(44100))
val fft = FastFourierTransformer(DftNormalization.STANDARD)
val fftResult: Array<Complex> = fft.transform(moreThanOneSecond, TransformType.FORWARD)
println("fftResult size: ${fftResult.size}")
val phases = DoubleArray(fftResult.size / 2)
val amplitudes = DoubleArray(fftResult.size / 2)
val frequencies = DoubleArray(fftResult.size / 2)
fftResult.filterIndexed { index, _ -> index < fftResult.size / 2 }.forEachIndexed { idx, complex ->
phases[idx] = atan2(complex.imaginary, complex.real)
frequencies[idx] = idx * 44100.0 / fftResult.size
amplitudes[idx] = hypot(complex.real, complex.imaginary)
}
Triple(phases, frequencies, amplitudes)
}
Is my step #8 at all close to the truth? Why would the FFT result return an array as big as my input number of samples? That makes me think I've got the "window" or "frame" part wrong.
I read up on
FFT real/imaginary/abs parts interpretation
Converting Real and Imaginary FFT output to Frequency and Amplitude
Java - Finding frequency and amplitude of audio signal using FFT
An audio recording in waveform is a series of sound energy levels, basically how much sound energy there should be at any one instant. Based on the bit rate, you can think of the whole recording as a graph of energy versus time.
Sound is made of waves, which have frequencies and amplitudes. Unless your recording is of a pure sine wave, it will have many different waves of sound coming and going, which summed together create the total sound that you experience over time. At any one instant of time, you have energy from many different waves added together. Some of those waves may be at their peaks, and some at their valleys, or anywhere in between.
An FFT is a way to convert energy-vs.-time data into amplitude-vs.-frequency data. The input to an FFT is a block of waveform. You can't just give it a single energy level from a one-dimensional point in time, because then there is no way to determine all the waves that add together to make up the amplitude at that point of time. So, you give it a series of amplitudes over some finite period of time.
The FFT then does its math and returns a range of complex numbers that represent the waves of sound over that chunk of time, that when added together would create the series of energy levels over that block of time. That's why the return value is an array. It represents a bunch of frequency ranges. Together the total data of the array represents the same energy from the input array.
You can calculate from the complex numbers both phase shift and amplitude for each frequency range represented in the return array.
Ultimately, I don’t see why performing an FFT would get you any closer to syncing your recordings. Admittedly it’s not a task I’ve tried before. But I would think waveform data is already the perfect form for comparing the data and finding matching patterns. If you break your songs up into chunks to perform FFTs on, then you can try to find matching FFTs but they will only match perfectly if your chunks are divided exactly along the same division points relative to the beginning of the original recording. And even if you could guarantee that and found matching FFT’s, you will only have as much precision as the size of your chunks.
But when I think of apps like Shazam, I realize they must be doing some sort of manipulation of the audio that breaks it down into something simpler for rapid comparison. That possibly involves some FFT manipulation and filtering.
Maybe you could compare FFTs using some algorithm to just find ones that are pretty similar to narrow down to a time range and then compare wave form data in that range to find the exact point of synchronization.
I would imagine the approach that would work well would to find the offset with the maximum cross-correlation between the two recordings. This means calculate the cross-correlation between the two pieces at various offsets. You would expect the maximum cross-correlation to occur at the offset where the two piece were best aligned.
Currently working on Spark, I collected some performance metrics through the custom Spark listener API for analysis purposes. I tried to make a stacked bar plot that shows the percentage of the time the executor passes executing the task, shuffling or in garbage collection pauses for three different machine learning algorithms.
Here is a screenshot of what I found:
What caught my attention right after the plot appeared is that the rates are false. You can see that it goes beyond the value 1 for the kmeans algorithm, and less than 0.8 for the perceptron.
Here is how I computed the rates:
execution['cpuRate'] = execution['executorCpuTime'] / execution['executorRunTime']
execution['serRate'] = execution['resultSerializationTime'] / execution['executorRunTime']
execution['gcRate'] = execution['jvmGCTime'] / execution['executorRunTime']
execution['shuffleFetchRate'] = execution['shuffleFetchWaitTime'] / execution['executorRunTime']
execution['shuffleWriteRate'] = execution['shuffleWriteTime'] / execution['executorRunTime']
execution = execution[['cpuRate', 'serRate', 'gcRate', 'shuffleFetchRate', 'shuffleWriteRate']]
execution.plot.bar(stacked=True)
I use Pandas library and execution is the dataframe containing the averaged metrics.
Of course, my assumption is that the executorRunTime is a summation of the underlying other metrics, but it turns out to be false.
What are the meaning of those times, and how are they correlated? I mean: what does the executorRunTime consist of if not all the other metrics specified above?
Thanks
According to TaskMetrics.scala:
* Time the executor spends actually running the task (including fetching shuffle data).
*/
def executorRunTime: Long = _executorRunTime.sum
Measured in miliseconds.
I'm looking for a more efficient way to draw continuous lines in PsychoPy. That's what I've come up with, for now...
edit: the only improvement I could think of is to add a new line only if the mouse has really moved by adding if (mspos1-mspos2).any():
ms = event.Mouse(myWin)
lines = []
mspos1 = ms.getPos()
while True:
mspos2 = ms.getPos()
if (mspos1-mspos2).any():
lines.append(visual.Line(myWin, start=mspos1, end=mspos2))
for j in lines:
j.draw()
myWin.flip()
mspos1 = mspos2
edit: I tried it with Shape.Stim (code below), hoping that it would work better, but it get's edgy even more quickly..
vertices = [ms.getPos()]
con_line = visual.ShapeStim(myWin,
lineColor='red',
closeShape=False)
myclock.reset()
i = 0
while myclock.getTime() < 15:
new_pos = ms.getPos()
if (vertices[i]-new_pos).any():
vertices.append(new_pos)
i += 1
con_line.vertices=vertices
con_line.draw()
myWin.flip()
The problem is that it becomes too ressource demanding to draw those many visual.Lines or manipulate those many vertices in the visual.ShapeStim on each iteration of the loop. So it will hang on the draw (for Lines) or vertex assignment (for ShapeStim) so long that the mouse has moved enough for the line to show discontinuities ("edgy").
So it's a performance issue. Here are two ideas:
Have a lower threshold for the minimum distance travelled by the mouse before you want to add a new coordinate to the line. In the example below I impose a the criterion that the mouse position should be at least 10 pixels away from the previous vertex to be recorded. In my testing, this compressed the number of vertices recorded per second to about a third. This strategy alone will postpone the performance issue but not prevent it, so on to...
Use the ShapeStim solution but regularly use new ShapeStims, each with fewer vertices so that the stimulus to be updated isn't too complex. In the example below I set the complexity at 500 pixels before shifting to a new stimulus. There might be a small glitch while generating the new stimulus, but nothing I've noticed.
So combining these two strategies, starting and ending mouse drawing with a press on the keyboard:
# Setting things up
from psychopy import visual, event, core
import numpy as np
# The crucial controls for performance. Adjust to your system/liking.
distance_to_record = 10 # number of pixels between coordinate recordings
screenshot_interval = 500 # number of coordinate recordings before shifting to a new ShapeStim
# Stimuli
myWin = visual.Window(units='pix')
ms = event.Mouse()
myclock = core.Clock()
# The initial ShapeStim in the "stimuli" list. We can refer to the latest
# as stimuli[-1] and will do that throughout the script. The others are
# "finished" and will only be used for draw.
stimuli = [visual.ShapeStim(myWin,
lineColor='white',
closeShape=False,
vertices=np.empty((0, 2)))]
# Wait for a key, then start with this mouse position
event.waitKeys()
stimuli[-1].vertices = np.array([ms.getPos()])
myclock.reset()
while not event.getKeys():
# Get mouse position
new_pos = ms.getPos()
# Calculating distance moved since last. Pure pythagoras.
# Index -1 is the last row.index
distance_moved = np.sqrt((stimuli[-1].vertices[-1][0]-new_pos[0])**2+(stimuli[-1].vertices[-1][1]-new_pos[1])**2)
# If mouse has moved the minimum required distance, add the new vertex to the ShapeStim.
if distance_moved > distance_to_record:
stimuli[-1].vertices = np.append(stimuli[-1].vertices, np.array([new_pos]), axis=0)
# ... and show it (along with any "full" ShapeStims
for stim in stimuli:
stim.draw()
myWin.flip()
# Add a new ShapeStim once the old one is too full
if len(stimuli[-1].vertices) > screenshot_interval:
print "new shapestim now!"
stimuli.append(visual.ShapeStim(myWin,
lineColor='white',
closeShape=False,
vertices=[stimuli[-1].vertices[-1]])) # start from the last vertex
Can anybody spot what is wrong with the code below. It is supposed to average the frame interval (dt) for the previous TIME_STEPS number of frames.
I'm using Box2d and cocos2d, although I don't think the cocos2d bit is very relevent.
-(void) update: (ccTime) dt
{
float32 timeStep;
const int32 velocityIterations = 8;
const int32 positionIterations = 3;
// Average the previous TIME_STEPS time steps
for (int i = 0; i < TIME_STEPS; i++)
{
timeStep += previous_time_steps[i];
}
timeStep = timeStep/TIME_STEPS;
// step the world
[GB2Engine sharedInstance].world->Step(timeStep, velocityIterations, positionIterations);
for (int i = 0; i < TIME_STEPS - 1; i++)
{
previous_time_steps[i] = previous_time_steps[i+1];
}
previous_time_steps[TIME_STEPS - 1] = dt;
}
The previous_time_steps array is initially filled with whatever the animation interval is set too.
This doesn't do what I would expect it too. On devices with a low frame rate it speeds up the simulation and on devices with a high frame rate it slows it down. I'm sure it's something stupid I'm over looking.
I know box2D likes to work with fixed times steps but I really don't have a choice. My game runs at a very variable frame rate on the various devices so a fixed time stop just won't work. The game runs at an average of 40 fps, but on some of the crappier devices like the first gen iPad it runs at barely 30 frames per second. The third gen ipad runs it at 50/60 frames per second.
I'm open to suggestion on other ways of dealing with this problem too. Any advice would be appreciated.
Something else unusual I should note that somebody might have some insight into is the fact that running any debug optimisations on the build has a huge effect on the above. The frame rate isn't changed much when debug optimisations are set to -Os vs -O0. But when the debut optimisations are set to -Os the physics simulation runs much faster than -O0 when the above code is active. If I just use dt as the interval instead of the above code then the debug optimisations make no difference.
I'm totally confused by that.
On devices with a low frame rate it speeds up the simulation and on
devices with a high frame rate it slows it down.
That's what using a variable time step is all about. If you only get 10 fps the physics engine will iterate the world faster because the delta time is larger.
PS: If you do any kind of performance tests like these, run them with the release build. That also ensures that (most) logging is disabled and code optimizations are on. It's possible that you simply experience much greater impact on performance from debugging code on older devices.
Also, what value is TIME_STEPS? It shouldn't be more than 10, maybe 20 at most. The alternative to averaging is to use delta time directly, but if delta time is greater than a certain threshold (30 fps) switch to using a fixed delta time (cap it). Because variable time step below 30 fps can get really ugly, it's probably better in such cases to allow the physics engine to slow down with the framerate or else the game will become harder if not unplayable at lower fps.
I'm trying to pass an image showing to the user a countdown. For this, I'm using a separate thread where I'm checking when the countdown timer should be started, and when so, I draw an image for every 6 seconds passed.
What's annoying is when I pass the drawn image to the UI, the quality of the image is changed and it looks bad to the user.
This is my little script that handles the drawings:
Try
remainingTime = (#12:04:00 AM# - (DateTime.Now - local_DateTimeclick)).ToString("HH:mm:ss")
remainingTimeInSeconds = Convert.ToDateTime(remainingTime).Minute * 60 + Convert.ToDateTime(remainingTime).Second
If remainingTimeInSeconds Mod 6 = 0 Then
g.ResetTransform()
g.TranslateTransform(52, 52)
g.RotateTransform(230 - remainingTimeInSeconds / 6 * 9)
'g.CompositingQuality = Drawing2D.CompositingQuality.HighQuality
'g.SmoothingMode = Drawing2D.SmoothingMode.AntiAlias
'g.InterpolationMode = Drawing2D.InterpolationMode.HighQualityBicubic
'g.CompositingMode = Drawing2D.CompositingMode.SourceCopy
'g.PixelOffsetMode = Drawing2D.PixelOffsetMode.
g.DrawImage(Tick, 10, 10)
End If
Catch
remainingTime = "Times Up"
End Try
In the above section,
- *local_DateTimeClick* is the variable that is set when the countdown should start
- Tick is a Bitmap that represents the image i have to draw for every 6 elipsed seconds
- g is a Graphics object from image that i return into the main window.
Also tried with changing the properties of g, but there was no positive effect.
Anyone have any idea what can i do to make this properly work without changing the quality of the returned image? Any tip/advice is welcomed.
Because Drawing2D.SmoothingMode only applies for 2D vector drawing methods such as Graphics.DrawEllipse and Graphics.DrawLine.
It doesn't affect drawing bitmaps with Graphics.DrawImage.
From the .NET documentation:
The smoothing mode specifies whether lines, curves, and the edges of filled areas use smoothing (also called antialiasing).
You have two options at this point:
Pre-render every possible orientation of the image, and then just display different Image objects at each tick
use the .NET vector drawing methods to render your image instead of using a pre-rendered bitmap.
Without knowing the actual image you're transforming, I can't say which would be more appropriate.
from a graphic designer perspective followed at my company, i think your should crop your images to fit in the label:
1. it enhance performance by saving processing power.
2. it look better ?!.
regards