I/O Disk Drive Calculations - hardware

So I am studying for an up and coming exam, one of the questions involves calculating various disk drive properties. I have spent a fair while researching sample questions and formula but because I'm a bit unsure on what I have come up with I was wondering could you possibly help confirm my formulas / answers?
Information Provided:
Rotation Speed = 6000 RPM
Surfaces = 6
Sector Size = 512 bytes
Sectors / Track = 500 (average)
Tracks / Surfaces = 1,000
Average Seek Time = 8ms
One Track Seek Time = 0.4 ms
Maximum Seek Time = 10ms
Questions:
Calculate the following
(i) The capacity of the disk
(ii) The maximum transfer rate for a single track
(iii) Calculate the amount of cylinder skew needed (in sectors)
(iv) The Maximum transfer rate (in bytes) across cylinders (with cylinder skew)
My Answers:
(i) Sector Size x Sectors per Track x Tracks per Surface x No. of surfaces
512 x 500 x 1000 x 6 = 1,536,000,000 bytes
(ii) Sectors per Track x Sector Size x Rotation Speed per sec
500 x 512 x (6000/60) = 25,600,000 bytes per sec
(iii) (Track to Track seek time / Time for 1 Rotation) x Sectors per Track + 4
(0.4 / 0.1) x 500 + 4 = 24
(iv) Really unsure about this one to be honest, any tips or help would be much appreciated.
I fairly sure a similar question will appear on my paper so it really would be a great help if any of you guys could confirm my formulas and derived answers for this sample question. Also if anyone could provide a bit of help on that last question it would be great.
Thanks.

(iv) The Maximum transfer rate (in bytes) across cylinders (with cylinder skew)
500 s/t (1 rpm = 500 sectors) x 512 bytes/sector x 6 (reading across all 6 heads maximum)
1 rotation yields 1536000 bytes across 6 heads
you are doing 6000 rpm so that is 6000/60 or 100 rotations per second
so, 153,600,000 bytes per second (divide by 1 million is 153.6 megabytes per second)
takes 1/100th of a second or 10ms to read in a track
then you need a .4ms shift of the heads to then read the next track.
10.0/10.4 gives you a 96.2 percent effective read rate moving the heads perfectly.
you would be able to read at 96% of the 153.6 or 147.5 Mb/s optimally after the first seek.
where 1 Mb = 1,000,000 bytes

Related

What will be the best choice for batch size for one device (using Mirrored Strategy in TF)?

Question:
Suppose you have 4 GPUs (having 2GB memory each) to train your deep learning model. You have 1000 data points in your dataset that takes around 10 GB of storage. What will be the best choice for batch size for one device (using Mirrored Strategy in TF)?
Can someone help me to solve this assignment problem? Thanks in advance.
Each GPU has a memory of 2GB and there are 4 GPUs which means you have a total of 8 GB memory to work with.
Now you can't divide 10 GB of data into 8 GB in one go, so you split 10GB into halves, and have an overall batch size of 500 data points(or rather 512 to be closer to a power of 2)
Now you distribute these 500 data points across the 4 GPUs, getting a batch size of ~128 data points per device.
So overall batch size would be 512 data points, and per GPU batch size would 128.

Getting fuel% from analog data

I am getting analog voltage data, in mV, from a fuel gauge. The calibration readings were taken for every 10% change in the fuel gauge as mentioned below :
0% - 2000mV
10% - 2100mV
20% - 3200mV
30% - 3645mV
40% - 3755mV
50% - 3922mV
60% - 4300mV
70% - 4500mv
80% - 5210mV
90% - 5400mV
100% - 5800mV
The tank capacity is 45L.
Post calibration, I am getting reading from adc as let's say, 3000mV. How to calculate the exact % of fuel left in the tank?
If you plot the transfer function of ADC reading agaist the percentage tank contents you get a graph like this
There appears to be a fair degree of non linearity in the relationship between the sensor and the measured quantity. This could be down to a measurement error that was made while performing the calibration or it could be a true non linear relationship between the sensor reading and the tank contents. Using these results will give fairly inaccurate estimates of tank contents due to the non linearity of the transfer function.
If the relationship is linear or can be described by another mathematical relationship then you can perform an interpolation between known points using this mathematical relationship.
If the relationship is not linear than you will need many more known points in your calibration data so that the errors due to the interpolation between points is minimised.
The percentage value corresponding to the ADC reading can be approximated by finding the entries in the calibration above and below the reading that has been taken - for the ADC reading example in the question these would be the 10% and 20% values
Interpolation_Proportion = (ADC - ADC_Below) / (ADC_Above - ADC_Below) ;
Percent = Percent_Below + (Interpolation_Proportion * (Percent_Above - Percent_Below)) ;
.
Interpolation proportion = (3000-2100)/(3200-2100)
= 900/1100
= 0.82
Percent = 10 + (0.82 * (20 - 10)
= 10 + 8.2
= 18.2%
Capacity = 45 * 18.2 / 100
= 8.19 litres
When plotted it appears that the data id broadly linear, with some outliers. It is likely that this is experimental error or possibly influenced by confounding factors such as electrical noise or temperature variation, or even just the the liquid slopping around! Without details of how the data was gathered and how carefully, it is not possible to determine, but I would ask how many samples were taken per measurement, whether these are averaged or instantaneous and whether the results are exactly repeatable over more than one experiment?
Assuming the results are "indicative" only, then it is probably wisest from the data you do have to assume that the transfer function is linear, and to perform a linear regression from the scatter plot of your test data. That can be most done easily using any spreadsheet charting "trendline" function:
From your date the transfer function is:
Fuel% = (0.0262 x SensormV) - 54.5
So for your example 3000mV, Fuel% = (0.0262 x 3000) - 54.5 = 24.1%
For your 45L tank that equates to about 10.8 Litres.

remoteIO input square wave frequency is low

i input to the audioJack a square wave in the frequency of 2-3 khZ for about 5 seconds.
the square wave is 1 and 0 - no negative values.
i get some periodic signal that going between -32000 to 32000 (but my signal is positive!? )
i have check how many times my values are crossing the zero- i get 500 in 5 seconds, which means 100 per second .
what am i missing here ? 3khz is 3000 per second.
my sampling code is in my previous post :
error in audio Unit code -remoteIO for iphone
any explanation on the frequency domain here? am i missing samples ? how can i improve it? should i do :
float bufferLength = 0.005;
AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration, sizeof(bufferLength), &bufferLength);
status = AudioOutputUnitStart(audioUnit);
thanks alot!

Using `overlap`, `kernel time` and `utilization` to optimize one's kernels

My kernel archive 100% utilization, but the kernel time is at only 3% and there is no time overlap between memory copies and kernels.
Especially the high utilization and the low kernel time don't make sense to me.
So how should I proceed in optimizing my kernel?
I already made sure, that I only have coalesced and pinned memory access, like the profiler recommended.
`Quadro FX 580 utilization = 100.00% (62117.00/62117.00)`
Kernel time = 3.05 % of total GPU time
Memory copy time = 0.9 % of total GPU time
Kernel taking maximum time = Pinned (0.7% of total GPU time)
Memory copy taking maximum time = memcpyHtoD (0.5% of total GPU time)
There is no time overlap between memory copies and kernels on GPU
Furtermore I have no warp serialization, no divergent branches, and no occupancy limiting factor.
Kernel details: Grid size: [4 1 1], Block size: [256 1 1]
Register Ratio: 0.9375 ( 7680 / 8192 ) [10 registers per thread]
Shared Memory Ratio: 0.09375 ( 1536 / 16384 ) [60 bytes per Block]
Active Blocks per SM: 3 (Maximum Active Blocks per SM: 8)
Active threads per SM: 768 (Maximum Active threads per SM: 768)
Potential Occupancy: 1 ( 24 / 24 )
Achieved occupancy: 0.333333 (on 4 SMs)
Occupancy limiting factor: None
p.s. I don't claim that I wrote wundercode, but I just don't know how to proceed from here.
it seems the grid size of your kernel is too small to make full use of SM.
why not decrease block size and increase the grid size.
i think it will do some help.

VB FFT - stuck understanding relationship of results to frequency

Trying to understand an fft (Fast Fourier Transform) routine I'm using (stealing)(recycling)
Input is an array of 512 data points which are a sample waveform.
Test data is generated into this array. fft transforms this array into frequency domain.
Trying to understand relationship between freq, period, sample rate and position in fft array. I'll illustrate with examples:
========================================
Sample rate is 1000 samples/s.
Generate a set of samples at 10Hz.
Input array has peak values at arr(28), arr(128), arr(228) ...
period = 100 sample points
peak value in fft array is at index 6 (excluding a huge value at 0)
========================================
Sample rate is 8000 samples/s
Generate set of samples at 440Hz
Input array peak values include arr(7), arr(25), arr(43), arr(61) ...
period = 18 sample points
peak value in fft array is at index 29 (excluding a huge value at 0)
========================================
How do I relate the index of the peak in the fft array to frequency ?
If you ignore the imaginary part, the frequency distribution is linear across bins:
Frequency#i = (Sampling rate/2)*(i/Nbins).
So for your first example, assumming you had 256 bins, the largest bin corresponds to a frequency of 1000/2 * 6/256 = 11.7 Hz.
Since your input was 10Hz, I'd guess that bin 5 (9.7Hz) also had a big component.
To get better accuracy, you need to take more samples, to get smaller bins.
Your second example gives 8000/2*29/256 = 453Hz. Again, close, but you need more bins.
Your resolution here is only 4000/256 = 15.6Hz.
It would be helpful if you were to provide your sample dataset.
My guess would be that you have what are called sampling artifacts. The strong signal at DC ( frequency 0 ) suggests that this is the case.
You should always ensure that the average value in your input data is zero - find the average and subtract it from each sample point before invoking the fft is good practice.
Along the same lines, you have to be careful about the sampling window artifact. It is important that the first and last data point are close to zero because otherwise the "step" from outside to inside the sampling window has the effect of injecting a whole lot of energy at different frequencies.
The bottom line is that doing an fft analysis requires more care than simply recycling a fft routine found somewhere.
Here are the first 100 sample points of a 10Hz signal as described in the question, massaged to avoid sampling artifacts
> sinx[1:100]
[1] 0.000000e+00 6.279052e-02 1.253332e-01 1.873813e-01 2.486899e-01 3.090170e-01 3.681246e-01 4.257793e-01 4.817537e-01 5.358268e-01
[11] 5.877853e-01 6.374240e-01 6.845471e-01 7.289686e-01 7.705132e-01 8.090170e-01 8.443279e-01 8.763067e-01 9.048271e-01 9.297765e-01
[21] 9.510565e-01 9.685832e-01 9.822873e-01 9.921147e-01 9.980267e-01 1.000000e+00 9.980267e-01 9.921147e-01 9.822873e-01 9.685832e-01
[31] 9.510565e-01 9.297765e-01 9.048271e-01 8.763067e-01 8.443279e-01 8.090170e-01 7.705132e-01 7.289686e-01 6.845471e-01 6.374240e-01
[41] 5.877853e-01 5.358268e-01 4.817537e-01 4.257793e-01 3.681246e-01 3.090170e-01 2.486899e-01 1.873813e-01 1.253332e-01 6.279052e-02
[51] -2.542075e-15 -6.279052e-02 -1.253332e-01 -1.873813e-01 -2.486899e-01 -3.090170e-01 -3.681246e-01 -4.257793e-01 -4.817537e-01 -5.358268e-01
[61] -5.877853e-01 -6.374240e-01 -6.845471e-01 -7.289686e-01 -7.705132e-01 -8.090170e-01 -8.443279e-01 -8.763067e-01 -9.048271e-01 -9.297765e-01
[71] -9.510565e-01 -9.685832e-01 -9.822873e-01 -9.921147e-01 -9.980267e-01 -1.000000e+00 -9.980267e-01 -9.921147e-01 -9.822873e-01 -9.685832e-01
[81] -9.510565e-01 -9.297765e-01 -9.048271e-01 -8.763067e-01 -8.443279e-01 -8.090170e-01 -7.705132e-01 -7.289686e-01 -6.845471e-01 -6.374240e-01
[91] -5.877853e-01 -5.358268e-01 -4.817537e-01 -4.257793e-01 -3.681246e-01 -3.090170e-01 -2.486899e-01 -1.873813e-01 -1.253332e-01 -6.279052e-02
And here is the resulting absolute values of the fft frequency domain
[1] 7.160038e-13 1.008741e-01 2.080408e-01 3.291725e-01 4.753899e-01 6.653660e-01 9.352601e-01 1.368212e+00 2.211653e+00 4.691243e+00 5.001674e+02
[12] 5.293086e+00 2.742218e+00 1.891330e+00 1.462830e+00 1.203175e+00 1.028079e+00 9.014559e-01 8.052577e-01 7.294489e-01
I'm a little rusty too on math and signal processing but with the additional info I can give it a shot.
If you want to know the signal energy per bin you need the magnitude of the complex output. So just looking at the real output is not enough. Even when the input is only real numbers. For every bin the magnitude of the output is sqrt(real^2 + imag^2), just like pythagoras :-)
bins 0 to 449 are positive frequencies from 0 Hz to 500 Hz. bins 500 to 1000 are negative frequencies and should be the same as the positive for a real signal. If you process one buffer every second frequencies and array indices line up nicely. So the peak at index 6 corresponds with 6Hz so that's a bit strange. This might be because you're only looking at the real output data and the real and imaginary data combine to give an expected peak at index 10. The frequencies should map linearly to the bins.
The peaks at 0 indicates a DC offset.
It's been some time since I've done FFT's but here's what I remember
FFT usually takes complex numbers as input and output. So I'm not really sure how the real and imaginary part of the input and output map to the arrays.
I don't really understand what you're doing. In the first example you say you process sample buffers at 10Hz for a sample rate of 1000 Hz? So you should have 10 buffers per second with 100 samples each. I don't get how your input array can be at least 228 samples long.
Usually the first half of the output buffer are frequency bins from 0 frequency (=dc offset) to 1/2 sample rate. and the 2nd half are negative frequencies. if your input is only real data with 0 for the imaginary signal positive and negative frequencies are the same. The relationship of real/imaginary signal on the output contains phase information from your input signal.
The frequency for bin i is i * (samplerate / n), where n is the number of samples in the FFT's input window.
If you're handling audio, since pitch is proportional to log of frequency, the pitch resolution of the bins increases as the frequency does -- it's hard to resolve low frequency signals accurately. To do so you need to use larger FFT windows, which reduces time resolution. There is a tradeoff of frequency against time resolution for a given sample rate.
You mention a bin with a large value at 0 -- this is the bin with frequency 0, i.e. the DC component. If this is large, then presumably your values are generally positive. Bin n/2 (in your case 256) is the Nyquist frequency, half the sample rate, which is the highest frequency that can be resolved in the sampled signal at this rate.
If the signal is real, then bins n/2+1 to n-1 will contain the complex conjugates of bins n/2-1 to 1, respectively. The DC value only appears once.
The samples are, as others have said, equally spaced in the frequency domain (not logarithmic).
For example 1, you should get this:
alt text http://home.comcast.net/~kootsoop/images/SINE1.jpg
For the other example you should get
alt text http://home.comcast.net/~kootsoop/images/SINE2.jpg
So your answers both appear to be correct regarding the peak location.
What I'm not getting is the large DC component. Are you sure you are generating a sine wave as the input? Does the input go negative? For a sinewave, the DC should be close to zero provided you get enough cycles.
Another avenue is to craft a Goertzel's Algorithm of each note center frequency you are looking for.
Once you get one implementation of the algorithm working you can make it such that it takes parameters to set it's center frequency. With that you could easily run 88 of them or what ever you need in a collection and scan for the peak value.
The Goertzel Algorithm is basically a single bin FFT. Using this method you can place your bins logarithmically as musical notes naturally go.
Some pseudo code from Wikipedia:
s_prev = 0
s_prev2 = 0
coeff = 2*cos(2*PI*normalized_frequency);
for each sample, x[n],
s = x[n] + coeff*s_prev - s_prev2;
s_prev2 = s_prev;
s_prev = s;
end
power = s_prev2*s_prev2 + s_prev*s_prev - coeff*s_prev2*s_prev;
The two variables representing the previous two samples are maintained for the next iteration. This can be then used in a streaming application. I thinks perhaps the power calculation should be inside the loop as well. (However it is not depicted as such in the Wiki article.)
In the tone detection case there would be 88 different coeficients, 88 pairs of previous samples and would result in 88 power output samples indicating the relative level in that frequency bin.
WaveyDavey says that he's capturing sound from a mic, thru the audio hardware of his computer, BUT that his results are not zero-centered. This sounds like a problem with the hardware. It SHOULD BE zero-centered.
When the room is quiet, the stream of values coming from the sound API should be very close to 0 amplitude, with slight +- variations for ambient noise. If a vibratory sound is present in the room (e.g. a piano, a flute, a voice) the data stream should show a fundamentally sinusoidal-based wave that goes both positive and negative, and averages near zero. If this is not the case, the system has some funk going on!
-Rick