Find longest segment - pandas

I have a dataframe (or a series) of measured voltage (in V) indexed by timestamps (in seconds). I want to know the duration of the longest segment (=consecutive values) of voltage greater than a threshold.
Example:
time voltage
0.0 1.2
0.1 1.8
0.2 2.2
0.3 2.3
0.4 1.9
0.5 1.5
0.6 2.1
0.7 2.3
0.8 2.2
0.9 1.9
1.0 1.6
In this example, threshold is 2.0 V, and desired answer is 0.3 seconds
Real data is made of 10k or more samples, and number of segments of values above the threshold is completly random, there is even the possibility of having only one segment with all values above the threshold.
I think the first step is too identified these segments et separate them, then perform calculation of duration.

You can create a True and False sequence with boolean indexing. Then use value_counts and max to get the longest sequence:
s = df.voltage > 2
(~s).cumsum()[s].value_counts().max()
Output
3

IIUC
n=2
s=df.voltage.gt(n)
df.time[s].groupby((~s).cumsum()).diff().sum()
Out[1218]: 0.30000000000000004
And if you need the longest duration , Notice here is from 0.6 to 0.8 which should be 0.2 second ..
df.time[s].groupby((~s).cumsum()).apply(lambda x : x.diff().sum()).max()
Out[1221]: 0.20000000000000007

Related

Pandas keep decimal part of values like .998344 after round(2) applied to series

I have dataset with float values with 6 decimals. I need to round it to two decimals.The problem becams with some floats nearly close to 1. After applying round(2) I got 1.00 instead of 0.99. May be this is mathematically right, but I need to have values like 0.99. My customer needs two decimals as result, I cant change it.
ones_list = [0.998344, 0.996176, 0.998344, 0.998082]
df = pd.DataFrame(ones_list, columns=['value_to_round'])
df['value_to_round'].round(2)
1.0
1.0
1.0
1.0
I see a few options:
use floor instead of round (but would you have the same issue with 0.005?)
use clip to set a maximum (and a min?) value in the column of 0.99:
df['value_to_round'].round(2).clip(upper=0.99)
Please refer to the basic of rounding in math, you are trying to round 2 digits behind the dot using .round(2)
if you round 0.999 using .round(2), of course you'll get 1.0 because the last '9' digit (0.009) will become 0.01 thus will be added to 0.09 and become 0.1 added again with 0.9 and finally becomes 1.0
If you really want to have values like 0.99, just take the two decimals behind the dot. You can try either the following methods:
import math
df['value_to_round'] = df['value_to_round'] * 100
df['value_to_round'] = df['value_to_round'].apply(math.trunc)
df['value_to_round'] = df['value_to_round'] / 100
or
df['value_to_round'] = df['value_to_round'].astype(str)
df['value_to_round'] = df['value_to_round'].str[:4]
df['value_to_round'] = df['value_to_round'].astype(float)
I experienced the same thing when I was trying to show R squared value, what I did is just use .round(3), because 3 digits decimal wouldn't hurt
I hope this helps :)
df['value_to_round'] = [x if x < 1 else 0.99 for x in df['value_to_round'].round(2)]

Confusing Labels for Function Generators and Oscilloscopes in Tinkercad

In Tinkercad, amplitude definition for Function Generators and scale definition for Oscilloscopes are quite confusing. Here is an ss from Tinkercad's function generator:
On the device 6.20 V is represented as peak-to-peak voltage, look at the red-lines I've marked. But on the panel right-hand-side, we input it as the amplitude, look at the green line I've marked. Which one is true?
And I cannot deduce the answer using an oscilloscope, because there is not enough info about oscilloscope. (At least, I couldn't find enough info.) Here is the input signal from the function generator above:
Answer is not obvious, because the meaning of 10 V placed on y_axis is ambiguous. Is it +/- 10 V as in 20 V in total, i.e. the voltage-per-division is 2 V (first explanation)? Or, is it +/- 5 V as in 10 V in total, i.e. voltage-per-division is 1 V (second explanation)? In some Youtube lectures the explanation is first one. But, I'm not quite sure. Because, if 6.2 V is amplitude and voltage-per-division is 2 V, then this is noncontradictory. But if 6.2 V is peak-to-peak voltage and voltage-per-division is 1 V, then this, too, is noncontradictory. Again, which one is true?
And also, while studying, I've realise that a real life experiment indicates that the second explanation should be true. Let me explain the experiment step by step.
Theory: Full Wave Rectifier Circuits
Assume we apply V_in as the amplitude, the peak-peak voltage is, V_peaktopeak = 2 * V_in. And for output signal we have,
V_out = (V_in - n * V_diode) * R_L / (R_L + r_d),
where n is the number of diode in conduction, V_diode is bias of a diode and R_L is load resistor. Load resistor is choosen big enough so that R_L >> r_d and we get,
V_out = V_in - n * V_diode.
In a real experiment r_d is in between 1 \ohm and 25 \ohm, and we choose R_L on the order of kilo \ohm. Therefore, we can ignore R_L / (R_L + r_d) part, safely.
And for DC voltage corresponding to the output signal we have,
V_DC = 2 * V_out / \pi = 0.637 * V_out.
Sheme of Circuit in an Experiment
Here is circuit scheme,
As you may see, for positive half-periode only two of four diode is in conduction. And for negative half-periode, the other two is in conduction. Thus n is 2 for this circuit. Let's construct this experiment on Tinkercad. I didn't use breadboard to show more similarity between the scheme of circuit and the circuit built in Tinkercad.
Scenerio #1 - Theoretical Expectations
Let's assume 6.2 V to be the amplitude. Then, V_in=6.2 V. And V_peaktopeak is 12.4 V. As output signal we calculate,
V_out = V_in - n * V_diode = 6.2 V - 2 * 0.7 V = 4.8 V.
And for DC equivalent we theoretically get,
V_DC = 0.637 * V_out = 3.06 V.
But in multimeter, we see 1.06 V. This indicates nearly %60 percantage error.
Scenerio #2 - Theoretical Expectations
Let's assume 6.2 V to be the peak-to-peak voltage. Then, V_in=3.1 V. And V_peaktopeak is 6.2 V. As output signal we calculate,
V_out = V_in - n * V_diode = 3.1 V - 2 * 0.7 V = 1.7 V.
And for DC equivalent we theoretically get,
V_DC = 0.637 * V_out = 1.08 V.
And in multimeter, we see 1.06 V. There values are pretty close to each other.
Conclusion
Based on these results, we may conclude that 6.2 V is peak-to-peak voltage, scheme on the function generator is true, the tag "Amplitude" in the description of function generator is wrong and the y-scale of an oscilloscope represents the total voltage half of which is positive and the other half is negative.
BUT
I cannot be sure, and since I'll teach this material in my electronic laboratory class, I really need to be sure about this conclusion. Therefore, here I'm asking you about your opinions, conclusions or maybe other references that I've missed.
TinkerCAD refers to peak-to-peak voltage as amplitude for some reason. I believe the second explanation (+/- 5V, 10 V total) is correct, based on the x axis and frequency value.

How to convert a matrix whose entries are one digit decimal number from Text file of Latex to console of Scilab in shortcut way?

I have a very big matrix whose entries are one digit decimal number (For example, \begin{bmatrix}
0.3 & 0.2 & 0.1 \
0.1 & 0.6 & 0.8 \
0.7 & 0.4 & 0.8
\end{bmatrix} ) in a Text file of Latex. My aim is to convert this matrix to Console of Scilab. Is there any shortcut way to do this? (The very big matrix means that it has a maximum of 30 rows and 6 columns).
First put your matrix in file named 'matrix' and remove the \begin{bmatrix} and \end{bmatrix} entries.
Thus your file 'matrix' should look like:
0.3 & 0.2 & 0.1 \ 0.1 & 0.6 & 0.8 \ 0.7 & 0.4 & 0.8
Now execute the following code in scilab:
mclose('all')
f=mopen('matrix');
matrix_car='M=[['
while ~(meof(f))
car = mgetstr(1, f);
if car == '&' then
matrix_car=matrix_car+',';
elseif car == '\' then
matrix_car=matrix_car+'];[';
elseif car ~= ascii(10) then
matrix_car=matrix_car+car;
end;
end;
matrix_car=matrix_car+']]'
execstr(matrix_car)
Then you will get a variable called M with the appropriate matrix in it.

tensorflow probability per class

I want to get probability per class, for my output, in tensorflow.
Using softmax yields the following.
A : 0.7
B : 0.2
C : 0.1
But, what I want is
A probability : 0.8
B probability : 0.6
C probability : 0.7
Instead of using softmax, use tf.nn.sigmoid as
tf.nn.sigmoid(<output-tensor>)

AI: Conditional Independence

Conditional Independence Example Photo
The entire pdf lesson
it's on page 8.
I've been looking at this for a long time now; can anyone explain how for the P13 we end up with <0.31,0.69>? I'm not sure how the a' gets distributed here. When I calculate 0.2(0.04+0.16+0.16) for the x column I get 0.072, so how do we end up with 0.31?
Thank you.
The α is a normalization constant that is supposed to ensure that you have proper probabilities, i.e. values in [0, 1] that sum to 1. As such, it has to be 1 over the sum of all possible values. For your example, we calculate it as follows.
Let's first evaluate the single expressions in the tuple:
0.2 * (0.04 + 0.16 + 0.16) = 0.072
0.8 * (0.04 + 0.16) = 0.16
Notice that these two values do not specify a probability distribution (they don't sum to 1).
Therefore, we calculate the normalization constant α as 1 over the sum of these values:
α = 1 / (0.072 + 0.16) = 4.310345
With this, we normalize the original values as follows:
0.072 * α = 0.310345
0.16 * α = 0.689655
Notice how these values do indeed specify a probability distribution now. (They are in [0, 1] and sum to 1).
I hope this helps :)