How to increase 1d-data resolution though interpolating in between measurements? - numpy

I'm developing a pythong script where I receive angular measurements from a motor which has a low resolution encoder attached to it. The data I get from the motor has a very low resolution (about 5 degrees division in between measurments). This is an example of the sensor output whilst it is rotating with a constant speed (in degrees):
sensor output = ([5, 5, 5, 5, 5, 10, 10, 10, 10 ,10, 15, 15, 20, 20, 20, 20, 25, 25, 30, 30, 30, 30, 30, 35, 35....])
As you can see, some of these measurements are repeating themselves.
From these measurements, I would like to interpolate in order to get the measurements in between the 1D data-points. For instance, if I at time k receive the angular measurement theta=5 and in the next instance at t=k+1 also receive a measurement of theta=5, I would like to compute an estimate that would be something like theta = 5+(1/5).
I have also been considering using some sort of predictive filtering, but I'm not sure which one to apply if that is even applicable in this case (e.g. Kalman filtering). The estimated output should be in a linear form since the motor is rotating with a constast angular velocity.
I have tried using numpy.linspace in order to acheive what I want, but cannot seem to get it to work the way I want:
# Interpolate for every 'theta_div' values in angle received through
# modbus
for k in range(np.size(rx)):
y = T.readSensorData() # take measurement (call read sensor function)
fp = np.linspace(y, y+1, num=theta_div)
for n in range(theta_div):
if k % 6 == 0:
if not y == fp[n]:
z = fp[n]
else:
z = y
print(z)
So for the sensor readings: ([5, 5, 5, 5, 5, 10, 10, 10, 10 ,10, 15, 15, 20, 20, 20, 20, 25, 25, 30, 30, 30, 30, 30, 35, 35....]) # each element at time=k0...kn
I would like the output to be something similar to:
theta = ([5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17.5, 20...])
So in short, I need some sort of prediction and then update the value with the actual reading from the sensor, similar to the procedure in a Kalman filter.

why dont just make a linear fit?
import numpy as np
import matplotlib.pyplot as plt
messurements = np.array([5, 5, 5, 5, 5, 10, 10, 10, 10 ,10, 15, 15, 20, 20, 20, 20, 25, 25, 30, 30, 30, 30, 30, 35, 35])
time_array = np.arange(messurements.shape[0])
fitparms = np.polyfit(time_array,messurements,1)
def line(x,a,b):
return a*x +b
better_time_array = np.linspace(0,np.max(time_array))
plt.plot(time_array,messurements)
plt.plot(better_time_array,line(better_time_array,fitparms[0],fitparms[1]))

Related

np.array for variable matrix

import numpy as np
data = np.array([[10, 20, 30, 40, 50, 60, 70, 80, 90],
[2, 7, 8, 9, 10, 11],
[3, 12, 13, 14, 15, 16],
[4, 3, 4, 5, 6, 7, 10, 12]],dtype=object)
target = data[:,0]
It has this error.
IndexError Traceback (most recent call last)
Input In \[82\], in \<cell line: 9\>()
data = np.array(\[\[10, 20, 30, 40, 50, 60, 70, 80, 90\],
\[2, 7, 8, 9, 10, 11\],
\[3, 12, 13, 14, 15, 16\],
\[4, 3, 4, 5, 6, 7, 10,12\]\],dtype=object)
# Define the target data ----\> 9 target = data\[:,0\]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
May I know how to fix it, please? I mean do not change the elements in the data. Many thanks. I made the matrix in the same size and the error message was gone. But I have the data with variable size.
You have a array of objects, so you can't use indexing on axis=1 as there is none (data.shape -> (4,)).
Use a list comprehension:
out = np.array([a[0] for a in data])
Output: array([10, 2, 3, 4])

How to plot my data using MatPloitLib with step size

Consider the following code and the graph obtained from it
import matplotlib.pyplot as plt
import numpy as np
fig,axs = plt.subplots(figsize=(10,10))
data1 = [5, 6, 18, 7, 19]
x_ax = [10, 20, 30, 40, 50]
y_ax = [0, 5, 10, 15, 20]
axs.plot(data1,marker="o")
axs.set_xticks(x_ax)
axs.set_xticklabels(labels=x_ax,rotation=45)
axs.set_yticks(y_ax)
axs.set_yticklabels(labels=y_ax,rotation=45)
axs.set_xlabel("X")
axs.set_ylabel("Y")
axs.set_title("Name")
I need to plot my data1 = [5, 6, 18, 7, 19] with a step size of 10. 5 for 10, 6 for 20, 18 for 30, 7 for 40 and 19 for 50. But the plot is taking a step size of one.
How can I modify my code to do the required?
If you don't provide x values to plot, it'll automatically use 0, 1, 2 ....
So in your case you need:
x = range(10, len(data1)*10+1, 10)
axs.plot(x, data1, marker="o")

Why extreme large value to 0 frequency fft (numpy.fft.fft method)

I have a signal ts which has rougly mean 40 and applied fft on that with code
ts = array([25, 40, 30, 40, 29, 48, 36, 32, 34, 38, 15, 33, 40, 32, 41, 25, 37,49, 41, 35, 23, 22, 36, 44, 28, 36, 32, 37, 39, 51])
index = fftshift(fftfreq(len(ts)))
ft_ts =fftshift(fft(ts))
output
ft_ts = array([ -76.00000000 +8.34887715e-14j, -57.72501110 +1.17054586e+01j,
7.69492662 +9.79582336e+00j, -29.11145618 -7.22493645e+00j,
14.92140414 +4.58471353e+01j, -26.00000000 -4.67653718e+01j,
-39.61803399 -2.83601821e+01j, -11.34044003 +8.66215368e+00j,
23.68703939 +1.57391882e+01j, -64.88854382 -2.44499549e+01j,
50.00000000 -3.98371686e+01j, 4.09382150 -6.27663403e+00j,
-37.38196601 -3.06708342e+01j, 35.97162964 +1.31929223e+01j,
18.69662985 -2.20453671e+00j, 1048.00000000 +0.00000000e+00j,
18.69662985 +2.20453671e+00j, 35.97162964 -1.31929223e+01j,
-37.38196601 +3.06708342e+01j, 4.09382150 +6.27663403e+00j,
50.00000000 +3.98371686e+01j, -64.88854382 +2.44499549e+01j,
23.68703939 -1.57391882e+01j, -11.34044003 -8.66215368e+00j,
-39.61803399 +2.83601821e+01j, -26.00000000 +4.67653718e+01j,
14.92140414 -4.58471353e+01j, -29.11145618 +7.22493645e+00j,
7.69492662 -9.79582336e+00j, -57.72501110 -1.17054586e+01j])
at 0 frequency ft_ts has value of 1048. Shouldn't that be the mean of my original signal ts which is 40 ? What happened here ?
Many thanks
The FFT is not normalized, so the first term should be the sum, not the mean.
For example, see the definition here
and you can see, that when k=0, the exponential term is 1, and you'll just get the sum of x_n.
This is why the first item in fft(np.ones(10)) is 10, not 1. 1 is the mean (since it's an array of ones), and 10 is the sum.

Extracting the indices of outliers in Linear Regression

The following script computes R-squared value between two numpy arrays(x and y).
The R-squared value is very low due to outliers in the data. How can I extract the indices of those outliers?
import numpy as np, matplotlib.pyplot as plt, scipy.stats as stats
x = np.random.random_integers(1,50,50)
y = np.random.random_integers(1,50,50)
r2 = stats.linregress(x, y) [3]**2
print r2
plt.scatter(x, y)
plt.show()
An outlier is defined as: value-mean > 2*standard deviation.
You can do this with the line
[i for i in range(len(x)) if (abs(x[i] - np.mean(x)) > 2*np.std(x))]
What is does:
A list is constructed from the indices of x, where the element at that index satisfies the condition described above.
A quick test:
x = np.random.random_integers(1,50,50)
this gives me the array:
array([16, 6, 13, 18, 21, 37, 31, 8, 1, 48, 4, 40, 9, 14, 6, 45, 20,
15, 14, 32, 30, 8, 19, 8, 34, 22, 49, 5, 22, 23, 39, 29, 37, 24,
45, 47, 21, 5, 4, 27, 48, 2, 22, 8, 12, 8, 49, 12, 15, 18])
Now I add some outliers manually as there are none initially:
x[4] = 200
x[15] = 178
lets test:
[i for i in range(len(x)) if (abs(x[i] - np.mean(x)) > 2*np.std(x))]
result:
[4, 15]
Is this what you was looking for?
EDIT:
I added the abs() function in the line above, because when you are working with negative numbers this might end bad. The abs() function takes the absolute value.
I think Sander's approach is the correct one, but if you must see R2 without those outliers before making a decision here is a way to do it.
Setup data and introduce outlier:
In [1]:
import numpy as np, scipy.stats as stats
np.random.seed(123)
x = np.random.random_integers(1,50,50)
y = np.random.random_integers(1,50,50)
y[5] = 100
Calculate R2 taking out one y value at a time (along with matching x value):
m = np.eye(y.shape[0])
r2 = np.apply_along_axis(lambda a: stats.linregress(np.delete(x, a.argmax()), np.delete(y, a.argmax()))[3]**2, 0, m)
Get index of the biggest outlier:
r2.argmax()
Out[1]:
5
Get R2 when this outlier is taken out:
In [2]:
r2[r2.argmax()]
Out[2]:
0.85892084723588935
Get the value of the outlier:
In [3]:
y[r2.argmax()]
Out[3]:
100
To get top n outliers:
In [4]:
n = 5
sorted_index = r2.argsort()[::-1]
sorted_index[:n]
Out [4]:
array([ 5, 27, 34, 0, 17], dtype=int64)

Chart Axes in VB.NET

My requirement is to graph (scatter graph) data from 2 arrays. I can now connect the data from the array and use it on the chart. My question is, how do I set the graph's X- and Y- axes to show consistency in their intervals?
For example, I have points from X = {1, 3, 4, 6, 8, 9} and Y = {7, 10, 11, 15, 18, 19}. What I would like to see is that these points are graphed in a scatter manner, but, the intervals for x-axis should be (intervals of) 2 up to 10 (such that it will show 0, 2, 4, 6, 8, 10 on x-axis) and intervals of 5 for the y-axis (such that it will show 5, 10, 15, 20 on y-axis). What code/property should I use/manipulate?
ADDED PART:
I currently have this data:
x_column = {12, 24, 1, 7, 29, 28, 25, 24, 15, 19}
y_column = {3, 5, 8, 3, 3, 3, 3, 3, 19, 15}
each y_column element is a pair of each respective x_column element
Now, I want MyChart to display a scatter graph of the x_column and y_column data in such a way that the x-axis will show 5, 10, 15, 20, 25, 30 and the y-axis will show 2, 4, 6, 8, 10, 12, 14, 16, 18, 20.
My current code is:
' add points
MyChart.Series("Scatter Plot").Points.DataBindXY(x_Column, y_Column)
The code above only adds points.
Try:
Chart1.ChartAreas("Default").AxisX.Interval = 2
Chart1.ChartAreas("Default").AxisY.Interval = 5