Explain np.polyfit and np.polyval for a scatter plot - numpy

I have to make a scatter plot and liner fit to my data. prediction_08.Dem_Adv and prediction_08.Dem_Win are two column of datas. I know that np.polyfit returns coefficients. But what is np.polyval doing here? I saw the documentation, but the explanation is confusing. can some one explain to me clearly
plt.plot(prediction_08.Dem_Adv, prediction_08.Dem_Win, 'o')
plt.xlabel("2008 Gallup Democrat Advantage")
plt.ylabel("2008 Election Democrat Win")
fit = np.polyfit(prediction_08.Dem_Adv, prediction_08.Dem_Win, 1)
x = np.linspace(-40, 80, 10)
y = np.polyval(fit, x)
plt.plot(x, y)
print fit

np.polyval is applying the polynomial function which you got using polyfit. If you get y = mx+ c relationship. The np.polyval function will multiply your x values with fit[0] and add fit[1]
Polyval according to Docs:
N = len(p)
y = p[0]*x**(N-1) + p[1]*x**(N-2) + ... + p[N-2]*x + p[N-1]
If the relationship is y = ax**2 + bx + c,
fit = np.polyfit(x,y,2)
a = fit[0]
b = fit[1]
c = fit[2]
If you do not want to use the polyval function:
y = a*(x**2) + b*(x) + c
This will create the same output as polyval.

Related

Why convert coordinate on image before training yolo?

I understand that how to convert bounding box(x1, y1, x2, y2) to YOLO Style(X, Y, W, H).
def convert(size, box):
dw = 1./size[0]
dh = 1./size[1]
x = (box[0] + box[1])/2.0
y = (box[2] + box[3])/2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
But I don't know advantage by converting coordinates.
Even though I read a lot of explanations, there was no clear answer to this question
I'll appreciate your answer.

Let A,B, C be fad. Consider the equation X = AX + BX + C. Must a solution X be fad?

Let A,B, C be fad. Consider the equation X = AX + BX + C. Must a solution X be fad?
Could you help me solve this question?
fad is a regular language
Assume juxtaposition (AX) means concatenation, and + means union. Then, let A = B = {e} and C = {}, the FAD language containing only the empty string and the empty language, respectively. Then let X be any non-FAD language. Clearly, the equation X = AX + BX + C is true since AX = X, BX = X, and X + X + {} = X.
Here are FAs for {e} and {} (proof, if desired, is left as an exercise):
/-\
--->[q0]-s->q1 | s
\-/
/-\
--->q0 | s
\-/
If juxtaposition and union mean something else, the answer may change. For instance, it's possibly that + means concatenation, but then I don't know what to make of juxtaposition (union? intersection?).

How to plot 4-D data embedded in a dataframe in Julia using a subplots approach?

I have a Julia DataFrame where the first 4 columns are dimensions and the 5th one contains the actual data.
I would like to plot it using a subplots approach where the two main plot axis concern the first two dimensions and each subplot then is a contour plot over the remaining two dimensions.
I am almost there with the above code:
using DataFrames,Plots
# plotlyjs() # doesn't work with plotlyjs backend
pyplot()
X = [1,2,3,4]
Y = [0.1,0.15,0.2]
I = [2,4,6,8,10,12,14]
J = [10,20,30,40,50,60]
df = DataFrame(X=Int64[], Y=Float64[], I=Float64[], J=Float64[], V=Float64[] )
[push!(df,[x,y,i,j,(5*x+20*y+2)*(0.2*i^2+0.5*j^2+3*i*j+2*i^2*j+1)]) for x in X, y in Y, i in I, j in J]
minvalue = minimum(df[:V])
maxvalue = maximum(df[:V])
function toDict(df, dimCols, valueCol)
toReturn = Dict()
for r in eachrow(df)
keyValues = []
[push!(keyValues,r[d]) for d in dimCols]
toReturn[(keyValues...)] = r[valueCol]
end
return toReturn
end
dict = toDict(df, [:X,:Y,:I,:J], :V )
M = [dict[(x,y,i,j)] for j in J, i in I, y in Y, x in X ]
yL = length(Y)
xL = length(X)
plot(contour(M[:,:,3,1], ylabel="y = $(string(Y[3]))", zlims=(minvalue,maxvalue)), contour(M[:,:,3,2]), contour(M[:,:,3,3]), contour(M[:,:,3,4]),
contour(M[:,:,2,1], ylabel="y = $(string(Y[2]))", zlims=(minvalue,maxvalue)), contour(M[:,:,2,2]), contour(M[:,:,2,3]), contour(M[:,:,2,4]),
contour(M[:,:,1,1], ylabel="y = $(string(Y[1]))", xlabel="x = $(string(X[1]))"), contour(M[:,:,1,2], xlabel="x = $(string(X[2]))"), contour(M[:,:,1,3], xlabel="x = $(string(X[3]))"), contour(M[:,:,3,4], xlabel="x = $(string(X[4]))"),
layout=(yL,xL) )
This produces:
I remain however with the following concerns:
How do I automatize the creation of each subplot in the subplot call ? Do I need to write a macro ?
I would like each subplot to have the same limits in the z axis, but zlims seems not to work. Is zlims not yet supported ?
How do I hide the legend on the z axis on each subplot and plot it instead apart (best would be on the right side of the main/total plot) ?
EDIT:
For the first point I don't need a macro, I can create the subplots in a for loop, add them in a array and pass the array to the plot() call using the ellipsis operator:
plots = []
for y in length(Y):-1:1
for x in 1:length(X)
xlabel = y == 1 ? "x = $(string(X[x]))" : ""
ylabel = x==1 ? "y = $(string(Y[y]))" : ""
println("$y - $x")
plot = contour(I,J,M[:,:,y,x], xlabel=xlabel, ylabel=ylabel, zlims=(minvalue,maxvalue))
push!(plots,plot)
end
end
plot(plots..., layout=(yL,xL))

Flatten out y-matrix and repeat x-vector

So I have a vector x and a matrix y where y[i] = [f(x[i]),f(x[i]),f(x[i])...] is a row of experimental values of f at x. I need to flatten out y so that I have two vectors with y[i] = f(x[i]). Here's what I'm using now:
x = np.ravel([[xx]*y.shape[1] for xx in x]); y = np.ravel(y)
Is there a cleaner/faster way?
You could just use np.repeat -
x = x.repeat(y.shape[1])

Iterating over multidimensional Numpy array

What is the fastest way to iterate over all elements in a 3D NumPy array? If array.shape = (r,c,z), there must be something faster than this:
x = np.asarray(range(12)).reshape((1,4,3))
#function that sums nearest neighbor values
x = np.asarray(range(12)).reshape((1, 4,3))
#e is my element location, d is the distance
def nn(arr, e, d=1):
d = e[0]
r = e[1]
c = e[2]
return sum(arr[d,r-1,c-1:c+2]) + sum(arr[d,r+1, c-1:c+2]) + sum(arr[d,r,c-1]) + sum(arr[d,r,c+1])
Instead of creating a nested for loop like the one below to create my values of e to run the function nn for each pixel :
for dim in range(z):
for row in range(r):
for col in range(c):
e = (dim, row, col)
I'd like to vectorize my nn function in a way that extracts location information for each element (e = (0,1,1) for example) and iterates over ALL elements in my matrix without having to manually input each locational value of e OR creating a messy nested for loop. I'm not sure how to apply np.vectorize to this problem. Thanks!
It is easy to vectorize over the d dimension:
def nn(arr, e):
r,c = e # (e[0],e[1])
return np.sum(arr[:,r-1,c-1:c+2],axis=2) + np.sum(arr[:,r+1,c-1:c+2],axis=2) +
np.sum(arr[:,r,c-1],axis=?) + np.sum(arr[:,r,c+1],axis=?)
now just iterate over the row and col dimensions, returning a vector, that is assigned to the appropriate slot in x.
for row in <correct range>:
for col in <correct range>:
x[:,row,col] = nn(data, (row,col))
The next step is to make
rows = [:,None]
cols =
arr[:,rows-1,cols+2] + arr[:,rows,cols+2] etc.
This kind of problem has come up many times, with various descriptions - convolution, smoothing, filtering etc.
We could do some searches to find the best, or it you prefer, we could guide you through the steps.
Converting a nested loop calculation to Numpy for speedup
is a question similar to yours. There's only 2 levels of looping, and sum expression is different, but I think it has the same issues:
for h in xrange(1, height-1):
for w in xrange(1, width-1):
new_gr[h][w] = gr[h][w] + gr[h][w-1] + gr[h-1][w] +
t * gr[h+1][w-1]-2 * (gr[h][w-1] + t * gr[h-1][w])
Here's what I ended up doing. Since I'm returning the xv vector and slipping it in to the larger 3D array lag, this should speed up the process, right? data is my input dataset.
def nn3d(arr, e):
r,c = e
n = np.copy(arr[:,r-1:r+2,c-1:c+2])
n[:,1,1] = 0
n3d = np.ma.masked_where(n == nodata, n)
xv = np.zeros(arr.shape[0])
for d in range(arr.shape[0]):
if np.ma.count(n3d[d,:,:]) < 2:
element = nodata
else:
element = np.sum(n3d[d,:,:])/(np.ma.count(n3d[d,:,:])-1)
xv[d] = element
return xv
lag = np.zeros(shape = data.shape)
for r in range(1,data.shape[1]-1): #boundary effects
for c in range(1,data.shape[2]-1):
lag[:,r,c] = nn3d(data,(r,c))
What you are looking for is probably array.nditer:
a = np.arange(6).reshape(2,3)
for x in np.nditer(a):
print(x, end=' ')
which prints
0 1 2 3 4 5