Move for loop into numpy single expression when calling polyfit - numpy

Fairly new to numpy/python here, trying to figure out some less c-like, more numpy-like coding styles.
Background
I've got some code done that takes a fixed set of x values and multiple sets of corresponding y value sets and tries to find which set of the y values are the "most linear".
It does this by going through each set of y values in a loop, calculating and storing the residual from a straight line fit of those y's against the x's, then once the loop has finished finding the index of the minimum residual value.
...sorry this might make a bit more sense with the code below.
import numpy as np
import numpy.polynomial.polynomial as poly
# set of x values
xs = [1,22,33,54]
# multiple sets of y values for each of the x values in 'xs'
ys = np.array([[1, 22, 3, 4],
[2, 3, 1, 5],
[3, 2, 1, 1],
[34,23, 5, 4],
[23,24,29,33],
[5,19, 12, 3]])
# array to store the residual from a linear fit of each of the y's against x
residuals = np.empty(ys.shape[0])
# loop through the xs's and calculate the residual of a linear fit for each
for i in range(ys.shape[0]):
_, stats = poly.polyfit(xs, ys[i], 1, full=True)
residuals[i] = stats[0][0]
# the 'most linear' of the ys's is at np.argmin:
print('most linear at', np.argmin(residuals))
Question
I'd like to know if it's possible to "numpy'ize" that into a single expression, something like
residuals = get_residuals(xs, ys)
...I've tried:
I've tried the following, but no luck (it always passes the full arrays in, not row by row):
# ------ ok try to do it without a loop --------
def wrap(x, y):
_, stats = poly.polyfit(x, y, 1, full=True)
return stats[0][0]
res = wrap(xs, ys) # <- fails as passes ys as full 2D array
res = wrap(np.broadcast_to(xs, ys.shape), ys) # <- fails as passes both as 2D arrays
Could anyone give any tips on how to numpy'ize that?

From the numpy.polynomial.polynomial.polyfit docs (not to be confused with numpy.polyfit which is not interchangable)
:
x : array_like, shape (M,)
y : array_like, shape (M,) or (M, K)
Your ys needs to be transposed to have ys.shape[0] equal to xs.shape
def wrap(x, y):
_, stats = poly.polyfit(x, y.T, 1, full=True)
return stats[0]
res = wrap(xs, ys)
res
Out[]: array([284.57337884, 5.54709898, 0.41399317, 91.44641638,
6.34982935, 153.03515358])

Related

Passing function arguments to list in Optimal Control Application

My intention is to pass the initial guess as an argument of a function instead of directly defining it in the body of the code.
1)Is there a way to do this without getting: TypeError: cannot unpack non-iterable int object
Also, my additional goal is to use this function to iterate over different initial guesses which also produces a float working when defining for example:
initial_guess = [8, 0.1], [9, 0.1], [10, 0.1], [11, 0.1] and doing:
for i in initial_guess:
...
...
result1 = opt.solve_ocp(
vehicle, horizon, x0, quad_cost, initial_guess[i], log=True,
minimize_method='trust-constr',
minimize_options={'finite_diff_rel_step': 0.01},
)
...
...
return(t1, y1, u1)
2)Is there a way to achieve iteration of various floating parameters for initial_guess list values?
Please note that the optimal control function ocp takes initial_guess as a list in the form initial_guess = [f, g], where f, g floats or integers.
# Set up the cost functions
Q = np.diag([20, 20, 0.01]) # keep lateral error low
R = np.diag([10, 10]) # minimize applied inputs
quad_cost = opt.quadratic_cost(vehicle, Q, R, x0=xf, u0=uf)
# Define the time horizon (and spacing) for the optimization
horizon = np.linspace(0, Tf, Tf, endpoint=True)
# Provide an intial guess (will be extended to entire horizon)
#bend_left = [8, 0.01] # slight left veer
########################################################################################################################
def Approach1(Velocity_guess, Steer_guess):
# Turn on debug level logging so that we can see what the optimizer is doing
logging.basicConfig(
level=logging.DEBUG, filename="steering-integral_cost.log",
filemode='w', force=True)
#constraints = [ opt.input_range_constraint(vehicle, [8, -0.1], [12, 0.1]) ]
initial_guess = [Velocity_guess, Steer_guess]
# Compute the optimal control, setting step size for gradient calculation (eps)
start_time = time.process_time()
result1 = opt.solve_ocp(
vehicle, horizon, x0, quad_cost, initial_guess, log=True,
minimize_method='trust-constr',
minimize_options={'finite_diff_rel_step': 0.01},
)
print("* Total time = %5g seconds\n" % (time.process_time() - start_time))
# If we are running CI tests, make sure we succeeded
if 'PYCONTROL_TEST_EXAMPLES' in os.environ:
assert result1.success
# Extract and plot the results (+ state trajectory)
t1, u1 = result1.time, result1.inputs
t1, y1 = ct.input_output_response(vehicle, horizon, u1, x0)
Final_x_deviation = xf[0] - y1[0][len(y1[0])-1]
Final_y_deviation = xf[1] - y1[1][len(y1[1])-1]
V_variation = uf[0] - u1[0][len(u1[0])-1]
Angle_Variation = uf[1] - u1[1][len(u1[1])-1]
plot_results(t1, y1, u1, xf, uf, Tf, yf=xf[0:2])
return(t1, u1, y1)

In Tensorflow, is there a built in function to compute states over time given a transition matrix?

I have a system given by this recursive relationship: xt = At xt-1 + bt. I wish to compute xt for all t, with At, bt and x0 given. Is there are built-in function for that? If I use a loop it would be extremely slow. Thanks!
There is sort of a way. Let's say you have your A matrices in a 3D tensor with shape (T, N, N), where T is the total number of time steps and N is the size of your vector. Similarly, B values are in a 2D tensor (T, N). The first step in the computation would be:
x1 = A[0] # x0 + B[0]
Where # represents matrix product. But you can convert this into a single matrix product. Suppose we add a value 1 at the end of x0, and we call that x0p (for prime):
x0p = tf.concat([x, [1]], axis=0)
And now we build a new 3D tensor Ap with shape (T, N+1, N+1), such that for each A[i] we concatenate B[i] as a new column, and then we add a row with N zeros and a single one at the end:
AwithB = tf.concat([tf.concat([A, tf.expand_dims(B, 2)], axis=2)], axis=1)
AnewRow = tf.concat([tf.zeros((T, 1, N), A.dtype), tf.ones((T, 1, 1), A.dtype)], axis=2)
Ap = tf.concat([AwithB, AnewRow], axis=1)
As it turns out, you can now say:
x1p = Ap[0] # x0p
And therefore:
x2p = Ap[1] # x1p = Ap[1] # Ap[0] # x0p
So we just need to compute all the matrix product of all matrices in Ap across the first dimension. Unfortunately, there does not seem to be a direct operation to compute that with TensorFlow, but you can do it relatively fast with tf.scan:
Ap_prod = tf.scan(tf.matmul, Ap)[-1]
And with that you just have to do:
xtp = Ap_prod # x0p
Here is a proof of concept (the code is tweaked to support single examples and batches, either in the A and B values or in the x)
import tensorflow as tf
def compute_state(a, b, x):
s = tf.shape(a)
t = s[-3]
n = s[-1]
# Add final 1 to x
xp = tf.concat([x, tf.ones_like(x[..., :1])], axis=-1)
# Add B column to A
a_b = tf.concat([tf.concat([a, tf.expand_dims(b, axis=-1)], axis=-1)], axis=-2)
# Make new final row for A
a_row = tf.concat([tf.zeros_like(a[..., :1, :]),
tf.ones_like(a[..., :1, :1])], axis=-1)
# Add new row to A
ap = tf.concat([a_b, a_row], axis=-2)
# Compute matrix product reduction
ap_prod = tf.scan(tf.matmul, ap)[..., -1, :, :]
# Compute final result
outp = tf.linalg.matvec(ap_prod, xp)
return outp[..., :-1]
#Test
tf.random.set_seed(0)
a = tf.random.uniform((10, 5, 5), -1, 1)
b = tf.random.uniform((10, 5), -1, 1)
x = tf.random.uniform((5,), -1, 1)
y = compute_state(a, b, x)
# Also works with batches of (a, b) or x
a = tf.random.uniform((100, 10, 5, 5), -1, 1)
b = tf.random.uniform((100, 10, 5), -1, 1)
x = tf.random.uniform((100, 5), -1, 1)
y = compute_state(a, b, x)

Slow computation on google colab while solving partial differential equation

I 'm using google colab to solve the homogeneous heat equation. I had made a program earlier with scipy using sparse matrices which worked upto N = 10(hyperparameter) but I need to run it for like N = 4... 1000 and thus it won't work on my pc. I therefore converted the code to tensorflow and here I 'm unable to use sparse matrices like I could in sympy but even the GPU/TPU computation is also slow and slower than my pc. Problems that I'm facing in the code and require solution for
1) tf.contrib is removed and thus I 've to use an older version of tensorflow for odeint function. Where is it in 2.0?
2)If the computation can be computed with sparse matrices it could be good since matrices are tridiagonal.I know about sparse_dense_mul() function but that returns dense tensor and it wouldn't do the job. The "func" function applies time independent boundary conditions and then requires matrix multiplication of (nxn) with (nX1) which gives (nX1) with multiple matrices.
Also the program was running faster without I created the class.
Also it's giving this
WARNING: Logging before flag parsing goes to stderr.
W0829 09:12:24.415445 139855355791232 lazy_loader.py:50]
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
W0829 09:12:24.645356 139855355791232 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/contrib/integrate/python/ops/odes.py:233: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
when I run code for loop in range(2, 10) and tqdm does not display and cell keeps running forever but it works fine for in (2, 5) and tqdm bar does appears.
#find a way to use sparse matrices
class Heat:
def __init__(self, N):
self.N = N
self.H = 1/N
self.A = ts.to_dense(ts.SparseTensor(indices=[[0, 0], [0, 1]] + \
[[i, i+j] for i in range(1, N) for j in [-1, 0, 1]] +[[N, N-1], [N, N]],
values=self.H*np.array([1/3, 1/6] + [1/6, 2/3, 1/6]*(N-1) + [1/6, 1/3], dtype=np.float32),
dense_shape=(N+1, N+1 )))
self.D = ts.to_dense(ts.SparseTensor(indices=[[0, 0], [0, 1]] + [[i, i+j] \
for i in range(1, N) for j in [-1, 0, 1]] +[[N, N-1], [N, N]],
values=N*np.array([1-(1), -1 -(-1)] + [-1, 2, -1]*(N-1) + [-1-(-1), 1-(1)], dtype=np.float32),
dense_shape=(N+1, N+1)))
self.domain = tf.linspace(0.0, 1.0, N+1)
def f(k):
if k == 0:
return (1 + math.pi**2)*(math.pi*self.H - math.sin(math.pi*self.H))/(math.pi**2*self.H)
elif k == N:
return -(1 + math.pi**2)*(-math.pi*self.H + math.sin(math.pi*self.H))/(math.pi**2*self.H)
else:
return -2*(1 + math.pi**2)*(math.cos(math.pi*self.H) - 1)*math.sin(math.pi*self.H*k)/(math.pi**2*self.H)
self.F = tf.constant([f(k) for k in range(N+1)], shape=(N+1,), dtype=tf.float32) #caution! shape changed caution caution 1, N+1(problem) is different from N+1,
self.exact = tm.scalar_mul(scalar=np.exp(1), x=tf.sin(math.pi*self.domain))
def error(self):
return np.linalg.norm(self.exact.numpy() - self.approx, 2)
def func (self, y, t):
y = tf.Variable(y)
y = y[0].assign(0.0)
y = y[self.N].assign(0.0)
if self.N**2> 100:
y_dash = tl.matvec(tf.linalg.inv(self.A), tl.matvec(a=tm.negative(self.D), b=y, a_is_sparse=True) + tm.scalar_mul(scalar=math.exp(t), x=self.F)) #caution! shape changed F is (1, N+1) others too
else:
y_dash = tl.matvec(tf.linalg.inv(self.A), tl.matvec(a=tm.negative(self.D), b=y) + tm.scalar_mul(scalar=math.exp(t), x=self.F)) #caution! shape changed F is (1, N+1) others too
y_dash = tf.Variable(y_dash) #!!y_dash performs Hadamard product like multiplication not matrix-like multiplication;returns 2-D
y_dash = y_dash[0].assign(0.0)
y_dash = y_dash[self.N].assign(0.0)
return y_dash
def algo_1(self):
self.approx = tf.contrib.integrate.odeint(
func=self.func,
y0=tf.sin(tm.scalar_mul(scalar=math.pi, x=self.domain)),
t=tf.constant([0.0, 1.0]),
rtol=1e-06,
atol=1e-12,
method='dopri5',
options={"max_num_steps":10**10},
full_output=False,
name=None
).numpy()[1]
def algo_2(self):
self.approx = tf.contrib.integrate.odeint_fixed(
func=self.func,
y0=tf.sin(tm.scalar_mul(scalar=math.pi, x=self.domain)),
t=tf.constant([0.0, 1.0]),
dt=tf.constant([self.H**2], dtype=tf.float32),
method='rk4',
name=None
).numpy()[1]
df = pd.DataFrame(columns=["NumBasis", "Errors"])
Ns = [2**r for r in range(2, 10)]
l =[]
for i in tqdm_notebook(Ns):
heateqn = Heat(i)
heateqn.algo_1()
l.append([i, heateqn.error()])
df.append({"NumBasis":i, "Errors":heateqn.error()}, ignore_index=True)
tf.keras.backend.clear_session()

Pairwise distance between a set of Matrices in Keras/Tensorflow

I want to calculate pairwise distance between a set of Tensor (e.g 4 Tensor). Each matrix is 2D Tensor. I don't know how to do this in vectorize format. I wrote following sudo-code to determine what I need:
E.shape => [4,30,30]
sum = 0
for i in range(4):
for j in range(4):
res = calculate_distance(E[i],E[j]) # E[i] is one the 30*30 Tensor
sum = sum + reduce_sum(res)
Here is my last try:
x_ = tf.expand_dims(E, 0)
y_ = tf.expand_dims(E, 1)
s = x_ - y_
P = tf.reduce_sum(tf.norm(s, axis=[-2, -1]))
This code works But I don't know how do this in a Batch. For instance when E.shape is [BATCH_SIZE * 4 * 30 * 30] my code doesn't work and Out Of Memory will happen. How can I do this efficiently?
Edit: After a day, I find a solution. it's not perfect but works:
res = tf.map_fn(lambda x: tf.map_fn(lambda y: tf.map_fn(lambda z: tf.norm(z - x), x), x), E)
res = tf.reduce_mean(tf.square(res))
Your solution with expand_dims should be okay if your batch size is not too large. However, given that your original pseudo code loops over range(4), you should probably expand axes 1 and 2, instead of 0 and 1.
You can check the shape of the tensors to ensure that you're specifying the correct axes. For example,
batch_size = 8
E_np = np.random.rand(batch_size, 4, 30, 30)
E = K.variable(E_np) # shape=(8, 4, 30, 30)
x_ = K.expand_dims(E, 1)
y_ = K.expand_dims(E, 2)
s = x_ - y_ # shape=(8, 4, 4, 30, 30)
distances = tf.norm(s, axis=[-2, -1]) # shape=(8, 4, 4)
P = K.sum(distances, axis=[-2, -1]) # shape=(8,)
Now P will be the sum of pairwise distances between the 4 matrices for each of the 8 samples.
You can also verify that the values in P is the same as what would be computed in your pseudo code:
answer = []
for batch_idx in range(batch_size):
s = 0
for i in range(4):
for j in range(4):
a = E_np[batch_idx, i]
b = E_np[batch_idx, j]
s += np.sqrt(np.trace(np.dot(a - b, (a - b).T)))
answer.append(s)
print(answer)
[149.45960605637578, 147.2815068236368, 144.97487402393705, 146.04866735065312, 144.25537059201062, 148.9300986019226, 146.61229889228133, 149.34259789169045]
print(K.eval(P).tolist())
[149.4595947265625, 147.281494140625, 144.97488403320312, 146.04867553710938, 144.25537109375, 148.9300994873047, 146.6123046875, 149.34259033203125]
Tensorflow allows to compute the Frobenius norm via tf.norm function. In case of 2D matrices, it's equivalent to 1-norm.
The following solution isn't vectorized and assumes that the first dimension in E is known statically:
E = tf.random_normal(shape=[5, 3, 3], dtype=tf.float32)
F = tf.split(E, E.shape[0])
total = tf.reduce_sum([tf.norm(tensor=(lhs-rhs), ord=1, axis=(-2, -1)) for lhs in F for rhs in F])
Update:
An optimized vectorized version of the same code:
E = tf.random_normal(shape=[1024, 4, 30, 30], dtype=tf.float32)
lhs = tf.expand_dims(E, axis=1)
rhs = tf.expand_dims(E, axis=2)
total = tf.reduce_sum(tf.norm(tensor=(lhs - rhs), ord=1, axis=(-2, -1)))
Memory concerns: upon evaluating this code,
tf.contrib.memory_stats.MaxBytesInUse() reports that the peak memory consumption is 73729792 = 74Mb, which indicates relatively moderate overhead (the raw lhs-rhs tensor is 59Mb). Your OOM is most likely caused by the duplication of BATCH_SIZE dimension when you compute s = x_ - y_, because your batch size is much larger than the number of matrices (1024 vs 4).

Row-wise Histogram

Given a 2-dimensional tensor t, what's the fastest way to compute a tensor h where
h[i, :] = tf.histogram_fixed_width(t[i, :], vals, nbins)
I.e. where tf.histogram_fixed_width is called per row of the input tensor t?
It seems that tf.histogram_fixed_width is missing an axis parameter that works like, e.g., tf.reduce_sum's axis parameter.
tf.histogram_fixed_width works on the entire tensor indeed. You have to loop through the rows explicitly to compute the per-row histograms. Here is a complete working example using TensorFlow's tf.while_loop construct :
import tensorflow as tf
t = tf.random_uniform([2, 2])
i = 0
hist = tf.constant(0, shape=[0, 5], dtype=tf.int32)
def loop_body(i, hist):
h = tf.histogram_fixed_width(t[i, :], [0.0, 1.0], nbins=5)
return i+1, tf.concat_v2([hist, tf.expand_dims(h, 0)], axis=0)
i, hist = tf.while_loop(
lambda i, _: i < 2, loop_body, [i, hist],
shape_invariants=[tf.TensorShape([]), tf.TensorShape([None, 5])])
sess = tf.InteractiveSession()
print(hist.eval())
Inspired by keveman's answer and because the number of rows of t is fixed and rather small, I chose to use a combination of tf.gather to split rows and tf.pack to join rows. It looks simple and works, will see if it is efficient...
t_histo_rows = [
tf.histogram_fixed_width(
tf.gather(t, [row]),
vals, nbins)
for row in range(t_num_rows)]
t_histo = tf.pack(t_histo_rows, axis=0)
I would like to propose another implementation.
This implementation can also handle multi axes and unknown dimensions (batching).
def histogram(tensor, nbins=10, axis=None):
value_range = [tf.reduce_min(tensor), tf.reduce_max(tensor)]
if axis is None:
return tf.histogram_fixed_width(tensor, value_range, nbins=nbins)
else:
if not hasattr(axis, "__len__"):
axis = [axis]
other_axis = [x for x in range(0, len(tensor.shape)) if x not in axis]
swap = tf.transpose(tensor, [*other_axis, *axis])
flat = tf.reshape(swap, [-1, *np.take(tensor.shape.as_list(), axis)])
count = tf.map_fn(lambda x: tf.histogram_fixed_width(x, value_range, nbins=nbins), flat, dtype=(tf.int32))
return tf.reshape(count, [*np.take([-1 if a is None else a for a in tensor.shape.as_list()], other_axis), nbins])
The only slow part here is tf.map_fn but it is still faster than the other solutions mentioned.
If someone knows a even faster implementation please comment since this operation is still very expensive.
answers above is still slow running in GPU. Here i give an another option, which is faster(at least in my running envirment), but it is limited to 0~1 (you can normalize the value first). the train_equal_mask_nbin can be defined once in advance
def histogram_v3_nomask(tensor, nbins, row_num, col_num):
#init mask
equal_mask_list = []
for i in range(nbins):
equal_mask_list.append(tf.ones([row_num, col_num], dtype=tf.int32) * i)
#[nbins, row, col]
#[0, row, col] is tensor of shape [row, col] with all value 0
#[1, row, col] is tensor of shape [row, col] with all value 1
#....
train_equal_mask_nbin = tf.stack(equal_mask_list, axis=0)
#[inst, doc_len] float to int(equaly seg float in bins)
int_input = tf.cast(tensor * (nbins), dtype=tf.int32)
#input [row,col] -> copy N times, [nbins, row_num, col_num]
int_input_nbin_copy = tf.reshape(tf.tile(int_input, [nbins, 1]), [nbins, row_num, col_num])
#calculate histogram
histogram = tf.transpose(tf.count_nonzero(tf.equal(train_equal_mask_nbin, int_input_nbin_copy), axis=2))
return histogram
With the advent of tf.math.bincount, I believe the problem has become much simpler.
Something like this should work:
def hist_fixed_width(x,st,en,nbins):
x=(x-st)/(en-st)
x=tf.cast(x*nbins,dtype=tf.int32)
x=tf.clip_by_value(x,0,nbins-1)
return tf.math.bincount(x,minlength=nbins,axis=-1)