I'm using emcee to generate samples with a given ln_prob twice, but both times yield the exact same samples.
I am using the same initial state for both samplers, but I don't see why it should matter.
Am I wrong thinking that it should yield different results?
import emcee
import numpy as np
NWALKERS = 32
NDIM = 2
NSAMPLES = 1000
def ln_gaussian(x):
# mu = 0, cov = 1
a = (2*np.pi)** -0.5
return np.log(a * np.exp(-0.5 * np.dot(x,x)))
p0 = np.random.rand(NWALKERS, NDIM)
sampler1 = emcee.EnsembleSampler(NWALKERS, NDIM, ln_gaussian)
sampler2 = emcee.EnsembleSampler(NWALKERS, NDIM, ln_gaussian)
state1 = sampler1.run_mcmc(p0, 100) # burn in
state2 = sampler2.run_mcmc(p0, 100) # burn in
sampler1.reset()
sampler2.reset()
# run sampler 1k times (x32 walkers)
sampler1.run_mcmc(state1, NSAMPLES)
sampler2.run_mcmc(state2, NSAMPLES)
s1 = sampler1.get_chain(flat=True)
s2 = sampler2.get_chain(flat=True)
s1 - s2
The output is
array([[0., 0.],
[0., 0.],
[0., 0.],
...,
[0., 0.],
[0., 0.],
[0., 0.]])
If I use different initial states
p0 = np.random.rand(NWALKERS, NDIM)
p1 = np.random.rand(NWALKERS, NDIM)
it yields different samples
array([[-0.70474519, -0.09671908],
[-0.31555036, -0.33661664],
[ 0.75735537, 0.01540277],
...,
[ 2.84810783, -2.11736446],
[-0.55164227, -0.26478868],
[ 0.01301593, -1.76233017]])
But why should it matter? I thought it's random.
Related
I have categories as a list of list integers as shown below:
categories = [
[0,2,4,6,8],
[1,3,5,7,9]
]
I have a label tensor y with num_batches integers (as classes):
y = tf.constant([0, 1, 1, 2, 5, 4, 7, 9, 3, 3])
I want to replace values in y with certain indices (let's say 0-even, 1-odd) with the categories list available, such that final result would be:
cat_labels = tf.constant([0, 1, 1, 0, 1, 0, 1, 1, 1, 1])
I can get it by iterating through each value in y like below:
cat_labels = tf.Variable(tf.identity(y))
for idx in range(len(categories)):
for i, _y in enumerate(y):
if _y in categories[idx]: # if _y value is in categories[idx]
cat_labels[i].assign(idx) # replace all of them with idx
But apparently iterating is not allowed when this block is encapsulated in a #tf.function parent function.
Is there a way to apply the logic without iterating, or converting to numpy and applying np.isin, while getting speedups of tf.function?
Edit: There seem to be workarounds on this like here, but any help on explaining in the context of this use case would be appreciated.
You can try this:
y = tf.constant([0., 1., 1., 2., 5., 4., 7., 9., 3., 3.], dtype=tf.float32)
categories = [[0,2,4,6,8],[1,3,5,7,9]]
c = tf.convert_to_tensor(categories, dtype=tf.float32)
cat_labels = tf.map_fn( # apply an operation on all of the elements of Y
lambda x:tf.gather_nd( # get index of category: 0 or 1 or anything else
tf.cast( # cast dtype of the result of the inner function
tf.where( # get index of the element of Y in categories
tf.equal(c, x)), # search an element of Y within categories
dtype=tf.float32),[0,0]), y)
tf.print(cat_labels, summarize=-1)
# [0 1 1 0 1 0 1 1 1 1]
I am trying to do a simple evaluation (i.e. forward pass) for a learned LSTM model and I cannot figure out in what order can f_t, i_t, o_t, c_in be extracted from z. It is my understanding that they are computed in bulk.
Here is the model architecture obtained using Keras:
My input sequence is:
input_seq = np.array([[[0.725323664],
[0.7671179],
[0.805884672]]])
The output should be:
[ 0.83467698]
Using Keras, I have obtained the following parameters for the first LSTM layer:
lstm_1_kernel_0 = np.array([[-0.40927699, -0.53539848, 0.40065038, -0.07722378, 0.30405849, 0.54959822, -0.23097005, 0.4720422, 0.05197877, -0.52746099, -0.5856396, -0.43691438]])
lstm_1_recurrent_kernel_0 = np.array([[-0.25504839, -0.0823682, 0.11609183, 0.41123426, 0.03409858, -0.0647027, -0.59183347, -0.15359771, 0.21647622, 0.24863823, 0.46169096, -0.21100986],
[0.29160395, 0.46513283, 0.33996364, -0.31195125, -0.24458826, -0.09762905, 0.16202784, -0.01602131, 0.34460208, 0.39724654, 0.31806156, 0.1102117],
[-0.15919448, -0.33053166, -0.22857222, -0.04912394, -0.21862955, 0.55346996, 0.38505834, 0.18110731, 0.270677, -0.02759281, 0.42814475, -0.13496138]])
lstm_1_bias_0 = np.array([0., 0., 0., 1., 1., 1., 0., 0., 0., 0., 0., 0.])
# LSTM 1
z_1_lstm_1 = np.dot(x_1_lstm_1, lstm_1_kernel_0) + np.dot(h_0_lstm_1, lstm_1_recurrent_kernel_0) + lstm_1_bias_0
i_1_lstm_1 = z_1_lstm_1[0, 0:3]
f_1_lstm_1 = z_1_lstm_1[0, 3:6]
input_to_c_1_lstm_1 = z_1_lstm_1[0, 6:9]
o_1_lstm_1 = z_1_lstm_1[0, 9:12]
So the question is what is the correct order for i_1_lstm_1, f_1_lstm_1, input_to_c_1_lstm_1, o_1_lstm_1 ?
It's (i, f, c, o). In recurrent.py, in LSTMCell, the weights are constructed by:
self.kernel_i = self.kernel[:, :self.units]
self.kernel_f = self.kernel[:, self.units: self.units * 2]
self.kernel_c = self.kernel[:, self.units * 2: self.units * 3]
self.kernel_o = self.kernel[:, self.units * 3:]
self.recurrent_kernel_i = self.recurrent_kernel[:, :self.units]
self.recurrent_kernel_f = self.recurrent_kernel[:, self.units: self.units * 2]
self.recurrent_kernel_c = self.recurrent_kernel[:, self.units * 2: self.units * 3]
self.recurrent_kernel_o = self.recurrent_kernel[:, self.units * 3:]
if self.use_bias:
self.bias_i = self.bias[:self.units]
self.bias_f = self.bias[self.units: self.units * 2]
self.bias_c = self.bias[self.units * 2: self.units * 3]
self.bias_o = self.bias[self.units * 3:]
The problem
I have a 1-dimensional numpy array filled mostly with zeros but also containing some groups of non-zero values.
>> import numpy as np
>> a = np.zeros(10)
>> a[2:4] = 2
>> a[6:9] = 3
>> print a
[ 0. 0. 2. 2. 0. 0. 3. 3. 3. 0.]
I want to get the array that contains only the last non-zero group. In other words, all but the last non-zero group should be replaced by zeros. (The groups could be only 1 element long). Like so:
[ 0. 0. 0. 0. 0. 0. 3. 3. 3. 0.]
Non-robust solution
This seems to do the trick. Reverse the array and find the first index where the change between elements is negative. Then replace all subsequent elements with zero. Then flip back. It's a bit long-winded:
>> b = a[::-1]
>> b[np.where(np.ediff1d(b) < 0)[0][0] + 1:] = 0
>> c = b[::-1]
>> print c
[ 0. 0. 0. 0. 0. 0. 3. 3. 3. 0.]
Fails for a specific case
However, it is not robust and fails in the following case (because the where command returns an empty list of indices):
>> a = np.zeros(10)
>> a[0:4] = 2
>> print a
[ 2. 2. 2. 2. 0. 0. 0. 0. 0. 0.]
>> b = a[::-1]
>> b[np.where(np.ediff1d(b) < 0)[0][0] + 1:] = 0
>> c = b[::-1]
>> print c
Traceback (most recent call last):
File "<ipython-input-81-8cba57558ba8>", line 1, in <module>
runfile('C:/Users/name/test1.py', wdir='C:/Users/name')
File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\ProgramData\Anaconda2\lib\site-packages\spyder\utils\site\sitecustomize.py", line 87, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "C:/Users/name/test1.py", line 21, in <module>
b[np.where(np.ediff1d(b) < 0)[0][0] + 1:] = 0
IndexError: index 0 is out of bounds for axis 0 with size 0
Fix
So I need to introduce an if clause:
>> b = a[::-1]
>> if len(np.where(np.ediff1d(b) < 0)[0]) > 0:
>> b[np.where(np.ediff1d(b) < 0)[0][0] + 1:] = 0
>> c = b[::-1]
>> print c
[ 2. 2. 2. 2. 0. 0. 0. 0. 0. 0.]
Is there a more elegant way to do it?
UPDATE
Following on from Divakar's excellent answer and mtrw's question, I would like to extend the specification. The method should also work if the input array has non-zero values that are negative and for groups of non-zero numbers that change within the grouping.
e.g. np.array([1, 0, 0, 4, 5, 4, 5, 0, 0])
This means methods where we check for a positive or negative difference between elements, in order to find the group boundaries, would not work so well.
Approach #1
Since we are after elegance, let's feed ourselves a one-liner -
a[:(a[1:] > a[:-1]).cumsum().argmax()] = 0
Sample run -
In [605]: a
Out[605]: array([ 0., 0., 2., 2., 0., 0., 3., 3., 3., 0.])
In [606]: a[:(a[1:] > a[:-1]).cumsum().argmax()] = 0
In [607]: a
Out[607]: array([ 0., 0., 0., 0., 0., 0., 3., 3., 3., 0.])
Approach #2
Above approach assumes that the last group numbers are greater than 0's. If that's not the case and for cases where the non-zeros group might have different numbers, let's feed one more line to have a generic solution -
mask = a != 0
a[:(mask[1:] > mask[:-1]).cumsum().argmax()] = 0
Sample run -
In [667]: a
Out[667]: array([-1, 0, 0, -4, -5, 4, -5, 0, 0])
In [668]: mask = a != 0
In [669]: a[:(mask[1:] > mask[:-1]).cumsum().argmax()] = 0
In [670]: a
Out[670]: array([ 0, 0, 0, -4, -5, 4, -5, 0, 0])
I need to populate two interdependent arrays simultaneously, based on their previous element, like so:
import numpy as np
a = np.zeros(100)
b = np.zeros(100)
c = np.random.random(100)
for num in range(1, len(a)):
a[num] = b[num-1] + c[num]
b[num] = b[num-1] + a[num]
Is there a way to truly vectorize this (i.e. not using numpy.vectorize) using numpy? Note that these are arbitrary arrays, not looking for a solution for these specific values.
As mentioned in #Praveen's post, we can write those expressions for few iterations trying to find the closed form and that would be a triangular matrix of course for c. Then, we just need to add in iteratively-scaled b[0] to get full b. To get a, we simply add shifted versions of b and c.
So, implementation-wise here's a different take on it using some NumPy broadcasting and dot-product for efficiency purposes -
p = 2**np.arange(a.size-1)
scale1 = p[:,None]//p
b_out = np.append(b[0],scale1.dot(c[1:]) + 2*p*b[0])
a_out = np.append(a[0],b_out[:-1] + c[1:])
If a and b are meant to be always start as 0, the code for the last two steps would simplify to -
b_out = np.append(0,scale1.dot(c[1:]))
a_out = np.append(0,b_out[:-1] + c[1:])
Yes there is:
c = np.arange(100)
a = 2 ** c - 1
b = numpy.cumsum(a)
Clearly, the updates are:
a_i = b_i-1 + c_i
b_i = 2*b_i-1 + c_i
Writing out the recursion,
b_0 = c_0 # I'm not sure if c_0 is to be used
b_1 = 2*b_0 + c_1
= 2*c_0 + c_1
b_2 = 2*b_1 + c_2
= 2*(2*c_0 + c_1) + c_2
= 4*c_0 + 2*c_1 + c_2
b_3 = 2*b_2 + c_3
= 2*(4*c_0 + 2*c_1 + c_2) + c_3
= 8*c_0 + 4*c_1 + 2*c_2 + c_3
So it would seem that
b_i = np.sum((2**np.arange(i+1))[::-1] * c[:i])
a_i = b_i-1 + c_i
It's not possible to do a cumulative sum here, because the coefficient of c_i keeps changing.
The easiest way to fully vectorize this is to probably just use a giant matrix. If c has size N:
t = np.zeros((N, N))
x, y = np.tril_indices(N)
t[x, y] = 2 ** (x - y)
This gives us:
>>> t
array([[ 1., 0., 0., 0.],
[ 2., 1., 0., 0.],
[ 4., 2., 1., 0.],
[ 8., 4., 2., 1.]])
So now you can do:
b = np.sum(t * c, axis=1)
a = np.zeros(N)
a[1:] = b[:-1] + c[1:]
I probably wouldn't recommend this solution. From what little I know of computational methods, this doesn't seem numerically stable for large N. But I have the feeling that this would be true of any vectorized solution which performs the summation at the end. Maybe you should try both the for-loop and this piece of code out and see if your errors keep blowing up with the vectorized solution.
I have same problem as described here:
how to perform coordinates affine transformation using python?
I was trying to use method described but some reason I will get error messages.
Changes I made to code was to replace primary system and secondary system points. I created secondary coordinate points by using different origo. In real case for which I am studying this topic will have some errors when measuring the coordinates.
primary_system1 = (40.0, 1160.0, 0.0)
primary_system2 = (40.0, 40.0, 0.0)
primary_system3 = (260.0, 40.0, 0.0)
primary_system4 = (260.0, 1160.0, 0.0)
secondary_system1 = (610.0, 560.0, 0.0)
secondary_system2 = (610.0,-560.0, 0.0)
secondary_system3 = (390.0, -560.0, 0.0)
secondary_system4 = (390.0, 560.0, 0.0)
Error I get from when executing is following.
*Traceback (most recent call last):
File "affine_try.py", line 57, in <module>
secondary_system3, secondary_system4 )
File "affine_try.py", line 22, in solve_affine
A2 = y * x.I
File "/usr/lib/python2.7/dist-packages/numpy/matrixlib/defmatrix.py", line 850, in getI
return asmatrix(func(self))
File "/usr/lib/python2.7/dist-packages/numpy/linalg/linalg.py", line 445, in inv
return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
File "/usr/lib/python2.7/dist-packages/numpy/linalg/linalg.py", line 328, in solve
raise LinAlgError, 'Singular matrix'
numpy.linalg.linalg.LinAlgError: Singular matrix*
What might be the problem ?
The problem is that your matrix is singular, meaning it's not invertible. Since you're trying to take the inverse of it, that's a problem. The thread that you linked to is a basic solution to your problem, but it's not really the best solution. Rather than just inverting the matrix, what you actually want to do is solve a least-squares minimization problem to find the optimal affine transform matrix for your possibly noisy data. Here's how you would do that:
import numpy as np
primary = np.array([[40., 1160., 0.],
[40., 40., 0.],
[260., 40., 0.],
[260., 1160., 0.]])
secondary = np.array([[610., 560., 0.],
[610., -560., 0.],
[390., -560., 0.],
[390., 560., 0.]])
# Pad the data with ones, so that our transformation can do translations too
n = primary.shape[0]
pad = lambda x: np.hstack([x, np.ones((x.shape[0], 1))])
unpad = lambda x: x[:,:-1]
X = pad(primary)
Y = pad(secondary)
# Solve the least squares problem X * A = Y
# to find our transformation matrix A
A, res, rank, s = np.linalg.lstsq(X, Y)
transform = lambda x: unpad(np.dot(pad(x), A))
print "Target:"
print secondary
print "Result:"
print transform(primary)
print "Max error:", np.abs(secondary - transform(primary)).max()
The reason that your original matrix was singular is that your third coordinate is always zero, so there's no way to tell what the transform on that coordinate should be (zero times anything gives zero, so any value would work).
Printing the value of A tells you the transformation that least-squares has found:
A[np.abs(A) < 1e-10] = 0 # set really small values to zero
print A
results in
[[ -1. 0. 0. 0.]
[ 0. 1. 0. 0.]
[ 0. 0. 0. 0.]
[ 650. -600. 0. 1.]]
which is equivalent to x2 = -x1 + 650, y2 = y1 - 600, z2 = 0 where x1, y1, z1 are the coordinates in your original system and x2, y2, z2 are the coordinates in your new system. As you can see, least-squares just set all the terms related to the third dimension to zero, since your system is really two-dimensional.