How to construct an equivalent multivariate normal distribution in tensorflow-probability, using TransformedDistribution? - tensorflow

How to construct an equivalent multivariate normal distribution in tensorflow-probability, using TransformedDistribution and tfb.ScaleMatvecLinearOperator?
I'm reading about a tutorial on a bijector in tensorflow_probability: tfp.bijectors.ScaleMatvecLinearOperator.
An example was provided.
n = 10000
loc = 0
scale = 0.5
normal = tfd.Normal(loc=loc, scale=scale)
The above codes creates a univariate normal distribution.
tril = tf.random.normal((2, 4, 4))
scale_low_tri = tf.linalg.LinearOperatorLowerTriangular(tril)
scale_low_tri.to_dense()
The above codes created a tensor consisting of 2 lower triangular matrix:
<tf.Tensor: shape=(2, 4, 4), dtype=float32, numpy=
array([[[-0.56953585, 0. , 0. , 0. ],
[ 1.1368589 , 0.32028311, 0. , 0. ],
[-0.8328388 , -1.9963025 , -0.6005632 , 0. ],
[ 0.596155 , -0.214932 , 1.0988408 , -0.41731614]],
[[ 2.0778096 , 0. , 0. , 0. ],
[-1.1863967 , 2.4897904 , 0. , 0. ],
[ 0.38001925, 1.4962028 , 1.7609248 , 0. ],
[ 2.9253726 , 0.7047957 , 0.050508 , 0.58643174]]],
dtype=float32)>
Then a matrix-vector multiplication bijector is created:
scale_lin_op = tfb.ScaleMatvecLinearOperator(scale_low_tri)
After that, a TransformedDistribution is constructed as follows:
mvn = tfd.TransformedDistribution(normal, scale_lin_op, batch_shape=[2], event_shape=[4]) #
This should have worked in the old versions of tensorflow_probability. However the constructor of TransformedDistribution is changed now and does not accept the last two parameters batch_shape and event_shape. Therefore I tried to use the following way to do the same:
mvn2 = tfd.TransformedDistribution(
distribution=tfd.Sample(
normal,
sample_shape=[4] # base_dist.event_shape == [4]
),
bijector=scale_lin_op, ) # batch_shape=[2], event_shape=[4]
mvn2
And the result seems to have the correct batch_shape and event_shape
<tfp.distributions.TransformedDistribution 'scale_matvec_linear_operatorSampleNormal' batch_shape=[2] event_shape=[4] dtype=float32>
Then, another distribution for comparison is created:
mvn3 = tfd.MultivariateNormalLinearOperator(loc=loc, scale=scale_low_tri)
mvn3
According to the tutorial, the TransformedDistribution mvn2 should be equivalent to the MultivariateNormalLinearOperator mvn3.
# Check
xn = normal.sample((n, 2, 4)) # sample_shape = (n, 2, 4)
tf.norm(mvn2.log_prob(xn) - mvn3.log_prob(xn)) / tf.norm(mvn2.log_prob(xn))
<tf.Tensor: shape=(), dtype=float32, numpy=0.7498207>
But in my result they are not equivalent. (If they are, the above tensor should be 0)
What have I done wrong?

Related

Removing row of a tensor in tensorflow

Assuming I have a tensor k
k=tf.random.normal([4,5],0,1)
def sample_without_replacement(logits, K):
"""
Courtesy of https://github.com/tensorflow/tensorflow/issues/9260#issuecomment-437875125
"""
logits=tf.transpose(logits)
z = -tf.math.log(-tf.math.log(tf.random.uniform(tf.shape(logits),0,1)))
_, indices = tf.math.top_k(logits, K)
return indices
indices=sample_without_replacement(k, 2):
k.remove(x,indices_of_size_two)#
Which function can i
use in place of 'remove' to remove rows contained in indices from k ?
You can remove a specific row from a tensor by using np.delete(). For example, if i have to remove the 1st row from a tensor k
<tf.Tensor: shape=(4, 5), dtype=float32, numpy=
array([[-0.02622163, -1.2923028 , 3.2072415 , 1.2431644 , -0.11518966],
[ 1.2594987 , -1.6813043 , -0.4560027 , 1.4999349 , -0.7349123 ],
[ 0.21005473, 1.1832136 , -2.4060364 , -0.59930867, -0.1646447 ],
[ 0.7740495 , 0.48236254, 0.682837 , -0.54411227, 1.0912068 ]],
dtype=float32)>
np.delete(k, obj=1, axis=0)
output:
array([[-0.02622163, -1.2923028 , 3.2072415 , 1.2431644 , -0.11518966],
[ 0.21005473, 1.1832136 , -2.4060364 , -0.59930867, -0.1646447 ],
[ 0.7740495 , 0.48236254, 0.682837 , -0.54411227, 1.0912068 ]],
dtype=float32)
Thank You.

Addressing polynomial multiplication and division "overflow" issue

I have a list of the coefficient to degree 1 polynomials, with a[i][0]*x^1 + a[i][1]
a = np.array([[ 1. , 77.48514702],
[ 1. , 0. ],
[ 1. , 2.4239275 ],
[ 1. , 1.21848739],
[ 1. , 0. ],
[ 1. , 1.18181818],
[ 1. , 1.375 ],
[ 1. , 2. ],
[ 1. , 2. ],
[ 1. , 2. ]])
And running into issues with the following operation,
np.polydiv(reduce(np.polymul, a), a[0])[0] != reduce(np.polymul, a[1:])
where
In [185]: reduce(np.polymul, a[1:])
Out[185]:
array([ 1. , 12.19923307, 63.08691612, 179.21045388,
301.91486027, 301.5756213 , 165.35814595, 38.39582615,
0. , 0. ])
and
In [186]: np.polydiv(reduce(np.polymul, a), a[0])[0]
Out[186]:
array([ 1.00000000e+00, 1.21992331e+01, 6.30869161e+01, 1.79210454e+02,
3.01914860e+02, 3.01575621e+02, 1.65358169e+02, 3.83940472e+01,
1.37845155e-01, -1.06809521e+01])
First of all the remainder of np.polydiv(reduce(np.polymul, a), a[0]) is way bigger than 0, 827.61514239 to be exact, and secondly, the last two terms to quotient should be 0, but way larger from 0. 1.37845155e-01, -1.06809521e+01.
I'm wondering what are my options to improve the accuracy?
There is a slightly complicated way to keep the product first and then divide structure.
By first employ n points and evaluate on a.
xs = np.linspace(0, 1., 10)
ys = np.array([np.prod(list(map(lambda r: np.polyval(r, x), a))) for x in xs])
then do the division on ys instead of coefficients.
ys = ys/np.array([np.polyval(a[0], x) for x in xs])
finally recover the coefficient using polynomial interpolation with xs and ys
from scipy.interpolate import lagrange
lagrange(xs, ys)

Keras how to view node connections?

If I understand a neural net correctly - it is just a graph of nodes and edges where each node in a given layer is connected to every node in the following layer.
The nodes have weights and the edges have weights? And you do some multiplication of these values to get a prediction.
Given a 2 layer model (with 2 input nodes 'a & b' and 1 output node 'c'), this is what I am after:
| source | destination | value |
+--------+-------------+-------+
| a | c | 0.01 |
| b | c | 0.03 |
But when I call model.weights (albeit on a more complex model) I get a bunch of keyless np arrays with no way to tell which values belong to which nodes.
[<tf.Variable 'dense_1/kernel:0' shape=(8, 12) dtype=float32, numpy=
array([[ 0.31751466, 0.20620143, 0.09791961, -0.08813753, 0.2515421 ,
-0.53187364, -0.15702713, 0.0267031 , -0.48389524, -0.13240823,
0.39453653, -0.39209265],
[ 0.31308496, -0.38468117, -0.03970708, 0.2889997 , 0.03803336,
0.04796927, -0.5140167 , 0.04645742, 0.08511442, -0.09435426,
0.03105392, -0.17520434],
[ 0.05365064, -0.05402106, -0.02931813, 0.13150737, 0.08898667,
0.20198704, 0.28716817, 0.21081768, -0.09572094, 0.14665389,
-0.3083644 , -0.47491354],
[-0.36734372, -0.12509695, -0.16984704, -0.19592582, 0.24023046,
-0.28856498, 0.11084742, 0.12101128, 0.00146453, -0.4996385 ,
-0.23521361, 0.24130017],
[ 0.21538568, -0.08531788, -0.32247233, -0.09213281, -0.39390212,
0.05042276, 0.22282743, -0.11438937, -0.00920196, 0.12748554,
-0.02741051, -0.12594655],
[ 0.3057384 , -0.20449257, 0.16837521, 0.21493798, -0.14034544,
0.45435148, -0.0548106 , 0.07033874, 0.39275315, -0.3332669 ,
-0.10222256, 0.14674312],
[ 0.36575058, 0.07205153, -0.14340317, -0.57348907, 0.7167731 ,
-0.29590985, 0.6351 , -0.6615748 , -0.23423046, -0.1065482 ,
0.7084621 , 0.02146828],
[-0.14760445, -0.4926324 , 0.30986223, 0.4067813 , 0.32313958,
-0.39595246, 0.12813015, -0.3088377 , -0.7285755 , 0.6085407 ,
0.39351743, -0.09248918]], dtype=float32)>,
<tf.Variable 'dense_1/bias:0' shape=(12,) dtype=float32, numpy=
array([-1.1890789 , 0. , -0.43765482, 0.5292001 , -0.94201744,
0.44064137, -0.5898111 , 0.8738893 , -0.62948394, 0.9394948 ,
0.47176355, 0. ], dtype=float32)>,
<tf.Variable 'dense_2/kernel:0' shape=(12, 8) dtype=float32, numpy=
array([[ 0.18743241, -0.04509293, 0.26035592, -0.40080604, -0.2120734 ,
0.0604641 , 0.17452721, -0.25245216],
[-0.4116977 , 0.4476785 , 0.13495606, 0.38070595, -0.16811815,
-0.5323667 , -0.41471216, 0.49056184],
[-0.43843648, -0.01767761, 0.03876654, 0.279591 , -0.64866304,
0.4605058 , 0.50288963, 0.46865177],
[-0.50431 , 0.26749972, -0.4822985 , 0.11643535, 0.34190154,
0.28961414, -0.19484225, 0.32788265],
[-0.4659909 , 0.12863334, -0.17177017, 0.27696657, -0.08261362,
0.1787579 , -0.49217325, -0.419283 ],
[-0.31586087, 0.4421215 , -0.35133213, -0.40784043, 0.3213457 ,
0.08262701, -0.20723267, -0.4305911 ],
[-0.32226318, -0.3479017 , -0.48984393, -0.19052912, 0.27398133,
-0.18631694, -0.42036086, -0.31824118],
[-0.04223084, -0.38938865, -0.33997327, -0.7986885 , -0.12062006,
-0.37880445, 0.06364141, 0.41674942],
[-0.07699671, -1.0260301 , -0.38287994, 0.46872973, -0.32630473,
0.37103057, 0.06274027, -0.25317484],
[-0.11334842, 0.29602957, 0.01759415, 0.07748368, -0.0767558 ,
0.13787462, -0.31502756, 0.17331126],
[-0.5030543 , -0.23578712, -0.38978124, 0.01187875, -0.02882512,
-0.5208091 , -0.4208508 , -0.08294159],
[ 0.04435921, 0.545004 , 0.07590699, 0.21470094, -0.46099266,
-0.25307545, -0.31362575, 0.3284188 ]], dtype=float32)>,
<tf.Variable 'dense_2/bias:0' shape=(8,) dtype=float32, numpy=
array([ 0. , 1.3254918 , -0.18484406, -0.0136466 , 1.2459729 ,
-1.331188 , -0.01439124, 0.9184486 ], dtype=float32)>,
<tf.Variable 'dense_3/kernel:0' shape=(8, 1) dtype=float32, numpy=
array([[-0.27390796],
[-0.40990734],
[-0.12878264],
[-0.43434066],
[-0.04099607],
[ 0.57922167],
[ 0.3830525 ],
[-0.47695825]], dtype=float32)>, <tf.Variable 'dense_3/bias:0' shape=(1,) dtype=float32, numpy=array([-1.3391492], dtype=float32)>]
Is there a JSON/dictionary-like way to get what I am after?
The "sources" and "destinations" of those edges don't have names like "a" and "b", they're just the kth neuron of the nth layer. The weights, then, are just an array. For example, weights[n][i][j] might be the weight of the edge connecting the ith neuron of layer n to the jth neuron of layer n+1. In this paradigm, the weights of your textbook example would look like
[[[ 0.8 0.4 0.3 ] [ 0.2 0.9 0.5 ]]
[[ 0.3 0.5 0.9 ]]]
When you take into account the fact that each neuron can have a bias as well as incoming weights, and that different layers' different numbers of neurons would make the 3D array ragged (which is inconvenient), you might find that the most convenient way to store it is as a structure that contains several 2D arrays (each one containing the weights for one pair of layers) and several 1D arrays (each containing the biases for one layer), all of different sizes... which is exactly what the dump you provided shows.

bad result from numpy corrcoef and minimum spanning tree

I have this code:
mm = np.array([[1, 4, 7, 8], [2, 2, 8, 4], [1, 13, 1, 5]])
mm = np.column_stack(mm)
mmCov = np.cov(mm, rowvar=0)
print("covariance\n", mmCov)
# my code to get correlations
mmResCor = np.zeros(shape=(3, 3))
for i in range(len(mmCov)):
for j in range(len(mmCov[i])):
mmResCor[i][j] = mmCov[i][j] / (math.sqrt(mmCov[i][i] * mmCov[j] [j]))
print("correlaciones a mano\n", mmResCor)
mmCor = np.corrcoef(mmCov, rowvar=0)
print("correlations\n", mmCor)
X = csr_matrix(mmCor)
XX = minimum_spanning_tree(X)
print("minimun spanning tree\n", XX)
first: each column represents a variable, with observations in the rows
numpy corrcoef use this relation with covariance matrix:
R_{ij} = \frac{ C_{ij} } { \sqrt{ C_{ii} * C_{jj} } }
when I use numpy corrcoef I get this matrix
correlations
[[ 1. 0.8660254 -0.82603319]
[ 0.8660254 1. -0.99717646]
[-0.82603319 -0.99717646 1. ]]
but when I apply "my code" to get the same result...
mmResCor = np.zeros(shape=(3, 3))
for i in range(len(mmCov)):
for j in range(len(mmCov[i])):
mmResCor[i][j] = mmCov[i][j] / (math.sqrt(mmCov[i][i] * mmCov[j][j]))
I get this matrix
correlaciones a mano
[[ 1. 0.67082039 0. ]
[ 0.67082039 1. -0.5 ]
[ 0. -0.5 1. ]]
why do I get differents results if its suppose I am doing the same?
One more question:
When I apply minimun_spanning_tree I get this:
minimun spanning tree
(0, 2) -0.826033187631
(1, 2) -0.997176464953
Is there any way to represent these or can I save this result in some variables?
The np.corrcoef should take the data as the input. You're passing the covariance matrix as input. If you pass the data, you get the same result as your manual computation:
>>> np.corrcoef(mm, rowvar=0)
array([[ 1. , 0.67082039, 0. ],
[ 0.67082039, 1. , -0.5 ],
[ 0. , -0.5 , 1. ]])
Regarding the minimum spanning tree, I'm not sure what your question is, but the output XX is a sparse matrix which stores a matrix representation of the tree.

Why are the convolution outputs calculated with theano and numpy not the same?

I made a simple example ipython notebook to calculate convolution with theano and with numpy, however the results are different. Does anybody know where is the mistake?
import theano
import numpy
from theano.sandbox.cuda import dnn
import theano.tensor as T
Define the input image x0:
x0 = numpy.array([[[[ 7.61323881, 0. , 0. , 0. ,
0. , 0. ],
[ 25.58142853, 0. , 0. , 0. ,
0. , 0. ],
[ 7.51445341, 0. , 0. , 0. ,
0. , 0. ],
[ 0. , 12.74498367, 4.96315479, 0. ,
0. , 0. ],
[ 0. , 0. , 0. , 0. ,
0. , 0. ],
[ 0. , 0. , 0. , 0. ,
0. , 0. ]]]], dtype='float32')
x0.shape
# (1, 1, 6, 6)
Define the convolution kernel:
w0 = numpy.array([[[[-0.0015835 , -0.00088091, 0.00226375, 0.00378434, 0.00032208,
-0.00396959],
[-0.000179 , 0.00030951, 0.00113849, 0.00012536, -0.00017198,
-0.00318825],
[-0.00263921, -0.00383847, -0.00225416, -0.00250589, -0.00149073,
-0.00287099],
[-0.00149283, -0.00312137, -0.00431571, -0.00394508, -0.00165113,
-0.0012118 ],
[-0.00167376, -0.00169753, -0.00373235, -0.00337372, -0.00025546,
0.00072154],
[-0.00141197, -0.00099017, -0.00091934, -0.00226817, -0.0024105 ,
-0.00333713]]]], dtype='float32')
w0.shape
# (1, 1, 6, 6)
Calculate the convolution with theano and cudnn:
X = T.tensor4('input')
W = T.tensor4('W')
conv_out = dnn.dnn_conv(img=X, kerns=W)
convolution = theano.function([X, W], conv_out)
numpy.array(convolution(x0, w0))
# array([[[[-0.04749081]]]], dtype=float32)
Calculate convolution with numpy (note the result is different):
numpy.sum(x0 * w0)
# -0.097668208
I'm not exactly sure what kind of convolution you are trying to compute, but it seems to me that numpy.sum(x0*w0) might not be the way to do it. Does this help?
import numpy as np
# ... define x0 and w0 like in your example ...
np_convolution = np.fft.irfftn(np.fft.rfftn(x0) * np.fft.rfftn(w0))
The last element of the resulting array, i.e. np_convolution[-1,-1,-1,-1] is -0.047490807560833327, which seems to be the answer you're looking for in your notebook.