Keras how to view node connections? - tensorflow

If I understand a neural net correctly - it is just a graph of nodes and edges where each node in a given layer is connected to every node in the following layer.
The nodes have weights and the edges have weights? And you do some multiplication of these values to get a prediction.
Given a 2 layer model (with 2 input nodes 'a & b' and 1 output node 'c'), this is what I am after:
| source | destination | value |
+--------+-------------+-------+
| a | c | 0.01 |
| b | c | 0.03 |
But when I call model.weights (albeit on a more complex model) I get a bunch of keyless np arrays with no way to tell which values belong to which nodes.
[<tf.Variable 'dense_1/kernel:0' shape=(8, 12) dtype=float32, numpy=
array([[ 0.31751466, 0.20620143, 0.09791961, -0.08813753, 0.2515421 ,
-0.53187364, -0.15702713, 0.0267031 , -0.48389524, -0.13240823,
0.39453653, -0.39209265],
[ 0.31308496, -0.38468117, -0.03970708, 0.2889997 , 0.03803336,
0.04796927, -0.5140167 , 0.04645742, 0.08511442, -0.09435426,
0.03105392, -0.17520434],
[ 0.05365064, -0.05402106, -0.02931813, 0.13150737, 0.08898667,
0.20198704, 0.28716817, 0.21081768, -0.09572094, 0.14665389,
-0.3083644 , -0.47491354],
[-0.36734372, -0.12509695, -0.16984704, -0.19592582, 0.24023046,
-0.28856498, 0.11084742, 0.12101128, 0.00146453, -0.4996385 ,
-0.23521361, 0.24130017],
[ 0.21538568, -0.08531788, -0.32247233, -0.09213281, -0.39390212,
0.05042276, 0.22282743, -0.11438937, -0.00920196, 0.12748554,
-0.02741051, -0.12594655],
[ 0.3057384 , -0.20449257, 0.16837521, 0.21493798, -0.14034544,
0.45435148, -0.0548106 , 0.07033874, 0.39275315, -0.3332669 ,
-0.10222256, 0.14674312],
[ 0.36575058, 0.07205153, -0.14340317, -0.57348907, 0.7167731 ,
-0.29590985, 0.6351 , -0.6615748 , -0.23423046, -0.1065482 ,
0.7084621 , 0.02146828],
[-0.14760445, -0.4926324 , 0.30986223, 0.4067813 , 0.32313958,
-0.39595246, 0.12813015, -0.3088377 , -0.7285755 , 0.6085407 ,
0.39351743, -0.09248918]], dtype=float32)>,
<tf.Variable 'dense_1/bias:0' shape=(12,) dtype=float32, numpy=
array([-1.1890789 , 0. , -0.43765482, 0.5292001 , -0.94201744,
0.44064137, -0.5898111 , 0.8738893 , -0.62948394, 0.9394948 ,
0.47176355, 0. ], dtype=float32)>,
<tf.Variable 'dense_2/kernel:0' shape=(12, 8) dtype=float32, numpy=
array([[ 0.18743241, -0.04509293, 0.26035592, -0.40080604, -0.2120734 ,
0.0604641 , 0.17452721, -0.25245216],
[-0.4116977 , 0.4476785 , 0.13495606, 0.38070595, -0.16811815,
-0.5323667 , -0.41471216, 0.49056184],
[-0.43843648, -0.01767761, 0.03876654, 0.279591 , -0.64866304,
0.4605058 , 0.50288963, 0.46865177],
[-0.50431 , 0.26749972, -0.4822985 , 0.11643535, 0.34190154,
0.28961414, -0.19484225, 0.32788265],
[-0.4659909 , 0.12863334, -0.17177017, 0.27696657, -0.08261362,
0.1787579 , -0.49217325, -0.419283 ],
[-0.31586087, 0.4421215 , -0.35133213, -0.40784043, 0.3213457 ,
0.08262701, -0.20723267, -0.4305911 ],
[-0.32226318, -0.3479017 , -0.48984393, -0.19052912, 0.27398133,
-0.18631694, -0.42036086, -0.31824118],
[-0.04223084, -0.38938865, -0.33997327, -0.7986885 , -0.12062006,
-0.37880445, 0.06364141, 0.41674942],
[-0.07699671, -1.0260301 , -0.38287994, 0.46872973, -0.32630473,
0.37103057, 0.06274027, -0.25317484],
[-0.11334842, 0.29602957, 0.01759415, 0.07748368, -0.0767558 ,
0.13787462, -0.31502756, 0.17331126],
[-0.5030543 , -0.23578712, -0.38978124, 0.01187875, -0.02882512,
-0.5208091 , -0.4208508 , -0.08294159],
[ 0.04435921, 0.545004 , 0.07590699, 0.21470094, -0.46099266,
-0.25307545, -0.31362575, 0.3284188 ]], dtype=float32)>,
<tf.Variable 'dense_2/bias:0' shape=(8,) dtype=float32, numpy=
array([ 0. , 1.3254918 , -0.18484406, -0.0136466 , 1.2459729 ,
-1.331188 , -0.01439124, 0.9184486 ], dtype=float32)>,
<tf.Variable 'dense_3/kernel:0' shape=(8, 1) dtype=float32, numpy=
array([[-0.27390796],
[-0.40990734],
[-0.12878264],
[-0.43434066],
[-0.04099607],
[ 0.57922167],
[ 0.3830525 ],
[-0.47695825]], dtype=float32)>, <tf.Variable 'dense_3/bias:0' shape=(1,) dtype=float32, numpy=array([-1.3391492], dtype=float32)>]
Is there a JSON/dictionary-like way to get what I am after?

The "sources" and "destinations" of those edges don't have names like "a" and "b", they're just the kth neuron of the nth layer. The weights, then, are just an array. For example, weights[n][i][j] might be the weight of the edge connecting the ith neuron of layer n to the jth neuron of layer n+1. In this paradigm, the weights of your textbook example would look like
[[[ 0.8 0.4 0.3 ] [ 0.2 0.9 0.5 ]]
[[ 0.3 0.5 0.9 ]]]
When you take into account the fact that each neuron can have a bias as well as incoming weights, and that different layers' different numbers of neurons would make the 3D array ragged (which is inconvenient), you might find that the most convenient way to store it is as a structure that contains several 2D arrays (each one containing the weights for one pair of layers) and several 1D arrays (each containing the biases for one layer), all of different sizes... which is exactly what the dump you provided shows.

Related

Removing row of a tensor in tensorflow

Assuming I have a tensor k
k=tf.random.normal([4,5],0,1)
def sample_without_replacement(logits, K):
"""
Courtesy of https://github.com/tensorflow/tensorflow/issues/9260#issuecomment-437875125
"""
logits=tf.transpose(logits)
z = -tf.math.log(-tf.math.log(tf.random.uniform(tf.shape(logits),0,1)))
_, indices = tf.math.top_k(logits, K)
return indices
indices=sample_without_replacement(k, 2):
k.remove(x,indices_of_size_two)#
Which function can i
use in place of 'remove' to remove rows contained in indices from k ?
You can remove a specific row from a tensor by using np.delete(). For example, if i have to remove the 1st row from a tensor k
<tf.Tensor: shape=(4, 5), dtype=float32, numpy=
array([[-0.02622163, -1.2923028 , 3.2072415 , 1.2431644 , -0.11518966],
[ 1.2594987 , -1.6813043 , -0.4560027 , 1.4999349 , -0.7349123 ],
[ 0.21005473, 1.1832136 , -2.4060364 , -0.59930867, -0.1646447 ],
[ 0.7740495 , 0.48236254, 0.682837 , -0.54411227, 1.0912068 ]],
dtype=float32)>
np.delete(k, obj=1, axis=0)
output:
array([[-0.02622163, -1.2923028 , 3.2072415 , 1.2431644 , -0.11518966],
[ 0.21005473, 1.1832136 , -2.4060364 , -0.59930867, -0.1646447 ],
[ 0.7740495 , 0.48236254, 0.682837 , -0.54411227, 1.0912068 ]],
dtype=float32)
Thank You.

How to construct an equivalent multivariate normal distribution in tensorflow-probability, using TransformedDistribution?

How to construct an equivalent multivariate normal distribution in tensorflow-probability, using TransformedDistribution and tfb.ScaleMatvecLinearOperator?
I'm reading about a tutorial on a bijector in tensorflow_probability: tfp.bijectors.ScaleMatvecLinearOperator.
An example was provided.
n = 10000
loc = 0
scale = 0.5
normal = tfd.Normal(loc=loc, scale=scale)
The above codes creates a univariate normal distribution.
tril = tf.random.normal((2, 4, 4))
scale_low_tri = tf.linalg.LinearOperatorLowerTriangular(tril)
scale_low_tri.to_dense()
The above codes created a tensor consisting of 2 lower triangular matrix:
<tf.Tensor: shape=(2, 4, 4), dtype=float32, numpy=
array([[[-0.56953585, 0. , 0. , 0. ],
[ 1.1368589 , 0.32028311, 0. , 0. ],
[-0.8328388 , -1.9963025 , -0.6005632 , 0. ],
[ 0.596155 , -0.214932 , 1.0988408 , -0.41731614]],
[[ 2.0778096 , 0. , 0. , 0. ],
[-1.1863967 , 2.4897904 , 0. , 0. ],
[ 0.38001925, 1.4962028 , 1.7609248 , 0. ],
[ 2.9253726 , 0.7047957 , 0.050508 , 0.58643174]]],
dtype=float32)>
Then a matrix-vector multiplication bijector is created:
scale_lin_op = tfb.ScaleMatvecLinearOperator(scale_low_tri)
After that, a TransformedDistribution is constructed as follows:
mvn = tfd.TransformedDistribution(normal, scale_lin_op, batch_shape=[2], event_shape=[4]) #
This should have worked in the old versions of tensorflow_probability. However the constructor of TransformedDistribution is changed now and does not accept the last two parameters batch_shape and event_shape. Therefore I tried to use the following way to do the same:
mvn2 = tfd.TransformedDistribution(
distribution=tfd.Sample(
normal,
sample_shape=[4] # base_dist.event_shape == [4]
),
bijector=scale_lin_op, ) # batch_shape=[2], event_shape=[4]
mvn2
And the result seems to have the correct batch_shape and event_shape
<tfp.distributions.TransformedDistribution 'scale_matvec_linear_operatorSampleNormal' batch_shape=[2] event_shape=[4] dtype=float32>
Then, another distribution for comparison is created:
mvn3 = tfd.MultivariateNormalLinearOperator(loc=loc, scale=scale_low_tri)
mvn3
According to the tutorial, the TransformedDistribution mvn2 should be equivalent to the MultivariateNormalLinearOperator mvn3.
# Check
xn = normal.sample((n, 2, 4)) # sample_shape = (n, 2, 4)
tf.norm(mvn2.log_prob(xn) - mvn3.log_prob(xn)) / tf.norm(mvn2.log_prob(xn))
<tf.Tensor: shape=(), dtype=float32, numpy=0.7498207>
But in my result they are not equivalent. (If they are, the above tensor should be 0)
What have I done wrong?

Compare numpy arrays of different shapes

I have two numpy arrays of shapes (4,4) and (9,4)
matrix1 = array([[ 72. , 72. , 72. , 72. ],
[ 72.00396729, 72.00396729, 72.00396729, 72.00396729],
[596.29998779, 596.29998779, 596.29998779, 596.29998779],
[708.83398438, 708.83398438, 708.83398438, 708.83398438]])
matrix2 = array([[ 72.02400208, 77.68997192, 115.6057663 , 105.64997101],
[120.98195648, 77.68997192, 247.19802856, 105.64997101],
[252.6330719 , 77.68997192, 337.25634766, 105.64997101],
[342.63256836, 77.68997192, 365.60125732, 105.64997101],
[ 72.02400208, 113.53997803, 189.65515137, 149.53997803],
[196.87202454, 113.53997803, 308.13119507, 149.53997803],
[315.3480835 , 113.53997803, 405.77023315, 149.53997803],
[412.86999512, 113.53997803, 482.0453186 , 149.53997803],
[ 72.02400208, 155.81002808, 108.98254395, 183.77003479]])
I need to compare all the rows of matrix2 with every row of matrix1. How can this be done without looping in the elements of matrix1?
If it is about element-wise comparison of the rows, then check this example:
# Generate sample arrays
a = np.random.randint(0, 5, size = (4, 3))
b = np.random.randint(-1, 6, size = (5, 3))
# Compare
a == b[:, None]
The last line does the comparison for you. The output array will have shape (num_of_b_rows, num_of_a_rows, common_num_of_cols): in this case, (5, 4, 3).

How to understand the conv2d_transpose in tensorflow

The following is a test for conv2d_transpose.
import tensorflow as tf
import numpy as np
x = tf.constant(np.array([[
[[-67], [-77]],
[[-117], [-127]]
]]), tf.float32)
# shape = (3, 3, 1, 1) -> (height, width, input_channels, output_channels) - 3x3x1 filter
f = tf.constant(np.array([
[[[-1]], [[2]], [[-3]]],
[[[4]], [[-5]], [[6]]],
[[[-7]], [[8]], [[-9]]]
]), tf.float32)
conv = tf.nn.conv2d_transpose(x, f, output_shape=(1, 5, 5, 1), strides=[1, 2, 2, 1], padding='VALID')
The result:
tf.Tensor(
[[[[ 67.]
[ -134.]
[ 278.]
[ -154.]
[ 231.]]
[[ -268.]
[ 335.]
[ -710.]
[ 385.]
[ -462.]]
[[ 586.]
[ -770.]
[ 1620.]
[ -870.]
[ 1074.]]
[[ -468.]
[ 585.]
[-1210.]
[ 635.]
[ -762.]]
[[ 819.]
[ -936.]
[ 1942.]
[-1016.]
[ 1143.]]]], shape=(1, 5, 5, 1), dtype=float32)
To my understanding, it should work as described in Figure 4.5 in the doc
Therefore, the first element (conv[0,0,0,0]) should be -67*-9=603. Why it turns out to be 67?
The result may be expained by the following image:. But why the convolution kernel is inversed?
To explain best, I have made a draw.io figure to explain the results that you obtained.
I guess above illustration might help explain the reason why the first element of transpose conv. feature map is 67.
A key thing to note:
Unlike traditional convolution, in transpose convolution each element of the filter is multiplied by an element of the input feature map and the results of those individual multiplications & intermediate feature maps are overlaid on one another to create the final feature map. The stride determines how far apart the overlays are. In our case, stride = 2, hence the filter moves by 2 in both x & y dimension after each convolution with the original downsampled feature map.

Why are the convolution outputs calculated with theano and numpy not the same?

I made a simple example ipython notebook to calculate convolution with theano and with numpy, however the results are different. Does anybody know where is the mistake?
import theano
import numpy
from theano.sandbox.cuda import dnn
import theano.tensor as T
Define the input image x0:
x0 = numpy.array([[[[ 7.61323881, 0. , 0. , 0. ,
0. , 0. ],
[ 25.58142853, 0. , 0. , 0. ,
0. , 0. ],
[ 7.51445341, 0. , 0. , 0. ,
0. , 0. ],
[ 0. , 12.74498367, 4.96315479, 0. ,
0. , 0. ],
[ 0. , 0. , 0. , 0. ,
0. , 0. ],
[ 0. , 0. , 0. , 0. ,
0. , 0. ]]]], dtype='float32')
x0.shape
# (1, 1, 6, 6)
Define the convolution kernel:
w0 = numpy.array([[[[-0.0015835 , -0.00088091, 0.00226375, 0.00378434, 0.00032208,
-0.00396959],
[-0.000179 , 0.00030951, 0.00113849, 0.00012536, -0.00017198,
-0.00318825],
[-0.00263921, -0.00383847, -0.00225416, -0.00250589, -0.00149073,
-0.00287099],
[-0.00149283, -0.00312137, -0.00431571, -0.00394508, -0.00165113,
-0.0012118 ],
[-0.00167376, -0.00169753, -0.00373235, -0.00337372, -0.00025546,
0.00072154],
[-0.00141197, -0.00099017, -0.00091934, -0.00226817, -0.0024105 ,
-0.00333713]]]], dtype='float32')
w0.shape
# (1, 1, 6, 6)
Calculate the convolution with theano and cudnn:
X = T.tensor4('input')
W = T.tensor4('W')
conv_out = dnn.dnn_conv(img=X, kerns=W)
convolution = theano.function([X, W], conv_out)
numpy.array(convolution(x0, w0))
# array([[[[-0.04749081]]]], dtype=float32)
Calculate convolution with numpy (note the result is different):
numpy.sum(x0 * w0)
# -0.097668208
I'm not exactly sure what kind of convolution you are trying to compute, but it seems to me that numpy.sum(x0*w0) might not be the way to do it. Does this help?
import numpy as np
# ... define x0 and w0 like in your example ...
np_convolution = np.fft.irfftn(np.fft.rfftn(x0) * np.fft.rfftn(w0))
The last element of the resulting array, i.e. np_convolution[-1,-1,-1,-1] is -0.047490807560833327, which seems to be the answer you're looking for in your notebook.