About feed sparse matrix into the graph - tensorflow

For the data dimension is too large, i have to change the data into sparse matrix, instead of the dense array.
However, for the graph includes the cnn, and when i feed the sparse matrix directly, i was told the cnn cannot receive the sparse tensor. so i have to do the operation 'sparse to dense' at first.
But the question is that the data(multi sparse matrix) i feed should be converted to a two-dimension sparse matrix.(e.g i have sparse matrix1, dim is [14,25500],and sparse matrix2, dim is [14,25500], the perfect dimension i want to feed is [2,14,25500], but the reality i faced is [28,25500])
So i have to split the tensor after entering the graph.
i want to ask, if any other ways can solve this problem ?

tf.stack is your friend
tf.stack([matrix1, matrix2]) # => [2,14,25500]

Related

Working with different Tensor types in Tensorflow

I'm struggling to work with different tensor types and operations between them. For example, a basic division tf.divide(a, b) is giving me the following error:
TypeError: Failed to convert elements of SparseTensor(indices=Tensor("inputs_8_copy:0", shape=(None, 2), dtype=int64), values=Tensor("cond/Cast_1:0", shape=(None,), dtype=float64), dense_shape=Tensor("inputs_10_copy:0", shape=(2,), dtype=int64)) to Tensor. Consider casting elements to a supported type. See https://www.tensorflow.org/api_docs/python/tf/dtypes for supported TF dtypes.
I was able to work around this by calling tf.sparse.to_dense on a and b. But the approach doesn't scale when the dataset is large. Nor does it work in general because I don't know the tensor type of all of the features (I'm working within a preprocessing_fn in TFT and the data comes from BigQuery).
This seems like a very common issue that should have a simple answer but I'm not able to find any information on it. Something like basic divisions shouldn't cause this much trouble?
It is a difficult question, in fact.
For element-wise division in particular, let say ai and bi are scalars. if ai = 0 and bi is not zero, then ai/bi = 0, but what if ai = 0 and bi = 0, ai/bi = ? 0?
Even worse, if ai is not zero and bi = 0 then ai/bi is NaN!
So if the divisor is a sparse tensor, it will raise (possibly lots of) NaNs, unless the indices of both sparse matrices are the same. Same problem if you divide a dense matrix by a sparse matrix.
There is a nice a workaround to multiply two sparse tensors element-wise here, based on the relation (a+b)^2 = a^2 + b^2 + 2 ab.
It is also possible to compute the inverse of a sparse tensor C: tf.SparseTensor(indices=C.indices, values=1/C.values, dense_shape=C.dense_shape).
So there is this NaN issue for division, and concerning the mixture of dense tensor and sparse tensor, one option consists in converting the sparse tensor to a dense tensor. But we want to avoid this. In the other direction, from converting the dense tensor to a sparse tensor, this can be very ineffective if the tensor is not really sparse.
All this to say that it does not seem to be a simple problem.

What is embedding_column doing in tensorflow

From the docs it seems to me that it is using a embedding matrix to transform a one-hot encoding like sparse input vector to a dense vector. But how is this different from just using a fully connected layer?
Summarizing the answer from comments to here.
The main difference is efficiency. Instead of having to encode data points in these very long one hot vectors and do matrix multiplication, using embedding_column allows you to use index vectors and do a matrix lookup.
To represent categories.
Both one-hot encoding and embedding column are options to represent categorical features.
One of the problem with one-hot encoding is that it doesn't encode any relationships between the categories. They are completely independent from each other, so the neural network has no way of knowing which ones are similar to each other.
This problem can be solved by representing a categorical feature with an embedding
column. The idea is that each category has a smaller vector. The values are weights, similar to the weights that are used for basic features in a neural network.
For more:
https://developers.googleblog.com/2017/11/introducing-tensorflow-feature-columns.html

Why do we flatten the data before we feed it into tensorflow?

I'm following udacity MNIST tutorial and MNIST data is originally 28*28 matrix. However right before feeding that data, they flatten the data into 1d array with 784 columns (784 = 28 * 28).
For example,
original training set shape was (200000, 28, 28).
200000 rows (data). Each data is 28*28 matrix
They converted this into the training set whose shape is (200000, 784)
Can someone explain why they flatten the data out before feeding to tensorflow?
Because when you're adding a fully connected layer, you always want your data to be a (1 or) 2 dimensional matrix, where each row is the vector representing your data. That way, the fully connected layer is just a matrix multiplication between your input (of size (batch_size, n_features)) and the weights (of shape (n_features, n_outputs)) (plus the bias and the activation function), and you get an output of shape (batch_size, n_outputs). Plus, you really don't need the original shape information in a fully connected layer, so it's OK to lose it.
It would be more complicated and less efficient to get the same result without reshaping first, that's why we always do it before a fully connected layer. For a convolutional layer, on the opposite, you'll want to keep the data in original format (width, height).
That is a convention with fully connected layers. Fully connected layers connect every node in the previous layer with every node in the successive layer so locality is not an issue for this type of layer.
Additionally by defining the layer like this we can efficiently calculate the next step by calculating the formula: f(Wx + b) = y. This would not be as easily possible with multidimensional input and reshaping the input is low cost and easy to accomplish.

How to read SciPy sparse matrix into Tensorflow's placeholder

It's possible to read dense data by this way:
# tf - tensorflow, np - numpy, sess - session
m = np.ones((2, 3))
placeholder = tf.placeholder(tf.int32, shape=m.shape)
sess.run(placeholder, feed_dict={placeholder: m})
How to read scipy sparse matrix (for example scipy.sparse.csr_matrix) into tf.placeholder or maybe tf.sparse_placeholder ?
I think that currently TF does not have a good way to read from sparse data. If you do not want to convert a your sparse matrix into a dense one, you can try to construct a sparse tensor..
Here is what official tutorial tells you:
SparseTensors don't play well with queues. If you use SparseTensors
you have to decode the string records using tf.parse_example after
batching (instead of using tf.parse_single_example before batching).
To feed SciPy sparse matrix to TF placeholder
Option 1: you need to use tf.sparse_placeholder. In Use coo_matrix in TensorFlow shows the way to feed data to a sparse_placeholder
Option 2: you need to convert sparse matrix to NumPy dense matrix and feed to tf.place_holder (of course, this way is impossible when the converted dense matrix is out of memory)

Use coo_matrix in TensorFlow

I'm doing a Matrix Factorization in TensorFlow, I want to use coo_matrix from Spicy.sparse cause it uses less memory and it makes it easy to put all my data into my matrix for training data.
Is it possible to use coo_matrix to initialize a variable in tensorflow?
Or do I have to create a session and feed the data I got into tensorflow using sess.run() with feed_dict.
I hope that you understand my question and my problem otherwise comment and i will try to fix it.
The closest thing TensorFlow has to scipy.sparse.coo_matrix is tf.SparseTensor, which is the sparse equivalent of tf.Tensor. It will probably be easiest to feed a coo_matrix into your program.
A tf.SparseTensor is a slight generalization of COO matrices, where the tensor is represented as three dense tf.Tensor objects:
indices: An N x D matrix of tf.int64 values in which each row represents the coordinates of a non-zero value. N is the number of non-zeroes, and D is the rank of the equivalent dense tensor (2 in the case of a matrix).
values: A length-N vector of values, where element i is the value of the element whose coordinates are given on row i of indices.
dense_shape: A length-D vector of tf.int64, representing the shape of the equivalent dense tensor.
For example, you could use the following code, which uses tf.sparse_placeholder() to define a tf.SparseTensor that you can feed, and a tf.SparseTensorValue that represents the actual value being fed :
sparse_input = tf.sparse_placeholder(dtype=tf.float32, shape=[100, 100])
# ...
train_op = ...
coo_matrix = scipy.sparse.coo_matrix(...)
# Wrap `coo_matrix` in the `tf.SparseTensorValue` form that TensorFlow expects.
# SciPy stores the row and column coordinates as separate vectors, so we must
# stack and transpose them to make an indices matrix of the appropriate shape.
tf_coo_matrix = tf.SparseTensorValue(
indices=np.array([coo_matrix.rows, coo_matrix.cols]).T,
values=coo_matrix.data,
dense_shape=coo_matrix.shape)
Once you have converted your coo_matrix to a tf.SparseTensorValue, you can feed sparse_input with the tf.SparseTensorValue directly:
sess.run(train_op, feed_dict={sparse_input: tf_coo_matrix})