It's possible to read dense data by this way:
# tf - tensorflow, np - numpy, sess - session
m = np.ones((2, 3))
placeholder = tf.placeholder(tf.int32, shape=m.shape)
sess.run(placeholder, feed_dict={placeholder: m})
How to read scipy sparse matrix (for example scipy.sparse.csr_matrix) into tf.placeholder or maybe tf.sparse_placeholder ?
I think that currently TF does not have a good way to read from sparse data. If you do not want to convert a your sparse matrix into a dense one, you can try to construct a sparse tensor..
Here is what official tutorial tells you:
SparseTensors don't play well with queues. If you use SparseTensors
you have to decode the string records using tf.parse_example after
batching (instead of using tf.parse_single_example before batching).
To feed SciPy sparse matrix to TF placeholder
Option 1: you need to use tf.sparse_placeholder. In Use coo_matrix in TensorFlow shows the way to feed data to a sparse_placeholder
Option 2: you need to convert sparse matrix to NumPy dense matrix and feed to tf.place_holder (of course, this way is impossible when the converted dense matrix is out of memory)
Related
I am trying to code a custom metric for U-net model implemented using keras/tensorflow. In the metric, I need to use the opencv function, 'cv2.dilate' on the ground truth. When I tried to use it, it gave the error as y_true is a tensor and cv2.dilate expects a numpy array.
Any idea on how to implement this?
I tried to convert tensor to numpy array but it is not working.
I searched for the tensorflow implementation of cv2.dilate but couldnt find one.
One possibility, if you are using a simple rectangular kernel in your dilation, is to use tf.nn.max_pool2d as a replacement.
import numpy as np
import tensorflow as tf
import cv2
image = np.random.random((28,28))
kernel_size = 3
# OpenCV dilation works with grayscale image, with H,W dimensions
dilated_cv = cv2.dilate(image, np.ones((kernel_size, kernel_size), np.uint8))
# TensorFlow maxpooling works with batch and channels: B,H,W,C dimenssions
image_w_batch_and_channels = image[None,...,None]
dilated_tf = tf.nn.max_pool2d(image_w_batch_and_channels, kernel_size, 1, "SAME")
# checking that the results are equal
np.allclose(dilated_cv, dilated_tf[0,...,0])
However, given that you mention that you are applying dilation on the ground truth, this dilation does not need to be differentiable. In that case, you can wrap your dilation in a tf.numpy_function
from functools import partial
# be sure to put the correct output type, tf.float64 is working in that specific case because numpy defaults to float64, but it might be different in your case
dilated_tf_npfunc = tf.numpy_function(
partial(cv2.dilate, kernel=np.ones((kernel_size, kernel_size), np.uint8)), [image]
)
The problem can be described in zigzag scanning. However, I wonder if there's is a TensorFlow version of implementation by using something like tf.tensor_scatter_nd_update that TensorFlow suggests.
BxNxN tensor where B represents Batch.
I found a workaround by using 1x1 conv. Use numpy to generate a constant permutation conv kernel ( tf does not support eager tensor assignment... ), then
reshape tensor(BxNxN) to Bx1x1x(NxN) before applying tf.nn.conv2d to it. Finally do some reshape acrobat to flatten it.
I have an autoencoder defined using tf.keras in tensorflow 1.15. I cannot upgrade to tensorflow to 2.0 for some specific reasons.
This particular autoencoder is used for anomaly detection. I currently compute the AUC score of the autoencoder as follows:
All anomalous inputs are labelled 1 and all normal inputs are labelled 0. This is y_true
I feed the autoencoder with unseen inputs and then measure the reconstruction error, like so: errors = np.mean(np.square(data - model.predict(data)), axis=-1)
The mean of this array is then said to the predicted label, y_pred.
I then compute the AUC using auc = metrics.roc_auc_score(y_true, y_pred).
This approach works well. I now need to move towards using tf.data.dataset to feed in my data. Previously, it was numpy arrays. The issue is, I am unable to convert tf.data.dataset to a numpy array and hence unable to compute the mean squared error as seen in 2.
Once I have a tf.data.Dataset, I feed it for prediction like so: results = model.predict(x_test)
This yields a numpy array, results. I want to compute the mean square error of results with x_test. However, x_test is of type tf.data.Dataset. So the question is, how can I convert a tf.data.dataset to a numpy array in tensorflow 1.15 or what is an alternative method to do this?
I tried searching for the documentation online but I can't find anything that gives me an answer. What does .numpy() function do? The example code given is:
y_true = []
for X_batch, y_batch in mnist_test:
y_true.append(y_batch.numpy()[0].tolist())
Both in Pytorch and Tensorflow, the .numpy() method is pretty much straightforward. It converts a tensor object into an numpy.ndarray object. This implicitly means that the converted tensor will be now processed on the CPU.
Ever getting a problem understanding some PyTorch function you may ask help().
import torch
t = torch.tensor([1,2,3])
help(t.numpy)
Out:
Help on built-in function numpy:
numpy(...) method of torch.Tensor instance
numpy() -> numpy.ndarray
Returns :attr:`self` tensor as a NumPy :class:`ndarray`. This tensor and the
returned :class:`ndarray` share the same underlying storage. Changes to
:attr:`self` tensor will be reflected in the :class:`ndarray` and vice versa.
This numpy() function is the converter form torch.Tensor to numpy array.
If we look at this code below, we see a simple example where the .numpy() convert Tensors to numpy arrays automatically.
import numpy as np
ndarray = np.ones([3, 3])
print("TensorFlow operations convert numpy arrays to Tensors automatically")
tensor = tf.multiply(ndarray, 42)
print(tensor)
print("And NumPy operations convert Tensors to numpy arrays automatically")
print(np.add(tensor, 1))
print("The .numpy() method explicitly converts a Tensor to a numpy array")
print(tensor.numpy())
In the 2nd last line of code, we see that the tensorflow officials declared it as the converter of Tensor to a numpy array.
You may check it out here
I'm doing a Matrix Factorization in TensorFlow, I want to use coo_matrix from Spicy.sparse cause it uses less memory and it makes it easy to put all my data into my matrix for training data.
Is it possible to use coo_matrix to initialize a variable in tensorflow?
Or do I have to create a session and feed the data I got into tensorflow using sess.run() with feed_dict.
I hope that you understand my question and my problem otherwise comment and i will try to fix it.
The closest thing TensorFlow has to scipy.sparse.coo_matrix is tf.SparseTensor, which is the sparse equivalent of tf.Tensor. It will probably be easiest to feed a coo_matrix into your program.
A tf.SparseTensor is a slight generalization of COO matrices, where the tensor is represented as three dense tf.Tensor objects:
indices: An N x D matrix of tf.int64 values in which each row represents the coordinates of a non-zero value. N is the number of non-zeroes, and D is the rank of the equivalent dense tensor (2 in the case of a matrix).
values: A length-N vector of values, where element i is the value of the element whose coordinates are given on row i of indices.
dense_shape: A length-D vector of tf.int64, representing the shape of the equivalent dense tensor.
For example, you could use the following code, which uses tf.sparse_placeholder() to define a tf.SparseTensor that you can feed, and a tf.SparseTensorValue that represents the actual value being fed :
sparse_input = tf.sparse_placeholder(dtype=tf.float32, shape=[100, 100])
# ...
train_op = ...
coo_matrix = scipy.sparse.coo_matrix(...)
# Wrap `coo_matrix` in the `tf.SparseTensorValue` form that TensorFlow expects.
# SciPy stores the row and column coordinates as separate vectors, so we must
# stack and transpose them to make an indices matrix of the appropriate shape.
tf_coo_matrix = tf.SparseTensorValue(
indices=np.array([coo_matrix.rows, coo_matrix.cols]).T,
values=coo_matrix.data,
dense_shape=coo_matrix.shape)
Once you have converted your coo_matrix to a tf.SparseTensorValue, you can feed sparse_input with the tf.SparseTensorValue directly:
sess.run(train_op, feed_dict={sparse_input: tf_coo_matrix})