Data set is numpy set. Some tutorial said: because it is needed to in advantage of GPU, we should change numpy array to tensorflow tensor. And then use tensorflow model.
But after training, some code use numpy function to test and interactive. But the code in tensorflow official tutorial still use the same tensorflow model and tf.dataset to test.
I want to know:
When testing or real time apply, should I use numpy or tensorflow tensor and model?
In other words, is there some bad influences using tensorflow tensor and function if not traing?
eg.:
we use selected_words =tf.argsort(o_j)
in stead of
selected_words = np.argsort(o_j)
Since TF tensor runs on GPU and numpy array runs on CPU, conversion from GPU to CPU needs memory allocation and content copy using CUDA API (see pycuda document), which causes a tiny delay. Such delay could be a problem in training because of the high throughput data stream, but I think it could be ignored in most inference usage case. Anyway, if the selected_words is the desired output, we normally would prefer to use tf.argsort to make an elegant end-to-end model. However, if the output would be used in multiple places like logits, use np.argsort in a specific situation is fine.
Related
I have a computation which has for loops and calls to Tensorflow matrix algorithms such as tf.lstsq and Tensorflow iteration with tf.map_fn. I would like to profile this to see how much parallelism I am getting in the tf.map_fn and matrix algorithms that get called.
This doesn't seem to be the use case at all for the Tensorflow Profiler which is organized around the neural network model training loop.
Is there a way to use Tensorflow Profiler for arbitrary Tensorflow computations, or is the go-to move in this case to use NVidia tools like nvprof?
I figured out that the nvprof and nvvp and nsight tools I was looking for are available as a Conda install of cudatoolkit-dev. Uses are described in this gist.
I have been looking to learn TensorFlow and I have noticed that different functions are used for the same goal. To square a variable for instance, I have seen tf.square(), tf.math.square() and tf.keras.backend.square(). This is the same for most math operations. Are all these the same or is there any difference?
Mathematically, they should produce the same result. However Tensorflow functions in tensorflow.math.somefunction are used for operating Tensorflow tensors.
For example, when you write a custom loss or metric, the inputs and outputs should be Tensorflow tensors. So that Tensorflow knows how to take gradients of the functions. You can also use tf.keras.backend.* functions for custom loss etc.
Try to use tensorflow.math.somefunctions whenever you can, native operations are preferred. Because they are officially documented and guarateed to have backward compatibility between TF versions like TF 1.x and TF 2.x.
As tf.data augmentations are executed only on CPUs. I need a way to run certain augmentations on the TPU for an audio project.
For example,
CPU: tf.recs read -> audio crop -> noise addition.
TPU: spectogram -> Mixup Augmentation.
Most augmentations can be done as a Keras Layer on top of the model, but MixUp requires both changes in input as well as label.
Is there a way to do it using tf keras APIs.
And if there is any way we can transfer part of tf.data to run on TPU that will also be helpful.
As you have rightly mentioned and as per the Tensorflow documentation also the preprocessing of tf.data is done on CPU only.
However, you can do some workaround to preprocess your tf.data using TPU/GPU by directly using transformation function in your model with something like below code.
input = tf.keras.layers.Input((512,512,3))
x = tf.keras.layers.Lambda(transform)(input)
You can follow this Kaggle post for detailed discussion on this topic.
See the Tensorflow guide that discusses preprocessing data before the model or inside the model. By including preprocessing inside the model, the GPU is leveraged instead of the CPU, it makes the model portable, and it helps reduce the training/serving skew. The guide also has multiple recipes to get you started too. It doesn't explicitly state this works for a TPU but it can be tried.
I am making a neural network using tensorflow and I ran into a problem trying to use a generator to split my data up, basically it's too slow.
My training data consists of 52x52 numpy arrays. I need to split each array into a 52x52x3 array before I input it into my NN. As mentioned I have a generator working that does this, but I noticed that even though my NN is running on the GPU my GPU usage is very low (under 10% usually). I think this might be caused by me doing the generator on the CPU.
Is there any way of running my generator on the GPU?
What I tried:
- I thought of trying to use pyCUDA in order to program the generator on the GPU but found that tensorflow and pyCUDA don't support each other
-I tried using the from_generator function from the Dataset API as mentioned here:
https://www.tensorflow.org/api_docs/python/tf/contrib/data/Dataset
But while having issues with it I ran into this github thread mentioning that this function isn't supported to run on the GPU anyway:
https://github.com/tensorflow/tensorflow/issues/13610
Any help would be greatly appreciated.
I have a pytorch model and a tensorflow model, I want to train them together on one GPU, following the process bellow: input --> pytorch model--> output_pytorch --> tensorflow model --> output_tensorflow --> pytorch model.
Is is possible to do this? If answer is yes, is there any problem which I will encounter?
Thanks in advance.
I haven't done this but it is possible but implementing is can be a little bit.
You can consider each network as a function, you want to - in some sense - compose these function to form your network, to do this you can compute the final function by just giving result of one network to the other and then use chain-rule to compute the derivatives(using symbolic differentiation from both packages).
I think a good way for implementing this you might be to wrap TF models as a PyTorch Function and use tf.gradients for computing the backward pass.
Doing gradient updates can really get hard (because some variables exist in TF's computation graph) you can turn TF variables to PyTorch Variable turn them into placeholdes in TF computation graph, feed them in feed_dict and update them using PyTorch mechanisms, but I think it would be really hard to do, instead if you do your updates inside backward method of the function you might be able to do the job(it is really ugly but might do the job).