Convert a tensorflow script to pytorch (TransformedDistribution) - tensorflow

I am trying to rewrite a tensorflow script in pytorch. I have a problem finding the equivalent part in torch for the following line from this script:
import tensorflow_probability as tfp
tfd = tfp.distributions
a_distribution = tfd.TransformedDistribution(
distribution=tfd.Normal(loc=0.0, scale=1.0),
bijector=tfp.bijectors.Chain([
tfp.bijectors.AffineScalar(shift=self._means,
scale=self._mags),
tfp.bijectors.Tanh(),
tfp.bijectors.AffineScalar(shift=mean, scale=std),
]),
event_shape=[mean.shape[-1]],
batch_shape=[mean.shape[0]])
In particular, I have a huge problem for replacing the tfp.bijectors.Chain component.
I wrote the following lines in torch, but I am wondering whether these lines in pytorch compatible with the above tensorflow code and whether I can specify the batch_shape somewhere?
base_distribution = torch.normal(0.0, 1.0)
transforms = torch.distributions.transforms.ComposeTransform([torch.distributions.transforms.AffineTransform(loc=self._action_means, scale=self._action_mag, event_dim=mean.shape[-1]), torch.nn.Tanh(),torch.distributions.transforms.AffineTransform(loc=mean, scale=std, event_dim=mean.shape[-1])])
a_distribution = torch.distributions.transformed_distribution.TransformedDistribution(base_distribution, transforms)
Any solution?

In Pytorch, the base distribution class Distribution expects both a batch_shape and a event_shape parameter. Now notice that the subclass TransformedDistribution does not take such parameters (src code). That's because they are inferred from the base distribution class provided on initialization: see here and here.
You already found out about AffineTransform and ComposeTransform. Keep in mind you must stick with classes from the torch.distributions.
This holds for torch.normal which should be replaced with Normal. With this class, the shape is inferred from the provided loc and scale tensors.
And nn.Tanh which should be replaced with TanhTransform.
Here is a minimal example using your transformation pipeline:
Imports:
from torch.distributions.normal import Normal
from torch.distributions import transforms as tT
from torch.distributions.transformed_distribution import TransformedDistribution
Parameters:
mean = torch.rand(2,2)
std = 1
_action_means, _action_mag = 0, 1
event_dim=mean.shape[-1]
Distribution definition:
a_distribution = TransformedDistribution(
base_distribution=Normal(loc=torch.full_like(mean, 0),
scale=torch.full_like(mean, 1)),
transforms=tT.ComposeTransform([
tT.AffineTransform(loc=_action_means, scale=_action_mag, event_dim=event_dim),
tT.TanhTransform(),
tT.AffineTransform(loc=mean, scale=std, event_dim=event_dim)]))

Related

How to use a custom loss function in GPfLOW 2?

I am new to GPflow and I am trying to figure out how to write a custom loss function to optimize the model. For my purpose, I need to manipulate the predicted output of the GP through different data treatments, and thus, it is the output I get after these treatments, that I would like the optimise the GP model according to. For that purpose I would like to use the root mean square error as loss function.
Workflow:
Input -> GP model -> GP_output -> Data treatment -> Predicted_output -> RMSE(Predicted_output, Observations)
I hope this makes sense.
Normally models are optimised doing something like this:
import gpflow as gf
import numpy as np
X = np.linspace(0, 100, num=100)
n = np.random.normal(scale=8, size=X.size)
y_obs = 10 * np.sin(X) + n
model = gf.models.GPR(
data=(X, y_obs),
kernel=gf.kernels.SquaredExponential(),
)
gf.optimizers.Scipy().minimize(
model.training_loss, model.trainable_variables, options=optimizer_config
)
I have figured out how to do a workaround using the scipy minimize function to optimise using RMSE, but I would like to stay within the GPflow framework, where I can just input model.trainable_variables as argument, and have a general function that also works if I have multiple input/output dimensions.
def objective_func(params):
model.kernel.lengthscales.assign(params[0])
model.kernel.variance.assign(params[1])
model.likelihood.variance.assign(params[2])
GP_output = model.predict_y(X)[0]
GP_output = GP_output.numpy()
Predicted_output = data_treatment_func(GP_output)
return np.sqrt(np.square(np.subtract(Predicted_output, y_obs)).mean())
from scipy.optimize import minimize
res = minimize(objective_func,
x0=(1.0, 1.0, 1.0),)
I found the answer myself.
If you write your objective_func using TensorFlow instead of NumPy (e.g. tf.math.sqrt, tf.reduce_mean) you can simply pass that to gf.optimizers.Scipy().minimize(...) instead of model.training_loss:
import tensorflow as tf
def objective_func():
GP_output = model.predict_y(X)[0]
Predicted_output = data_treatment_func(GP_output)
return tf.sqrt(tf.reduce_mean(tf.square(Predicted_output - y_obs)))
gf.optimizers.Scipy().minimize(
objective_func, model.trainable_variables, options=optimizer_config
)

Streamlit with Tensorflow to analyse image and return the probability if is positive or negative

I'm trying to use Tensorflow to Machine Learning to analyze an image and return the probability if is positive or negative based on a model created (extension .h5). I couldn't found a documentation exactly for that, or repository, so even a link to read will be awesome.
Link for the application: https://share.streamlit.io/felipelx/hackathon/IDC_Detector.py
Libraries that I'm trying to use.
import numpy as np
import streamlit as st
import tensorflow as tf
from keras.models import load_model
The function to load the model.
#st.cache(allow_output_mutation=True)
def loadIDCModel():
model_idc = load_model('models/IDC_model.h5', compile=False)
model_idc.summary()
return model_idc
The function to work the image, and what I'm trying to see: model.predict - I can see but is not updating the %, independent of the image the value is always the same.
if uploaded_file is not None:
# transform image to numpy array
file_bytes = tf.keras.preprocessing.image.load_img(uploaded_file, target_size=(96,96), grayscale = False, interpolation = 'nearest', color_mode = 'rgb', keep_aspect_ratio = False)
c.image(file_bytes, channels="RGB")
Genrate_pred = st.button("Generate Prediction")
if Genrate_pred:
model = loadMetModel()
input_arr = tf.keras.preprocessing.image.img_to_array(file_bytes)
input_arr = np.array([input_arr])
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
prediction = probability_model.predict(input_arr)
dict_pred = {0: 'Benigno/Normal', 1: 'Maligno'}
result = dict_pred[np.argmax(prediction)]
value = 0
if result == 'Benigno/Normal':
value = str(((prediction[0][0])*100).round(2)) + '%'
else:
value = str(((prediction[0][1])*100).round(2)) + '%'
c.metric('Predição', result, delta=value, delta_color='normal')
Thank you in advance to any help.
The first thing I'm noticing is that your function for loading the model is named loadIDCModel, but then the function you call for loading the model is loadMetModel. When I check your source code, though, it looks like you've already addressed this issue. I'd recommend updating your question to reflect this.
Playing around with your application, I think the issue is your model itself. I tried various images — images containing carcinomas, and even a picture of a cat — and each gave me a probability around 73%. The lowest score I got was 72.74%, and the highest was 73.11% (this one was the cat). It seems that the output percentage is varying slightly, hinting that rather than something being wrong in the code, your model itself is likely at fault. You might need to retrain your model, as it seems to have learned to always return a value of approximately 0.73.

Why does keras (SGD) optimizer.minimize() not reach global minimum in this example?

I'm in the process of completing a TensorFlow tutorial via DataCamp and am transcribing/replicating the code examples I am working through in my own Jupyter notebook.
Here are the original instructions from the coding problem :
I'm running the following snippet of code and am not able to arrive at the same result that I am generating within the tutorial, which I have confirmed are the correct values via a connected scatterplot of x vs. loss_function(x) as seen a bit further below.
# imports
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import Variable, keras
def loss_function(x):
import math
return 4.0*math.cos(x-1)+np.divide(math.cos(2.0*math.pi*x),x)
# Initialize x_1 and x_2
x_1 = Variable(6.0, np.float32)
x_2 = Variable(0.3, np.float32)
# Define the optimization operation
opt = keras.optimizers.SGD(learning_rate=0.01)
for j in range(100):
# Perform minimization using the loss function and x_1
opt.minimize(lambda: loss_function(x_1), var_list=[x_1])
# Perform minimization using the loss function and x_2
opt.minimize(lambda: loss_function(x_2), var_list=[x_2])
# Print x_1 and x_2 as numpy arrays
print(x_1.numpy(), x_2.numpy())
I draw a quick connected scatterplot to confirm (successfully) that the loss function that I using gets me back to the same graph provided by the example (seen in screenshot above)
# Generate loss_function(x) values for given range of x-values
losses = []
for p in np.linspace(0.1, 6.0, 60):
losses.append(loss_function(p))
# Define x,y coordinates
x_coordinates = list(np.linspace(0.1, 6.0, 60))
y_coordinates = losses
# Plot
plt.scatter(x_coordinates, y_coordinates)
plt.plot(x_coordinates, y_coordinates)
plt.title('Plot of Input values (x) vs. Losses')
plt.xlabel('x')
plt.ylabel('loss_function(x)')
plt.show()
Here are the resulting global and local minima, respectively, as per the DataCamp environment :
4.38 is the correct global minimum, and 0.42 indeed corresponds to the first local minima on the graphs RHS (when starting from x_2 = 0.3)
And here are the results from my environment, both of which move opposite the direction that they should be moving towards when seeking to minimize the loss value:
I've spent the better part of the last 90 minutes trying to sort out why my results disagree with those of the DataCamp console / why the optimizer fails to minimize this loss for this simple toy example...?
I appreciate any suggestions that you might have after you've run the provided code in your own environments, many thanks in advance!!!
As it turned out, the difference in outputs arose from the default precision of tf.division() (vs np.division()) and tf.cos() (vs math.cos()) -- operations which were specified in (my transcribed, "custom") definition of the loss_function().
The loss_function() had been predefined in the body of the tutorial and when I "inspected" it using the inspect package ( using inspect.getsourcelines(loss_function) ) in order to redefine it in my own environment, the output of said inspection didn't clearly indicate that tf.division & tf.cos had been used instead of their NumPy counterparts (which my version of the code had used).
The actual difference is quite small, but is apparently sufficient to push the optimizer in the opposite direction (away from the two respective minima).
After swapping in tf.division() and tf.cos (as seen below) I was able to arrive at the same results as seen in the DC console.
Here is the code for the loss_function that will back in to the same results as seen in the console (screenshot) :
def loss_function(x):
import math
return 4.0*tf.cos(x-1)+tf.divide(tf.cos(2.0*math.pi*x),x)

Generating a *simple* TensorFlow graph illustration

I'm working on my first deep learning model using TensorFlow in a Jupyter notebook, and I would like to generate simplified graphs which illustrate the various layers of the network. Specifically, graphs such as those pictured in this answer:
This is very simple and clean and I can understand what's going on. This is more important than capturing 100% of the details. Contrast with the graph generated by TensorBoard which is a complete fustercluck:
How can I take a tf.Graph object and automatically generate a graph similar to the one above? Bonus points if it can be displayed in the Jupyter Notebook, too.
In short - you cannot. TF is a low-level library, which has no concept of "high level operations", it has ops, and this is the only thing it can visualise in a way you are thinking about. In particular, from math perspective there are no "neurons" in your graph, there are just tensors being multiplied by each other, this additional "semantics" is there only to make it easier for humans to talk about this, but it is not really encoded in your graph.
What you can do is to group nodes by yourself by specifing variable_scope for sections of your graph, then, after displaying in TB they will be displayed as a single node. It will not give you this "per-neuron-like" flavour of visualisation but at least it will hide many details. Creating a nice, visually appealing visualisations of neural nets is an "art" on its own rights, and a hard task to do in general.
Here's a snippet of code that we use in our PipelineAI notebooks to display our TensorFlow graphs inline within our Jupyter notebooks:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import re
from google.protobuf import text_format
from tensorflow.core.framework import graph_pb2
def convert_graph_to_dot(input_graph, output_dot, is_input_graph_binary):
graph = graph_pb2.GraphDef()
with open(input_graph, "rb") as fh:
if is_input_graph_binary:
graph.ParseFromString(fh.read())
else:
text_format.Merge(fh.read(), graph)
with open(output_dot, "wt") as fh:
print("digraph graphname {", file=fh)
for node in graph.node:
output_name = node.name
print(" \"" + output_name + "\" [label=\"" + node.op + "\"];", file=fh)
for input_full_name in node.input:
parts = input_full_name.split(":")
input_name = re.sub(r"^\^", "", parts[0])
print(" \"" + input_name + "\" -> \"" + output_name + "\";", file=fh)
print("}", file=fh)
print("Created dot file '%s' for graph '%s'." % (output_dot, input_graph))
input_graph='/root/models/optimize_me/linear/cpu/unoptimized_cpu.pb'
output_dot='/root/notebooks/unoptimized_cpu.dot'
convert_graph_to_dot(input_graph=input_graph, output_dot=output_dot, is_input_graph_binary=True)
Using graphviz, you can convert the .dot to .png using a %%bash magic within your notebook cell:
%%bash
dot -T png /root/notebooks/unoptimized_cpu.dot \
-o /root/notebooks/unoptimized_cpu.png > /tmp/a.out
and finally, display the graph in your notebook:
from IPython.display import Image
Image('/root/notebooks/unoptimized_cpu.png', width=1024, height=768)
here's an example of a simple Linear Regression model implemented in TensorFlow:
Here's the optimized version used to deploy and serve the TensorFlow Model in production (also rendered using the above code snippets):
More examples and details of these types of optimizations at http://pipeline.ai

sklearn's `RandomizedSearchCV` not working with `np.random.RandomState`

I am trying to optimize a pipeline and wanted to try giving RandomizedSearchCV a np.random.RandomState object. I can't it to work but I can give it other distributions.
Is there a special syntax I can use to give RandomSearchCV a np.random.RandomState(0).uniform(0.1,1.0)?
from scipy import stats
import numpy as np
from sklearn.neighbors import KernelDensity
from sklearn.grid_search import RandomizedSearchCV
# Generate data
x = np.random.normal(5,1,size=int(1e3))
# Make model
model = KernelDensity()
# Gridsearch for best params
# This one works
search_params = RandomizedSearchCV(model, param_distributions={"bandwidth":stats.uniform(0.1, 1)}, n_iter=30, n_jobs=2)
search_params.fit(x[:, None])
# RandomizedSearchCV(cv=None, error_score='raise',
# estimator=KernelDensity(algorithm='auto', atol=0, bandwidth=1.0, breadth_first=True,
# kernel='gaussian', leaf_size=40, metric='euclidean',
# metric_params=None, rtol=0),
# fit_params={}, iid=True, n_iter=30, n_jobs=2,
# param_distributions={'bandwidth': <scipy.stats._distn_infrastructure.rv_frozen object at 0x106ab7da0>},
# pre_dispatch='2*n_jobs', random_state=None, refit=True,
# scoring=None, verbose=0)
# This one doesn't work :(
search_params = RandomizedSearchCV(model, param_distributions={"bandwidth":np.random.RandomState(0).uniform(0.1, 1)}, n_iter=30, n_jobs=2)
# TypeError: object of type 'float' has no len()
What you observe is expected, as the class-method uniform of an object of type np.random.RandomState() immediately draws a sample at the time of the call.
Compared to that, your usage of scipy's stats.uniform() creates a distribution yet to sample from. (Although i'm not sure if it's working as you expect in your case; be careful with the parameters).
If you want to incorporate something based on np.random.RandomState() you have to build your own class like mentioned in the docs:
This example uses the scipy.stats module, which contains many useful distributions for sampling parameters, such as expon, gamma, uniform or randint. In principle, any function can be passed that provides a rvs (random variate sample) method to sample a value. A call to the rvs function should provide independent random samples from possible parameter values on consecutive calls.