Generate covariance matrices in batch (TensorFlow) - tensorflow

I wonder how to generate covariance matrices in batch in TensorFlow. If we use the following code
dim = 3
df = 5
ds = tf.contrib.distributions
scale_sqrt = tf.random_normal([dim, dim], seed=1234)
scale = tf.matmul(scale_sqrt, tf.transpose(scale_sqrt))
sigma = ds.WishartCholesky(df=df, scale=scale).sample()
it would work. But if we try the batch version of this code by adding an additional batch dimension then TF would throw an error. My batch version looks as follows:
dim = 3
df = 5
ds = tf.contrib.distributions
num_per_batch = 10
scale_sqrt = tf.random_normal([num_per_batch, dim, dim], seed=1234)
scale = tf.matmul(scale_sqrt, tf.transpose(scale_sqrt, [0,2,1]))
sigma = ds.WishartCholesky(df=df, scale=scale).sample()
Please let me know how to sample in batch efficiently.

For posteriority, this bug has been fixed, and should be available in the tf-nightly versions (when I call sess.run(...) on sigma, I get back values with no errors).

Related

How to perform custom operations in between keras layers?

I have one input and one output neural network and in between I need to perform small operation. I have two inputs (from the same distribution of either mean 0 or mean 1) which I need to fed to the neural network one at a time and compare the output of each input. After the comparison, I am finally generating the prediction of the model. The implementation is as follows:
from tensorflow import keras
import tensorflow as tf
import numpy as np
#define network
x1 = keras.Input(shape=(1), name="x1")
x2 = keras.Input(shape=(1), name="x2")
model = keras.layers.Dense(20)
model1 = keras.layers.Dense(1)
x11 = model1(model(x1))
x22 = model1(model(x2))
After this I need to perform following operations:
if x11>=x22:
Vm=x1
else:
Vm=x2
Finally I need to do:
out = Vm - 0.5
out= keras.activations.sigmoid(out)
model = keras.Model([x1,x2], out)
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss=tf.keras.losses.binary_crossentropy,
metrics=['accuracy']
)
model.summary()
tf.keras.utils.plot_model(model) #visualize model
I have normally distributed pair of data with same mean (mean 0 and mean 1 as generated below:
#Generating training dataset
from scipy.stats import skewnorm
n=1000 #sample each
s = 1 # scale to change o/p range
X1_0 = skewnorm.rvs(a = 0 ,loc=0, size=n)*s; X1_1 = skewnorm.rvs(a = 0 ,loc=1, size=n)*s #Skewnorm function
X2_0 = skewnorm.rvs(a = 0 ,loc=0, size=n)*s; X2_1 = skewnorm.rvs(a = 0 ,loc=1, size=n)*s #Skewnorm function
X1_train = list(X1_0) + list(X1_1) #append both data
X2_train = list(X2_0) + list(X2_1) #append both data
y_train = [x for x in (0,1) for i in range(0, n)] #make Y for above conditions
#reshape to proper format
X1_train = np.array(X1_train).reshape(-1,1)
X2_train = np.array(X2_train).reshape(-1,1)
y_train = np.array(y_train)
#train model
model.fit([X1_train, X2_train], y_train, epochs=10)
I am not been able to run the program if I include operation
if x11>=x22:
Vm=x1
else:
Vm=x2
in between layers. If I directly work with maximum of outputs as:
Vm = keras.layers.Maximum()([x11,x22])
The program is working fine. But I need to select either x1 or x2 based on the value of x11 and x22.
The problem might be due to the inclusion of the comparison operation while defining structure of the model where there is no value for x11 and x22 (I guess). I am totally new to all these stuffs and so I could not resolve this. I would greatly appreciate any help/suggestions. Thank you.
You can add this functionality via a Lambda layer.
Vm = tf.keras.layers.Lambda(lambda x: tf.where(x[0]>=x[1], x[2], x[3]))([x11, x22, x1, x2])

Scipy Optimize minimize returns the initial value

I am building machine learning models for a certain data set. Then, based on the constraints and bounds for the outputs and inputs, I am trying to find the input parameters for the most minimized answer.
The problem which I am facing is that, when the model is a linear regression model or something like lasso, the minimization works perfectly fine.
However, when the model is "Decision Tree", it constantly returns the very initial value that is given to it. So basically, it does not enforce the constraints.
import numpy as np
import pandas as pd
from scipy.optimize import minimize
I am using the very first sample from the input data set for the optimization. As it is only one sample, I need to reshape it to (1,-1) as well.
x = df_in.iloc[0,:]
x = np.array(x)
x = x.reshape(1,-1)
This is my Objective function:
def objective(x):
x = np.array(x)
x = x.reshape(1,-1)
y = 0
for n in range(df_out.shape[1]):
y = Model[n].predict(x)
Y = y[0]
return Y
Here I am defining the bounds of inputs:
range_max = pd.DataFrame(range_max)
range_min = pd.DataFrame(range_min)
B_max=[]
B_min =[]
for i in range(range_max.shape[0]):
b_max = range_max.iloc[i]
b_min = range_min.iloc[i]
B_max.append(b_max)
B_min.append(b_min)
B_max = pd.DataFrame(B_max)
B_min = pd.DataFrame(B_min)
bnds = pd.concat([B_min, B_max], axis=1)
These are my constraints:
con_min = pd.DataFrame(c_min)
con_max = pd.DataFrame(c_max)
Here I am defining the constraint function:
def const(x):
x = np.array(x)
x = x.reshape(1,-1)
Y = []
for n in range(df_out.shape[1]):
y = Model[n].predict(x)[0]
Y.append(y)
Y = pd.DataFrame(Y)
a4 =[]
for k in range(Y.shape[0]):
a1 = Y.iloc[k,0] - con_min.iloc[k,0]
a2 = con_max.iloc[k, 0] - Y.iloc[k,0]
a3 = [a2,a1]
a4 = np.concatenate([a4, a3])
return a4
c = const(x)
con = {'type': 'ineq', 'fun': const}
This is where I try to minimize. I do not pick a method as the automatically picked model has worked so far.
sol = minimize(fun = objective, x0=x,constraints=con, bounds=bnds)
So the actual constraints are:
c_min = [0.20,1000]
c_max = [0.3,1600]
and the max and min range for the boundaries are:
range_max = [285,200,8,85,0.04,1.6,10,3.5,20,-5]
range_min = [215,170,-1,60,0,1,6,2.5,16,-18]
I think you should check the output of 'sol'. At times, the algorithm is not able to perform line search completely. To check for this, you should check message associated with 'sol'. In such a case, the optimizer returns initial parameters itself. There may be various reasons of this behavior. In a nutshell, please check the output of sol and act accordingly.
Arad,
If you have not yet resolved your issue, try using scipy.optimize.differential_evolution instead of scipy.optimize.minimize. I ran into similar issues, particularly with decision trees because of their step-like behavior resulting in infinite gradients.

Error in prediction in svm classifier after one hot encoding

I have used one hot encoding to my dataset before training my SVM classifier.
which increased number of features in training set to 982. But during
prediction of test dataset which has 7 features i am getting error " X has 7
features per sample; expecting 982". I don't understand how to increase
number of features in test dataset.
My code is:
df = pd.read_csv('train.csv',header=None);
features = df.iloc[:,:-1].values
labels = df.iloc[:,-1].values
encode = LabelEncoder()
features[:,2] = encode.fit_transform(features[:,2])
features[:,3] = encode.fit_transform(features[:,3])
features[:,4] = encode.fit_transform(features[:,4])
features[:,5] = encode.fit_transform(features[:,5])
df1 = pd.DataFrame(features)
#--------------------------- ONE HOT ENCODING --------------------------------#
hotencode = OneHotEncoder(categorical_features=[2])
features = hotencode.fit_transform(features).toarray()
hotencode = OneHotEncoder(categorical_features=[14])
features = hotencode.fit_transform(features).toarray()
hotencode = OneHotEncoder(categorical_features=[37])
features = hotencode.fit_transform(features).toarray()
hotencode = OneHotEncoder(categorical_features=[466])
features = hotencode.fit_transform(features).toarray()
X = np.array(features)
y = np.array(labels)
clf = svm.LinearSVC()
clf.fit(X,y)
d_test = pd.read_csv('query.csv')
Z_test =np.array(d_test)
confidence = clf.predict(Z_test)
print("The query image belongs to Class ")
print(confidence)
######################### test dataset
query.csv
1 0.076 1 3232236298 2886732679 3128 60604
The short answer: you need to apply the same OHE transform (or LE+OHE in your case) on the test set.
For a good advice, see Scikit Learn OneHotEncoder fit and transform Error: ValueError: X has different shape than during fitting or How to deal with imputation and hot one encoding in pandas?

Using Tensorflow for workload distribution

My code looks like:
import tensorflow as tf
N = 16, num_ckfs = 5
init_variances = tf.placeholder(tf.float64, shape=[ num_ckfs, N],name='inital_variances')
init_states = tf.placeholder(tf.float64, shape=[num_ckfs, N], name='init_states')
#some more code
predicted_state = prior_state_expanded + kalman_gain * diff_expanded
error_covariance = sum_cov_cholesky + tf.batch_matmul(kg , kalman_gain, adj_x=True)
projected_output = tf.batch_matmul(predicted_state,input_vectors_extra, adj_y=True)
session = tf.Session()
# read data from input file
init_var = [10 for i in range(N)]
init_var_ckfs = [init_var for i in range(num_ckfs)]
init_state = [0 for i in range(N)]
init_state_ckfs = [init_state for i in range(num_ckfs)]
for timestep in range(10):
out= session.run([projected_output, predicted_state, error_covariance], {init_variances:init_var_ckfs, init_states:init_state_ckfs })
init_state_ckfs = np.array([i.tolist()[0] for i in out[1]])
init_var_ckfs = np.array([i.diagonal().tolist() for i in out[2]])
This code is for running a Cubature Kalman Filter(CKF) in a batched mode. For example:
num_ckfs = 5
means that this code will run 5 CKFs in parallel. Now, what I would like to do is to distribute the workload to multiple nodes depending upon the value of num_ckfs. For example, if I pass num_ckfs as an argument to the code, and it is set to 20,000, then I would distribute the workload to 4 nodes running 5000 each.
I would like to do this using the distributed version of Tensorflow. Can someone please give me some hints on how this could be achieved? Ideally, I should have to execute the code on a single node and it should then get distributed to as many nodes as defined in
tf.train.ClusterSpec

Bayesian Probabilistic Matrix Factorization (BPMF) with PyMC3: PositiveDefiniteError using `NUTS`

I've implemented the Bayesian Probabilistic Matrix Factorization algorithm using pymc3 in Python. I also implemented it's precursor, Probabilistic Matrix Factorization (PMF). See my previous question for a reference to the data used here.
I'm having trouble drawing MCMC samples using the NUTS sampler. I initialize the model parameters using the MAP from PMF, and the hyperparameters using Gaussian random draws sprinkled around 0. However, I get a PositiveDefiniteError when setting up the step object for the sampler. I've verified that the MAP estimate from PMF is reasonable, so I expect it has something to do with the way the hyperparameters are being initialized. Here is the PMF model:
import pymc3 as pm
import numpy as np
import pandas as pd
import theano
import scipy as sp
data = pd.read_csv('jester-dense-subset-100x20.csv')
n, m = data.shape
test_size = m / 10
train_size = m - test_size
train = data.copy()
train.ix[:,train_size:] = np.nan # remove test set data
train[train.isnull()] = train.mean().mean() # mean value imputation
train = train.values
test = data.copy()
test.ix[:,:train_size] = np.nan # remove train set data
test = test.values
# Low precision reflects uncertainty; prevents overfitting
alpha_u = alpha_v = 1/np.var(train)
alpha = np.ones((n,m)) * 2 # fixed precision for likelihood function
dim = 10 # dimensionality
# Specify the model.
with pm.Model() as pmf:
pmf_U = pm.MvNormal('U', mu=0, tau=alpha_u * np.eye(dim),
shape=(n, dim), testval=np.random.randn(n, dim)*.01)
pmf_V = pm.MvNormal('V', mu=0, tau=alpha_v * np.eye(dim),
shape=(m, dim), testval=np.random.randn(m, dim)*.01)
pmf_R = pm.Normal('R', mu=theano.tensor.dot(pmf_U, pmf_V.T),
tau=alpha, observed=train)
# Find mode of posterior using optimization
start = pm.find_MAP(fmin=sp.optimize.fmin_powell)
And here is BPMF:
n, m = data.shape
dim = 10 # dimensionality
beta_0 = 1 # scaling factor for lambdas; unclear on its use
alpha = np.ones((n,m)) * 2 # fixed precision for likelihood function
logging.info('building the BPMF model')
std = .05 # how much noise to use for model initialization
with pm.Model() as bpmf:
# Specify user feature matrix
lambda_u = pm.Wishart(
'lambda_u', n=dim, V=np.eye(dim), shape=(dim, dim),
testval=np.random.randn(dim, dim) * std)
mu_u = pm.Normal(
'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
testval=np.random.randn(dim) * std)
U = pm.MvNormal(
'U', mu=mu_u, tau=lambda_u, shape=(n, dim),
testval=np.random.randn(n, dim) * std)
# Specify item feature matrix
lambda_v = pm.Wishart(
'lambda_v', n=dim, V=np.eye(dim), shape=(dim, dim),
testval=np.random.randn(dim, dim) * std)
mu_v = pm.Normal(
'mu_v', mu=0, tau=beta_0 * lambda_v, shape=dim,
testval=np.random.randn(dim) * std)
V = pm.MvNormal(
'V', mu=mu_v, tau=lambda_v, shape=(m, dim),
testval=np.random.randn(m, dim) * std)
# Specify rating likelihood function
R = pm.Normal(
'R', mu=theano.tensor.dot(U, V.T), tau=alpha,
observed=train)
# `start` is the start dictionary obtained from running find_MAP for PMF.
for key in bpmf.test_point:
if key not in start:
start[key] = bpmf.test_point[key]
with bpmf:
step = pm.NUTS(scaling=start)
At the last line, I get the following error:
PositiveDefiniteError: Scaling is not positive definite. Simple check failed. Diagonal contains negatives. Check indexes [ 0 2 ... 2206 2207 ]
As I understand it, I can't use find_MAP with models that have hyperpriors like BPMF. This is why I'm attempting to initialize with the MAP values from PMF, which uses point estimates for the parameters on U and V rather than parameterized hyperpriors.
Unfortunately the Wishart distribution is non-functional. I recently added a warning here: https://github.com/pymc-devs/pymc3/commit/642f63973ec9f807fb6e55a0fc4b31bdfa1f261e
See here for more discussions on this tricky distribution: https://github.com/pymc-devs/pymc3/issues/538
You could confirm that that's the source by fixing the covariance matrix. If that's the case, I'd try using the JKL prior distribution: https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/LKJ_correlation.py