Feeding tensorflow keras architecture with Sparse matrix of type scipy.sparse._csr.csr_matrix - tensorflow

Short Version:
I am trying to feed my data in the form of sparse matrix (of the type scipy.sparse._csr.csr_matrix') into a Tensorflow Keras Neural Network model. I highly appreciate any guidance. todense() and toarray() are not options for me. Also feeding in mini batches is not preferred.
Long version (including my efforts):
The problem is about a deep learning model with text, categorical and numerical features. My TfidfVectorizer creates a huge matrix which cannot be fed into a model as dense format.
text_cols = ['ca_name']
categorical_cols = ['cua_name','ca_category_modified']
numerical_cols = ['vidim1', 'vidim2', 'vidim3', 'vim', 'vid']
title_transformer = TfidfVectorizer()
numerical_transformer = MinMaxScaler()
categorical_transformer = OneHotEncoder(handle_unknown='ignore')
preprocessor = ColumnTransformer(
transformers=[
('title', title_transformer, text_cols[0]),
('num', numerical_transformer, numerical_cols),
('cat', categorical_transformer, categorical_cols)
])
# df['dur_linreg] is my numerical target
X_train, X_test, y_train, y_test = train_test_split(df[text_cols+categorical_cols+numerical_cols], df['dur_linreg'], test_size=0.2, random_state=42)
# fit_transform the preprocessor on X_train, only transform X_test
X_train_transformed = preprocessor.fit_transform(X_train)
X_test_transformed = preprocessor.transform(X_test)
I can build and compile a model as following:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(64, activation='relu', input_shape=(X_train_transformed.shape[1],)))
modeladd(tf.keras.layers.Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')
But cannot fit it:
history = model.fit(X_train_transformed, y_train, epochs=20, batch_size=32, validation_data=(X_test_transformed, y_test))
InvalidArgumentError: Graph execution error: TypeError: 'SparseTensor' object is not subscriptable
Obviously because I am feeding the model with a sparse scipy.sparse._csr.csr_matrix matrix.
The size of my matrix and my resources restrict me to transform it to
dense format:
X_train_transformed.todense()
MemoryError: Unable to allocate 205. GiB for an array with shape (275189, 100074) and data type float64
2) (obviously) array:
X_train_transformed.toarray()
MemoryError: Unable to allocate 205. GiB for an array with shape (275189, 100074) and data type float64
According to a post "https://stackoverflow.com/questions/41538692/using-sparse-matrices-with-keras-and-tensorflow" I there are two approaches
" Keep it as a scipy sparse matrix, then, when giving Keras a minibatch, make it dense
Keep it sparse all the way through, and use Tensorflow Sparse Tensors"
The second approach is preferred for me as well. Therefore, I tried the following as well:
However, again I could only build and compile the model without a problem:
from tensorflow.keras.layers import Input, Dense, Dropout
from tensorflow.keras.models import Model
input_layer = Input(shape=(X_train_transformed.shape[1],), sparse=True)
dense1 = Dense(64, activation='relu')(input_layer)
dropout1 = Dropout(0.2)(dense1)
dense2 = Dense(64, activation='relu')(dropout1)
dropout2 = Dropout(0.2)(dense2)
output_layer = Dense(1)(dropout2)
model = Model(input_layer, output_layer)
model.compile(optimizer='adam', loss='mean_squared_error')
But cannot fit it:
history = model.fit(X_train_transformed, y_train, validation_data=(X_test_transformed, y_test), epochs=5, batch_size=32)
InvalidArgumentError: Graph execution error:TypeError: 'SparseTensor' object is not subscriptable
Lastly, in case it is relevant I am using Tensorflow version 2.11.0 installed January 2023.
Many Thanks in advance for your help.

Related

Implementing Variational Auto Encoder using Tensoroflow_Probability NOT on MNIST data set

I know there are many questions related to Variational Auto Encoders. However, this question in two aspects differs from the existing ones: 1) it is implemented using Tensforflow V2 and Tensorflow_probability; 2) It does not use MNIST or any other image data set.
As about the problem itself:
I am trying to implement VAE using Tensorflow_probability and Keras. and I want to train and evaluate it on some synthetic data sets --as part of my research. I provided the code below.
Although the implementation is done and during the training, the loss value decreases but once I want to evaluate the trained model on my test set I face different errors.
I am somehow confident that the issue is related to input/output shape but unfortunately I did not manage the solve it.
Here is the code:
import numpy as np
import tensorflow as tf
import tensorflow.keras as tfk
import tensorflow_probability as tfp
from tensorflow.keras import layers as tfkl
from sklearn.datasets import make_classification
from tensorflow_probability import layers as tfpl
from sklearn.model_selection import train_test_split
tfd = tfp.distributions
n_epochs = 5
n_features = 2
latent_dim = 1
n_units = 4
learning_rate = 1e-3
n_samples = 400
batch_size = 32
# Generate synthetic data / load data sets
x_in, y_in = make_classification(n_samples=n_samples, n_features=n_features, n_informative=2, n_redundant=0,
n_repeated=0, n_classes=2, n_clusters_per_class=2, weights=[0.5, 0.5],
flip_y=0.01, class_sep=1.0, hypercube=True,
shift=0.0, scale=1.0, shuffle=False, random_state=42)
x_in = x_in.astype('float32')
y_in = y_in.astype('float32') # .reshape(-1, 1)
x_train, x_test, y_train, y_test = train_test_split(x_in, y_in, test_size=0.4, random_state=42, shuffle=True)
x_test, x_val, y_test, y_val = train_test_split(x_test, y_test, test_size=0.5, random_state=42, shuffle=True)
print("shapes:", x_train.shape, y_train.shape, x_test.shape, y_test.shape, x_val.shape, y_val.shape)
prior = tfd.Independent(tfd.Normal(loc=[tf.zeros(latent_dim)], scale=1.), reinterpreted_batch_ndims=1)
train_dataset = tf.data.Dataset.from_tensor_slices(x_train).batch(batch_size)
valid_dataset = tf.data.Dataset.from_tensor_slices(x_val).batch(batch_size)
test_dataset = tf.data.Dataset.from_tensor_slices(x_test).batch(batch_size)
encoder = tf.keras.Sequential([
tfkl.InputLayer(input_shape=[n_features, ], name='enc_input'),
tfkl.Lambda(lambda x: tf.cast(x, tf.float32)), # - 0.5
tfkl.Dense(n_units, activation='relu', name='enc_dense1'),
tfkl.Dense(int(n_units / 2), activation='relu', name='enc_dense2'),
tfkl.Dense(tfpl.MultivariateNormalTriL.params_size(latent_dim),
activation=None, name='mvn_triL1'),
tfpl.MultivariateNormalTriL(
# weight >> num_train_samples or some thing except 1 to convert VAE to beta-VAE
latent_dim, activity_regularizer=tfpl.KLDivergenceRegularizer(prior, weight=1.), name='bottleneck'),
])
decoder = tf.keras.Sequential([
tfkl.InputLayer(input_shape=latent_dim, name='dec_input'),
# tfkl.Dense(n_units, activation='relu', name='dec_dense1'),
# tfkl.Dense(int(n_units * 2), activation='relu', name='dec_dense2'),
tfpl.IndependentBernoulli([n_features], tfd.Bernoulli.logits, name='dec_output'),
])
vae = tfk.Model(inputs=encoder.inputs, outputs=decoder(encoder.outputs), name='VAE')
print("enoder:", encoder)
print(" ")
print("encoder.inputs:", encoder.inputs)
print(" ")
print(" encoder.outputs:", encoder.outputs)
print(" ")
print("decoder:", decoder)
print(" ")
print("decoder:", decoder.inputs)
print(" ")
print("decoder.outputs:", decoder.outputs)
print(" ")
# negative log likelihood i.e the E_{S(eps)} [p(x|z)];
# because the KL term was added in the last layer of the encoder, i.e., via activity_regularizer.
# this loss function takes two arguments, namely the original data points x, and the output of the model,
# which we call it rv_x (because it is a random variable)
negloglik = lambda x, rv_x: -rv_x.log_prob(x)
vae.compile(optimizer=tf.optimizers.Adam(learning_rate=learning_rate),
loss=negloglik,)
vae.summary()
history = vae.fit(train_dataset, epochs=n_epochs, validation_data=valid_dataset,)
print("x.shape:", x_test.shape)
x_hat = vae(x_test)
print("original:")
print(x_test)
print(" ")
print("Decoded Random Samples:")
print(x_hat.sample())
print(" ")
print("Decoded Means:")
print(x_hat.mean())
The Questions:
With the above code I receive the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 80 values, but the requested shape has 160 [Op:Reshape]
As far I know we can add as many layers as I want in the decoder model before its output layer --as it is done a convolutional VAEs, am I right?
If I uncomment the following two lines of code in decoder:
# tfkl.Dense(n_units, activation='relu', name='dec_dense1'),
# tfkl.Dense(int(n_units * 2), activation='relu', name='dec_dense2'),
I see the following warnings and the upcoming error:
WARNING:tensorflow:Gradients do not exist for variables ['dec_dense1/kernel:0', 'dec_dense1/bias:0', 'dec_dense2/kernel:0', 'dec_dense2/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['dec_dense1/kernel:0', 'dec_dense1/bias:0', 'dec_dense2/kernel:0', 'dec_dense2/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['dec_dense1/kernel:0', 'dec_dense1/bias:0', 'dec_dense2/kernel:0', 'dec_dense2/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['dec_dense1/kernel:0', 'dec_dense1/bias:0', 'dec_dense2/kernel:0', 'dec_dense2/bias:0'] when minimizing the loss.
And the error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 640 values, but the requested shape has 160 [Op:Reshape]
Now the question is why the decoder layers are not used during the training as it is mentioned in the warning.
PS, I also tried to pass the x_train, x_valid, x_test directly during the training and evaluation process but it does not help.
Any helps would be indeed appreciated.

Cannot convert a symbolic Tensor (dense_2_target_2:0) to a numpy array

I'm trying to implement SVM as the last layer of a CNN for classification, I'm trying to implement this code:
def custom_loss_value(y_true, y_pred):
print(y_true)
print(y_pred)
X = y_pred
print(X)
Y = y_true
Predict = []
Prob = []
scaler = StandardScaler()
# X = scaler.fit_transform(X)
param_grid = {'C': [0.1, 1, 8, 10], 'gamma': [0.001, 0.01, 0.1, 1]}
SVM = GridSearchCV(SVC(kernel='rbf',probability=True), cv=3, param_grid=param_grid, scoring='auc', verbose=1)
SVM.fit(X, Y)
Final_Model = SVM.best_estimator_
Predict = Final_Model.predict(X)
Prob = Final_Model.predict_proba(X)
return categorical_hinge(tf.convert_to_tensor(Y, dtype=tf.float32), tf.convert_to_tensor(Predict, dtype=tf.float32))
sgd = tf.keras.optimizers.SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss=custom_loss_value, optimizer=sgd, metrics=['accuracy'])
I'm getting the error: Cannot convert a symbolic Tensor (dense_2_target_2:0) to a numpy array
on the line SVM.fit(X,Y)
I also tried converting the y_true and y_pred to np array but was getting error then also
To train a neural network with gradient descent, you need a model to be differentiable. So, you need to be able to take a gradient w.r.t. every trainable parameter.
Some problems arise in your code:
You can't directly train an SVM inside a Keras loss function. It
takes a TensorFlow tensor and uses TF ops. The output is also a
Tensorflow tensor. sklearn can work with NumPy arrays or lists but
not tensors.
It is very hard and practically not useful to train SVM through backpropagation. Something about it can be read here.
You can train SVM on top of pretrained model instead of fully-connected layer.

keras, MNIST classification with RNN model, problem about the output shape

I am trying to use keras' functional API to build a recurrent neural network, but met some problems about the output shape, any help will be appreciated.
my code:
import tensorflow as tf
from tensorflow.python.keras.datasets import mnist
from tensorflow.python.keras.layers import Dense, CuDNNLSTM, Dropout
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.utils import normalize
from tensorflow.python.keras.utils import np_utils
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = normalize(x_train, axis=1), normalize(x_test, axis=1)
y_train = np_utils.to_categorical(y_train, 10)
y_test = np_utils.to_categorical(y_test, 10)
feature_input = tf.keras.layers.Input(shape=(28, 28))
x = tf.keras.layers.CuDNNLSTM(128, kernel_regularizer=tf.keras.regularizers.l2(l=0.0004), return_sequences=True)(feature_input)
y = tf.keras.layers.Dense(10, activation='softmax')(x)
model = tf.keras.Model(inputs=feature_input, outputs=y)
opt = tf.keras.optimizers.Adam(lr=1e-3, decay=1e-5)
model.compile(optimizer=opt, loss="sparse_categorical_crossentropy", metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3, validation_data=(x_test, y_test))
ERROR:
ValueError: Error when checking target: expected dense to have 3 dimensions, but got array with shape (60000, 10)
You data (targets) has shape (60000, 10).
Your model's output ('dense') has shape (None, length, 10).
Where None is the batch size (variable), length is the middle dimension, which mean "time steps" for an LSTM, and 10 is the units of the Dense layer.
Now, you don't have any sequence with time steps to process in an LSTM, it doesn't make sense. It is interpreting "image rows" as sequential time steps and "image columns" as independent features. (If this was not your intention, you simply got lucky that it didn't give you an error for trying to put an image into an LSTM)
Anyway, you can fix this error with return_sequences=False (discard the length of the sequences). Which does not mean this model is optimal for this case.

Keras model with several inputs and several outputs

I want to build a Keras model with two inputs and two outputs which both use the same architecture/weights. Both outputs are then used to compute a​ single loss.
Here is a picture of my desired architecture.
This is my pseudo code:
model = LeNet(inputs=[input1, input2, input3],outputs=[output1, output2, output3])
model.compile(optimizer='adam',
loss=my_custom_loss_function([output1,outpu2,output3],target)
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
Can this approach work?
Do I need to use a different Keras API?
The architecture is fine. Here is a toy example with training data of how it can be defined using keras' functional API:
from keras.models import Model
from keras.layers import Dense, Input
# two separate inputs
in_1 = Input((10,10))
in_2 = Input((10,10))
# both inputs share these layers
dense_1 = Dense(10)
dense_2 = Dense(10)
# both inputs are passed through the layers
out_1 = dense_1(dense_2(in_1))
out_2 = dense_1(dense_2(in_2))
# create and compile the model
model = Model(inputs=[in_1, in_2], outputs=[out_1, out_2])
model.compile(optimizer='adam', loss='mse')
model.summary()
# train the model on some dummy data
import numpy as np
i_1 = np.random.rand(10, 10, 10)
i_2 = np.random.rand(10, 10, 10)
model.fit(x=[i_1, i_2], y=[i_1, i_2])
Edit given that you want to compute the losses together you can use Concatenate()
output = Concatenate()([out_1, out_2])
Any loss function you pass into model.compile will be applied to output in it's combined state. After you get the output from a prediction you can just split it back up into it's original state:
f = model.predict(...)
out_1, out_2 = f[:n], f[n:]

How can i create a model in Keras and train it using Tensorflow?

Is it possible to create a model with Keras and without using compile and fit functions in Keras, use Tensorflow to train the model?
Sure. From Keras documentation:
Useful attributes of Model
model.layers is a flattened list of the layers comprising the model graph.
model.inputs is the list of input tensors.
model.outputs is the list of output tensors.
If you use Tensorflow backend, inputs and outputs are Tensorflow tensors, so you can use them without using Keras.
You can use keras to define a complicated graph:
import tensorflow as tf
sess = tf.Session()
from keras import backend as K
K.set_session(sess)
from keras.layers import Dense
from keras.objectives import categorical_crossentropy
img = Input(shape=(784,))
labels = Input(shape=(10,)) #one-hot vector
x = Dense(128, activation='relu')(img)
x = Dense(128, activation='relu')(x)
preds = Dense(10, activation='softmax')(x)
Then use tensorflow to config complicated optimization and training procedure:
loss = tf.reduce_mean(categorical_crossentropy(labels, preds))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
init_op = tf.global_variables_initializer()
sess.run(init_op)
# Run training loop
with sess.as_default():
for i in range(100):
batch = mnist_data.train.next_batch(50)
train_step.run(feed_dict={img: batch[0],
labels: batch[1]})
Ref: https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html