I'm creating a linear regression model and i am receiving an error - tensorflow

I was creating a linear regression model and I used TensorFlow's linear estimator but after I run the linear estimator train function I receive an invalid argument error which says Labels must be <= n_classes - 1.I don't know which part of the code i have gone wrong
this is the code i was running
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv(r"C:\Users\XPRESS\Downloads\CarPrice_Assignment.csv") #load the data
data.head()
#split data into traiing and testing
from sklearn.model_selection import train_test_split
train , test = train_test_split(data,random_state=42,test_size=0.2)
train_x = train
train_y = train.pop('price')
eval_x = test
eval_y = test.pop('price')
lst = list(train_x.columns)
#get numerical and categorical columns
categorical_columns = []
numerical_columns = []
for cat in lst:
if train_x[cat].dtypes == 'object':
categorical_columns.append(_)
for nums in lst:
if nums not in categorical_columns:
numerical_columns.append(nums)
train_x.info()
#convert categorical data to numeric data
feature_columns = []
for feature_name in categorical_columns:
vocabulary = train_x[feature_name].unique()
feature_columns.append(tf.feature_column.categorical_column_with_vocabulary_list(feature_name,vocabulary))
for feature_name in numerical_columns: feature_columns.append(tf.feature_column.numeric_column(feature_name,dtype=tf.float32))
def make_input_fn(data,label,num_epochs=10,shuffle=True,batch_size=32):
def input_fn():
ds = tf.data.Dataset.from_tensor_slices((dict(data),label))
if shuffle:
ds=ds.shuffle(1000)
ds = ds.batch(batch_size).repeat(num_epochs)
return ds
return input_fn
train_input_funtion = make_input_fn(train_x,train_y)
eval_input_function = make_input_fn(eval_x,eval_y,shuffle=False,num_epochs=1)
linear_est = tf.estimator.LinearClassifier(feature_columns=feature_columns)
linear_est.train(train_input_funtion)
this is the error i received
InvalidArgumentError: 2 root error(s) found.
(0) INVALID_ARGUMENT: assertion failed: [Labels must be <= n_classes - 1] [Condition x <= y did not hold element-wise:] [x (head/losses/Cast:0) = ] [[7895][10795][17710]...] [y (head/losses/check_label_range/Const:0) = ] [1]
[[{{function_node head_losses_check_label_range_assert_less_equal_Assert_AssertGuard_false_22323}}{{node Assert}}]]
[[training/Ftrl/gradients/gradients/linear/linear_model/linear/linear_model/linear/linear_model/enginelocation/weighted_sum_grad/Select_1/_1047]]
(1) INVALID_ARGUMENT: assertion failed: [Labels must be <= n_classes - 1] [Condition x <= y did not hold element-wise:] [x (head/losses/Cast:0) = ] [[7895][10795][17710]...] [y (head/losses/check_label_range/Const:0) = ] [1]
[[{{function_node head_losses_check_label_range_assert_less_equal_Assert_AssertGuard_false_22323}}{{node Assert}}]]
0 successful operations.
0 derived errors ignored.
...
[[training/Ftrl/gradients/gradients/linear/linear_model/linear/linear_model/linear/linear_model/enginelocation/weighted_sum_grad/Select_1/_1047]]
(1) INVALID_ARGUMENT: assertion failed: [Labels must be <= n_classes - 1] [Condition x <= y did not hold element-wise:] [x (head/losses/Cast:0) = ] [[7895][10795][17710]...] [y (head/losses/check_label_range/Const:0) = ] [1]
[[{{node Assert}}]]
0 successful operations.
0 derived errors ignored.

You mentioned that you are creating regression, but here you have tf.estimator.LinearClassifier in the code. May be you meant to use tf.estimator.LinearRegressor instead?

Related

ValueError: Found unexpected instance while processing input tensors for keras functional model. tensor

coding: utf-8
import numpy as np
import os
from keras import backend as K
def get_slope(output, slope):
conf = []
for sample in output:
conf.append([slope if num < 0 else 1 for num in sample])
conf = np.array(conf)
return conf
def get_weight(model, data):
weights = np.array(model.get_weights())
get_bn1_layer_output = K.function([model.layers[0].input, K.learning_phase()],[model.layers[2].output])
# output in test mode = 0
bn1_output = get_bn1_layer_output([data, 0])[0]
get_bn2_layer_output = K.function([model.layers[0].input, K.learning_phase()],[model.layers[6].output])
bn2_output = get_bn2_layer_output([data, 0])[0]
relu1_slope = get_slope(bn1_output, 0.3)
relu2_slope = get_slope(bn2_output, 0.3)
all_W = []
for i in range(len(relu1_slope)):
W1 = weights[0].T
W2 = weights[6].T * relu1_slope[i]*weights[2]/np.sqrt(weights[5]+0.001)
W3 = weights[12].T * relu2_slope[i]*weights[8]/np.sqrt(weights[11]+0.001)
W_im = np.matmul(W3, W2)
W = np.matmul(W_im, W1)
all_W.extend(W)
all_W = np.array(all_W)
return all_W
There was a problem when run the get_weight, ValueError: Found unexpected instance while processing input tensors for keras functional model. Expecting KerasTensor which is from tf.keras.Input() or output from keras layer call(). Got: 0
How to solve it? .....

tensorflow Exception encountered when calling layer (type CategoryEncoding)

I'm trying to code a layer to interface between a data set (numerical and categorical features) so it can be fed into a model.
I can't understand the error I get when it comes to categorical columns.
ValueError: Exception encountered when calling layer (type CategoryEncoding).
When output_mode is not 'int', maximum supported output rank is 2. Received
output_mode multi_hot and input shape (10, 7, 1), which would result in output rank 3.
From what I understand, the batch size should not have been counted in, but it is. And that seems to break.
Note that reproducing with only numerical features works fine.
Thank you for your help.
import tensorflow as tf
import pandas as pd
import numpy as np
# Simulate a data set of categorical and numerical values
# Configure simulation specifications: {feature: number of unique categories or None for numerical}
theSimSpecs = {'Cat1': 54, 'Cat2': 2, 'Cat3': 4, 'Num1': None, 'Num2': None}
# theSimSpecs = {'Num1': None, 'Num2': None}
# batch size and timesteps
theBatchSz, theTimeSteps = 10, 4
# Creation of the dataset as pandas.DataFrame
theDFs = []
for theFeature, theUniques in theSimSpecs.items():
if theUniques is None:
theDF = pd.DataFrame(np.random.random(size=theBatchSz * theTimeSteps), columns=[theFeature])
else:
theDF = pd.DataFrame(np.random.randint(low=0, high=theUniques, size=theBatchSz * theTimeSteps),
columns=[theFeature]).astype('category')
theDFs.append(theDF)
theDF = pd.concat(theDFs, axis=1)
# code excerpt
# inventory of the categorical features' values ( None for the numerical)
theCatCodes = {theCol: (theDF[theCol].unique().tolist() if str(theDF[theCol].dtypes) == "category" else None)
for theCol in theDF.columns}
# Creation of the batched tensorflow.data.Dataset
theDS = tf.data.Dataset.from_tensor_slices(dict(theDF))
theDS = theDS.window(size=theTimeSteps, shift=1, stride=1, drop_remainder=True)
theDS = theDS.flat_map(lambda x: tf.data.Dataset.zip(x))
theDS = theDS.batch(batch_size=theTimeSteps, drop_remainder=True)
theDS = theDS.batch(batch_size=theBatchSz, drop_remainder=True)
# extracting one batch
theBatch = next(iter(theDS))
tf.print(theBatch)
# Creation of the components for the interface layer
theFeaturesInputs = {}
theFeaturesEncoded = {}
for theFeature, theCodes in theCatCodes.items():
if theCodes is None: # Pass-through for numerical features
theNumInput = tf.keras.layers.Input(shape=[], dtype=tf.float32, name=theFeature)
theFeaturesInputs[theFeature] = theNumInput
theFeatureExp = tf.expand_dims(input=theNumInput, axis=-1)
theFeaturesEncoded[theFeature] = theFeatureExp
else: # Process for categorical features
theCatInput = tf.keras.layers.Input(shape=[], dtype=tf.int64, name=theFeature)
theFeaturesInputs[theFeature] = theCatInput
theFeatureExp = tf.expand_dims(input=theCatInput, axis=-1)
theEncodingLayer = tf.keras.layers.CategoryEncoding(num_tokens=theSimSpecs[theFeature], name=f"{theFeature}_enc",
output_mode="multi_hot", sparse=False)
theFeaturesEncoded[theFeature] = theEncodingLayer(theFeatureExp)
theStackedInputs = tf.concat(tf.nest.flatten(theFeaturesEncoded), axis=1)
theModel = tf.keras.Model(inputs=theFeaturesInputs, outputs=theStackedInputs)
theOutput = theModel(theBatch)
tf.print(theOutput)

Pytorch: Weighted Covariance

I am trying to implement a PyTorch covariance matrix operator. However, I notice the results are not the same between the Numpy implementation and my attempt, yet I do not understand why.
I define the Bessel-corrected weighted covariance matrix as:
I define the weighted mean as:
I compare the NumPy method and my method as follows:
import numpy as np
import torch
torch.set_printoptions(precision=8)
x = np.random.randn(1000, 3)*1000
w = np.abs(np.random.randn(1000))*1000
x_torch = torch.DoubleTensor(x)
w_torch = torch.DoubleTensor(w)
#calculate weighted means
m_w = torch.sum(x_torch.T*w_torch, axis=1)/torch.sum(w_torch)
m_w_np = np.average(x, axis=0, weights=w)
#calculate weighted covariance matrix
Q = (x_torch-m_w).T
cov_w = (1.0 / (torch.sum(w_torch) - 1))*(w_torch*Q).mm(Q.T)
cov_w_np = np.cov(x.T, aweights=w.T)
print("WEIGHTED MEAN")
print("NUMPY = {0}\n\nTORCH = {1}\n\nDIFFERENCE={2}".format(m_w_np, m_w.numpy(), m_w_np-m_w.numpy()))
print("")
print("")
print("WEIGHTED COVARIANCE")
print("NUMPY = {0}\n\nTORCH = {1}\n\nDIFFERENCE={2}".format(cov_w_np, cov_w.numpy(),cov_w_np-cov_w.numpy()))
This yields the following output:
WEIGHTED MEAN
NUMPY = [-21.10537208 -7.70801723 64.4034329 ]
TORCH = [-21.10537208 -7.70801723 64.4034329 ]
DIFFERENCE=[-7.10542736e-15 -1.77635684e-15 1.42108547e-14]
WEIGHTED COVARIANCE
NUMPY = [[ 989468.17457696 13620.54885133 10723.87790683]
[ 13620.54885133 953966.92486133 21407.69378841]
[ 10723.87790683 21407.69378841 1019646.81044077]]
TORCH = [[ 987952.51042915 13599.68493868 10707.45110536]
[ 13599.68493868 952505.64141296 21374.90155234]
[ 10707.45110536 21374.90155234 1018084.91875621]]
DIFFERENCE=[[1515.6641478 20.86391265 16.42680147]
[ 20.86391265 1461.28344838 32.79223607]
[ 16.42680147 32.79223607 1561.89168456]]

dimension of tf.Variables change after some epochs

I am new to TensorFlow and I am learning.
I define some variables and start training. Everything runs smoothly for the first epochs but suddenly it throws the following error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Matrix size-incompatible: In[0]: [17952,50], In[1]: [0,20]
[[{{node gradients/Embeddings_1/MatMul_grad/MatMul_1}}]]
[[gradients/Embeddings_1/MatMul_grad/tuple/control_dependency/_1867]]
(1) Invalid argument: Matrix size-incompatible: In[0]: [17952,50], In[1]: [0,20]
[[{{node gradients/Embeddings_1/MatMul_grad/MatMul_1}}]]
My problem is that why it is giving the error after some epochs and not in the first place. Usually, these types of errors are thrown when the graph is built.
This is my code for creating the variables and embedding the trees:
def __init__(self, vocab, embedding):
self.add_model_variables()
with tf.variable_scope("Embeddings", reuse=True):
with tf.device('/cpu:0'):
w_embed = tf.get_variable('WE', [self.vocab_embedding_size, self.embed_size])
b_embed = tf.get_variable('bE', [1, self.embed_size])
embeddings = tf.get_variable('embeddings')
self.embeddings = tf.add(tf.matmul(embeddings, w_embed), b_embed)
def add_model_variables(self):
myinitilizer = tf.random_uniform_initializer(-self.calc_wt_init(),self.calc_wt_init())
with tf.variable_scope('Embeddings'):
with tf.device('/cpu:0'):
w_embed = tf.get_variable('WE', [self.vocab_embedding_size, self.embed_size], initializer = myinitilizer)
b_embed = tf.get_variable('bE', [1, self.embed_size], initializer = myinitilizer)
embeddings = tf.get_variable('embeddings',
initializer=tf.convert_to_tensor(self.pretrained_embedding),
dtype=tf.float32)
with tf.variable_scope('Composition'):
self.W1 = tf.get_variable('W1', [2 * self.embed_size, self.embed_size], initializer = myinitilizer)
self.b1 = tf.get_variable('b1', [1, self.embed_size], initializer = myinitilizer)
with tf.variable_scope('Projection'):
self.U = tf.get_variable('U', [self.embed_size, 1], initializer = myinitilizer)
self.bu = tf.get_variable('bu', [self.max_number_nodes, 1], initializer = myinitilizer)
def embed_tree(self, batch_index):
def combine_children( left_tensor, right_tensor):
return tf.nn.relu(tf.matmul(tf.concat([left_tensor, right_tensor], axis=1, name='combine_children'), self.W1) + self.b1)
def embed_word(word_index):
with tf.device('/cpu:0'):
return tf.expand_dims(tf.gather(self.embeddings, word_index), 0)
def loop_body(node_tensors, i):
node_is_leaf = tf.gather(is_leaf, i)
word = tf.gather(words, i)
left_child = tf.gather(left_children, i)
right_child = tf.gather(right_children, i)
node_tensor = tf.cond(
node_is_leaf,
lambda: embed_word(word),
lambda: combine_children(
node_tensors.read(n-right_child),
node_tensors.read(n-left_child)))
node_tensors = node_tensors.write(i, node_tensor)
i = tf.add(i, 1)
return node_tensors, i
is_leaf = tf.gather(self.batch_is_leaf, batch_index)
left_children = tf.gather(self.batch_left_children, batch_index)
right_children = tf.gather(self.batch_right_children, batch_index)
words = tf.gather(self.batch_words, batch_index)
n = tf.reduce_sum(tf.cast(tf.not_equal(left_children, -1), tf.int32))-2
#iself.batch_operation = tf.print(batch_index,'N::::::::',output_stream=sys.stdout)
node_tensors = tf.TensorArray(tf.float32, size=self.max_number_nodes,
dynamic_size=False, clear_after_read=False, element_shape=[1, self.embed_size])
loop_cond = lambda node_tensors, i: tf.less(i, n+2)
#with tf.control_dependencies([self.batch_operation]):
node_tensors, _ = tf.while_loop(loop_cond, loop_body, [node_tensors, 0], parallel_iterations=1)
tree_embedding = tf.convert_to_tensor(node_tensors.stack())
return tree_embedding
The other problem is that I cannot replicate the error as it happens occasionally.
Update:
When I reduce the batch_size, the chance of getting this error reduces.
Is it possible for this to be because of working close to GPU memory limit?
The tf.gather produces zeros for invalid indices on GPU (it works correctly on CPU however). In other words, Tensorflow does not check for the range of indices while running on GPU.
The errors caused by returned 0s accumulate on the gradient and finally result in confusing error messages that are not related to the original problem.
For reference:
https://github.com/tensorflow/tensorflow/issues/3638
I changed tf.gather to index-based retrieval(a[i]) and the problem is fixed. I don't know exactly why!

Multi-Target and Multi-Class prediction

I am relatively new to machine learning as well as tensorflow. I would like to train the data so that predictions with 2 targets and multiple classes could be made. Is this something that can be done? I was able to implement the algorithm for 1 target but don't know how I need to do it for a second target as well.
An example dataset:
DayOfYear Temperature Flow Visibility
316 8 1 4
285 -1 1 4
326 8 2 5
323 -1 0 3
10 7 3 6
62 8 0 3
56 8 1 4
347 7 2 5
363 7 0 3
77 7 3 6
1 7 1 4
308 -1 2 5
364 7 3 6
If I train (DayOfYear Temperature Flow) I can predict the Visibility quite well. But I need to predict Flow as well somehow. I am pretty sure that Flow will influence Visibility so I am not sure how to go with that.
This is the implementation that I have
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import urllib
import numpy as np
import tensorflow as tf
# Data sets
TRAINING = "/ml_baetterich_learn.csv"
TEST = "/ml_baetterich_test.csv"
VALIDATION = "/ml_baetterich_validation.csv"
def main():
# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=TRAINING,
target_dtype=np.int,
features_dtype=np.int,
target_column=-1)
test_set = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=TEST,
target_dtype=np.int,
features_dtype=np.int,
target_column=-1)
validation_set = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=VALIDATION,
target_dtype=np.int,
features_dtype=np.int,
target_column=-1)
# Specify that all features have real-value data
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=3)]
# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
n_classes=9,
model_dir="/tmp/iris_model")
# Define the training inputs
def get_train_inputs():
x = tf.constant(training_set.data)
y = tf.constant(training_set.target)
return x, y
# Fit model.
classifier.fit(input_fn=get_train_inputs, steps=4000)
# Define the test inputs
def get_test_inputs():
x = tf.constant(test_set.data)
y = tf.constant(test_set.target)
return x, y
# Define the test inputs
def get_validation_inputs():
x = tf.constant(validation_set.data)
y = tf.constant(validation_set.target)
return x, y
# Evaluate accuracy.
accuracy_test_score = classifier.evaluate(input_fn=get_test_inputs,
steps=1)["accuracy"]
accuracy_validation_score = classifier.evaluate(input_fn=get_validation_inputs,
steps=1)["accuracy"]
print ("\nValidation Accuracy: {0:0.2f}\nTest Accuracy: {1:0.2f}\n".format(accuracy_validation_score,accuracy_test_score))
# Classify two new flower samples.
def new_samples():
return np.array(
[[327,8,3],
[47,8,0]], dtype=np.float32)
predictions = list(classifier.predict_classes(input_fn=new_samples))
print(
"New Samples, Class Predictions: {}\n"
.format(predictions))
if __name__ == "__main__":
main()
Option 1: multi-headed model
You could use a multi-headed DNNEstimator model. This treats Flow and Visibility as two separate softmax classification targets, each with their own set of classes. I had to modify the load_csv_without_header helper function to support multiple targets (which could be cleaner, but is not the point here - feel free to ignore its details).
import numpy as np
import tensorflow as tf
from tensorflow.python.platform import gfile
import csv
import collections
num_flow_classes = 4
num_visib_classes = 7
Dataset = collections.namedtuple('Dataset', ['data', 'target'])
def load_csv_without_header(fn, target_dtype, features_dtype, target_columns):
with gfile.Open(fn) as csv_file:
data_file = csv.reader(csv_file)
data = []
targets = {
target_cols: []
for target_cols in target_columns.keys()
}
for row in data_file:
cols = sorted(target_columns.items(), key=lambda tup: tup[1], reverse=True)
for target_col_name, target_col_i in cols:
targets[target_col_name].append(row.pop(target_col_i))
data.append(np.asarray(row, dtype=features_dtype))
targets = {
target_col_name: np.array(val, dtype=target_dtype)
for target_col_name, val in targets.items()
}
data = np.array(data)
return Dataset(data=data, target=targets)
feature_columns = [
tf.contrib.layers.real_valued_column("", dimension=1),
tf.contrib.layers.real_valued_column("", dimension=2),
]
head = tf.contrib.learn.multi_head([
tf.contrib.learn.multi_class_head(
num_flow_classes, label_name="Flow", head_name="Flow"),
tf.contrib.learn.multi_class_head(
num_visib_classes, label_name="Visibility", head_name="Visibility"),
])
classifier = tf.contrib.learn.DNNEstimator(
feature_columns=feature_columns,
hidden_units=[10, 20, 10],
model_dir="iris_model",
head=head,
)
def get_input_fn(filename):
def input_fn():
dataset = load_csv_without_header(
fn=filename,
target_dtype=np.int,
features_dtype=np.int,
target_columns={"Flow": 2, "Visibility": 3}
)
x = tf.constant(dataset.data)
y = {k: tf.constant(v) for k, v in dataset.target.items()}
return x, y
return input_fn
classifier.fit(input_fn=get_input_fn("tmp_train.csv"), steps=4000)
res = classifier.evaluate(input_fn=get_input_fn("tmp_test.csv"), steps=1)
print("Validation:", res)
Option 2: multi-labeled head
If you keep your CSV data separated by commas, and keep the last column for all the classes a row might have (separated by some token such as space), you can use the following code:
import numpy as np
import tensorflow as tf
all_classes = ["0", "1", "2", "3", "4", "5", "6"]
def k_hot(classes_col, all_classes, delimiter=' '):
table = tf.contrib.lookup.index_table_from_tensor(
mapping=tf.constant(all_classes)
)
classes = tf.string_split(classes_col, delimiter)
ids = table.lookup(classes)
num_items = tf.cast(tf.shape(ids)[0], tf.int64)
num_entries = tf.shape(ids.indices)[0]
y = tf.SparseTensor(
indices=tf.stack([ids.indices[:, 0], ids.values], axis=1),
values=tf.ones(shape=(num_entries,), dtype=tf.int32),
dense_shape=(num_items, len(all_classes)),
)
y = tf.sparse_tensor_to_dense(y, validate_indices=False)
return y
def feature_engineering_fn(features, labels):
labels = k_hot(labels, all_classes)
return features, labels
feature_columns = [
tf.contrib.layers.real_valued_column("", dimension=1), # DayOfYear
tf.contrib.layers.real_valued_column("", dimension=2), # Temperature
]
classifier = tf.contrib.learn.DNNEstimator(
feature_columns=feature_columns,
hidden_units=[10, 20, 10],
model_dir="iris_model",
head=tf.contrib.learn.multi_label_head(n_classes=len(all_classes)),
feature_engineering_fn=feature_engineering_fn,
)
def get_input_fn(filename):
def input_fn():
dataset = tf.contrib.learn.datasets.base.load_csv_without_header(
filename=filename,
target_dtype="S100", # strings of length up to 100 characters
features_dtype=np.int,
target_column=-1
)
x = tf.constant(dataset.data)
y = tf.constant(dataset.target)
return x, y
return input_fn
classifier.fit(input_fn=get_input_fn("tmp_train.csv"), steps=4000)
res = classifier.evaluate(input_fn=get_input_fn("tmp_test.csv"), steps=1)
print("Validation:", res)
We are using DNNEstimator with a multi_label_head, which uses sigmoid crossentropy rather than softmax crossentropy as a loss function. This means that each of the output units/logits are passed through the sigmoid function, which gives the likelihood of the data point belonging to that class, i.e. the classes are computed independently and are not mutually exclusive as they are with softmax crossentropy. This means that you could have between 0 and len(all_classes) classes set for each row in the training set and final predictions.
Also notice that the classes are represented as strings (and k_hot makes the conversion to token indices), so that you could use arbitrary class identifiers such as category UUIDs in e-commerce settings. If the categories in the 3rd and 4th column are different (Flow ID 1 != Visibility ID 1), you could prepend the column name to each class ID, e.g.
316,8,flow1 visibility4
285,-1,flow1 visibility4
326,8,flow2 visibility5
For a description of how k_hot works, see my other SO answer. I decided to use k_hot as a separate function (rather than define it directly in feature_engineering_fn because it's a distinct piece of functionality, and probably TensorFlow will soon have a similar utility function.
Note that if you're now using the first two columns to predict the last two columns, your accuraccy will certainly go down, as the last two columns are highly correlated and using one of them will give you a lot of information about the other. Actually, your code was using only the 3rd column, which was kind of a cheat anyway if the goal is to predict the 3rd and 4th columns.