GridSearchCV: "TypeError: 'StratifiedKFold' object is not iterable" - pandas

I want to perform GridSearchCV in a RandomForestClassifier, but data is not balanced, so I use StratifiedKFold:
from sklearn.model_selection import StratifiedKFold
from sklearn.grid_search import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
param_grid = {'n_estimators':[10, 30, 100, 300], "max_depth": [3, None],
"max_features": [1, 5, 10], "min_samples_leaf": [1, 10, 25, 50], "criterion": ["gini", "entropy"]}
rfc = RandomForestClassifier()
clf = GridSearchCV(rfc, param_grid=param_grid, cv=StratifiedKFold()).fit(X_train, y_train)
But I get an error:
TypeError Traceback (most recent call last)
<ipython-input-597-b08e92c33165> in <module>()
9 rfc = RandomForestClassifier()
10
---> 11 clf = GridSearchCV(rfc, param_grid=param_grid, cv=StratifiedKFold()).fit(X_train, y_train)
c:\python34\lib\site-packages\sklearn\grid_search.py in fit(self, X, y)
811
812 """
--> 813 return self._fit(X, y, ParameterGrid(self.param_grid))
c:\python34\lib\site-packages\sklearn\grid_search.py in _fit(self, X, y, parameter_iterable)
559 self.fit_params, return_parameters=True,
560 error_score=self.error_score)
--> 561 for parameters in parameter_iterable
562 for train, test in cv)
c:\python34\lib\site-packages\sklearn\externals\joblib\parallel.py in __call__(self, iterable)
756 # was dispatched. In particular this covers the edge
757 # case of Parallel used with an exhausted iterator.
--> 758 while self.dispatch_one_batch(iterator):
759 self._iterating = True
760 else:
c:\python34\lib\site-packages\sklearn\externals\joblib\parallel.py in dispatch_one_batch(self, iterator)
601
602 with self._lock:
--> 603 tasks = BatchedCalls(itertools.islice(iterator, batch_size))
604 if len(tasks) == 0:
605 # No more tasks available in the iterator: tell caller to stop.
c:\python34\lib\site-packages\sklearn\externals\joblib\parallel.py in __init__(self, iterator_slice)
125
126 def __init__(self, iterator_slice):
--> 127 self.items = list(iterator_slice)
128 self._size = len(self.items)
c:\python34\lib\site-packages\sklearn\grid_search.py in <genexpr>(.0)
560 error_score=self.error_score)
561 for parameters in parameter_iterable
--> 562 for train, test in cv)
563
564 # Out is a list of triplet: score, estimator, n_test_samples
TypeError: 'StratifiedKFold' object is not iterable
When I write cv=StratifiedKFold(y_train) I have ValueError: The number of folds must be of Integral type. But when I write `cv=5, it works.
I don't understand what is wrong with StratifiedKFold

I had exactly the same problem. The solution that worked for me is to replace:
from sklearn.grid_search import GridSearchCV
with
from sklearn.model_selection import GridSearchCV
Then it should work fine.

The problem here is an API change as mentioned in other answers, however the answers could be more explicit.
The cv parameter documentation states:
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy. Possible inputs
for cv are:
None, to use the default 3-fold cross-validation, integer,
to specify the number of folds.
An object to be used as a
cross-validation generator.
An iterable yielding train/test splits.
For integer/None inputs, if y is binary or multiclass, StratifiedKFold
used. If the estimator is a classifier or if y is neither binary nor
multiclass, KFold is used.
So, whatever the cross validation strategy used, all that is needed is to provide the generator using the function split, as suggested:
kfolds = StratifiedKFold(5)
clf = GridSearchCV(estimator, parameters, scoring=qwk, cv=kfolds.split(xtrain,ytrain))
clf.fit(xtrain, ytrain)

It seems that cv=StratifiedKFold()).fit(X_train, y_train) should be changed to cv=StratifiedKFold()).split(X_train, y_train).

The api changed in the latest version. You used to pass y and now you pass just the number when you create the stratifiedKFold object. You pass the y later.

Related

GridSearchCV with KerasClassifier causes Could not pickle the task to send it to the workers with an adapted Keras layer

Question
GridSearchCV via KerasClassifier causes the error when Keras Normalization has been adapted to data. Without the adapted Normalization, it works. The reason why using Normalization is because it gave better result than simply divide by 255.0.
PicklingError: Could not pickle the task to send it to the workers.
Workaround
By setting n_jobs=1 not to multi-thread, it works but perhaps not much use to run single thread.
Environment
Python 3.9.13
TensorFlow version: 2.10.0
Eager execution is: True
Keras version: 2.10.0
sklearn version: 1.1.3
Code
import numpy as np
import tensorflow as tf
from keras.layers import (
Dense,
Flatten,
Normalization,
Conv2D,
MaxPooling2D,
)
from keras.models import (
Sequential
)
from scikeras.wrappers import (
KerasClassifier,
)
from sklearn.model_selection import (
GridSearchCV
)
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
# max_value = float(np.max(x_train))
# x_train, x_test = x_train/max_value, x_test/max_value
input_shape = x_train[0].shape
number_of_classes = 10
# Data Normalization
normalization = Normalization(
name="norm",
input_shape=input_shape, # (32, 32, 3)
axis=-1 # Regard each pixel as a feature
)
normalization.adapt(x_train)
def create_model():
model = Sequential([
# Without the adapted Normalization layer, it works.
normalization,
Conv2D(
name="conv",
filters=32,
kernel_size=(3, 3),
strides=(1, 1),
padding="same",
activation='relu',
input_shape=input_shape
),
MaxPooling2D(
name="maxpool",
pool_size=(2, 2)
),
Flatten(),
Dense(
name="full",
units=100,
activation="relu"
),
Dense(
name="label",
units=number_of_classes,
activation="softmax"
)
])
model.compile(
loss=tf.keras.losses.sparse_categorical_crossentropy,
optimizer='adam',
metrics=['accuracy']
)
return model
model = KerasClassifier(model=create_model, verbose=2)
batch_size = [32]
epochs = [2, 3]
param_grid = dict(batch_size=batch_size, epochs=epochs)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(x_train, y_train)
Log
The above exception was the direct cause of the following exception:
PicklingError Traceback (most recent call last)
Cell In [28], line 7
4 param_grid = dict(batch_size=batch_size, epochs=epochs)
6 grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
----> 7 grid_result = grid.fit(x_train, y_train)
File ~/venv/ml/lib/python3.9/site-packages/sklearn/model_selection/_search.py:875, in BaseSearchCV.fit(self, X, y, groups, **fit_params)
869 results = self._format_results(
870 all_candidate_params, n_splits, all_out, all_more_results
871 )
873 return results
--> 875 self._run_search(evaluate_candidates)
877 # multimetric is determined here because in the case of a callable
878 # self.scoring the return type is only known after calling
879 first_test_score = all_out[0]["test_scores"]
File ~/venv/ml/lib/python3.9/site-packages/sklearn/model_selection/_search.py:1379, in GridSearchCV._run_search(self, evaluate_candidates)
1377 def _run_search(self, evaluate_candidates):
1378 """Search all candidates in param_grid"""
-> 1379 evaluate_candidates(ParameterGrid(self.param_grid))
File ~/venv/ml/lib/python3.9/site-packages/sklearn/model_selection/_search.py:822, in BaseSearchCV.fit.<locals>.evaluate_candidates(candidate_params, cv, more_results)
814 if self.verbose > 0:
815 print(
816 "Fitting {0} folds for each of {1} candidates,"
817 " totalling {2} fits".format(
818 n_splits, n_candidates, n_candidates * n_splits
819 )
820 )
--> 822 out = parallel(
823 delayed(_fit_and_score)(
824 clone(base_estimator),
825 X,
826 y,
827 train=train,
828 test=test,
829 parameters=parameters,
830 split_progress=(split_idx, n_splits),
831 candidate_progress=(cand_idx, n_candidates),
832 **fit_and_score_kwargs,
833 )
834 for (cand_idx, parameters), (split_idx, (train, test)) in product(
835 enumerate(candidate_params), enumerate(cv.split(X, y, groups))
836 )
837 )
839 if len(out) < 1:
840 raise ValueError(
841 "No fits were performed. "
842 "Was the CV iterator empty? "
843 "Were there no candidates?"
844 )
File ~/venv/ml/lib/python3.9/site-packages/joblib/parallel.py:1098, in Parallel.__call__(self, iterable)
1095 self._iterating = False
1097 with self._backend.retrieval_context():
-> 1098 self.retrieve()
1099 # Make sure that we get a last message telling us we are done
1100 elapsed_time = time.time() - self._start_time
File ~/venv/ml/lib/python3.9/site-packages/joblib/parallel.py:975, in Parallel.retrieve(self)
973 try:
974 if getattr(self._backend, 'supports_timeout', False):
--> 975 self._output.extend(job.get(timeout=self.timeout))
976 else:
977 self._output.extend(job.get())
File ~/venv/ml/lib/python3.9/site-packages/joblib/_parallel_backends.py:567, in LokyBackend.wrap_future_result(future, timeout)
564 """Wrapper for Future.result to implement the same behaviour as
565 AsyncResults.get from multiprocessing."""
566 try:
--> 567 return future.result(timeout=timeout)
568 except CfTimeoutError as e:
569 raise TimeoutError from e
File /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py:446, in Future.result(self, timeout)
444 raise CancelledError()
445 elif self._state == FINISHED:
--> 446 return self.__get_result()
447 else:
448 raise TimeoutError()
File /Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/concurrent/futures/_base.py:391, in Future.__get_result(self)
389 if self._exception:
390 try:
--> 391 raise self._exception
392 finally:
393 # Break a reference cycle with the exception in self._exception
394 self = None
PicklingError: Could not pickle the task to send it to the workers.
Research
Keras KerasClassifier gridsearch TypeError: can't pickle _thread.lock objects told Keras did not support pickle was the cause. However, as the code works if the adapted Normalization is not used, not relevant.
GPU can cause the issue but there is no GPU in my environment.
References
SciKeras Basic usageĀ¶
How to Grid Search Hyperparameters for Deep Learning Models in Python with Keras

How do you write a custom activation function in python for Keras?

I'm trying to write a custom activation function for use with Keras. I can not write it with tensorflow primitives as it does properly compute the derivative. I followed How to make a custom activation function with only Python in Tensorflow? and it works very we in creating a tensorflow function. However, when I tried putting it into Keras as an activation function for the classic MNIST demo. I got errors. I also tried the tf_spiky function from the above reference.
Here is the sample code
tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf_spiky),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
Here's my entire error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-48-73a57f81db19> in <module>
3 tf.keras.layers.Dense(512, activation=tf_spiky),
4 tf.keras.layers.Dropout(0.2),
----> 5 tf.keras.layers.Dense(10, activation=tf.nn.softmax)])
6 x=tf.keras.layers.Activation(tf_spiky)
7 y=tf.keras.layers.Flatten(input_shape=(28, 28))
/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py in _method_wrapper(self, *args, **kwargs)
472 self._setattr_tracking = False # pylint: disable=protected-access
473 try:
--> 474 method(self, *args, **kwargs)
475 finally:
476 self._setattr_tracking = previous_value # pylint: disable=protected-access
/opt/conda/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py in __init__(self, layers, name)
106 if layers:
107 for layer in layers:
--> 108 self.add(layer)
109
110 #property
/opt/conda/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py in _method_wrapper(self, *args, **kwargs)
472 self._setattr_tracking = False # pylint: disable=protected-access
473 try:
--> 474 method(self, *args, **kwargs)
475 finally:
476 self._setattr_tracking = previous_value # pylint: disable=protected-access
/opt/conda/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py in add(self, layer)
173 # If the model is being built continuously on top of an input layer:
174 # refresh its output.
--> 175 output_tensor = layer(self.outputs[0])
176 if isinstance(output_tensor, list):
177 raise TypeError('All layers in a Sequential model '
/opt/conda/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
728
729 # Check input assumptions set before layer building, e.g. input rank.
--> 730 self._assert_input_compatibility(inputs)
731 if input_list and self._dtype is None:
732 try:
/opt/conda/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in _assert_input_compatibility(self, inputs)
1463 if x.shape.ndims is None:
1464 raise ValueError('Input ' + str(input_index) + ' of layer ' +
-> 1465 self.name + ' is incompatible with the layer: '
1466 'its rank is undefined, but the layer requires a '
1467 'defined rank.')
ValueError: Input 0 of layer dense_1 is incompatible with the layer: its rank is undefined, but the layer requires a defined rank.
From this I gather the last Dense layer is unable to get the dimensions of the output after the activation function or something to that. I did see in the tensorflow code that many activation functions register a shape. But either I'm not doing that correctly or I'm going in the wrong direction. But I'm guessing something needs to be done to the tensorflow function to make it an activation function that Keras can use.
I would appreciate any help you can give.
As requested here is the sample codes for tf_spiky, it works as described in the above reference. However, once put into Keras I get the errors shown. This is pretty much as shown in the *How to make a custom activation function with only Python in Tensorflow?" stackoverflow article.
def spiky(x):
print(x)
r = x % 1
if r <= 0.5:
return r
else:
return 0
def d_spiky(x):
r = x % 1
if r <= 0.5:
return 1
else:
return 0
np_spiky = np.vectorize(spiky)
np_d_spiky = np.vectorize(d_spiky)
np_d_spiky_32 = lambda x: np_d_spiky(x).astype(np.float32)
import tensorflow as tf
from tensorflow.python.framework import ops
def tf_d_spiky(x,name=None):
with tf.name_scope(name, "d_spiky", [x]) as name:
y = tf.py_func(np_d_spiky_32,
[x],
[tf.float32],
name=name,
stateful=False)
return y[0]
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad) # see _MySquareGrad for grad example
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
def spikygrad(op, grad):
x = op.inputs[0]
n_gr = tf_d_spiky(x)
return grad * n_gr
np_spiky_32 = lambda x: np_spiky(x).astype(np.float32)
def tf_spiky(x, name=None):
with tf.name_scope(name, "spiky", [x]) as name:
y = py_func(np_spiky_32,
[x],
[tf.float32],
name=name,
grad=spikygrad) # <-- here's the call to the gradient
return y[0]
The solution is in this post Output from TensorFlow `py_func` has unknown rank/shape
The easiest fix is to add y[0].set_shape(x.get_shape()) before the return statement in the definition of tf_spiky.
Perhaps someone out there knows how to properly work with tensorflow shape functions. Digging around I found a unchanged_shape shape function in tensorflow.python.framework.common_shapes, which be appropriate here, but I don't know how to attach it to the tf_spiky function. Seems a python decorator is in order here. It would probably be a service to others to explain customizing tensorflow functions with shape functions.

Keras Tensorflow 'Cannot apply softmax to a tensor that is 1D'

I'm going over the Book Deep Learning with Python from F. Chollet.
https://www.manning.com/books/deep-learning-with-python
I'm trying to follow along with the code examples. I just installed keras, and I am getting this error when trying to run this:
from this notebook:
https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/2.1-a-first-look-at-a-neural-network.ipynb
from keras import models
from keras import layers
network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
network.add(layers.Dense(10, activation='softmax'))
TypeError Traceback (most recent call
last) in ()
4 network = models.Sequential()
5 network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
----> 6 network.add(layers.Dense(10, activation='softmax'))
~/anaconda3/lib/python3.6/site-packages/keras/engine/sequential.py in
add(self, layer)
179 self.inputs = network.get_source_inputs(self.outputs[0])
180 elif self.outputs:
--> 181 output_tensor = layer(self.outputs[0])
182 if isinstance(output_tensor, list):
183 raise TypeError('All layers in a Sequential model '
~/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py in
call(self, inputs, **kwargs)
455 # Actually call the layer,
456 # collecting output(s), mask(s), and shape(s).
--> 457 output = self.call(inputs, **kwargs)
458 output_mask = self.compute_mask(inputs, previous_mask)
459
~/anaconda3/lib/python3.6/site-packages/keras/layers/core.py in
call(self, inputs)
881 output = K.bias_add(output, self.bias, data_format='channels_last')
882 if self.activation is not None:
--> 883 output = self.activation(output)
884 return output
885
~/anaconda3/lib/python3.6/site-packages/keras/activations.py in
softmax(x, axis)
29 raise ValueError('Cannot apply softmax to a tensor that is 1D')
30 elif ndim == 2:
---> 31 return K.softmax(x)
32 elif ndim > 2:
33 e = K.exp(x - K.max(x, axis=axis, keepdims=True))
~/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py
in softmax(x, axis) 3229 A tensor. 3230 """
-> 3231 return tf.nn.softmax(x, axis=axis) 3232 3233
TypeError: softmax() got an unexpected keyword argument 'axis'
I'm wondering if there's something off with my installation?
keras.__version__
2.2.4
If anyone could give me a clue of what to look into.
Seems like you have an incompatible Tensorflow version (which Keras is using as a backend). For details look here

tensorflow estimator custom metrics: use sklearn metrics?

Is there a way to use sklearn metrics as custom metrics in tf.estimator ? I have tried the custom score function below.
from sklearn.metrics import recall_score
def my_score(labels, predictions):
return {'MARecall': recall_score(labels, predictions, average='macro')}
But it does not work :
eval_results = classifier.evaluate(input_fn=tf.estimator.inputs.numpy_input_fn(x={'x': val_x}, y=np.array(val_y), num_epochs=1, batch_size=20, shuffle=False))
... ...
... ...
<ipython-input-81-e433b0457af2> in my_acc(labels, predictions)
1 def my_acc(labels, predictions):
----> 2 return {'WA': np.array(recall_score(labels, predictions, average='micro'))}
3
/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/metrics/classification.py in recall_score(y_true, y_pred, labels, pos_label, average, sample_weight)
1357 average=average,
1358 warn_for=('recall',),
-> 1359 sample_weight=sample_weight)
1360 return r
1361
/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/metrics/classification.py in precision_recall_fscore_support(y_true, y_pred, beta, labels, pos_label, average, warn_for, sample_weight)
1023 raise ValueError("beta should be >0 in the F-beta score")
1024
-> 1025 y_type, y_true, y_pred = _check_targets(y_true, y_pred)
1026 present_labels = unique_labels(y_true, y_pred)
1027
/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/metrics/classification.py in _check_targets(y_true, y_pred)
70 """
71 check_consistent_length(y_true, y_pred)
---> 72 type_true = type_of_target(y_true)
73 type_pred = type_of_target(y_pred)
74
/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/utils/multiclass.py in type_of_target(y)
242 if not valid:
243 raise ValueError('Expected array-like (array or non-string sequence), '
--> 244 'got %r' % y)
245
246 sparseseries = (y.__class__.__name__ == 'SparseSeries')
ValueError: Expected array-like (array or non-string sequence), got <tf.Tensor 'fifo_queue_DequeueUpTo:2' shape=(?,) dtype=int64>
Is there a way around this? I have a multi-class classification problem at hand, and need to log both macro and micro averaged scores during training and evaluation.
In order to work properly inside tensorflow training loop, metrics need to be updated. Every function in tf.metrics has an update_op for it. So it's strongly recommended to construct your custom metric using lower level functions like tf.metrics.true_positives I'm not aware of particular formula used in sklearn, you can define your own metric like this
def custom_metric(labels, predictions):
metric1, update_op_m1 = tf.metrics.any_function(labels, predictions)
metric2, update_op_m2 = tf.metrics.any_other_function(labels, predictions)
output = tf.reduce_mean(metric1 + metric2)
return output, tf.group(update_op_m1, update_op_m2) #Note that you need to group all update ops
One solution is to use the metric functions from https://github.com/guillaumegenthial/tf_metrics

Word2Vec Tutorial: Tensorflow TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'

Version of Tensorflow: 1.2.1
Version of Python: 3.5
Operating System: Windows 10
Another poster has asked about this same problem on StackOverflow here, and he appears to be using code from the same Udacity Word2Vec tutorial. So, maybe I'm dense, but the code of this example is so busy and complex that I can't tell what fixed his problem.
The error occurs when I call tf.reduce_means:
loss = tf.reduce_mean(
tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
train_labels, num_sampled, vocabulary_size))
Right before the call to tf.reduce_mean the key variables have the following data types.
train_dataset.dtype
>> tf.int32
train_labels.dtype
>> tf.int32
valid_dataset.dtype
>> tf.int32
embeddings.dtype
>> tf.float32_ref
softmax_weights.dtype
>> tf.float32_ref
softmax_biases.dtype
>> tf.float32_ref
embed.dtype
>> tf.float32
I tried every permutation of data type in the definitions of the variables train_dataset.dtype, train_labels.dtype and valid_dataset.dtype: making them all int64, all float32, all float64, and combinations of integer and floating point. Nothing worked. I didn't try altering the data types of softmax_weight and softmax_biases, because I'm afraid that might foul up the optimization algorithm. Don't these need to be floats to support the calculus that is done during backpropagation? (Tensorflow is often a very opaque black box with documentation that verges on completely useless, so I can suspect things but never know for sure.)
Program Flow at Time of Error:
After the call to reduce_mean program control transfers to sampled_softmax_loss() in file nn_impl.py which in turn calls _compute_sampled_logits():
logits, labels = _compute_sampled_logits(
weights=weights,
biases=biases,
labels=labels,
inputs=inputs,
num_sampled=num_sampled,
num_classes=num_classes,
num_true=num_true,
sampled_values=sampled_values,
subtract_log_q=True,
remove_accidental_hits=remove_accidental_hits,
partition_strategy=partition_strategy,
name=name)
At this point I check the data types of the passed-in parameters and get the following:
weights.dtype
>> tf.float32_ref
biases.dtype
>> tf.float32_ref
labels.dtype
>> tf.float32
inputs.dtype
>> tf.int32
On the very next step an exception occurs, and I am thrown into the StreamWrapper class in file ansitowin32.py. Running to the end, I get the following Traceback:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
489 as_ref=input_arg.is_ref,
--> 490 preferred_dtype=default_dtype)
491 except TypeError as err:
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype)
740 if ret is None:
--> 741 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
742
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in _TensorTensorConversionFunction(t, dtype, name, as_ref)
613 "Tensor conversion requested dtype %s for Tensor with dtype %s: %r"
--> 614 % (dtype.name, t.dtype.name, str(t)))
615 return t
ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("sampled_softmax_loss/Reshape_1:0", shape=(?, 1, ?), dtype=float32, device=/device:CPU:0)'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-7-66d378b94a16> in <module>()
34 loss = tf.reduce_mean(
35 tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
---> 36 train_labels, num_sampled, vocabulary_size))
37
38 # Optimizer.
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in sampled_softmax_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
1266 remove_accidental_hits=remove_accidental_hits,
1267 partition_strategy=partition_strategy,
-> 1268 name=name)
1269 sampled_losses = nn_ops.softmax_cross_entropy_with_logits(labels=labels,
1270 logits=logits)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name)
1005 row_wise_dots = math_ops.multiply(
1006 array_ops.expand_dims(inputs, 1),
-> 1007 array_ops.reshape(true_w, new_true_w_shape))
1008 # We want the row-wise dot plus biases which yields a
1009 # [batch_size, num_true] tensor of true_logits.
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\math_ops.py in multiply(x, y, name)
284
285 def multiply(x, y, name=None):
--> 286 return gen_math_ops._mul(x, y, name)
287
288
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\gen_math_ops.py in _mul(x, y, name)
1375 A `Tensor`. Has the same type as `x`.
1376 """
-> 1377 result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
1378 return result
1379
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
524 "%s type %s of argument '%s'." %
525 (prefix, dtypes.as_dtype(attrs[input_arg.type_attr]).name,
--> 526 inferred_from[input_arg.type_attr]))
527
528 types = [values.dtype]
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.
Here's the complete program:
# These are all the modules we'll be using later.
# Make sure you can import them before proceeding further.
# %matplotlib inline
from __future__ import print_function
import collections
import math
import numpy as np
import os
import random
import tensorflow as tf
import zipfile
from matplotlib import pylab
from six.moves import range
from six.moves.urllib.request import urlretrieve
from sklearn.manifold import TSNE
print("Working directory = %s\n" % os.getcwd())
def read_data(filename):
"""Extract the first file enclosed in a zip file as a list of words"""
with zipfile.ZipFile(filename) as f:
data = tf.compat.as_str(f.read(f.namelist()[0])).split()
return data
filename = 'text8.zip'
words = read_data(filename)
print('Data size %d' % len(words))
vocabulary_size = 50000
def build_dataset(words):
count = [['UNK', -1]]
count.extend(collections.Counter(words).most_common(vocabulary_size - 1))
dictionary = dict()
# Loop through the keys of the count collection dictionary
# (apparently, zeroing out counts)
for word, _ in count:
dictionary[word] = len(dictionary)
data = list()
unk_count = 0 # count of unknown words
for word in words:
if word in dictionary:
index = dictionary[word]
else:
index = 0 # dictionary['UNK']
unk_count = unk_count + 1
data.append(index)
count[0][1] = unk_count
reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
return data, count, dictionary, reverse_dictionary
data, count, dictionary, reverse_dictionary = build_dataset(words)
print('Most common words (+UNK)', count[:5])
print('Sample data', data[:10])
del words # Hint to reduce memory.
data_index = 0
def generate_batch(batch_size, num_skips, skip_window):
global data_index
assert batch_size % num_skips == 0
assert num_skips <= 2 * skip_window
batch = np.ndarray(shape=(batch_size), dtype=np.int32)
labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
span = 2 * skip_window + 1 # [ skip_window target skip_window ]
buffer = collections.deque(maxlen=span)
for _ in range(span):
buffer.append(data[data_index])
data_index = (data_index + 1) % len(data)
for i in range(batch_size // num_skips):
target = skip_window # target label at the center of the buffer
targets_to_avoid = [ skip_window ]
for j in range(num_skips):
while target in targets_to_avoid:
target = random.randint(0, span - 1)
targets_to_avoid.append(target)
batch[i * num_skips + j] = buffer[skip_window]
labels[i * num_skips + j, 0] = buffer[target]
buffer.append(data[data_index])
data_index = (data_index + 1) % len(data)
return batch, labels
print('data:', [reverse_dictionary[di] for di in data[:8]])
for num_skips, skip_window in [(2, 1), (4, 2)]:
data_index = 0
batch, labels = generate_batch(batch_size=8, num_skips=num_skips, skip_window=skip_window)
print('\nwith num_skips = %d and skip_window = %d:' % (num_skips, skip_window))
print(' batch:', [reverse_dictionary[bi] for bi in batch])
print(' labels:', [reverse_dictionary[li] for li in labels.reshape(8)])
batch_size = 128
embedding_size = 128 # Dimension of the embedding vector.
skip_window = 1 # How many words to consider left and right.
num_skips = 2 # How many times to reuse an input to generate a label.
# We pick a random validation set to sample nearest neighbors. here we limit the
# validation samples to the words that have a low numeric ID, which by
# construction are also the most frequent.
valid_size = 16 # Random set of words to evaluate similarity on.
valid_window = 100 # Only pick dev samples in the head of the distribution.
valid_examples = np.array(random.sample(range(valid_window), valid_size))
num_sampled = 64 # Number of negative examples to sample.
graph = tf.Graph()
with graph.as_default(), tf.device('/cpu:0'):
# Input data.
train_dataset = tf.placeholder(tf.int32, shape=[batch_size])
train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
valid_dataset = tf.constant(valid_examples, dtype=tf.int32)
# Variables.
embeddings = tf.Variable(
tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
softmax_weights = tf.Variable(
tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
softmax_biases = tf.Variable(tf.zeros([vocabulary_size]))
# Model.
# Look up embeddings for inputs.
embed = tf.nn.embedding_lookup(embeddings, train_dataset)
# Compute the softmax loss, using a sample of the negative labels each time.
loss = tf.reduce_mean(
tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
train_labels, num_sampled, vocabulary_size))
# Optimizer.
# Note: The optimizer will optimize the softmax_weights AND the embeddings.
# This is because the embeddings are defined as a variable quantity and the
# optimizer's `minimize` method will by default modify all variable quantities
# that contribute to the tensor it is passed.
# See docs on `tf.train.Optimizer.minimize()` for more details.
optimizer = tf.train.AdagradOptimizer(1.0).minimize(loss)
# Compute the similarity between minibatch examples and all embeddings.
# We use the cosine distance:
norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
normalized_embeddings = embeddings / norm
valid_embeddings = tf.nn.embedding_lookup(
normalized_embeddings, valid_dataset)
similarity = tf.matmul(valid_embeddings, tf.transpose(normalized_embeddings))
I had the same issue and it looks like that two parameters that are passed on to the loss function are swapped around.
If you look at the tensorflow description for 'sample_softmax_loss' (https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss):
sampled_softmax_loss(
weights,
biases,
labels,
inputs,
num_sampled,
num_classes,
num_true=1,
sampled_values=None,
remove_accidental_hits=True,
partition_strategy='mod',
name='sampled_softmax_loss'
)
The third expected parameter is 'labels' and the fourth 'inputs'. In the supplied code, these two parameters seem to have been switched around. I'm a bit puzzled how this is possible. Maybe this used to be different in an older version of TF. Anyway, swapping those two parameters around will solve the problem.