How to use tensorflow sequence_numeric_column with an RNNClassifier? - tensorflow

I was looking throw the tensorflow contrib API and I wanted to use the RNNClassifier available with Tensorflow 1.13. Contrary to non sequence estimators, this one needs sequence feature columns only. However I was not able to make it work on a toy dataset. I keep getting an error while using sequence_numeric_column.
Here is the structure of my toy dataset:
idSeq,kind,label,size
0,0,dwarf,117.6
0,0,dwarf,134.4
0,0,dwarf,119.0
0,1,human,168.0
0,1,human,145.25
0,2,elve,153.9
0,2,elve,218.49999999999997
0,2,elve,210.9
1,0,dwarf,166.6
1,0,dwarf,168.0
1,0,dwarf,131.6
1,1,human,150.5
1,1,human,208.25
1,1,human,210.0
1,2,elve,199.5
1,2,elve,161.5
1,2,elve,197.6
where idSeq allow us to see which rows belong to which sequence.
I want to predict the "kind" column thanks to the "size" column.
Below there is my code about make my RNN training on my dataset.
import numpy as np
import pandas as pd
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.logging.set_verbosity(tf.logging.INFO)
dataframe = pd.read_csv("data_rnn.csv")
dataframe_test = pd.read_csv("data_rnn_test.csv")
train_x = dataframe
train_y = dataframe.loc[:,(["kind"])]
size_feature_col = tf.contrib.feature_column.sequence_numeric_column('size ')
estimator = tf.contrib.estimator.RNNClassifier(
sequence_feature_columns=[size_feature_col ],
num_units=[32, 16],
cell_type='lstm',
model_dir=None,
n_classes=3,
optimizer='Adagrad'
)
def make_dataset(
batch_size,
x,
y=None,
shuffle=False,
shuffle_buffer_size=1000,
shuffle_seed=1):
"""
An input function for training, evaluation or prediction.
Parameters
----------------------
batch_size: integer
the size of the batch to use for the training of the neural network
x: pandas dataframe
dataframe that contains the features of the samples to study
y: pandas dataframe or array (Default: None)
dataframe or array that contains the values to predict of the samples
to study. If none, we want a dataset for evaluation or prediction.
shuffle: boolean (Default: False)
if True, we shuffle the elements of the dataset
shuffle_buffer_size: integer (Default: 1000)
if we shuffle the elements of the dataset, it is the size of the buffer
used for it.
shuffle_seed : integer
the random seed for the shuffling
Returns
---------------------
dataset.make_one_shot_iterator().get_next(): Tensor
a nested structure of tf.Tensors containing the next element of the
dataset to study
"""
def input_fn():
if y is not None:
dataset = tf.data.Dataset.from_tensor_slices((dict(x), y))
else:
dataset = tf.data.Dataset.from_tensor_slices(dict(x))
if shuffle:
dataset = dataset.shuffle(
buffer_size=shuffle_buffer_size,
seed=shuffle_seed).batch(batch_size).repeat()
else:
dataset = dataset.batch(batch_size)
return dataset.make_one_shot_iterator().get_next()
return input_fn
batch_size = 50
random_seed = 1
input_fn_train = make_dataset(
batch_size=batch_size,
x=train_x,
y=train_y,
shuffle=True,
shuffle_buffer_size=len(train_x),
shuffle_seed=random_seed)
estimator.train(input_fn=input_fn_train, steps=5000)
But I only got the following error :
INFO:tensorflow:Calling model_fn.
Traceback (most recent call last):
File "main.py", line 125, in <module>
estimator.train(input_fn=input_fn_train, steps=5000)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 358, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1124, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1154, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1112, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 512, in _model_fn
config=config)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 332, in _rnn_model_fn
logits, sequence_length_mask = logit_fn(features=features, mode=mode)
File "/usr/local/lib/python3.5/dist-packages/tensorflow_estimator/contrib/estimator/python/estimator/rnn.py", line 226, in rnn_logit_fn
features=features, feature_columns=sequence_feature_columns)
File "/root/.local/lib/python3.5/site-packages/tensorflow/contrib/feature_column/python/feature_column/sequence_feature_column.py", line 120, in sequence_input_layer
trainable=trainable)
File "/root/.local/lib/python3.5/site-packages/tensorflow/contrib/feature_column/python/feature_column/sequence_feature_column.py", line 496, in _get_sequence_dense_tensor
sp_tensor, default_value=self.default_value)
File "/root/.local/lib/python3.5/site-packages/tensorflow/python/ops/sparse_ops.py", line 1432, in sparse_tensor_to_dense
sp_input = _convert_to_sparse_tensor(sp_input)
File "/root/.local/lib/python3.5/site-packages/tensorflow/python/ops/sparse_ops.py", line 68, in _convert_to_sparse_tensor
raise TypeError("Input must be a SparseTensor.")
TypeError: Input must be a SparseTensor.
So I don't understand what I've done wrong because on the documentation, it is written that we have to give a sequence column to the RNNEstimator. They do not say anything about giving sparse tensor.
Thanks in advance for your help and advices.

Related

Incompatible shapes while using triplet loss and pre-trained resnet

I am trying to use pre-trained resnet and fine-tune it using triplet loss. The following code I came up with is a combination of tutorials I found on the topic:
import pathlib
import tensorflow as tf
import tensorflow_addons as tfa
with tf.device('/cpu:0'):
INPUT_SHAPE = (32, 32, 3)
BATCH_SIZE = 16
data_dir = pathlib.Path('/home/user/dataset/')
base_model = tf.keras.applications.ResNet50V2(
weights='imagenet',
pooling='avg',
include_top=False,
input_shape=INPUT_SHAPE,
)
# following two lines are added after edit, originally it was model = base_model
head_model = tf.keras.layers.Lambda(lambda x: tf.math.l2_normalize(x, axis=1))(base_model.output)
model = tf.keras.Model(inputs=base_model.input, outputs=head_model)
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=10,
zoom_range=0.1,
)
generator = datagen.flow_from_directory(
data_dir,
target_size=INPUT_SHAPE[:2],
batch_size=BATCH_SIZE,
seed=42,
)
model.compile(
optimizer=tf.keras.optimizers.Adam(0.001),
loss=tfa.losses.TripletSemiHardLoss(),
)
model.fit(
generator,
epochs=5,
)
Unfortunately after running the code I get the following error:
Found 4857 images belonging to 83 classes.
Epoch 1/5
Traceback (most recent call last):
File "ReID/external_process.py", line 35, in <module>
model.fit(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
return method(self, *args, **kwargs)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1098, in fit
tmp_logs = train_function(iterator)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
result = self._call(*args, **kwds)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
return self._stateless_fn(*args, **kwds)
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1843, in _filtered_call
return self._call_flat(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1923, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 545, in call
outputs = execute.execute(
File "/home/user/videolytics/venv_python/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 1328 values, but the requested shape has 16
[[{{node TripletSemiHardLoss/PartitionedCall/Reshape}}]] [Op:__inference_train_function_13749]
Function call stack:
train_function
2020-10-23 22:07:09.094736: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated.
[[{{node PyFunc}}]]
The dataset directory has 83 subdirectories, one per class and each of this subdirectories contains images of given class. The dimension 1328 in the error output is the batch size (16) times number of classes (83), and the dimension 16 is the batch size (both dimensions change accordingly if I change the BATCH_SIZE.
To be honest I do not really understand the error, so any solution or even any kind of indight where is the problem is deeply appreciated.
The problem is that the TripletSemiHardLoss expects
labels y_true to be provided as 1-D integer Tensor with shape [batch_size] of multi-class integer labels
but the flow_from_directory by default generate categorical labels; using class_mode="sparse" should fix the problem.

String input to categorical column via dataset

I'm trying to learn how to use the Estimator API, using input_fn to provide Dataset backed input to a feature_column generated input layer.
My code looks like
import tensorflow as tf import random
tf.logging.set_verbosity(tf.logging.DEBUG)
def input_fn():
def gen():
for i in range(100000):
for j in range(10):
yield {"in": str(j)}, [str(j+1)]
data = tf.data.Dataset.from_generator(gen, ({"in": tf.string}, tf.string))
data = data.batch(10)
iterator = data.make_one_shot_iterator()
return iterator.get_next()
vocabulary_feature_column = tf.feature_column.categorical_column_with_vocabulary_list(
key="in",
vocabulary_list=map(lambda i: str(i), range(11)))
embedding_column = tf.feature_column.embedding_column(
categorical_column=vocabulary_feature_column,
dimension=2)
with tf.Session() as sess:
print(sess.run(input_fn()))
classifier = tf.estimator.DNNClassifier(
feature_columns = [embedding_column],
hidden_units = [5,5],
n_classes = 11,
model_dir = '/tmp/predict/snap')
classifier.train(
input_fn=input_fn)
but running it I get
Traceback (most recent call last):
File "predict.py", line 33, in
input_fn=input_fn)
File "/usr/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 302, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 711, in _train_model
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/lib/python2.7/site-packages/tensorflow/python/estimator/estimator.py", line 694, in _call_model_fn
model_fn_results = self._model_fn(features=features, **kwargs)
File "/usr/lib/python2.7/site-packages/tensorflow/python/estimator/canned/dnn.py", line 334, in _model_fn
config=config)
File "/usr/lib/python2.7/site-packages/tensorflow/python/estimator/canned/dnn.py", line 190, in _dnn_model_fn
logits = logit_fn(features=features, mode=mode)
File "/usr/lib/python2.7/site-packages/tensorflow/python/estimator/canned/dnn.py", line 89, in dnn_logit_fn
features=features, feature_columns=feature_columns)
File "/usr/lib/python2.7/site-packages/tensorflow/python/feature_column/feature_column.py", line 230, in input_layer
trainable=trainable)
File "/usr/lib/python2.7/site-packages/tensorflow/python/feature_column/feature_column.py", line 1837, in _get_dense_tensor
inputs, weight_collections=weight_collections, trainable=trainable)
File "/usr/lib/python2.7/site-packages/tensorflow/python/feature_column/feature_column.py", line 2123, in _get_sparse_tensors
return _CategoricalColumn.IdWeightPair(inputs.get(self), None)
File "/usr/lib/python2.7/site-packages/tensorflow/python/feature_column/feature_column.py", line 1533, in get
transformed = column._transform_feature(self) # pylint: disable=protected-access
File "/usr/lib/python2.7/site-packages/tensorflow/python/feature_column/feature_column.py", line 2091, in _transform_feature
input_tensor = _to_sparse_input(inputs.get(self.key))
File "/usr/lib/python2.7/site-packages/tensorflow/python/feature_column/feature_column.py", line 1631, in _to_sparse_input
raise ValueError('Undefined input_tensor shape.')
ValueError: Undefined input_tensor shape.
Looking at the tf sources I get the impression I that the categorical_column_with_vocabulary_list expects a tensor as output instead of a string, but I have a hard time understanding how to make my input_fn provide that the right way.
Does anyone have any idea what I'm doing wrong here?
As a comparison, the following code works perfectly fine: https://pastebin.com/28QUNAjA
EDIT
I noticed that replacing tf.data.Dataset.from_generator with tf.data.Dataset.from_tensor_slices makes the code run.
I.e. the following actually works:
import tensorflow as tf
import random
tf.logging.set_verbosity(tf.logging.DEBUG)
def input_fn():
data = tf.data.Dataset.from_tensor_slices(({"in": map(lambda i: str(i), range(10))}, range(1,11)))
data = data.repeat(1000)
data = data.batch(10)
iterator = data.make_one_shot_iterator()
return iterator.get_next()
vocabulary_feature_column = tf.feature_column.categorical_column_with_vocabulary_list(
key="in",
vocabulary_list=map(lambda i: str(i), range(11)))
embedding_column = tf.feature_column.embedding_column(
categorical_column=vocabulary_feature_column,
dimension=2)
with tf.Session() as sess:
print(sess.run(input_fn()))
classifier = tf.estimator.DNNClassifier(
feature_columns = [embedding_column],
hidden_units = [5,5],
n_classes = 11,
model_dir = '/usr/local/google/home/zond/tmp/predict/snap')
classifier.train(
input_fn=input_fn)
This ought to be a bug, so I created https://github.com/tensorflow/tensorflow/issues/15178.

tensorflow serving: confusion on feature_configs data format

I have followed the tensorflow serving tutorial mnist_saved_model.py
and try to train and export a text-cnn-classifier model
The pipeline is
*embedding layer -> cnn -> maxpool -> cnn -> dropout -> output layer
Tensorflow data input :
data_in = tf.placeholder(tf.int32,[None, sequence_length] , name='data_in')
transformed to
serialized_tf_example = tf.placeholder(tf.string, name='tf_example')
feature_configs = {'x': tf.FixedLenFeature(shape=[sequence_length],
dtype=tf.int64),}
tf_example = tf.parse_example(serialized_tf_example, feature_configs)
# use tf.identity() to assign name
data_in = tf.identity(tf_example['x'], name='x')
This works for training phase
but at test time
it tells
AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Expects arg[0] to be int64 but string is provided")
I am confused about the above line
feature_configs = {'x': tf.FixedLenFeature(shape=[sequence_length],
dtype=tf.int64),}
I changed the line to
feature_configs = {'x': tf.FixedLenFeature(shape=[sequence_length],
dtype=tf.string),}
but it gives the following error at training time:
Traceback (most recent call last):
File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/tf_serving/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.py", line 222, in <module>
embedded_chars = tf.nn.embedding_lookup(W, data_in)
File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/ops/embedding_ops.py", line 122, in embedding_lookup
return maybe_normalize(_do_gather(params[0], ids, name=name))
File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/ops/embedding_ops.py", line 42, in _do_gather
return array_ops.gather(params, ids, name=name)
File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/ops/gen_array_ops.py", line 1179, in gather
validate_indices=validate_indices, name=name)
File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/framework/op_def_library.py", line 589, in apply_op
param_name=input_name)
File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/framework/op_def_library.py", line 60, in _SatisfiesTypeConstraint
", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'indices' has DataType string not in list of allowed values: int32, int64
Your code is wrong:
serialized_tf_example = tf.placeholder(tf.string, name='tf_example')
that means your input is a string, such as sentence's word. Therefore:
feature_configs = {'x': tf.FixedLenFeature(shape=[sequence_length],
dtype=tf.int64),}
tf_example = tf.parse_example(serialized_tf_example, feature_configs)
That means nothing, in my opinion, because you do not though vocabulary transfer string to int. You need load your train data's vocab to get word index!

why dataset.output_shapes returns demension(none) after batching

I'm using the Dataset API for input pipelines in TensorFlow (version: r1.2). I built my dataset and batched it with a batch size of 128. The dataset fed into the RNN.
Unfortunately, the dataset.output_shape returns dimension(none) in the first dimension, so the RNN raises an error:
Traceback (most recent call last):
File "untitled1.py", line 188, in <module>
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "untitled1.py", line 121, in main
run_training()
File "untitled1.py", line 57, in run_training
is_training=True)
File "/home/harold/huawei/ConvLSTM/ConvLSTM.py", line 216, in inference
initial_state=initial_state)
File "/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 566, in dynamic_rnn
dtype=dtype)
File "/home/harold/anaconda2/envs/tensorflow_py2.7/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 636, in _dynamic_rnn_loop
"Input size (depth of inputs) must be accessible via shape inference,"
ValueError: Input size (depth of inputs) must be accessible via shape inference, but saw value None.
I think this error is caused by the shape of input, the first dimension should be batch size but not none.
here is the code:
origin_dataset = Dataset.BetweenS_Dataset(FLAGS.data_path)
train_dataset = origin_dataset.train_dataset
test_dataset = origin_dataset.test_dataset
shuffle_train_dataset = train_dataset.shuffle(buffer_size=10000)
shuffle_batch_train_dataset = shuffle_train_dataset.batch(128)
batch_test_dataset = test_dataset.batch(FLAGS.batch_size)
iterator = tf.contrib.data.Iterator.from_structure(
shuffle_batch_train_dataset.output_types,
shuffle_batch_train_dataset.output_shapes)
(images, labels) = iterator.get_next()
training_init_op = iterator.make_initializer(shuffle_batch_train_dataset)
test_init_op = iterator.make_initializer(batch_test_dataset)
print(shuffle_batch_train_dataset.output_shapes)
I print output_shapes and it gives:
(TensorShape([Dimension(None), Dimension(36), Dimension(100)]), TensorShape([Dimension(None)]))
I suppose that it should be 128, because I have batched dataset:
(TensorShape([Dimension(128), Dimension(36), Dimension(100)]), TensorShape([Dimension(128)]))
This feature has been added with the drop_remainder parameter used like the following:
batch_test_dataset = test_dataset.batch(FLAGS.batch_size, drop_remainder=True)
From the docs:
drop_remainder: (Optional.) A tf.bool scalar tf.Tensor, representing whether the last batch should be dropped in the case its has fewer than batch_size elements; the default behavior is not to drop the smaller batch.
They hardcoded batch size in implementation and it always will return None (tf 1.3).
def _padded_shape_to_batch_shape(s):
return tensor_shape.vector(None).concatenate(
tensor_util.constant_value_as_shape(s))
In this way, they can batch all elements (e.g. dataset_size=14, batch_size=5, last_batch_size=4).
You can use dataset.filter and dataset.map to fix this issue
d = contrib.data.Dataset.from_tensor_slices([[5] for x in range(14)])
batch_size = 5
d = d.batch(batch_size)
d = d.filter(lambda e: tf.equal(tf.shape(e)[0], batch_size))
def batch_reshape(e):
return tf.reshape(e, [args.batch_size] + [s if s is not None else -1 for s in e.shape[1:].as_list()])
d = d.map(batch_reshape)
r = d.make_one_shot_iterator().get_next()
print('dataset_output_shape = %s' % r.shape)
with tf.Session() as sess:
while True:
print(sess.run(r))
Output
dataset_output_shape = (5, 1)
[[5][5][5][5][5]]
[[5][5][5][5][5]]
OutOfRangeError

Running LinearClassifier.fit with SparseTensor

I'm trying to create a LinearClassifer with a sparse binary numpy coo matrix (reports) using a SparseTensor. This is with TensorFlow 0.9.0
I do this as follows:
reports_indices = list()
rows,cols = reports.nonzero()
for row,col in zip(rows,cols):
reports_indices.append([row,col])
x_sparsetensor = tf.SparseTensor(
indices=reports_indices,
values=[1] * len(reports_indices),
shape=[reports.shape[0],reports.shape[1]])
The dimensions of reports is 10K by 1.5K.
I then setup the LinearClassifier as follows:
m = tf.contrib.learn.LinearClassifier()
m.fit(x=x_sparsetensor,y=response_vector.todense(),input_fn=None)
Response vector is binary and has a length of 10K. This results in the following error:
Traceback (most recent call last):
File "ddi_prr.py", line 38, in <module>
m.fit(x=x_sparsetensor,y=response_vector.todense(),input_fn=None)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 173, in fit
input_fn, feed_fn = _get_input_fn(x, y, batch_size)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 67, in _get_input_fn
x, y, n_classes=None, batch_size=batch_size)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 117, in setup_train_data_feeder
X, y, n_classes, batch_size, shuffle=shuffle, epochs=epochs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 240, in __init__
batch_size)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 44, in _get_in_out_shape
x_shape = list(x_shape[1:]) if len(x_shape) > 1 else [1]
TypeError: object of type 'Tensor' has no len()
Is my construction incorrect for some reason? It seems that LinearClassifier.fit can't be instantiated with a SparseTensor for x, is that true? Thanks in advance for any help.
As far as I know, passing SparseTensors as x or y arguments to .fit is not supported:
x: matrix or tensor of shape [n_samples, n_features...]. Can be
iterator that returns arrays of features. The training input samples
for fitting the model. If set, input_fn must be None.
Also, SparseTensor is a sparse equivalent of Tensor -- an object representing symbolic computation to be executed. I believe what you would like to use as x is SparseTensorValue.
You can try pass it using other way of passing data to Estimator: input_fn function:
def get_input_fn(sparse_x, y):
def input_fn():
return sparse_x, y
m.fit(input_fn=get_input_fn(x, y))
if it won't work, you may try to produce the SparseTensors inside the input_fn function.