pix2pixHD error with own dataset - tensorflow

I am trying to generate my own images using the pix2pixHD pre-trained model. Github repo found here
The images inside the dataset has to be in grayscale with no alpha channel. The images in the repo has a size of 16 bitPerSample and I have both images in size 8 and 16 bitsPerSample.
When I check both my images and the images in the repo using sips -g all. This is the outcome I get:
pixelWidth: 2048
pixelHeight: 1024
typeIdentifier: public.png
format: png
formatOptions: default
dpiWidth: 72.000
dpiHeight: 72.000
samplesPerPixel: 1
bitsPerSample: 16
hasAlpha: no
space: Gray
The strange thing is that it works with the images that has 8 bitPerSample.
This is the outcome I get:
Grayscale input
Converted label map
Final output
When I run test.py with 16 bitsPerSample images, it doesn't work.
This is the error it gives me:
model [Pix2PixHDModel] was created
Traceback (most recent call last):
File "test.py", line 26, in <module>
for i, data in enumerate(dataset):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 210, in __next__
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 230, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 42, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 42, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/paperspace/Documents/pix2pixHD/data/aligned_dataset.py", line 41, in __getitem__
label_tensor = transform_label(label) * 255.0
File "/usr/local/lib/python3.5/dist-packages/torch/tensor.py", line 309, in __mul__
return self.mul(other)
TypeError: mul received an invalid combination of arguments - got (float), but expected one of:
* (int value)
didn't match because some of the arguments have invalid types: (float)
* (torch.IntTensor other)
didn't match because some of the arguments have invalid types: (float)
I am new fairly to Tensorflow and I have never used pytorch before.
Any idea what this error mean and how can I resolve it?

Yes, I think I can help you.
I haven't checked the repository, but from the error trace the problem appears to be following:
You are performing a multiplication operation betweenn the output of transform_label(label) (presumably a tensor) and a scalar 255.0. This is fine as long as both your scalar and your tensor are of the same datatype. From the error trace however, it looks as if the output of transform_label() is of data type Int / Long, while 255.0 is a float.
I suggest you try 255 or int(255.0) instead of 255.0.
If this does not resolve your problem, let me know what data type the output of transform_label() is.

Related

Pandas to_hdf fails on dataframes containing nullable int dtypes (e.g. Int8Dtype)

I'm trying to reduce the memory consumption of some large data that we work with, so that more data can be appended to it without throwing memory errors. Downcasting floats where possible helps a little, but the major savings I#ve found have been from casting float64s the Int8 and Int16 where possible. This data contains NaNs. This is unavoidable, and in context there is no value I can replace NaNs with that doesn't change the meaning of the data. The new nullable dtypes are great for this, but I get ValueError: cannot convert float NaN to integer when trying to save the resulting frames to hdf.
I've tried using to_hdf with and without specifiying table format, and get different errors (without specifying table format the error is AttributeError: 'NoneType' object has no attribute 'names')
´´´
df=pd.DataFrame([1,2,3,np.nan,5], columns=['A'])
df.to_hdf('Z:/test.hd5', 'data')
#This works
df['A']=df.A.astype(pd.Int8Dtype())
df.to_hdf('Z:/test.hd5', 'data')
Traceback (most recent call last):
File "<ipython-input-51-6b0f3ad26286>", line 1, in <module>
df.to_hdf('Z:/test.hd5', 'data', complevel=9, complib='blosc:zlib')
File "C:\Users\marnoch.hamilton-jon\AppData\Local\Continuum\anaconda3 \lib\site-packages\pandas\core\generic.py", line 2377, in to_hdf
return pytables.to_hdf(path_or_buf, key, self, **kwargs)
File "C:\Users\marnoch.hamilton-jon\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py", line 274, in to_hdf
f(store)
File "C:\Users\marnoch.hamilton-jon\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py", line 268, in <lambda>
f = lambda store: store.put(key, value, **kwargs)
File "C:\Users\marnoch.hamilton-jon\AppData\Local\Continuum\anaconda3 \lib\site-packages\pandas\io\pytables.py", line 889, in put
self._write_to_group(key, value, append=append, **kwargs)
File "C:\Users\marnoch.hamilton-jon\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py", line 1415, in _write_to_group
s.write(obj=value, append=append, complib=complib, **kwargs)
File "C:\Users\marnoch.hamilton-jon\AppData\Local\Continuum\anaconda3 \lib\site-packages\pandas\io\pytables.py", line 3022, in write
blk.values, items=blk_items)
File "C:\Users\marnoch.hamilton-jon\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py", line 2750, in write_array
atom = _tables().Atom.from_dtype(value.dtype)
File "C:\Users\marnoch.hamilton-jon\AppData\Local\Continuum\anaconda3\lib\site-packages\tables\atom.py", line 381, in from_dtype
if basedtype.names:
AttributeError: 'NoneType' object has no attribute 'names'
´´´
Is this a bug? An intentional limitation? Or have I done something dumb?
This is a bug. See GitHub Issue #26144 for the status.

TFX Transform Rank Mismatch While Loading/Applying TFX Beam Transform Graph

I've already successfully fit a TFTransformOutput to some data (in this case, the Census dataset from UCI common amongst the TF and TFX examples.) I try to apply the transformer with the method transform_raw_features(raw_features) but keep getting the error:
ValueError: Node 'transform/transform/inputs/workclass_copy' has an
_output_shapes attribute inconsistent with the GraphDef for output #0: Shapes must be equal rank, but are 0 and 1
Digging into the source code, it seems the error originates in saved_transform_io in the method _partially_apply_saved_transform_impl while doing:
saver = tf_saver.import_meta_graph(meta_graph_def, import_scope=import_scope,
input_map=input_map)
I examined the meta_graph_def produced by TFX TFTransform and Beam and notice that the graph indeed has a series of copied variables with input/output rank differences. However, that is nothing I have control over.
The column in the error message is "workclass" which is a simple categorical column. What might I be doing incorrectly? What is the best way to debug this? At this point, I've already dug deep into the TF source code but the error seems to originate with how the TFTransform graph was written, not sure what levers I have to change/fix that.
This is using TF Transform v0.9 and the corresponding TF v1.9
Traceback (most recent call last): File
"/home/sahmed/workspace/ml_playground/TFX-TFT/trainers.py", line 449,
in parse_csv
transformed_stuff=xformer.transform_raw_features(raw_features) File
"/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow_transform/output_wrapper.py",
line 122, in transform_raw_features
self.transform_savedmodel_dir, raw_features)) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow_transform/saved/saved_transform_io.py",
line 360, in partially_apply_saved_transform_internal
saved_model_dir, logical_input_map, tensor_replacement_map) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow_transform/saved/saved_transform_io.py",
line 218, in _partially_apply_saved_transform_impl
input_map=input_map) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow/python/training/saver.py",
line 1960, in import_meta_graph
**kwargs) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow/python/framework/meta_graph.py",
line 744, in import_scoped_meta_graph
producer_op_list=producer_op_list) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py",
line 432, in new_func
return func(*args, **kwargs) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow/python/framework/importer.py",
line 422, in import_graph_def
raise ValueError(str(e)) ValueError: Node 'transform/transform/inputs/workclass_copy' has an _output_shapes
attribute inconsistent with the GraphDef for output #0: Shapes must be
equal rank, but are 0 and 1
The issue is likely that the shape of the workclass tensor is incompatible with what transform_raw_features expects.
TFTransformOutput.transform_raw_features() expects these features to have the same characteristics as described in the metadata given to tft.AnalyzeDataset() similarly to how it's done in this example:
https://github.com/tensorflow/transform/blob/master/examples/simple_example.py#L63
Could you take a look at the metadata used in your pipeline and see that it is compatible with the data fed into TFTransformOutput.transform_raw_features()?

UnicodeDecodeError from tf.train.import_meta_graph

I serialized a Tensorflow model with the following code ...
save_path = self.saver.save(self.session, os.path.join(self.logdir, "model.ckpt"), global_step)
logging.info("Model saved in file: %s" % save_path)
... and I'm now trying to restore it from scratch in a separate file using the following code:
saver = tf.train.import_meta_graph(PROJ_DIR + '/logs/default/model.ckpt-54.meta')
session = tf.Session()
saver.restore(session, PROJ_DIR + '/logs/default/model.ckpt-54')
print('Model restored')
When tf.train.import_meta_graph is called, the following exception is thrown:
[libprotobuf ERROR google/protobuf/io/coded_stream.cc:207] A protocol message was rejected because it was too big (more than 67108864 bytes). To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
Traceback (most recent call last):
File "/home/reid/projects/research/ccg/taggerflow_modified/test/tf_restore.py", line 4, in <module>
saver = tf.train.import_meta_graph(PROJ_DIR + '/logs/default/model.ckpt-54.meta')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1711, in import_meta_graph
read_meta_graph_file(meta_graph_or_file), clear_devices)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1563, in read_meta_graph_file
text_format.Merge(file_content.decode("utf-8"), meta_graph_def)
File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa7 in position 1: invalid start byte
For reference, here's the first few lines of <PROJ_DIR>/logs/default/model.ckpt-54.meta:
<A7>:^R<A4>:
9
^CAdd^R^F
^Ax"^AT^R^F
^Ay"^AT^Z^F
^Az"^AT"^Z
^AT^R^Dtype:^O
^M2^K^S^A^B^D^F^E^C ^R^G
I think that Tensorflow is using a different encoding when serializing vs when deserializing. How do we specify the encoding that Tensorflow uses when serializing/deserializing? Or is the solution something different?
I was facing the same issue. Have you ensured that apart from the
.meta, .data-00000-of-00001 and the .index files
the file named 'checkpoint' too is there in the directory from which you're loading the model?
My issue got resolved after I made sure of this. Hope this helps!

Why does the weight matrix of the mxnet.gluon.nn.Dense object has no shape?

I try to follow this nice MXNet Tutorial. I create an extremely simple neural network (two input unit, no hidden units and one output unit) doing this:
from mxnet import gluon
net = gluon.nn.Dense(1, in_units=2)
After that I try to take a look at the shape of the weight matrix (the same way as it is described in the tutorial):
print(net.weight)
As a result I expect to see this:
Parameter dense4_weight (shape=(1, 2), dtype=None)
However, I see the following error message:
Traceback (most recent call last):
File "tmp.py", line 5, in <module>
print(net.weight)
File "/usr/local/lib/python3.6/site-packages/mxnet/gluon/parameter.py", line 120, in __repr__
return s.format(**self.__dict__)
KeyError: 'shape'
Am I doing something wrong?
This is a regression that happened here and has since been fixed on master branch here. Expect it to be fixed in the next MXNet release.

the error message of TypeError: Input 'split_dim' of 'Split' Op has type float32 that does not match expected type of int32

Running a tensorflow model generates the following error message. Googling it seems to be caused by the tensorflow version being < 0.12.0. The version of tensorflow I am using is 0.12.0-rc0.
File "/home/ug/GPU-Study/keras/FCN/fcn/tensorflow_fcn/fcn8_vgg.py", line 60, in build
red, green, blue = tf.split(rgb, 3, 3)
File "/devl/tensorflow/tf_0.12/lib/python3.4/site-packages/tensorflow/python/ops/array_ops.py", line 1159, in split
name=name)
File "/devl/tensorflow/tf_0.12/lib/python3.4/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3241, in _split
num_split=num_split, name=name)
File "/devl/tensorflow/tf_0.12/lib/python3.4/site-packages/tensorflow/python/framework/op_def_library.py", line 508, in apply_op
(prefix, dtypes.as_dtype(input_arg.type).name))
TypeError: Input 'split_dim' of 'Split' Op has type float32 that does not match expected type of int32.
In your linked SO post above, #alphaleonis points out that Tensorflow 0.12.0-rc0 and 0.12.0-rc1 are pre-release versions that still contain the old syntax. The new syntax change didn't take effect until v0.12.0.
So update ot v0.12 or better v1.0.0. Good luck!
Update:
If you cannot update the underlying libraries, then you should use the previous/older/pre-v0.12 signature for tf.split:
# tf.split(axis, num_or_size_splits, value)
In your case use:
red, green, blue = tf.split(3, 3, rgb)