Why does the weight matrix of the mxnet.gluon.nn.Dense object has no shape? - mxnet

I try to follow this nice MXNet Tutorial. I create an extremely simple neural network (two input unit, no hidden units and one output unit) doing this:
from mxnet import gluon
net = gluon.nn.Dense(1, in_units=2)
After that I try to take a look at the shape of the weight matrix (the same way as it is described in the tutorial):
print(net.weight)
As a result I expect to see this:
Parameter dense4_weight (shape=(1, 2), dtype=None)
However, I see the following error message:
Traceback (most recent call last):
File "tmp.py", line 5, in <module>
print(net.weight)
File "/usr/local/lib/python3.6/site-packages/mxnet/gluon/parameter.py", line 120, in __repr__
return s.format(**self.__dict__)
KeyError: 'shape'
Am I doing something wrong?

This is a regression that happened here and has since been fixed on master branch here. Expect it to be fixed in the next MXNet release.

Related

TypeError: can’t convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first (fastai)

I am following the code here:
https://www.kaggle.com/tanlikesmath/diabetic-retinopathy-with-resnet50-oversampling
However, during the metrics calculation, I am getting the following error:
File "main.py", line 50, in <module>
learn.fit_one_cycle(4,max_lr = 2e-3)
...
File "main.py", line 39, in quadratic_kappa
return torch.tensor(cohen_kappa_score(torch.argmax(y_hat,1), y, weights='quadratic'),device='cuda:0')
...
File "/pfs/work7/workspace/scratch/ul_dco32-conda-0/conda/envs/resnet50/lib/python3.8/site-packages/torch/tensor.py", line 486, in __array__
return self.numpy()
TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
Here are the metrics and the model:
def quadratic_kappa(y_hat, y):
return torch.tensor(cohen_kappa_score(torch.argmax(y_hat,1), y, weights='quadratic'),device='cuda:0')
learn = cnn_learner(data, models.resnet50, metrics = [accuracy,quadratic_kappa])
learn.fit_one_cycle(4,max_lr = 2e-3)
As it is being said in the discussion https://discuss.pytorch.org/t/typeerror-can-t-convert-cuda-tensor-to-numpy-use-tensor-cpu-to-copy-the-tensor-to-host-memory-first/32850/6, I have to bring the data back to cpu. But I am slightly lost how to do it.
I tried to add .cpu() all over the metrics but could not solve it so far.
I'm assuming that both y and y_hat are CUDA tensors, that means that you need to bring them both to the CPU for the cohen_kappa_score, not just one.
def quadratic_kappa(y_hat, y):
return torch.tensor(cohen_kappa_score(torch.argmax(y_hat.cpu(),1), y.cpu(), weights='quadratic'),device='cuda:0')
# ^^^ ^^^
Calling .cpu() on a tensor that is already on the CPU has no effect, so it's safe to use in any case.
I went from a CPU to a GPU version and received this error. It was due to passing metrics=[mean_absolute_error,mean_squared_error] to the Learner object (in my case tabular_learner).
Removing the metric parameter solved the issue temporarily for me.

Problem converting tensorflow saved_model to tensorflowjs

I want to convert trained python model (.pb) to tensorflowjs model. To accomplish this, first I saved the model with estimator.export_savedmodel function, then I run the tensorflowjs_converter command on Google Colab. However, no file is created for tensorflowjs. The conversion also gives a lot of warning and ends with an error.
Here is the full code, and please run to see the full output:
https://colab.research.google.com/drive/19k2s8eHpQY9Trps9dyaxPp0HqHWp5qpb
What is the reason of the problem and how can I fix it?
Part of the output:
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
Traceback (most recent call last):
File "/usr/local/bin/tensorflowjs_converter", line 8, in <module>
sys.exit(pip_main())
File "/usr/local/lib/python3.6/dist-packages/tensorflowjs/converters/converter.py", line 638, in pip_main
main([' '.join(sys.argv[1:])])
File "/usr/local/lib/python3.6/dist-packages/tensorflowjs/converters/converter.py", line 642, in main
convert(argv[0].split(' '))
File "/usr/local/lib/python3.6/dist-packages/tensorflowjs/converters/converter.py", line 591, in convert
strip_debug_ops=args.strip_debug_ops)
File "/usr/local/lib/python3.6/dist-packages/tensorflowjs/converters/tf_saved_model_conversion_v2.py", line 435, in convert_tf_saved_model
strip_debug_ops=strip_debug_ops)
File "/usr/local/lib/python3.6/dist-packages/tensorflowjs/converters/tf_saved_model_conversion_v2.py", line 141, in optimize_graph
', '.join(unsupported))
ValueError: Unsupported Ops in the model before optimization
ParallelDynamicStitch, StringSplit, Unique, RegexReplace, DynamicPartition, StringToHashBucketFast, ParseExample, LookupTableFindV2, LookupTableSizeV2, SparseFillEmptyRows, StringJoin, AsString, SparseSegmentSqrtN, HashTableV2
Edit:
Seems like it isn't supported:
https://github.com/tensorflow/tfjs/issues/2322
This is because your model has ops that are not supported by tensorflow.js yet. And seems like you missed the missing op name in the output you pasted. Please feel free to update the output with missing op name or file a feature request in the tensorflow.js repo with more details.

TFX Transform Rank Mismatch While Loading/Applying TFX Beam Transform Graph

I've already successfully fit a TFTransformOutput to some data (in this case, the Census dataset from UCI common amongst the TF and TFX examples.) I try to apply the transformer with the method transform_raw_features(raw_features) but keep getting the error:
ValueError: Node 'transform/transform/inputs/workclass_copy' has an
_output_shapes attribute inconsistent with the GraphDef for output #0: Shapes must be equal rank, but are 0 and 1
Digging into the source code, it seems the error originates in saved_transform_io in the method _partially_apply_saved_transform_impl while doing:
saver = tf_saver.import_meta_graph(meta_graph_def, import_scope=import_scope,
input_map=input_map)
I examined the meta_graph_def produced by TFX TFTransform and Beam and notice that the graph indeed has a series of copied variables with input/output rank differences. However, that is nothing I have control over.
The column in the error message is "workclass" which is a simple categorical column. What might I be doing incorrectly? What is the best way to debug this? At this point, I've already dug deep into the TF source code but the error seems to originate with how the TFTransform graph was written, not sure what levers I have to change/fix that.
This is using TF Transform v0.9 and the corresponding TF v1.9
Traceback (most recent call last): File
"/home/sahmed/workspace/ml_playground/TFX-TFT/trainers.py", line 449,
in parse_csv
transformed_stuff=xformer.transform_raw_features(raw_features) File
"/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow_transform/output_wrapper.py",
line 122, in transform_raw_features
self.transform_savedmodel_dir, raw_features)) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow_transform/saved/saved_transform_io.py",
line 360, in partially_apply_saved_transform_internal
saved_model_dir, logical_input_map, tensor_replacement_map) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow_transform/saved/saved_transform_io.py",
line 218, in _partially_apply_saved_transform_impl
input_map=input_map) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow/python/training/saver.py",
line 1960, in import_meta_graph
**kwargs) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow/python/framework/meta_graph.py",
line 744, in import_scoped_meta_graph
producer_op_list=producer_op_list) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py",
line 432, in new_func
return func(*args, **kwargs) File "/home/sahmed/miniconda3/envs/kml2/lib/python2.7/site-packages/tensorflow/python/framework/importer.py",
line 422, in import_graph_def
raise ValueError(str(e)) ValueError: Node 'transform/transform/inputs/workclass_copy' has an _output_shapes
attribute inconsistent with the GraphDef for output #0: Shapes must be
equal rank, but are 0 and 1
The issue is likely that the shape of the workclass tensor is incompatible with what transform_raw_features expects.
TFTransformOutput.transform_raw_features() expects these features to have the same characteristics as described in the metadata given to tft.AnalyzeDataset() similarly to how it's done in this example:
https://github.com/tensorflow/transform/blob/master/examples/simple_example.py#L63
Could you take a look at the metadata used in your pipeline and see that it is compatible with the data fed into TFTransformOutput.transform_raw_features()?

pix2pixHD error with own dataset

I am trying to generate my own images using the pix2pixHD pre-trained model. Github repo found here
The images inside the dataset has to be in grayscale with no alpha channel. The images in the repo has a size of 16 bitPerSample and I have both images in size 8 and 16 bitsPerSample.
When I check both my images and the images in the repo using sips -g all. This is the outcome I get:
pixelWidth: 2048
pixelHeight: 1024
typeIdentifier: public.png
format: png
formatOptions: default
dpiWidth: 72.000
dpiHeight: 72.000
samplesPerPixel: 1
bitsPerSample: 16
hasAlpha: no
space: Gray
The strange thing is that it works with the images that has 8 bitPerSample.
This is the outcome I get:
Grayscale input
Converted label map
Final output
When I run test.py with 16 bitsPerSample images, it doesn't work.
This is the error it gives me:
model [Pix2PixHDModel] was created
Traceback (most recent call last):
File "test.py", line 26, in <module>
for i, data in enumerate(dataset):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 210, in __next__
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 230, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 42, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 42, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/paperspace/Documents/pix2pixHD/data/aligned_dataset.py", line 41, in __getitem__
label_tensor = transform_label(label) * 255.0
File "/usr/local/lib/python3.5/dist-packages/torch/tensor.py", line 309, in __mul__
return self.mul(other)
TypeError: mul received an invalid combination of arguments - got (float), but expected one of:
* (int value)
didn't match because some of the arguments have invalid types: (float)
* (torch.IntTensor other)
didn't match because some of the arguments have invalid types: (float)
I am new fairly to Tensorflow and I have never used pytorch before.
Any idea what this error mean and how can I resolve it?
Yes, I think I can help you.
I haven't checked the repository, but from the error trace the problem appears to be following:
You are performing a multiplication operation betweenn the output of transform_label(label) (presumably a tensor) and a scalar 255.0. This is fine as long as both your scalar and your tensor are of the same datatype. From the error trace however, it looks as if the output of transform_label() is of data type Int / Long, while 255.0 is a float.
I suggest you try 255 or int(255.0) instead of 255.0.
If this does not resolve your problem, let me know what data type the output of transform_label() is.

Clustering of sparse matrix in python and scipy

I'm trying to cluster some data with python and scipy but the following code does not work for reason I do not understand:
from scipy.sparse import *
matrix = dok_matrix((en,en), int)
for pub in pubs:
authors = pub.split(";")
for auth1 in authors:
for auth2 in authors:
if auth1 == auth2: continue
id1 = e2id[auth1]
id2 = e2id[auth2]
matrix[id1, id2] += 1
from scipy.cluster.vq import vq, kmeans2, whiten
result = kmeans2(matrix, 30)
print result
It says:
Traceback (most recent call last):
File "cluster.py", line 40, in <module>
result = kmeans2(matrix, 30)
File "/usr/lib/python2.7/dist-packages/scipy/cluster/vq.py", line 683, in kmeans2
clusters = init(data, k)
File "/usr/lib/python2.7/dist-packages/scipy/cluster/vq.py", line 576, in _krandinit
return init_rankn(data)
File "/usr/lib/python2.7/dist-packages/scipy/cluster/vq.py", line 563, in init_rankn
mu = np.mean(data, 0)
File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 2374, in mean
return mean(axis, dtype, out)
TypeError: mean() takes at most 2 arguments (4 given)
When I'm using kmenas instead of kmenas2 I have the following error:
Traceback (most recent call last):
File "cluster.py", line 40, in <module>
result = kmeans(matrix, 30)
File "/usr/lib/python2.7/dist-packages/scipy/cluster/vq.py", line 507, in kmeans
guess = take(obs, randint(0, No, k), 0)
File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 103, in take
return take(indices, axis, out, mode)
TypeError: take() takes at most 3 arguments (5 given)
I think I have the problems because I'm using sparse matrices but my matrices are too big to fit the memory otherwise. Is there a way to use standard clustering algorithms from scipy with sparse matrices? Or I have to re-implement them myself?
I created a new version of my code to work with vector space
el = len(experts)
pl = len(pubs)
print el, pl
from scipy.sparse import *
P = dok_matrix((pl, el), int)
p_id = 0
for pub in pubs:
authors = pub.split(";")
for auth1 in authors:
if len(auth1) < 2: continue
id1 = e2id[auth1]
P[p_id, id1] = 1
from scipy.cluster.vq import kmeans, kmeans2, whiten
result = kmeans2(P, 30)
print result
But I'm still getting the error:
TypeError: mean() takes at most 2 arguments (4 given)
What am I doing wrong?
K-means cannot be run on distance matrixes.
It needs a vector space to compute means in, that is why it is called k-means. If you want to use a distance matrix, you need to look into purely distance based algorithms such as DBSCAN and OPTICS (both on Wikipedia).
May I suggest, "Affinity Propagation" from scikit-learn? On the work I've been doing with it, I find that it has generally been able to find the 'naturally' occurring clusters within my data set. The inputs into the algorithm are an affinity matrix, or similarity matrix, of any arbitrary similarity measure.
I don't have a good handle on the kind of data you have on hand, so I can't speak to the exact suitability of this method to your data set, but it may be worth a try, perhaps?
Alternatively, if you're looking to cluster graphs, I'd take a look at NetworkX. That might be a useful tool for you. The reason I suggest this is because it looks like the data you're looking to work with networks of authors. Hence, with NetworkX, you can put in an adjacency matrix and find out which authors are clustered together.
For a further elaboration on this, you can see a question that I had asked earlier for inspiration here.