b'No files matched pattern error with Tensorflow - tensorflow

I started to write the code below to review my dataset. I'm trying to reach my images, so I add the path to the images.
import tensorflow as tf
import json
import numpy as np
from matplotlib import pyplot as plt
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
print(tf.config.list_physical_devices('GPU'))
images = tf.data.Dataset.list_files('data\\images\\*.jpg',shuffle=False)
But I got the error below.
Expected 'tf.Tensor(False, shape=(), dtype=bool)' to be true. Summarized data: b'No files matched pattern:: data\\images\\*.jpg'
As you can see my folder hierarchy is correct. And there is images with .jpg format in the images folder.
I also tried to change to string to variations below but none of it worked.
'\\data\\images\\*.jpg'
'/data/images/*.jpg'
'data/images/*.jpg'
What I am missing here, can you help me out please?
UPDATE:
I couldn't make it work with relative path, so I go with the absolute path and it worked.

Related

How to save histogram plots from Tensorboard 2 to disk, just like you can do with scalars?

I am using Tensorboard 2 to visualize my training data and I am able to save scalar plots to disk. However, I am unable to find a way to do this for histogram plots (tf.summary.histogram).
Is it possible to save histogram plots from Tensorboard 2 to disk, just like it is possible to do with scalars? I have looked through the documentation and it seems like this is not supported, but I wanted to confirm with the community before giving up. Any help or suggestions would be greatly appreciated.
There is an open issue to add a download button for histograms. However, this issue is open for more than 4 years, so I doubt it is getting resolved soon.
A workaround is to use the url that tensorboard would use to get the data.
A short example:
# writing some data to tensorboard
from torch.utils.tensorboard import SummaryWriter
import numpy as np
writer = SummaryWriter('./tmp')
writer.add_histogram('hist', np.arange(10), 0)
Open tensorboard in the browser (here localhost:6006):
Get data as JSON using the template
http://<tb-host>/data/plugin/histograms/histograms?run=<run-name>&tag=<tag-name>.
Here http://localhost:6006/data/plugin/histograms/histograms/?run=.&tag=hist:
Now you can download the data as JSON.
Quick comparison with matplotlib:
import pandas as pd
import json
import matplotlib.pyplot as plt
with open('histograms.json', 'r') as f:
d = pd.DataFrame(json.load(f)[0][2])
fix, axes = plt.subplots(1, 2, figsize=(10, 3))
axes[0].bar(d[1], d[2])
axes[0].set_title('tb')
axes[1].hist(data)
axes[1].set_title('original')

Trying to access `splits['test']` but `splits` is empty. This likely indicate the dataset has not been generated yet

When I try to run the 'Adjust the training configuration' code section in docs\vision\image_classification.ipynb, I see the error 'Trying to access splits['test'] but splits is empty. This likely indicate the dataset has not been generated yet.'
I see that spilts in dataset_info of tensorflow_datasets is empty when using the 'cifar10'.
Would like to know the reason?
Please have a look into this:
import tensorflow as tf
import tensorflow_datasets as tfds
dataset, ds_info = tfds.load('cifar10',
split=['train', 'test'],
shuffle_files=True,
as_supervised=True,
with_info=True,)
print(ds_info.splits['train'].num_examples)
print(ds_info.splits['test'].num_examples)
Output:
50000
10000

Loading a csv file with no header on my Colab by Pandas read_csv and Numpy loadtxt gave me a different results

This is the image of the error on my Colab when I used pd.dtye to pd_data
This is the image of the error on my Colab when I used np.dtye to pd_data
I have loaded one csv file to my Colab note by two diffrent way. By pd.read_csv() and np.loadtxt(). And I have assigned these two in nd_data and pd_data ,repectively. After that I printed the shape of each data. At this point I've got two diffrent shape even though I loaded the same csv file.
My question is why I've got two diffrent shape by loading the same data.
this is the link to ThoraricSurgery.csv file which I've used.
'''
from google.colab import drive
drive.mount('/content/drive')
import pandas as pd
pd_data = pd.read_csv('/content/drive/MyDrive/딥러닝과실습1/ThoraricSurgery.csv')
print(pd_data.shape)
print(type(pd_data))
import numpy as np
nd_data = np.loadtxt('/content/drive/MyDrive/딥러닝과실습1/ThoraricSurgery.csv', delimiter=",")
print(nd_data.shape)
print(type(nd_data))this is the mentioned result
'''

How to implement SciBERT with pytorch; error while loading

I am trying to use SciBERT pre-trained model, namely: scibert-scivocab-uncased the following way:
!pip install pytorch-pretrained-bert
import torch
from pytorch_pretrained_bert import BertTokenizer, BertModel, BertForMaskedLM
import logging
import matplotlib.pyplot as plt
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
segments_ids = [1] * len(tokenized_text)
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])
model = BertModel.from_pretrained('/Users/.../Downloads/scibert_scivocab_uncased-3.tar.gz')
And I get the following error:
EOFError: Compressed file ended before the end-of-stream marker was reached
I downloaded the file from the website (https://github.com/allenai/scibert)
I converted it from "tar" to gzip
Nothing worked.
Any hint on how to approach this?
Thank you!
In the new version of pytorch-pretrained-BERT i.e. in transformers, you can do the following to load a pretrained model after you un-tar:
import AutoModelForTokenClassification, AutoTokenizer
model = AutoModelForTokenClassification.from_pretrained("/your/local/path/to/scibert_scivocab_uncased")
Need to unzip the package and rename the json file to config.json
Then just address the folder pathname where you have unzipped the package. It should work

How to plot remote image (from http url)

This must be easy, but I can't figure how right now without using urllib module and manually fetching remote file
I want to overlay plot with remote image (let's say "http://matplotlib.sourceforge.net/_static/logo2.png"), and neither imshow() nor imread() can load the image.
Any ideas which function will allow loading remote image?
It is easy indeed:
import urllib2
import matplotlib.pyplot as plt
# create a file-like object from the url
f = urllib2.urlopen("http://matplotlib.sourceforge.net/_static/logo2.png")
# read the image file in a numpy array
a = plt.imread(f)
plt.imshow(a)
plt.show()
This works for me in a notebook with python 3.5:
from skimage import io
import matplotlib.pyplot as plt
image = io.imread(url)
plt.imshow(image)
plt.show()
you can do it with this code;
from matplotlib import pyplot as plt
a = plt.imread("http://matplotlib.sourceforge.net/_static/logo2.png")
plt.imshow(a)
plt.show()
pyplot.imread for URLs is deprecated
Passing a URL is deprecated. Please open the URL for reading and pass
the result to Pillow, e.g. with
np.array(PIL.Image.open(urllib.request.urlopen(url))).
Matplotlib suggests using PIL instead. I prefer using imageio as sugested by SciPy:
imread is deprecated in SciPy 1.0.0, and will be removed in 1.2.0. Use
imageio.imread instead.
imageio.imread(uri, format=None, **kwargs)
Reads an image from the specified file. Returns a numpy array, which
comes with a dict of meta data at its ‘meta’ attribute.
Note that the image data is returned as-is, and may not always have a
dtype of uint8 (and thus may differ from what e.g. PIL returns).
Example:
import matplotlib.pyplot as plt
from imageio import imread
url = "http://matplotlib.sourceforge.net/_static/logo2.png"
img = imread(url)
plt.imshow(img)