Displaying tf.summary.text with underscores correctly in Tensorboard - tensorflow

I want to log a few strings with underscores to tensorboard. However, the underscores are treated as emphasis somewhere in the pipeline. Here's some example code to illustrate the problem. I've included a few versions that attempt to escape the underscores
import tensorflow as tf
sess = tf.InteractiveSession()
text0 = """/a/b/c_d/f_g_h_2017"""
text1 = """/a/b/c\_d/f\_g\_h\_2017"""
text2 = """/a/b/c\\_d/f\\_g\\_h\\_2017"""
summary_op0 = tf.summary.text('text', tf.convert_to_tensor(text0))
summary_op1 = tf.summary.text('text', tf.convert_to_tensor(text1))
summary_op2 = tf.summary.text('text', tf.convert_to_tensor(text2))
summary_op = tf.summary.merge([summary_op0, summary_op1, summary_op2])
summary_writer = tf.summary.FileWriter('/tmp/tensorboard', sess.graph)
summary = sess.run(summary_op)
summary_writer.add_summary(summary, 0)
summary_writer.flush()
summary_writer.close()
Here's the output:
How can I use tensorboard to properly render strings with tensorboard?
Package versions: Tensorflow 1.3.0, TensorBoard 0.1.8

This is working as intended. The docs for tf.summary.text and also for tensorboard.summary.text state that the text will be rendered using Markdown formatting—just like the text in this question and answer—and in Markdown, underscores create italics.
If you don't want this to be the case, you can consider formatting these strings as code, by using either
text0 = """`/a/b/c_d/f_g_h_2017`""" # backticks: inline code formatting
text1 = """ /a/b/c\_d/f\_g\_h\_2017""" # four-space indent: code block
This yields the following result:
(Disclaimer: I work on TensorBoard.)

According to this github issue, this is a bug with the current tensorsorboard and Python 3. For now, using backticks as suggested in another answer is sufficient to render the underscores correctly.
https://github.com/tensorflow/tensorboard/issues/647#issuecomment-337380296

Related

Streamlit with Tensorflow to analyse image and return the probability if is positive or negative

I'm trying to use Tensorflow to Machine Learning to analyze an image and return the probability if is positive or negative based on a model created (extension .h5). I couldn't found a documentation exactly for that, or repository, so even a link to read will be awesome.
Link for the application: https://share.streamlit.io/felipelx/hackathon/IDC_Detector.py
Libraries that I'm trying to use.
import numpy as np
import streamlit as st
import tensorflow as tf
from keras.models import load_model
The function to load the model.
#st.cache(allow_output_mutation=True)
def loadIDCModel():
model_idc = load_model('models/IDC_model.h5', compile=False)
model_idc.summary()
return model_idc
The function to work the image, and what I'm trying to see: model.predict - I can see but is not updating the %, independent of the image the value is always the same.
if uploaded_file is not None:
# transform image to numpy array
file_bytes = tf.keras.preprocessing.image.load_img(uploaded_file, target_size=(96,96), grayscale = False, interpolation = 'nearest', color_mode = 'rgb', keep_aspect_ratio = False)
c.image(file_bytes, channels="RGB")
Genrate_pred = st.button("Generate Prediction")
if Genrate_pred:
model = loadMetModel()
input_arr = tf.keras.preprocessing.image.img_to_array(file_bytes)
input_arr = np.array([input_arr])
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
prediction = probability_model.predict(input_arr)
dict_pred = {0: 'Benigno/Normal', 1: 'Maligno'}
result = dict_pred[np.argmax(prediction)]
value = 0
if result == 'Benigno/Normal':
value = str(((prediction[0][0])*100).round(2)) + '%'
else:
value = str(((prediction[0][1])*100).round(2)) + '%'
c.metric('Predição', result, delta=value, delta_color='normal')
Thank you in advance to any help.
The first thing I'm noticing is that your function for loading the model is named loadIDCModel, but then the function you call for loading the model is loadMetModel. When I check your source code, though, it looks like you've already addressed this issue. I'd recommend updating your question to reflect this.
Playing around with your application, I think the issue is your model itself. I tried various images — images containing carcinomas, and even a picture of a cat — and each gave me a probability around 73%. The lowest score I got was 72.74%, and the highest was 73.11% (this one was the cat). It seems that the output percentage is varying slightly, hinting that rather than something being wrong in the code, your model itself is likely at fault. You might need to retrain your model, as it seems to have learned to always return a value of approximately 0.73.

Spacy - erroneous config.file

While training ner with custom labels I created a .json file the exactly similar way but with my own data as stated in the example.
Then I tried to convert it (both train/dev) to the binary format needed for training using the command:
python -m spacy convert train.json ./ -t spacy
which did result in creating 2 files.
The error I got while launching the training process:
[E923] It looks like there is no proper sample data to initialize the Model of component 'ner'. To check your input data paths and annotation, run: python -m spacy debug data config.cfg
The debug command output is the same.
The problem is that there are overlapping entities. For each word there should be only one tag.
The solution of the problem can be (code from spacy_convert_script):
import srsly
import spacy
for f in ["train.json", "dev.json"]:
nlp = spacy.blank("en")
db = DocBin()
for text, annot in srsly.read_json(f):
doc = nlp.make_doc(text)
ents = []
try:
for start, end, label in annot["entities"]:
span = doc.char_span(start, end, label=label)
if span is None:
msg = f"Skipping entity [{start}, {end}, {label}] in the following text because the character span '{doc.text[start:end]}' does not align with token boundaries:\n\n{repr(text)}\n"
warnings.warn(msg)
else:
ents.append(span)
doc.ents = ents
db.add(doc)
except:
print(doc.text, ents) #see which texts cause the problem
continue
db.to_disk(f.split('.')[0]+'.spacy')
That just would result in skipping the texts which cause problems. To choose one of the overlapping entities:
try:
x = 0
for start, end, label in annot["entities"]:
span = doc.char_span(start, end, label=label)
if span is None:
msg = f"Skipping entity [{start}, {end}, {label}] in the following text because the character span '{doc.text[start:end]}' does not align with token boundaries:\n\n{repr(text)}\n"
warnings.warn(msg)
else:
if start > x and end > x:
x = end
ents.append(span)

SpaCy use Lemmatizer as stand-alone component

I want to use SpaCy's lemmatizer as a standalone component (because I have pre-tokenized text, and I don't want to re-concatenate it and run the full pipeline because SpaCy will most likely tokenize differently in some cases).
I found the lemmatizer in the package but I somehow needs to load the dictionaries with the rules to initialize this Lemmatizer.
These files must be somewhere in the model of the English or German model, right? I couldn't find them there.
from spacy.lemmatizer import Lemmatizer
where do the LEMMA_INDEX, etc. files are comming from?
lemmatizer = Lemmatizer(LEMMA_INDEX, LEMMA_EXC, LEMMA_RULES)
I found a similar question here: Spacy lemmatizer issue/consistency
but this one did not entirely answer how to get these dictionary files from the model. The spacy.lang.* parameter seems to no longer exist in newer versions.
Here's an extracted bit of code I had, that used the SpaCy lemmatizer by itself. I'm not somewhere I can run it so it might have a small bug or two if I made an editing mistake.
Note that in general, you need to know the upos for the word in order to lemmatize correctly. This code will return all the possible lemmas but I would advise modifying it to pass in the correct upos for your word.
class SpacyLemmatizer(object):
def __init__(self, smodel):
import spacy
self.lemmatizer = spacy.load(smodel).vocab.morphology.lemmatizer
# get the lemmas for every upos
def getLemmas(self, entry):
possible_lemmas = set()
for upos in ('NOUN', 'VERB', 'ADJ', 'ADV'):
lemmas = self.lemmatizer(entry, upos, morphology=None)
lemma = lemmas[0] # See morphology.pyx::lemmatize
possible_lemmas.add( lemma )
return possible_lemmas

Passing commandline argument in google colab

How to pass commandline argument when running a python code in google colab?
I have written a code which takes a file as input via sys.argv[]. How do I do this?
As far as I know, there is no special way to pass command line arguments to python code. This is a working code sample I use to when creating tfrecords.
!python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=data/test.record --image_dir=images/
I don't see any difference between the regular command line python argument passing and the colab. Please add more code to your question to get better help.
I tried this in a google colab notebook
import sys
sys.argv[0] = "first_arg" # this is to assign the first command line argument
sys.argv[1] = "second_arg" # This line to assign the second arg for example
And it worked for me.
So if you want to run a python code that works like this:
!python test.py --image_folder '/content/image' --workers 2 --Prediction CTC --rgb True
You have to open test.py or your file with editor then you will find line inside the file similer like this:
parser = argparse.ArgumentParser()
parser.add_argument('--image_folder', required=True, help='path to image_folder')
parser.add_argument('--workers', type=int, default=1, help='number of workers')
parser.add_argument('--Prediction', type=str, default='CTC', help='Prediction stage.')
parser.add_argument('--rgb', action='store_true', help='use rgb input')
args = parser.parse_args()
But this will give you " Error SystemExit: 2 "
Then you have to change like this:
parser = argparse.ArgumentParser()
parser.add_argument('--image_folder', required=False, default='/content/image', help='path to image_folder')
parser.add_argument('--workers', type=int, default=2, help='number of workers')
parser.add_argument('--Prediction', type=str, default='CTC', help='Prediction stage.')
parser.add_argument('--rgb', action='store_false', help='use rgb input')
parser.add_argument("-f", "--file", required=False)
args = parser.parse_args()
You must add in the end of " parser.add_argument " line:
parser.add_argument("-f", "--file", required=False)
Then you can call commandline argument like this:
image = args.image_path
Or
img = Image.open(args.image_path)
workers = args.workers
But if your last line like this:
args = vars(ap.parse_args())
Then you have to call it like this:
image = args["image_path"]
Or
img = Image.open(args["image_path"])
workers = args["workers"]
#Note ( action='store_false' ) will default to ( False )
Likewise, ( action='store_false' ) will default to ( True )
Tested with Google colab
I made a bioinformatic tool locally in my machine to parse Uniprot big data files of proteins.
The tool I made needs the passing of different parameters using command line arguments. After the tool was working locally, I upload data files and python source files to my google drive.
I did not make any changes to my files. I just run directly the following command in google colab:
!python3 drive/MyDrive/uniprot/uniprot_select.py FIELDS "ID,OS,SQ" FROM drive/MyDrive/data/uniprot.dat WHERE "SQ#EYDRRR" FASTA
It works perfectly!
No need of special parsing, no need to additional imports. All the work you normally do locally in your machine, can be executed without changes.

How to "append" Op at the beginning of a TensorFlow graph?

I have a GraphDef proto file which I am importing using tf.import_graph_def. Ops can be added at the end of the graph like this:
final_tensor = tf.import_graph_def(graph_def, name='', return_elements=['final_tensor'])
new_tensor = some_op(final_tensor)
But I want to add Ops at the beginning of the graph, so essentially the first Op in the graph_def needs to take the output of my Op as input, how do I do it?
Finally found a way to do this. I am sure the function Yarolsav mentioned in the comments does something similar internally.
new_input = graph_def.node.add()
new_input.op = 'new_op_name' # eg: 'Const', 'Placeholder', 'Add' etc
new_input.name = 'some_new_name'
# set any attributes you want for new_input here
old_input.input[0] = 'some_new_name' # must match with the name above
For details about how to set the attributes, see this file.
The script #Priyatham gives in the link is a good example how to add node in tf graph_def. name, op, input, attr are 4 required elements. name and op could be assigned, whereas input should use extend and attr should use CopyFrom method for assignment, like:
new_node = graph_def.node.add()
new_node.op = "Cast"
new_node.name = "To_Float"
new_node.input.extend(["To_Float"])
new_node.attr["DstT"].CopyFrom(attr_value_pb2.AttrValue(type=types_pb2.DT_FLOAT))
new_node.attr["SrcT"].CopyFrom(attr_value_pb2.AttrValue(type=types_pb2.DT_FLOAT))
new_node.attr["Truncate"].CopyFrom(attr_value_pb2.AttrValue(b=True))