What happens to a Wit.ai neural network if i delete an entity? - wit.ai

Suppose i have a bot of medium complexity. And i have this client_location entity.
I want to delete it and create a similar entity that will contain only a portion of it's utterances.
Now from what i understand about neural nets, the meaning of something is saved in the network - so i can't see how it's possible to delete something and expect that will just work as it was not even there. Everything is interconnected -> so maybe it's not that simple to delete stuff.
I'm worried that this old entity i want to delete, will collide with the new one i want to create -> and will produce unexpected results.
Is this something to worry about? Can i expect to start fresh when creating a new entity?
More general - Is it safe to do refactorings in a bot? Or i need to create a new bot for each major refactor?

There is no problem changing something like that. If your change may modify the neural network (changing an intent for instance), it will just re-train.
It is not just training over the previous training. It will train from scratch again. So old version of your bot does not exist anymore.
It is not obvious to see it with wit.ai, but with a dialogflow for instance, you can see that each modification make your bot retrain, and the more intents you got, the longer it is to train.

Related

TF2 tensorflow hub retrained model - expanded my training datababase twofold, accuracy dropped 30%, after cleaning up data dropped 40%

I need assistance as a ML beginner. I have retrained TF2's model using make_image_classifier through tensorflow hub (command line approach).
My very first training:
I have immediately retrained the model once again as this one's predictions did not satisfy me -> using 4 classes instead and achieved 85% val.accuracy in Epoch 4/5 and 81% val.accuracy in Epoch 5/5. Because the problem is complex, I have decided to expand upon my db, increasing the number of data fed to the model.
I have expanded the database more than twofold! The results are shocking and I have no clue what to do. I can't believe this.
If anything - the added data is of much better quality, relevance and diversity, and it is actually obtained by myself - a human "discriminator" to the 2nd model's preformance (the 81% acc. one). If I knew how to do it, which is a second question - I would simply add this "feedback data" to my already working model. But I have no idea how to make it happen - ideally, I'd want to give feedback to the bot, allowing it to train once again on the additional data, it understanding that this is a feedback rather than a random additional set of data. However training a new model with updated data from scratch I'd like to try out anyways.
How do I interpret those numbers? What can be wrong with the data? Is it the data what's wrong at all, which is my assumption? Why is the training accuracy below 0.5?? What could have caused the drop between Epoch 2/5' and Epoch 3/5 starting the downfall? What are common issues and similar situations when something like this happens?
I'd love to understand what happened behind the curtains on a more lower level here but I have difficulty, I need guidance. This is as discouraging as the first & second training were encouraging.
Possible problems with the data I can see - but I can't believe they could cause such a drop especially because they were there during 1st&2nd training that went okay:
1-5% images can be in multiple classes (2 classes or maximum 3) - in my opinion this is even encouraged for the problem, as the model imitates me, and I want it to struggle with finding out what it is about certain elements that caused me to classify them as two+ simultaneous classes. Finding out features in them that made me think they'd satisfy both.
1-3% can be duplicates.
There's white border around the images (since it didn't seem to affect the first trainings I didn't bother to remove it)
The drop is 40%.
I will now try to use Tensorflow's tutorial to do it through a script rather than using command line, but seeing such a huge drop is very discouraging, I was hoping to see an increase, after all that's a common suggestion to feed a model more data and I have made sure to feed it quality additional data.. I appreciate every single suggestion for fine-tuning the model, as I am a beginner and I have no experience with what might work and what most probably won't. Thank you.
EDIT: I have cleaned my data by:
removing borders(now they are tiny, white, sometimes not present at all)
removing dust
removing artifacts which there were quite many in the added data!
Convinced the last point was the issue, I retrained. Results are even worse!
2 classes binary
4 classes

Neural Network: Convert HTML Table into JSON data

I'm kinda new to Neural Networks and just started to learn coding them by trying some examples.
Two weeks ago I was searching for an interesting challenge and I found one. But I'm about to give up because it seems to be too hard for me... But I was curious to know if anyone of you is able to solve this?
The Problem: Assume there are ".htm"-files that contain tables about the same topic. But the table structure isn't the same for every file. For example: We have a lot ".htm"-files containing information about teachers substitutions per day per school. Because the structure of those ".htm"-files isn't the same for every file it would be hard to program a parser that could extract the data from those tables. So my thought was that this is a task for a Neural Network.
First Question: Is it a task a Neural Network can/should handle or am I mistaken by that?
Because for me a Neural Network seemed to fit for this kind of a challenge I tried to thing of an Input. I came up with two options:
First Input Option: Take the HTML Code (only from the body-tag) as string and convert it as Tensor
Second Input Option: Convert the HTML Tables into Images (via Canvas maybe) and feed this input to the DNN through Conv2D-Layers.
Second Question: Are those Options any good? Do you have any better solution to this?
After that I wanted to figure out how I would make a DNN output this heavily dynamic data for me? My thought was to convert my desired JSON-Output into Tensors and feed them to the DNN while training and for every prediction i would expect the DNN to return a Tensor that is convertible into a JSON-Output...
Third Question: Is it even possible to get such a detailed Output from a DNN? And if Yes: Do you think the Output would be suitable for this task?
Last Question: Assuming all my assumptions are correct - Wouldn't training this DNN take for ever? Let's say you have a RTX 2080 ti for it. What would you guess?
I guess that's it. I hope i can learn a lot from you guys!
(I'm sorry about my bad English - it's not my native language)
Addition:
Here is a more in-depth Example. Lets say we have a ".htm"-file that looks like this:
The task would be to get all the relevant informations from this table. For example:
All Students from Class "9c" don't have lessons in their 6th hour due to cancellation.
1) This is not particularly suitable problem for a Neural Network, as you domain is a structured data with clear dependcies inside. Tree based ML algorithms tend to show much better results on such problems.
2) Both you choices of input are very unstructured. To learn from such data would be nearly impossible. The are clear ways to give more knowledge to the model. For example, you have the same data in different format, the difference is only the structure. It means that a model needs to learn a mapping from one structure to another, it doesn't need to know any data. Hence, words can be Tokenized with unique identifiers to remove unnecessary information. Htm data can be parsed to a tree, as well as json. Then, there are different ways to represent graph structures, which can be used in a ML model.
3) It seems that the only adequate option for output is a sequence of identifiers pointing to unique entities from text. The whole problem then is similar to Seq2Seq best solved by RNNs with an decoder-encoder architecture.
I believe that, if there is enough data and htm files don't have huge amount of noise, the task can be completed. Training time hugely depends on selected model and its complexity, as well as diversity of initial data.

training images? Considerations for selection

I'm relatively new and am still learning the basics. I've used NVIDIA DIGITS in the past, and am now looking at Tensorflow. While I've been able to fumble my way around creating some models for a few projects I'm working on, I really want to start diving deeper into what I'm doing, how I'm doing it, and ultimately a better understanding of why.
One area that I would like to start with is the Images that I'm using for training and testing. Can anyone point me to a blog, an article, a paper, or give me some insight in what I need to consider when selecting images to train a new model on. Up until recently, I've been using datasets that have already been selected and that are available for download. Lets say I'm going to start working on a project that involves object detection of ships from a variety of distances and angles.
So my thoughts would be
1) I need a large quantity of images.
2) The images need to contain ships of the different types I would like to detect. (lets just say one class, ships, don't care what type of ships)
3) I also need to have images that have a great variety of distance perspective for the different types of ships.
Ultimately, my thoughts are that the images need to reflect the distance, perspective, and types of ships I would ideally want to identify from the video. Seems simple enough.
However, there are a number of questions
Does the images need to be the same/similar resolution as the camera I'll be using, for best results?
Does the images all need to be the same resolution?
Can I use a single image and just digitally zoom out on the image to give the illusion of different distances?
I'm sure there are a number of other questions that I'm not asking, or should be asking. Are there any guide lines available for creating a solid collection of images to use when creating the collection of images for training and validation?
I recommend thinking through end to end, like would you need to classify ship models as a next step? I recommend going through well known public datasets and actually work with the structure, how to store data, labels, how to handle preprocessing etc.
More importantly, what are you trying to achieve? Talking to experts in the topic does help greatly while preparing your own dataset.
Use open source images if you can, e.g. flickr, google, imagenet.
No, you don't need them to be the same resolution.
It is not ideal to zoom in/out images to use in different categories. Preprocessing images and data augmentation already does this to create more distant representations of the same class. This is why I would recommend hands on approach with an existing dataset first.
Yes, what you need is many, different representations of classes, and a roughly balanced dataset of classes. If you define your data structure well in the beginning, it will save you a ton of time as you won't have to make changes often.

What Tensorflow API to use for Seq2Seq

This year Google produced 5 different packages for seq2seq:
seq2seq (claimed to be general purpose but
inactive)
nmt (active but supposed to be just
about NMT probably)
legacy_seq2seq
(clearly legacy)
contrib/seq2seq
(not complete probably)
tensor2tensor (similar purpose, also
active development)
Which package is actually worth to use for the implementation? It seems they are all different approaches but none of them stable enough.
I've had too a headache about some issue, which framework to choose? I want to implement OCR using Encoder-Decoder with attention. I've been trying to implement it using legacy_seq2seq (it was main library that time), but it was hard to understand all that process, for sure it should not be used any more.
https://github.com/google/seq2seq: for me it looks like trying to making a command line training script with not writing own code. If you want to learn Translation model, this should work but in other case it may not (like for my OCR), because there is not enough of documentation and too little number of users
https://github.com/tensorflow/tensor2tensor: this is very similar to above implementation but it is maintained and you can add more of own code for ex. reading own dataset. The basic usage is again Translation. But it also enable such task like Image Caption, which is nice. So if you want to try ready to use library and your problem is txt->txt or image->txt then you could try this. It should also work for OCR. I'm just not sure it there is enough documentation for each case (like using CNN at feature extractor)
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/seq2seq: apart from above, this is just pure library, which can be useful when you want to create a seq2seq by yourself using TF. It have a function to add Attention, Sequence Loss etc. In my case I chose that option as then I have much more freedom of choosing the each step of framework. I can choose CNN architecture, RNN cell type, Bi or Uni RNN, type of decoder etc. But then you will need to spend some time to get familiar with all the idea behind it.
https://github.com/tensorflow/nmt : another translation framework, based on tf.contrib.seq2seq library
From my perspective you have two option:
If you want to check the idea very fast and be sure that you are using very efficient code, use tensor2tensor library. It should help you to get early results or even very good final model.
If you want to make a research, not being sure how exactly the pipeline should look like or want to learn about idea of seq2seq, use library from tf.contrib.seq2seq.

Lstm to improve tokenization

Recently I stared toying with tensor flow, dnns etc. now I'm trying to implement something more serious, information retrieval from short sentences (doctor instructions).
Unfortunately the dataset I have is, as always, quite "dirty". As I'm trying to use word embeddings, I actually need "clean" data. Take one example:
"Take two pilleach day". There is a missing white space between pill and each. I am implementing "tokenizer improver" to look at each sentence and propose new tokenization based on joint probability of each word in sentence given the frequency of terms in whole document (tf) . As I was doing it today, a thought came to my mind: why bother writing suboptimal solution for this problem when I can employ powerful learning algorithms such as Lstm networks to do that for me. However, as of today, I have only a feeling that it's actually possible to do that. As we know, feelings are not best when it comes to architecting such complex problems. I don't know where to begin: what should be my training set and learning goal.
I know this is a broad question, but I know there are many brilliant people with more knowledge about tensorflow and neural nets, so I'm sure that somebody has either already solved similar problem or just knows how to approach this problem.
Any guidance is welcome, I do not except you to solve this for me of course:)
Besos and all the best to all the tensorflow community:)
Having the same issue. I solved it by using a character level net. Basically I rewrote Character-Aware Neural Language Models, kicked out the whole "words"-elements and just stayed with the caracter level.
Training Data: I took the data I had, as dirty as it was, used the dirty data as targets and made it even more dirty to create inputs.
So your "Take two pilleach day" will be learned as in many cases you do have a clean and similar phrase, e.g. "Take one pill each morning" that with the regime mentioned will serve as target and you train the net on destroyed inputs like "Take oe pileach mornin"