When training a GAN to generate images of faces (with the celebrity faces dataset from keggle), I keep on getting ok-ish looking faces, but this very weird noise that looks like fire (and makes the generated faces look unnervingly demonic). An example of the result can be found here. The same phenomenon has appeared all three times I've tried to train it, so I was wondering if anyone knew of a reason why such a specific- and weird - type of noise would keep appearing? Asking mostly out of curiosity about what's going on- since I'm a total beginner if the way of mitigating the effect is really complicated don't worry about trying to explain how to fix it.
Thanks!
Related
I need assistance as a ML beginner. I have retrained TF2's model using make_image_classifier through tensorflow hub (command line approach).
My very first training:
I have immediately retrained the model once again as this one's predictions did not satisfy me -> using 4 classes instead and achieved 85% val.accuracy in Epoch 4/5 and 81% val.accuracy in Epoch 5/5. Because the problem is complex, I have decided to expand upon my db, increasing the number of data fed to the model.
I have expanded the database more than twofold! The results are shocking and I have no clue what to do. I can't believe this.
If anything - the added data is of much better quality, relevance and diversity, and it is actually obtained by myself - a human "discriminator" to the 2nd model's preformance (the 81% acc. one). If I knew how to do it, which is a second question - I would simply add this "feedback data" to my already working model. But I have no idea how to make it happen - ideally, I'd want to give feedback to the bot, allowing it to train once again on the additional data, it understanding that this is a feedback rather than a random additional set of data. However training a new model with updated data from scratch I'd like to try out anyways.
How do I interpret those numbers? What can be wrong with the data? Is it the data what's wrong at all, which is my assumption? Why is the training accuracy below 0.5?? What could have caused the drop between Epoch 2/5' and Epoch 3/5 starting the downfall? What are common issues and similar situations when something like this happens?
I'd love to understand what happened behind the curtains on a more lower level here but I have difficulty, I need guidance. This is as discouraging as the first & second training were encouraging.
Possible problems with the data I can see - but I can't believe they could cause such a drop especially because they were there during 1st&2nd training that went okay:
1-5% images can be in multiple classes (2 classes or maximum 3) - in my opinion this is even encouraged for the problem, as the model imitates me, and I want it to struggle with finding out what it is about certain elements that caused me to classify them as two+ simultaneous classes. Finding out features in them that made me think they'd satisfy both.
1-3% can be duplicates.
There's white border around the images (since it didn't seem to affect the first trainings I didn't bother to remove it)
The drop is 40%.
I will now try to use Tensorflow's tutorial to do it through a script rather than using command line, but seeing such a huge drop is very discouraging, I was hoping to see an increase, after all that's a common suggestion to feed a model more data and I have made sure to feed it quality additional data.. I appreciate every single suggestion for fine-tuning the model, as I am a beginner and I have no experience with what might work and what most probably won't. Thank you.
EDIT: I have cleaned my data by:
removing borders(now they are tiny, white, sometimes not present at all)
removing dust
removing artifacts which there were quite many in the added data!
Convinced the last point was the issue, I retrained. Results are even worse!
2 classes binary
4 classes
I want to be able to label all of the muscles on an athletes body. I got a lot of the images that the athletes are almost in the same body pose but the issue that I am running into is that drawing a box around them makes them inaccurate as it ends up overlapping other muscles. Drawing exact lines around them is a bit difficult as they are a lot of smaller muscles and creates inconsistently over 20-30images. I was wondering if there is a way to feed in a human anatomy and then have tensorflow go in and label all of the muscles in given pictures.
Or I was wondering if you all had a different idea on approach this problem that I'm running into.
I don't have anybody else to ask and I've been researching this for awhile so if I missed or overlooked something please forgive me
The way i see is you need to combine with some prepossessing steps to normalize your target object in the image such as:
identify the human,
identify the pose or skeleton (which nowadays many open-source such as openpose-plus),
the pose estimation results can label the limbs, or part of the body from which you can do something either by hand-crafted image processing or other segmentation model.
Recently I stared toying with tensor flow, dnns etc. now I'm trying to implement something more serious, information retrieval from short sentences (doctor instructions).
Unfortunately the dataset I have is, as always, quite "dirty". As I'm trying to use word embeddings, I actually need "clean" data. Take one example:
"Take two pilleach day". There is a missing white space between pill and each. I am implementing "tokenizer improver" to look at each sentence and propose new tokenization based on joint probability of each word in sentence given the frequency of terms in whole document (tf) . As I was doing it today, a thought came to my mind: why bother writing suboptimal solution for this problem when I can employ powerful learning algorithms such as Lstm networks to do that for me. However, as of today, I have only a feeling that it's actually possible to do that. As we know, feelings are not best when it comes to architecting such complex problems. I don't know where to begin: what should be my training set and learning goal.
I know this is a broad question, but I know there are many brilliant people with more knowledge about tensorflow and neural nets, so I'm sure that somebody has either already solved similar problem or just knows how to approach this problem.
Any guidance is welcome, I do not except you to solve this for me of course:)
Besos and all the best to all the tensorflow community:)
Having the same issue. I solved it by using a character level net. Basically I rewrote Character-Aware Neural Language Models, kicked out the whole "words"-elements and just stayed with the caracter level.
Training Data: I took the data I had, as dirty as it was, used the dirty data as targets and made it even more dirty to create inputs.
So your "Take two pilleach day" will be learned as in many cases you do have a clean and similar phrase, e.g. "Take one pill each morning" that with the regime mentioned will serve as target and you train the net on destroyed inputs like "Take oe pileach mornin"
having read this article about a guy who uses tensorflow to sort cucumber into nine different classes I was wondering if this type of process could be applied to a large number of classes. My idea would be to use it to identify Lego parts.
At the moment, a site like Bricklink describes more than 40,000 different parts so it would be a bit different than the cucumber example but I am wondering if it sounds suitable. There is no easy way to get hundreds of pictures for each part but does the following process sound feasible :
take pictures of a part ;
try to identify the part using tensorflow ;
if it does not identify the correct part, take more pictures and feed the neural network with them ;
go on with the next part.
That way, each time we encounter a new piece we "teach" the network its reference so that it can better be recognized the next time. Like that and after hundreds of iterations monitored by a human, could we imagine tensorflow to be able to recognize the parts? At least the most common ones?
My question might sound stupid but I am not into neural networks so any advice is welcome. At the moment I have not found any way to identify a lego part based on pictures and this "cucumber example" sounds promising so I am looking for some feedback.
Thanks.
You can read about the work of Jacques Mattheij, he actually uses a customized version of Xception1 running on https://keras.io/.
The introduction is Sorting 2 Metric Tons of Lego.
In Sorting 2 Tons of Lego, The software Side you can read:
The hard challenge to deal with next was to get a training set large
enough to make working with 1000+ classes possible. At first this
seemed like an insurmountable problem. I could not figure out how to
make enough images and to label them by hand in acceptable time, even
the most optimistic calculations had me working for 6 months or longer
full-time in order to make a data set that would allow the machine to
work with many classes of parts rather than just a couple.
In the end the solution was staring me in the face for at least a week
before I finally clued in: it doesn’t matter. All that matters is that
the machine labels its own images most of the time and then all I need
to do is correct its mistakes. As it gets better there will be fewer
mistakes. This very rapidly expanded the number of training images.
The first day I managed to hand-label about 500 parts. The next day
the machine added 2000 more, with about half of those labeled wrong.
The resulting 2500 parts where the basis for the next round of
training 3 days later, which resulted in 4000 more parts, 90% of which
were labeled right! So I only had to correct some 400 parts, rinse,
repeat… So, by the end of two weeks there was a dataset of 20K images,
all labeled correctly.
This is far from enough, some classes are severely under-represented
so I need to increase the number of images for those, perhaps I’ll
just run a single batch consisting of nothing but those parts through
the machine. No need for corrections, they’ll all be labeled
identically.
A recent update is Sorting 2 Tons of Lego, Many Questions, Results.
1CHOLLET, François. Xception: Deep Learning with Depthwise Separable Convolutions. arXiv preprint arXiv:1610.02357, 2016.
I have started this using IBM Watson's Visual Recognition.
I had six different bricks to be recognized on the transport belt background.
I am actually thinking about tensorflow, since I can have it running locally.
The codelab : TensorFlow for Poets, describes almost exactly what you want to achieve,
For a demo of the Watson version:
https://www.ibm.com/developerworks/community/blogs/ibmandgoogle/entry/Lego_bricks_recognition_with_Watosn_lego_and_raspberry_pi?lang=en
I've written up a variation on Melinda Green's Buddhabrot method for visualizing the Mandelbrot set. Here it is:
http://pastebin.com/RH6dD77F
To create an animation I rendered hundreds of the individual images with slight variations. The variation is a transformation of the coefficients of the generating function as if they were an abstract vector in a space of coefficients. All of that produced incredible structures in the video...
http://www.youtube.com/watch?v=S2uMAvL_5Fo
The problem? As you can tell, the quality on each image is rather low because it takes forever using the method I came up with (the copies I have on my computer are a little better quality, but still look like old reel-to-reel movies). I'm hoping to find a few methods for increasing quality or lowering output time.
Thanks for any suggestions. I would really like to produce more detailed versions of these. Obviously there is much more structure in the graininess of these images.
You can try something like boxcounting, http://imagej.nih.gov/ij/plugins/fraclac/FLHelp/BoxCounting.htm. If buddhabrot is some sort mandelbrot you can skip some empty boxes. You can use a kd-tree like in packing lightmaps to subdivide the surface.