Ive been successfully training image classifiers with Google Cloud AutoML, however, when I have a large number of tags the evaluation tab for my model shows an incomplete confusion matrix - ie: only a subset of tags are listed.
How do I, within the UI - view the entire confusion matrix of my model?
Ive seen other S/O q's which haven't been updated / answered for some time:
Access to entire confusion matrix
Is this currently possible?
Thank you.
In the answer you found I can see they got the Confusion Matrix via the API.
As far as I can see, there's no other version for that API, so it should still valid.
Note that AutoML Vision is on Beta stage so it might get changed as well as the API.
Related
I know this question is likely to be closed as "opinion based", but I could not find any resource online and every link pointed in asking on Stack Overflow, so please be patient.
I'm trying to understand if Tensorflow is the right tool for object detection. I'm not talking about classification, but real object detection and recognition.
My use case is the following: given image A (live photo), find the matching one inside a catalogue of thousand of different images.
For example: live scanning of a supermarket product, find the matching one inside an high res catalogue of images. I'm not interested to know if the product is a shoe or a toothpaste, I want to know the "most matching" image (ie Prada model X or Colgate mint flavoured).
I already have a working script developed few years ago with OpenCV, using SURF feature detection with FLANN, but I wanted to know if there's a better tool for the job.
Can anyone point me in the right direction?
While I'm unsure whether it provides a better solution than any you've already implemented, TensorFlow, and deep learning in general, can indeed be used for this purpose. A neural network can be created which takes an image as input and outputs a numeric vector. The Euclidean distance between vectors can be used to determine the similarity between different images, an approach which has been applied effectively for facial recognition (see this paper).
For a starting point in implementing this solution using TensorFlow, see this tutorial.
I'm currently trying to get my SSDLite Network, which I trained with the Tensroflow Object-detection API working, with iOS.
So I'm using the Open Source Code of SSDMobileNet_CoreML.
The Graph allready works with some limitations. For running on iOS I had to extract the FeatureExtractor from my Graph and where unable to keep Preprocessor, Posprocessor and MutlipleGrindAnchorBox, same as they did in SSDMobileNet_CoreML.
Here you can see the Anchors they have used.
So cause my Anchors seem to be a little different I tried to undestand how they got this array.
So I found in an GitHub Issue an explenation, where the User who created the Anchors explains how he got them.
He says:
I just exported them out of the Tensorflow Graph from the import/MultipleGridAnchorGenerator/Identity tensor
I allready found the matching tensor in my Graph but I don't know how to export the Graph and retrive the correct Anchor encoding.
Can sombody explain this to me?
I allready figured it out. A little below quote was a link to a Python Notebook which explains everything in detail.
I'm putting my first steps in Machine Learning, I went through many TensorFlow.js tutorials already and I'm trying to achieve this: "Realtime Single Object Tracking/Detection"
Something like this -> input: webcam/video -> output: object bounding box
I know there are SSD and YOLO, and other libraries to predict & locate the objects. But the predicted time is very slow (in browser), I guessed it's because the Neural Network have to predict between so many objects.
https://github.com/ModelDepot/tfjs-yolo-tiny
https://github.com/tensorflow/models/tree/master/research/object_detection
What if I just want to track a single object? Would it be possible? Will the performance be better? Where should I start?
I've been thinking about extract the pre-trained class (object) from a SavedModel, then start training more from it. But there don't seems to be any instructions around Google.
I found some fantastic code by IBM, which I used in the video in this tweet: https://twitter.com/GantLaborde/status/1125735283343921152?s=20
I extracted that code to make a ReactJS component for detecting Rock/Paper/Scissors here: https://github.com/GantMan/rps_tfjs_demo/blob/master/src/AdvancedModel.js
If you'd like to play with the demo, it's at the bottom of this page: https://rps-tfjs.netlify.com/
All of this is open source and seems to work perfectly fast for detecting a single object in realtime.
I haven't tried Tensorflow yet but still curious, how does it store, and in what form, data type, file type, the acquired learning of a machine learning code for later use?
For example, Tensorflow was used to sort cucumbers in Japan. The computer used took a long time to learn from the example images given about what good cucumbers look like. In what form the learning was saved for future use?
Because I think it would be inefficient if the program should have to re-learn the images again everytime it needs to sort cucumbers.
Ultimately, a high level way to think about a machine learning model is three components - the code for the model, the data for that model, and metadata needed to make this model run.
In Tensorflow, the code for this model is written in Python, and is saved in what is known as a GraphDef. This uses a serialization format created at Google called Protobuf. Common serialization formats include Python's native Pickle for other libraries.
The main reason you write this code is to "learn" from some training data - which is ultimately a large set of matrices, full of numbers. These are the "weights" of the model - and this too is stored using ProtoBuf, although other formats like HDF5 exist.
Tensorflow also stores Metadata associated with this model - for instance, what should the input look like (eg: an image? some text?), and the output (eg: a class of image aka - cucumber1, or 2? with scores, or without?). This too is stored in Protobuf.
During prediction time, your code loads up the graph, the weights and the meta - and takes some input data to give out an output. More information here.
Are you talking about the symbolic math library, or the idea of tensor flow in general? Please be more specific here.
Here are some resources that discuss the library and tensor flow
These are some tutorials
And here is some background on the field
And this is the github page
If you want a more specific answer, please give more details as to what sort of work you are interested in.
Edit: So I'm presuming your question is more related to the general field of tensor flow than any particular application. Your question still is too vague for this website, but I'll try to point you toward a few resources you might find interesting.
The tensorflow used in image recognition often uses an ANN (Artificial Neural Network) as the object on which to act. What this means is that the tensorflow library helps in the number crunching for the neural network, which I'm sure you can read all about with a quick google search.
The point is that tensorflow isn't a form of machine learning itself, it more serves as a useful number crunching library, similar to something like numpy in python, in large scale deep learning simulations. You should read more here.
My problem statement is as follows :
" Object Detection and Localization using Tensorflow and convolutional neural network "
What i did ?
I am done with the cat detection from images using tflearn library.I successfully trained a model using 25000 images of cats and its working fine with good accuracy.
Current Result :
What i wanted to do?
If my image consist of two or more than two objects in the same image for example cat and dog together so my result should be 'cat and dog' and apart from this i have to find the exact location of these two objects on the image(bounding box)
I came across many high level libraries like darknet , SSD but not able to get the concept behind it.
Please guide me about the approach to solve the problem.
Note : I am using supervised learning techniques.
Expected Result :
You have several ways to go about it.
The most straight forward way is to get some suggested bounding boxes using some bounding box suggestion algorithm like selective search and run on each on of the suggestion the classification net that you already trained. This approach is the approach taken by R-CNN.
For more advanced algorithm based on the above approach i suggest you read about Fast-R-CNN and Faster R-CNN.
Look at Object detection with R-CNN? for some basic explanation.
Darknet and SSD are based on a different approach if you want to undestand them you can read about them on
http://www.cs.unc.edu/~wliu/papers/ssd.pdf
https://pjreddie.com/media/files/papers/yolo.pdf
Image localization is a complex problem with many different implementations achieving the same result with different efficiency.
There are 2 main types of implementation
-Localize objects with regression
-Single Shot Detectors
Read this https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/object_localization_and_detection.html to get a better idea.
Cheers
I have done a similar project (detection + localization) on Indian Currencies using PyTorch and ResNet34. Following is the link of my kaggle notebook, hope you find it helpful. I have manually collected images from the internet and made bounding box around them and saved their annotation file (Pascal VOC) using "LabelImg" annotation tool.
https://www.kaggle.com/shweta2407/objectdetection-on-custom-dataset-resnet34