Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm trying to train a Neural Network how to detect cardboard boxes along with multiple classes of persons (people).
Although it's easy to detect persons and correctly classifies them, it's incredibly hard to detect cardboard boxes.
The boxes look like this:
My suspicion is that box is too simple of an object, and the neural network has a hard time detecting it because there are too few features to extract from the object.
The division of the dataset looks like this:
personA: 1160
personB: 1651
personC: 2136
person: 1959
box: 2798
Persons are wearing different safety items, based on the items are classified, while detected as whole person, not just the item.
I tried to use following architectures:
ssd300_incetpionv2
ssd512_inceptionv2
faster_rcnn_inceptionv2
All of these are detecting and classifying persons much better than boxes. I cannot provide exact mAP (don't have it).
i used pertained CoCo model from tensorflow model zoo.
Any ideas why is so hard to detect boxes?
Thanks.
PS: I have asked this question on data science stack exchange but didn't got relevant answer.
You are starting from a model pre-trained on COCO, which includes itself the "person" category, but not the "box" category so it sound normal to me that the box category is harder.
I don't think your hypothesis is correct since a CNN should be more than capable of extracting the right features for simple objects as well as complex ones.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
If one image is a part of another image, then how to compute the accurate location in deep learning way?
Now I could compute this by extracting and matching key points using OpenCV, but I hope to solve it with neural networks.
Any ideas to design the networks and loss functions?
Thanks very much.
This is a detection problem. The simplest approach to do it is to create a a network with two heads, one for classification and the other for the bounding box (regression).
you feed your network with the image and respective label, and sum the lossess and do a backward. train for some epochs and you'll get your self a detection model that you can use to detect what you need. but its just a simple approach and it can get much more complex.
You may as well skip this and use an existing detection architecture or better framework which simplifies your life much better.
For Tensorflow I belive you can use ObjectDetctionAPI and for Pytorch you can use Detectron, Detectron2, mmdetection among others.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am wanting to build a computer vision model that can identify an object in an image. For example, identify the (x, y, width, height) pixel coordinates of the bounding box of somebody's hand. I know of complex object detection algorithm likes YOLO and RCNN but am curious as to why I couldn't just create a vanilla Conv Net with an output layer of 4 neurons (for each coordinate value) w/ linear activation functions?
For clarity, I am not wanting to identify multiple objects in the image. Just assuming that only one hand is present in each image.
Any help would be appreciated!
You for sure can do it, there's no math stopping you or anything. YOLO is designed for multiple objects after all.
Some thoughts though:
Your model will always guess some box, even when there's no hand in the image.
If you do use YOLO, you gain the benefit of using some pre-trained network, which makes it robust (at least more robust) to using the model in new environments.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Is there any efficient way by which we can we calculate the area covered by trees using machine learning in any google earth image. We can re-train our data using tensorflow and inception trained dataset to identify whether there is tree or not, but I can't think of any way to find out how many trees or how kuch area it is covering. Is there anything we can do.
I use Python, Tensorflow for machine learning.
P.s : don't know much about machine learning but can work with steps.
In computer vision there exists different ways for finding objects in images:
image classification will tell you if an image is something (i.e. this image is a cat)
image detection will tell you where something is in an image (i.e. it will draw a box around a cat)
image segmentation will try to extract the exact contour of something in an image (i.e. the precise contour of the cat, not just a box containing it)
You need a neural network capable of doing the second or third task with aerial images of trees.
Then simply sum all the tree' areas and compare the result with the image size.
Here you can find a Tensorflow network for doing object detection https://github.com/tensorflow/models/tree/master/research/object_detection.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Sometimes you know by experience or by some expert knowledge some variable will play a key role in this model, is there a way to manually make the variable count more so the training process can speed up and the method can combine some human knowledge/wisdom/intelligence.
I still think machine learning combined with human knowledge is the strongest weapon we have now
This might work by scaling your input data accordingly.
On the other hand the strength of a neural network is to figure out
which features are in fact important and which combinations with other
features are important - from the data.
You might argue, that you'll decrease training time. Somebody else might argue that you're biasing your training in such a way that it might even take more time.
Anyway if you would want to do this, assuming a fully connected layer, you could increasedly initialize the weights of the input feature you found important.
Another way, could be to first pretrain the model according to a training loss, that should have your feature as an output. Than keep the weights and switch to the actual loss - I have never tried this, but it could work.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to start working on neural network for my final project I want a topic which could be completed on 2-3 months of work and also It should be of good understanding for a fresher, as I am new to this topic and I want to learn by doing this project. It should not be very tough to understand and start work.
You could write a simple OCR using a Hopfield neural network.
A good start would be:
A comparative study of neural network algorithms applied to optical character recognition
Hopfield Networks: A Simple OCR Application
It is a relatively simple fun project.
It would be even easier if you could use Matlab and some of its modules. But even if you were to implement it in Java or some similar language, I think it should be doable in 3 months for a beginner.
In Matlab, you could start with the following:
Hopfield Neural Network
Hopfield Two Neuron Design
You will need the Neural Network Toolbox which has to be purchased separately I think.
Your first question should be what and not how you want to classify. Depending on the problem, you can choose a fitting classifier. It's hard for you to decide the detailed solution before knowing the actual problem.
Simple topics (depending on your personal background) can be text, audio or image analysis. OCR is quite typical (you can use the MNIST database for that, it's well researched so you can compare your own results). To get a general idea of what applications are out there, you should also definitely have a look at the UCI database. It has all sorts of data.
The easiest Type of Neural Network to understand and implement is a Single Layer Perceptron. To also classify non-linearly seperable data (which is needed in most real-world scenarios), you can use a Multi Layer Perceptron with 3 layers (in/hidden/out).