predict the position of an image in another image [closed] - tensorflow

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
If one image is a part of another image, then how to compute the accurate location in deep learning way?
Now I could compute this by extracting and matching key points using OpenCV, but I hope to solve it with neural networks.
Any ideas to design the networks and loss functions?
Thanks very much.

This is a detection problem. The simplest approach to do it is to create a a network with two heads, one for classification and the other for the bounding box (regression).
you feed your network with the image and respective label, and sum the lossess and do a backward. train for some epochs and you'll get your self a detection model that you can use to detect what you need. but its just a simple approach and it can get much more complex.
You may as well skip this and use an existing detection architecture or better framework which simplifies your life much better.
For Tensorflow I belive you can use ObjectDetctionAPI and for Pytorch you can use Detectron, Detectron2, mmdetection among others.

Related

When to use YOLO vs vanilla CNN? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am wanting to build a computer vision model that can identify an object in an image. For example, identify the (x, y, width, height) pixel coordinates of the bounding box of somebody's hand. I know of complex object detection algorithm likes YOLO and RCNN but am curious as to why I couldn't just create a vanilla Conv Net with an output layer of 4 neurons (for each coordinate value) w/ linear activation functions?
For clarity, I am not wanting to identify multiple objects in the image. Just assuming that only one hand is present in each image.
Any help would be appreciated!
You for sure can do it, there's no math stopping you or anything. YOLO is designed for multiple objects after all.
Some thoughts though:
Your model will always guess some box, even when there's no hand in the image.
If you do use YOLO, you gain the benefit of using some pre-trained network, which makes it robust (at least more robust) to using the model in new environments.

why can't I reimplement my tensorflow model with pytorch? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am developing a model in tensorflow and find that it is good on my specific evaluation method. But when I transfer to pytorch, I can't achieve the same results. I have checked the model architecture, the weight init method, the lr schedule, the weight decay, momentum and epsilon used in BN layer, the optimizer, and the data preprocessing. All things are the same. But I can't get the same results as in tensorflow. Anybody have met the same problem?
I did a similar conversion recently.
First you need to make sure that the forward path produces the same results: disable all randomness, initialize with the same values, give it a very small input and compare. If there is a discrepancy, disable parts of the network and compare enabling layers one by one.
When the forward path is confirmed, check the loss, gradients, and updates after one forward-backward cycle.

How can we calculate the area covered by the trees in google earth pic or even just a ratio of Trees to other things [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Is there any efficient way by which we can we calculate the area covered by trees using machine learning in any google earth image. We can re-train our data using tensorflow and inception trained dataset to identify whether there is tree or not, but I can't think of any way to find out how many trees or how kuch area it is covering. Is there anything we can do.
I use Python, Tensorflow for machine learning.
P.s : don't know much about machine learning but can work with steps.
In computer vision there exists different ways for finding objects in images:
image classification will tell you if an image is something (i.e. this image is a cat)
image detection will tell you where something is in an image (i.e. it will draw a box around a cat)
image segmentation will try to extract the exact contour of something in an image (i.e. the precise contour of the cat, not just a box containing it)
You need a neural network capable of doing the second or third task with aerial images of trees.
Then simply sum all the tree' areas and compare the result with the image size.
Here you can find a Tensorflow network for doing object detection https://github.com/tensorflow/models/tree/master/research/object_detection.

Any way to manually make a variable more important in a machine learning model? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Sometimes you know by experience or by some expert knowledge some variable will play a key role in this model, is there a way to manually make the variable count more so the training process can speed up and the method can combine some human knowledge/wisdom/intelligence.
I still think machine learning combined with human knowledge is the strongest weapon we have now
This might work by scaling your input data accordingly.
On the other hand the strength of a neural network is to figure out
which features are in fact important and which combinations with other
features are important - from the data.
You might argue, that you'll decrease training time. Somebody else might argue that you're biasing your training in such a way that it might even take more time.
Anyway if you would want to do this, assuming a fully connected layer, you could increasedly initialize the weights of the input feature you found important.
Another way, could be to first pretrain the model according to a training loss, that should have your feature as an output. Than keep the weights and switch to the actual loss - I have never tried this, but it could work.

project topic for neural network for freshers? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to start working on neural network for my final project I want a topic which could be completed on 2-3 months of work and also It should be of good understanding for a fresher, as I am new to this topic and I want to learn by doing this project. It should not be very tough to understand and start work.
You could write a simple OCR using a Hopfield neural network.
A good start would be:
A comparative study of neural network algorithms applied to optical character recognition
Hopfield Networks: A Simple OCR Application
It is a relatively simple fun project.
It would be even easier if you could use Matlab and some of its modules. But even if you were to implement it in Java or some similar language, I think it should be doable in 3 months for a beginner.
In Matlab, you could start with the following:
Hopfield Neural Network
Hopfield Two Neuron Design
You will need the Neural Network Toolbox which has to be purchased separately I think.
Your first question should be what and not how you want to classify. Depending on the problem, you can choose a fitting classifier. It's hard for you to decide the detailed solution before knowing the actual problem.
Simple topics (depending on your personal background) can be text, audio or image analysis. OCR is quite typical (you can use the MNIST database for that, it's well researched so you can compare your own results). To get a general idea of what applications are out there, you should also definitely have a look at the UCI database. It has all sorts of data.
The easiest Type of Neural Network to understand and implement is a Single Layer Perceptron. To also classify non-linearly seperable data (which is needed in most real-world scenarios), you can use a Multi Layer Perceptron with 3 layers (in/hidden/out).