Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 months ago.
This post was edited and submitted for review 6 months ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
I have a target object and a lot of other objects in an image. Target object is pre-defined with a known shape in advance as shown in figure.
My task is to detect all the target object present in the image and find the angle at which the detected object is oriented with respect to target object. For the object detection purpose I am using YOLO-V5-OBB model which gives me detection confidence and the rotated bounding box coordinates.See the result below
I would like to know how rotation angle is predicted by yolo-obb model in order to make rotated bounding boxes around the detected objects?
For finding the object:
Before using heavy machine learning models, try using classic computer vision algorithms.
For finding the object:
If the object above is the only object you will be searching for:
Use cv2.HoughCircles().
https://docs.opencv.org/3.4/d4/d70/tutorial_hough_circle.html
If you want to be able to search arbitrary objects:
Try using template matching.
https://pyimagesearch.com/2021/03/22/opencv-template-matching-cv2-matchtemplate/
After detecting the objects:
Apply Hough transform to extract the top line and detect the angle by using a line fitting algorithm.
OpenCV line-fitting (Might be deprecated)
I think you may try to use YOLOv5-OBB to train and get the coordinates of bounding box, and then it is easy to find the angle of orientation. Look at the detect.py.
https://www.youtube.com/watch?v=iRkCNo9-slY
https://github.com/hukaixuan19970627/yolov5_obb
# Write results
for *poly, conf, cls in reversed(det):
if save_txt: # Write to file
# xywh =
(xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()
# normalized xywh
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am building a web app game which simply takes an image file as an input from the filesystem, divides the image into blocks, then puts the image blocks randomly so that the client-side user can relocate the blocks to restore the original image.
The platform I will be using is Elixir/Phoenix language/framework on the backend and VueJS on the front end. When I tried to manipulate image with Erlang, I had some difficulty to find libraries for pixel manipulation. But then I thought, if every user's puzzle image will have to be created on the server, it will cost a lot of resources anyway.
Do you think it's better to send the image from the server's filesystem to a client-side and create the image blocks on the client-side with Javascript or you would do it with Erlang/Elixir on the server side?
What would be your preferred method on a scientific basis, pros and cons?
This is how i am doing it.
# Args Explantion :
# -resize 72x72! - Resizes the passed Image, `!` means don't care about aspect ratio
# -crop 24x24 - Crops the image
# -scene 1 - Start naming the cropped images starting with index 1
# -%02d - produces output such as 00, 01, 02, 03
# image_size - Actual image size
# seg_size - Size of small segments/cropped images
list = [
"#{img_path}",
"-resize",
"#{image_size} x #{image_size}!",
"-crop",
"#{seg_size} x #{seg_size}",
"-scene",
"1",
"#{new_tmp}/%02d.png"
]
System.cmd("convert", list)
I would go with ImageMagick command line wrapper.
As by cropping documentation, somewhat like below would do:
convert my_image: -repage 60x40-5-3 \
-crop 30x20 +repage my_image_tiles_%d.png`
Although there is an Elixir wrapper for IM, called Mogrify, it implements a limited set of operations, so I usually go with System.cmd/3.
For each input file you might create tiles and next time the file is being accessed, you might start with a check. If the file was already cropped, just serve the tiles, otherwise crop it before serving tiles. This would be basically a clean server-side one-liner, prone to reverse engineering, unlike client-side solutions.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am trying to create a bot that mimics my playing behavior. The game is real time. I have access to the game's internal state. The player has a set of actions that they can perform.
Implementing an if/then decision tree bot based on the games state is easy to do, but it does not result in a realistic human player.
I though using machine learning and neural networks could solve this problem. My initial approach was to log the game's state and my actions every 100 ms. I fed a sequence of game states and my actions into a LSTM and attempted to predict what action should be performed in the next 100ms. The problem with this is that 95+% of the time, me (the player) is idle and not sending any input into the game. So the result of training is that the network predicts that the next action after a sequence of game states should be nothing/idle.
I thought about using a different approach where game state is only logged when the player sends an input. That way the network will not predict that the player should be idle. This misses out on potentially vital information in the game state when the player was not sending input.
Any ideas on how to approach this?
Another approach you can take is to use cost function weights to bring each action (i.e. idle and others) on equal footing. The problem here is presumably that you have an unbalanced dataset (in terms of labels) e.g. you might have 10 "idle" for each "jump" label/action that you have; this in return pushes the neural network to go with the dominant label.
In that case, you can set the class weights of your cross entropy as [1 10] i.e. inversely proportional to frequency of each label that you have. See https://www.tensorflow.org/api_docs/python/tf/losses/softmax_cross_entropy
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I would like to deploy tensorflow on a production server. I expect about 200 concurrent users. I will be using the parser and a few of my own neural network deep learning models. I would like to know the peak memory and cpu usage for the same.
Appreciate your help.
Trying a simple (But very variable) guess:
If you talk about deep learning, I infer you are talking at least of 3 or more layers, including some CNN and probably RNNs.
If you are using simple 2D or 3D inputs, but a complex architecture it can be safely said that your bottleneck will be on CPU, and thus implementing the algorithms on GPU will be needed.
You also need to prepare to scale for any number of clients, so a scaling mechanism will be useful from the start.
Also you need to know how the workload will be handled, will you have to serve real time, or a batch queue is needed? This change the requirements enormously.
Once you can figure out this and maybe other details, you can refine your estimation.
Best Regards.
For the memory it depends on 3 factors:
graph size
batch size for the graph
the queue size with the incoming request
Among the 3 factors probably the batch size has the most impact as memory of the graph is: graph size x batch size
About the CPU I suggest you to use the GPU for the graph. You can make some tests and count the number of inferences per second you can do with your graph and the selected batch size. Tensorflow serving well implemented handles concurrency very nicely and you bound is going to be the graph speed
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I have such a problem with organizing a non-automated warehouse (with forklifts). In the start of the day, there are some pallets in pallet racks in warehouse and during the day there are some specific number of lorries importing / exporting pallets to / from warehouse. And I want to minimize travel distances of forklifts during the day and (or) minimize waiting time of lorries that are processing outgoing deliveries (they are waiting for filling up their lorry with pallets).
I have suggested some quite intuitive algorithms, but they are not producing good results if I compare them to most intuitive method - putting imported pallets to the nearest free rack in warehouse. I tried to convert this problem to linear programming, but I didnt succeed - I know how to find minimized forklift paths for individual lorry, but then I dont know how to put it together because each time lorry export/import some pallets the warehouse state is changed (different pallet layout in warehouse). I also tried brute-force method for finding the best result by systematically checking every possibility, but this isnt producing results in a reasonable time...
Does anyone have some idea please (about converting the problem to linear programming)?
Some ideas.
It sounds like you may not be able to cast this problem into LP canonical form. Recall, the canonical form of a LP is
If you want to optimize the travel distance of the forklifts, then -c is a vector of costs for operating each forklift, A will be the matrix of size #lorries x #forklifts that contain the optimal distances that you are able to calculate, and the solution x will assign some fraction of the work to each forklift.
You will have to figure out the vector b based on system constraints, i.e. b[i] could be the maximum distance a forklift can drive based on its average speed.
Hopefully you can convert the real number solutions to some reasonable integer solution, otherwise you will need to use Integer Linear Programming, which is a much more difficult optimization problem.
Finally, if moving the palettes around in the warehouse changes the cost of the system, then LP will not be applicable and you will have to use some sort of state-space search (best-first, A*, or some other variant) where the state is defined by the location of the palettes, forklifts and lorries.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I'm looking for a way to determine whether a picture is explicit (is Safe For Work ) or not.
I am currently looking for an API that is capable of doing it, but so far I didn't have any success.
One of the ideas I had was to use the google search API and provide a URL to a picture, and looking whether or not it is in the results when safeSearch is enabled, but it will fail on a picture that was added before the crawler got to it.
Alternatively, I'm looking for pointers regarding what to look for in an image to determine how SFW it is. Any suggestions regarding shapes, colors or patterns?
As promised, a SFW paper from Google researchers and a patent for your study procured from this blog entry.
One of my colleagues led the development of the porn classification technology at one of the largest web companies. I will share what he told me about the development of the filter.
The definition of what is explicit varying greatly among jurisdictions so what is considered explicit in the US might not be in other parts of the world and vice-versa. So models need to take into account the users origin.
A purely imaged based approach is almost impossible to use effectively at web scale. The feature space is very complex in terms of how humans judge what is explicit and what is not and developing appropriate feature extraction technology for images proved to be exceedingly difficult.
Some of the most predictive features are the text on pages that link to the images. These are among the easiest features to develop also.
Building labeled training sets is very difficult since classifying porn and other explicit content for 8 hours a day tends to take a toll on the labelers. Because of this the turn over is fairly high with almost no one lasting a year.
Getting a high accuracy from the classifiers is still very, very difficult. They worked on it with several PhD's and a very experienced team and still did not achieve the accuracy that you are probably looking for.
If you have a more constrained problem space you can probably achieve a higher accuracy. If you are using image features only the algorithm or model will probably not generalize well and will have a high false positive rate. Best of luck.
See papers:
Detection of Pornographic Digital Images
Jorge A. Marcial-Basilio, Gualberto Aguilar-Torres, Gabriel Sánchez-Pérez, L. Karina Toscano-
Medina, and Héctor M. Pérez-Meana
Pornography Detection Using Support Vector Machine
Yu-Chun Lin (林語君) Hung-Wei Tseng (曾宏偉) Chiou-Shann Fuh
Image-Based Pornography Detection
Rigan Ap-apid
De La Salle University, Manila, Philippines
You can also take some hints from existing implementations e.g.:
"The Porn Detection Stick uses advanced image analyzing algorithms that categorize images as potentially harmful by identifying facial features, flesh tone colors, image back grounds, body part shapes, and more."