I use blender to create a model like an apple like this:
The model in Blender
And I export the model, getting a .obj file and a .mtl file, then I copy those two files into the folder of ARToolkit - examples/.../models - and change the setting in models.dat file by adding this:
models/Apple.obj
0.0 0.0 0.0
90.0 1.0 0.0 0.0
10.016 10.016 10.016
MARKER 1
But the model display without material like this:
The rendering effect seen in the camera
What should I do to display the model with material?
you didn't provide a lot of information. But from the information I can see you can try the following.
Sometimes the textures in the material are switched so you can try and load the obj using this function:
glmReadOBJ3(model0file, 0, 0,true); // context 0, don't read textures yet.
The last parameter forces the rendering to flip the texture when rendering which might help in your case.
Documentation for this function is here:
https://github.com/artoolkit/artoolkit5/blob/master/include/Eden/glm.h#L386
Let me know if that works for you.
Related
I wonder why YOLO pictures need to have a bounding box.
Assume that we are using Darknet. Each image need to have a corresponding .txt file with the same name as the image file. Inside the .txt file it need to be. It's the same for all YOLO frameworks that are using bounded boxes for labeling.
<object-class> <x> <y> <width> <height>
Where x, y, width, and height are relative to the image's width and height.
For exampel. If we goto this page and press YOLO Darknet TXT button and download the .zip file and then go to train folder. Then we can see a these files
IMG_0074_jpg.rf.64efe06bcd723dc66b0d071bfb47948a.jpg
IMG_0074_jpg.rf.64efe06bcd723dc66b0d071bfb47948a.txt
Where the .txt file looks like this
0 0.7055288461538461 0.6538461538461539 0.11658653846153846 0.4110576923076923
1 0.5913461538461539 0.3545673076923077 0.17307692307692307 0.6538461538461539
Every image has the size 416x416. This image looks like this:
My idéa is that every image should have one class. Only one class. And the image should taked with a camera like this.
This camera snap should been taked as:
Take camera snap
Cut the camera snap into desired size
Upscale it to square 416x416
Like this:
And then every .txt file that correspons for every image should look like this:
<object-class> 0 0 1 1
Question
Is this possible for e.g Darknet or other framework that are using bounded boxes to labeling the classes?
Instead of let the software e.g Darknet upscale the bounded boxes to 416x416 for every class object, then I should do it and change the .txt file to x = 0, y = 0, width = 1, height = 1 for every image that only having one class object.
Is that possible for me to create a traing set in that way and train with it?
Little disclaimer I have to say that I am not an expert on this, I am part of a project and we are using darknet so I had some time experimenting.
So if I understand it right you want to train with cropped single class images with full image sized bounding boxes.
It is possible to do it and I am using something like that but it is most likely not what you want.
Let me tell you about the problems and unexpected behaviour this method creates.
When you train with images that has full image size bounding boxes yolo can not make proper detection because while training it also learns the backgrounds and empty spaces of your dataset. More specifically objects on your training dataset has to be in the same context as your real life usage. If you train it with dog images on the jungle it won't do a good job of predicting dogs in house.
If you are only going to use it with classification you can still train it like this it still classifies fine but images that you are going to predict also should be like your training dataset, so by looking at your example if you train images like this cropped dog picture your model won't be able to classify the dog on the first image.
For a better example, in my case detection wasn't required. I am working with food images and I only predict the meal on the plate, so I trained with full image sized bboxes since every food has one class. It perfectly classifies the food but the bboxes are always predicted as full image.
So my understanding for the theory part of this, if you feed the network with only full image bboxes it learns that making the box as big as possible is results in less error rate so it optimizes that way, this is kind of wasting half of the algorithm but it works for me.
Also your images don't need to be 416x416 it resizes to that whatever size you give it, you can also change it from cfg file.
I have a code that makes full sized bboxes for all images in a directory if you want to try it fast.(It overrides existing annotations so be careful)
Finally boxes should be like this for them to be centered full size, x and y are center of the bbox it should be center/half of the image.
<object-class> 0.5 0.5 1 1
from imagepreprocessing.darknet_functions import create_training_data_yolo, auto_annotation_by_random_points
import os
main_dir = "datasets/my_dataset"
# auto annotating all images by their center points (x,y,w,h)
folders = sorted(os.listdir(main_dir))
for index, folder in enumerate(folders):
auto_annotation_by_random_points(os.path.join(main_dir, folder), index, annotation_points=((0.5,0.5), (0.5,0.5), (1.0,1.0), (1.0,1.0)))
# creating required files
create_training_data_yolo(main_dir)
```
I'm using AVMutableCompositionLayerInstruction and the setCropRectangleRamp function to create a moving crop effect.
When exporting using AVAssetExportSession I set the output and renderSize to match the crop dimensions.
However the outputted video doesn't seem to follow the moving crop, but rather just outputs the center of the original video.
How do I get the encoder to encode the pixels inside the moving crop?
Figured this out. You need to use setTransformRamp(fromStart:toEnd:timeRange:) with corresponding parameters!
If I want to implement k = k0 + log2(√(w*h)/224) in Feature Pyramid Networks for Object Detection, where and which file should I change?
Note, this formula is for ROI pooling. W and H are the width and height of ROI, whereas k represents the level of the feature pyramid this ROI should be used on.
*saying the FasterRCNN meta_architecture file of in object_detection might be helpful, but please inform me which method I can change.
Take a look at this document for a rough overview of the process. In a nutshell, you'll have to create a "FeatureExtractor" sub-class for you desired meta-architecture. For FasterRCNN, you can probably start with a copy of our Resnet101 Feature Extractor as a starting point.
The short answer is that the change won't be trivial as we don't currently support cropping regions from multiple layers. Here is an outline of what would need to change if you would like to pursue this anyway:
Generating a new anchor set
Currently Faster RCNN uses a “GridAnchorGenerator” as the first_stage_anchor_generator - instead you will have to use a MultipleGridAnchorGenerator (same as we use in SSD pipeline).
You will have to use a 32^2 anchor box -> for the scales field of the anchor generator, basically you will have to add a .125
You will have to modify the code to generate and crop from multiple layers: to start, look for a function in the faster_rcnn_meta_arch file called "_extract_rpn_feature_maps", which is suggestively named, but currently returns just a single tensor! You will also have to add some logic to determine which layer to crop from based on the size of the proposal (Eqn 1 from the paper)
You will have to finally create a new feature extractor following the directions that Derek linked to.
I want to find multiple objects in a scene (objects look the same, but may differ in scale, and rotation and I don´t know what the object to be detected will be). I have implemented the following idea, based on the featuredetectors in OpenCV, which works:
detect and compute keypoints from the object
for i < max_objects_todetect; i++
1. detect and compute keypoints from the whole scene
2. match scene and object keypoints with Flannmatcher
3. use findHomography/Ransac to compute the boundingbox of the first object (object which hast the most keypoints in the scene with multiple objects)
4. set the pixel in the scene, which are within the computed boundingbox to 0, -> in the next loopcycle there are no keypoints for this object to detect anymore.
The Problem with this implementation is that I need to compute the keypoints for the scene multiple times which needs alot of computing time (250ms). Does anyone has a better idea for detecting multiple objects?
Thanks Drian
Hello togehter I tried ORB which is indeed faster and I will try Akaze.
While testing ORB I have encouterd following problem:
While chaning the size of my picture doesn´t affect the detected keypoints in Surf (finds the same keypoints in the small and in the big picture (in the linked picture right)), it affects the keypoints detected by ORB. In the small picute i´m not able to find these keypoints. I tried to experiment with the ORB parameters but couldn´t make it work.
Picture: http://www.fotos-hochladen.net/view/bildermaf6d3zt.png
SURF:
cv::Ptr<cv::xfeatures2d::SURF> detector = cv::xfeatures2d::SURF::create(100);
ORB:
cv::Ptr<cv::ORB> detector = cv::ORB::create( 1500, 1.05f,16, 31, 0, 2, ORB::HARRIS_SCORE, 2, 10);
Do you know if, and how it´s possible to detect the same keypoints independent of the size of the pictures?
Greetings Drian
I have an .obj file storing triangle mesh. I wish to record a color for each of the triangle faces. Is there a way to save this information into .obj file so that software like MeshLab could identify and visualize it?
It's way too late to help you but since I came across the same problem, someone might have the same problem later.
Here is how I would do it :
As you can read on Wikipedia's page about .obj file format,
Some applications support vertex colors, by putting red, green and blue values after x y and z (this precludes specifying w). The color values range from 0 to 1.
This method is now widely supported. I figured out that giving vertices the proper colors, will make most software (including MeshLab) automatically color the faces with those. This is advantageous when you want to store everything in an OBJ file.
v 0.0 0.0 0.0 1.0 0.0 0.0
v 0.0 0.5 0.5 1.0 0.0 0.0
v 0.5 0.5 0.0 1.0 0.0 0.0
f 3//3 2//2 1//1
Will make a red triangle.
Obviously, it will look way better if you use a MTL file. But this is quick to make and useful for testing your code when you do 3D scan. Also I did not try this with indexed vertices, so it might not work the same way.