In the official panda3d guide for blender model conversion, under the "Why do my colors look different in Panda3D?" header they say you need to set SRGB mode for colors to look like they do in blender
Set your textures to use the Texture.F_srgb or Texture.F_srgb_alpha texture format, which automatically linearizes the colors before they are used in the rendering process.
How exaclty do I do that? I load my model as Actor, I tried getting to the textures with findTexture() but had little to no success.
self.actor = Actor(model_path, animation_dict)
# how to set Texture.F_srgb?
Related
I wonder why YOLO pictures need to have a bounding box.
Assume that we are using Darknet. Each image need to have a corresponding .txt file with the same name as the image file. Inside the .txt file it need to be. It's the same for all YOLO frameworks that are using bounded boxes for labeling.
<object-class> <x> <y> <width> <height>
Where x, y, width, and height are relative to the image's width and height.
For exampel. If we goto this page and press YOLO Darknet TXT button and download the .zip file and then go to train folder. Then we can see a these files
IMG_0074_jpg.rf.64efe06bcd723dc66b0d071bfb47948a.jpg
IMG_0074_jpg.rf.64efe06bcd723dc66b0d071bfb47948a.txt
Where the .txt file looks like this
0 0.7055288461538461 0.6538461538461539 0.11658653846153846 0.4110576923076923
1 0.5913461538461539 0.3545673076923077 0.17307692307692307 0.6538461538461539
Every image has the size 416x416. This image looks like this:
My idéa is that every image should have one class. Only one class. And the image should taked with a camera like this.
This camera snap should been taked as:
Take camera snap
Cut the camera snap into desired size
Upscale it to square 416x416
Like this:
And then every .txt file that correspons for every image should look like this:
<object-class> 0 0 1 1
Question
Is this possible for e.g Darknet or other framework that are using bounded boxes to labeling the classes?
Instead of let the software e.g Darknet upscale the bounded boxes to 416x416 for every class object, then I should do it and change the .txt file to x = 0, y = 0, width = 1, height = 1 for every image that only having one class object.
Is that possible for me to create a traing set in that way and train with it?
Little disclaimer I have to say that I am not an expert on this, I am part of a project and we are using darknet so I had some time experimenting.
So if I understand it right you want to train with cropped single class images with full image sized bounding boxes.
It is possible to do it and I am using something like that but it is most likely not what you want.
Let me tell you about the problems and unexpected behaviour this method creates.
When you train with images that has full image size bounding boxes yolo can not make proper detection because while training it also learns the backgrounds and empty spaces of your dataset. More specifically objects on your training dataset has to be in the same context as your real life usage. If you train it with dog images on the jungle it won't do a good job of predicting dogs in house.
If you are only going to use it with classification you can still train it like this it still classifies fine but images that you are going to predict also should be like your training dataset, so by looking at your example if you train images like this cropped dog picture your model won't be able to classify the dog on the first image.
For a better example, in my case detection wasn't required. I am working with food images and I only predict the meal on the plate, so I trained with full image sized bboxes since every food has one class. It perfectly classifies the food but the bboxes are always predicted as full image.
So my understanding for the theory part of this, if you feed the network with only full image bboxes it learns that making the box as big as possible is results in less error rate so it optimizes that way, this is kind of wasting half of the algorithm but it works for me.
Also your images don't need to be 416x416 it resizes to that whatever size you give it, you can also change it from cfg file.
I have a code that makes full sized bboxes for all images in a directory if you want to try it fast.(It overrides existing annotations so be careful)
Finally boxes should be like this for them to be centered full size, x and y are center of the bbox it should be center/half of the image.
<object-class> 0.5 0.5 1 1
from imagepreprocessing.darknet_functions import create_training_data_yolo, auto_annotation_by_random_points
import os
main_dir = "datasets/my_dataset"
# auto annotating all images by their center points (x,y,w,h)
folders = sorted(os.listdir(main_dir))
for index, folder in enumerate(folders):
auto_annotation_by_random_points(os.path.join(main_dir, folder), index, annotation_points=((0.5,0.5), (0.5,0.5), (1.0,1.0), (1.0,1.0)))
# creating required files
create_training_data_yolo(main_dir)
```
I'm trying to solve some simple captcha using OpenCV and pytesseract. Some of captcha samples are:
I tried to the remove the noisy dots with some filters:
import cv2
import numpy as np
import pytesseract
img = cv2.imread(image_path)
_, img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
img = cv2.morphologyEx(img, cv2.MORPH_OPEN, np.ones((4, 4), np.uint8), iterations=1)
img = cv2.medianBlur(img, 3)
img = cv2.medianBlur(img, 3)
img = cv2.medianBlur(img, 3)
img = cv2.medianBlur(img, 3)
img = cv2.GaussianBlur(img, (5, 5), 0)
cv2.imwrite('res.png', img)
print(pytesseract.image_to_string('res.png'))
Resulting tranformed images are:
Unfortunately pytesseract just recognizes first captcha correctly. Any other better transformation?
Final Update:
As #Neil suggested, I tried to remove noise by detecting connected pixels. To find connected pixels, I found a function named connectedComponentsWithStats, whichs detect connected pixels and assigns group (component) a label. By finding connected components and removing the ones with small number of pixels, I managed to get better overall detection accuracy with pytesseract.
And here are the new resulting images:
I've taken a much more direct approach to filtering ink splotches from pdf documents. I won't share the whole thing it's a lot of code, but here is the general strategy I adopted:
Use Python Pillow library to get an image object where you can manipulate pixels directly.
Binarize the image.
Find all connected pixels and how many pixels are in each group of connected pixels. You can do this using the minesweeper algorithm. Which is easy to search for.
Set some threshold value of pixels that all legitimate letters are expected to have. This will be dependent on your image resolution.
replace all black pixels in groups below the threshold with white pixels.
Convert back to image.
Your final output image is too blurry. To enhance the performance of pytesseract you need to sharpen it.
Sharpening is not as easy as blurring, but there exist a few code snippets / tutorials (e.g. http://datahacker.rs/004-how-to-smooth-and-sharpen-an-image-in-opencv/).
Rather than chaining blurs, blur once either using Gaussian or Median Blur, experiment with parameters to get the blur amount you need, perhaps try one method after the other but there is no reason to chain blurs of the same method.
There is an OCR example in python that detect the characters. Save several images and apply the filter and train a SVM algorithm. that may help you. I did trained a algorithm with even few Images but the results were acceptable. Check this link.
Wish you luck
I know the post is a bit old but I suggest you to try this library I've developed some time ago. If you have a set of labelled captchas that service would fit you. Take a look: https://github.com/punkerpunker/captcha_solver
In README there is a section "Train model on external data" that you might be interested in.
So i am trying to make graph mapping coral cover degradation. From the attached graph you can see i am almost there, i am just trying to add custom labels to my legend and a custom theme to change up the colours.
This code produces a graph but seems to ignore my custom theme and labels.
Thanks for the help!
ggplot(CCR, aes(x=as.factor(Year),y=mean, group=Site,color=Site))+
geom_point(size=3)+
labs(y="Hard Coral Cover %",x="Year")+
geom_line(size=2)+
geom_errorbar(aes(ymin=mean-sd,ymax=mean+sd),width=.1,colour="black")+
scale_y_continuous(limits=c(0,110), expand=c(0,0))+
theme(axis.title.y = element_text(face="bold"),axis.title.x=element_text(face="bold"),legend.title=element_text(face="bold"))+
scale_fill_npg(labels=c("Bouy 3","Sampela 1","Pak Kasims","Kaledupa 1","Kaledupa Double Spur","Ridge 1"),guide=guide_legend(title="Site"))+
theme_base()
Graph produced from above code
I've been struggling for some time to find a way in Meshlab to include or transfer UV’s onto a poisson model from source meshes. I will try to explain more of what I’m trying to accomplish below.
My source meshes have uv’s along with texture data. I need to build a fused model and include the texture data. It is for facial expression scan data reconstruction for a production pipeline which ultimately builds a facial rig for animation. Our source scan data includes marker information which we use to register, build a fused scan model which is used to generate a retopologized mesh for blendshapes.
Previously, we were using David3D. http://www.david-3d.com/en/support/downloads
David 3D used poisson surface reconstruction to create a fused model. The fused model it created brought along the uvs and optimized the source textures into 1 uv tile. I'll post a picture of the result below that I'm looking to recreate in MeshLab.
My need to find this solution in meshlab is to build tools to help automate this process. David3D version 5 does not have an development kit to program around.
Is it possible in Meshlab to apply the uvs from the regions used from the source mesh onto the poison model? Could I use a filter to transfer them? Reproject them?
Or is there another reconstruction method/ process from within Meshlab that will keep the uv’s?
Here is an image of what the resulting uv parameter looks like from David. The uvs are white on the left half of the image.
Thank You,David3D UV Layout Result
Dan
No, in MeshLab there is no direct way to transfer UV mapping between two layers.
This is because UV transfer is not, in the general case, a trivial task. It is not simply a matter of assigning to the new surface the "closest" UV of the original mesh: this would not work on UV discontinuities, which are present in the example you linked. Additionally, the two meshes should be almost coincident, otherwise you would also have problems also in defining the "closest" UV.
There are a couple ways to do it, but require manual work and a re-sampling of the texture:
create a UV mapping of the re-meshed model using whatever tool you may have, then resample the existing texture on the new parametrization using "transfer: vertex attributes to Texture (1 or 2 meshes)", using texture color as source
load the original mesh, and using the screenshot function, create "virtual" photos of the model (turn off illumination and do NOT use ortho views), adding them as raster layers, until the model surface has been fully covered. Load the new model, that should be in the same space, and texture-map it using the "parametrization + texturing " using those registered images
In MeshLab it is also possible to create a new texture from the original images, if you have a way to import the registered cameras...
TL;DR: UV coords to color channels → Vertex Attribute Transfer → Color channels back to UV coords
I have had very good results kludging it through the color channels, like this (say you are transfering from layer A to layer B):
Make sure A and B are roughly aligned with eachother (you can use the ICP filter if needed).
Select layer A, then:
Texture → Convert Per Wedge UV to Per Vertex UV (if you've got wedge coords)
Color Creation → Per Vertex Color Function, and transfer the tex coords to the color channels (assuming UV range 0-1, you'll want to tweak these if your range is larger):
func r = 255.0 * vtu
func g = 255.0 * vtv
func b = 0
Sampling → Vertex Attribute Transfer, and use this to transfer the vertex colors (which now hold texture coordinates) from layer A to layer B.
source mesh = layer A
target mesh = layer B
check Transfer Color
set distance large enough to not miss any spots
Now select layer B, which contains the mapped vertex colors, and do the opposite that you did for A:
Texture → Per Vertex Texture Function
func u = r / 255.0
func v = g / 255.0
Texture → Convert Per Vertex UV to Per Wedge UV
And that's it.
The results aren't going to be perfect, but in practice I often find them sufficient. In particular:
If the texture is not continuously mapped to layer A (e.g. maybe you've got patches of image mapped to certain areas, etc.), it's very possible for the attribute transfer to B (especially when upsampling) to have some vertices be interpolated across patch boundaries, which will probably lead to visual artifacts along patch boundaries.
UV coords may be quantized by conversion to a color channel and back. (You could maybe eliminate this by stretching U out over all three color channels, then transferring U, then repeating for V -- never tried it though.)
That said, there's a lot of cases it works in.
I may or may not add images / video to this post another day.
PS Meshlab is pretty straightforward to build from source; it might be possible to add a UV coordinate option to the Vertex Attribute Transfer filter. But, to make it more useful, you'd want to make sure that you didn't interpolate across boundary edges in the mapped UV projection. Definitely a project I'd like to work on some day... in theory. If that ever happens I'll post a link here.
How can I do a basic face alignment on a 2-dimensional image with the assumption that I have the position/coordinates of the mouth and eyes.
Is there any algorithm that I could implement to correct the face alignment on images?
Face (or image) alignment refers to aligning one image (or face in your case) with respect to another (or a reference image/face). It is also referred to as image registration. You can do that using either appearance (intensity-based registration) or key-point locations (feature-based registration). The second category stems from image motion models where one image is considered a displaced version of the other.
In your case the landmark locations (3 points for eyes and nose?) provide a good reference set for straightforward feature-based registration. Assuming you have the location of a set of points in both of the 2D images, x_1 and x_2 you can estimate a similarity transform (rotation, translation, scaling), i.e. a planar 2D transform S that maps x_1 to x_2. You can additionally add reflection to that, though for faces this will most-likely be unnecessary.
Estimation can be done by forming the normal equations and solving a linear least-squares (LS) problem for the x_1 = Sx_2 system using linear regression. For the 5 unknown parameters (2 rotation, 2 translation, 1 scaling) you will need 3 points (2.5 to be precise) for solving 5 equations. Solution to the above LS can be obtained through Direct Linear Transform (e.g. by applying SVD or a matrix pseudo-inverse). For cases of a sufficiently large number of reference points (i.e. automatically detected) a RANSAC-type method for point filtering and uncertainty removal (though this is not your case here).
After estimating S, apply image warping on the second image to get the transformed grid (pixel) coordinates of the entire image 2. The transform will change pixel locations but not their appearance. Unavoidably some of the transformed regions of image 2 will lie outside the grid of image 1, and you can decide on the values for those null locations (e.g. 0, NaN etc.).
For more details: R. Szeliski, "Image Alignment and Stitching: A Tutorial" (Section 4.3 "Geometric Registration")
In OpenCV see: Geometric Image Transformations, e.g. cv::getRotationMatrix2D cv::getAffineTransform and cv::warpAffine. Note though that you should estimate and apply a similarity transform (special case of an affine) in order to preserve angles and shapes.
For the face there is lot of variability in feature points. So it won't be possible to do a perfect fit of all feature points by just affine transforms. The only way to align all the points perfectly is to warp the image given the points. Basically you can do a triangulation of image given the points and do a affine warp of each triangle to get the warped image where all the points are aligned.
Face detection could be handled based on the just eye positions.
Herein, OpenCV, Dlib and MTCNN offers to detect faces and eyes. Besides, it is a python based framework but deepface wraps those methods and offers an out-of-the box detection and alignment function.
detectFace function applies detection and alignment in the background respectively.
#!pip install deepface
from deepface import DeepFace
backends = ['opencv', 'ssd', 'dlib', 'mtcnn']
DeepFace.detectFace("img.jpg", detector_backend = backends[0])
Besides, you can apply detection and alignment manually.
from deepface.commons import functions
img = functions.load_image("img.jpg")
backends = ['opencv', 'ssd', 'dlib', 'mtcnn']
detected_face = functions.detect_face(img = img, detector_backend = backends[3])
plt.imshow(detected_face)
aligned_face = functions.align_face(img = img, detector_backend = backends[3])
plt.imshow(aligned_face)
processed_img = functions.detect_face(img = aligned_face, detector_backend = backends[3])
plt.imshow(processed_img)
There's a section Aligning Face Images in OpenCV's Face Recognition guide:
http://docs.opencv.org/trunk/modules/contrib/doc/facerec/facerec_tutorial.html#aligning-face-images
The script aligns given images at the eyes. It's written in Python, but should be easy to translate to other languages. I know of a C# implementation by Sorin Miron:
http://code.google.com/p/stereo-face-recognition/