Is there a way in Accord.NET to match SURF features? - accord.net

I have detected features like this:
Grayscale GS = new Grayscale(10, 10, 10);
GS.Apply(Image1);
GS.Apply(Image2);
List<SpeededUpRobustFeaturePoint> Image1Features;
List<SpeededUpRobustFeaturePoint> Image2Features;
Image1Features = Detector.Transform(Image1).ToList();
Image2Features = Detector.Transform(Image2).ToList();
But how do I match them? Do I write my own implementation of desired metric or I am missing some Accord.NET functionality for featrue matching?

There is an example of image stitching using SURF features, they are using KNearestNeighborMatching.

Related

Simple Captcha Solving

I'm trying to solve some simple captcha using OpenCV and pytesseract. Some of captcha samples are:
I tried to the remove the noisy dots with some filters:
import cv2
import numpy as np
import pytesseract
img = cv2.imread(image_path)
_, img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
img = cv2.morphologyEx(img, cv2.MORPH_OPEN, np.ones((4, 4), np.uint8), iterations=1)
img = cv2.medianBlur(img, 3)
img = cv2.medianBlur(img, 3)
img = cv2.medianBlur(img, 3)
img = cv2.medianBlur(img, 3)
img = cv2.GaussianBlur(img, (5, 5), 0)
cv2.imwrite('res.png', img)
print(pytesseract.image_to_string('res.png'))
Resulting tranformed images are:
Unfortunately pytesseract just recognizes first captcha correctly. Any other better transformation?
Final Update:
As #Neil suggested, I tried to remove noise by detecting connected pixels. To find connected pixels, I found a function named connectedComponentsWithStats, whichs detect connected pixels and assigns group (component) a label. By finding connected components and removing the ones with small number of pixels, I managed to get better overall detection accuracy with pytesseract.
And here are the new resulting images:
I've taken a much more direct approach to filtering ink splotches from pdf documents. I won't share the whole thing it's a lot of code, but here is the general strategy I adopted:
Use Python Pillow library to get an image object where you can manipulate pixels directly.
Binarize the image.
Find all connected pixels and how many pixels are in each group of connected pixels. You can do this using the minesweeper algorithm. Which is easy to search for.
Set some threshold value of pixels that all legitimate letters are expected to have. This will be dependent on your image resolution.
replace all black pixels in groups below the threshold with white pixels.
Convert back to image.
Your final output image is too blurry. To enhance the performance of pytesseract you need to sharpen it.
Sharpening is not as easy as blurring, but there exist a few code snippets / tutorials (e.g. http://datahacker.rs/004-how-to-smooth-and-sharpen-an-image-in-opencv/).
Rather than chaining blurs, blur once either using Gaussian or Median Blur, experiment with parameters to get the blur amount you need, perhaps try one method after the other but there is no reason to chain blurs of the same method.
There is an OCR example in python that detect the characters. Save several images and apply the filter and train a SVM algorithm. that may help you. I did trained a algorithm with even few Images but the results were acceptable. Check this link.
Wish you luck
I know the post is a bit old but I suggest you to try this library I've developed some time ago. If you have a set of labelled captchas that service would fit you. Take a look: https://github.com/punkerpunker/captcha_solver
In README there is a section "Train model on external data" that you might be interested in.

Using machine learning to remove background in image of hand-written signature

I am new to machine learning.
I want to prepare a document with a signature at the bottom of it.
For this purpose I am taking a photo of the user's signature for placement in the document.
How can I using machine learning extract only the signature part from the image and place it on the document?
Input example:
Output expected in gif format:
Extract the green image plane. Then take the complementary of the gray value of every pixel as the transparency coefficient. Then you can perform compositing to the destination.
https://en.wikipedia.org/wiki/Alpha_compositing
A simple image-processing technique using OpenCV should work. The idea is to obtain a binary image then bitwise-and the image to remove the non-signature details. Here's the results:
Input image
Binary image
Result
Code
import cv2
# Load image, convert to grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Bitwise-and and color background white
result = cv2.bitwise_and(image, image, mask=thresh)
result[thresh==0] = [255,255,255]
cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey()
Please do research before posting questions like these. A simple google search of "extract signature from image python" gave so many results.
Git Repo
Git Repo
Stack Overflow
There are many other such alternatives. Please have a look and try a few approaches.
If you still have some questions or doubts, then post the approach you have taken and a discussion is warranted.

Using TFX for designing image piplines

When reading the documentation for TFX, especially in the parts related to pre-processing of the data, I would think the pipeline design is more appropiate for categorical features.
I wanted to know whether TFX could also be used for pipelines involving images.
Yes, TFX could also be used for pipelines involving images.
Especially, in the parts related to pre-processing the data, as per my knowledge, there are no in built functions in Tensorflow Transform.
But the Transformations can be made using Tensorflow Ops. For example, Image Augmentation can be done using tf.image, and so on.
Sample code for Transformation of Images, i.e., converting an image from Color to Grey Scale, by dividing the value of each pixel by 255, using Tensorflow Transform is shown below:
def preprocessing_fn(inputs):
"""Preprocess input columns into transformed columns."""
# Since we are modifying some features and leaving others unchanged, we
# start by setting `outputs` to a copy of `inputs.
outputs = inputs.copy()
# Convert the Image from Color to Grey Scale.
# NUMERIC_FEATURE_KEYS is the names of Columns of Values of Pixels
for key in NUMERIC_FEATURE_KEYS:
outputs[key] = tf.divide(outputs[key], 255)
outputs[LABEL_KEY] = outputs[LABEL_KEY]
return outputs

procedural mesh creation with blender script

What is a clean way to create a procedural mesh in blender with a script? Ideally I'd like to create an empty mesh and start adding vertices and triangles, like this:
mesh = create_mesh()
a = mesh.add_vertex([0, 0, 0])
b = mesh.add_vertex([1, 0, 0])
c = mesh.add_vertex([1, 1, 0])
t = mesh.add_triangle(a, b, c)
Blenders bmesh module is designed for creating and editing mesh data, although you can also turn python arrays into mesh data.
You can find some examples included with blender, all addons are made using python, one helpful addon would be add_mesh_extra_objects which has several examples of creating different types of objects.
There is a blender specific stackexchange site where you will get more help with blender specific scripting. You should also find some existing examples there.

Feature Pyramid Network with tensorflow/models/object_detection

If I want to implement k = k0 + log2(√(w*h)/224) in Feature Pyramid Networks for Object Detection, where and which file should I change?
Note, this formula is for ROI pooling. W and H are the width and height of ROI, whereas k represents the level of the feature pyramid this ROI should be used on.
*saying the FasterRCNN meta_architecture file of in object_detection might be helpful, but please inform me which method I can change.
Take a look at this document for a rough overview of the process. In a nutshell, you'll have to create a "FeatureExtractor" sub-class for you desired meta-architecture. For FasterRCNN, you can probably start with a copy of our Resnet101 Feature Extractor as a starting point.
The short answer is that the change won't be trivial as we don't currently support cropping regions from multiple layers. Here is an outline of what would need to change if you would like to pursue this anyway:
Generating a new anchor set
Currently Faster RCNN uses a “GridAnchorGenerator” as the first_stage_anchor_generator - instead you will have to use a MultipleGridAnchorGenerator (same as we use in SSD pipeline).
You will have to use a 32^2 anchor box -> for the scales field of the anchor generator, basically you will have to add a .125
You will have to modify the code to generate and crop from multiple layers: to start, look for a function in the faster_rcnn_meta_arch file called "_extract_rpn_feature_maps", which is suggestively named, but currently returns just a single tensor! You will also have to add some logic to determine which layer to crop from based on the size of the proposal (Eqn 1 from the paper)
You will have to finally create a new feature extractor following the directions that Derek linked to.