Loading labelled imagess in ITK-SNAP for rendering - mesh

I am trying to get a mesh of the cerebellum. The issue is that the mri volume does not have a high enough resolution to enable a reliable identification of different substructures so what i am doing is parsing svg files of delineations of histological staining slides and converting them into filled masks kind of what you would get after using polygon tool on one slice of the mri volume. I do this for around 20 slices.
Can i load a labelled volume in snap where each voxel is labelled with an id corresponding to some structure and each structure has its own rgb color code kind of like the load label feature. When i load this volume into snap will snap recognize the labelled voxels and allow me use interpolate label features for interpolating these labels to unlabelled slices and then do a export mesh. Right now what im doing is
import nibabel as nib
import numpy as np
d3 = fixed_3d.astype("int16")
d3[210,:,:][ind] = 1 # ind is the coords inside my structure
new_image = nib.Nifti1Image(d3, affine=np.eye(4))
nib.save(new_image, "vol.nii.gz")
fixed_3d is my atlas volume then i load this in itk snap and also do a import label . my label file looks like
IDX -R- -G- -B- -A-- VIS MSH LABEL
0 0 0 0 0.00 0 0 "Clear"
1 48 126 110 1.00 1 1 "arb"
But when i click on export as surface mesh i get the message missing mesh for selected label

I think you are missing a step of registering histology slides to MR image. Otherwise your workflow looks good.

Related

Blender changes the Z-axis value when exporting to gltf 2.0

I created a simple triangle in a blender with three vertices.
The picture shows the coordinates in the blender.
I export the model with default settings, including "+Y up"
However, here are the vertexes that the .bin file contains.
vertexes:
-1 0 2
1 0 1.5
0 0 -3
indeces
0 1 2
As can be seen, the Z-axis values have changed sign.
Why so?
My guess:
Blender as far as I understand uses left-side coordinate system, and gltf uses right-side coordinate system, maybe that's why it changes the Z-axis.
But, the gltf specification says: "glTF defines +Y as up, +Z as forward, and -X as right;" why then does the X axis not change? After all, in blender +X is right.

Images rotated when added to PDF in itext7

I'm using the following extension method I built on top of itext7's com.itextpdf.layout.Document type to apply images to PDF documents in my application:
fun Document.writeImage(imageStream: InputStream, page: Int, x: Float, y: Float, width: Float, height: Float) {
val imageData = ImageDataFactory.create(imageStream.readBytes())
val image = Image(imageData)
val pageHeight = pdfDocument.getPage(page).pageSize.height
image.scaleAbsolute(width, height)
val lowerLeftX = x
val lowerLeftY = pageHeight - y - image.imageScaledHeight
image.setFixedPosition(page, lowerLeftX, lowerLeftY)
add(image)
}
Overall, this works -- but with one exception! I've encountered a subset of documents where the images are placed as if the document origin is rotated 90 degrees. Even though the content of the document is presented properly oriented underneath.
Here is a redacted copy of one of the PDFs I'm experiencing this issue with. I'm wondering if anyone would be able to tell me why itext7 is having difficulties writing to this document, and what I can do to fix it -- or alternatively, if it's a potential bug in the higher level functionality of com.itextpdf.layout in itext7?
Some Additional Notes
I'm aware that drawing on a PDF works via a series of instructions concatenated to the PDF. The code above works on other PDFs we've had issues with in the past, so com.itextpdf.layout.Document does appear to be normalizing the coordinate space prior to drawing. Thus, the issue I describe above seems to be going undetected by itext?
The rotation metadata in the PDF that itext7 reports from a "good" PDF without this issue seems to be the same as the rotation metadata in PDFs like the one I've linked above. This means I can't perform some kind of brute-force fix through detection.
I would love any solution to not require me to flatten the PDF through any form of broad operation.
I can talk only about the document you`ve shared.
It contains 4 pages.
/Rotate property of the first page is 0, for other pages is 270 (defines 90 rotation counterclockwise).
IText indeed tries to normalize the coordinate space for each page.
That`s why when you add an image to pages 2-4 of the document it is rotated on 270 (90 counterclockwise) degrees.
... Even though the content of the document is presented properly oriented underneath.
Content of pages 2-4 looks like
q
0 -612 792 0 0 612 cm
/Im0 Do
Q
This is an image with applied transformation.
0 -612 792 0 0 612 cm represents the composite transformation matrix.
From ISO 32000
A transformation matrix in PDF shall be specified by six numbers,
usually in the form of an array containing six elements. In its most
general form, this array is denoted [a b c d e f]; it can represent
any linear transformation from one coordinate system to another.
We can extract a rotation from that matrix.
How to decompose the matrix you can find there.
https://math.stackexchange.com/questions/237369/given-this-transformation-matrix-how-do-i-decompose-it-into-translation-rotati
The rotation is defined by the next matrix
0 -1
1 0
This is a rotation on -90 (270) degrees.
Important note: in this case positive angle means counterclockwise rotation.
ISO 32000
Rotations shall be produced by [rc rs -rs rc 0 0], where rc = cos(q)
and rs = sin(q) which has the effect of rotating the coordinate system
axes by an angle q counter clockwise.
So the image has been rotated on the same angle in the counter direction comparing to the page.

Can YOLO pictures have a bounded box that covering the whole picture?

I wonder why YOLO pictures need to have a bounding box.
Assume that we are using Darknet. Each image need to have a corresponding .txt file with the same name as the image file. Inside the .txt file it need to be. It's the same for all YOLO frameworks that are using bounded boxes for labeling.
<object-class> <x> <y> <width> <height>
Where x, y, width, and height are relative to the image's width and height.
For exampel. If we goto this page and press YOLO Darknet TXT button and download the .zip file and then go to train folder. Then we can see a these files
IMG_0074_jpg.rf.64efe06bcd723dc66b0d071bfb47948a.jpg
IMG_0074_jpg.rf.64efe06bcd723dc66b0d071bfb47948a.txt
Where the .txt file looks like this
0 0.7055288461538461 0.6538461538461539 0.11658653846153846 0.4110576923076923
1 0.5913461538461539 0.3545673076923077 0.17307692307692307 0.6538461538461539
Every image has the size 416x416. This image looks like this:
My idéa is that every image should have one class. Only one class. And the image should taked with a camera like this.
This camera snap should been taked as:
Take camera snap
Cut the camera snap into desired size
Upscale it to square 416x416
Like this:
And then every .txt file that correspons for every image should look like this:
<object-class> 0 0 1 1
Question
Is this possible for e.g Darknet or other framework that are using bounded boxes to labeling the classes?
Instead of let the software e.g Darknet upscale the bounded boxes to 416x416 for every class object, then I should do it and change the .txt file to x = 0, y = 0, width = 1, height = 1 for every image that only having one class object.
Is that possible for me to create a traing set in that way and train with it?
Little disclaimer I have to say that I am not an expert on this, I am part of a project and we are using darknet so I had some time experimenting.
So if I understand it right you want to train with cropped single class images with full image sized bounding boxes.
It is possible to do it and I am using something like that but it is most likely not what you want.
Let me tell you about the problems and unexpected behaviour this method creates.
When you train with images that has full image size bounding boxes yolo can not make proper detection because while training it also learns the backgrounds and empty spaces of your dataset. More specifically objects on your training dataset has to be in the same context as your real life usage. If you train it with dog images on the jungle it won't do a good job of predicting dogs in house.
If you are only going to use it with classification you can still train it like this it still classifies fine but images that you are going to predict also should be like your training dataset, so by looking at your example if you train images like this cropped dog picture your model won't be able to classify the dog on the first image.
For a better example, in my case detection wasn't required. I am working with food images and I only predict the meal on the plate, so I trained with full image sized bboxes since every food has one class. It perfectly classifies the food but the bboxes are always predicted as full image.
So my understanding for the theory part of this, if you feed the network with only full image bboxes it learns that making the box as big as possible is results in less error rate so it optimizes that way, this is kind of wasting half of the algorithm but it works for me.
Also your images don't need to be 416x416 it resizes to that whatever size you give it, you can also change it from cfg file.
I have a code that makes full sized bboxes for all images in a directory if you want to try it fast.(It overrides existing annotations so be careful)
Finally boxes should be like this for them to be centered full size, x and y are center of the bbox it should be center/half of the image.
<object-class> 0.5 0.5 1 1
from imagepreprocessing.darknet_functions import create_training_data_yolo, auto_annotation_by_random_points
import os
main_dir = "datasets/my_dataset"
# auto annotating all images by their center points (x,y,w,h)
folders = sorted(os.listdir(main_dir))
for index, folder in enumerate(folders):
auto_annotation_by_random_points(os.path.join(main_dir, folder), index, annotation_points=((0.5,0.5), (0.5,0.5), (1.0,1.0), (1.0,1.0)))
# creating required files
create_training_data_yolo(main_dir)
```

Is it possible to train YOLO (any version) for a single class where the image has text data. (find region of equations)

I am wondering if YOLO (any version, specially the one with accuracy, not speed) can be trained on the text data. What I am trying to do is to find the Region in the text image where any equation is present.
For example, I want to find the 2 of the Gray regions of interest in this image so that I can outline and eventually, crop the equations separately.
I am asking this questions because :
First of all I have not found a place where the YOLO is used for text data.
Secondly, how can we customise for low resolution unlike the (416,416) as all the images are either cropped or horizontal mostly in (W=2H) format.
I have implemented the YOLO-V3 version for text data but using OpenCv which is basically for CPU. I want to train the model from scratch.
Please help. Any of the Keras, Tensorflow or PyTorch would do.
Here is the code I used for implementing in OpenCv.
net = cv2.dnn.readNet(PATH+"yolov3.weights", PATH+"yolov3.cfg") # build the model. NOTE: This will only use CPU
layer_names = net.getLayerNames() # get all the layer names from the network 254 layers in the network
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()] # output layer is the
# 3 output layers in otal
blob = cv2.dnn.blobFromImage(image=img, scalefactor=0.00392, size=(416,416), mean=(0, 0, 0), swapRB=True,)
# output as numpy array of (1,3,416,416). If you need to change the shape, change it in the config file too
# swap BGR to RGB, scale it to a threshold, resize, subtract it from the mean of 0 for all the RGB values
net.setInput(blob)
outs = net.forward(output_layers) # list of 3 elements for each channel
class_ids = [] # id of classes
confidences = [] # to store all the confidence score of objects present in bounding boxes if 0, no object is present
boxes = [] # to store all the boxes
for out in outs: # get all channels one by one
for detection in out: # get detection one by one
scores = detection[5:] # prob of 80 elements if the object(s) is/are inside the box and if yes, with what prob
class_id = np.argmax(scores) # Which class is dominating inside the list
confidence = scores[class_id]
if confidence > 0.1: # consider only those boxes which have a prob of having an object > 0.55
# grid coordinates
center_x = int(detection[0] * width) # centre X of grid
center_y = int(detection[1] * height) # Center Y of grid
w = int(detection[2] * width) # width
h = int(detection[3] * height) # height
# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)
boxes.append([x, y, w, h]) # get all the bounding boxes
confidences.append(float(confidence)) # get all the confidence score
class_ids.append(class_id) # get all the clas ids
Being an object detector Yolo can be used for specific text detection only, not for detecting any text that might be present in the image.
For example Yolo can be trained to do text based logo detection like this:
I want to find the 2 of the Gray regions of interest in this image so
that I can outline and eventually, crop the equations separately.
Your problem statement talks about detecting any equation (math formula) that's present in the image so it can't be done using Yolo alone. I think mathpix is similar to your use-case. They will be using OCR (Optical Character Recognition) system trained and fine tuned towards their use-case.
Eventually to do something like mathpix, OCR system customised for your use case is what you need. There won't be any ready ready made solution out there for this. You'll have to build one.
Proposed Methods:
Mathematical Formula Detection in Heterogeneous Document Images
A Simple Equation Region Detector for Printed Document Images in Tesseract
Note: Tesseract as it is can't be used because it is a pre-trained model trained for reading any character. You can refer 2nd paper to train tesseract towards fitting your use case.
To get some idea about OCR, you can read about it here.
EDIT:
So idea is to build your own OCR to detect something that constitutes equation/math formula rather than detecting every character. You need to have data set where equations are marked. Basically you look for region with math symbols(say summation, integration etc.).
Some Tutorials to train your own OCR:
Tesseract training guide
Creating OCR pipeline using CV and DL
Build OCR pipeline
Build Your OCR
Attention OCR
So idea is that you follow these tutorials to get to know how to train
and build your OCR for any use case and then you read research papers
I mentioned above and also some of the basic ideas I gave above to
build OCR towards your use case.

How can I do template matching in opencv with colour?

I have been trying to use opencv's template matching function to match templates within images. However, when the images are dark brown and dark green, the template matching does not work so well. I am pretty sure it is the grey scale conversion that is responsible for this because in greyscale it looks very similar.
However from what I see, cv2.matchtemplate() only takes in grey scale images. How can I do coloured template matching? Should I seperate the RGB image into 3 images: one red, one green, one blue and treat each one as gray scale images and apply matchtemplate then sum the similarity rating for each pixel position? Is that the way to do it? Or is there a different function or a parameter value I can use to make matchtemplate work for coloured images?
You may try this code:
import numpy as np
import cv2
threshold = 0.8
##Read Main and Needle Image
imageMainRGB = cv2.imread(main/Image/Path/main.png)
imageNeedleRGB = cv2.imread(needle/Image/Path/needle.png)
##Split Both into each R, G, B Channel
imageMainR, imageMainG, imageMainB = cv2.split(imageMainRGB)
imageNeedleR, imageNeedleG, imageNeedleB = cv2.split(imageNeedleRGB)
##Matching each channel
resultB = cv2.matchTemplate(imageMainR, imageNeedleR, cv2.TM_SQDIFF)
resultG = cv2.matchTemplate(imageMainG, imageNeedleG, cv2.TM_SQDIFF)
resultR = cv2.matchTemplate(imageMainB, imageNeedleB, cv2.TM_SQDIFF)
##Add together to get the total score
result = resultB + resultG + resultR
loc = np.where(result >= 3 * threshold)
print("loc: ", loc)
The Image I tested with are:
main.png
needle.png
result.png
Remark: This code may not function in some photos, where a user may need to modify it further to enhance it.
Note: This image was getting from pexels.com which is free copyright. If you have any issues with the image copyright and want to take down this image, welcome to contact me. Thanks.