forest prediction folium how to print only one value in a band - folium

We are predicting the forest position on a earth engine map.
Our output is (256,256,1)
The unique band of our prediction has 1 band with 0 and 1.
We would like to show only the 1 in folium but the layer we have is showing both zeros and ones ...
folium.TileLayer(
tiles=mapid['tile_fetcher'].url_format,
attr='Google Earth Engine',
overlay=True,
name='predictions',
).add_to(map)
map.add_child(folium.LayerControl())
map

Mask out the 0's using the .mask() function.

Related

Binary treatment based on a continuous variable (Stata)

I want to create a scatter plot showing my treatment assignment on the y-axis and the margin of winning on the x-axis.
To create a binary treatment variable, where a margin over 0 indicates that a Republican candidate won the local election.
gen republican_win = (margin>0)
Here is a data example:
* Example generated by -dataex-. For more info, type help dataex
clear
input double margin float republican_win
-.356066316366196 0
-.54347825050354 0
-.204092293977737 0
-.449720650911331 1
-.201149433851242 1
-.505899667739868 0
-.206885248422623 1
end
To generate a scatter plot, I ran this. While the code ran well, I was wondering if it would be possible to display a continuous distribution of the margin of Republican wins and losses?
scatter margin republican_win
You can use the predicted probabilities by storing them in a variable, and then plot it at the same time as your scatter plot.
I would then reverse the axes to show your logistic distribution.
logit republican_win margin
predict win_hat
twoway scatter win_hat republican_win margin, ///
connect(l i) msymbol(i 0) sort ylabel(0 1)
There are not enough data points in your data example to show a nice fitted curve, but I'm sure it will look better on your whole dataset.

About the Input in a NN

So i am new to NN and i'm trying to go deep and apply to my subject. I would like to ask: the input of the NN can it be 2 or more values for example-> the measurement of a value, distance and time? Thanks in advance!
Yes you can have more than 1 value as your input. By my experience typically you're entering in these values as an array of values. Here is some example code from tensorflow: https://www.tensorflow.org/datasets/keras_example
In this example you see 784 inputs, each input is one of the pixels in the 28x28 greyscale image.

Is it possible to train YOLO (any version) for a single class where the image has text data. (find region of equations)

I am wondering if YOLO (any version, specially the one with accuracy, not speed) can be trained on the text data. What I am trying to do is to find the Region in the text image where any equation is present.
For example, I want to find the 2 of the Gray regions of interest in this image so that I can outline and eventually, crop the equations separately.
I am asking this questions because :
First of all I have not found a place where the YOLO is used for text data.
Secondly, how can we customise for low resolution unlike the (416,416) as all the images are either cropped or horizontal mostly in (W=2H) format.
I have implemented the YOLO-V3 version for text data but using OpenCv which is basically for CPU. I want to train the model from scratch.
Please help. Any of the Keras, Tensorflow or PyTorch would do.
Here is the code I used for implementing in OpenCv.
net = cv2.dnn.readNet(PATH+"yolov3.weights", PATH+"yolov3.cfg") # build the model. NOTE: This will only use CPU
layer_names = net.getLayerNames() # get all the layer names from the network 254 layers in the network
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()] # output layer is the
# 3 output layers in otal
blob = cv2.dnn.blobFromImage(image=img, scalefactor=0.00392, size=(416,416), mean=(0, 0, 0), swapRB=True,)
# output as numpy array of (1,3,416,416). If you need to change the shape, change it in the config file too
# swap BGR to RGB, scale it to a threshold, resize, subtract it from the mean of 0 for all the RGB values
net.setInput(blob)
outs = net.forward(output_layers) # list of 3 elements for each channel
class_ids = [] # id of classes
confidences = [] # to store all the confidence score of objects present in bounding boxes if 0, no object is present
boxes = [] # to store all the boxes
for out in outs: # get all channels one by one
for detection in out: # get detection one by one
scores = detection[5:] # prob of 80 elements if the object(s) is/are inside the box and if yes, with what prob
class_id = np.argmax(scores) # Which class is dominating inside the list
confidence = scores[class_id]
if confidence > 0.1: # consider only those boxes which have a prob of having an object > 0.55
# grid coordinates
center_x = int(detection[0] * width) # centre X of grid
center_y = int(detection[1] * height) # Center Y of grid
w = int(detection[2] * width) # width
h = int(detection[3] * height) # height
# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)
boxes.append([x, y, w, h]) # get all the bounding boxes
confidences.append(float(confidence)) # get all the confidence score
class_ids.append(class_id) # get all the clas ids
Being an object detector Yolo can be used for specific text detection only, not for detecting any text that might be present in the image.
For example Yolo can be trained to do text based logo detection like this:
I want to find the 2 of the Gray regions of interest in this image so
that I can outline and eventually, crop the equations separately.
Your problem statement talks about detecting any equation (math formula) that's present in the image so it can't be done using Yolo alone. I think mathpix is similar to your use-case. They will be using OCR (Optical Character Recognition) system trained and fine tuned towards their use-case.
Eventually to do something like mathpix, OCR system customised for your use case is what you need. There won't be any ready ready made solution out there for this. You'll have to build one.
Proposed Methods:
Mathematical Formula Detection in Heterogeneous Document Images
A Simple Equation Region Detector for Printed Document Images in Tesseract
Note: Tesseract as it is can't be used because it is a pre-trained model trained for reading any character. You can refer 2nd paper to train tesseract towards fitting your use case.
To get some idea about OCR, you can read about it here.
EDIT:
So idea is to build your own OCR to detect something that constitutes equation/math formula rather than detecting every character. You need to have data set where equations are marked. Basically you look for region with math symbols(say summation, integration etc.).
Some Tutorials to train your own OCR:
Tesseract training guide
Creating OCR pipeline using CV and DL
Build OCR pipeline
Build Your OCR
Attention OCR
So idea is that you follow these tutorials to get to know how to train
and build your OCR for any use case and then you read research papers
I mentioned above and also some of the basic ideas I gave above to
build OCR towards your use case.

How can I achieve better than 80% on the test set

My goal is to detect digits from 0 to 9 on a random background. I wrote a dataset generator with the following features:
Grayscale data
Random digit rotation
Random digit blur
43 different fonts
Random noisy blurred background
Here are 1024 samples of my dataset:
1024 testset samples
I adapted the mnist expert model to train the dataset and get almost 100% on the train and validation set.
On the test set I get approximately 80% correct.
Here is a sample. The green digit is the digit predicted:
9 predicted as 5
It seems that my model has some troubles to distinguish between
1 and 7
8 and 3
9 and 6
5 and 9
I need to detect the digit on any background because the test images are not always binary images.
Now my questions:
For the testset generator:
How useful is applying digit rotation? When I rotate a 7 then I get a 1 for some fonts. When I rotate a 9 I get a 6 (rotation > 90°)
Is the convolution filter already treating image rotation?
Are 180'000 image samples enough to train the model?
For the model:
Should I increase the image size from 28x28 to 56x56 when I apply a blur filter onto the dataset?
What filter size should I use?
Do I have to increase the number of hidden layers?
Thanks a lot for any guide.
If you are stuck with the different image backgrounds, I suggest you try image filtering, which will turn your images into the same background for foreground, assuming your images have good qualities.
Try this (scikit-image library):
import numpy as np
from skimage import filters as flt
filtered_image = np.array(original_image > flt.threshold_li(original_image))
Then you can use the filtered images for both training and prediction.
I ended up extracting the dataset patches out of existing images instead of using a random background with random digits. This gives us less variance and a much better accuracy on the test set.
Here is a working but not so performant implementation which allows us to define shape and stride size:
def patchify(self, arr, shape, stride):
patches = []
arr_shape = arr.shape
(shape_h, shape_w) = shape
(stride_h, stride_w) = stride
num_patches = np.floor(np.array(arr_shape)/np.array(stride))
(num_patches_row, num_patches_col) = (int(num_patches[0]), int(num_patches[1]))
for row in range(num_patches_row):
row_from = row*stride_h
row_to = row_from+shape_h
for col in range(num_patches_col):
col_from = col * stride_w
col_to = col_from + shape_w
origin_information = (row_from,row_to, col_from,col_to)
roi = arr[row_from:row_to, col_from:col_to]
patches.append((roi, origin_information))
return patches
or we can also use scklearn where image is a numpy array
patches = image.extract_patches_2d(image, (patch_height, patch_width))

Faster way to perform point-wise interplation of numpy array?

I have a 3D datacube, with two spatial dimensions and the third being a multi-band spectrum at each point of the 2D image.
H[x, y, bands]
Given a wavelength (or band number), I would like to extract the 2D image corresponding to that wavelength. This would be simply an array slice like H[:,:,bnd]. Similarly, given a spatial location (i,j) the spectrum at that location is H[i,j].
I would also like to 'smooth' the image spectrally, to counter low-light noise in the spectra. That is for band bnd, I choose a window of size wind and fit a n-degree polynomial to the spectrum in that window. With polyfit and polyval I can find the fitted spectral value at that point for band bnd.
Now, if I want the whole image of bnd from the fitted value, then I have to perform this windowed-fitting at each (i,j) of the image. I also want the 2nd-derivative image of bnd, that is, the value of the 2nd-derivative of the fitted spectrum at each point.
Running over the points, I could polyfit-polyval-polyder each of the x*y spectra. While this works, this is a point-wise operation. Is there some pytho-numponic way to do this faster?
If you do least-squares polynomial fitting to points (x+dx[i],y[i]) for a fixed set of dx and then evaluate the resulting polynomial at x, the result is a (fixed) linear combination of the y[i]. The same is true for the derivatives of the polynomial. So you just need a linear combination of the slices. Look up "Savitzky-Golay filters".
EDITED to add a brief example of how S-G filters work. I haven't checked any of the details and you should therefore not rely on it to be correct.
So, suppose you take a filter of width 5 and degree 2. That is, for each band (ignoring, for the moment, ones at the start and end) we'll take that one and the two on either side, fit a quadratic curve, and look at its value in the middle.
So, if f(x) ~= ax^2+bx+c and f(-2),f(-1),f(0),f(1),f(2) = p,q,r,s,t then we want 4a-2b+c ~= p, a-b+c ~= q, etc. Least-squares fitting means minimizing (4a-2b+c-p)^2 + (a-b+c-q)^2 + (c-r)^2 + (a+b+c-s)^2 + (4a+2b+c-t)^2, which means (taking partial derivatives w.r.t. a,b,c):
4(4a-2b+c-p)+(a-b+c-q)+(a+b+c-s)+4(4a+2b+c-t)=0
-2(4a-2b+c-p)-(a-b+c-q)+(a+b+c-s)+2(4a+2b+c-t)=0
(4a-2b+c-p)+(a-b+c-q)+(c-r)+(a+b+c-s)+(4a+2b+c-t)=0
or, simplifying,
22a+10c = 4p+q+s+4t
10b = -2p-q+s+2t
10a+5c = p+q+r+s+t
so a,b,c = p-q/2-r-s/2+t, (2(t-p)+(s-q))/10, (p+q+r+s+t)/5-(2p-q-2r-s+2t).
And of course c is the value of the fitted polynomial at 0, and therefore is the smoothed value we want. So for each spatial position, we have a vector of input spectral data, from which we compute the smoothed spectral data by multiplying by a matrix whose rows (apart from the first and last couple) look like [0 ... 0 -9/5 4/5 11/5 4/5 -9/5 0 ... 0], with the central 11/5 on the main diagonal of the matrix.
So you could do a matrix multiplication for each spatial position; but since it's the same matrix everywhere you can do it with a single call to tensordot. So if S contains the matrix I just described (er, wait, no, the transpose of the matrix I just described) and A is your 3-dimensional data cube, your spectrally-smoothed data cube would be numpy.tensordot(A,S).
This would be a good point at which to repeat my warning: I haven't checked any of the details in the few paragraphs above, which are just meant to give an indication of how it all works and why you can do the whole thing in a single linear-algebra operation.