Suppose I have a segmented image as a Numpy array, where each entry in the image is a number from 1, ... C, C+1 where C is the number of segmentation classes, and class C+1 is some background class. I want to find an efficient way to convert this to a contour image (a binary image where a contour pixel will have value 1, and the rest will have values 0), so that any pixel who has a neighbor in its 8-neighbourhood (or 4-neighbourhood) will be a contour pixel.
The inefficient way would be something like:
def isValidLocation(i, j, image_height, image_width):
if i<0:
return False
if i>image_height-1:
return False
if j<0:
return False
if j>image_width-1:
return False
return True
def get8Neighbourhood(i, j, image_height, image_width):
nbd = []
for height_offset in [-1, 0, 1]:
for width_offset in [-1, 0, 1]:
if isValidLocation(i+height_offset, j+width_offset, image_height, image_width):
nbd.append((i+height_offset, j+width_offset))
return nbd
def getContourImage(seg_image):
seg_image_height = seg_image.shape[0]
seg_image_width = seg_image.shape[1]
contour_image = np.zeros([seg_image_height, seg_image_width], dtype=np.uint8)
for i in range(seg_image_height):
for j in range(seg_image_width):
nbd = get8Neighbourhood(i, j, seg_image_height, seg_image_width)
for (m,n) in nbd:
if seg_image[m][n] != seg_image[i][j]:
contour_image[i][j] = 1
return contour_image
I'm looking for a more efficient "vectorized" way of achieving this, as I need to be able to compute this at run time on batches of 8 images at a time in a deep learning context. Any insights appreciated. Visual Example Below. The first image is the original image overlaid over the ground truth segmentation mask (not the best segmentation admittedly...), the second is the output of my code, which looks good, but is way too slow. Takes me about 10 seconds per image with an intel 9900K cpu.
This might work but it might have some limitations which I cannot be sure of without testing on the actual data, so I'll be relying on your feedback.
import numpy as np
from scipy import ndimage
import matplotlib.pyplot as plt
# some sample data with few rectangular segments spread out
seg = np.ones((100, 100), dtype=np.int8)
seg[3:10, 3:10] = 20
seg[24:50, 40:70] = 30
seg[55:80, 62:79] = 40
seg[40:70, 10:20] = 50
Now to find the contours, we will convolve the image with a kernel which should give 0 values when convolved within the same segment of the image and <0 or >0 values when convolved over image regions with multiple segments.
# kernel for convolving
k = np.array([[1, -1, -1],
[1, 0, -1],
[1, 1, -1]])
convolved = ndimage.convolve(seg, k)
# contour pixels
non_zeros = np.argwhere(convolved != 0)
plt.scatter(non_zeros[:, 1], non_zeros[:, 0], c='r', marker='.')
As you can see in this sample data the kernel has a small limitation and misses identifying two contour pixels caused due to symmetric nature of data (which I think would be a rare case in actual segmentation outputs)
For better understanding, this is the scenario(occurs at top left and bottom right corners of the rectangle) where the kernel convolution fails to identify the contour i.e. misses one pixel
[ 1, 1, 1]
[ 1, 1, 1]
[ 1, 20, 20]

Based on #sai's idea I came up with this snippet, which yielded the same result much, much faster than my original code. Runs in 0.039 seconds, which when compared to close to 8-10 seconds for the original I'd say is quite a speed-up!
filters = []
for i in [0, 1, 2]:
for j in [0, 1, 2]:
filter = np.zeros([3,3],
if i ==1 and j==1:
filter[i][j] = -1
filter[1][1] = 1
def getCountourImage2(seg_image):
convolved_images = []
for filter in filters:
convoled_image = ndimage.correlate(seg_image, filter, mode='reflect')
convoled_images = np.add.reduce(convolved_images)
seg_image = np.where(convoled_images != 0, 255, 0)
return seg_image


How can I loop over a multidimensional array and save the result in an array?

I want to fill in an empty 4D array. I have created a pre-allocated array (data_4d_smoothed) with 80 x 80 x 44 x 50. I want to loop through all (50) volumes of the data (data_4d), smooth them separately and store the results in data_4d_smoothed. Basically:
data_4d_smoothed = np.zeros(data_4d.shape)
sigma = 0.7
for i in data_4d[:, :, :, i]:
smoothed_vol = gaussian_filter(i, sigma=sigma)
The gaussian_filter should take every volume (the last dimension of the 4d array), do the operation, and save it into data_4d_smoothed. But obviously, this is not a 2D array and I think I need a nested loop to fill this empty list.
I think this should work without looping:
from scipy.ndimage import gaussian_filter
data_4d = np.random.rand(80,80,44,50)
data_4d_smoothed = gaussian_filter(data_4d, sigma = (sigma, sigma, sigma, 0))
Basically make the last dimension's sigma = 0, so that it doesn't do the convolution in that dimension.
data_4d_0 = gaussian_filter(data_4d[..., 0], sigma = sigma) #filter first image
np.allclose(data_4d_0, data_4d_smoothed[..., 0]) #first image from global filter

Move for loop into numpy single expression when calling polyfit

Fairly new to numpy/python here, trying to figure out some less c-like, more numpy-like coding styles.
I've got some code done that takes a fixed set of x values and multiple sets of corresponding y value sets and tries to find which set of the y values are the "most linear".
It does this by going through each set of y values in a loop, calculating and storing the residual from a straight line fit of those y's against the x's, then once the loop has finished finding the index of the minimum residual value.
...sorry this might make a bit more sense with the code below.
import numpy as np
import numpy.polynomial.polynomial as poly
# set of x values
xs = [1,22,33,54]
# multiple sets of y values for each of the x values in 'xs'
ys = np.array([[1, 22, 3, 4],
[2, 3, 1, 5],
[3, 2, 1, 1],
[34,23, 5, 4],
[5,19, 12, 3]])
# array to store the residual from a linear fit of each of the y's against x
residuals = np.empty(ys.shape[0])
# loop through the xs's and calculate the residual of a linear fit for each
for i in range(ys.shape[0]):
_, stats = poly.polyfit(xs, ys[i], 1, full=True)
residuals[i] = stats[0][0]
# the 'most linear' of the ys's is at np.argmin:
print('most linear at', np.argmin(residuals))
I'd like to know if it's possible to "numpy'ize" that into a single expression, something like
residuals = get_residuals(xs, ys)
...I've tried:
I've tried the following, but no luck (it always passes the full arrays in, not row by row):
# ------ ok try to do it without a loop --------
def wrap(x, y):
_, stats = poly.polyfit(x, y, 1, full=True)
return stats[0][0]
res = wrap(xs, ys) # <- fails as passes ys as full 2D array
res = wrap(np.broadcast_to(xs, ys.shape), ys) # <- fails as passes both as 2D arrays
Could anyone give any tips on how to numpy'ize that?
From the numpy.polynomial.polynomial.polyfit docs (not to be confused with numpy.polyfit which is not interchangable)
x : array_like, shape (M,)
y : array_like, shape (M,) or (M, K)
Your ys needs to be transposed to have ys.shape[0] equal to xs.shape
def wrap(x, y):
_, stats = poly.polyfit(x, y.T, 1, full=True)
return stats[0]
res = wrap(xs, ys)
Out[]: array([284.57337884, 5.54709898, 0.41399317, 91.44641638,
6.34982935, 153.03515358])

Alternative implementation of sparse convolution in TensorFlow

I have a special convolution kernel: 1) it has a big size (600x600); 2) it is a sparse filter and consists of mostly 0 values and some 1s. I want to apply this kernel to another big image (2000x2000). Since the filter only has 0s and 1s, the convolution operation in this special case is equivalent to the following steps:
1) Compute the coordinates of the 1s relative to the centre point of the convolution filter; 2) Translate the image by each of the relative coordinates; 3) Sum the resulting translations together.
Assume there are in total n 1s, then the above will result in n translations. We can't have n images present in the memory, since the space needed will be nx2000x2000 and that will result in OOM. I tried using a while loop to save memory space and the code looks like the following:
def sum_conv(acc, curr):
Apply translation to implement convolution. Summation is done in while loop.
:param acc: (batch, size, size, 1)
:param curr: (batch, 2)
translation = tf.reshape(curr, [batch_size, 2])
trans_x = tf.expand_dims(translation[:, 1], axis=-1)
trans_y = tf.expand_dims(translation[:, 0], axis=-1)
ones, zeros = tf.ones_like(trans_x), tf.zeros_like(trans_x)
transform = tf.concat([ones, zeros, trans_x, zeros, ones, trans_y, zeros, zeros], axis=-1)
# Image: (batch, size, size, 1)
conv_image = tf.contrib.image.transform(image, transform)
return acc + conv_image
# Points: (num_points, batch, 2)
# Loop on points (i.e. positions of 1s) to get the results of convolution
i0 = tf.constant(0)
conv_image0 = tf.zeros((batch_size, size, size, 1))
c = lambda i, prev: i < num_points
b = lambda i, prev: (i + 1, sum_conv(prev, tf.gather(points, i, axis=0)))
i, conv_images = tf.while_loop(c, b, (i0, conv_image0))
I have two questions.
1) Can anyone help me come up with a simpler implementation of the above algorithm?
2) Is there any operation in TensorFlow that supports sparse convolution?

High Eigen values always for Edge detection

I am trying to understand Harris detector, using the explanation here. As per explanation, I understand, if we calculate the eigen values, then,
However, when I try to calculate the eigen values are always high. Below is my main image from which I extract parts to calculate eigen values.
For a flat area with no visible features, I get this distribution (on right most) which is good, but eigen values are large
For a linear edge, also I get high eigen values: 16290305.45393251 567780.54606749
For corner, it is expected to get high values, but now I am doubtful if these high values are correct due to above cases.
8958127.80563239 10986758.19436761
Here is my method, translated from matlab code here. Its the vals value I directly get from numpy's linear algebra library.
def plot_derivatives_1(img_rgb, mode=1):
img_rgb = image in rgb color space (3 channeled)
img_1c = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
if mode == 1: # method 1 derivative
Ix = cv2.Sobel(img_1c, cv2.CV_64F, 1, 0, ksize=3)
Iy = cv2.Sobel(img_1c, cv2.CV_64F, 0, 1, ksize=3)
# another method of derivatives
dx = np.array([
[-1, 0, 1],
[-1, 0, 1],
[-1, 0, 1]
dy = np.transpose(dx)
Ix = signal.convolve2d(img_1c, dx, mode='valid')
Iy = signal.convolve2d(img_1c, dy, mode='valid')
Ix, Iy = Ix.astype(np.float64), Iy.astype(np.float64) # else gaussian blur later is failing
# yet to solve why we need A and eigen outputs
A = np.array([
[ np.sum(Ix*Ix), np.sum(Ix*Iy) ],
[ np.sum(Ix*Iy), np.sum(Iy*Iy) ]
vals, V = linalg.eig(A)
lamb = vals/np.max(vals)
print('lambda values:{}'.format(vals))
fig, ax = plt.subplots(1,4, figsize=(20,5))
ax[0].imshow(img_rgb);ax[0].set_title('Input Image')
ax[1].imshow(Ix, cmap='gray');ax[1].set_title('$I_x = \dfrac{\partial I}{\partial x}$')
ax[2].imshow(Iy, cmap='gray');ax[2].set_title('$I_y = \dfrac{\partial I}{\partial y}$')
ax[3].scatter(Ix, Iy);ax[3].set_xlim([-200,200]);ax[3].set_ylim([-200,200]);
ax[3].set_aspect('equal');ax[3].set_title('Derivatives Distribution');
ax[3].axvline(x=0, color = 'r');ax[3].axhline(y=0, color ='r')
return Ix, Iy
A sample call for a case (here shown for corner).
img = cv2.imread(SRC_FOLDER + 'checkersandbooksmall_sample_6.jpg')
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
Ix, Iy = plot_derivatives_1(img_rgb, mode=1)
I use jupyter notebook and the code is just built as I try to understand the concept.
What am I doing wrong to get high eigen values always for all cases?
Custom median pooling in tensorflow

I am trying to implement a median pooling layer in tensorflow.
However there is neither tf.nn.median_pool and neither tf.reduce_median.
Is there a way to implement such pooling layer with the python api ?
You could use something like:
patches = tf.extract_image_patches(tensor, [1, k, k, 1], ...)
m_idx = int(k*k/2+1)
top = tf.top_k(patches, m_idx, sorted=True)
median = tf.slice(top, [0, 0, 0, m_idx-1], [-1, -1, -1, 1])
To accommodate even sized median kernels and multiple channels, you will need to extend this, but this should get you most of the way.
As of March 2017, an easier answer (that under the hood works similarly to how Alex suggested) is to do this:
patches = tf.extract_image_patches(x, [1, k, k, 1], [1, k, k, 1], 4*[1], 'VALID')
medians = tf.contrib.distributions.percentile(patches, 50, axis=3)
For me, Alex's answer is not working for tf 1.4.1.
tf.top_k should be tf.nn.top_k
and should get values of tf.nn.top_k
Also, if the input is [1, H, W, C], either answer could not only work on height and width and neglect the channel.
Channel-wise median-pooling can be done by some addition reshapes on top of other answers:
# assuming NHWC layout
strides = rates = [1, 1, 1, 1]
patches = tf.extract_image_patches(x, [1, k, k, 1], strides, rates, 'VALID')
batch_size = tf.shape(x)[0]
n_channels = tf.shape(x)[-1]
n_patches_h = (tf.shape(x)[1] - k) // strides[1] + 1
n_patches_w = (tf.shape(x)[2] - k) // strides[2] + 1
n_patches = tf.shape(patches)[-1] // n_channels
patches = tf.reshape(patches, [batch_size, k, k, n_patches_h * n_patches_w, n_channels])
medians = tf.contrib.distributions.percentile(patches, 50, axis=[1,2])
medians = tf.reshape(medians, (batch_size, n_patches_h, n_patches_w, n_channels))
Not very efficient though.
I was looking for a median filter for tensorflowjs but can't seem to find one. tfa has a median filter now I think but for tf.js you can use this. Not sure if it would work on nodegpu.
function medianFilter(x, filter, strides, pad) {
//make Kernal
//todo allow for filter as array or number
let filterSize = filter ** 2;
let locs = tf.range(0, filterSize, filterSize );
//makes a bunc of arrays each one reprensentin one of the valuesin the median window ie 2x2 filter i in chanle and 4 out chanles
let f = tf.oneHot(tf.range(0,filterSize,1, 'int32'), filterSize).reshape([filter, filter, 1, filterSize]);
let y = tf.conv2d(x,f,strides,pad);
let m_idx = Math.floor(filterSize/2)+1;
let top = tf.topk(y, m_idx, true);
//note that thse are 3d tensors and if you use 4d ones add a 0 and -1 infron like in above ansowers
let median = tf.slice(top.values, [0,0,m_idx-1], [-1,-1,1] );
return median;