I have a three channel image (color image) and I want to use fancy indexing to find pixels of a specific color and then manipulate the color value.
Let's say I want to change every black pixel (0, 0, 0) to be blue (0, 0, 255)
Without fancy indexing it would work like this (and it works):
for x in range(0, width):
for y in range(0, height):
if (image[y, x] == (0, 0, 0)).all():
image[y, x] = (0, 0, 255)
So for fancy indexing I tried this:
image[(image[:, :] == (0, 0, 0)).all()] = (0, 0, 255)
But it does not work because it checks the condition for every single channel instead of the whole "channel tuple". It also didn't work without the .all() or other variations I tried. I somehow need to specify to not further unfold the array after the first two dimensions before evaluating the condition (thats what I tried with the image[:,:] but it does nothing...)
Is there a way to solve it with fancy indexing? Since the other solution is very slow.
.all() reduce the whole array to 1 scalar value. In your case, you just want to reduce each pixel to a scalar (ie. to reduce values along the last axis). You can do that by specifying the axis:
image[(image[:, :] == (0, 0, 0)).all(axis=2)] = (0, 0, 255)
Related
For a 1-d array, what kind of x gives you argsort(x) == argsort(argsort(x)) ? sorted array would be a trivial soliton.
but you can have not sorted array like [1, 0, 2] or [1, 0, 2, 3]
i'm really interested.
sorted_array = np.arange(10)
np.testing.assert_array_equal(np.argsort(sorted_array), np.argsort(np.argsort(sorted_array)))
# or
semi_sorted = [1, 0, 2]
np.testing.assert_array_equal(np.argsort(semi_sorted), np.argsort(np.argsort(semi_sorted)))
# or
semi_sorted = [1, 0, 2, 3]
np.testing.assert_array_equal(np.argsort(semi_sorted), np.argsort(np.argsort(semi_sorted)))
# or
semi_sorted = [2, 1, 3, 4, 5]
np.testing.assert_array_equal(np.argsort(semi_sorted), np.argsort(np.argsort(semi_sorted)))
what type of arrays fits in the criteria?
To formalize #Alex Riley's intuition:
For any (zero based) permutation p we have argsort(p) = p^-1 because by definition of argsort p[argsort(p)] = [0,1,2,...] and [0,1,2,...] viewed as a permutation is the identity.
Now, no matter what x, argsort(x) is a permutation, so writing p for that we get p = p^-1 or, equivalently, p^2 = id.
What do permutations p that are self-inverse look like? If p is applied twice nothing changes, so if the first application of p moves x to y the second application of p must move y to x. As y may equal x p must therefore consist of flips of two elements and of elements that stay put. That is also sufficient.
We now know what argsort(x) looks like. What about x itself? Let us for simplicity assume x has only unique elements, otherwise the details of the sort algorithm used have to be considered. Let us write s for the sorted x. Then s = x[p]. Permuting both sides with p we get s[p] = x[p^2] = x. So x may be any sequence that is obtained from an ordered sequence by flipping the positions of some (possibly zero) nonoverlapping pairs.
This question already has answers here:
Most dominant color in RGB image - OpenCV / NumPy / Python
(3 answers)
Closed 3 years ago.
I am trying to count the number of unique colours in an image. I have some code that I think should work however when I run it on an image its saying a I have 252 different colours out of a possible 16,777,216. That seems wrong given the image is BGR so shouldn't their be much more different colours (thousands not hundreds?)?
def count_colours(src):
unique, counts = np.unique(src, return_counts=True)
print(counts.size)
return counts.size
src = cv2.imread('../../images/di8.jpg')
src = imutils.resize(src, height=300)
count_colours(src) # outputs 252 different colours!? only?
Is that value correct? And if not how can I fix my function count_colours()?
Source image:
Edit: is this correct?
def count_colours(src):
unique, counts = np.unique(src.reshape(-1, src.shape[-1]), axis=0, return_counts=True)
return counts.size
If you look at the uniques you are getting back, I'm pretty sure you'll find they are scalars.
You need to use the axis keyword:
>>> import numpy as np
>>> from scipy.misc import face
>>>
>>> img = face()
>>> np.unique(img.reshape(-1, img.shape[-1]), axis=0, return_counts=True)
(array([[ 0, 0, 5],
[ 0, 0, 7],
[ 0, 0, 9],
...,
[255, 248, 255],
[255, 249, 255],
[255, 252, 255]], dtype=uint8), array([1, 2, 2, ..., 1, 1, 1]))
The comment by # Edeki Okoh is correct. You need to find a way to take the color channels into account. There is probably a much cleaner solution but a hacky way to do this would be something like this. Each color channels has values from 0 to 255 so we add 1 in order to make sure that it gets multiplied. Blue will represent the last the digits, green the middle three ones and red the first three. Now every value is representing a unique color.
b,g,r = cv2.split(src)
shiftet_im = b + 1000 * (g + 1) + 1000 * 1000 * (r + 1)
The resulting image should have one channel with each value representing a unique color combination.
I think you only counted for a single channel e.g R-value out of full RGB channel. that's why you have only 252 discrete values.
In theory R G B each can have 256 discrete states.
256*256*256 =16777216
means in total you can have 16777216 possibilities of colors.
My suggestion is to convert RGB uchar CV_8UC3 into a single 32bit data structure like CV_32FC1
Let
Given image as input
# my test small sie text image. which I can count the number of the state by hand
import cv2
import numpy as np
image=cv2.imread('/home/usr/naneDownloads/vuQ9y.png' )# change here
b,g,r = cv2.split(image)
out_in_32U_2D = np.int32(b) << 16 + np.int32(g) << 8 + np.int32(r) #bit wise shift 8 for each channel.
out_in_32U_1D= out_in_32U_2D.reshape(-1) #convert to 1D
np.unique(out_in_32U_1D)
array([-2147483648, -2080374784, -1073741824, -1006632960, 0,
14336, 22528, 30720, 58368, 91136,
123904, 237568, 368640, 499712, 966656,
1490944, 2015232, 3932160, 6029312, 8126464,
15990784, 24379392, 32768000, 65011712, 67108864,
98566144, 132120576, 264241152, 398458880, 532676608,
536870912, 805306368, 1073741824, 1140850688, 1342177280,
1610612736, 1879048192], dtype=int32)
len(np.unique(out_in_32U_1D))
37 # correct for my test wirting paper when compare when my manual counting
The code here should be able to provide you with what you needed
I'm new to Tensorflow, I'm processing a number of feature maps from two images. At a certain point, I've N features map for each of the two images and I want to obtain a single, feature volume concatenation of the two.
I can simply concat them with tf.concat([features1, features2]), in order to obtain a new volume having, for each pixel, the features at that position from both images.
What if I want to concat features with pixels having different coordinates on the two images?
For example, I've a function mapping x,y from the first image to u,v on the second image. Such functions does not follow a shared rule for all pixels (e.g., it's not a simple horizontal translation). Using numpy arrays, the behavior should be this:
for i in range(0,H):
for j in range(0,W):
u = f(x)
v = f(y)
concat[i][j] = np.concatenate([image1[y][x], image2[v][u]])
I tried to slice single pixel features and concat them together within for loops, but as you know it's very inefficient (and infeasible with large images, the memory required is just too high).
matrix = []
for y in range(0,H):
row = []
for x in range(0,W):
row.append(tf.concat([tf.slice(image1, [0, y, x, 0], [-1, 1, 1, -1]), tf.slice(image2, [0, v, u, 0], [-1, 1, 1, -1]) ], 3))
row_array = tf.concat(row, 2)
matrix.append(row_array)
result = tf.concat(matrix, 1)
What's the best option, it it exists?
Is there a neat way to compute a color histogram of an image? Maybe by abusing the internal code of tf.histogram_summary? From what I've seen, this code is not very modular and calls directly some C++ code.
Thanks in advance.
I would use tf.unsorted_segment_sum, where the "segment IDs" are computed from the color values and the thing you sum is a tf.ones vector. Note that tf.unsorted_segment_sum is probably better thought of as "bucket sum". It implements dest[segment] += thing_to_sum -- exactly the operation you need for a histogram.
In slightly pseudocode (meaning I haven't run this):
binned_values = tf.reshape(tf.floor(img_r * (NUM_BINS-1)), [-1])
binned_values = tf.cast(binned_values, tf.int32)
ones = tf.ones_like(binned_values, dtype=tf.int32)
counts = tf.unsorted_segment_sum(ones, binned_values, NUM_BINS)
You could accomplish this in one pass instead of separating out the r, g, and b values with a split if you wanted to cleverly construct your "ones" to look like "100100..." for red, "010010" for green, etc., but I suspect it would be slower overall, and harder to read. I'd just do the split that you proposed above.
This is what I'm using right now:
# Assumption: img is a tensor of the size [img_width, img_height, 3], normalized to the range [-1, 1].
with tf.variable_scope('color_hist_producer') as scope:
bin_size = 0.2
hist_entries = []
# Split image into single channels
img_r, img_g, img_b = tf.split(2, 3, img)
for img_chan in [img_r, img_g, img_b]:
for idx, i in enumerate(np.arange(-1, 1, bin_size)):
gt = tf.greater(img_chan, i)
leq = tf.less_equal(img_chan, i + bin_size)
# Put together with logical_and, cast to float and sum up entries -> gives count for current bin.
hist_entries.append(tf.reduce_sum(tf.cast(tf.logical_and(gt, leq), tf.float32)))
# Pack scalars together to a tensor, then normalize histogram.
hist = tf.nn.l2_normalize(tf.pack(hist_entries), 0)
tf.histogram_fixed_width
might be what you are looking for...
Full documentation on
https://www.tensorflow.org/api_docs/python/tf/histogram_fixed_width
I have 2 Numpy arrays which I need to perform some basic math operations on them.
But also I can't have the result of this operation to be greater than 255, due to the type (uint8) of the final numpy array (named magnitude). Any Idea? Except of traversing through the array...
# Notice that the data type is "np.uint8", also arrays are 2D
magnitude = np.zeros((org_im_width,org_im_height), dtype=np.uint8)
# "numpy_arr_1" and "numpy_arr_2" both of the same size & type as "magnitude"
# In the following operation, I should limit the number to 255
magnitude = ( (np.int_(numpy_arr_1))**2 + (np.int_(numpy_arr_2))**2 )**0.5
# The following doesn't work obviously:
# magnitude = min(255,((np.int_(numpy_arr_1))**2+(np.int_(numpy_arr_2))**2)**0.5)
First of all, if you assign magnitude = ... after its creation, you are replacing the initial uint8 array by the obtianed in the operation, so magnitude wont be uint8 anymore.
Anyway, in case is just a mistake in the example, to perform what you want you can either clamp/clip or normalize the values of the resulting operations:
You can find np.clip which limits the values of an array to a min and max values:
>>> magnitude = np.clip(operation, 0, 255)
Where operation is the magnitude you calculate. In fact, what you might want is:
>>> magnitude = np.clip(np.sqrt(a**2 + b**2), 0, 255).astype(np.uint8)
Where a and b are your np.int_(numpy_arr_1) and np.int_(numpy_arr_2) respectively, renamed for readability purposes.
Additionally, as in your case all the values are positive, you can replace np.clip by np.minimum:
>>> magnitude = np.minimum(np.sqrt(a**2 + b**2), 255).astype(np.uint8)
However, this just limits the magnitude of the vector to 255 (what you want), but you will lose a lot of information for points of higher magnitude. If the magnitude at some point is 1000 it will be clamped to 255, and therefore in your final array 1000 = 255. Two points with a wide variation in magnitude will end up having the same magnitude (1000 and 255 in this case).
To avoid this, you can normalize (re-scale) your range of magnitudes to [0, 255]. This means, if in your initial computation the magnitude array is in the ranges [0, 1000], transform it to [0, 255] so 1000 before will be 255 after, but 255 before will now be 63 (simple linear scaling).
>>> tmp = np.sqrt(a**2 + b**2).astype(float)
>>> magnitude = (tmp / tmp.max() * 255).astype(np.uint8)
tmp / tmp.max() will rescale all the values to [0, 1] range (if the array is float), and by multiplying by 255 the array is reescaled to [0, 255] again.
In case your magnitude's lower range is not 0, you can perform a re-scale from say [200, 1000] to [0, 255] that better represents your data:
>>> tmp = np.sqrt(a**2 + b**2).astype(float)
>>> tmax, tmin = tmp.max(), tmp.min()
>>> magnitude = ((tmp - tmin) / (tmax - tmin) * 255).astype(np.uint8)