Creating image from point list with Numpy, how to speed up? - optimization

I've following code which seems to be performance bottleneck:
for x, y, intensity in myarr:
target_map[x, y] = target_map[x,y] + intensity
There are multiple coordinates for same coordinate with variable intensity.
Datatypes:
> print myarr.shape, myarr.dtype
(219929, 3) uint32
> print target_map.shape, target_map.dtype
(150, 200) uint32
Is there any way to optimize this loop, other than writing it in C?
This seems to be related question, how ever I couldn't get the accepted answer working for me: How to convert python list of points to numpy image array?
I get following error message:
Traceback (most recent call last):
File "<pyshell#38>", line 1, in <module>
image[coordinates] = 1
IndexError: too many indices for array

If you convert your 2D coordinates into target_map into flat indices into it using np.ravel_multi_index, you can use np.unique and np.bincount to speed things up quite a bit:
def vec_intensity(my_arr, target_map) :
flat_coords = np.ravel_multi_index((my_arr[:, 0], my_arr[:, 1]),
dims=target_map.shape)
unique_, idx = np.unique(flat_coords, return_inverse=True)
sum_ = np.bincount(idx, weights=my_arr[:, 2])
target_map.ravel()[unique_] += sum_
return target_map
def intensity(my_arr, target_map) :
for x, y, intensity in myarr:
target_map[x, y] += intensity
return target_map
#sample data set
rows, cols = 150, 200
items = 219929
myarr = np.empty((items, 3), dtype=np.uint32)
myarr[:, 0] = np.random.randint(rows, size=(items,))
myarr[:, 1] = np.random.randint(cols, size=(items,))
myarr[:, 2] = np.random.randint(100, size=(items,))
And now:
In [6]: %timeit target_map_1 = np.zeros((rows, cols), dtype=np.uint32); target_map_1 = vec_intensity(myarr, target_map_1)
10 loops, best of 3: 53.1 ms per loop
In [7]: %timeit target_map_2 = np.zeros((rows, cols), dtype=np.uint32); target_map_2 = intensity(myarr, target_map_2)
1 loops, best of 3: 934 ms per loop
In [8]: np.all(target_map_1 == target_map_2)
Out[8]: True
That's almost a 20x speed increase.

Related

Vectorized running bin index calculation with Tensorflow or numpy

I have an integer array like this:
in=[1, 2, 6, 1, 3, 2, 1]
I would like to calculate a running index for the equal values in the array. For the matrix above the output would be:
out=[0, 0, 0, 1, 0, 1, 2]
So the naive implementation would be to have a counter for all the values. I would like to have a vectorized solution to run it with tensorflow, perhaps with numpy.
I already thought of creating a 2D tensor of shape=(in.shape[0], tf.max(in), ) and writing 1 to the tensor[i, in[i]] cell, and then call a cumsum column-wise, then writing back row-wise. But my input array is quite big (with several 100k entries) with the maximum value of ~500k, thus this sparse matrix wouldn't even fit into the memory.
Do you have better suggestions? Thank you!
Here's a pandas solution:
s = pd.Series([1, 2, 6, 1, 3, 2, 1])
s.groupby(s).cumcount().values
Output:
array([0, 0, 0, 1, 0, 1, 2], dtype=int64)
Test on similar sized data:
s = pd.Series(np.random.randint(0,500000, 100000))
%timeit -n 100 s.groupby(s).cumcount().values
# 23.9 ms ± 562 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
You can use an actual sparse matrix, i.e. use sparse storage. With that an input like a = np.random.randint(0,5*10**5,10**6) is no problem:
import numpy as np
from scipy import sparse
def running(a):
n,m = a.size,a.max()+1
aux = sparse.csr_matrix((np.ones_like(a),a,np.arange(n+1)),(n,m)).tocsc()
msk = aux.indptr[1:] != aux.indptr[:-1]
indptr = aux.indptr[:-1][msk]
aux.data[0] = 0
aux.data[indptr[1:]] -= np.diff(indptr)
out = np.empty_like(a)
out[aux.indices] = aux.data.cumsum()
return out
# alternative method for validation
def use_argsort(a):
indices = a.argsort(kind="stable")
ao = a[indices]
indptr = np.concatenate([[0],(ao[1:] != ao[:-1]).nonzero()[0]+1])
data = np.ones_like(a)
data[0] = 0
data[indptr[1:]] -= np.diff(indptr)
out = np.empty_like(a)
out[indices] = data.cumsum()
return out
in_ = np.array([1, 2, 6, 1, 3, 2, 1])
print("OP example",in_,"->",running(in_))
print("second opinion","->",use_argsort(in_))
from timeit import timeit
A = np.random.randint(0,500_000,1_000_000)
print("large example (500k labels, 1M entries) takes",
timeit(lambda:running(A),number=10)*100,"ms")
print("using other method takes",
timeit(lambda:use_argsort(A),number=10)*100,"ms")
print("same result:",(use_argsort(A) == running(A)).all())
Sample run:
OP example [1 2 6 1 3 2 1] -> [0 0 0 1 0 1 2]
second opinion -> [0 0 0 1 0 1 2]
large example (500k labels, 1M entries) takes 84.1427305014804 ms
using other method takes 262.38483290653676 ms
same result: True

Convert numpy array with many dimensions into 2D array with nested numpy arrays

I would like to convert an array with many dimensions (more than 2) into a 2D array where other dimensions would be converted to nested stand-alone arrays.
So if I have an array like numpy.arange(3 * 4 * 5 * 5 * 5).reshape((3, 4, 5, 5, 5)), I would like to convert it to an array of shape (3, 4), where each element would be an array of shape (5, 5, 5). The dtype of the outer array would be object.
For example, for np.arange(8).reshape((1, 1, 2, 2, 2)), the output would be equivalent to:
a = np.ndarray(shape=(1,1), dtype=object)
a[0, 0] = np.arange(8).reshape((1, 1, 2, 2, 2))[0, 0, :, :, :]
How can I do this efficiently?
We can reshape and assign elements from the regular array into the output object dtype array in a single loop that seems to be a tad faster than with two loops, like so -
def reshape_approach(a):
m,n = a.shape[:2]
a.shape = (m*n,) + a.shape[2:]
out = np.empty((m*n),dtype=object)
for i in range(m*n):
out[i] = a[i]
out.shape = (m,n)
a.shape = (m,n) + a.shape[1:]
return out
Runtime test
Other approach(es) -
# #Scotty1-'s soln
def simply_assign(a):
m,n = a.shape[:2]
out = np.empty((m,n),dtype=object)
for i in range(m):
for j in range(n):
out[i,j] = a[i,j]
return out
Timings -
In [154]: m,n = 300,400
...: a = np.arange(m * n * 5 * 5 * 5).reshape((m,n, 5, 5, 5))
In [155]: %timeit simply_assign(a)
10 loops, best of 3: 39.4 ms per loop
In [156]: %timeit reshape_approach(a)
10 loops, best of 3: 32.9 ms per loop
With 7D data -
In [160]: m,n,p,q = 30,40,30,40
...: a = np.arange(m * n *p * q * 5 * 5 * 5).reshape((m,n,p,q, 5, 5, 5))
In [161]: %timeit simply_assign(a)
1000 loops, best of 3: 421 µs per loop
In [162]: %timeit reshape_approach(a)
1000 loops, best of 3: 316 µs per loop
Thanks for your hint Mitar. This is how it should look like using dtype=np.object arrays:
outer_array = np.empty((x.shape[0], x.shape[1]), dtype=np.object)
for i in range(x.shape[0]):
for j in range(x.shape[1]):
outer_array[i, j] = x[i, j]
Looping may not be the most efficient way to do it, but there is afaik no vectorized operation for this task.
(Using some more reshaping, this should be even faster than Divakar's solution: ;)) ---> No, Divakar is faster.... Nice solution Divakar!
def advanced_reshape_solution(x):
m, n = x.shape[:2]
sub_arr_size = np.prod(x.shape[2:])
out_array = np.empty((m * n), dtype=object)
x_flat_view = x.reshape(-1)
for i in range(m*n):
out_array[i] = x_flat_view[i * sub_arr_size:(i + 1) * sub_arr_size].reshape(x.shape[2:])
return out_array.reshape((m, n))

Row-wise Histogram

Given a 2-dimensional tensor t, what's the fastest way to compute a tensor h where
h[i, :] = tf.histogram_fixed_width(t[i, :], vals, nbins)
I.e. where tf.histogram_fixed_width is called per row of the input tensor t?
It seems that tf.histogram_fixed_width is missing an axis parameter that works like, e.g., tf.reduce_sum's axis parameter.
tf.histogram_fixed_width works on the entire tensor indeed. You have to loop through the rows explicitly to compute the per-row histograms. Here is a complete working example using TensorFlow's tf.while_loop construct :
import tensorflow as tf
t = tf.random_uniform([2, 2])
i = 0
hist = tf.constant(0, shape=[0, 5], dtype=tf.int32)
def loop_body(i, hist):
h = tf.histogram_fixed_width(t[i, :], [0.0, 1.0], nbins=5)
return i+1, tf.concat_v2([hist, tf.expand_dims(h, 0)], axis=0)
i, hist = tf.while_loop(
lambda i, _: i < 2, loop_body, [i, hist],
shape_invariants=[tf.TensorShape([]), tf.TensorShape([None, 5])])
sess = tf.InteractiveSession()
print(hist.eval())
Inspired by keveman's answer and because the number of rows of t is fixed and rather small, I chose to use a combination of tf.gather to split rows and tf.pack to join rows. It looks simple and works, will see if it is efficient...
t_histo_rows = [
tf.histogram_fixed_width(
tf.gather(t, [row]),
vals, nbins)
for row in range(t_num_rows)]
t_histo = tf.pack(t_histo_rows, axis=0)
I would like to propose another implementation.
This implementation can also handle multi axes and unknown dimensions (batching).
def histogram(tensor, nbins=10, axis=None):
value_range = [tf.reduce_min(tensor), tf.reduce_max(tensor)]
if axis is None:
return tf.histogram_fixed_width(tensor, value_range, nbins=nbins)
else:
if not hasattr(axis, "__len__"):
axis = [axis]
other_axis = [x for x in range(0, len(tensor.shape)) if x not in axis]
swap = tf.transpose(tensor, [*other_axis, *axis])
flat = tf.reshape(swap, [-1, *np.take(tensor.shape.as_list(), axis)])
count = tf.map_fn(lambda x: tf.histogram_fixed_width(x, value_range, nbins=nbins), flat, dtype=(tf.int32))
return tf.reshape(count, [*np.take([-1 if a is None else a for a in tensor.shape.as_list()], other_axis), nbins])
The only slow part here is tf.map_fn but it is still faster than the other solutions mentioned.
If someone knows a even faster implementation please comment since this operation is still very expensive.
answers above is still slow running in GPU. Here i give an another option, which is faster(at least in my running envirment), but it is limited to 0~1 (you can normalize the value first). the train_equal_mask_nbin can be defined once in advance
def histogram_v3_nomask(tensor, nbins, row_num, col_num):
#init mask
equal_mask_list = []
for i in range(nbins):
equal_mask_list.append(tf.ones([row_num, col_num], dtype=tf.int32) * i)
#[nbins, row, col]
#[0, row, col] is tensor of shape [row, col] with all value 0
#[1, row, col] is tensor of shape [row, col] with all value 1
#....
train_equal_mask_nbin = tf.stack(equal_mask_list, axis=0)
#[inst, doc_len] float to int(equaly seg float in bins)
int_input = tf.cast(tensor * (nbins), dtype=tf.int32)
#input [row,col] -> copy N times, [nbins, row_num, col_num]
int_input_nbin_copy = tf.reshape(tf.tile(int_input, [nbins, 1]), [nbins, row_num, col_num])
#calculate histogram
histogram = tf.transpose(tf.count_nonzero(tf.equal(train_equal_mask_nbin, int_input_nbin_copy), axis=2))
return histogram
With the advent of tf.math.bincount, I believe the problem has become much simpler.
Something like this should work:
def hist_fixed_width(x,st,en,nbins):
x=(x-st)/(en-st)
x=tf.cast(x*nbins,dtype=tf.int32)
x=tf.clip_by_value(x,0,nbins-1)
return tf.math.bincount(x,minlength=nbins,axis=-1)

New error during set_labels in pandas 0.19.2: ValueError: Unequal label lengths

After upgrading from Pandas 0.18.1 to 0.19.2, I am getting the following error when I try to add new levels and labels to my dataframe. Any idea what the problem is?
print index
MultiIndex(levels=[[u'1', u'2'], [u'nextLevel']],
labels=[[0, 1], [0, 0]],
names=[u'segment..ASRinfo..supportedUtt', u'label'])
print levels
[['1', '2', 'Total'], ['nextLevel']]
print labels
[[0, 1, 2], [0, 0, 0]]
index = index.set_levels(levels)
print index
MultiIndex(levels=[[u'Supported', u'Unsupported', u'Total'], [u'nextLevel']],
labels=[[0, 1], [0, 0]],
names=[u'segment..ASRinfo..supportedUtt', u'label'])
index = index.set_labels(labels)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-11-f6fb11fbbb3a> in <module>()
288
289 # Initialize dfplot
--> 290 slice_data()
291
292 if len(resultList)==1:
<ipython-input-11-f6fb11fbbb3a> in slice_data(*args)
71 index = index.set_levels(levels)
72 print index
---> 73 index = index.set_labels(labels)
74 data_slice = data_slice.reindex(index)
75
/Users/user1/anaconda/lib/python2.7/site-packages/pandas/indexes/multi.pyc in set_labels(self, labels, level, inplace, verify_integrity)
350 idx = self._shallow_copy()
351 idx._reset_identity()
--> 352 idx._set_labels(labels, level=level, verify_integrity=verify_integrity)
353 if not inplace:
354 return idx
/Users/user1/anaconda/lib/python2.7/site-packages/pandas/indexes/multi.pyc in _set_labels(self, labels, level, copy, validate, verify_integrity)
285
286 if verify_integrity:
--> 287 self._verify_integrity(labels=new_labels)
288
289 self._labels = new_labels
/Users/user1/anaconda/lib/python2.7/site-packages/pandas/indexes/multi.pyc in _verify_integrity(self, labels, levels)
145 if len(label) != label_length:
146 raise ValueError("Unequal label lengths: %s" %
--> 147 ([len(lab) for lab in labels]))
148 if len(label) and label.max() >= len(level):
149 raise ValueError("On level %d, label max (%d) >= length of"
ValueError: Unequal label lengths: [3, 3]
I'm wondering if it's a bug in the new pandas code. Perhaps self.labels[0] should be labels[0]?
def _verify_integrity(self, labels=None, levels=None):
"""
Parameters
----------
labels : optional list
Labels to check for validity. Defaults to current labels.
levels : optional list
Levels to check for validity. Defaults to current levels.
Raises
------
ValueError
* if length of levels and labels don't match or any label would
exceed level bounds
"""
# NOTE: Currently does not check, among other things, that cached
# nlevels matches nor that sortorder matches actually sortorder.
labels = labels or self.labels
levels = levels or self.levels
if len(levels) != len(labels):
raise ValueError("Length of levels and labels must match. NOTE:"
" this index is in an inconsistent state.")
label_length = len(self.labels[0])
for i, (level, label) in enumerate(zip(levels, labels)):
if len(label) != label_length:
raise ValueError("Unequal label lengths: %s" %
([len(lab) for lab in labels]))
if len(label) and label.max() >= len(level):
raise ValueError("On level %d, label max (%d) >= length of"
" level (%d). NOTE: this index is in an"
" inconsistent state" % (i, label.max(),
len(level)))
I tested my fix and it worked! I submitted a bug to Pandas:
https://github.com/pandas-dev/pandas/issues/15157
I'm not sure if its a bug - I suppose Pandas could replace all the extra indexes with missing values doing it your way but I think you should use reindex
df.reindex(index2)
index = pd.MultiIndex(levels=[[u'1', u'2'], [u'nextLevel']],
labels=[[0, 1], [0, 0]],
names=[u'segment..ASRinfo..supportedUtt', u'label'])
index2 = pd.MultiIndex(levels=[['1', '2', 'Total'], ['nextLevel']],
labels=[[0, 1, 2], [0, 0, 0]],
names=[u'segment..ASRinfo..supportedUtt', u'label'])
I am new to Pandas, and I found the documentation on MultiIndexing difficult to adapt to solving my own problem. Basically, I want to add some extra rows. This is the solution I came up with. There is probably a much better way to do it. Feel free to share if you'd like.
groupbyColumns = ['label0', 'label1']
data_slice = dataframe.groupby(by=groupbyColumns).sum()
index = data_slice.index
levels = list()
for levelIter in range(len(data_slice.index.levels)):
levels.append([x for x in data_slice.index.levels[levelIter]])
levels[0].append('Total')
if len(resultList)==2:
levels[-1].append('Difference')
addIndexCountForDifferenceRow = 1
else:
addIndexCountForDifferenceRow = 0
# Create new indexing sequence since we are adding Total (and Difference if doing comparison) rows
labels = list()
for labelIter in range(len(data_slice.index.labels)):
labels.append(list())
if len(data_slice.index.labels)==2:
labels0 = [x for x in data_slice.index.labels[0]]
labels1 = [x for x in data_slice.index.labels[1]]
for iter0 in range(max(labels0)+2):
for iter1 in range(max(labels1)+1+addIndexCountForDifferenceRow):
labels[0].append(iter0)
labels[1].append(iter1)
if len(data_slice.index.labels)==3:
labels0 = [x for x in data_slice.index.labels[0]]
labels1 = [x for x in data_slice.index.labels[1]]
labels2 = [x for x in data_slice.index.labels[2]]
for iter0 in range(max(labels0)+2):
for iter1 in range(max(labels1)+1):
for iter2 in range(max(labels2)+1+addIndexCountForDifferenceRow):
labels[0].append(iter0)
labels[1].append(iter1)
labels[2].append(iter2)
index = index.set_levels(levels)
index = index.set_labels(labels)
data_slice = data_slice.reindex(index)

How can I make a greyscale copy of a Surface in pygame?

In pygame, I have a surface:
im = pygame.image.load('foo.png').convert_alpha()
im = pygame.transform.scale(im, (64, 64))
How can I get a grayscale copy of the image, or convert the image data to grayscale? I have numpy.
Use a Surfarray, and filter it with numpy or Numeric:
def grayscale(self, img):
arr = pygame.surfarray.array3d(img)
#luminosity filter
avgs = [[(r*0.298 + g*0.587 + b*0.114) for (r,g,b) in col] for col in arr]
arr = numpy.array([[[avg,avg,avg] for avg in col] for col in avgs])
return pygame.surfarray.make_surface(arr)
After a lot of research, I came up with this solution, because answers to this question were too slow for what I wanted this feature to:
def greyscale(surface: pygame.Surface):
start = time.time() # delete me!
arr = pygame.surfarray.array3d(surface)
# calulates the avg of the "rgb" values, this reduces the dim by 1
mean_arr = np.mean(arr, axis=2)
# restores the dimension from 2 to 3
mean_arr3d = mean_arr[..., np.newaxis]
# repeat the avg value obtained before over the axis 2
new_arr = np.repeat(mean_arr3d[:, :, :], 3, axis=2)
diff = time.time() - start # delete me!
# return the new surface
return pygame.surfarray.make_surface(new_arr)
I used time.time() to calculate the time cost for this approach, so for a (800, 600, 3) array it takes: 0.026769161224365234 s to run.
As you pointed out, here is a variant preserving the luminiscence:
def greyscale(surface: pygame.Surface):
arr = pygame.surfarray.pixels3d(surface)
mean_arr = np.dot(arr[:,:,:], [0.216, 0.587, 0.144])
mean_arr3d = mean_arr[..., np.newaxis]
new_arr = np.repeat(mean_arr3d[:, :, :], 3, axis=2)
return pygame.surfarray.make_surface(new_arr)
The easiest way is to iterate over all the pixels in your image and call .get_at(...) and .set_at(...).
This will be pretty slow, so in answer to your implicit suggestion about using NumPy, look at http://www.pygame.org/docs/tut/surfarray/SurfarrayIntro.html. The concepts and most of the code are identical.