A more efficient way of creating an NxM array in Python

A more efficient way of creating an NxM array in Python - numpy

In Python, I need to create an NxM matrix in which the ij entry has value of i^2 + j^2.
I'm currently constructing it using two for loops, but the array is quite big and the computation time is long and I need to perform it several times. Is there a more efficient way of constructing such matrix using maybe Numpy ?

You can use broadcasting in numpy. You may refer to the official documentation. For example,
import numpy as np
N = 3; M = 4 #whatever values you'd like
a = (np.arange(N)**2).reshape((-1,1)) #make it to column vector
b = np.arange(M)**2
print(a+b) #broadcasting applied
Instead of using np.arange(), you can use np.array([...some array...]) for customizing it.

Related

Numpy iterating over rows

I kind of have the misconception that for loops should be avoided in Numpy for speed reasons, for example
import numpy
a = numpy.array([[2,0,1,3],[0,2,3,1]])
targets = numpy.array([[1,1,1,1,1,1,1]])
output = numpy.zeros((2,1))
for i in range(2):
output[i] = numpy.mean(targets[a[i]])
Is this a good way to get the mean on selected positions of each row? Feels like there might be ways to slice the array first then apply mean directly.

I think you are looking for this:
targets[a].mean(1)
Note that in your example, targets need to be 1-D and not 2-D. Otherwise, your loop throws out of bound index as it interprets the index for row index and not the column index.

numpy actually interprets this for you: targets[a] works "row-wise" and subsequently using np.mean(targets[a], axis=1) as suggested by #hpaulj in the comments does exactly what you want:
import numpy
a = numpy.array([[2,0,1,3],[0,2,3,1]])
targets = numpy.arange(1,6) # To make the results differ
output = numpy.mean(targets[a], axis=1) # the i-th row of targets[a] is targets[a[i]]

Construct NumPy matrix row by row

I'm trying to construct a 2D NumPy array from values in an extant 2D NumPy array using an iterative process. Using ordinary python lists the process I'm describing would look like so:
coords = #data from file contained in a 2D list
d = #integer
edges = []
for i in range(d+1):
for j in range(i+1, d+1):
edge = coords[j] - coords[i]
edges.append(edge)
However, the NumPy array imposes restrictions that do not permit the process shown above. Below I try to do the same thing using NumPy arrays, and it should immediately be clear where the problems are:
coords = np.genfromtxt('Energies.txt', dtype=float, skip_header=1)
d = #integer
#how to initialize?
for i in range(d+1):
for j in range(i+1, d+1):
edge = coords[j] - coords[i]
#how to append?
Because .append does not exist for NumPy arrays I need to rely on concatenate or stack instead. But these functions are designed to join existing arrays, and I don't have anything to concatenate or stack until after the first iteration of my loop. So I suppose I need to change my data flow, but I'm unsure how to go about this.
Any help would be greatly appreciated. Thanks in advance.

that function is numpy.meshgrid [1] , the function does it by default.
[1] https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.meshgrid.html

Numpy: Search for an array A with same pattern in a larger array B

I have two 1D numpy array A(small) and B(large)
A=np.array([6,7,8,9,10])
B=np.array([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,10])
I want to check if we have elements of the array A in the same order being detected in the array B.
Get the index value of array B from where the we detect the starting of array A
Index Value returned = 6
Do we have any inbuilt numpy function to perform such an operation?

I have also encountered this problem sometimes.I think the fastest way especially for big numpy arrays would be to convert them to strings and then do it.
Here is the code I use:
b=np.array([6,7,8,9,10])
a=np.array([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,10])
a.tostring().index(b.tostring())//a.itemsize

I found a nice solution.
Given by #EdSmith in Finding Patterns in a Numpy Array
In short this is the process
Short the length of array being searched for.(My example A)
Check through entire length of the array being searched in(My example B), using np.where and np.all
This is not my code but the code that can be found in the about link, Simple and easy. I'll just alter it a bit to fit my example above Hope it helps someone :)
Thanks to #EdSmith
import numpy as np
A=np.array([6,7,8,9,10])
B=np.array([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,10])
N = len(A)
possibles = np.where(B == A[0])[0]
solns = []
for p in possibles:
check = B[p:p+N]
if np.all(check == A):
solns.append(p)
print(solns)
Ouput
[6]

Try this:
import numpy as np
A=np.array([6,7,8,9,10])
B=np.array([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,10])
r = np.ones_like(B)
for x in range(len(A)):r*=np.roll((B==A[x]),-x)
#first index, answer: /6/
print(np.where(r)[0][0])

How to simulate random returns with numpy

What is a quick way to simulate random returns. I'm aware of numpy.random. However, that doesn't guide me towards how to model asset returns.
I've tried:
import numpy as np
r = np.random.rand(100)
But this doesn't feel accurate. How are others dealing doing this?

I'd suggest one of two approaches:
One: Assume returns are normally distributed with mean equal to 0.1% and stadard deviation about 1%. This looks like:
import numpy as np
np.random.seed(314)
r = np.random.randn(100) / 100 + 0.001
seed(314) sets the random number generator at a specific point so that if we both use the same seed, we should see the same results.
randn pulls from the normal distribution.
I'd also recommend using pandas. It's a library that implements a DataFrame object similar to R
import pandas as pd
df = pd.DataFrame(r)
You can then plot the cumulative returns like this:
df.add(1).cumprod().plot()
Two:
The second way is to assume returns are log normally distributed. That means the log(r) is normal. In this scenario, we pull normally distributed random numbers and then use those values as the exponent of e. It looks like this:
r = np.exp(np.random.randn(100) / 100 + 0.001) - 1
If you plot it, it looks like this:
pd.DataFrame(r).add(1).cumprod().plot()

Find indices of a list of values in a numpy array

I have a numpy master array. Given another array of search values, with repeating elements, I want to produce the indices of these search values in the master array.
E.g.: master array is [1,2,3,4,5], search array is [4,2,2,3]
Solution: [3,1,1,2]
Is there a "native" numpy function that does this efficiently (meaning at C speed, rather than python speed)?
I'm aware of the following solution, but, first, it's a python list comprehension, and second, it'll search for the index of 2 twice.
ma = np.array([1,2,3,4,5])
sl = np.array([4,2,2,3])
ans = [np.where(ma==i) for i in sl]
Also, if I have to resort to sorting and binary search, I will do it as a last resort (puns not intended at all sorts of levels). I am interested in finding if I'm missing something basic from the numpy library. These lists are very large, so performance is paramount.
Thanks.
Edit:
Before posting I'd tried the following with dismal results:
[np.searchsorted(ma,x) for x in sl]
The solution posted by #pierre is much more performant and exactly what I was looking for.

Would np.searchsorted work for you ?
>>> master = np.array([1,2,3,4,5])
>>> search = np.array([4,2,2,3])
>>> np.searchsorted(master, search)
array([3, 1, 1, 2])

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

A more efficient way of creating an NxM array in Python - numpy

Related

Numpy iterating over rows

Construct NumPy matrix row by row

Numpy: Search for an array A with same pattern in a larger array B

How to simulate random returns with numpy

Find indices of a list of values in a numpy array

Categories

Resources