IndexError: list index out of range. appending to list - python-3.8

I created a class Matrix. I've been trying to fill this matrix by randomize method, But this error was shown:
self.data[i].append(value)
IndexError: list index out of range
full code:
import random
class Matrix:
def __init__(self, rows, cols):
self.rows=rows
self.cols=cols
self.data=[]
def randomize(self):
for i in range(self.rows):
for j in range(self.cols):
value=random.randint(0, 10)
self.data[i].append(value)

You need to initialize the self.data list before accessing it with self.data[i] in the randomize function. You can do this with:
self.data=[[] for x in range(rows)]

Related

Creating 2d array and filling first columns of each row in numpy

I have written the following code for creating a 2D array and filing the first element of each row. I am new to numpy. Is there a better way to do this?
y=np.zeros(N*T1).reshape(N,T1)
x = np.linspace(0,L,num = N)
for k in range(0,N):
y[k][0] = np.sin(PI*x[k]/L)
Simply do this:
y[:, 0] = np.sin(PI*x/L)

How do I efficiently compute the gyration tensor in numpy?

The gyration tensor of a set of N points in 3d space is defined as
assuming the condition
.
How do I compute this in numpy without using an explicit for loop? I know that I can just do something like
import numpy as np
def calculate_gyration_tensor(points):
'''
Calculates the gyration tensor of a set of points.
'''
COM = centre_of_mass(points)
gyration_tensor = np.zeros((3, 3))
for p in points:
gyration_tensor += np.outer(p-COM, p-COM)
return gyration_tensor / len(points)
but this quickly becomes inefficient for large N, because of the for loop. Is there a better way to do it?
You can do with np.einsum like this:
def gyration(points):
'''
Calculate the gyrason tensor
points : numpy array of shape N x 3
'''
center = points.mean(0)
# normalized points
normed_points = points - center[None,:]
return np.einsum('im,in->mn', normed_points,normed_points)/len(points)
# test
points = np.arange(36).reshape(12,3)
gyration(points)
Output:
array([[107.25, 107.25, 107.25],
[107.25, 107.25, 107.25],
[107.25, 107.25, 107.25]])

Numpy , OOP and callables

I'm implementing a Markov Chain Montecarlo with metropolis and barkes alphas for numerical integration. I've created a class called MCMCIntegrator(). I've loaded it with some attributes, one of then is the pdf of the function (a lambda) we're trying to integrate called g.
import numpy as np
import scipy.stats as st
class MCMCIntegrator:
def __init__(self):
self.g = lambda x: st.gamma.pdf(x, 0, 1, scale=1 / 1.23452676)*np.abs(np.cos(1.123454156))
self.size = 10000
self.std = 0.6
self.real_int = 0.06496359
There are other methods in this class, the size is the size of the sample that the class must generate, std is the standard deviation of the Normal Kernel, which you will see in a few seconds. The real_int is the value of the integral from 1 to 2 of the function we're integrating. I've generated it with a R script. Now, to the problem.
def _chain(self, method=None):
"""
Markov chain heat-up with burn-in
:param method: Metrpolis or barker alpha
:return: np.array containing the sample
"""
old = 0
sample = np.zeros(int(self.size * 1.5))
i = 0
if method:
def alpha(a, b): return min(1, self.g(b) / self.g(a))
else:
def alpha(a, b): return self.g(b) / (self.g(a) + self.g(b))
while i != len(sample):
if new < 0:
new = st.norm(loc=old, scale=self.std).rvs()
alpha = alpha(old, new)
u = st.uniform.rvs()
if alpha > u:
sample[i] = new
old = new
i += 1
return np.array(sample)
When I call the _chain() method, this is the following error:
44 while i != len(sample):
45 new = st.norm(loc=old, scale=self.std).rvs()
---> 46 alpha = alpha(old, new)
47 u = st.uniform.rvs()
48
TypeError: 'numpy.float64' object is not callable
alpha returns a nnumpy.float, but I don't know why it's saying it's not callable.
You define a method named alpha based on some condition in an 'early' section of the code:
if method:
def alpha(a, b): return min(1, self.g(b) / self.g(a))
else:
def alpha(a, b): return self.g(b) / (self.g(a) + self.g(b))
and then in the while loop (a 'later' part of the code), you assign the return value of this function to a variable named alpha.
Since the names of these two objects are same, and the variable has been declared later in the code, without the function being re-declared anywhere after this variable creation, the variable replaces the function in the namespace and now you can't make calls to alpha anymore, because it has ceased to be a function.
If it is not a hindrance to your program logic (doesn't seem to be), renaming the variable to some other nice name would be okay.

Initializing variables in gekko using the array model function

Defining an array of Gekko variables does not allow any arguments to initialize the variables. For example, I am unable to make an array of integer variables using the m.Array function.
I can make an array of variables using this syntax: m.Array(m.Var, (42, 42)). However, I don't know how to make this array an array of integer variables because the m.Var passed in to the m.Array function does not take any arguments.
I have a single variable as an integer variable:
my_var_is_an_integer_var = m.Var(0, lb=0, ub=1, integer=True)
I have an array of variables that are not integer variables:
my_array_vars_are_not_integer_vars = m.Array(m.Var, (42, 42))
I want an array of integer variables: my_array_vars_are_integer_vars = m.Array(m.Var(0, lb=0, ub=1, integer=True), (42,42)) (Throws error)
HOW DO I INITIALIZE THE VARIABLES IN THE ARRAY TO BE INTEGER VARIABLES???
Error when trying to initialize array as integer variables:
Traceback (most recent call last):
File "integer_array.py", line 7, in <module>
my_array_vars_are_not_integer_vars = m.Array(m.Var(0, lb=0, ub=1,
integer=True), (42,42))
File "C:\Users\wills\Anaconda3\lib\site-packages\gekko\gekko.py", line
1831, in Array
i[...] = f(**args)
TypeError: 'GKVariable' object is not callable
If you need to pass additional arguments when creating a variable array, you can use one of the following options. Option 1 creates a Numpy array while Options 2 and 3 create a Python list.
Option 1 (Preferred)
Create a numpy array with the m.Array function with additional argument integer=True:
y = m.Array(m.Var,(42,42),lb=0,ub=1,integer=True)
Option 2
Create a 2D list of variables with a list comprehension:
y = [[m.Var(lb=0,ub=1,integer=True) for i in range(42)] for j in range(42)]
Option 3
Alternatively, you can create an empty list (y) and append binary values to that list.
y = [[None]*42]*42
for i in range(42):
for j in range(42):
y[i][j] = m.Var(lb=0,ub=1,integer=True)
The UPPER and LOWER bounds can be changed after the variable creation but the integer option is only available at initialization. Don't forget to switch to the APOPT MINLP solver for integer variable solutions with m.options.SOLVER = 1. Below is a complete example that uses all three options but with a 3x4 array for x, y, and z.
from gekko import GEKKO
import numpy as np
m = GEKKO()
# option 1
x = m.Array(m.Var,(3,4),lb=0,ub=1,integer=True)
# option 2
y = [[m.Var(lb=0,ub=1,integer=True) for i in range(4)] for j in range(3)]
# option 3
z = [[None]*4]*3
for i in range(3):
for j in range(4):
z[i][j] = m.Var(lb=0,ub=1,integer=True)
# switch to APOPT
m.options.SOLVER = 1
# define objective function
m.Minimize(m.sum(m.sum(x)))
m.Minimize(m.sum(m.sum(np.array(y))))
m.Minimize(m.sum(m.sum(np.array(z))))
# define equation
m.Equation(x[1,2]==0)
m.Equation(m.sum(x[:,0])==2)
m.Equation(m.sum(x[:,1])==3)
m.Equation(m.sum(x[2,:])==1)
m.solve(disp=True)
print(x)
The objective is to minimize sum of all the elements in x, y, and z but there are certain constraints on an element, row, and columns of x. The solution is:
[[[1.0] [1.0] [0.0] [0.0]]
[[1.0] [1.0] [0.0] [0.0]]
[[0.0] [1.0] [0.0] [0.0]]]

TypeError: unhashable type: 'numpy.ndarray' - How to get data from data frame by querying radius from ball tree?

How to get data by querying radius from ball tree? For example
from sklearn.neighbors import BallTree
import pandas as pd
bt = BallTree(df[['lat','lng']], metric="haversine")
for idx, row in df.iterrow():
res = df[bt.query_radius(row[['lat','lng']],r=1)]
I want to get those rows in df that are in radius r=1. But it throws type error
TypeError: unhashable type: 'numpy.ndarray'
Following the first answer I got index out of range when iterating over the rows
5183
(5219, 25)
5205
(5219, 25)
5205
(5219, 25)
5221
(5219, 25)
Traceback (most recent call last):
File "/Users/Chu/Documents/dssg2018/sa4.py", line 45, in <module>
df.loc[idx,word]=len(df.iloc[indices[idx]][df[word]==1])/\
IndexError: index 5221 is out of bounds for axis 0 with size 5219
And the code is
bag_of_words = ['beautiful','love','fun','sunrise','sunset','waterfall','relax']
for idx,row in df.iterrows():
for word in bag_of_words:
if word in row['caption']:
df.loc[idx, word] = 1
else:
df.loc[idx, word] = 0
bt = BallTree(df[['lat','lng']], metric="haversine")
indices = bt.query_radius(df[['lat','lng']],r=(float(10)/40000)*360)
for idx,row in df.iterrows():
for word in bag_of_words:
if word in row['caption']:
print(idx)
print(df.shape)
df.loc[idx,word]=len(df.iloc[indices[idx]][df[word]==1])/\
np.max([1,len(df.iloc[indices[idx]][df[word]!=1])])
The error is not in the BallTree, but the indices returned by it are not used properly for putting it into index.
Do it this way:
for idx, row in df.iterrows():
indices = bt.query_radius(row[['lat','lng']].values.reshape(1,-1), r=1)
res = df.iloc[[x for b in indices for x in b]]
# Do what you want to do with res
This will also do (since we are sending only a single point each time):
res = df.iloc[indices[0]]
Explanation:
I'm using scikit 0.20. So the code you wrote above:
df[bt.query_radius(row[['lat','lng']],r=1)]
did not work for me. I needed to make it a 2-d array by using reshape().
Now bt.query_radius() returns array of array of indices within the radius r specified as mentioned in the documentation:
ind : array of objects, shape = X.shape[:-1]
each element is a numpy integer array listing the indices of neighbors of the corresponding point. Note that unlike the results of
a k-neighbors query, the returned neighbors are not sorted by distance
by default.
So we needed to iterate two arrays to reach the actual indices of the data.
Now once we got the indices, in a pandas Dataframe, iloc is the way to access data with indices.
Update:
You dont need to query the bt each time for individual points. You can send all the df at once to return a 2-d array containing the indices of points within the radius to the point specified that index.
indices = bt.query_radius(df, r=1)
for idx, row in df.iterrows():
nearest_points_index = indices[idx]
res = df.iloc[nearest_points_index]
# Do what you want to do with res