Pandas 0.21.1 - DataFrame.replace recursion error - pandas

I was used to run this code with no issue:
data_0 = data_0.replace([-1, 'NULL'], [None, None])
now, after the update to Pandas 0.21.1, with the very same line of code I get a:
recursionerror: maximum recursion depth exceeded
does anybody experience the same issue ? and knows how to solve ?
Note: rolling back to pandas 0.20.3 will make the trick but I think it's important to solve with latest version
thanx

I think this error message depends on what your input data is. Here's an example of input data where this works in the expected way:
data_0 = pd.DataFrame({'x': [-1, 1], 'y': ['NULL', 'foo']})
data_0.replace([-1, 'NULL'], [None, None])
replaces values of -1 and 'NULL' with None:
x y
0 NaN None
1 1.0 foo

Related

Use of plt.plot vs plt.scatter with two variables (x and f(x,y))

I am new in Python and stack overflow so please bear with me.
I was trying to plot using plt.plot and plt.scatter. The former works perfectly alright while the latter not. Down below is the relevant part of code:
enter code here
def vis_cal(u, a):
return np.exp(2*np.pi*1j*u*np.cos(a))
u = np.array([[1, 2, 3, 4]])
u = u.reshape((4,1))
a = a([[-np.pi, -np.pi/6]])
plt.figure(figsize=(10, 8))
plt.xlabel("Baseline")
plt.ylabel("Vij (Visibility)")
plt.scatter(u, vis_cal(u, a), 'o', color='blue', label="Vij_ind")
plt.legend(loc="lower left")
plt.show()
This returns an error: ValueError: x and y must be the same size
My questions here are
Why the different array size doesn't matter to plt.plot but it does matter to plt.scatter?
Does this mean that if I want to use plt.scatter I always need to make sure that they arrays must have the same size otherwise I need to use plt.plot?
Thank you very much

Applying scipy.sparse.linalg.svds returns nan values

I am starting to use the scipy.sparse library, and when I try to apply scipy.sparse.linalg.svds, I get an error if there are zero singular values.
I am doing this because in the end I am going to use very large and very sparse matrices with entries only {+1, -1} which are not square (>1100*1000 size with >0.99 sparsity), and I want to know their rank.
I know approximately what the rank is, it is almost full, so knowing only the last singular values can tell me what is the rank exactly.
This is why I chose to work with scipy.sparse.linalg.svds and set which='LM'. If the rank is not full, there will be singular values which are zero, this is my code:
import numpy as np
import scipy.sparse as sp
import scipy.sparse.linalg as la
a = np.array([[0, 0, 0], [0, 0, 0], [1, 1, -1]], dtype='d')
sp_a = sp.csc_matrix(a)
s = la.svds(sp_a, k=2, return_singular_vectors=False, which='SM')
print(s)
output is
[ nan 9.45667059e-12]
/usr/lib/python3/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py:1849: RuntimeWarning: invalid value encountered in sqrt
s = np.sqrt(eigvals)
Any thoughts on why this happens?
Maybe there is another efficient way to know the rank, knowing that I have a large non-square very sparse matrix with almost full rank?
scipy version 1.1.0
numpy version 1.14.5
Linux platform
Thanks in advance

Why does Tensorflow output these simple results?

I'm brand new to Tensorflow, but I'm trying to figure out why these results end in ...001, ...002, etc.
I'm following the tutorial here: https://www.tensorflow.org/get_started/get_started
Code:
"""This is a Tensorflow learning script."""
import tensorflow as tf
sess = tf.Session()
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
x = tf.placeholder(tf.float32)
linear_model = W*x + b
sess.run(tf.global_variables_initializer()) #This is the same as the above 2 lines
print(sess.run(linear_model, {x: [1, 2, 3, 4]}))
It looks like a simple math function where if I was using 2 as an input, it would be (0.3 * 2) + -0.3 = 0.3.
Output:
[ 0. 0.30000001 0.60000002 0.90000004]
I would expect:
[ 0. 0.3 0.6 0.9]
That's probably a floating point error, because you introduced your variables as a tf.float32 dtype. You could use tf.round (https://www.tensorflow.org/api_docs/python/tf/round) but it doesn't seem to have round-to-the-nearest decimal place capability yet. For that, check out the response in: tf.round() to a specified precision.
The issue is that a floating point variable (like tf.float32) simply cannot store exactly 0.3 due to being stored in binary. It's like trying to store exactly 1/3 in decimal, it'd be 0.33... but you'd have to go out to infinity to get the exact number (which isn't possible our mortal realm!).
See the python docs for more in depth review of the subject.
Tensorflow doesn't have a way to deal with decimal numbers yet (as far as I know)! But once the numbers are returned to python you could round & then convert to a Decimal.

repmat with interlace or Kronecker product in Tensorflow

Suppose I have a tensor:
A=[[1,2,3],[4,5,6]]
Which is a matrix with 2 rows and 3 columns.
I would like to replicate it, suppose twice, to get the following tensor:
A2 = [[1,2,3],
[1,2,3],
[4,5,6],
[4,5,6]]
Using tf.repmat will clearly replicate it differently, so I tried the following code (which works):
A_tiled = tf.reshape(tf.tile(A, [1, 2]), [4, 3])
Unfortunately, it seems to be working very slow when the number of columns become large. Executing it in Matlab using Kronecker product with a vector of ones (Matlab's "kron") seems to be much faster.
Can anyone help?

Jacobian in Tensorflow

I see many people asking this question here but I didn't see a code that I can execute. I am trying to make two operations, to get dOuput/dInput and to get dOutput/dParameters. I tried
# gradient method 1
jac_Action_wrt_Param = tf.pack([tf.concat(1, [tf.reshape(tf.gradients(action_output[:, idx], param)[0], [1, -1])
for param in learnable_param_list]) for idx in range(action_dim)],
axis=1, name='jac_Action_wrt_Param')
jac_Action_wrt_State = tf.pack(
[tf.gradients(action_output[:, idx], state_input)[0] for idx in range(action_dim)], axis=1,
name='jac_Action_wrt_State')
here state is input and action is output. Both methods give None... What did I do wrong?