Invalid classes inferred from unique values of `y`. Expected: [0 1], got ['N' 'Y'] - data-science

Invalid classes inferred from unique values of `y`. Expected: [0 1], got ['N' 'Y']
anyone can help me out with this problem it is Xgboost problem i solve this with label encoding but for further steps i face problems

Just encode your "y" column because the model expects the column to be predicted as categorical:
df["y"] = df["y"].astype('category').cat.codes

Related

what does the np.array command do?

question about the np.array command.
let's say the content of caches when you displayed it with the print command is
caches = [array([1,2,3]),array([1,2,3]),...,array([1,2,3])]
Then I executed following code:
train_x = np.array(caches)
When I print the content of train_x I have:
train_x = [[1,2,3],[1,2,3],...,[1,2,3]]
Now, the behavior is exactly as I want but do not really understand in dept what the np.array(caches) command has done. Can somebody explain this to me?
Making a 1d array
In [89]: np.array([1,2,3])
Out[89]: array([1, 2, 3])
In [90]: np.array((1,2,3))
Out[90]: array([1, 2, 3])
[1,2,3] is a list; (1,2,3) is a tuple. np.array treats them as the same. (list versus tuple does make a difference when creating structured arrays, but that's a more advanced topic.)
Note the shape is (3,) (shape is a tuple)
Making a 2d array from a nested list - a list of lists:
In [91]: np.array([[1,2],[3,4]])
Out[91]:
array([[1, 2],
[3, 4]])
In [92]: _.shape
Out[92]: (2, 2)
np.array takes data, not shape information. It infers shape from the data.
array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)
In these examples the object parameter is a list or list of lists. We aren't, at this stage, defining the other parameters.

Pandas 0.21.1 - DataFrame.replace recursion error

I was used to run this code with no issue:
data_0 = data_0.replace([-1, 'NULL'], [None, None])
now, after the update to Pandas 0.21.1, with the very same line of code I get a:
recursionerror: maximum recursion depth exceeded
does anybody experience the same issue ? and knows how to solve ?
Note: rolling back to pandas 0.20.3 will make the trick but I think it's important to solve with latest version
thanx
I think this error message depends on what your input data is. Here's an example of input data where this works in the expected way:
data_0 = pd.DataFrame({'x': [-1, 1], 'y': ['NULL', 'foo']})
data_0.replace([-1, 'NULL'], [None, None])
replaces values of -1 and 'NULL' with None:
x y
0 NaN None
1 1.0 foo

Position Randomisation by shuffle () in psychopy

I have 4 text stimuli which I want to randomise their locations.
I did this at the beginning of routine
Posi=[’[4.95,0]’,’[-4.95,0]’,’[0,4.95]’,’[0,-4.95]’]
shuffle(Posi)
Then, turning to the builder, I typed
$Posi[0], $Posi[1]
in the ‘position’ column and so on, for the 4 stmuli. I also set that to ‘set every repeat’
But I keep getting this
ValueError: could not convert string to float: [-4.95,0]
I don’t understand how I should change the input, because there is no problem if I just plainly put [x,y] into position.
Thanks!
When you use those single quotes you are telling python that you are creating a string, that is a list of characters - not a number. Programs have types which say what a value is. '0.44' is a string of characters not a number.
>>> pos = [[0.2,0.0],[0.1,1]]
>>> pos[0]
[0.2, 0.0]
>>> pos[0][0]
0.2
>>> pos[0][0]+ 3.3
3.5
Produce a list of numerical coordinates, not strings
Like brittUWaterloo already stated, you are currently effectively creating a list of strings, not a list of lists (of coordinates), as you intended:
>>> pos = ['[4.95, 0]', '[-4.95, 0]', '[0, 4.95]', '[0, -4.95]']
>>> pos[0]
'[4.95, 0]'
>>> type(pos[0])
<class 'str'>
Note that I also changed the variable name and inserted spaces to produce more readable code that follows common coding style guidelines.
So, the first thing you need to do is remove the quotation marks to stop producing strings:
>>> pos = [[4.95, 0], [-4.95, 0], [0, 4.95], [0, -4.95]]
>>> pos[0]
[4.95, 0]
>>> type(pos[0])
<class 'list'>
Putting it to work in the Builder
Then, turning to the builder, I typed
$Posi[0], $Posi[1]
What you are trying to achieve here is, I believe, using the x, y coordinates of the very first element of the (shuffled) list of possible coordinates. I believe the current syntax is not fully correct; but let's have a closer look what would potentially happen if it were:
>>> pos[0], pos[1]
([4.95, 0], [-4.95, 0])
This would produce two coordinate pairs (the first two of the shuffled list). That's not what you want. You want the x and y coordinates of the first list pair only. To get the first coordinate pair only, you would to (in "pure" Python):
>>> pos[0]
[4.95, 0]
Or, in the Builder, you would enter
$pos[0]
into the respective coordinates field.
Summary
So to sum this up, in your Code component you need to do:
pos = [[4.95, 0], [-4.95, 0], [0, 4.95], [0, -4.95]]
shuffle(pos)
And as coordinate of the Text components, you can then use
$pos[0]

What does tf.gather_nd intuitively do?

Can you intuitively explain or give more examples about tf.gather_nd for indexing and slicing into high-dimensional tensors in Tensorflow?
I read the API, but it is kept quite concise that I find myself hard to follow the function's concept.
Ok, so think about it like this:
You are providing a list of index values to index the provided tensor to get those slices. The first dimension of the indices you provide is for each index you will perform. Let's pretend that tensor is just a list of lists.
[[0]] means you want to get one specific slice(list) at index 0 in the provided tensor. Just like this:
[tensor[0]]
[[0], [1]] means you want get two specific slices at indices 0 and 1 like this:
[tensor[0], tensor[1]]
Now what if tensor is more than one dimensions? We do the same thing:
[[0, 0]] means you want to get one slice at index [0,0] of the 0-th list. Like this:
[tensor[0][0]]
[[0, 1], [2, 3]] means you want return two slices at the indices and dimensions provided. Like this:
[tensor[0][1], tensor[2][3]]
I hope that makes sense. I tried using Python indexing to help explain how it would look in Python to do this to a list of lists.
You provide a tensor and indices representing locations in that tensor. It returns the elements of the tensor corresponding to the indices you provide.
EDIT: An example
import tensorflow as tf
sess = tf.Session()
x = [[1,2,3],[4,5,6]]
y = tf.gather_nd(x, [[1,1],[1,2]])
print(sess.run(y))
[5, 6]

Tensorflow gradients

I am trying to adapt the tf DeepDream tutorial code to work with another model. Right now when I call tf.gradients():
t_grad = tf.gradients(t_score, t_input)[0]
g = sess.run(t_grad, {t_input:img0})
I am getting a type error:
TypeError: Fetch argument None of None has invalid type <type 'NoneType'>,
must be a string or Tensor. (Can not convert a NoneType into a Tensor or
Operation.)
Where should I even start to look for fixing this error?
Is it possible to use tf.gradients() with a model that has an Optimizer in it?
I'm guessing your t_grad has some Nones. None is mathematically equivalent to 0 gradient, but is returned for the special case when the cost doesn't depend on the argument it is differentiated against. There are various reasons why we don't just return 0 instead of None which you can see in discussion here
Because None can be annoying in cases like above, or when computing second derivatives, I use helper function below
def replace_none_with_zero(l):
return [0 if i==None else i for i in l]
The following is a helpful tip for debugging tf.gradients()
for an invalid pair of tensors:
grads = tf.gradients(<a tensor>, <another tensor that doesn't depend on the first>)
even before you try to run tf.gradients in a session you can see it is invalid using print
print grads
It will return [None] a list with a single None in it.
If you try to run it in a session anyways:
results = sess.run(grads)
You will not get None again, instead you get the error message described in the question.
For a valid pair of tensors:
grads = tf.gradients(<a tensor>, <a related tensor>)
print grads
You will get something like:
Tensor("gradients_1/sub_grad/Reshape:0", dtype=float32)
In a valid situation:
results = sess.run(grads, {<appropriate feeds>})
print results
you get something like
[array([[ 4.97156498e-06, 7.87349381e-06, 9.25197037e-06, ...,
8.72526925e-06, 6.78442757e-06, 3.85240173e-06],
[ 7.72772819e-06, 9.26370740e-06, 1.19129227e-05, ...,
1.27088233e-05, 8.76379818e-06, 6.00637532e-06],
[ 9.46506498e-06, 1.10620931e-05, 1.43903117e-05, ...,
1.40718612e-05, 1.08670165e-05, 7.12365863e-06],
...,
[ 1.03536004e-05, 1.03090524e-05, 1.32107480e-05, ...,
1.40605653e-05, 1.25974075e-05, 8.90011415e-06],
[ 9.69486427e-06, 8.18045282e-06, 1.12702282e-05, ...,
1.32554378e-05, 1.13317501e-05, 7.74569162e-06],
[ 5.61043908e-06, 4.93397192e-06, 6.33513537e-06, ...,
6.26539259e-06, 4.52598442e-06, 4.10689108e-06]], dtype=float32)]