TypeError: 'NoneType' object is not subscriptable when checking for nonetype - pandas

I am trying to detect Nonetype in a single cell of a 1 column, 15 row dataframe with the following:
if str(row.iloc[13][:]) is None:
print("YES")
But this causes the error: TypeError: 'NoneType' object is not subscriptable

If row is Series, then if select value by position:
row.iloc[13]
output is scalar. So cannot slice scalar value by [:]. Also if convert to string by str cannot compare by None, but by string like:
if str(row.iloc[13]) == 'None':
If want compare by None:
if row.iloc[13] is None:
Or if compare by NaN or None:
if pd.isna(row.iloc[13]):

Related

How to solve 'numpy.float64' object is not callable

rsq_admin = smf.ols("Admin~RDS+MS", data=startup1).fit().rsquared()
vif_admin = 1/(1-rsq_admin)
I am trying to find the r-squared value to calculate VIF, but I am getting 'numpy.float64' object is not callable error
My dataset has Dtype as Float64

Tensorflow 2 custom dataset Sequence

I have a dataset in a python dictionary. The structure is as follow:
data.data['0']['input'],data.data['0']['target'],data.data['0']['length']
Both input and target are arrays of size (n,) and length is an int.
I have created a class object with tf.keras.utils.Sequence and specify __getitem__ as this:
def __getitem__(self, idx):
idx = str(idx)
return {
'input': np.asarray(self.data[idx]['input']),
'target': np.asarray(self.data[idx]['target']),
'length': self.data[idx]['length']
}
How can I iterate over such dataset using tf.data.Dataset? I am getting this error if I try to use from_tensor_slices
ValueError: Attempt to convert a value with an unsupported type (<class 'dict'>) to a Tensor.
I think you should modify the dictionary to a tensor as proposed here convert a dictionary to a tensor
or change the dictionary to a text file or to a tfrecords. Hope this would help you!

Using static rnn getting TypeError: Cannot convert value None to a TensorFlow DType

First some of my code:
...
fc_1 = layers.Dense(256, activation='relu')(drop_reshape)
bi_LSTM_2 = layers.Lambda(buildGruLayer)(fc_1)
...
def buildGruLayer(inputs):
gru_cells = []
gru_cells.append(tf.contrib.rnn.GRUCell(256))
gru_cells.append(tf.contrib.rnn.GRUCell(128))
gru_layers = tf.keras.layers.StackedRNNCells(gru_cells)
inputs = tf.unstack(inputs, axis=1)
outputs, _ = tf.contrib.rnn.static_rnn(
gru_layers,
inputs,
dtype='float32')
return outputs
Error I am getting when running static_rnn is:
raise TypeError("Cannot convert value %r to a TensorFlow DType." % type_value)
TypeError: Cannot convert value None to a TensorFlow DType.
The shape that comes into the Layer in the shape (64,238,256).
Anyone has a clue what the problem could be. I already googled the error but couldn't find anything. Any help is much appreciated.
If anyone still needs a solution to this. Its because you need to specify the dtype for the GRUCell, e.g tf.float32
Its default is None which in the documentation defaults to the first dimension of your input data (i.e batch dimension, which in tensorflow is a ? or None)
Check the dtype argument from :
https://www.tensorflow.org/api_docs/python/tf/compat/v1/nn/rnn_cell/GRUCell

converting pyspark dataframe fail on 'None Type' object

I have a pyspark dataframe 'data3' with many columns. I am trying to run kmeans on it except the first two columns, when I run my code , tasks always fails on TypeError: float() argument must be a string or a number, not 'NoneType' What am I doing wrong?
def f(x):
rel = {}
#rel['features'] = Vectors.dense(float(x[0]),float(x[1]),float(x[2]),float(x[3]))
rel['features'] = Vectors.dense(float(x[2]),float(x[3]),float(x[4]),float(x[5]),float(x[6]),float(x[7]),float(x[8]),float(x[9]),float(x[10]),float(x[11]),float(x[12]),float(x[13]),float(x[14]),float(x[15]),float(x[16]),float(x[17]),float(x[18]),float(x[19]),float(x[20]),float(x[21]),float(x[22]),float(x[23]),float(x[24]),float(x[25]),float(x[26]),float(x[27]),float(x[28]),float(x[29]),float(x[30]),float(x[31]),float(x[32]),float(x[33]),float(x[34]),float(x[35]),float(x[36]),float(x[37]),float(x[38]),float(x[39]),float(x[40]),float(x[41]),float(x[42]),float(x[43]),float(x[44]),float(x[45]),float(x[46]),float(x[47]),float(x[48]),float(x[49]))
return rel
data= data3.rdd.map(lambda p: Row(**f(p))).toDF()
kmeansmodel = KMeans().setK(7).setFeaturesCol('features').setPredictionCol('prediction').fit(data)
TypeError: float() argument must be a string or a number, not 'NoneType'
Your error comes from converting the xs to float because you probably have missing values
rel['features'] = Vectors.dense(float(x[2]),float(x[3]),float(x[4]),float(x[5]),float(x[6]),float(x[7]),float(x[8]),float(x[9]),float(x[10]),float(x[11]),float(x[12]),float(x[13]),float(x[14]),float(x[15]),float(x[16]),float(x[17]),float(x[18]),float(x[19]),float(x[20]),float(x[21]),float(x[22]),float(x[23]),float(x[24]),float(x[25]),float(x[26]),float(x[27]),float(x[28]),float(x[29]),float(x[30]),float(x[31]),float(x[32]),float(x[33]),float(x[34]),float(x[35]),float(x[36]),float(x[37]),float(x[38]),float(x[39]),float(x[40]),float(x[41]),float(x[42]),float(x[43]),float(x[44]),float(x[45]),float(x[46]),float(x[47]),float(x[48]),float(x[49]))
return rel
You can create a flag to convert each x to float when there is a missing values. For example
list_of_Xs = [x[2], x[3], x[4], x[5], x[6],etc. ]
for x in list_of_Xs:
if x is not None:
x = float(x)
Or use rel.dropna()

Getting error while passing Class_weight parameter in Random Forest

I am doing binary classifier. Since my data is unbalanced i am using class weight. I am getting error while passing values how to fix this.
Error: ValueError: class_weight must be dict, 'balanced', or None, got: [{0: 0.4, 1: 0.6}]"
Code
rf=RandomForestClassifier(n_estimators=1000,oob_score=True,min_samples_leaf=500,class_weight=[{0:.4, 1:.6}])
fit_rf=rf.fit(X_train_res,y_train_res)
Error
\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\utils\class_weight.py in compute_class_weight(class_weight, classes, y)
60 if not isinstance(class_weight, dict):
61 raise ValueError("class_weight must be dict, 'balanced', or None,"
---> 62 " got: %r" % class_weight)
63 for c in class_weight:
64 i = np.searchsorted(classes, c)
ValueError: class_weight must be dict, 'balanced', or None, got: [{0: 0.4, 1: 0.6}]
How to fix this.
Per the documentation
class_weight : dict, list of dicts, “balanced”,
Therefore, the class_weight paramter accepts a dictionary, a list of dictionary, or the string "balanced". The error message you are given states that it wants a dictionary, and since you have only one dictionary a list is not needed.
So, let's try:
rf=RandomForestClassifier(n_estimators=1000,
oob_score=True,
min_samples_leaf=500,
class_weight={0:.4, 1:.6})
fit_rf=rf.fit(X_train_res,y_train_res)