xgboost TypeError: can not initialize DMatrix from DataFrame - xgboost

I am getting the below error when creating a DMatrix from a data in python.
TypeError: can not initialize DMatrix from DataFrame
Exception AttributeError: "'DMatrix' object has no attribute 'handle'" in <bound method DMatrix.__del__ ofrix object at 0x584d210>> ignored

Without accompanying code my best guess is you are passing the pandas data-frame directly, instead you need to pass numpy representation of the dataframe
ie., pandas.DataFrame.values as below
X_train = pd.read_csv("train.csv")
y_train = X_train['label']
X_train.drop(['label'],axis=1,inplace=True)
final_GBM.fit(X_train.values,y_train.values)

Related

Dataframe containing np.arrays in each cell into Machine learning method

I have a pandas dataframe containing np.arrays in each cell (see the photo to understand better). Each array is 1000 samples long. However, when trying to use this as a training data in LSTM, it won't go through. enter image description here
model.fit(x_train, y_train, epochs = 15)
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
How do I tackle this? Can't find an answer elsewhere, tried this:
x_train=np.asarray(x_train).astype(np.float32)
but it failed due ValueError: setting an array element with a sequence.
Is there another way to use this sort of numpy arrays as input?
I was trying to train LSTM with my pandas dataframe data containing 1000 sample long np.arrays

I cannot convert my pandas dataframe to a tensorflow dataset - Get a Value Error

My data source - https://www.kaggle.com/vbookshelf/respiratory-sound-database
Tensroflow version - 2.4.0
After a bit of Data Cleaning my pandas Dataframe looked like this:
My objective is to make Deep Learning Model for a Classification task, so I read the audio files with scipy.io wavefile and put the array as a feature in the data frame.
All the values of the audio have a shape of (882000,)
My problem is that I want to convert my Pandas Dataframe into a Tensorflow Dataset.
I get this error: ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type NumPy.ndarray).
I tried using tf.convert_to_tensor and still get the same error. What should I do?

What does the .numpy() function do?

I tried searching for the documentation online but I can't find anything that gives me an answer. What does .numpy() function do? The example code given is:
y_true = []
for X_batch, y_batch in mnist_test:
y_true.append(y_batch.numpy()[0].tolist())
Both in Pytorch and Tensorflow, the .numpy() method is pretty much straightforward. It converts a tensor object into an numpy.ndarray object. This implicitly means that the converted tensor will be now processed on the CPU.
Ever getting a problem understanding some PyTorch function you may ask help().
import torch
t = torch.tensor([1,2,3])
help(t.numpy)
Out:
Help on built-in function numpy:
numpy(...) method of torch.Tensor instance
numpy() -> numpy.ndarray
Returns :attr:`self` tensor as a NumPy :class:`ndarray`. This tensor and the
returned :class:`ndarray` share the same underlying storage. Changes to
:attr:`self` tensor will be reflected in the :class:`ndarray` and vice versa.
This numpy() function is the converter form torch.Tensor to numpy array.
If we look at this code below, we see a simple example where the .numpy() convert Tensors to numpy arrays automatically.
import numpy as np
ndarray = np.ones([3, 3])
print("TensorFlow operations convert numpy arrays to Tensors automatically")
tensor = tf.multiply(ndarray, 42)
print(tensor)
print("And NumPy operations convert Tensors to numpy arrays automatically")
print(np.add(tensor, 1))
print("The .numpy() method explicitly converts a Tensor to a numpy array")
print(tensor.numpy())
In the 2nd last line of code, we see that the tensorflow officials declared it as the converter of Tensor to a numpy array.
You may check it out here

How to convert TensorVariable to numpy

I want to convert TensorVariable to numpy array and try:
feature_vector = keras_model.get_layer(blob_name).output.numpy()
But get the error.
AttributeError: 'TensorVariable' object has no attribute 'numpy'
I also tried:
feature_vector = keras_model.get_layer(blob_name).output
init = tf.compat.v1.global_variables_initializer()
with tf.compat.v1.Session() as sess:
sess.run(init)
print(feature_vector.eval())
But get error
theano.gof.fg.MissingInputError: Input 0 of the graph (indices start
from 0), used to compute Shape(/input_1), was not provided and not
given a value. Use the Theano flag exception_verbosity='high', for
more information on this error.
Thank you #Lau. Yes, I using theano as it turns out and fixed this error like this

Tensorflow - Tensorboard Event Accumulator get Tensor from TensorEvent

I am working with Tensorflow and Tensorboard version 1.14.
I would like to perform some off-line analysis starting from Data I have saved during training using the tf.summary.tensor_summary()
I am not able to recover the data saved with the method described here, using the tf.train.summary_iterator which does recover scalar data but not the data I saved with the tensor_summary method.
Though with the EventAccumulator object I am able to recover the data I have saved, that it is returned as a TensorEvent Object which has the following attributes:
step
wall_time
tensor_proto
tensor_content
Thing is that I would like to convert this data into numpy array, the TensorEvent object sure has all the information needed (tensor_proto for type and shape, tensor_content for values), but not being a Tensor does not have a .value or a .numpy() method. So I do I trasform a TensorEvent Object into a numpy array? or equivalently into a Tensor object then into a numpy array?
You can use tf.make_ndarray to convert a TensorProto into a NumPy array:
tensor_np = tf.make_ndarray(tensor_event.tensor_proto)