AttributeError: 'ArrayView' object has no attribute 'A1' - numpy

I have to import a processed h5ad file, but it seems that X has been passed as a numpy array instead of a numpy matrix. See below:
# Read the data
data_path = "/home/bbb5130/snOMICS/maria/msrna.h5ad"
adata = sn.pp.read_h5ad(data_path, pr_process="Yes")
adata
But the output was:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In [15], line 3
1 # Read the data
2 data_path = "/home/bbb5130/snOMICS/maria/msrna.h5ad"
----> 3 adata = sn.pp.read_h5ad(data_path, pr_process="Yes")
4 adata
File ~/miniconda3/envs/snOMICS/lib/python3.9/site-packages/scanet/preprocessing.py:54, in Preprocessing.read_h5ad(cls, filename, pr_process)
51 return sc.read_h5ad(filename)
52 else:
53 # initial preprocessing as it is required later
---> 54 return cls._intial(adata)
File ~/miniconda3/envs/snOMICS/lib/python3.9/site-packages/scanet/preprocessing.py:35, in Preprocessing._intial(adata)
33 adata.var['mt'] = adata.var_names.str.startswith('MT-')
34 mito_genes = adata.var_names.str.startswith('MT-')
---> 35 adata.obs['percent_mito'] = np.sum(adata[:, mito_genes].X, axis=1).A1 / np.sum(adata.X, axis=1).A1
36 sc.pp.calculate_qc_metrics(adata, qc_vars=['mt'], percent_top=None, inplace=True)
37 sc.pp.filter_cells(adata, min_genes=0)
AttributeError: 'ArrayView' object has no attribute 'A1'
Is there anyway I can change the format, so the file can be read?
Thanks in advance.

Related

TypeError: descriptor 'lower' for 'str' objects doesn't apply to a 'list' object

I wanna stemming my dataset. Before stemming, I did tokenize use nltk tokenize
You can see the output on the pic
Dataset
Col Values
But when i do stemming, it return error :
[Error][3]
TypeError Traceback (most recent call
last)
<ipython-input-102-7700a8e3235b> in <module>()
----> 1 df['Message'] = df['Message'].apply(stemmer.stem)
2 df = df[['Message', 'Category']]
3 df.head()
5 frames
/usr/local/lib/python3.7/dist-
packages/Sastrawi/Stemmer/Filter/TextNormalizer.py in
normalize_text(text)
2
3 def normalize_text(text):
----> 4 result = str.lower(text)
5 result = re.sub(r'[^a-z0-9 -]', ' ', result, flags =
re.IGNORECASE|re.MULTILINE)
6 result = re.sub(r'( +)', ' ', result, flags =
re.IGNORECASE|re.MULTILINE)
TypeError: descriptor 'lower' requires a 'str' object but received a
'list'
Hope all you guys can help me

Why I can't loop xmltodict?

Ive'been trying to transform all my logs in a dict through xmltodict.parse function
The thing is, when I try to convert a single row to a variable it works fine
a = xmltodict.parse(df['CONFIG'][0])
Same to
parsed[1] = xmltodict.parse(df['CONFIG'][1])
But when I try to iterate the entire dataframe and store it on a dictionaire I get the following
for ind in df['CONFIG'].index:
parsed[ind] = xmltodict.parse(df['CONFIG'][ind])
---------------------------------------------------------------------------
ExpatError Traceback (most recent call last)
/tmp/ipykernel_31/1871123186.py in <module>
1 for ind in df['CONFIG'].index:
----> 2 parsed[ind] = xmltodict.parse(df['CONFIG'][ind])
/opt/conda/lib/python3.9/site-packages/xmltodict.py in parse(xml_input, encoding, expat, process_namespaces, namespace_separator, disable_entities, **kwargs)
325 parser.ParseFile(xml_input)
326 else:
--> 327 parser.Parse(xml_input, True)
328 return handler.item
329
ExpatError: syntax error: line 1, column 0
Can you try this?
for ind in range(len(df['CONFIG'])):
parsed[ind] = xmltodict.parse(df['CONFIG'][ind])

train image classification models with colab

I follow the template and change the link , but it doesn't work
https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/tutorials/model_maker_image_classification.ipynb#scrollTo=3jz5x0JoskPv
This is my datasets
https://firebasestorage.googleapis.com/v0/b/lol-fypproject.appspot.com/o/lol.tgz?alt=media&token=d07b81bd-442f-4ebe-920e-3772598fbb20
original code
image_path = tf.keras.utils.get_file(
'flower_photos.tgz',
'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
extract=True)
image_path = os.path.join(os.path.dirname(image_path), 'flower_photos')
I changed in that
image_path = tf.keras.utils.get_file(
'lol.tgz',
'https://firebasestorage.googleapis.com/v0/b/lol-fypproject.appspot.com/o/lol.tgz?alt=media&token=d07b81bd-442f-4ebe-920e-3772598fbb20',
extract=True)
image_path = os.path.join(os.path.dirname(image_path), 'lol')
the line wrong and error message is showed
data = ImageClassifierDataLoader.from_folder(image_path)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-15-a5e7646aca55> in <module>()
----> 1 data = ImageClassifierDataLoader.from_folder(image_path)
2 train_data, test_data = data.split(0.9)
/usr/local/lib/python3.7/dist-
packages/tensorflow_examples/lite/model_maker/core/data_util/image_dataloader.py
in
from_folder(cls, filename, shuffle)
69 all_image_size = len(all_image_paths)
70 if all_image_size == 0:
---> 71 raise ValueError('Image size is zero')
72
73 if shuffle:
ValueError: Image size is zero
I have find the problem
the path of the zip file is not the right structure as the sample

"ValueError: arrays must all be same length"

In a machine learning project, suppose I have 3 cat images and 2 dog images. when I make a dataframe for the training data.
#pre processing train data
filenames = os.listdir('/content/train')
categories = []
for filename in os.listdir('/content/train'):
category = filename.split('.')[0]
if category == 'dog' :
categories.append(1) #1 for dog and 0 for cat
else :
categories.append(0)
#make a dictonary
df = pd.DataFrame(
{
'filename' : filenames,
'category':categories
}
)
It gives an error because I haven't the same amount of dog, cat images.
ValueError Traceback (most recent call last)
<ipython-input-28-2d4e2440ba41> in <module>()
12 {
13 'filename' : filenames,
---> 14 'category' : categories
15 }
16 )
3 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/construction.py in extract_index(data)
395 lengths = list(set(raw_lengths))
396 if len(lengths) > 1:
--> 397 raise ValueError("arrays must all be same length")
398
399 if have_dicts:
ValueError: arrays must all be same length
Is there any way to fix it without adding any image to the training dataset?

While removing html text from column, object of type 'float' has no len() error is occuring

I am using an amazon dataset to do sentiment analysis. Dataset content is
https://i.stack.imgur.com/qcKZp.png
dataset con be found on:
https://www.kaggle.com/PromptCloudHQ/amazon-reviews-unlocked-mobile-phones
I am trying to remove html from Review column.
This is what I am doing. Note: dataset is assigned to df.
df_removedNoise = []
def removingHTML(text):
soup = BeautifulSoup(text, 'lxml').get_text()
return soup
def removingNoise(text):
html_removed = removingHTML(text)
return html_removed
for i in df["Reviews"]:
text = removingNoise(i)
df_removedNoise.append(text)
Even though Reviews column has object as a datatype, I am still getting an error like.
TypeError Traceback (most recent call last)
<ipython-input-83-3591f5d7a54f> in <module>
9
10 for i in df["Reviews"]:
---> 11 df_removedNoise.append(removingNoise(i))
<ipython-input-83-3591f5d7a54f> in removingNoise(text)
5
6 def removingNoise(text):
----> 7 html_removed = removingHTML(text)
8 return html_removed
9
<ipython-input-83-3591f5d7a54f> in removingHTML(text)
1 df_removedNoise = []
2 def removingHTML(text):
----> 3 soup = BeautifulSoup(text, 'lxml').get_text()
4 return soup
5
~/anaconda3/lib/python3.7/site-packages/bs4/__init__.py in __init__(self, markup, features, builder, parse_only, from_encoding, exclude_encodings, **kwargs)
244 if hasattr(markup, 'read'): # It's a file-type object.
245 markup = markup.read()
--> 246 elif len(markup) <= 256 and (
247 (isinstance(markup, bytes) and not b'<' in markup)
248 or (isinstance(markup, str) and not '<' in markup)
TypeError: object of type 'float' has no len()
Any help will be appreciated!
Check for NaN with df[df['Reviews'].isnull()], if you find any try to dropna first