How to avoid row access by label error in DataFrame?

How to avoid row access by label error in DataFrame? - pandas

I have trouble to access rows in DataFrame. My code and the results are asfollows. What's the problem? Please help me.
df = pd.read_excel('./eeg_samples/chanlocs67.xlsx',usecols=
[0,3,4,5],index_col='labels')
df.index.names = [None]
print(df.head())
print(df.loc['Fp1'])
The result is as follows.
X Y Z
'Fp1' 83.9171 29.4367 -6.990
'Fz' 58.5120 -0.3122 66.462
'F3' 53.1112 50.2438 42.192
'F7' 42.4743 70.2629 -11.420
'FT9' 14.5673 84.0759 -50.429
Traceback (most recent call last):
File "C:\ProgramData\mne-python\1.2.1_0\lib\site-
packages\pandas\core\indexes\base.py", line 3803, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 138, in
pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 165, in
pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 5745, in
pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 5753, in
pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Fp1'

your index values are coming with surrounding quotes . so you need to do this or fix your data in excel file:
print(df.loc["'Fp1'"])

Related

Reading keys from an .npz file with multiple workers in pytorch dataloader?

I have an .npz file where I have stored a dictionary. The dictionary has some keys and the values are numpy arrays. I want to read the dictionary in my getitem() method of the dataloader. When I set the dataloader num_workers to 1, everything runs fine. But when I increase the num workers, it throws the following error when reading the data from that npz file:
Traceback (most recent call last):
File "scripts/train.py", line 235, in <module>
train(args)
File "scripts/train.py", line 186, in train
solver(args.epoch, args.verbose)
File "/local-scratch/codebase/cap/lib/solver.py", line 174, in __call__
self._feed(self.dataloader["train"], "train", epoch_id)
File "/local-scratch/codebase/cap/lib/solver.py", line 366, in _feed
for data_dict in dataloader:
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
return self._process_data(data)
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
zipfile.BadZipFile: Caught BadZipFile in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/local-scratch/codebase/cap/lib/dataset.py", line 947, in __getitem__
other_bbox_feat = self.box_features['{}-{}_{}.{}'.format(scene_id, target_object_id, ann_id, object_id)]
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/site-packages/numpy/lib/npyio.py", line 255, in __getitem__
pickle_kwargs=self.pickle_kwargs)
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/site-packages/numpy/lib/format.py", line 763, in read_array
data = _read_bytes(fp, read_size, "array data")
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/site-packages/numpy/lib/format.py", line 892, in _read_bytes
r = fp.read(size - len(data))
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/zipfile.py", line 872, in read
data = self._read1(n)
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/zipfile.py", line 962, in _read1
self._update_crc(data)
File "/local-scratch/anaconda3/envs/scanenv/lib/python3.6/zipfile.py", line 890, in _update_crc
raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file 'scene0519_00-13_1.0.npy'
As far as I know, pytorch dataloader uses multiprocessing to for data loading. Perhaps the issue is with multiprocessing and .npz files. I really appreciate any help.

Not able to extract a column name using Panda data frame

KeyError: 'Name'
>>> df=pd.read_csv(text_file)
>>> print(df)
Name Age
0 Ritesh 32
1 Priyanka 29
>>> print(df['Name'].where(df['Name'] == 'Ritesh'))
Traceback (most recent call last):
File "/Users/reyansh/venv/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Name'
During handling of the above exception, another exception occurred:

read_csv didn't read your file into two columns because you have a space as the separator and the default is a comma. Specify the space:
df = pd.read_csv(text_file, " ")

Access dataframe with multi-level index on pandas v1.0.3

I'm trying to access a row of the dataframe dtSortedTable by
dtSortedTable.loc[decisionCountSorted.index[0]]
dtSortedTable is
X0 X1 X2
(D1, G2) A B C
(D2, G1) A A A
(D2, G0) A A C
decisionCountSorted indexes look like:
Index([('D1', 'G2'), ('D2', 'G1'), ('D2', 'G0')], dtype='object')
The indexes of decisionCountSorted are exactly the same as dtSortedTable. The indexes are multilevel with 2 levels. Why am I getting the below error? I need to run some tests on decisionCountSorted and extract the corresponding rows from dtSortedTable. Any help would be hugely appreciated!
Traceback (most recent call last):
File "/usr/local/Caskroom/miniconda/base/envs/logicsim/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2646, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'D1'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/tkazi/Documents/Code/logicsim/logic.py", line 183, in <module>
dtFixed = dtConsensus(dtSortedTable,quorumCount)
File "/Users/tkazi/Documents/Code/logicsim/logic.py", line 119, in dtConsensus
print(dtSortedTable.loc[decisionCountSorted.index[0]])
File "/usr/local/Caskroom/miniconda/base/envs/logicsim/lib/python3.8/site-packages/pandas/core/indexing.py", line 1762, in __getitem__
return self._getitem_tuple(key)
File "/usr/local/Caskroom/miniconda/base/envs/logicsim/lib/python3.8/site-packages/pandas/core/indexing.py", line 1272, in _getitem_tuple
return self._getitem_lowerdim(tup)
File "/usr/local/Caskroom/miniconda/base/envs/logicsim/lib/python3.8/site-packages/pandas/core/indexing.py", line 1389, in _getitem_lowerdim
section = self._getitem_axis(key, axis=i)
File "/usr/local/Caskroom/miniconda/base/envs/logicsim/lib/python3.8/site-packages/pandas/core/indexing.py", line 1965, in _getitem_axis
return self._get_label(key, axis=axis)
File "/usr/local/Caskroom/miniconda/base/envs/logicsim/lib/python3.8/site-packages/pandas/core/indexing.py", line 625, in _get_label
return self.obj._xs(label, axis=axis)
File "/usr/local/Caskroom/miniconda/base/envs/logicsim/lib/python3.8/site-packages/pandas/core/generic.py", line 3537, in xs
loc = self.index.get_loc(key)
File "/usr/local/Caskroom/miniconda/base/envs/logicsim/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 2648, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'D1'

Python _getitem_ and _getitem_column(key) error when load data from excel

I have a dataframe, with 20 different sheets. It ran normally for the first 16 sheets, but on the 17th sheet it raised an error. Here is my code:
A=A.sort_values(by=['timing','id'])
The error was:
Traceback (most recent call last):
File "<ipython-input-24-11bf4f35bb1b>", line 1, in <module>
SessionNumber(5)
File "filepath", line 160
DepthBuyA=DepthBuyA.sort_values(by=['timing','id'])
File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 4411, in sort_values
stacklevel=stacklevel)
File "C:\Anaconda\lib\site-packages\pandas\core\generic.py", line 1382, in _get_label_or_level_values
raise KeyError(key)
KeyError: 'id'
So I thought, there must be some problem with the column 'id' on that particular sheet, because other sheets also had 'id' and none of which raised an error like that. So I tried:
print(A['id'])
And it successfully printed the column 'id' for sheet 17, however, right after printing it, it raised this error:
File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2688, in __getitem__
return self._getitem_column(key)
File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2695, in _getitem_column
return self._get_item_cache(key)
File "C:\Anaconda\lib\site-packages\pandas\core\generic.py", line 2489, in _get_item_cache
values = self._data.get(item)
File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 4115, in get
loc = self.items.get_loc(item)
File "C:\Anaconda\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'id'
So after that I tried the code by putting it directly into the console, and now there is no error.
A=A.sort_values(by=['timing','id'])
So what is the problem and what can I do to fix?
Thank you!

Used column index instead of name, it is fine now

pandas.read_csv gives FileNotFound error inside a loop

pandas.read_csv is working properly when used as a single statement. But it is giving FileNotFoundError when it is being used inside a loop even though the file exists.
for filename in os.listdir("./Datasets/pollution"):
print(filename) # To check which file is under processing
df = pd.read_csv(filename, sep=",").head(1)
These above lines are giving this following error.
pollutionData184866.csv <----- The name of the file is printed properly.
Traceback (most recent call last):
File "/home/parnab/PycharmProjects/FinalYearProject/locationExtractor.py", line 13, in <module>
df = pd.read_csv(i, sep=",").head(1)
File "/usr/lib/python3.6/site-packages/pandas/io/parsers.py", line 646, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/lib/python3.6/site-packages/pandas/io/parsers.py", line 389, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/lib/python3.6/site-packages/pandas/io/parsers.py", line 730, in __init__
self._make_engine(self.engine)
File "/usr/lib/python3.6/site-packages/pandas/io/parsers.py", line 923, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/lib/python3.6/site-packages/pandas/io/parsers.py", line 1390, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 373, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:4184)
File "pandas/parser.pyx", line 667, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:8449)
FileNotFoundError: File b'pollutionData184866.csv' does not exist
But when I am doing
filename = 'pollutionData184866.csv'
df = pd.read_csv(filename, sep=',')
It is working fine.
What am I doing wrong?

os.listdir("./Datasets/pollution") returns a list of files without a path and according to the path "./Datasets/pollution" you are parsing CSV files NOT from the current directory ".", so changing it to glob.glob('./Datasets/pollution/*.csv') should work, because glob.glob() returns a list of satisfying files/directories including given path
Demo:
In [19]: os.listdir('d:/temp/.data/629509')
Out[19]:
['AAON_data.csv',
'AAON_data.png',
'AAPL_data.csv',
'AAPL_data.png',
'AAP_data.csv',
'AAP_data.png']
In [20]: glob.glob('d:/temp/.data/629509/*.csv')
Out[20]:
['d:/temp/.data/629509\\AAON_data.csv',
'd:/temp/.data/629509\\AAPL_data.csv',
'd:/temp/.data/629509\\AAP_data.csv']

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to avoid row access by label error in DataFrame? - pandas

your index values are coming with surrounding quotes . so you need to do this or fix your data in excel file: print(df.loc["'Fp1'"])

Related

Reading keys from an .npz file with multiple workers in pytorch dataloader?

Not able to extract a column name using Panda data frame

Access dataframe with multi-level index on pandas v1.0.3

Python _getitem_ and _getitem_column(key) error when load data from excel

pandas.read_csv gives FileNotFound error inside a loop

Categories

Resources