How to select multi range of rows in pandas dataframe [duplicate] - pandas

Given an example of pandas dataframe with index from 0 to 30. I would like to select the rows within several ranges of index, [0:5], [10:15] and [20:25].
How to do that?

Say you have a random pandas DataFrame with 30 rows and 4 columns as follows:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0,30,size=(30, 4)), columns=list('ABCD'))
You can then use np.r_ to index into ranges of rows [0:5], [10:15] and [20:25] as follows:
df.loc[np.r_[0:5, 10:15, 20:25], :]


How Should I skip Rows in Pandas DataFrame, that are above the Index Columns name? [duplicate]

I want to set all column names at one line. How should I do it?
I tried many things but couldn't do it, including renaming columns.
You need to flatten your multiindex column header.
df.columns ='_'.join)
Or using f-string with list comprehension:
df.columns = [f'{i}_{j}' if j else f'{i}' for i, j in df.columns]

How can I replace value in cells in dataframe? [duplicate]

I have cells with values '...', I wanna replace them on 'NaN'. My dataframe called energy.
import pandas as pd
import numpy as np
file_name_energy = "data/Energy Indicators.xls"
energy = pd.read_excel(file_name_energy)
energy.replace('...', np.NaN)
I tried to use replace() but it doesn't work and dont output any error.
You need to reassign your dataframe or use inplace=True:
energy= energy.replace('...', np.NaN) #or energy.replace('...', np.NaN, inplace=True)
But since you're reading and Excel, why not using na_values parameter of pandas.read_excel ?
Try this :
energy = pd.read_excel(file_name_energy, na_values= ["..."])
From the documentation:
na_values : scalar, str, list-like, or dict, default None
Additional strings to recognize as NA/NaN.
If dict passed, specific per-column NA
values. By default the following values are interpreted as NaN: ‘’,
‘#N/A’, ‘#N/A N/A’, ‘#NA’, ‘-1.#IND’, ‘-1.#QNAN’, ‘-NaN’, ‘-nan’,
‘1.#IND’, ‘1.#QNAN’, ‘’, ‘N/A’, ‘NA’, ‘NULL’, ‘NaN’, ‘n/a’, ‘nan’,

Seaborn boxplot for classification with pandas wide to long [duplicate]

I have data that I would like to train an ml classifier on. The data is in wide format. I'd like to do a boxplot with searborn sns.boxplot(x='variable',y='value', hue='target', data=df_train). How do I reshape the data to be able to pass it to sns.boxplot?
Sample data
import pandas as pd
from sklearn import datasets
X, y = datasets.make_classification(n_samples=100, n_features=5, random_state=1)
df_train = pd.DataFrame(X)
pd.melt is what you want to use.
dfg_train = df_train.melt(id_vars='y')
sns.boxplot(x='variable',y='value', hue='y', data=dfg_train)

count the number of strings in a 2-D pandas series [duplicate]

I am trying to count the number of characters in an uneven 2-D pandas series.
df = pd.DataFrame({ 'A' : [['a','b'],['a','c','f'],['a'], ['b','f']]}
I want to count the number of times each character is repeated.
any ideas?
You can use explode() and value_counts().
import pandas as pd
df = pd.DataFrame({ 'A' : [['a','b'],['a','c','f'],['a'], ['b','f']]})
df = df.explode("A")
Expected output:
a 3
b 2
f 2
c 1

How to convert ndarray to pandas DataFrame [duplicate]

I have ndarray data with the shape of (231,31). now I want to convert this ndarray to pandas DataFrame with 31 columns. I am using this code:
for i in range (1,32):
dataset = pd.DataFrame({'Column{}'.format(i):data[:,i-1]})
but this code just creates the last column, it means with 231 indexes and just 1 column, but I need 31 columns. is there any way to fix this problem and why it happens?
Every time you are creating a new dataframe, that is why only the last column remains.
You need to create the dataframe with pd.DataFrame(data).