I have to format a dataframe so some code can receive it. It has to be in a specific format. The raw dataframe produced gives me a multi-index format. When I pass to the code it gives an IndexError because it was expecting 1 index.
Raw dataframe
I make a copy of the dataframe and remove the index.
ticker_data_2 = ticker_data.copy().reset_index()
Removed index
I need the timestamp column to be the index, so I reset the index to be timestamp. But now I have 2 columns named timestamp. Set index is supposed to remove the timestamp column and place it as the index, not make a copy.
ticker_data_2.set_index(ticker_data_2['timestamp'], inplace=True)
Duplicate timestamp columns
How do I fix to make it so only timestamp shows as the index, and not have a second timestamp column.
Related
I have a dataframe in pandas. I have done the followings:
data.reset_index(inplace=True)
data.set_index(['Date','Name'],inplace=True)
How can I get the index (i.e., Date and Name) for a specific row (e.g., for data.iloc[0])
I have a variable size columned data frame. What is the best way to drop, in-place, all columns except for the nth and the index column.
you can just keep the n-th by indexing it explicitly
df = df[df.columns[n:n+1]]
note range notation to make sure you get a dataframe not a series
the index column will naturally stay in df
I have a dataframe with a column 'date' (YYYY-MM-DD HH:MM:SS) and datetime64 type.
I want to drop/eliminate rows by selecting ranges of dates. How can I do this on python/pandas?
Thank you so much in advance
(I cannot post comments, thus I dare to put an answer) The following questions also refer to deleting or filtering a data frame based on the value of a given column:
Delete rows from a pandas DataFrame based on a conditional expression involving len(string) giving KeyError
Deleting DataFrame row in Pandas based on column value
Basically, you can pass a boolean array to the index operator [ ] of the data frame, this returns the filtered data frame. Here the pandas v1.0.1 (!) documentation of how to index data frames. Also this question is helpful.
I am trying to reposition the index column in the output CSV from pandas DataFrame.to_csv()
I can order the non index columns using columns but it is unclear how to move the index column.
If i have 2 columns Name and Age and index i want the columns to come out in the following order in resulting CSV Name, Age,index
Anyone know how to do this?
index cannot be moved, it is always first column in DataFrame or Series or Panel. But you can copy data from index to another column.
But if need last column created from index:
df['new_last'] = df.index
If need custom position of new column:
df.insert(2, 'new', df.index)
And last for prevent write index to csv, thanks #Vivek Kalyanarangan:
df.to_csv(file, index=False)
I have a pandas dataframe with 10 columns. I would like to add a column which will uniquely identify every row. I do have to come up with the unique value(could be as simple as a running sequence). How can I do this? I tried adding index as a column itself but for some reason I get a KeyError when I do this.
add a column from range of len of you index
df['new'] = range(1, len(df.index)+1)