I am trying to reposition the index column in the output CSV from pandas DataFrame.to_csv()
I can order the non index columns using columns but it is unclear how to move the index column.
If i have 2 columns Name and Age and index i want the columns to come out in the following order in resulting CSV Name, Age,index
Anyone know how to do this?
index cannot be moved, it is always first column in DataFrame or Series or Panel. But you can copy data from index to another column.
But if need last column created from index:
df['new_last'] = df.index
If need custom position of new column:
df.insert(2, 'new', df.index)
And last for prevent write index to csv, thanks #Vivek Kalyanarangan:
df.to_csv(file, index=False)
Related
I have to format a dataframe so some code can receive it. It has to be in a specific format. The raw dataframe produced gives me a multi-index format. When I pass to the code it gives an IndexError because it was expecting 1 index.
Raw dataframe
I make a copy of the dataframe and remove the index.
ticker_data_2 = ticker_data.copy().reset_index()
Removed index
I need the timestamp column to be the index, so I reset the index to be timestamp. But now I have 2 columns named timestamp. Set index is supposed to remove the timestamp column and place it as the index, not make a copy.
ticker_data_2.set_index(ticker_data_2['timestamp'], inplace=True)
Duplicate timestamp columns
How do I fix to make it so only timestamp shows as the index, and not have a second timestamp column.
I have a variable size columned data frame. What is the best way to drop, in-place, all columns except for the nth and the index column.
you can just keep the n-th by indexing it explicitly
df = df[df.columns[n:n+1]]
note range notation to make sure you get a dataframe not a series
the index column will naturally stay in df
I am selecting row by row as follows:
for i in range(num_rows):
row = df.iloc[i]
as a result I am getting a Series object where row.index.values contains names of df columns.
But I wanted instead dataframe with only one row having dataframe columns in place.
When I do row.to_frame() instead of 1x85 dataframe (1 row, 85 cols) I get 85x1 dataframe where index contains names of columns and row.columns
outputs
Int64Index([0], dtype='int64').
But all I want is just original data-frame columns with only one row. How do I do it?
Or how do I convert row.index values to row.column values and change 85x1 dimension to 1x85
You just need to adding T
row.to_frame().T
Also change your for loop with adding []
for i in range(num_rows):
row = df.iloc[[i]]
I have a pandas dataframe with 10 columns. I would like to add a column which will uniquely identify every row. I do have to come up with the unique value(could be as simple as a running sequence). How can I do this? I tried adding index as a column itself but for some reason I get a KeyError when I do this.
add a column from range of len of you index
df['new'] = range(1, len(df.index)+1)
I have a Python dictionary and I created a panda data frame like below:
I want to change the name of index column to date . But I couldn't do this with data.set_index('date') . How can I do this? Any advice would be appreciated.
data.set_index('date') here only assign the column with name 'date' as index for the dataframe data.
You can rewrite the name of the column using data.index.name = 'data'