Can I request all Pandas columns display in an IPython notebook? - pandas

By default, the number of columns displayed by pandas commands is limited to display.max_columns. Is there something like df.showall() that can be used to override this on a per-command bases?

The simplest way I have found to do this is the following:
from IPython.core.display import HTML
HTML(df.to_html())
This will display the whole table in the IPython notebook output cell - all rows and columns. Scrollbars will appear for large tables.
To display all columns but no more than N rows, use:
HTML(df.head(N).to_html())

You can define a context manager:
from contextlib import contextmanager
#contextmanager
def temp_option(option, value):
old_value = pd.get_option(option)
pd.set_option(option, value)
yield
pd.set_option(option, old_value)
Then do what you want, something like
>>>with temp_option('display.max_rows', 200):
print(df)
I thought pandas already had this feature, but I couldn't find it.

Related

How to increase length of ouput table or dataframe in Jupyter Notebook?

I am working on the Jupyter notebook and have been facing issues in increasing the length of the output of the Jupyter Notebook. I can see the output as follows:
I tried increasing the default length of the columns in pandas with no success. Can you please help me with it?
If you were using the typical way to view a dataframe in Jupyter (see my puzzelment about your screenshot in my comments to your original post) it would be things like this:
adapted from answer to 'Pretty-print an entire Pandas Series / DataFrame'
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
display(df)
(Note that will work with the text-based viewing, too. Note it uses print(df) in the answer to 'Pretty-print an entire Pandas Series / DataFrame'.
Adjust the 'display.max_colwidth' if you want the entire column text to show:
with pd.option_context('display.max_rows', None, 'display.max_columns', None,'display.max_colwidth', -1):
display(df)
(If you prefer text like you posted, replace display() with print()
Generally with the solutions above the view window in Jupyter will get scrollbars so you can navigate to view all still.
You can also set the number of rows to show to be lower to save space, see example here.
You may also be interested in Pandas dataframe hide index functionality? or Using python / Jupyter Notebook, how to prevent row numbers from printing?.
As pointed out here, setting some some global options is covered in the Pandas Documentation for top-level options.
For display() to work these days you don't need to do anything extra. But if your are using old Jupyter or it doesn't work then try adding towards the top of your notebook file and running the following as a cell first:
from IPython.display import display

In Jupyter, do you do a running update (hide) a pandas table as you are updating it?

In Jupyter, I am running a long-running computation.
I want to show a Pandas table with the top 25 rows. The top 25 may update each iteration.
However, I don't want to show many Pandas tables. I want to delete / update the existing displayed Pandas table.
How is this possible?
This approach seems usable for matplotlibs but not pandas pretty tables.
You can use clear_output and display the dataframe:
from IPython.display import display, clear_output
# this is just to simulate the delay
import time
i = 1
while i<10:
time.sleep(1)
df = pd.DataFrame(np.random.rand(4,3))
clear_output(wait=True)
display(df)
i += 1

display output pandas dataframe in rstudio

One question please.
I like to use rstudio to code in python and R, but when I print a pandas dataframe I get output that doesn't use all the space. It is not very friendly and it is worse if I have more variables.
As shown in the attached image.
Is there a way to display the columns to the right like we do with tibble in r?
Thanks!
I have tried using these options but it doesn't work.
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

Pandas styling - change font size and format float/apply background gradient

I am building an application that displays stock correlations data in various visual forms, including a matrix with a heatmap applied. My heatmap is created by passing the correlation matrix dataframe into IPy Widgets Output, so I can display it as part of a VBox later on. I have successfully applied a background gradient and formatted my numbers to 2dp. Can anyone help me edit the function to also reduce the font size, I just want to shrink it up a little?
Note: I chose to do this using dataframe styling over matplotlib as I had a number of issues getting the output to display in the way I wanted. I also have a function that downloads the dataframe to excel with the styling applied.
I have tried putting the following line of code at the beginning of my notebook so I can leave it outside of the function, but it seems to get ignored once the dataframe is passed to Output.
pd.options.display.float_format = "{:,.2f}".format
Here is my code sample:
import seaborn as sns
import ipywidgets as ipw
import pandas as pd
import numpy as np
#Sample Data
data = np.random.randint(5,30,size=500)
df = pd.DataFrame(data.reshape((50,10)))
corr = df.corr()
#Function produces dataframe as Output
def output_heatmap_df(df):
out = ipw.Output()
with out:
display(df.style\
.background_gradient(cmap=sns.diverging_palette(220,10, as_cmap=True),axis=None).format("{:,.2f}"))
out.layout.width='1600px'
return out
output_heatmap_df(corr)
In case anyone should come across this, the below code worked for me in the end:
def output_heatmap_df(df):
out = ipw.Output()
with out:
display(df.style\
.background_gradient(cmap=sns.diverging_palette(220,10, as_cmap=True),axis=None).format("{:,.2f}")
.set_properties(**{'text-align':'center','font-size':'10px'})
.set_table_styles([{'selector':'th','props':[('text-align','center'),('font-size','10px')]}])
)
out.layout.width='1600px'
return out

How to Render Math Table Properly in IPython Notebook

The math problem that I'm solving gives different analytical solutions in different scenarios, and I would like to summarize the result in a nice table. IPython Notebook renders the list nicely:
for example:
import sympy
from pandas import DataFrame
from sympy import *
init_printing()
a, b, c, d = symbols('a b c d')
t = [[a/b, b/a], [c/d, d/c]]
t
However, when I summarize the answers into a table using DataFrame, the math cannot be rendered any more:
df = DataFrame(t, index=['Situation 1', 'Situation 2'], columns=['Answer1','Answer2'])
df
"print df.to_latex()" also gives the same result. I also tried "print(latex(t))" but it gives this after compiling in LaTex, which is alright, but I still need to manually convert it to a table:
How should I use DataFrame properly in order to render the math properly? Or is there any other way to export the math result into a table in Latex? Thanks!
Update: 01/25/14
Thanks again to #Jakob for solving the problem. It works perfectly for simple matrices, though there are still some minor problems for more complicated math expressions. But I guess like #asmeurer said, perfection requires an update in IPython and Pandas.
Update: 01/26/14
If I render the result directly, i.e. just print the list, it works fine:
MathJax is currently not able to render tables, hence the most obvious approach (pure latex) does not work.
However, following the advise of #asmeurer you should use an html table and render the cell content as latex. In your case this could be easily achieved by the following intermediate step:
from sympy import latex
tl = map(lambda tc: '$'+latex(tc)+'$',t)
df = DataFrame(tl, index=['Situation 1', 'Situation 2'], columns=['Answer'])
df
which gives:
Update:
In case of two dimensional data, the simple map function will not work directly. To cope with this situation the numpy shape, reshape and ravel functions could be used like:
import numpy as np
t = [[a/b, b/a],[a*a,b*b]]
tl=np.reshape(map(lambda tc: '$'+latex(tc)+'$',np.ravel(t)),np.shape(t))
df = DataFrame(tl, index=['Situation 1', 'Situation 2'], columns=['Answer 1','Answer 2'])
df
This gives:
Update 2:
Pandas crops cell content if the string length exceeds a certain number. E.g a more complicated expression like
t1 = [a/2+b/2+c/2+d/2]
tl=np.reshape(map(lambda tc: '$'+latex(tc)+'$',np.ravel(t1)),np.shape(t1))
df = DataFrame(tl, index=['Situation 1'], columns=['Answer 1'])
df
gives:
To cope with this issue a pandas package option has to be altered, for details see here. For the present case the max_colwidth has to be changed. The default value is 50, hence let's change it to 100:
import pandas as pd
pd.options.display.max_colwidth=100
df
gives: