change scientific notation abbreviation of y axis units to a string - pandas

First I would like to apologize as I know I am not asking this question correctly (which is why I cant find what is likely a simple answer).
I have a graph
As you can see above the y axis it says 1e11 meaning that the units are in 100 Billions. I would like to change the graph to read 100 Billion instead of 1e11.
I am not sure what such a notation is called.
To be clear I am not asking to change the whole y axis to number values like other questions I only want to change the top 1e11 to be more readable to those who are less mathematical.
ax.get_yaxis().get_major_formatter().set_scientific(False)
results in an undesired result

import numpy as np
from matplotlib.ticker import FuncFormatter
def billions(x, pos):
return '$%1.1fB' % (x*1e-9)
formatter = FuncFormatter(billions)
ax.yaxis.set_major_formatter(formatter)
located from https://matplotlib.org/examples/pylab_examples/custom_ticker1.html
produces

Related

How to show min and max values at the end of the axes

I generate plots like below:
from pylab import *
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
import matplotlib.ticker as ticker
rcParams['axes.linewidth'] = 2 # set the value globally
rcParams['font.size'] = 16# set the value globally
rcParams['font.family'] = ['DejaVu Sans']
rcParams['mathtext.fontset'] = 'stix'
rcParams['legend.fontsize'] = 24
rcParams['axes.prop_cycle'] = cycler(color=['grey','b','g','r','orange'])
rc('lines', linewidth=2, linestyle='-',marker='o')
rcParams['axes.xmargin'] = 0
rcParams['axes.ymargin'] = 0
t = arange(0,21,1)
v = 2.0
s = v*t
plt.figure(figsize=(12, 4))
plt.plot(t,s,label='$s=%1.1f\cdot t$'%v)
plt.title('Wykres drogi w czasie $s=v\cdot t$')
plt.xlabel('Czas $t$, s')
plt.ylabel('Droga $s$, m')
plt.autoscale(enable=True, axis='both', tight=None)
legend(loc='best')
plt.xlim(min(t),max(t))
plt.ylim(min(s),max(s))
plt.grid()
plt.show()
When I am changing the value t = arange(0,21,1) for example to t = arange(0,20,1) which gives me for example on the x axis max value= 19.0 my max value dispirs from the x axis. The same situation is of course with y axis.
My question is how to force matplotlib to produce always plots where on the axes are max values just at the end of the axes like should be always for my purposes or should be possible to chose like an option?
Imiage from my program in Fortan I did some years ago
Matplotlib is more efficiens that I use it but there should be an opition like that (the picture above).
In this way I can always observe max min in text windows or do take addiional steps to make sure about max min values. I would like to read them from axes and the question is ...Are there such possibilites in mathplotlib ??? If not I will close the post.
Axes I am thinking about more or less
I see two ways to solve the problem.
Set the axes automatic limit mode to round numbers
In the rcParams you can do this with
rcParams['axes.autolimit_mode'] = 'round_numbers'
And turn off the manual axes limits with min and max
plt.xlim(min(t),max(t))
plt.ylim(min(s),max(s))
This will produce the image below. Still, the extreme values of the axes are shown at the nearest "round numbers", but the user can approximately catch the data range limits. If you need the exact value to be displayed, you can see the second solution which cannot be directly used from the rcParams.
or – Manually generate axes ticks
This solution implies explicitly asking for a given number of ticks. I guess there is a way to automatize it depending on the axes size etc. But if you are dealing with more or less every time the same graph size, you can decide a fixed number of ticks manually. This can be done with
plt.xlim(min(t),max(t))
plt.ylim(min(s),max(s))
plt.xticks(np.linspace(t.min(), t.max(), 7)) # arbitrary chosen
plt.yticks(np.linspace(s.min(), s.max(), 5)) # arbitrary chosen
generated the image below, quite similar to your image example.

Matplotlib value on top left and remove it

I have this array with 10 values.
I get that my array has so many numbers behind the comma.
But I notice there's value on top left corner.
Anyone knows what is it and how remove it?
thank you in advance.
the array:
0.00409960926442099
0.00409960926442083
0.004099609264420652
0.004099609264420653
0.004099609264420585
0.0040996092644205884
0.004099609264420545
0.004099609264420517
0.004099609264420514
0.004099609264420513
As your values are all very close together, the usual ticks would all be the same. For example, if you use '%.6f' as the tick format, you'd get '0.00410' for each of the ticks. That would not be very helpful. Therefore, matplotlib puts a base number '4.099609264420e-3' together with an offset '1e-16' to label the yticks. So, every real ytick would be the base plus the offset times the tick-value.
To get rid of these strange numbers, you have to re-evaluate what exactly you want to achieve with your plot. If you'd set some y-limits (e.g. plt.ylim(0.004099, 0.004100)), you'd get a quite dull horizontal line. Note that 1e-16 is very close to the maximum precision you can get using standard floating-point math.
Here is some demo code to show how it would look with the '%.6f' format:
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
plt.plot([0.00409960926442099, 0.00409960926442083, 0.004099609264420652, 0.004099609264420653,
0.004099609264420585, 0.0040996092644205884, 0.004099609264420545, 0.004099609264420517,
0.004099609264420514, 0.004099609264420513])
plt.gca().yaxis.set_major_formatter(mtick.FormatStrFormatter('%.6f'))
plt.tight_layout()
plt.show()

Trying to use the sum operator in matplotlib gives an error

So I would like to mention first that I am completely new to basically everything that is linked to Jupyter Notebook, matplotlib and numpy stuff. So that's why I most likely will not be able to express my problem clearly. Therefore I am begging for your patience :) (ah yeah and my English sucks too so...)
Anyways, I am trying to create an interactive plot. Therefore, I want to display the function of the first n polynomes of the square wave where the value of n can be choosen by using a slider. This is what I got so far:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (11,4)
plt.rcParams['figure.dpi'] = 150
from ipywidgets import interact,interactive, fixed, interact_manual
import ipywidgets as Widgets
def f(n):
plt.plot( np.arange(0,10), 1/pi * sum( 2/(i* pi) * (1- cos(i*pi) ) * sin(i*np.arange(0,10)) for i in range(1,n) ) )
plt.ylim(-2,2)
interact(f, n= 1)
Now, everything works fine until the line where I set my function, so the line with this
plt.plot(np.arange ...)
It gives me the following error:
ValueError: x and y must have same first dimension, but have shapes (10,) and (1,)
I already figured out that this has to do with the usage of the sum() Operator and using the variable n in it. If i don't put n in the sum, then everything works out nicely and I am getting my graph.
So, the question basically is what I will have to do to make my idea happen.
Thank you for your Responses, I know that my post might be very annoying to some of you because of its style or whatever and I am sorry for that.
Using the sum means you collapse the list of values down to a single value, that's what numpy is telling you - you have 10 x values and only 1 y value (because you just added them all up). I think what you are meaning to do is create a list of sums, so just move one closing parenthesis ()) from after the for i in range(n) to before it:
plt.plot(np.arange(0,10), 1/pi * sum(2/(i* pi) * (1- cos(i*pi)) * sin(i*np.arange(0,10))) for i in range(1,n))
So, for those of you who are interested in the answer (probably there are some): I found a nice and simple solution.
The Problem was the fact, that the line
interact(f, n= 1)
didn't work on its own. Now that I put it like this,
interact(f, n =widgets.IntSlider(min=2, max=100, step=1, value=2))
so by - most importantly - saying that the slider is supposed to be an IntSlider, everything works just fine!
Thank you for your help anyways! Since I am new on this platform, I don't know how solved questions can be closed, but this one here defenitely can be closed.

Adding descriptive stats to this plot

In pandas/seaborn:
sns.distplot(combo['resubmits'], kde=False, bins=8)
plt.savefig("g1.png")
Makes a very pretty histogram. I want to include a textual "legend" showing the mean, stdev, n, etc as numbers in a box. You would think this is so common that there's a semi automatic way to do it but I can't find it.
There is a feature request for that.
However, note that using matplotlib.pyplot.axvline, you can easily do it yourself for now.
from matplotlib import pyplot as plt
plt.axvline(x, 0, y_max)
where x=combo['resubmits'].mean() and y_max is the maximal value of hist(combo['resubmits'])'s bins' values.

Better ticks and tick labels with log scale

I am trying to get better looking log-log plots and I almost got what I want except for a minor problem.
The reason my example throws off the standard settings is that the x values are confined within less than one decade and I want to use decimal, not scientific notation.
Allow me to illustrate with an example:
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib as mpl
import numpy as np
x = np.array([0.6,0.83,1.1,1.8,2])
y = np.array([1e-5,1e-4,1e-3,1e-2,0.1])
fig1,ax = plt.subplots()
ax.plot(x,y)
ax.set_xscale('log')
ax.set_yscale('log')
which produces:
There are two problems with the x axis:
The use of scientific notation, which in this case is counterproductive
The horrible "offset" at the lower right corner
After much reading, I added three lines of code:
ax.xaxis.set_major_formatter(mpl.ticker.ScalarFormatter())
ax.xaxis.set_minor_formatter(mpl.ticker.ScalarFormatter())
ax.ticklabel_format(style='plain',axis='x',useOffset=False)
This produces:
My understanding of this is that there are 5 minor ticks and 1 major one. It is much better, but still not perfect:
I would like some additional ticks between 1 and 2
Formatting of label at 1 is wrong. It should be "1.0"
So I inserted the following line before the formatter statement:
ax.xaxis.set_major_locator(mpl.ticker.MultipleLocator(0.2))
I finally get the ticks I want:
I now have 8 major and 2 minor ticks. Now, this almost looks right except for the fact that the tick labels at 0.6, 0.8 and 2.0 appear bolder than the others. What is the reason for this and how can I correct it?
The reason, some of the labels appear bold is that they are part of the major and minor ticklabels. If two texts perfectly overlap, they appear bolder due to the antialiasing.
You may decide to only use minor ticklabels and set the major ones with a NullLocator.
Since the locations of the ticklabels you wish to have is really specific there is no automatic locator that would provide them out of the box. For this special case it may be easiest to use a FixedLocator and specify the labels you wish to have as a list.
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
x = np.array([0.6,0.83,1.1,1.8,2])
y = np.array([1e-5,1e-4,1e-3,1e-2,0.1])
fig1,ax = plt.subplots(dpi=72, figsize=(6,4))
ax.plot(x,y)
ax.set_xscale('log')
ax.set_yscale('log')
locs = np.append( np.arange(0.1,1,0.1),np.arange(1,10,0.2))
ax.xaxis.set_minor_locator(ticker.FixedLocator(locs))
ax.xaxis.set_major_locator(ticker.NullLocator())
ax.xaxis.set_minor_formatter(ticker.ScalarFormatter())
plt.show()
For a more generic labeling, one could of course subclass a locator, but we would then need to know the logic to use to determine the ticklabels. (As I do not see a well defined logic for the desired ticks from the question, I feel it would be wasted effort to provide such a solution for now.)