Helvetica in Matplotlib with siunitx - matplotlib

I would like to create my graphs to match my LaTeX document and use the Helvetica font for both.
In LaTeX I have
\usepackage{helvet}
\renewcommand{\familydefault}{\sfdefault}
set.
The code in Python looks like this:
import matplotlib.pyplot as plt
import numpy as np
import locale
plt.rc('text', usetex=True)
plt.rcParams['text.latex.preamble'] = [
r'\usepackage[detect-all,locale=DE]{siunitx}', #SI-Einheiten, Komma
r'\usepackage{helvet}', #Helvetica als Schrift
r'\usepackage{icomma}']
locale.setlocale(locale.LC_NUMERIC, "de_DE.UTF-8")
plt.ticklabel_format(useLocale=True)
x = [1, 2, 3, 4]
y = [5, 6, 7.2, 8.1]
plt.plot(x, y, marker="o", label="setting1")
plt.xticks(np.arange(1.0, 4.2, step=0.5))
plt.xlabel("x (\si{\milli\metre})")
plt.ylabel("y (\si{\pascal})")
plt.legend()
plt.grid(True)
plt.savefig('test.pdf', bbox_inches='tight')
The problem is that "Pa" from the figure does not match the "Pa" in LaTeX

Adding this to my matplotlibrc file worked for me.
mathtext.fontset : custom
mathtext.it : Helvetica:italic
Also, I needed to have Helvetica-Oblique.ttf in my /usr/local/anaconda3/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/ttf directory. Olga Botvinnik has some good instructions in her blog. Someday, I'll put a similar set of instructions together on mine.
Note, that you're going to have to clear out your cache under ~/.matplotlib in order to refresh this.
Matplotlib says that custom fontsets are not supported and this might all break in a future update of Matplotlib.

Related

How to prevent 1e9 from being shown to exponential form in Python matplotlib figure

I've seen this. How to prevent numbers being changed to exponential form in Python matplotlib figure
However, I've got some custom annotations to put in, and I'd just like matplotlib to just not show the 1e9 marker. Example code below
import matplotlib.pyplot as plt
import seaborn as sns
sns.set() # not necessary, but just to reproduce the photo below
f, a = plt.subplots() # I use the oop interface
pd.DataFrame({'y': [1e9, 2e9, 3e9], 'x': [1, 2, 3]}).set_index('x').plot(ax=a)
Yields:
How do I just not show the 1e9? I have a custom annotation there which says 'billions' and it overlaps.
My thanks to ImportanceOfBeingErnest above, as
a.yaxis.offsetText.set_visible(False)
solves.

The decimal point appears as a triangle in inkscape

I am using Inkscape to merge two figures. One of the figures, created using matplotlib, has a mathematical text written over it. To the best of my understanding, the text is rendered using latex by matplotlib. When I import this figure in Inkscape, the decimal points in the text are replaced automatically by small triangles. What could be done? Please have a look at the following image. Thanks in advance.
Edit: Here is an example to produce a similar plot which when imported in inkscape, renders a wrong font.
import matplotlib.pyplot as plt
plt.plot([1, 2, 3], [1, 4, 9])
plt.annotate(r'$\alpha = 2.4$', xy = (0.1, 0.9), xycoords = 'axes fraction')
plt.legend()
plt.savefig('test.pdf', bbox_inches = 'tight')

Usetex in Matplotlib

When I try to obtain plots in which the axis (both formulae and text) are written in LaTeX standard roman font, I keep not obtaining the plot, but the code runs without warnings. In particular, this simple scatter with TeX code in the axis labels, in which I have put my better understanding of the documentation:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import rc
x = np.linspace(0,1,100)
y = np.random.rand(100,1)
plt.rc('text', usetex=True)
plt.rc('font', family='roman')
plt.scatter(x, y, c='b', s=10)
plt.xlabel(r'$\lambda$ ($\AA$)',size='12')
plt.ylabel(r'$F_\alpha (W/m^2)$ ',size='12')
plt.title(r'A title in \LaTeX typography')
plt.show()
keeps yielding a message like <matplotlib.figure.Figure at 0x1f75d4750>, which I have met before, but I keep failing when trying to remedy this one. In addition, saving the plot (png or pdf) would not solve the issue, and if the problem is related to TeX, I have definitely not found any resource that can help. I use MacOS Sierra.

Graphing matplotlib with Python code in a R Markdown document

Is it possible to use Python matplotlib code to draw graph in RStudio?
e.g. below Python matplotlib code:
import numpy as np
import matplotlib.pyplot as plt
n = 256
X = np.linspace(-np.pi,np.pi,n,endpoint=True)
Y = np.sin(2*X)
plt.plot (X, Y+1, color='blue', alpha=1.00)
plt.plot (X, Y-1, color='blue', alpha=1.00)
plt.show()
Output graph will be:
Then I need to write a R Markdown to include these code and generate graph automatically after knitting the markdown.
install.packages('devtools') first, get install_github function
install_github("rstudio/reticulate") install the dev version of reticulate
in r markdown doc, use code below to enable the function.
```{r setup, include=FALSE}
library(knitr)
library(reticulate)
knitr::knit_engines$set(python = reticulate::eng_python)
```
Try it , you will get what you want and don't need to save any image.
One possible solution is save the plot as a image, then load the file to markdown.
### Call python code sample
```{r,engine='python'}
import numpy as np
import matplotlib.pyplot as plt
n = 256
X = np.linspace(-np.pi,np.pi,n,endpoint=True)
Y = np.sin(2*X)
fig, ax = plt.subplots( nrows=1, ncols=1 )
ax.plot (X, Y+1, color='blue', alpha=1.00)
ax.plot (X, Y-1, color='blue', alpha=1.00)
#plt.show()
fig.savefig('foo.png', bbox_inches='tight')
print "finished"
```
Output image:
![output](foo.png)
#### The End
Output:
You can do that with reticulate, but most time in trying to follow a tutorial in doing that you may encounter some technicalities that weren't sufficiently explained.
My answer is a little late but I hope it's a thorough walkthrough of doing it the right way - not rendering it and then loading it as a png but have the python code executed more "natively".
Step 1: Configure Python from RStudio
You want to insert an R chunk, and run the following code to configure the path to the version of Python you want to use. The default python that comes shipped with most OS is usually the outdated python 2 and is not where you install your packages. That is the reason why it's important to do this, to make sure Rstudio will use the specified python instance where your matplotlib library (and the other libraries you will be using for that project) can be found:
library(reticulate)
# change the following to point to the desired path on your system
use_python('/Users/Samuel/anaconda3/bin/python')
# prints the python configuration
py_config()
You should expect to see that your session is configured with the settings you specified:
python: /Users/Samuel/anaconda3/bin/python
libpython: /Users/Samuel/anaconda3/lib/libpython3.6m.dylib
pythonhome: /Users/Samuel/anaconda3:/Users/Samuel/anaconda3
version: 3.6.3 |Anaconda custom (64-bit)| (default, Oct 6 2017, 12:04:38) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
numpy: /Users/Samuel/anaconda3/lib/python3.6/site-packages/numpy
numpy_version: 1.15.2
python versions found:
/Users/Samuel/anaconda3/bin/python
/usr/bin/python
/usr/local/bin/python
/usr/local/bin/python3
/Users/Samuel/.virtualenvs/r-tensorflow/bin/python
Step 2: The familiar plt.show
Add a Python chunk (not R!) in your R Markdown document (see attached screenshot) and you can now write native Python code. This means that the familiar plt.show() and plt.imshow() will work without any extra work. It will be rendered and can be compiled into HTML / PDF using knitr.
This will work:
plt.imshow(my_image, cmap='gray')
Or a more elaborated example:
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
DATADIR = '/Users/Samuel/Datasets/PetImages'
CATEGORIES = ['Dog', 'Cat']
for category in CATEGORIES:
path = os.path.join(DATADIR, category) # path to cat or dog dir
for img in os.listdir(path):
img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_GRAYSCALE)
plt.imshow(img_array, cmap='gray')
plt.show()
break
break
Output:
Step 3: Knit to HTML / PDF / Word etc
Proceed to knit as usual. The end product is a beautifully formatted document done in Python code using R Markdown. RStudio has come a long way and I'm surprised the level of support it has for Python code isn't more known so hoping anyone that stumbled upon this answer will find it informative and learned something new.
I have been working with reticulate and R Markdown and you should specify your virtual environment. For example my R Markdown starts as follows:
{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, cache.lazy = FALSE)
library(reticulate)
use_condaenv('pytorch') ## yes, you can run pytorch and tensor flow too
Then you can work in either language. So, for plotting with matplotlib, I have found that you need the PyQt5 module to make it all run smoothly. The following makes a nice plot inside R Markdown - it's a separate chunk.
{python plot}
import PyQt5
import numpy as np
import pandas as pd
import os
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
data = pd.read_csv('Subscriptions.csv',index_col='Date', parse_dates=True)
# make the nice plot
# set the figure size
fig = plt.figure(figsize = (15,10))
# the series
ax1 = fig.add_subplot(211)
ax1.plot(data.index.values, data.Opens, color = 'green', label = 'Opens')
# plot the legend for the first plot
ax1.legend(loc = 'upper right', fontsize = 14)
plt.ylabel('Opens', fontsize=16)
# Hide the top x axis
ax1.axes.get_xaxis().set_visible(False)
####### NOW PLOT THE OTHER SERIES ON A SINGLE PLOT
# plot 212 is the MI series
# plot series
ax2 = fig.add_subplot(212)
ax2.plot(data.index.values, data.Joiners, color = 'orange', label = 'Joiners')
# plot the legend for the second plot
ax2.legend(loc = 'upper right', fontsize = 14)
# set the fontsize for the bottom plot
plt.ylabel('Joiners', fontsize=16)
plt.tight_layout()
plt.show()
You get the following from this:
I don't have the reputation points to add a comment, but Bryan's answer above was the only one to work for me. Adding plt.tight_layout() made the difference. I added that line to the following simple code and the plot displayed.
{python evaluate}
plt.scatter(X_train, y_train, color = 'gray')
plt.plot(X_train, regresssion_model_sklearn.predict(X_train), color = 'red')
plt.ylabel('Salary')
plt.xlabel('Number of Years of Experience')
plt.title('Salary vs. Years of Experience')
plt.tight_layout()
plt.show()

How to pick a new color for each plotted line within a figure in matplotlib?

I'd like to NOT specify a color for each plotted line, and have each line get a distinct color. But if I run:
from matplotlib import pyplot as plt
for i in range(20):
plt.plot([0, 1], [i, i])
plt.show()
then I get this output:
If you look at the image above, you can see that matplotlib attempts to pick colors for each line that are different, but eventually it re-uses colors - the top ten lines use the same colors as the bottom ten. I just want to stop it from repeating already used colors AND/OR feed it a list of colors to use.
I usually use the second one of these:
from matplotlib.pyplot import cm
import numpy as np
#variable n below should be number of curves to plot
#version 1:
color = cm.rainbow(np.linspace(0, 1, n))
for i, c in zip(range(n), color):
plt.plot(x, y, c=c)
#or version 2:
color = iter(cm.rainbow(np.linspace(0, 1, n)))
for i in range(n):
c = next(color)
plt.plot(x, y, c=c)
Example of 2:
matplotlib 1.5+
You can use axes.set_prop_cycle (example).
matplotlib 1.0-1.4
You can use axes.set_color_cycle (example).
matplotlib 0.x
You can use Axes.set_default_color_cycle.
You can use a predefined "qualitative colormap" like this:
from matplotlib.cm import get_cmap
name = "Accent"
cmap = get_cmap(name) # type: matplotlib.colors.ListedColormap
colors = cmap.colors # type: list
axes.set_prop_cycle(color=colors)
Tested on matplotlib 3.0.3. See https://github.com/matplotlib/matplotlib/issues/10840 for discussion on why you can't call axes.set_prop_cycle(color=cmap).
A list of predefined qualititative colormaps is available at https://matplotlib.org/gallery/color/colormap_reference.html :
prop_cycle
color_cycle was deprecated in 1.5 in favor of this generalization: http://matplotlib.org/users/whats_new.html#added-axes-prop-cycle-key-to-rcparams
# cycler is a separate package extracted from matplotlib.
from cycler import cycler
import matplotlib.pyplot as plt
plt.rc('axes', prop_cycle=(cycler('color', ['r', 'g', 'b'])))
plt.plot([1, 2])
plt.plot([2, 3])
plt.plot([3, 4])
plt.plot([4, 5])
plt.plot([5, 6])
plt.show()
Also shown in the (now badly named) example: http://matplotlib.org/1.5.1/examples/color/color_cycle_demo.html mentioned at: https://stackoverflow.com/a/4971431/895245
Tested in matplotlib 1.5.1.
I don't know if you can automatically change the color, but you could exploit your loop to generate different colors:
for i in range(20):
ax1.plot(x, y, color = (0, i / 20.0, 0, 1)
In this case, colors will vary from black to 100% green, but you can tune it if you want.
See the matplotlib plot() docs and look for the color keyword argument.
If you want to feed a list of colors, just make sure that you have a list big enough and then use the index of the loop to select the color
colors = ['r', 'b', ...., 'w']
for i in range(20):
ax1.plot(x, y, color = colors[i])
You can also change the default color cycle in your matplotlibrc file.
If you don't know where that file is, do the following in python:
import matplotlib
matplotlib.matplotlib_fname()
This will show you the path to your currently used matplotlibrc file.
In that file you will find amongst many other settings also the one for axes.color.cycle. Just put in your desired sequence of colors and you will find it in every plot you make.
Note that you can also use all valid html color names in matplotlib.
As Ciro's answer notes, you can use prop_cycle to set a list of colors for matplotlib to cycle through. But how many colors? What if you want to use the same color cycle for lots of plots, with different numbers of lines?
One tactic would be to use a formula like the one from https://gamedev.stackexchange.com/a/46469/22397, to generate an infinite sequence of colors where each color tries to be significantly different from all those that preceded it.
Unfortunately, prop_cycle won't accept infinite sequences - it will hang forever if you pass it one. But we can take, say, the first 1000 colors generated from such a sequence, and set it as the color cycle. That way, for plots with any sane number of lines, you should get distinguishable colors.
Example:
from matplotlib import pyplot as plt
from matplotlib.colors import hsv_to_rgb
from cycler import cycler
# 1000 distinct colors:
colors = [hsv_to_rgb([(i * 0.618033988749895) % 1.0, 1, 1])
for i in range(1000)]
plt.rc('axes', prop_cycle=(cycler('color', colors)))
for i in range(20):
plt.plot([1, 0], [i, i])
plt.show()
Output:
Now, all the colors are different - although I admit that I struggle to distinguish a few of them!