Matplotlib x ticks label not regularly spaced - matplotlib

I am plotting some data using seaborn as below:
ax1 = plt.subplot(2, 1, 1)
sns.lineplot(x='time', y='gap_ratio', data=df_gap, ax=ax1)
I am wondering why the x-ticks labels are not regularly spaced. As a result, I am ending with overlapping x-tick labels. Is there any way to get this fixed easily ?
FYI, the x data is regularly spaced (1 data point every minute)

Related

Is it possible to break x and y axis at the same time on lineplot?

I am working on drawing lineplots with matplotlib.
I checked several posts and could understand how the line break works on matplotlib (Break // in x axis of matplotlib)
However, I was wondering is it possible to break x and y axis all together at the same time.
My current drawing looks like below.
As shown on the graph, x-axis [2000,5000] waste spaces a lot.
Because I have more data that need to be drawn after 7000, I want to save more space.
Is it possible to split x-axis together with y-axis?
Or is there another convenient way to not to show specific region on lineplot?
If there is another library enabling this, I am willing to drop matplotlib and adopt others...
Maybe splitting the axis isn't your best choice. I would perhaps try inserting another smaller figure into the open space of your large figure using add_axes(). Here is a small example.
t = np.linspace(0, 5000, 1000) # create 1000 time stamps
data = 5*t*np.exp(-t/100) # and some fake data
fig, ax = plt.subplots()
ax.plot(t, data)
box = ax.get_position()
width = box.width*0.6
height = box.height*0.6
x = 0.35
y = 0.35
subax = fig.add_axes([x,y,width,height])
subax.plot(t, data)
subax.axis([0, np.max(t)/10, 0, np.max(data)*1.1])
plt.show()

Adding grouping ticks to a bar chart

I have a chart created from a pandas DataFrame that looks like this:
I've formatted the ticks with:
ax = df.plot(kind='bar')
ax.set_xticklabels(df.index.strftime('%I %p'))
However, I'd like to add a second set of larger ticks, to achieve this kind of effect:
I've tried many variations of use set_major_locator and set_major_formatter (as well as combining major and minor formatter), but it seems I'm not approaching it correctly and I wasn't able to find useful examples of similar combined ticks online either.
Does someone have a suggestion on how to achieve something similar to the bottom image?
The dataframe has a datetime index and is binned data, from something like df.resample(bin_size, label='right', closed='right').sum())
One idea is to set major ticks to display the date (%-d-%b) at noon each day with some padding (e.g., pad=40). This will leave a minor tick gap at noon, so for consistency you could set minor ticks only on the odd hours and give them rotation=90.
Note that this uses matplotlib's bar() since pandas' plot.bar() doesn't play well with the date formatting.
import matplotlib.dates as mdates
# toy data
dates = pd.date_range('2021-08-07', '2021-08-10', freq='1H')
df = pd.DataFrame({'date': dates, 'value': np.random.randint(10, size=len(dates))}).set_index('date')
# pyplot bar instead of pandas bar
fig, ax = plt.subplots(figsize=(14, 4))
ax.bar(df.index, df.value, width=0.02)
# put day labels at noon
ax.xaxis.set_major_locator(mdates.HourLocator(byhour=[12]))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%-d-%b'))
ax.xaxis.set_tick_params(which='major', pad=40)
# put hour labels on odd hours
ax.xaxis.set_minor_locator(mdates.HourLocator(byhour=range(1, 25, 2)))
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%-I %p'))
ax.xaxis.set_tick_params(which='minor', pad=0, rotation=90)
# add day separators at every midnight tick
ticks = df[df.index.strftime('%H:%M:%S') == '00:00:00'].index
arrowprops = dict(width=2, headwidth=1, headlength=1, shrink=0.02)
for tick in ticks:
xy = (mdates.date2num(tick), 0) # convert date index to float coordinate
xytext = (0, -65) # draw downward 65 points
ax.annotate('', xy=xy, xytext=xytext, textcoords='offset points',
annotation_clip=False, arrowprops=arrowprops)

Dynamically scaling axes during a matplotlib ArtistAnimation

It appears to be impossible to change the y and x axis view limits during an ArtistAnimation, and have the frames replayed with different axis limits.
The limits seem to fixed to those set last before the animation function is called.
In the code below, I have two plotting stages. The input data in the second plot is a much smaller subset of the data in the 1st frame. The data in the 1st stage has a much wider range.
So, I need to "zoom in" when displaying the second plot (otherwise the plot would be very tiny if the axis limits remain the same).
The two plots are overlaid on two different images (that are of the same size, but different content).
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import matplotlib.image as mpimg
import random
# sample 640x480 image. Actual frame loops through
# many different images, but of same size
image = mpimg.imread('image_demo.png')
fig = plt.figure()
plt.axis('off')
ax = fig.gca()
artists = []
def plot_stage_1():
# both x, y axis limits automatically set to 0 - 100
# when we call ax.imshow with this extent
im_extent = (0, 100, 0, 100) # (xmin, xmax, ymin, ymax)
im = ax.imshow(image, extent=im_extent, animated=True)
# y axis is a list of 100 random numbers between 0 and 100
p, = ax.plot(range(100), random.choices(range(100), k=100))
# Text label at 90, 90
t = ax.text(im_extent[1]*0.9, im_extent[3]*0.9, "Frame 1")
artists.append([im, t, p])
def plot_stage_2():
# axes remain at the the 0 - 100 limit from the previous
# imshow extent so both the background image and plot are tiny
im_extent = (0, 10, 0, 10)
# so let's update the x, y axis limits
ax.set_xlim(im_extent[0], im_extent[1])
ax.set_ylim(im_extent[0], im_extent[3])
im = ax.imshow(image, extent=im_extent, animated=True)
p, = ax.plot(range(10), random.choices(range(10), k=10))
# Text label at 9, 9
t = ax.text(im_extent[1]*0.9, im_extent[3]*0.9, "Frame 2")
artists.append([im, t, p])
plot_stage_1()
plot_stage_2()
# clear white space around plot
fig.subplots_adjust(left=0, bottom=0, right=1, top=1, wspace=None, hspace=None)
# set figure size
fig.set_size_inches(6.67, 5.0, True)
anim = animation.ArtistAnimation(fig, artists, interval=2000, repeat=False, blit=False)
plt.show()
If I call just one of the two functions above, the plot is fine. However, if I call both, the axis limits in both frames will be 0 - 10, 0 - 10. So frame 1 will be super zoomed in.
Also calling ax.set_xlim(0, 100), ax.set_ylim(0, 100) in plot_stage_1() doesn't help. The last set_xlim(), set_ylim() calls fix the axis limits throughout all frames in the animation.
I could keep the axis bounds fixed and apply a scaling function to the input data.
However, I'm curious to know whether I can simply change the axis limits -- my code will be better this way, because the actual code is complicated with multiple stages, zooming plots across many different ranges.
Or perhaps I have to rejig my code to use FuncAnimation, instead of ArtistAnimation?
FuncAnimation appears to result in the expected behavior. So I'm changing my code to use that instead of ArtistAnimation.
Still curious to know though, whether this can at all be done using ArtistAnimation.

Pandas: How can I plot with separate y-axis, but still control the order?

I am trying to plot multiple time series in one plot. The scales are different, so they need separate y-axis, and I want a specific time series to have its y-axis on the right. I also want that time series to be behind the others. But I find that when I use secondary_y=True, this time series is always brought to the front, even if the code to plot it comes before the others. How can I control the order of the plots when using secondary_y=True (or is there an alternative)?
Furthermore, when I use secondary_y=True the y-axis on the left no longer adapts to appropriate values. Is there a fixed for this?
# imports
import numpy as np
import matplotlib.pyplot as plt
# dummy data
lenx = 1000
x = range(lenx)
np.random.seed(4)
y1 = np.random.randn(lenx)
y1 = pd.Series(y1, index=x)
y2 = 50.0 + y1.cumsum()
# plot time series.
# use ax to make Pandas plot them in the same plot.
ax = y2.plot.area(secondary_y=True)
y1.plot(ax=ax)
So what I would like is to have the blue area plot behind the green time series, and to have the left y-axis take appropriate values for the green time series:
https://i.stack.imgur.com/6QzPV.png
Perhaps something like the following using matplotlib.axes.Axes.twinx instead of using secondary_y, and then following the approach in this answer to move the twinned axis to the background:
# plot time series.
fig, ax = plt.subplots()
y1.plot(ax=ax, color='green')
ax.set_zorder(10)
ax.patch.set_visible(False)
ax1 = ax.twinx()
y2.plot.area(ax=ax1, color='blue')

Change colour of curve according to its y-value in matplotlib [duplicate]

This question already has answers here:
Having line color vary with data index for line graph in matplotlib?
(4 answers)
Set line colors according to colormap
(1 answer)
Closed 8 years ago.
I'm trying to replicate the style of the attached figure using matplotlib's facilities.
Basically, I want to change the colour of the curve according to its y-value using matplotlib.
The plot you've shown doesn't have the color set by the vertical axis of the plot (which is what I would consider the y-value). Instead, it just has 8 different plots overlain, each with a different color, without stating what the color means.
Here's an example of something that looks like your plot:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
# some fake data:
x = np.linspace(0, 2*np.pi, 1000)
fs = np.arange(1, 5.)
ys = np.sin(x*fs[:, None])
for y, f in zip(ys, fs):
plt.plot(x, y, lw=3, c=cm.hot(f/5))
If you actually want the color of one line to change with respect to its value, you have to kind of hack it, because any given Line2D object can only have one color, as far as I know. One way to do this is to make a scatter plot, where each dot can have any color.
x = np.linspace(0, 2*np.pi, 1000)
y = np.sin(2*x)
plt.scatter(x,y, c=cm.hot(np.abs(y)), edgecolor='none')
Notes:
The color vector should range between 0 and 1, so if y.max() > 1, then normalize by it: c=cm.hot(y/y.max()) and make sure it's all positive.
I used edgecolor='none' because by default the scatter markers have a black outline which makes the it look less like a uniform line.
If your data is spaced too far, you'll have to interpolate the data if you don't want gaps between markers.