xticks are not getting displayed matplotlib - matplotlib

I have a block of code to plot 2 columns vs 1 column as a Line graph.
The Year, Month values (xticks) are not getting displayed in my plot for this simple block of code. Where am I going wrong?
Also, I get the same result with or without plt.subplots? Need an explanation on this, please.
plt.subplots(1, sharex=True)
df.col1.groupby([df["timestamp"].dt.year,df["timestamp"].dt.month]).mean()
.plot('line')
df.col2.groupby([df["timestamp"].dt.year,df["timestamp"].dt.month]).mean()
.plot('line').set_ylim(0, )
plt.title('Title')
plt.ylabel('ylabel')
plt.xlabel('(Year, Month)')
plt.legend(('col1', 'col2'))
plt.figure(figsize=(15,8))

Related

Seaborn - Change the X-Axis Range (Date field)

how can I change the x-axis so that I begin on January 1 2022? I don't want to set the other side of the bound. The aim here is to create a YTD chart. Thanks! (Data type for the x-axis field 'Date_reported' is a Dtype datetime64[ns]) (ps: does anyone know why my figsize statement isn't working? I'm aiming for the 15 by 8 siz but it doesn't seem to work.
sns.relplot(kind='line', data=df_Final, x='Date_reported', y='New_cases_Mov_avg',
hue='Continent', linewidth=1, ci=None)
sns.set_style("white")
sns.set_style('ticks')
plt.xlabel("Date Reported")
plt.ylabel("New Cases (Moving Average)")
plt.figure(figsize=(15,8))
You could define your figure and ax beforehand, set the figsize and then plot. Doing so, you have to go with lineplot instead of relplot.
ax.set_xlim will define the left boundary, fig.autofmt_xdate rotates the x labels.
fig, ax = plt.subplots(figsize=(15,8))
sns.lineplot(data=df_Final, x='Date_reported', y='New_cases_Mov_avg',
hue='Continent', linewidth=1, ci=None, ax=ax)
sns.set_style("white")
sns.set_style('ticks')
ax.set_xlabel("Date Reported")
ax.set_ylabel("New Cases (Moving Average)")
ax.set_xlim(datetime.date(2022, 1, 1))
fig.autofmt_xdate()

Remove thousands(k) in plotly line plot

I have plotly line plot with x axis year & week as integer. I do get correct data for year weeks 202101,202102,202103. In plot it shows as 202.1K,202.1K, 202.1K. I am looking for to show 202101,202102, 202103 in plot axis as well. Below is my code.
if chart_choice == 'line':
print(dff['week'])
dff = dff.groupby(['product','week'], as_index=False)[['CYSales']].sum()
fig = px.line(dff, x='week', y=num_value, color='product')
return fig
thanks for help
Does tickformat = "000" help perhaps? Have a look here Plotly for R: Remove the k that apears on the y axis when the dataset contains numbers larger than 1000

How to display DateTimeIndex x_tick labels

I have a Pandas series with a DateTimeIndex that I'm plotting as a line plot. I'd like my x_ticks and x_tick labels to only be the DateTimeIndex of the series.
Using the code below I'm displaying the x_ticks I want, but I'm also getting both 'Jan 2019' and 'Feb' added to the x_tick labels, as well as the values 30 and 10 at each end of the x-axis (which are the day values of the first and last DateTimeIndex).
w_c = pd.date_range(start=pd.to_datetime('2018-12-30'), end=pd.to_datetime('2019-02-10'), freq='w')
sales = [111.94, 193.44, 143.46, 157.26, 124.8, 206.26, 127.22]
test = pd.Series(sales, index=w_c)
fig,ax = plt.subplots(figsize=(8,7))
ax = test.plot(fontsize=10, color='darkorange', lw=0.8, ylim=(0,250))
ax.xaxis.grid(True, which="both")
ax.xaxis.set_ticklabels(test.index.strftime('%d/%m/%Y'), rotation=25, minor=True)
display(fig)
Can someone tell me how to remove these additional labels? I expect the x_tick labels to be the DateTimeIndex in my test Series only.
See screen shot here with unwanted labels circled in red
One quick solution is to plot the
w_c = pd.date_range(start=pd.to_datetime('2018-12-30'), end=pd.to_datetime('2019-02-10'), freq='w')
sales = [111.94, 193.44, 143.46, 157.26, 124.8, 206.26, 127.22]
test = pd.Series(sales, index=w_c)
fig,ax = plt.subplots(figsize=(8,7))
# plot on ranks of rows instead of index
ax.plot(range(len(test)), test, color='darkorange', lw=0.8)
ax.set_ylim(0,250)
ax.xaxis.grid(True, which="both")
# manually modify the label
ax.set_xticklabels([''] + test.index.strftime('%d/%m/%Y').to_list(), rotation=25)
Output:

Pandas scatterplot categorical and timeseries axes

I'm looking to create a chart much like nltk's lexical dispersion plot, but am drawing a blank how to construct this. I was thinking that scatter would be my best geom, using '|' as markers, and setting the alpha, but I am running into all sorts of issues setting the parameters. An example of this is below:
I have the dataframe arranged with a datetime index, freq='D', over a 5 year period, and each column represents the count of a particular word used that date.
For example:
tst = pd.DataFrame(index=pd.date_range(datetime.datetime(2010, 1, 1), end=datetime.datetime(2010, 2, 1), freq='D'), data=[[randint(0, 5), randint(0, 1), randint(0, 2)] for x in range(32)])
Currently I'm trying something akin to the following:
plt.figure()
tst.plot(kind='scatter', x=tst.index, y=tst.columns, marker='|', color=sns.xkcd_rgb['dodger blue'], alpha=.05, legend=False)
yticks = plt.yticks()[0]
plt.yticks(yticks, top_words)
the above code yields a KeyError:
KeyError: "['2009-12-31T19:00:00.000000000-0500' '2010-01-01T19:00:00.000000000-0500'\n '2010-01-02T19:00:00.000000000-0500' '2010-01-03T19:00:00.000000000-0500'\n '2010-01-04T19:00:00.000000000-0500' '2010-01-05T19:00:00.000000000-0500'\n '2010-01-06T19:00:00.000000000-0500' '2010-01-07T19:00:00.000000000-0500'\n '2010-01-08T19:00:00.000000000-0500' '2010-01-09T19:00:00.000000000-0500'\n '2010-01-10T19:00:00.000000000-0500' '2010-01-11T19:00:00.000000000-0500'\n '2010-01-12T19:00:00.000000000-0500' '2010-01-13T19:00:00.000000000-0500'\n '2010-01-14T19:00:00.000000000-0500' '2010-01-15T19:00:00.000000000-0500'\n '2010-01-16T19:00:00.000000000-0500' '2010-01-17T19:00:00.000000000-0500'\n '2010-01-18T19:00:00.000000000-0500' '2010-01-19T19:00:00.000000000-0500'\n '2010-01-20T19:00:00.000000000-0500' '2010-01-21T19:00:00.000000000-0500'\n '2010-01-22T19:00:00.000000000-0500' '2010-01-23T19:00:00.000000000-0500'\n '2010-01-24T19:00:00.000000000-0500' '2010-01-25T19:00:00.000000000-0500'\n '2010-01-26T19:00:00.000000000-0500' '2010-01-27T19:00:00.000000000-0500'\n '2010-01-28T19:00:00.000000000-0500' '2010-01-29T19:00:00.000000000-0500'\n '2010-01-30T19:00:00.000000000-0500' '2010-01-31T19:00:00.000000000-0500'] not in index"
Any help would be appreciated.
With help, I was able to produce the following:
plt.plot(tst.index, tst, marker='|', color=sns.xkcd_rgb['dodger blue'], alpha=.25, ms=.5, lw=.5)
plt.ylim([-1, 20])
plt.yticks(range(20), top_words)
Unfortunately, it only appears that the upper bars will show up when there is a corresponding bar to be built on top of. That's not how my data looks.
I am not sure you can do this with .plot method. However, it is easy to do it straightly in matplotlib:
plt.plot(tst.index, tst, marker='|', lw=0, ms=10)
plt.ylim([-0.5, 5.5])
If you can install seaborn, try stripplot():
import seaborn as sns
sns.stripplot(data=tst, orient='h', marker='|', edgecolor='blue');
Note that I changed your data to make it look a bit more interesting:
tst = pd.DataFrame(index=pd.date_range(datetime.datetime(2010, 1, 1), end=datetime.datetime(2010, 2, 1), freq='D'),
data=(150000 * np.random.rand(32, 3)).astype('int'))
More information on seaborn:
http://stanford.edu/~mwaskom/software/seaborn/tutorial/categorical.html

matplotlib candlestick chart bar output error - seems to be plotting more than one timeframe on single bar

I am attempting to plot a candlestick chart using matplotlib, with hourly candles. However my output looks strange and it seems to be plotting multiple "hours" on one candle.
My code is as follows:
cursor = conx.cursor()
query= 'SELECT ticker,date,time,open,low,high,close FROM eurusd WHERE date > "2014-01-28"'
cursor.execute(query)
for line in cursor:
#appendLine in correct format for candlesticks - date,open,close,high,low
date=date2num(line[1])
open=(line[3])
high=(line[5])
low=(line[4])
close=(line[6])
appendLine = date,open,close,high,low
candleAr.append(appendLine)
fig = plt.figure()
ax1 = plt.subplot(1,1,1)
candlestick(ax1, candleAr, width=0.6, colorup='g', colordown='r')
ax1.grid(True)
plt.xlabel('Date')
plt.ylabel('Price')
plt.show()
And my output looks like the following:
Do I have to manipulate the "date2num" function to account for the fact that my data is hourly and not daily?
Managed to answer my own question - It was due to the date2num output having repeated values and was cramming all the days hourly bars into one. I had to add my date and time together to get a datetime, and then use the date2num on the date time (rather than date)
date=[]
open=[]
low=[]
high=[]
close=[]
candleAr=[]
for line in cursor:
time1=datetime.time(0,0)
time=datetime.datetime.combine(line[1],time1)
time=time+line[2]
appendLine = date2num(time),line[3],line[6],line[5],line[4]