matplotlib - plotting histogram with unique bins - matplotlib

I am trying to plot a histogram but the x ticks does not seem to get right.
The plot is intended to get a histogram of frequency counts ( 1 to 13 ) and total rows in 10000.
d1 = []
for i in np.arange(1, 10000):
tmp = np.random.randint(1, 13)
d1.append(tmp)
d2 = pd.DataFrame(d1)
d2.hist(width = 0.5)
plt.xticks(np.arange(1, 14, 1))
I am trying to plot frequency count of values and not ranges.

You would need to set the bin edges which should be used by the histogram.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
d1 = np.random.randint(1, 13, size=1000)
d2 = pd.DataFrame(d1)
bins = np.arange(0,13)+0.5
d2.hist(bins=bins, ec ="k")
plt.xticks(np.arange(1, 13))
plt.show()

Related

Prevent 'darkgrid' ax2 gridlines in twinx() plot from disecting ax1 curve

I am having a problem with preventing grid lines in 'darkgrid' from disecting a line associated with the X1 axis when plotting a twinx() plot. I can "fix" the problem by not using 'darkgrid' or by passing an empty list to X2 (and lose the axis labels to - se last line), but I do want 'darkgrid' and x2 axis labels.
#Some imports
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
sns.set_style("darkgrid")
%matplotlib inline
#Data
d = np.arange(0, 1000, 100)
x1 = d/30 #redmodel
x2 = np.sqrt(d) #bluemodel
#Figure
fig = plt.figure(figsize = (8, 12))
sy1 = 'r-' #redmodel
sy2 = 'b-' #bluemodel
ax1 = fig.add_subplot(111)
ax2 = ax1.twiny()
_ = ax1.plot(x1, d, sy1)
_ = ax1.set_ylabel('D')
_ = ax1.set_ylim(0, 1000)
_ = ax1.set_xlabel('X1')
_ = ax1.set_xlim(0, 31)
_ = ax2.plot(x2, d, sy2)
_ = ax2.set_xlabel('X2')
_ = ax2.set_xlim(0, 31)
#_ = ax2.set_xticks([]) #Empty list passed to omit_xticks, otherwise ax2 gridines disect red line
As I was looking for solutions to this problem I stumbled upon the axes_grid1 toolkit collection of helper classes which has the twin() option (in addition to twinx and twiny) which may be the solution to my problem. But if you know of a simpler one please help me out.
The intent of your question is to answer as the challenge is to draw the grid lines of the x2 axis across the red line. I think you can simply set a standard for the grid lines of the x2 axis.
_ = ax2.grid(which='major', axis='x', zorder=1.0)

matplotlib subplot of 1:2:1

I'm trying to plot an A4 PDF with the following layout:
1 chart spanning 2 columns
2 charts, each spanning 1 column
1 chart spanning 2 columns
I have the following code:
fig = plt.figure(figsize=(8.27,11.69))
ax = fig.add_subplot(311)
ax = fig.add_subplot(323)
ax = fig.add_subplot(324)
ax = fig.add_subplot(315)
but i'm getting the following error:
ValueError: num must be 1 <= num <= 3, not 5
what am i missing?
This is the correct syntax:
fig = plt.figure()
ax = fig.add_subplot(311)
ax = fig.add_subplot(323)
ax = fig.add_subplot(324)
ax = fig.add_subplot(313)
However, for this kind of thing, you will gain in readability if you use GridSpec https://matplotlib.org/3.1.1/tutorials/intermediate/gridspec.html The following code yields the exact same output, but is (at least to me) easier to understand
fig = plt.figure()
gs = fig.add_gridspec(3, 2)
fig.add_subplot(gs[0, :])
fig.add_subplot(gs[1, 0])
fig.add_subplot(gs[1, 1])
fig.add_subplot(gs[2, :])

Data-Visualization Python

Plot 4 different line plots for the 4 companies in dataframe open_prices. Year would be on X-axis, stock price on Y axis, you will need (2,2) plot. Set figure size to 10, 8 and share X-axis for better visualization
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from nsepy import get_history
import datetime as dt
%matplotlib inline
start = dt.datetime(2015, 1, 1)
end = dt.datetime.today()
infy = get_history(symbol='INFY', start = start, end = end)
infy.index = pd.to_datetime(infy.index)
hdfc = get_history(symbol='HDFC', start = start, end = end)
hdfc.index = pd.to_datetime(hdfc.index)
reliance = get_history(symbol='RELIANCE', start = start, end = end)
reliance.index = pd.to_datetime(reliance.index)
wipro = get_history(symbol='WIPRO', start = start, end = end)
wipro.index = pd.to_datetime(wipro.index)
open_prices = pd.concat([infy['Open'], hdfc['Open'],reliance['Open'],
wipro['Open']], axis = 1)
open_prices.columns = ['Infy', 'Hdfc', 'Reliance', 'Wipro']
f, (ax1, ax2) = plt.subplots(1, 2, sharey=True)
axes[0, 0].plot(open_prices.index.year,open_prices.INFY)
axes[0, 1].plot(open_prices.index.year,open_prices.HDB)
axes[1, 0].plot(open_prices.index.year,open_prices.TTM)
axes[1, 1].plot(open_prices.index.year,open_prices.WIT)
Blank graph is coming.Please help....?!??
Below code works fine , I have changed the following things
a) axis should be ax b) DF column names were incorrect c) for any one to try this example would also need to install lxml library
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from nsepy import get_history
import datetime as dt
start = dt.datetime(2015, 1, 1)
end = dt.datetime.today()
infy = get_history(symbol='INFY', start = start, end = end)
infy.index = pd.to_datetime(infy.index)
hdfc = get_history(symbol='HDFC', start = start, end = end)
hdfc.index = pd.to_datetime(hdfc.index)
reliance = get_history(symbol='RELIANCE', start = start, end = end)
reliance.index = pd.to_datetime(reliance.index)
wipro = get_history(symbol='WIPRO', start = start, end = end)
wipro.index = pd.to_datetime(wipro.index)
open_prices = pd.concat([infy['Open'], hdfc['Open'],reliance['Open'],
wipro['Open']], axis = 1)
open_prices.columns = ['Infy', 'Hdfc', 'Reliance', 'Wipro']
print(open_prices.columns)
ax=[]
f, ax = plt.subplots(2, 2, sharey=True)
ax[0,0].plot(open_prices.index.year,open_prices.Infy)
ax[1,0].plot(open_prices.index.year,open_prices.Hdfc)
ax[0,1].plot(open_prices.index.year,open_prices.Reliance)
ax[1,1].plot(open_prices.index.year,open_prices.Wipro)
plt.show()

Plot datetime x axis with irregular intervals

I want to plot data for custom date range . So my data set is like every day from 8 a.m. to 5 p.m. I need to plot this time ranges for 5 days. How can i do that so that i dont get the other time range apart from the above mentioned. I tried plotting using x locator but it shows the data in full time range.
tried setting x locator didn't work
import datetime
from matplotlib import dates as mdates
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates
x= pd.date_range(start=datetime.datetime( 2019, 7, 8, 8, 0,0 ), freq='10min',end=datetime.datetime ( 2019, 7, 12, 17, 0 ,0))
start = datetime.time(8, 0, 0)
end = datetime.time(17, 0, 0)
def time_in_range(start, end, x):
"""Return true if x is in the range [start, end]"""
if start <= end:
return start <= x <= end
else:
return start <= x or x <= end
t= [i.to_datetime() for i in x if time_in_range(start,end,i.to_datetime().time())]
df=pd.DataFrame({'d':d,'t':t})
df=df.set_index('t')
fig,axes = plt.subplots()
axes.xaxis_date()
axes.bar(mdates.date2num(list(df.index)),df['d'],align='center',width=0.006,color='pink')
fig.autofmt_xdate()
xformatter = mdates.DateFormatter('%d/%m/%y:%H:%M')
xlocator = mdates.HourLocator(byhour=range(8,17,2))
axes.xaxis.set_major_locator(xlocator)
plt.gcf().axes[0].xaxis.set_major_formatter(xformatter)
plt.show()

matplotlib scatter plot using axes object in loop

I am having trouble using Matplotlib to plot multiple series in a loop (Matplotlib 1.0.0, Python 2.6.5, ArcGIS 10.0). Forum research pointed me to application of an Axes object, in order to plot multiple series on the same plot. I see how this works well for data generated outside of a loop (sample scripts), but when I insert the same syntax and add the second series into my loop that pulls data from database, I get the following error:
": unsupported operand type(s) for -: 'NoneType' and 'NoneType' Failed to execute (ChartAge8)."
Below is my code - any suggestions or comments are much appreciated!
import arcpy
import os
import matplotlib
import matplotlib.pyplot as plt
#Variables
FC = arcpy.GetParameterAsText(0) #feature class
P1_fld = arcpy.GetParameterAsText(1) #score field to chart
P2_fld = arcpy.GetParameterAsText(2) #score field to chart
plt.subplots_adjust(hspace=0.4)
nsubp = int(arcpy.GetCount_management(FC).getOutput(0)) #pulls n subplots from FC
last_val = object()
#Sub-plot loop
cur = arcpy.SearchCursor(FC, "", "", P1_fld)
i = 0
x1 = 1 # category 1 locator along x-axis
x2 = 2 # category 2 locator along x-axis
fig = plt.figure()
for row in cur:
y1 = row.getValue(P1_fld)
y2 = row.getValue(P2_fld)
i += 1
ax1 = fig.add_subplot(nsubp, 1, i)
ax1.scatter(x1, y1, s=10, c='b', marker="s")
ax1.scatter(x2, y2, s=10, c='r', marker="o")
del row, cur
#Save plot to pdf, open
figPDf = r"path.pdf"
plt.savefig(figPDf)
os.startfile("path.pdf")
If what you want to do is plot several stuff reusing the same plot what you should do it create the figure object outside the loop and then plot to that same object everytime, something like this:
fig = plt.figure()
for row in cur:
y1 = row.getValue(P1_fld)
y2 = row.getValue(P2_fld)
i += 1
ax1 = fig.add_subplot(nsubp, 1, i)
ax1.scatter(x1, y1, s=10, c='b', marker="s")
ax1.scatter(x2, y2, s=10, c='r', marker="o")
del row, cur