plot histogram after averaging images - numpy

I have the following code to average a set of images (say 100 images) and to plot the histogram of the averaged image (one image). I am not getting the histogram with this code. Could you please help me in this code.
# reading multiple images
S01=[i for i in glob.glob("C:/Users/experiment 1/S01/*.tif")]
# Averaging images
s=np.array([np.array(Image.open(fname)) for fname in
S01])
s_avg=np.array(np.mean(s,axis=(0)),dtype=np.uint16)
s_out=Image.fromarray(s_avg)
#plot histogram
plt.hist(s_out,bins=auto)
plt.show()

No need to convert s_avg to an Image, you can compute the histogram of the array using .ravel()
fnames = [i for i in glob.glob("C:/Users/experiment 1/S01/*.tif")]
s=np.array([np.array(Image.open(fname)) for fname in fnames])
s_avg=np.mean(s,axis=(0))
#plot histogram
plt.hist(s_avg.ravel(),bins=100)
plt.show()

Related

Plotting xarray.DataArray and Geopandas together - aspect ratio errors

I am trying to create two images side by side: one satellite image alone, and next to it, the same satellite image with outlines of agricultural fields. My raster data "raster_clip" is loaded into rioxarray (original satellite image from NAIP, converted from .sid to .tif), and my vector data "ag_clip" is in geopandas. My code is as follows:
fig, (ax1, ax2) = plt.subplots(ncols = 2, figsize=(14,8))
raster_clip.plot.imshow(ax=ax1)
raster_clip.plot.imshow(ax=ax2)
ag_clip.boundary.plot(ax=ax1, color="yellow")
I can't seem to figure out how to get the y axes in each plot to be the same. When the vector data is excluded, then the two plots end up the same shape and size.
I have tried the following:
Setting sharey=True in the subplots method. Doesn't affect shape of resulting images, just removes the tic labels on the second image.
Setting "aspect='equal'" in the imshow method, leads to an error, which doesn't make sense because the 'aspect' kwarg is listed in the documentation for xarray.plot.imshow.
plt.imshow's 'aspect' kwarg is not available in xarray
Removing the "figsize" variable, doesn't affect the ratio of the two plots.
not entirely related to your question but i've used cartopy before for overlaying a GeoDataFrame to a DataArray
plt.figure(figsize=(16, 8))
ax = plt.subplot(projection=ccrs.PlateCarree())
ds.plot(ax=ax)
gdf.plot(ax=ax)

How to change scaling of x/y axis to plot outliers in pandas dataframe?

In a set of datapoints I am trying to graph on a scatterplot, there are a couple of huge anomaly points. For reference, most values range between 0-100 but occasionally there is an anomalous point of 100000. Because of this, when I graph on a scatterplot, box plot, or any plot that is, it zooms out so much to fit in all the points that the 99% of the points that range between 0-100 just looks like a tiny dot. Is there any way I can scale it so that the first 99% of the points are scaled accordingly and have the scale skip to the anomaly point's value so it fits in the graph?
Here is how the graphs look:
Box Plot:
Scatter Plot:
Use the plt.axis() function with your limits.
plt.axis([x_min, x_max, y_min, y_max])
where x_min, x_max, y_min, and y_max are the coordinate limits for both axe
You can set x/y axis scale to log or just set limit on x/y (with plt.xlim(0,200) for example) to hide anomalies from your chart:
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style('whitegrid')
plt.figure(figsize=(20,12))
data = [1,2,3,4,5,55,1,6,7,24,67,33,41,75,100_000,1_000_000]
plt.subplot(2,2,1)
plt.title('basic boxplot')
sns.boxplot(x=data, flierprops=dict(markerfacecolor='green', markersize=7))
plt.subplot(2,2,2)
plt.title('log x axis')
b = sns.boxplot(x=data, flierprops=dict(markerfacecolor='red', markersize=7))
b.set_xscale('log')
plt.subplot(2,2,3)
plt.title('basic scatter')
hue = ['outlier' if i > 100 else 'basic sample' for i in data]
sns.scatterplot(x=data, y=data, hue=hue)
plt.subplot(2,2,4)
plt.title('log x/y scatter')
s = sns.scatterplot(x=data, y=data, hue=hue)
s.set_xscale('log')
s.set_yscale('log')
plt.show()

Tensorflow how to sample large number of textures from small dataset of large images

I have 100 large-ish (1000x1000) images which I want to use as a training data set for a texture analysis system. I want to randomly generate texture swatches of about 200x200. What is the best way to do this? I would prefer to not preprocess all of the swatches so that each epoch is trained with slightly different swatches.
My initial (naive?) implementation included preprocessing layers in the model that do random crops on the image and just do a ton of epochs to accommodate the small number of large pictures, however after about ~400 epochs TF would crash without exception (it would just exit).
I now find myself coding a data generator (tf.keras.utils.Sequence) that will return a batch of swatches on request, but I feel like I'm reinventing the wheel and it is getting clunky - making me think this can't be the best way.
What is the best way to handle such a situation where you have a somewhat small dataset that you dynamically create more samples from?
I have written a function that will segment an image. Code is below
import cv2
def image_segment( image_path, img_resize, crop_size):
image_list=[]
img=cv2.imread(image_path)
img=cv2.resize(img, img_resize)
shape=img.shape
xsteps =int( shape[0]/crop_size[0])
ysteps = int( shape[1]/crop_size[1])
print (xsteps, ysteps)
for i in range (xsteps):
for j in range (ysteps):
x= i * crop_size[0]
xend=x + crop_size[0]
y= j * crop_size[1]
yend = y + crop_size[1]
cropped_image = cropped_image=img[x: xend, y: yend]
image_list.append(cropped_image)
return image_list
below is an example of use
# This code provides input to the image_segment function
image_path=r'c:\temp\landscape.jpg' # location of image
width=1000 # width to resize input image
height= 800 # height to resize input image
image_resize=( width, height) # specify original image (width, height)
crop_width=200 # width of desired cropped images
crop_height=400 # height of desired cropped images
# Note to get full set of cropped images width/cropped_width and height/cropped_height should be integer values
crop_size=(crop_height, crop_width)
images=image_segment(image_path, image_resize, crop_size) # call the function
The code below will display the resized input image and the resultant cropped images
# this code will display the resized input image and the resultant cropped images
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
img=cv2.imread(image_path) # read in the image
input_resized_image=cv2.resize(img, image_resize) # resize the image
imshow(input_resized_image) # show the resized input image
r=len(images)
plt.figure(figsize=(20, 20))
for i in range(r):
plt.subplot(5, 5, i + 1)
image=images # scale images between 0 and 1 becaue pre-processor set them between -1 and +1
plt.imshow(image[i])
class_name=str(i)
plt.title(class_name, color='green', fontsize=16)
plt.axis('off')
plt.show()

Looping across files to overlay contours from different datasets onto one pcolor map

I am trying to have a pcolor plot from one dataset on a polar stereographic grid, and then overlay many contours for one level , but these contours are being defined from a 3d dataset of their own. I am getting the error:
RuntimeError: Can not put single artist in more than one figure
# load lat lons
fid = Dataset(ifile_for_lat_lon)
lon=fid.variables['TLON'][:]
lat=fid.variables['TLAT'][:]
fid.close()
def get_basemap():
fig = plt.figure('Background')
fig.add_subplot(111)
m = Basemap(projection='npstere',boundinglat=65,lon_0=270,resolution='l')
x,y = m(lon,lat) # compute map proj coordinates
m.pcolor(x,y,raster_data,vmin=0,vmax=1,cmap='Blues_r')
m.colorbar(location='right')
m.drawcoastlines()
m.fillcontinents(color='grey',lake_color='navy')
# draw parallels and meridians.
m.drawparallels(np.arange(-80.,81.,20.))
m.drawmeridians(np.arange(-180.,181.,20.))
m.drawmapboundary(fill_color='navy')
plt.legend(loc='lower right')
return m,x,y
# run get_basemap once to get background map
m,x,y = get_basemap()
for year in np.arange(1979,2017):
year=str(year)
# contour levels
clevs=np.array([0.7])
# load data
nc_fid = Dataset(ifile,'r')
data = nc_fid.variables['data'][0,:,:]
nc_fid.close()
cnplot = m.contour(x,y,data,clevs,cmap=plt.cm.jet,linewidth=20)
#for member in cnplot.collections:
# member.remove()
plt.show()
The first plot of the loop works fine -- the background raster with associated contour is displayed.
As soon as the loop goes through the second iteration, it complains there
RuntimeError: Can not put single artist in more than one figure
I just want the loop so I can read the new rasters so it knows where to put the overlaying contours. So, in the end, I need a background raster with a lot of contours on top (but those rasters will not be displayed, just the contour level information from them on top of the original background from the first file)

Scatter plot without x-axis

I am trying to visualize some data and have built a scatter plot with this code -
sns.regplot(y="Calls", x="clientid", data=Drop)
This is the output -
I don't want it to consider the x-axis. I just want to see how the data lie w.r.t y-axis. Is there a way to do that?
As #iayork suggested, you can see the distribution of your points with a striplot or a swarmplot (you could also combine them with a violinplot). If you need to move the points closer to the y-axis, you can simply adjust the size of the figure so that the width is small compared to the height (here i'm doing 2 subplots on a 4x5 in figure, which means that each plot is roughly 2x5 in).
fig, (ax1,ax2) = plt.subplots(1,2, figsize=(4,5))
sns.stripplot(d, orient='vert', ax=ax1)
sns.swarmplot(d, orient='vert', ax=ax2)
plt.tight_layout()
However, I'm going to suggest that maybe you want to use distplot instead. This function is specifically created to show the distribution of you data. Here i'm plotting the KDE of the data, as well as the "rugplot", which shows the position of the points along the y-axis:
fig = plt.figure()
sns.distplot(d, kde=True, vertical=True, rug=True, hist=False, kde_kws=dict(shade=True), rug_kws=dict(lw=2, color='orange'))