image from [3,M,N] to [M,N,3] - numpy

I have a ndarray representing an image with different channels like this:
image = (8,100,100) where 8=channels, 100x100 the actual image per channel
I am interested in extracting the RGB components of that image:
imageRGB = np.take(image, [4,2,1], axis = 0)
in this way I have an array of (3,100,100) with the RGB components.
However, I need to visualize it so I need an array of (100,100,3), I think it's quite straightforward to do it but I all the methods I try do not work.

numpy einsum is a good tool to be used.
Official document: https://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html
import numpy as np
imageRGB = np.random.randint(0,5,size=(3,100,101))
# set the last dim to 101 just to make stuff more clear
imageRGB.shape
# (3,100,101)
imageRGB_reshape = np.einsum('kij->ijk',imageRGB)
imageRGB_reshape.shape
# (100,101,3)
In my opinion it's the most clear way to write and read.

Wow thank you! I have never thought to use Einstein summation, actually it works very well.
Just for curiosity is it possible to build it manually?
For example:
R = image[4,:,:]
G = image[2,:,:]
B = image[1,:,:]
imageRGB = ???

Related

Streamlit with Tensorflow to analyse image and return the probability if is positive or negative

I'm trying to use Tensorflow to Machine Learning to analyze an image and return the probability if is positive or negative based on a model created (extension .h5). I couldn't found a documentation exactly for that, or repository, so even a link to read will be awesome.
Link for the application: https://share.streamlit.io/felipelx/hackathon/IDC_Detector.py
Libraries that I'm trying to use.
import numpy as np
import streamlit as st
import tensorflow as tf
from keras.models import load_model
The function to load the model.
#st.cache(allow_output_mutation=True)
def loadIDCModel():
model_idc = load_model('models/IDC_model.h5', compile=False)
model_idc.summary()
return model_idc
The function to work the image, and what I'm trying to see: model.predict - I can see but is not updating the %, independent of the image the value is always the same.
if uploaded_file is not None:
# transform image to numpy array
file_bytes = tf.keras.preprocessing.image.load_img(uploaded_file, target_size=(96,96), grayscale = False, interpolation = 'nearest', color_mode = 'rgb', keep_aspect_ratio = False)
c.image(file_bytes, channels="RGB")
Genrate_pred = st.button("Generate Prediction")
if Genrate_pred:
model = loadMetModel()
input_arr = tf.keras.preprocessing.image.img_to_array(file_bytes)
input_arr = np.array([input_arr])
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
prediction = probability_model.predict(input_arr)
dict_pred = {0: 'Benigno/Normal', 1: 'Maligno'}
result = dict_pred[np.argmax(prediction)]
value = 0
if result == 'Benigno/Normal':
value = str(((prediction[0][0])*100).round(2)) + '%'
else:
value = str(((prediction[0][1])*100).round(2)) + '%'
c.metric('Predição', result, delta=value, delta_color='normal')
Thank you in advance to any help.
The first thing I'm noticing is that your function for loading the model is named loadIDCModel, but then the function you call for loading the model is loadMetModel. When I check your source code, though, it looks like you've already addressed this issue. I'd recommend updating your question to reflect this.
Playing around with your application, I think the issue is your model itself. I tried various images — images containing carcinomas, and even a picture of a cat — and each gave me a probability around 73%. The lowest score I got was 72.74%, and the highest was 73.11% (this one was the cat). It seems that the output percentage is varying slightly, hinting that rather than something being wrong in the code, your model itself is likely at fault. You might need to retrain your model, as it seems to have learned to always return a value of approximately 0.73.

TensorFlow Binary Classification

I'm trying to make a simple binary image classification with TensorFlow, but the results are just all over the place.
The classifier is supposed to check whether my gate is open or closed. I already have some python scripts to rotate and crop the images to eliminate the surroundings, with an image size of 130w*705h.
Images are below. I know I must be doing something totally wrong, because the images are almost night and day of a difference, yet it still gives completely random results. Any tips? Is there a simpler library or maybe a cloud service I could use for this if TF is too complicated?
Any help is appreciated, thanks!
Gate Closed
Gate Open
Just compute the average grey value of your images and define a threshold. If you want something more sophisticated compute average gradients or something like that. Your problem seems far too simple to use TF or CV.
After taking into consideration Martin's Answer, I decided to go with average grays after some filtering and edge detection.
I think it will work great for my case, thanks!
Some code:
import cv2
import os
import numpy as np
# https://medium.com/sicara/opencv-edge-detection-tutorial-7c3303f10788
inputPath = '/Users/axelsariel/Desktop/GateImages/Cropped/'
# subDir = 'Closed/'
subDir = 'Open/'
openImagesList = os.listdir(inputPath + subDir)
for image in openImagesList:
if not image.endswith('.JPG'):
openImagesList.remove(image)
index = 0
while True:
image = openImagesList[index]
img = cv2.imread(inputPath + subDir + image)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.medianBlur(gray,11)
grayFiltered = cv2.bilateralFilter(gray, 7, 50, 50)
edgesFiltered = cv2.Canny(grayFiltered, 80, 160)
images = np.hstack((gray, grayFiltered, edgesFiltered))
cv2.imshow(image, images)
key = cv2.waitKey()
if key == 3:
index += 1
elif key == 2:
index -= 1
elif key == ord('q'):
break
cv2.destroyAllWindows()
Average Grays after filtering:
Filtering steps:

Construct NumPy matrix row by row

I'm trying to construct a 2D NumPy array from values in an extant 2D NumPy array using an iterative process. Using ordinary python lists the process I'm describing would look like so:
coords = #data from file contained in a 2D list
d = #integer
edges = []
for i in range(d+1):
for j in range(i+1, d+1):
edge = coords[j] - coords[i]
edges.append(edge)
However, the NumPy array imposes restrictions that do not permit the process shown above. Below I try to do the same thing using NumPy arrays, and it should immediately be clear where the problems are:
coords = np.genfromtxt('Energies.txt', dtype=float, skip_header=1)
d = #integer
#how to initialize?
for i in range(d+1):
for j in range(i+1, d+1):
edge = coords[j] - coords[i]
#how to append?
Because .append does not exist for NumPy arrays I need to rely on concatenate or stack instead. But these functions are designed to join existing arrays, and I don't have anything to concatenate or stack until after the first iteration of my loop. So I suppose I need to change my data flow, but I'm unsure how to go about this.
Any help would be greatly appreciated. Thanks in advance.
that function is numpy.meshgrid [1] , the function does it by default.
[1] https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.meshgrid.html

Convert date/time index of external dataset so that pandas would plot clearly

When you already have time series data set but use internal dtype to index with date/time, you seem to be able to plot the index cleanly as here.
But when I already have data files with columns of date&time in its own format, such as [2009-01-01T00:00], is there a way to have this converted into the object that the plot can read? Currently my plot looks like the following.
Code:
dir = sorted(glob.glob("bsrn_txt_0100/*.txt"))
gen_raw = (pd.read_csv(file, sep='\t', encoding = "utf-8") for file in dir)
gen = pd.concat(gen_raw, ignore_index=True)
gen.drop(gen.columns[[1,2]], axis=1, inplace=True)
#gen['Date/Time'] = gen['Date/Time'][11:] -> cause error, didnt work
filter = gen[gen['Date/Time'].str.endswith('00') | gen['Date/Time'].str.endswith('30')]
filter['rad_tot'] = filter['Direct radiation [W/m**2]'] + filter['Diffuse radiation [W/m**2]']
lis = np.arange(35040) #used the number of rows, checked by printing. THis is for 2009-2010.
plt.xticks(lis, filter['Date/Time'])
plt.plot(lis, filter['rad_tot'], '.')
plt.title('test of generation 2009')
plt.xlabel('Date/Time')
plt.ylabel('radiation total [W/m**2]')
plt.show()
My other approach in mind was to use plotly. Yet again, its main purpose seems to feed in data on the internet. It would be best if I am familiar with all the modules and try for myself, but I am learning as I go to use pandas and matplotlib.
So I would like to ask whether there are anyone who experienced similar issues as I.
I think you need set labels to not visible by loop:
ax = df.plot(...)
spacing = 10
visible = ax.xaxis.get_ticklabels()[::spacing]
for label in ax.xaxis.get_ticklabels():
if label not in visible:
label.set_visible(False)

Tensorboard histograms to matplotlib

I would like to "dump" the tensorboard histograms and plot them via matplotlib. I would have more scientific paper appealing plots.
I managed to hack the way through the Summary file using the tf.train.summary_iterator and dump the histogram that I wanted to dump( tensorflow.core.framework.summary_pb2.HistogramProto object).
By doing that and implementing what the java-script code does with the data (https://github.com/tensorflow/tensorboard/blob/c2fe054231fe77f3a5b05dbc519f713d2e738d1c/tensorboard/plugins/histogram/tf_histogram_dashboard/histogramCore.ts#L104), I managed to get something similar (same trends) with the tensorboard plots, but not the exact same plot.
Can I have some light on this?
Thanks
In order to plot a tensorboard histogram with matplotlib I am doing the following:
event_acc = EventAccumulator(path, size_guidance={
'histograms': STEP_COUNT,
})
event_acc.Reload()
tags = event_acc.Tags()
result = {}
for hist in tags['histograms']:
histograms = event_acc.Histograms(hist)
result[hist] = np.array([np.repeat(np.array(h.histogram_value.bucket_limit), np.array(h.histogram_value.bucket).astype(np.int)) for h in histograms])
return result
h.histogram_value.bucket_limit gives me the value and h.histogram_value.bucket the count of this value. So when i repeat the values accordingly (np.repeat(...)), I get a huge array of expected size. This array can now be plotted with the default matplotlib logic.
The best solution is loading all events and reconstructing all the histogram (as the answer of #khuesmann) but not using EventAccumulator but EventFileLoader. This will give you a histogram per wall time and step as the ones Tensorboard plots. It can be extended to return a list of actions by timestep and wall time.
Don't forget to check which tag will you use.
from tensorboard.backend.event_processing.event_file_loader import EventFileLoader
# Just in case, PATH_OF_FILE is the path of the file, not the folder
loader = EventFileLoader(PATH_Of_FILE)
# Where to store values
wtimes,steps,actions = [],[],[]
for event in loader.Load():
wtime = event.wall_time
step = event.step
if len(event.summary.value) > 0:
summary = event.summary.value[0]
if summary.tag == HISTOGRAM_TAG:
wtimes += [wtime]*int(summary.histo.num)
steps += [step] *int(summary.histo.num)
for num,val in zip(summary.histo.bucket,summary.histo.bucket_limit):
actions += [val] *int(num)
bear in mind that tensorflow approximates the actions and treats the actions as continuous variables, so even if you have discrete actions (e.g. 0,1,3) you will end up actions as 0.2,0.4,0.9,1.4 ... in that case round the values will do it.
A good solution is the one from #khuesmann, but this only allows you to retrieve the accumulated histogram, not the histogram per step -- which is the one actually being showed in tensorboard.
If you want the distribution and so far, what I have understood is that Tensorboard usually compresses the histogram to decrease the memory used to store the data -- imagine storing a 2D histogram over 4 million steps, the memory can increase fast quickly. These compress histograms are accessible by doing this:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
n2n = EventAccumulator(PATH)
n2n.Reload()
# Check the tags under histograms and choose the one you want
n2n.Tags()
# This will give you the list used by tensorboard
# of the compress histograms by timestep and wall time
n2n.CompressedHistograms(HISTOGRAM_TAG)
The only problem is that it compresses the histogram to five percentiles (in Basic points they are 0, 668, 1587, 3085, 5000, 6915, 8413, 9332, 10000) which corresponds to (-Inf, -1.5, -1, -0.5, 0, 0.5, 1, 1.5, Inf) in standard deviations. Check the code here.
I haven't read much, but it wouldn't be hard to reconstruct the temporal histograms that tensorboard shows. If I find a way to do it, I will post it here.
The simplest way is to parse the events with tbparse and plot the histograms with seaborn kde_ridgeplot.
This tutorial generates the stacked distribution plot with around 30 lines of Python code:
Tensorboard preview:
Parse by tbparse & plotted by seaborn:
You can open an issue if you encountered any question during parsing. (I'm the author of tbparse)