create a lineplot from a few variables - pandas

I have a dataframe with 3 variables, each one is representing different time point for the same outcome (e.g. weight):
df = pd.DataFrame({"Time_1": [-4.5, -0.8, -3.0, 0.2, -2.5], \
"Time_2": [-3, -0.2, -2.5, 0.3, 1], "TIme_3": [-2, 0, -1, 0.5, 1]})
I want to plot a trajectory for this variable identical to this graph:
Where I have a first point of (0,0) for the basline and three additional points on X axis with the correspondign values.

You could just use df.shift().fillna(0).cumsum().plot(marker='D') to get a plot of the 3 variables together. Shift and fillna are used so that the first line can be 0 for all the variables.
df = pd.DataFrame({"Time_1": [-4.5, -0.8, -3.0, 0.2, -2.5], \
"Time_2": [-3, -0.2, -2.5, 0.3, 1], "Time_3": [-2, 0, -1, 0.5, 1]})
df.shift().fillna(0).cumsum().plot(marker='D')

Related

Why tensorflow addons F1 Score gave 0 for correct guess?

I am confused. My goal is to train my CNN model with F1 Score. However, the result is weird
import tensorflow_addons as tfa
import numpy as np
metric = tfa.metrics.F1Score(
num_classes=4, threshold=0.5)
y_true = np.array([
[0, 1, 0, 0],
# [0, 1, 0, 0],
# [1, 0, 0, 0]
], np.int32)
y_pred = np.array([
[0, 1, 0, 0],
# [0.2, 0.6, 0.2, 0.2],
# [0.6, 0.2, 0.2, 0.2]
], np.float32)
metric.update_state(y_true, y_pred)
result = metric.result()
result.numpy()
The expected result is
[1,1,1,1]
So, when I want to get the macro F1 Score, it should be 1 instead of 0.25.
The actual result is
[0, 1, 0, 0]
So, when I use parameter average=macro, the actual result is 0.25.
EDIT:
I am confused. I add another row to y_true, and it works. I expected it to throws error but it does not.
import tensorflow_addons as tfa
import numpy as np
metric = tfa.metrics.F1Score(
num_classes=4, threshold=0.5)
y_true = np.array([
[0, 1, 0, 0],
[1, 0, 0, 0]
# [0, 1, 0, 0],
# [1, 0, 0, 0]
], np.int32)
y_pred = np.array([
[0, 1, 0, 0],
# [0.2, 0.6, 0.2, 0.2],
# [0.6, 0.2, 0.2, 0.2]
], np.float32)
metric.update_state(y_true, y_pred)
result = metric.result()
result.numpy()
Is tensorflow addons buggy?
There is no issue with tfa.metrics.F1Score. You have defined 4 classes and each element of the y_pred row represents the class probabilities and its made 1 if its above the threshold, and then F1 score is computed. In your first example, there were no outputs representing classes 0,2,3, that's why they were zero.
Check the below example,
y_true = np.array([
[0, 1, 1, 0],
[0, 0, 0, 1],
[1, 0, 1, 0],
y_pred = np.array([
[0, 1, 0, 0],
[0, 1, 0, 0],
[0.6, 0, 0.51, 0],
#metrics.F1Score
[1. , 0.6666667, 0.6666667, 0. ]

in pyplot hist2D with customized colorbar mark bins outside colorbar range

I'm plotting a weighted 2D histogram with one value assigned to each bin. Here's a minimal example:
import matplotlib.pyplot as plotter
plot_field, axis_field = plotter.subplots()
x = [0.5, 1.5, 2.5, 0.5, 1.5, 2.5, 0.5, 1.5, 2.5]
y = [0.5, 0.5, 0.5, 1.5, 1.5, 1.5, 2.5, 2.5, 2.5]
w = [2, 1, 0, 3, 0, 0, 1, 0, 3]
minimum = 1
bins = [[0, 1, 2, 3], [0, 1, 2, 3]]
histo = plotter.hist2d(x, y, bins=bins, weights=w)
plotter.colorbar(histo[3], extend='min')
plotter.clim(minimum, max(w))
plotter.show()
Restricting the range of the colorbar works fine. However, I want to the bins with weight below the minimum to be marked in some way. Either colored differently or indicated in some other way.
Is there a simple way to do this?
Thanks a lot!
You could create your own colormap for example:
import numpy as np
import matplotlib.pyplot as plotter
from matplotlib import cm
from matplotlib.colors import ListedColormap
plot_field, axis_field = plotter.subplots()
viridis = cm.get_cmap('viridis', 256)
newcolors = viridis(np.linspace(0, 1, 256))
pink = np.array([248/256, 24/256, 148/256, 1])
newcolors[0, :] = pink
newcmp = ListedColormap(newcolors)
x = [0.5, 1.5, 2.5, 0.5, 1.5, 2.5, 0.5, 1.5, 2.5]
y = [0.5, 0.5, 0.5, 1.5, 1.5, 1.5, 2.5, 2.5, 2.5]
w = [2, 1, 0, 3, 0, 0, 1, 0, 3]
minimum = 1
bins = [[0, 1, 2, 3], [0, 1, 2, 3]]
_, _, _, mesh = plotter.hist2d(
x, y, bins=bins, weights=w, cmap=newcmp, vmin=minimum, vmax=max(w)
)
plotter.colorbar(mesh, extend='min')
plotter.show()

Tensorflow: Reshape a tensor according to a boolean mask

I have a 1D tensor of values:
a = tf.constant([0.1, 0.2, 0.3, 0.4])
and a nD boolean mask:
b = tf.constant([[1, 1, 0], [0, 1, 1]])
The total number of 1's in b matches the length of a.
How can I get [[0.1, 0.2, 0.0], [0.0, 0.3, 0.4]] from a and b?
import tensorflow as tf
a = tf.constant([0.1, 0.2, 0.3, 0.4])
b = tf.constant([[1, 1, 0], [0, 1, 1]])
# reshape b to a 1D vector
b_res = tf.reshape(b, [-1])
# Get the indices to gather using cumsum
b_cum = tf.cumsum(b_res) - 1
# Gather the elements, multiply by b_res to zero out the unwanted values and reshape back
c = tf.reshape(tf.gather(a, b_cum) * tf.cast(b_res, 'float32'), [-1, 3])
print(c)

tensorflow how do one get the output the same size as input tensor after segment sum

I'm using the tf.unsorted_segment_sum method of TensorFlow and it works.
For example:
tf.unsorted_segment_sum(tf.constant([0.2, 0.1, 0.5, 0.7, 0.8]),
tf.constant([0, 0, 1, 2, 2]), 3)
Gives the right result:
array([ 0.3, 0.5 , 1.5 ], dtype=float32)
I want to get:
array([0.3, 0.3, 0.5, 1.5, 1.5], dtype=float32)
I've solved it.
data = tf.constant([0.2, 0.1, 0.5, 0.7, 0.8])
gr_idx = tf.constant([0, 0, 1, 2, 2])
y, idx, count = tf.unique_with_count(gr_idx)
group_sum = tf.segment_sum(data, gr_idx)
group_sup = tf.gather(group_sum, idx)
answer:
array([0.3, 0.3, 0.5, 1.5, 1.5], dtype=float32)

Add an extra column to ndarray in python

I have a ndarray as follows.
feature_matrix = [[0.1, 0.3], [0.7, 0.8], [0.8, 0.8]]
I have a position ndarray as follows.
position = [10, 20, 30]
Now I want to add the position value at the beginning of the feature_matrix as follows.
[[10, 0.1, 0.3], [20, 0.7, 0.8], [30, 0.8, 0.8]]
I tried the answers in this: How to add an extra column to an numpy array
E.g.,
feature_matrix = np.concatenate((feature_matrix, position), axis=1)
However, I get the error saying that;
ValueError: all the input arrays must have same number of dimensions
Please help me to resolve this prblem.
This solved my problem. I used np.column_stack.
feature_matrix = [[0.1, 0.3], [0.7, 0.8], [0.8, 0.8]]
position = [10, 20, 30]
feature_matrix = np.column_stack((position, feature_matrix))
It is the shape of the position array which is incorrect regarding the shape of the feature_matrix.
>>> feature_matrix
array([[ 0.1, 0.3],
[ 0.7, 0.8],
[ 0.8, 0.8]])
>>> position
array([10, 20, 30])
>>> position.reshape((3,1))
array([[10],
[20],
[30]])
The solution is (with np.concatenate):
>>> np.concatenate((position.reshape((3,1)), feature_matrix), axis=1)
array([[ 10. , 0.1, 0.3],
[ 20. , 0.7, 0.8],
[ 30. , 0.8, 0.8]])
But np.column_stack is clearly great in your case !