Scilab - Legend ONLY for a specific set of functions - legend

I would like to generate boundaries using xfpoly and save them using xs2pdf. Then I want to display a plot of 2 functions into those boundaries, add a legend to those functions and save the image again.
My code follows...
clear; clc; xdel(winsid());
t = -2:0.01:2;
x_1 = t.^2; x_2 = t.^4;
xfpoly([-3 -2 -2 -3], [0 0 16 16], color('grey'));
ax = gca();
ax.auto_clear = 'off'; ax.data_bounds = [-3, 0; 3, 3];
ax.box = 'on';
ax.axes_visible = ['on','on','off']; ax.tight_limits = ['on','on','off'];
xfpoly([2 3 3 2], [0 0 16 16], color('grey'));
xfpoly([-1 1 1 -1], [1 1 16 16], color('grey'));
xs2pdf(gcf(), 'fig_1');
plot2d(t, [x_1', x_2'], [color('green'), color('red')]);
legend(['t^2'; 't^4']);
leg_ent = gce();
leg_ent.text = ['';'';'';'t^2'; 't^4']
xs2pdf(gcf(), 'fig_2');

Do you want something like this?
clear;
clc;
t = -2:0.01:2;
x_1 = t.^2; x_2 = t.^4;
scf(0);
clf(0);
//plot the curves first to make legend easier
plot2d(t, [x_1', x_2'], [color('green'), color('red')]);
legend(['t^2'; 't^4']); //the first two elements are the curves, so no neet to modify
ax = gca();
ax.auto_clear = 'off';
ax.data_bounds = [-3, 0; 3, 3];
ax.box = 'on';
xfpoly([-3 -2 -2 -3], [0 0 3 3], color('grey'));
xfpoly([2 3 3 2], [0 0 3 3], color('grey'));
xfpoly([-1 1 1 -1], [1 1 3 3], color('grey'));
scf(1);
clf(1);
xfpoly([-3 -2 -2 -3], [0 0 3 3], color('grey')); //ymax sholud be 3, not 16
xfpoly([2 3 3 2], [0 0 3 3], color('grey'));
xfpoly([-1 1 1 -1], [1 1 3 3], color('grey'));
ax = gca();
ax.auto_clear = 'off';
ax.data_bounds = [-3, 0; 3, 3];
ax.box = 'on';

Atilla's answer brought me to this solution using pause command:
clear; clc; xdel(winsid());
t = -2:0.01:2;
x_1 = t.^2; x_2 = t.^4;
plot2d(t, [x_1', x_2'], [color('green'), color('red')]); plot_1 = gce();
legend(['t^2'; 't^4']); leg_1 = gce();
plot_1.visible = 'off'; leg_1.visible = 'off';
xfpoly([-3 -2 -2 -3], [0 0 16 16], color('grey'));
xfpoly([2 3 3 2], [0 0 16 16], color('grey'));
xfpoly([-1 1 1 -1], [1 1 16 16], color('grey'));
ax = gca();
ax.box = 'on';
xs2pdf(gcf(), 'fig_1');
// pause
plot_1.visible = 'on'; leg_1.visible = 'on';
xs2pdf(gcf(), 'fig_2');

Related

Indices in Numpy and MATLAB

I have a piece of code in Matlab that I want to convert into Python/numpy.
I have a matrix ind which has the dimensions (32768, 24). I have another matrix X which has the dimensions (98304, 6). When I perform the operation
result = X(ind)
the shape of the matrix is (32768, 24).
but in numpy when I perform the same shape
result = X[ind]
I get the shape of the result matrix as (32768, 24, 6).
I would greatly appreciate it if someone can help me with why I can these two different results and how can I fix them. I would want to get the shape (32768, 24) for the result matrix in numpy as well
In Octave, if I define:
>> X=diag([1,2,3,4])
X =
Diagonal Matrix
1 0 0 0
0 2 0 0
0 0 3 0
0 0 0 4
>> idx = [6 7;10 11]
idx =
6 7
10 11
then the indexing selects a block:
>> X(idx)
ans =
2 0
0 3
The numpy equivalent is
In [312]: X=np.diag([1,2,3,4])
In [313]: X
Out[313]:
array([[1, 0, 0, 0],
[0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])
In [314]: idx = np.array([[5,6],[9,10]]) # shifted for 0 base indexing
In [315]: np.unravel_index(idx,(4,4)) # raveled to unraveled conversion
Out[315]:
(array([[1, 1],
[2, 2]]),
array([[1, 2],
[1, 2]]))
In [316]: X[_] # this indexes with a tuple of arrays
Out[316]:
array([[2, 0],
[0, 3]])
another way:
In [318]: X.flat[idx]
Out[318]:
array([[2, 0],
[0, 3]])

Partitioned matrix multiplication in tensorflow or pytorch

Assume I have matrices P with the size [4, 4] which partitioned (block) into 4 smaller matrices [2,2]. How can I efficiently multiply this block-matrix into another matrix (not partitioned matrix but smaller)?
Let's Assume our original matric is:
P = [ 1 1 2 2
1 1 2 2
3 3 4 4
3 3 4 4]
Which split into submatrices:
P_1 = [1 1 , P_2 = [2 2 , P_3 = [3 3 P_4 = [4 4
1 1] 2 2] 3 3] 4 4]
Now our P is:
P = [P_1 P_2
P_3 p_4]
In the next step, I want to do element-wise multiplication between P and smaller matrices which its size is equal to number of sub-matrices:
P * [ 1 0 = [P_1 0 = [1 1 0 0
0 0 ] 0 0] 1 1 0 0
0 0 0 0
0 0 0 0]
You can think of representing your large block matrix in a more efficient way.
For instance, a block matrix
P = [ 1 1 2 2
1 1 2 2
3 3 4 4
3 3 4 4]
Can be represented using
a = [ 1 0 b = [ 1 1 0 0 p = [ 1 2
1 0 0 0 1 1 ] 3 4 ]
0 1
0 1 ]
As
P = a # p # b
With (# representing matrix multiplication). Matrices a and b represents/encode the block structure of P and the small p represents the values of each block.
Now, if you want to multiply (element-wise) p with a small (2x2) matrix q you simply
a # (p * q) # b
A simple pytorch example
In [1]: a = torch.tensor([[1., 0], [1., 0], [0., 1], [0, 1]])
In [2]: b = torch.tensor([[1., 1., 0, 0], [0, 0, 1., 1]])
In [3]: p=torch.tensor([[1., 2.], [3., 4.]])
In [4]: q = torch.tensor([[1., 0], [0., 0]])
In [5]: a # p # b
Out[5]:
tensor([[1., 1., 2., 2.],
[1., 1., 2., 2.],
[3., 3., 4., 4.],
[3., 3., 4., 4.]])
In [6]: a # (p*q) # b
Out[6]:
tensor([[1., 1., 0., 0.],
[1., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
I leave it to you as an exercise how to efficiently produce the "structure" matrices a and b given the sizes of the blocks.
Following is a general Tensorflow-based solution that works for input matrices p (large) and m (small) of arbitrary shapes as long as the sizes of p are divisible by the sizes of m on both axes.
def block_mul(p, m):
p_x, p_y = p.shape
m_x, m_y = m.shape
m_4d = tf.reshape(m, (m_x, 1, m_y, 1))
m_broadcasted = tf.broadcast_to(m_4d, (m_x, p_x // m_x, m_y, p_y // m_y))
mp = tf.reshape(m_broadcasted, (p_x, p_y))
return p * mp
Test:
import tensorflow as tf
tf.enable_eager_execution()
p = tf.reshape(tf.constant(range(36)), (6, 6))
m = tf.reshape(tf.constant(range(9)), (3, 3))
print(f"p:\n{p}\n")
print(f"m:\n{m}\n")
print(f"block_mul(p, m):\n{block_mul(p, m)}")
Output (Python 3.7.3, Tensorflow 1.13.1):
p:
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]
[30 31 32 33 34 35]]
m:
[[0 1 2]
[3 4 5]
[6 7 8]]
block_mul(p, m):
[[ 0 0 2 3 8 10]
[ 0 0 8 9 20 22]
[ 36 39 56 60 80 85]
[ 54 57 80 84 110 115]
[144 150 182 189 224 232]
[180 186 224 231 272 280]]
Another solution that uses implicit broadcasting is the following:
def block_mul2(p, m):
p_x, p_y = p.shape
m_x, m_y = m.shape
p_4d = tf.reshape(p, (m_x, p_x // m_x, m_y, p_y // m_y))
m_4d = tf.reshape(m, (m_x, 1, m_y, 1))
return tf.reshape(p_4d * m_4d, (p_x, p_y))
Don't know about the efficient method, but you can try these:
Method 1:
Using torch.cat()
import torch
def multiply(a, b):
x1 = a[0:2, 0:2]*b[0,0]
x2 = a[0:2, 2:]*b[0,1]
x3 = a[2:, 0:2]*b[1,0]
x4 = a[2:, 2:]*b[1,1]
return torch.cat((torch.cat((x1, x2), 1), torch.cat((x3, x4), 1)), 0)
a = torch.tensor([[1, 1, 2, 2],[1, 1, 2, 2],[3, 3, 4, 4,],[3, 3, 4, 4]])
b = torch.tensor([[1, 0],[0, 0]])
print(multiply(a, b))
output:
tensor([[1, 1, 0, 0],
[1, 1, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
Method 2:
Using torch.nn.functional.pad()
import torch.nn.functional as F
import torch
def multiply(a, b):
b = F.pad(input=b, pad=(1, 1, 1, 1), mode='constant', value=0)
b[0,0] = 1
b[0,1] = 1
b[1,0] = 1
return a*b
a = torch.tensor([[1, 1, 2, 2],[1, 1, 2, 2],[3, 3, 4, 4,],[3, 3, 4, 4]])
b = torch.tensor([[1, 0],[0, 0]])
print(multiply(a, b))
output:
tensor([[1, 1, 0, 0],
[1, 1, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
If the matrices are small, you are probably fine with cat or pad. The solution with factorization is very elegant, as the one with a block_mul implementation.
Another solution is turning the 2D block matrix in a 3D volume where each 2D slice is a block (P_1, P_2, P_3, P_4). Then use the power of broadcasting to multiply each 2D slice by a scalar. Finally reshape the output. Reshaping is not immediate but it's doable, port from numpy to pytorch of https://stackoverflow.com/a/16873755/4892874
In Pytorch:
import torch
h = w = 4
x = torch.ones(h, w)
x[:2, 2:] = 2
x[2:, :2] = 3
x[2:, 2:] = 4
# number of blocks along x and y
nrows=2
ncols=2
vol3d = x.reshape(h//nrows, nrows, -1, ncols)
vol3d = vol3d.permute(0, 2, 1, 3).reshape(-1, nrows, ncols)
out = vol3d * torch.Tensor([1, 0, 0, 0])[:, None, None].float()
# reshape to original
n, nrows, ncols = out.shape
out = out.reshape(h//nrows, -1, nrows, ncols)
out = out.permute(0, 2, 1, 3)
out = out.reshape(h, w)
print(out)
tensor([[1., 1., 0., 0.],
[1., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
I haven't benchmarked this against the others, but this doesn't consume additional memory like padding would do and it doesn't do slow operations like concatenation. It has also ther advantage of being easy to understand and visualize.
You can generalize it to any situation by playing with h, w, nrows, ncols.
Although the other answer may be the solution, it is not an efficient way. I come up with another one to tackle the problem (but still is not perfect). The following implementation needs too much memory when our inputs are 3 or 4 dimensions. For example, for input size of 20*75*1024*1024, the following calculation needs around 12gb ram.
Here is my implementation:
import tensorflow as tf
tf.enable_eager_execution()
inps = tf.constant([
[1, 1, 1, 1, 2, 2, 2, 2],
[1, 1, 1, 1, 2, 2, 2, 2],
[1, 1, 1, 1, 2, 2, 2, 2],
[1, 1, 1, 1, 2, 2, 2, 2],
[3, 3, 3, 3, 4, 4, 4, 4],
[3, 3, 3, 3, 4, 4, 4, 4],
[3, 3, 3, 3, 4, 4, 4, 4],
[3, 3, 3, 3, 4, 4, 4, 4]])
on_cells = tf.constant([[1, 0, 0, 1]])
on_cells = tf.expand_dims(on_cells, axis=-1)
# replicate the value to block-size (4*4)
on_cells = tf.tile(on_cells, [1, 1, 4 * 4])
# reshape to a format for permutation
on_cells = tf.reshape(on_cells, (1, 2, 2, 4, 4))
# permutation
on_cells = tf.transpose(on_cells, [0, 1, 3, 2, 4])
# reshape
on_cells = tf.reshape(on_cells, [1, 8, 8])
# element-wise operation
print(inps * on_cells)
Output:
tf.Tensor(
[[[1 1 1 1 0 0 0 0]
[1 1 1 1 0 0 0 0]
[1 1 1 1 0 0 0 0]
[1 1 1 1 0 0 0 0]
[0 0 0 0 4 4 4 4]
[0 0 0 0 4 4 4 4]
[0 0 0 0 4 4 4 4]
[0 0 0 0 4 4 4 4]]], shape=(1, 8, 8), dtype=int32)

Weird behavior of multiply in tensorflow

I am trying to use multiply in my program, but I find the behavior of this op is unnormal. It seems that it is calculating the wrong results. Minimum example:
import tensorflow as tf
batchSize = 2
maxSteps = 3
max_cluster_size = 4
x = tf.Variable(tf.random_uniform(dtype=tf.int32, maxval=20, shape=[batchSize, maxSteps, max_cluster_size]))
y = tf.sequence_mask(tf.random_uniform(minval=1, maxval=max_cluster_size-1, dtype=tf.int32, shape=[batchSize, maxSteps]), maxlen=max_cluster_size)
y = tf.cast(y, tf.int32)
z = tf.multiply(x, y)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
x_v = sess.run(x)
y_v = sess.run(y)
z_v = sess.run(z)
print(x_v.shape)
print(x_v)
print('----------------------------')
print(y_v.shape)
print(y_v)
print('----------------------------')
print(z_v.shape)
print(z_v)
print('----------------------------')
Result:
(2, 3, 4)
[[[ 7 12 19 3]
[10 18 15 7]
[18 9 2 7]]
[[ 4 5 16 1]
[ 2 14 15 14]
[ 5 18 8 18]]]
----------------------------
(2, 3, 4)
[[[1 1 0 0]
[1 0 0 0]
[1 1 0 0]]
[[1 1 0 0]
[1 1 0 0]
[1 1 0 0]]]
----------------------------
(2, 3, 4)
[[[ 7 12 0 0]
[10 0 0 0]
[18 0 0 0]]
[[ 4 5 0 0]
[ 2 0 0 0]
[ 5 0 0 0]]]
----------------------------
Where z_v is expected to be:
[[[ 7 12 0 0]
[10 0 0 0]
[18 9 0 0]]
[[ 4 5 0 0]
[ 2 14 0 0]
[ 5 18 0 0]]]
When I test multiply in other programs, it goes just fine.
I suspect that this may be related to x and y are random variables. Anyone give a hint on this?
Instead of these lines:
x_v = sess.run(x)
y_v = sess.run(y)
z_v = sess.run(z)
you need to use this:
x_v, y_v, z_v = sess.run( [ x, y, z ] )
With the first, separate version, basically what ends up happening is that you create x_v, and then y_v, but when you run the sess.run(z) it will recalculate z's dependencies as well, so you end up seeing the output from different x's and y's than you print.

Tensorflow: how to make sure all samples in each batch are with the same label?

I wonder whether there are some ways to apply constraints on the batches to generate in Tensorflow. For example, let's say we are training a CNN on a huge dataset to do image classification. Is it possible to force Tensorflow to generate batches where all samples are with the same class? Like, one batch of images all tagged with "Apple", the other one where samples all tagged with "Orange".
The reason I ask this question is I want to do some experiments to see how different levels of shuffling influence the final trained models. It's common practice to do sample-level shuffling for CNN training, and everybody is doing it. I just want to check it myself, thus obtaining a more vivid and first-hand knowledge about it.
Thanks!
Dataset.filter() can be used:
labels = np.random.randint(0, 10, (10000))
data = np.random.uniform(size=(10000, 5))
ds = tf.data.Dataset.from_tensor_slices((data, labels))
ds = ds.filter(lambda data, labels: tf.equal(labels, 1)) #comment this line out for unfiltered case
ds = ds.batch(5)
iterator = ds.make_one_shot_iterator()
vals = iterator.get_next()
with tf.Session() as sess:
for _ in range(5):
py_data, py_labels = sess.run(vals)
print(py_labels)
with ds.filter():
> [1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
without ds.filter():
> [8 0 7 6 3]
[2 4 7 6 1]
[1 8 5 5 5]
[7 1 7 4 0]
[7 1 8 0 0]
Edit. The following code shows how to use a feedable iterator to perform batch label selection on the fly. See "Creating an iterator"
labels = ['Apple'] * 100 + ['Orange'] * 100
data = list(range(200))
random.shuffle(labels)
batch_size = 4
ds_apple = tf.data.Dataset.from_tensor_slices((data, labels)).filter(
lambda data, label: tf.equal(label, 'Apple')).batch(batch_size)
ds_orange = tf.data.Dataset.from_tensor_slices((data, labels)).filter(
lambda data, label: tf.equal(label, 'Orange')).batch(batch_size)
handle = tf.placeholder(tf.string, [])
iterator = tf.data.Iterator.from_string_handle(
handle, ds_apple.output_types, ds_apple.output_shapes)
batch = iterator.get_next()
apple_iterator = ds_apple.make_one_shot_iterator()
orange_iterator = ds_orange.make_one_shot_iterator()
with tf.Session() as sess:
apple_handle = sess.run(apple_iterator.string_handle())
orange_handle = sess.run(orange_iterator.string_handle())
# loop and switch back and forth between apples and oranges
for _ in range(3):
feed_dict = {handle: apple_handle}
print(sess.run(batch, feed_dict=feed_dict))
feed_dict = {handle: orange_handle}
print(sess.run(batch, feed_dict=feed_dict))
Typical output for this is as follows. Note that the data values increase monotonically across Apple and Orange batches showing that the iterators are not resetting.
> (array([2, 3, 6, 7], dtype=int32), array([b'Apple', b'Apple', b'Apple', b'Apple'], dtype=object))
(array([0, 1, 4, 5], dtype=int32), array([b'Orange', b'Orange', b'Orange', b'Orange'], dtype=object))
(array([ 9, 13, 15, 19], dtype=int32), array([b'Apple', b'Apple', b'Apple', b'Apple'], dtype=object))
(array([ 8, 10, 11, 12], dtype=int32), array([b'Orange', b'Orange', b'Orange', b'Orange'], dtype=object))
(array([21, 22, 23, 25], dtype=int32), array([b'Apple', b'Apple', b'Apple', b'Apple'], dtype=object))
(array([14, 16, 17, 18], dtype=int32), array([b'Orange', b'Orange', b'Orange', b'Orange'], dtype=object))

Efficiently Creating A Pandas DataFrame From A Numpy 3d array

Suppose we start with
import numpy as np
a = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
How can this be efficiently be made into a pandas DataFrame equivalent to
import pandas as pd
>>> pd.DataFrame({'a': [0, 0, 1, 1], 'b': [1, 3, 5, 7], 'c': [2, 4, 6, 8]})
a b c
0 0 1 2
1 0 3 4
2 1 5 6
3 1 7 8
The idea is to have the a column have the index in the first dimension in the original array, and the rest of the columns be a vertical concatenation of the 2d arrays in the latter two dimensions in the original array.
(This is easy to do with loops; the question is how to do it without them.)
Longer Example
Using #Divakar's excellent suggestion:
>>> np.random.randint(0,9,(4,3,2))
array([[[0, 6],
[6, 4],
[3, 4]],
[[5, 1],
[1, 3],
[6, 4]],
[[8, 0],
[2, 3],
[3, 1]],
[[2, 2],
[0, 0],
[6, 3]]])
Should be made to something like:
>>> pd.DataFrame({
'a': [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3],
'b': [0, 6, 3, 5, 1, 6, 8, 2, 3, 2, 0, 6],
'c': [6, 4, 4, 1, 3, 4, 0, 3, 1, 2, 0, 3]})
a b c
0 0 0 6
1 0 6 4
2 0 3 4
3 1 5 1
4 1 1 3
5 1 6 4
6 2 8 0
7 2 2 3
8 2 3 1
9 3 2 2
10 3 0 0
11 3 6 3
Here's one approach that does most of the processing on NumPy before finally putting it out as a DataFrame, like so -
m,n,r = a.shape
out_arr = np.column_stack((np.repeat(np.arange(m),n),a.reshape(m*n,-1)))
out_df = pd.DataFrame(out_arr)
If you precisely know that the number of columns would be 2, such that we would have b and c as the last two columns and a as the first one, you can add column names like so -
out_df = pd.DataFrame(out_arr,columns=['a', 'b', 'c'])
Sample run -
>>> a
array([[[2, 0],
[1, 7],
[3, 8]],
[[5, 0],
[0, 7],
[8, 0]],
[[2, 5],
[8, 2],
[1, 2]],
[[5, 3],
[1, 6],
[3, 2]]])
>>> out_df
a b c
0 0 2 0
1 0 1 7
2 0 3 8
3 1 5 0
4 1 0 7
5 1 8 0
6 2 2 5
7 2 8 2
8 2 1 2
9 3 5 3
10 3 1 6
11 3 3 2
Using Panel:
a = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
b=pd.Panel(rollaxis(a,2)).to_frame()
c=b.set_index(b.index.labels[0]).reset_index()
c.columns=list('abc')
then a is :
[[[1 2]
[3 4]]
[[5 6]
[7 8]]]
b is :
0 1
major minor
0 0 1 2
1 3 4
1 0 5 6
1 7 8
and c is :
a b c
0 0 1 2
1 0 3 4
2 1 5 6
3 1 7 8