TensorFlow: slice Tensor and keep original shape - tensorflow

I have a Tensor tensor of shape (?, 1082) and I want to slice this Tensor into n subparts in a for-loop but I want to keep the original shape, including the unknown dimension ?.
Example:
lst = []
for n in range(15):
sub_tensor = tensor[n] # this will reduce the first dimension
print(sub_tensor.get_shape())
Print output I'm looking for:
(?, 1082)
(?, 1082)
etc.
How can this be achieved in TensorFlow?

Considering that your problem can have many constraints, I can think of at least 3 solutions.
You can use tf.split. I'll use tf.placeholder, but it's applicable to tensors and variables as well.
p = tf.placeholder(shape=[None,10], dtype=tf.int32)
s1, s2 = tf.split(value=p, num_or_size_splits=2, axis=1)
However, this approach can become unfeasible if number of splits required is large. Note that it can split None axis as well.
for n in range(15):
sub_tensor = tensor[n, :]
s = tf.slice(p, [0,2], [-1, 2])
Slice can be used for multidimensional tensors, but it' pretty tricky to use. And you can use tf.Tensor.getitem method, almost as you described in your question. It acts similar to NumPy. So this should do the job:
for n in range(10):
print(p[n, :])
However, usage of these methods heavily depend on your particular application. Hope this helps.

Related

Dataset Preparation for LSTM (multiple variables)

I am struggling to conceptualize the correct way to prepare a timeseries dataset for LSTM training. My main concern is how do I train the network to 'remember' N previous steps. I have two possible ways in my mind but I am not sure which one is the correct.
I am really confused with this, I have tried both approaches (for 1 variable however) and they both seem to provide some plausible results.
1.) The dataset should be in a tensor format like this
X1=[[var1_t_1, var2_t_1, var3_t_1],
[var1_t_2, var2_t_2, var3_t_2],
...]
X.shape = [N, 3]
y=[ [target_t_1],
[target_t_2],
...]
y.shape = [N, 1]
During training the LSTM gets N inputs, one for each timestep, and returns back N predictions that are used to compute the loss and update weights.
The network on its own "creates memmory" about previous time step values through its cell states. But for how many previous steps can it create memmory, is there any way to define this memmory (if possible answer with pytorch example).
2.) The dataset should already contain the previous timestep values as features, so a 3rd dimension is neccessary eg.
X = [ [var1_t_1, var1_t_2,..., var1_t_10], [var2_t_1,..., var2_t_10], [var3_t_1,..., var3_t_10],
[var1_t_2, var1_t_3,..., var1_t_11], [var2_t_2,..., var2_t_11], [var3_t_2,..., var3_t_11],
...]
X.shape = [N-10, 10, 3]
y = [ [target_t_11],
[target_t_12],
... ]
y.shape = [N-10, 1]
In this way we define the number of previous steps the LSTM should try to remember. For the example above we "ask" the LSTM to remember at least 10 previous prices in order to make predictions.
Any help to clarify the concept is greatly appreciated. Pytorch code would be extremely welcome as well.

How to multiply tensors with different shapes/dimensions?

I have a convolutional autoencoder model. While an autoencoder typically focuses on reconstructing the input without using any label information, I want to use the class label to perform class conditional scaling/shifting after convolutions. I am curious if utilizing the label in this way might help produce better reconstructions.
num_filters = 32
input_img = layers.Input(shape=(28, 28, 1)) # input image
label = layers.Input(shape=(10,)) # label
# separate scale value for each of the filter dimensions
scale = layers.Dense(num_filters, activation=None)(label)
# conv_0 produces something of shape (None,14,14,32)
conv_0 = layers.Conv2D(num_filters, (3, 3), strides=2, activation=None, padding='same')(input_img)
# TODO: Need help here. Multiply conv_0 by scale along each of the filter dimensions.
# This still outputs something of shape (None,14,14,32)
# Essentially each 14x14x1 has it's own scalar multiplier
In the example above, the output of the convolutional layer is (14,14,32) and the scale layer is of shape (32,). I want the convolutional output to be multiplied by the corresponding scale value along each filter dimension. For example, if these were numpy arrays I could do something like conv_0[:, :, i] * scale[i] for i in range(32).
I looked at tf.keras.layers.Multiply which can be found here, but based on the documentation I believe that takes in tensors of the same size as input. How do I work around this?
You don't have to loop. Simply do the following by making two tensors broadcast-compatible,
out = layers.Multiply()([conv_0, tf.expand_dims(tf.expand_dims(scale,axis=1), axis=1)])
I dont know if i actually understood what you are trying to achieve but i did a quick numpy test. I believe it should hold in tensorflow also:
conv_0 = np.ones([14, 14, 32])
scale = np.array([ i + 1 for i in range(32)])
result = conv_0 * scale
check whether channel-wise slices actually scaled element-wise in this case by the element found at index 1 in scale, which is 2
conv_0_slice_1 = conv_0[:, :, 1]
result_slice_1 = result[:, :, 1]

How to map an array of values for y_true to a single value in order to compare to y_pred in a Tensorflow loss function (Tensorflow/Tensorflow Quantum)

I am trying to implement the circuits listed on page 8 in the following paper: https://arxiv.org/pdf/1905.10876.pdf using Tensorflow Quantum (TFQ). I have done so previously for a subset of circuits using Qiskit, and ended up with accuracies that can be found on page 14 in the following paper: https://arxiv.org/pdf/2003.09887.pdf. In TFQ, my accuracies are way down. I think this delta originates because in TFQ, I only used 1 observable Pauli Z operator on the first qubit, and the circuits do not seem to "transfer all knowledge" to the first qubit. I place this in quotes, because I am sure there is a better way to describe this. In Qiskit on the other hand, 16 states (4^2) get mapped to 2 states.
My question: how can I get my accuracies back up?
Potential answer a): some method of "transferring all information" to a single qubit, potentially an ancilla qubit, and doing a readout on this qubit.
Potential answer b) placing a Pauli Z observable on all qubits (4 in total), mapping half of the 16 states to a label 0 and the other half to a label 1. I attempted this in the code below.
My attempt at answer b):
I have a Tensorflow Quantum (TFQ) circuit implemented in Tensorflow. The circuit has multiple observables, which I try to bring together in my loss function. I prefer to use as many standard components as possible, but need to map my quantum states to a label in order to determine the loss. I think what I am trying to achieve is not unique to TFQ. I define my model in the following way:
def circuit():
data_qubits = cirq.GridQubit.rect(4, 1)
circuit = cirq.Circuit()
...
return circuit, [cirq.Z(data_qubits[0]), cirq.Z(data_qubits[1]), cirq.Z(data_qubits[2]), cirq.Z(data_qubits[3])]
model_circuit, model_readout = circuit()
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=(), dtype=tf.string),
# The PQC layer returns the expected value of the readout gate, range [-1,1].
tfq.layers.PQC(model_circuit, model_readout),
])
# compile model
model.compile(
loss = loss_mse,
optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),
metrics=[])
in loss_mse (Mean Square Error), I receive a (32, 4) tensor for y_pred. One row could look like
[-0.2, 0.33, 0.6, 0.3]
This would have to be first mapped from [-1,1] to a binarized version of [0,1], so that it looks like:
[0, 1, 1, 1]
Now, a table lookup needs to happen, which tells if this combination is 0 or 1. Finally, the regular (y_true-y_pred)^2 can be performed by that row, followed by a np.sum on all rows. I tried to implement this:
def get_label(measurement):
if measurement == [0,0,0,0]: return 0
...
elif measurement == [1,1,1,1]: return 0
else: return -1
def py_call(y_true, y_pred):
# cast tensor to numpy
y_pred_np = np.asarray(y_pred)
loss = np.zeros((len(y_pred))) # could be a single variable with += within the loop
# evalaute all 32 samples
for pred in range(len(y_pred_np)):
# map, binarize and lookup
y_labelled = get_label([0 if y<0 else 1 for y in y_pred_np[pred]])
# regular loss comparison
loss[pred] = (y_labelled - y_true[pred])**2
# reduce
loss = np.sum(loss)/len(y_true)
return loss
#tf.function
def loss_mse(y_true, y_pred):
external_list = []
loss = tf.py_function(py_call, inp=[y_true, y_pred], Tout=[tf.float64])
return loss
However, the system appears to still expect a (32,4) tensor. I would have thought I could simply provide a single loss values (float). My question: how can I map multiple values for y_true to a single number in order to compare with a single y_pred value in a tensorflow loss function?
So it looks like there are a couple of things going on here. To answer your question
how can I map multiple values for y_true to a single number in order to compare with a single y_pred value in a tensorflow loss function ?
What you might want is some kind of tf.reduce_* function like tf.reduce_mean or tf.reduce_sum. This function will allow you to apply this reduction operation accross a given tensor axis allowing you to convert a tensor of shape (32, 4) to a tensor of shape (32,) or a tensor of shape (4,). Here is a quick snippet:
#tf.function
def my_loss(y_true, y_pred):
# y_true is shape (32, 4)
# y_pred is shape (32, 4)
# Scale from [-1, 1] to [0, 1]
y_true += 1
y_true /= 2
y_pred += 1
y_pred /= 2
# These are now both (32,) with the reduction of taking the mean applied along
# the second axis.
reduced_true = tf.reduce_mean(y_true, axis=1)
reduced_pred = tf.reduce_mean(y_pred, axis=1)
# Now a scalar loss.
loss = tf.reduce_mean((reduce_true - reduced_pred) ** 2)
return loss
Now the above isn't exactly what you want, since it's not super clear to me at least what exact reduction rules you have in mind for taking something like [0,1,1,1] -> 0 vs [0,0,0,0] -> 1.
Another thing I will also mention is that if you want JUST the sum of these Pauli Operators in cirq that you have term by term in the list [cirq.Z(data_qubits[0]), cirq.Z(data_qubits[1]), cirq.Z(data_qubits[2]), cirq.Z(data_qubits[3])] and all you care about is the final sum of these expectations, you could just as easily do:
my_operator = sum([cirq.Z(data_qubits[0]), cirq.Z(data_qubits[1]),
cirq.Z(data_qubits[2]), cirq.Z(data_qubits[3])])
print(my_op)
Which should give something like:
cirq.PauliSum(cirq.LinearDict({frozenset({(cirq.GridQubit(0, 0), cirq.Z)}): (1+0j), frozenset({(cirq.GridQubit(0, 1), cirq.Z)}): (1+0j), frozenset({(cirq.GridQubit(0, 2), cirq.Z)}): (1+0j), frozenset({(cirq.GridQubit(0, 3), cirq.Z)}): (1+0j)}))
Which is also compatable as a readout operation in the PQC layer. Lastly if would recommend reading through some of the snippets and examples here:
https://www.tensorflow.org/quantum/api_docs/python/tfq/layers/PQC
and here:
https://www.tensorflow.org/quantum/api_docs/python/tfq/layers/Expectation
Which give a pretty good description of how the input and output signatures of the functions look as well as the shapes you can expect from them.

Looping over Ragged Tensors in Tensorflow

I wondered if there was any way too loop over Ragged Tensors, similarly to tf.map_fn. My Ragged Tensor has a different amount of rows but contains 4 points which I would like to retrieve.
The input looks as follows:
ragged_tensor[0] equals (100, 4)
ragged_tensor[1] equals (50, 4)
For now I can retrieve all of the points by looping over the first tensor inside the RaggedTensor:
test = tf.map_fn(lambda box: tf.image.crop_to_bounding_box(img, box[0], box[1], box[2], box[3]), tf.cast(boxes, tf.int32), dtype=tf.float32)
Does anyone have any experience with this, or might give me some tips&tricks? All help is appreciated.
This is one way to get the whole array of points:
points = tf.reshape(ragged_tensor.flat_values, [-1, 4])

Shaping input labels for Tensorflow

Say I have 1000x500 table, where 500 are the columns and 1000 rows.
And the rows represent 1000 sample, each sample is composed of 499 features and 1 label
If I want to put this tensorflow model and, say that each time I get a batch of 20 samples:
.........................................
inputdata #is filled and has a shape of 499x1000
inputlabel #is filled and has a shape of 1x1000
y_ = tf.placeholder(tf.float32,[None,batchSize],name='Labels')
for j in range( numberOfRows/BatchSize):
sess.run(train_step,feed_dict={x:batch_xs[j],y_:np.reshape(inputlabel[j] ,(batchSize,1) )}))
So I've been trying to run my code for two days without any success, So I'll be grateful for any help considering the y_ and reshaping part. The problem that I have is to understand, that when I read a batch of 20 data row how should I shape the labels Y_
First issue: put your batch_size dimension as your first dimension, that's the standard and a fair number of computations in tensorflow assume as much.
Second, I don't see a placeholder for your data, X. But your passing it as a variable to sess.run.
To keep things simple, I suggest you do all this reshaping outside of tensorflow, use numpy. Don't get me wrong, you can absolutely do this in tensorflow, but if slicing and merging are confusing you (they confused everyone the first time), tensorflow will only add to that confusion because you can't simply print the results of a slicing operation as conveniently in tensorflow as you can in numpy to debug your situation.
So to that end, let's do it:
# your data
mydata = np.random.rand(500,1000)
# tensorflow placeholders
X = tf.placeholder(tf.float32, [batchSize, 499], name='X')
y_ = tf.placeholder(tf.float32, [batchSize, 1], name=y_')
# let's transpose your data so the batch is the first dimension (1000 x 500)
mydata = mydata.T
# Let's split the labels from the data
data = mydata[:,0:499]
labels = mydata[:,500]
# Now train
for j in range(numOfRows/BatchSize):
row_from = j * BatchSize
row_to = j * BatchSize + BatchSize
sess.run(train_step, feed_dict={
x : data[row_from:row_to, :]
y_ : labels[row_from:row_to]
})
Don't forget to permute your data, we didn't do it here. I personally like np.random.permutation(1000) to get a random list of indexes, then just take the first BatchSize indexes and then np.roll the random permutation, super easy way to iterate through data sets without dealing with computing indexes or the trailing batch that isn't an even size.