How to exporting adversarial examples for Facenet in Cleverhans? - tensorflow

I am trying to follow this blog to generate adversarial face images against Facenet. The code is here and works fine! My question is how can I export these adversarial images. Is this question too straightforward, so the blog didn't mention it, but only shows some sample pictures.
I was thinking it is not a hard problem, since I know the generated adversarial samples are in the "adv". But this adv (float32) came from faces1, after being prewhiten and normalized. To restore the int8 images from adv(float32), I have to reverse the normalization and prewhiten process. It seems like if we want output some images from facenet, we have to do this process.
I am new to Facenet and Cleverhans, I am not sure whether this is the best way to do that, or is that common way(such as functions) for people to export images from Facenet.
In, we finally got the adversarial samples. I need to export adv to plain int images.
adv =, feed_dict=feed_dict)
In There are some kinda of normalization.
def load_testset(size):
# Load images paths and labels
pairs = lfw.read_pairs(pairs_path)
paths, labels = lfw.get_paths(testset_path, pairs, file_extension)
# Random choice
permutation = np.random.choice(len(labels), size, replace=False)
paths_batch_1 = []
paths_batch_2 = []
for index in permutation:
paths_batch_1.append(paths[index * 2])
paths_batch_2.append(paths[index * 2 + 1])
labels = np.asarray(labels)[permutation]
paths_batch_1 = np.asarray(paths_batch_1)
paths_batch_2 = np.asarray(paths_batch_2)
# Load images
faces1 = facenet.load_data(paths_batch_1, False, False, image_size)
faces2 = facenet.load_data(paths_batch_2, False, False, image_size)
# Change pixel values to 0 to 1 values
min_pixel = min(np.min(faces1), np.min(faces2))
max_pixel = max(np.max(faces1), np.max(faces2))
faces1 = (faces1 - min_pixel) / (max_pixel - min_pixel)
faces2 = (faces2 - min_pixel) / (max_pixel - min_pixel)
In the load_data function, there is a prewhiten process.
nrof_samples = len(image_paths)
images = np.zeros((nrof_samples, image_size, image_size, 3))
for i in range(nrof_samples):
img = misc.imread(image_paths[i])
if img.ndim == 2:
img = to_rgb(img)
if do_prewhiten:
img = prewhiten(img)
img = crop(img, do_random_crop, image_size)
img = flip(img, do_random_flip)
images[i,:,:,:] = img
return images
I hope some expert can point me some hidden function in facenet or cleverhans that can directly export the adv images, otherwise reversing normalization and prewhiten process seems akward. Thank you very much.

I don't know much about the Facenet code. From your discussion, it seems like you will have to save the values of min_pixel,max_pixelto reverse the normalization, and then look at theprewhiten` function to see how you can reverse it. I'll email Bruno to see if he has any further comments to help you out.

EDIT: Now image exporting is included in the Facenet example of Cleverhans:


What is the Pytorch sub for this tensor flow code?

In converting this line of code to Pytorch from Tensor Flow, I am having trouble
datagen = ImageDataGenerator(
def read_img(filename, size, path):
img = image.load_img(os.path.join(path, filename), target_size=size)
#convert image to array
img = img_to_array(img) / 255
return img
and then
corona_df = final_train_data[final_train_data['Label_2_Virus_category'] == 'COVID-19']
with_corona_augmented = []
#create a function for augmentation
def augment(name):
img = read_img(name, (255,255), train_img_dir)
i = 0
for batch in tqdm(datagen.flow(tf.expand_dims(img, 0), batch_size=32)):
if i == 20:
i =i+1
#apply the function
I tried doing
transform = transforms.Compose([transforms.Resize(255*255)
train_loader =,corona_df),transform = transform,batch_size =32)
def read_img(path):
img = train_loader()
img = np.asarray(img,dtype='int32')
img = img/255
return img
I tried continuing but got soo confused by the errors.
I welcome any feedback. Tell me If i miss something
Even a small advice would work, thanks !
You can create a custom dataset to read the images. If you have a directory full of images you can use ImageFolder default dataset. Otherwise if you have different folder placement you can write your own custom dataset class. You can look to this link for custom datasets. What dataloader does is, it automatically gets the data from your dataset and read the images according to your dataset __getitem__ function and apply transformation. So you don't need anything fancy to apply augmentation.
transform = transforms.Compose([ transforms.RandomAffine(20,shear=20,scale=(-0.2,0.2)),
dataset = torchvision.datasets.ImageFolder(train_img_dir, transform=transform)
loader =,batch_size =32,shuffle=True)
for batch in loader:
output = model(batch)

I'm having trouble with the transition of Tensorflow Python to Tensorflow.js in regards to image preprocessing. What am I missing?

I'm having trouble with the transition of Tensorflow Python to Tensorflow.js in regards to image preprocessing
in Python
single_coin = r"C:\temp\coins\20Saint-03o.jpg"
img = image.load_img(single_coin, target_size = (100, 100))
array = image.img_to_array(img)
x = np.expand_dims(array, axis=0)
vimage = np.vstack([x])
prediction =model.predict(vimage)
I get the correct result
[2.8914417e-05 3.5085387e-03 1.9252902e-03 6.2635467e-05 3.7389682e-03
1.2983804e-03 7.4157811e-04 1.4608903e-04 2.7099697e-06 1.1844193e-02
1.3398369e-04 9.3798796e-03 9.7308388e-05 7.3931034e-05 1.9695959e-04
9.6496813e-05 4.2653349e-04 8.7305409e-05 8.1476872e-04 4.9094640e-04
1.3498703e-04 9.6476960e-01]
However in Tensorflow.js with the same image post the following preprocessing function:
function preprocess(img)
let tensor = tf.browser.fromPixels(img)
const resized = tf.image.resizeBilinear(tensor, [100, 100]).toFloat()
const offset = tf.scalar(255.0);
const normalized = tf.scalar(1.0).sub(resized.div(offset));
const batched = normalized.expandDims(0)
return batched
I get the following result:
I'm obviously not translating the preprocessing appropriately. Does anyone see what I'm missing?
There is no normalization applied in the python code but there is a normalization in the js code. Either the same normalization applied in js is applied in python as well, or the normalization is removed from the js code.
Similar answer has been given here

How to read (decode) tfrecords with API

I have a custom dataset, that I then stored as tfrecord, doing
# toy example data
label = np.asarray([[1,2,3],
[4,5,6]]).reshape(2, 3, -1)
sample = np.stack((label + 200).reshape(2, 3, -1))
def bytes_feature(values):
"""Returns a TF-Feature of bytes.
values: A string.
A TF-Feature.
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[values]))
def labeled_image_to_tfexample(sample_binary_string, label_binary_string):
return tf.train.Example(features=tf.train.Features(feature={
'sample/image': bytes_feature(sample_binary_string),
'sample/label': bytes_feature(label_binary_string)
def _write_to_tf_record():
with tf.Graph().as_default():
image_placeholder = tf.placeholder(dtype=tf.uint16)
encoded_image = tf.image.encode_png(image_placeholder)
label_placeholder = tf.placeholder(dtype=tf.uint16)
encoded_label = tf.image.encode_png(image_placeholder)
with tf.python_io.TFRecordWriter("./toy.tfrecord") as writer:
with tf.Session() as sess:
feed_dict = {image_placeholder: sample,
label_placeholder: label}
# Encode image and label as binary strings to be written to tf_record
image_string, label_string =, encoded_label),
# Define structure of what is going to be written
file_structure = labeled_image_to_tfexample(image_string, label_string)
However I cannot read it. First I tried (based on , and
def read_tfrecord_low_level():
data_path = "./toy.tfrecord"
filename_queue = tf.train.string_input_producer([data_path], num_epochs=1)
reader = tf.TFRecordReader()
_, raw_records =
decode_protocol = {
'sample/image': tf.FixedLenFeature((), tf.int64),
'sample/label': tf.FixedLenFeature((), tf.int64)
enc_example = tf.parse_single_example(raw_records, features=decode_protocol)
recovered_image = enc_example["sample/image"]
recovered_label = enc_example["sample/label"]
return recovered_image, recovered_label
I also tried variations casting enc_example and decoding it, such as in Unable to read from Tensorflow tfrecord file However when I try to evaluate them my python session just freezes and gives no output or traceback.
Then I tried using eager execution to see what is happening, but apparently it is only compatible with API. However as far as I understand transformations on API are made on the whole dataset. mentions that a decode function must be written, but doesn't give an example on how to do that. All the tutorials I have found are made for TFRecordReader (which doesn't work for me).
Any help (pinpointing what I am doing wrong/ explaining what is happening/ indications on how to decode tfrecords with API) is highly appreciated.
According to and is the best way to create input pipelines, so I am highly interested on learning that way.
Thanks in advance!
I am not sure why storing the encoded png causes the evaluation to not work, but here is a possible way of working around the problem. Since you mentioned that you would like to use the way of creating input pipelines, I'll show how to use it with your toy example:
label = np.asarray([[1,2,3],
[4,5,6]]).reshape(2, 3, -1)
sample = np.stack((label + 200).reshape(2, 3, -1))
First, the data has to be saved to the TFRecord file. The difference from what you did is that the image is not encoded to png.
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
writer = tf.python_io.TFRecordWriter("toy.tfrecord")
example = tf.train.Example(features=tf.train.Features(feature={
'label_raw': _bytes_feature(tf.compat.as_bytes(label.tostring())),
'sample_raw': _bytes_feature(tf.compat.as_bytes(sample.tostring()))}))
What happens in the code above is that the arrays are turned into strings (1d objects) and then stored as bytes features.
Then, to read the data back using the and class:
filename = 'toy.tfrecord'
# Create a placeholder that will contain the name of the TFRecord file to use
data_path = tf.placeholder(dtype=tf.string, name="tfrecord_file")
# Create the dataset from the TFRecord file
dataset =
# Use the map function to read every sample from the TFRecord file (_read_from_tfrecord is shown below)
dataset =
# Create an iterator object that enables you to access all the samples in the dataset
iterator =, dataset.output_shapes)
label_tf, sample_tf = iterator.get_next()
# Similarly to tf.Variables, the iterators have to be initialised
iterator_init = iterator.make_initializer(dataset, name="dataset_init")
with tf.Session() as sess:
# Initialise the iterator passing the name of the TFRecord file to the placeholder, feed_dict={data_path: filename})
# Obtain the images and labels back
read_label, read_sample =[label_tf, sample_tf])
The function _read_from_tfrecord() is:
def _read_from_tfrecord(example_proto):
feature = {
'label_raw': tf.FixedLenFeature([], tf.string),
'sample_raw': tf.FixedLenFeature([], tf.string)
features = tf.parse_example([example_proto], features=feature)
# Since the arrays were stored as strings, they are now 1d
label_1d = tf.decode_raw(features['label_raw'], tf.int64)
sample_1d = tf.decode_raw(features['sample_raw'], tf.int64)
# In order to make the arrays in their original shape, they have to be reshaped.
label_restored = tf.reshape(label_1d, tf.stack([2, 3, -1]))
sample_restored = tf.reshape(sample_1d, tf.stack([2, 3, -1]))
return label_restored, sample_restored
Instead of hard-coding the shape [2, 3, -1], you could also store that too into the TFRecord file, but for simplicity I didn't do it.
I made a little gist with a working example.
Hope this helps!

Soft attention from scratch for video sequences

I am trying to implement soft attention for video sequences classification. As there are a lot of implementations and examples about NLP so I tried following this schema but for video 1. Basically a LSTM with an Attention Model in between.
My code for my attention layer is the following which I am not sure it is implemented correctly.
def attention_layer(self, input, context):
# Input is a Tensor: [batch_size, lstm_units]
# Input (Seq_length, batch_size, lstm_units)
# Context is a LSTMStateTuple: [batch_size, lstm_units]. Hidden_state, output = StateTuple
hidden_state, _ = context
weights_y = tf.get_variable("att_weights_Y", [self.lstm_units, self.lstm_units], initializer=tf.contrib.layers.xavier_initializer())
weights_c = tf.get_variable("att_weights_c", [self.lstm_units, self.lstm_units], initializer=tf.contrib.layers.xavier_initializer())
z_ = []
for feat in input:
# Equation => M = tanh(Wc c + Wy y)
Wcc = tf.matmul(hidden_state, weights_c)
Wyy = tf.matmul(feat, weights_y)
m = tf.add(Wcc, Wyy)
m = tf.tanh(m, name='M_matrix')
# Equation => s = softmax(m)
s = tf.nn.softmax(m, name='softmax_att')
z = tf.multiply(feat, s)
out = tf.stack(z_, axis=1)
out = tf.reduce_sum(out, 1)
return out, s
So, adding this layer in between my LSTMs (or at the begining of my 2 LSTM) makes the training so slow. More specifically, it takes a lot of time when I declare my optimizer:
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
My questions are:
Is the implementation correct? If it is, is there a way to optimize it in order to make it train properly?
I was not able to make it work with the seq2seq APIs. Is there any API with Tensorflow that allows me tackle this specific issue?
Does it actually makes sense to use this for sequence classification?

per_image_whitening in Python

I'm trying to set up TensorFlow to accept one image at a time but I believe I'm getting incorrect results because I pass a regular array without first performing tf.image.per_image_whitening() beforehand. Is there an easy way to do this in Python to an individual image without using the image queue?
Here's my code so far:
im =[0])
im = im.convert('RGB')
im = im.crop((0, 0, cifar10.IMAGE_SIZE, cifar10.IMAGE_SIZE))
(width, height) = im.size
image_array = list(im.getdata())
image_array = np.array(image_array)
image_array = image_array.reshape((1, height, width, 3))
# tf.image.per_image_whitening() should be done here
#mean = numpy.mean(image_array)
#stddev = numpy.std(image_array)
#adjusted_stddev = max(stddev, 1.0/len(image_array.flatten())))
feed_dict = {"shuffle_batch:0": image_array}
# predictions always returns something close to [1, 0]
predictions =, feed_dict=feed_dict)
If you want to avoid the image queue and do the predictions one by one, I think
image_array = (image_array - mean) / adjusted_stddev
should be able to do the trick.
If you want to do the prediction by batches, it's a little bit complicated as per_image_whitening (now per_image_standardization) only works with single images. So you need to do it before you form the batch like the way above or setup a preprocess procedure.