I am trying to implement a PyTorch covariance matrix operator. However, I notice the results are not the same between the Numpy implementation and my attempt, yet I do not understand why.
I define the Bessel-corrected weighted covariance matrix as:
I define the weighted mean as:
I compare the NumPy method and my method as follows:
import numpy as np
import torch
torch.set_printoptions(precision=8)
x = np.random.randn(1000, 3)*1000
w = np.abs(np.random.randn(1000))*1000
x_torch = torch.DoubleTensor(x)
w_torch = torch.DoubleTensor(w)
#calculate weighted means
m_w = torch.sum(x_torch.T*w_torch, axis=1)/torch.sum(w_torch)
m_w_np = np.average(x, axis=0, weights=w)
#calculate weighted covariance matrix
Q = (x_torch-m_w).T
cov_w = (1.0 / (torch.sum(w_torch) - 1))*(w_torch*Q).mm(Q.T)
cov_w_np = np.cov(x.T, aweights=w.T)
print("WEIGHTED MEAN")
print("NUMPY = {0}\n\nTORCH = {1}\n\nDIFFERENCE={2}".format(m_w_np, m_w.numpy(), m_w_np-m_w.numpy()))
print("")
print("")
print("WEIGHTED COVARIANCE")
print("NUMPY = {0}\n\nTORCH = {1}\n\nDIFFERENCE={2}".format(cov_w_np, cov_w.numpy(),cov_w_np-cov_w.numpy()))
This yields the following output:
WEIGHTED MEAN
NUMPY = [-21.10537208 -7.70801723 64.4034329 ]
TORCH = [-21.10537208 -7.70801723 64.4034329 ]
DIFFERENCE=[-7.10542736e-15 -1.77635684e-15 1.42108547e-14]
WEIGHTED COVARIANCE
NUMPY = [[ 989468.17457696 13620.54885133 10723.87790683]
[ 13620.54885133 953966.92486133 21407.69378841]
[ 10723.87790683 21407.69378841 1019646.81044077]]
TORCH = [[ 987952.51042915 13599.68493868 10707.45110536]
[ 13599.68493868 952505.64141296 21374.90155234]
[ 10707.45110536 21374.90155234 1018084.91875621]]
DIFFERENCE=[[1515.6641478 20.86391265 16.42680147]
[ 20.86391265 1461.28344838 32.79223607]
[ 16.42680147 32.79223607 1561.89168456]]
How to do something like this?
nn = get_networks()
A = nn(X_input)
B = nn(X_other_input)
C = A + B
model = ...
So that all the tensors in nn are the same, only the input-training branches are different?
In pure tensorflow you do this with
tf.variable_scope('something', reuse=tf.AUTO_REUSE):
define stuff here
and carefully naming your layers.
But basically you can construct nn in the first place because you can not pass a non-called layer to a layer call!
For example:
In [21]: tf.keras.layers.Dense(16)(tf.keras.layers.Dense(8))
...
AttributeError: 'Dense' object has no attribute 'shape'
UPDATE:
I have been accomplishing this by creating an uncompiled model as the sub-network. That "model" can then be passed to other network creation functions. For example, if you have a functionaly equation that you want to solve, you might approximate the function with a network and then pass the network to the function which is itself a network.
It depends how you would like to reuse it, but the idea is to save your layers once initialized, and use them multiple times later.
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
import numpy as np
layers = {}
def net1(input):
layers["l1"] = keras.layers.Dense(10)
layers["l2"] = keras.layers.Dense(10)
return layers["l1"](layers["l2"](keras.layers.Flatten()(input)))
def net2(input):
return layers["l1"](layers["l2"](keras.layers.Flatten()(input)))
input1 = keras.layers.Input((2, 2))
input2 = keras.layers.Input((2, 2))
model1 = keras.Model(inputs=input1, outputs=net1(input1))
model1.compile(loss=keras.losses.mean_squared_error, optimizer=keras.optimizers.Adam())
model2 = keras.Model(inputs=input2, outputs=net2(input2))
model2.compile(loss=keras.losses.mean_squared_error, optimizer=keras.optimizers.Adam())
x = np.random.randint(0, 100, (50, 2, 2))
m1 = model1.predict(x)
m2 = model2.predict(x)
print(x[0])
print(m1[0])
print(m2[0])
Outputs are identical:
[ 10.114908 -13.074531 -8.671929 -59.03201 55.389366 1.3610549
-38.051434 8.355987 7.5310936 -27.717983 ]
[ 10.114908 -13.074531 -8.671929 -59.03201 55.389366 1.3610549
-38.051434 8.355987 7.5310936 -27.717983 ]
import numpy as np
import tensorflow as tf
from keras import backend as K
sess = tf.InteractiveSession()
box_scores1 = tf.constant([[[ 9.188682, 11.484599 ],
[10.06533, 7.557296 ]],
[[10.099248, 10.591225 ],
[10.592823 , 7.8770704]]])
box_scores2 = tf.random_normal([2,2,2], mean=10, stddev=1, dtype=tf.float32, seed = 1)
box_class_scores1 = K.max(box_scores1, axis=-1)
box_class_scores2 = K.max(box_scores2, axis=-1)
print(box_scores1.eval())
print(box_scores2.eval())
print(box_class_scores1.eval())
print(box_class_scores2.eval())
Output:
[[[ 9.188682 11.484599 ]
[10.06533 7.557296 ]]
[[10.099248 10.591225 ]
[10.592823 7.8770704]]]
[[[ 9.188682 11.484599 ]
[10.06533 7.557296 ]]
[[10.099248 10.591225 ]
[10.592823 7.8770704]]]
[[11.484599 10.06533 ]
[10.591225 10.592823]]
[[10.242094 10.515779]
[12.083789 11.397354]]
As, we can see values in box_scores1 and box_scores2 are same but the result obtained after applying max operation differs. How can the values of box_class_scores1 and box_class_scores2 be different?
Your problem has nothing to do with the max function, but a misunderstanding with tensorflow, as most of its operations are symbolic, so when you use tf.random_mormal, this does not produce random numbers, but a symbolic normal distribution with the given mean and standard distribution.
Then, each time you evaluate this distribution, it generates different outputs, so your first eval looks ok, but the second produces a different output that is given to max, so it produces a different output than just giving a constant vector.
I'm running a wide_deep.py script for linear regression in tensorflow.
I have cloned the models directory also as a part of process.
But i'm getting a error like AttributeError: 'list' object has no attribute 'model_dir'.
If I hard code this particular variable, I m getting other errors as AttributeError: 'list' object has no attribute 'data_dir' and so on .
Code:
"""Example code for TensorFlow Wide & Deep Tutorial using tf.estimator API."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import shutil
from absl import app as absl_app
from absl import flags
import tensorflow as tf # pylint: disable=g-bad-import-order
from official.utils.flags import core as flags_core
from official.utils.logs import hooks_helper
from official.utils.misc import model_helpers
_CSV_COLUMNS = [
'age', 'workclass', 'fnlwgt', 'education', 'education_num',
'marital_status', 'occupation', 'relationship', 'race', 'gender',
'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
'income_bracket'
]
_CSV_COLUMN_DEFAULTS = [[0], [''], [0], [''], [0], [''], [''], [''], [''], [''],
[0], [0], [0], [''], ['']]
_NUM_EXAMPLES = {
'train': 32561,
'validation': 16281,
}
LOSS_PREFIX = {'wide': 'linear/', 'deep': 'dnn/'}
def define_wide_deep_flags():
"""Add supervised learning flags, as well as wide-deep model type."""
flags_core.define_base()
flags.adopt_module_key_flags(flags_core)
flags.DEFINE_enum(
name="model_type", short_name="mt", default="wide_deep",
enum_values=['wide', 'deep', 'wide_deep'],
help="Select model topology.")
flags_core.set_defaults(data_dir='/tmp/census_data',
model_dir='/tmp/census_model',
train_epochs=40,
epochs_between_evals=2,
batch_size=40)
def build_model_columns():
"""Builds a set of wide and deep feature columns."""
# Continuous columns
age = tf.feature_column.numeric_column('age')
education_num = tf.feature_column.numeric_column('education_num')
capital_gain = tf.feature_column.numeric_column('capital_gain')
capital_loss = tf.feature_column.numeric_column('capital_loss')
hours_per_week = tf.feature_column.numeric_column('hours_per_week')
education = tf.feature_column.categorical_column_with_vocabulary_list(
'education', [
'Bachelors', 'HS-grad', '11th', 'Masters', '9th', 'Some-college',
'Assoc-acdm', 'Assoc-voc', '7th-8th', 'Doctorate', 'Prof-school',
'5th-6th', '10th', '1st-4th', 'Preschool', '12th'])
marital_status = tf.feature_column.categorical_column_with_vocabulary_list(
'marital_status', [
'Married-civ-spouse', 'Divorced', 'Married-spouse-absent',
'Never-married', 'Separated', 'Married-AF-spouse', 'Widowed'])
relationship = tf.feature_column.categorical_column_with_vocabulary_list(
'relationship', [
'Husband', 'Not-in-family', 'Wife', 'Own-child', 'Unmarried',
'Other-relative'])
workclass = tf.feature_column.categorical_column_with_vocabulary_list(
'workclass', [
'Self-emp-not-inc', 'Private', 'State-gov', 'Federal-gov',
'Local-gov', '?', 'Self-emp-inc', 'Without-pay', 'Never-worked'])
# To show an example of hashing:
occupation = tf.feature_column.categorical_column_with_hash_bucket(
'occupation', hash_bucket_size=1000)
# Transformations.
age_buckets = tf.feature_column.bucketized_column(
age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
# Wide columns and deep columns.
base_columns = [
education, marital_status, relationship, workclass, occupation,
age_buckets,
]
crossed_columns = [
tf.feature_column.crossed_column(
['education', 'occupation'], hash_bucket_size=1000),
tf.feature_column.crossed_column(
[age_buckets, 'education', 'occupation'], hash_bucket_size=1000),
]
wide_columns = base_columns + crossed_columns
deep_columns = [
age,
education_num,
capital_gain,
capital_loss,
hours_per_week,
tf.feature_column.indicator_column(workclass),
tf.feature_column.indicator_column(education),
tf.feature_column.indicator_column(marital_status),
tf.feature_column.indicator_column(relationship),
# To show an example of embedding
tf.feature_column.embedding_column(occupation, dimension=8),
]
return wide_columns, deep_columns
def build_estimator(model_dir, model_type):
"""Build an estimator appropriate for the given model type."""
wide_columns, deep_columns = build_model_columns()
hidden_units = [100, 75, 50, 25]
# Create a tf.estimator.RunConfig to ensure the model is run on CPU, which
# trains faster than GPU for this model.
run_config = tf.estimator.RunConfig().replace(
session_config=tf.ConfigProto(device_count={'GPU': 0}))
if model_type == 'wide':
return tf.estimator.LinearClassifier(
model_dir=model_dir,
feature_columns=wide_columns,
config=run_config)
elif model_type == 'deep':
return tf.estimator.DNNClassifier(
model_dir=model_dir,
feature_columns=deep_columns,
hidden_units=hidden_units,
config=run_config)
else:
return tf.estimator.DNNLinearCombinedClassifier(
model_dir=model_dir,
linear_feature_columns=wide_columns,
dnn_feature_columns=deep_columns,
dnn_hidden_units=hidden_units,
config=run_config)
def input_fn(data_file, num_epochs, shuffle, batch_size):
"""Generate an input function for the Estimator."""
assert tf.gfile.Exists(data_file), (
'%s not found. Please make sure you have run data_download.py and '
'set the --data_dir argument to the correct path.' % data_file)
def parse_csv(value):
print('Parsing', data_file)
columns = tf.decode_csv(value, record_defaults=_CSV_COLUMN_DEFAULTS)
features = dict(zip(_CSV_COLUMNS, columns))
labels = features.pop('income_bracket')
return features, tf.equal(labels, '>50K')
# Extract lines from input files using the Dataset API.
dataset = tf.data.TextLineDataset(data_file)
if shuffle:
dataset = dataset.shuffle(buffer_size=_NUM_EXAMPLES['train'])
dataset = dataset.map(parse_csv, num_parallel_calls=5)
# We call repeat after shuffling, rather than before, to prevent separate
# epochs from blending together.
dataset = dataset.repeat(num_epochs)
dataset = dataset.batch(batch_size)
return dataset
def export_model(model, model_type, export_dir):
"""Export to SavedModel format.
Args:
model: Estimator object
model_type: string indicating model type. "wide", "deep" or "wide_deep"
export_dir: directory to export the model.
"""
wide_columns, deep_columns = build_model_columns()
if model_type == 'wide':
columns = wide_columns
elif model_type == 'deep':
columns = deep_columns
else:
columns = wide_columns + deep_columns
feature_spec = tf.feature_column.make_parse_example_spec(columns)
example_input_fn = (
tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec))
model.export_savedmodel(export_dir, example_input_fn)
def run_wide_deep(flags_obj):
"""Run Wide-Deep training and eval loop.
Args:
flags_obj: An object containing parsed flag values.
"""
# Clean up the model directory if present
shutil.rmtree(flags_obj.model_dir, ignore_errors=True)
model = build_estimator(flags_obj.model_dir, flags_obj.model_type)
train_file = os.path.join(flags_obj.data_dir, 'adult.data')
test_file = os.path.join(flags_obj.data_dir, 'adult.test')
# Train and evaluate the model every `flags.epochs_between_evals` epochs.
def train_input_fn():
return input_fn(
train_file, flags_obj.epochs_between_evals, True, flags_obj.batch_size)
def eval_input_fn():
return input_fn(test_file, 1, False, flags_obj.batch_size)
loss_prefix = LOSS_PREFIX.get(flags_obj.model_type, '')
train_hooks = hooks_helper.get_train_hooks(
flags_obj.hooks, batch_size=flags_obj.batch_size,
tensors_to_log={'average_loss': loss_prefix + 'head/truediv',
'loss': loss_prefix + 'head/weighted_loss/Sum'})
# Train and evaluate the model every `flags.epochs_between_evals` epochs.
for n in range(flags_obj.train_epochs // flags_obj.epochs_between_evals):
model.train(input_fn=train_input_fn, hooks=train_hooks)
results = model.evaluate(input_fn=eval_input_fn)
# Display evaluation metrics
print('Results at epoch', (n + 1) * flags_obj.epochs_between_evals)
print('-' * 60)
for key in sorted(results):
print('%s: %s' % (key, results[key]))
if model_helpers.past_stop_threshold(
flags_obj.stop_threshold, results['accuracy']):
break
# Export the model
if flags_obj.export_dir is not None:
export_model(model, flags_obj.model_type, flags_obj.export_dir)
def main(_):
run_wide_deep(flags.FLAGS)
if __name__ == '__main__':
tf.logging.set_verbosity(tf.logging.INFO)
define_wide_deep_flags()
absl_app.run(main)
Hunter, I tried to run without hardcoding but still faced issues with attributes , so I tried to hard code to avoid this .
But, The issue is resolved now.
I cloned the directory again and instead of copying the wide_deep.py to another directory and run from there(which I was doing before), I ran directly from the same directory and now it is working fine.
I am currently trying to do some work in both Keras and Tensorflow, I stumbled upon a small thing I do not understand. If you look at the code below, I am trying to predict the responses of a network either via Tensorflow session explicitly, or by using the model predict_on_batch function.
import os
import keras
import numpy as np
import tensorflow as tf
from keras import backend as K
from keras.layers import Dense, Dropout, Flatten, Input
from keras.models import Model
# Try to standardize output
np.random.seed(1)
tf.set_random_seed(1)
# Building the model
inputs = Input(shape=(224,224,3))
base_model = keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', \
input_tensor=inputs, input_shape=(224, 224, 3))
x = base_model.get_layer("fc2").output
x = Dropout(0.5, name='model_fc_dropout')(x)
x = Dense(2048, activation='sigmoid', name='final_fc')(x)
x = Dropout(0.5, name='final_fc_dropout')(x)
predictions = Dense(1, activation='sigmoid', name='fcout')(x)
model = Model(outputs=predictions, inputs=inputs)
##################################################################
model.compile(loss='binary_crossentropy',
optimizer=tf.train.MomentumOptimizer(learning_rate=5e-4, momentum=0.9),
metrics=['accuracy'])
image_batch = np.random.random((64,224,224,3))
# Outputs predicted by TF
outs = [predictions]
feed_dict={inputs:image_batch, K.learning_phase():0}
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
outputs = sess.run(outs, feed_dict)[0]
print outputs.flatten()
# Outputs predicted by Keras
outputs = model.predict_on_batch(image_batch)
print outputs.flatten()
My issue is that I got two different results, even though I tried to remove any kind of sources of randomness by setting the seeds to 1 and running the operations on CPU. Even then, I get the following results:
[ 0.26079229 0.26078743 0.26079154 0.26079673 0.26078942 0.26079443
0.26078886 0.26079088 0.26078972 0.26078728 0.26079121 0.26079452
0.26078513 0.26078424 0.26079014 0.26079312 0.26079521 0.26078743
0.26078558 0.26078537 0.26078674 0.26079136 0.26078632 0.26077667
0.26079312 0.26078999 0.26079065 0.26078704 0.26078928 0.26078624
0.26078892 0.26079202 0.26079065 0.26078689 0.26078963 0.26078749
0.26078817 0.2607986 0.26078528 0.26078412 0.26079187 0.26079246
0.26079226 0.26078457 0.26078099 0.26078072 0.26078376 0.26078475
0.26078326 0.26079389 0.26079792 0.26078579 0.2607882 0.2607961
0.26079237 0.26078218 0.26078638 0.26079753 0.2607787 0.26078618
0.26078096 0.26078594 0.26078215 0.26079002]
and
[ 0.25331706 0.25228402 0.2534174 0.25033095 0.24851511 0.25099936
0.25240892 0.25139931 0.24948661 0.25183493 0.25104815 0.25164133
0.25214729 0.25265765 0.25128496 0.25249782 0.25247478 0.25314394
0.25014618 0.25280923 0.2526398 0.25381723 0.25138992 0.25072744
0.25069866 0.25307226 0.25063521 0.25133523 0.25050756 0.2536433
0.25164688 0.25054023 0.25117773 0.25352773 0.25157067 0.25173825
0.25234801 0.25182116 0.25284401 0.25297374 0.25079012 0.25146705
0.25401884 0.25111189 0.25192681 0.25252578 0.25039044 0.2525287
0.25165257 0.25357804 0.25001243 0.2495154 0.2531895 0.25270832
0.25305843 0.25064403 0.25180396 0.25231308 0.25224048 0.25068772
0.25212681 0.24812476 0.25027585 0.25243458]
Does anybody have an idea what could be going on in the background that could change the results? (These results do not change if one runs them again)
The difference gets even bigger if the network runs on a GPU (Titan X), e.g. the second output is:
[ 0.3302682 0.33054096 0.32677746 0.32830611 0.32972822 0.32807562
0.32850873 0.33161065 0.33009702 0.32811245 0.3285495 0.32966742
0.33050382 0.33156893 0.3300975 0.3298254 0.33350074 0.32991216
0.32990077 0.33203539 0.32692945 0.33036903 0.33102706 0.32648
0.32933888 0.33161271 0.32976636 0.33252293 0.32859167 0.33013415
0.33080408 0.33102706 0.32994759 0.33150592 0.32881773 0.33048317
0.33040857 0.32924038 0.32986534 0.33131596 0.3282761 0.3292698
0.32879189 0.33186096 0.32862625 0.33067161 0.329018 0.33022234
0.32904804 0.32891914 0.33122411 0.32900628 0.33088413 0.32931429
0.3268061 0.32924181 0.32940546 0.32860965 0.32828435 0.3310211
0.33098024 0.32997403 0.33025959 0.33133432]
whereas in the first one, the differences only occur in the 5th and latter decimal places:
[ 0.26075357 0.26074868 0.26074538 0.26075155 0.260755 0.26073951
0.26074919 0.26073971 0.26074231 0.26075247 0.2607362 0.26075858
0.26074955 0.26074123 0.26074299 0.26074946 0.26074076 0.26075014
0.26074076 0.26075229 0.26075041 0.26074776 0.26075897 0.26073995
0.260746 0.26074466 0.26073912 0.26075709 0.26075712 0.26073799
0.2607322 0.26075566 0.26075059 0.26073873 0.26074558 0.26074558
0.26074359 0.26073721 0.26074392 0.26074731 0.26074862 0.26074174
0.26074126 0.26074588 0.26073804 0.26074919 0.26074269 0.26074606
0.26075307 0.2607446 0.26074025 0.26074648 0.26074952 0.26073608
0.26073566 0.26073873 0.26074576 0.26074475 0.26074636 0.26073411
0.2607542 0.26074755 0.2607449 0.2607407 ]
Here results are different as initializations are different.
Tf uses the this init_op for variables initializations.
sess.run(init_op)
But Keras uses its own init_op inside its model class, not the init_op defined in your codes.