running temporal fusion transformer default dataset shape error - tensorflow

I ran default code of Temporal fusion transformer in google colab which downloaded at github.
After clone, when I ran the step 2, there's no way to test training.
python3 -m script_train_fixed_params volatility outputs yes
The problem is shape error in the below.
Computing best validation loss
Computing test loss
/usr/local/lib/python3.7/dist-packages/keras/engine/training_v1.py:2079: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
updates=self.state_updates,
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/content/drive/MyDrive/tft_tf2/script_train_fixed_params.py", line 239, in <module>
use_testing_mode=True) # Change to false to use original default params
File "/content/drive/MyDrive/tft_tf2/script_train_fixed_params.py", line 156, in main
targets = data_formatter.format_predictions(output_map["targets"])
File "/content/drive/MyDrive/tft_tf2/data_formatters/volatility.py", line 183, in format_predictions
output[col] = self._target_scaler.inverse_transform(predictions[col])
File "/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_data.py", line 1022, in inverse_transform
force_all_finite="allow-nan",
File "/usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py", line 773, in check_array
"if it contains a single sample.".format(array)
ValueError: Expected 2D array, got 1D array instead:
array=[-1.43120418 1.58885804 0.28558148 ... -1.50945972 -0.16713021
-0.57365613].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
I've tried to modify code which is predict dataframe shpae of 'data_formatters/volatility.py", line 183, in format_predictions' because I guessed that's where the problem arises.), but I can't handle that.

You have to change line
183 in volatitlity.py
output[col] = self._target_scaler.inverse_transform(predictions[col].values.reshape(-1, 1))
and line 216 in electricity.py
sliced_copy[col] = target_scaler.inverse_transform(sliced_copy[col].values.reshape(-1, 1))
Afterwards the example electricity works fine. And I guess this should be the same with volatility.

Related

Python matplotlib/pandas fails with "OverflowError: value too large to convert to npy_uint32"

I am trying to run this variant call pipeline with 144 samples so the resulting files are quite big. I managed to get it almost to the end, but the last rule (plots_stats) fails with OverflowError: value too large to convert to npy_uint32. This is a Python script, that plots from a gzipped tsv file. I guess, I just have to many rows in my calls.tsv.gzto be handled. The complete error log is:
Traceback (most recent call last):
File "/[PATH]/workflow_var_calling/.snakemake/scripts/tmp10j_ba31.plot-depths.py", line 16, in <module>
sample_info = calls.loc[:, samples].stack([0, 1]).unstack().reset_index(1, drop=False)
File "/[PATH]/workflow_var_calling/.snakemake/conda/5e32b1f022a698680d2667be14f8a58a/lib/python3.6/site-packages/pandas/core/series.py", line 2899, in unstack
return unstack(self, level, fill_value)
File "/[PATH]/workflow_var_calling/.snakemake/conda/5e32b1f022a698680d2667be14f8a58a/lib/python3.6/site-packages/pandas/core/reshape/reshape.py", line 501, in unstack
constructor=obj._constructor_expanddim)
File "/[PATH]/workflow_var_calling/.snakemake/conda/5e32b1f022a698680d2667be14f8a58a/lib/python3.6/site-packages/pandas/core/reshape/reshape.py", line 116, in __init__
self.index = index.remove_unused_levels()
File "/[PATH]/workflow_var_calling/.snakemake/conda/5e32b1f022a698680d2667be14f8a58a/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 1494, in remove_unused_levels
uniques = algos.unique(lab)
File "/[PATH]/workflow_var_calling/.snakemake/conda/5e32b1f022a698680d2667be14f8a58a/lib/python3.6/site-packages/pandas/core/algorithms.py", line 367, in unique
table = htable(len(values))
File "pandas/_libs/hashtable_class_helper.pxi", line 937, in pandas._libs.hashtable.Int64HashTable.__cinit__
OverflowError: value too large to convert to npy_uint32
any ideas?

Solving MINLP with PYOMO/PYSP

team,
currently I am working on a nonlinear stochastic optimization problem. So far, the toolbox has been really helpful, thank you! However, adding a nonlinear constraint has caused an error. I use the gurobi solver. The problem results from the following constraint.
def max_pcr_power_rule(model, t):
if t == 0:
return 0 <= battery.P_bat_max-model.P_sc_max[t+1]-model.P_pcr
else:
return model.P_trade_c[t+1] + np.sqrt(-2*np.log(rob_opt.max_vio)) \
*sum(model.U_max_pow[t,i]**2 for i in set_sim.tme_dat_stp)**(0.5) \
<= battery.P_bat_max-model.P_sc_max[t+1]-model.P_pcr
model.max_pcr_power = Constraint(set_sim.tme_dat_stp, rule=max_pcr_power_rule)
I receive this error message:
Initializing extensive form algorithm for stochastic programming
problems. Exception encountered. Scenario tree manager attempting to
shut down. Traceback (most recent call last): File
"C:\Users\theil\Anaconda3\Scripts\runef-script.py", line 5, in
sys.exit(pyomo.pysp.ef_writer_script.main()) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\pysp\ef_writer_script.py",
line 863, in main
traceback=options.traceback) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\pysp\util\misc.py",
line 344, in launch_command
rc = command(options, *cmd_args, **cmd_kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\pysp\ef_writer_script.py",
line 748, in runef
ef.solve() File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\pysp\ef_writer_script.py",
line 430, in solve
**solve_kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\parallel\manager.py",
line 122, in queue
return self._perform_queue(ah, *args, **kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\parallel\local.py",
line 59, in _perform_queue
results = opt.solve(*args, **kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\base\solvers.py",
line 599, in solve
self._presolve(*args, **kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\solvers\plugins\solvers\GUROBI.py",
line 224, in _presolve
ILMLicensedSystemCallSolver._presolve(self, *args, **kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\solver\shellcmd.py",
line 196, in _presolve
OptSolver._presolve(self, *args, **kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\base\solvers.py",
line 696, in _presolve
**kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\base\solvers.py",
line 767, in _convert_problem
**kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\base\convert.py",
line 110, in convert_problem
problem_files, symbol_map = converter.apply(*tmp, **tmpkw) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\solvers\plugins\converter\model.py",
line 96, in apply
io_options=io_options) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\core\base\block.py",
line 1681, in write
io_options) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\repn\plugins\cpxlp.py",
line 176, in call
include_all_variable_bounds=include_all_variable_bounds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\repn\plugins\cpxlp.py",
line 719, in _print_model_LP
"with nonlinear terms." % (constraint_data.name)) ValueError: Cannot write legal LP file. Constraint '1.max_pcr_power[1]' has a
body with nonlinear terms.
I thought, that the problem may lay within the nested formulation of the constraint, i.e. the combination of sum and exponential terms. Therefore, I put the sum()-term into a separate variable. This didn't change the core the characteristic of the nonlinear constraint, so that the error stayed the same. My other suspicion was, that the problem lays within the gurobi solver. So i tried to utilize ipopt, which produced the follwing error message:
Error evaluating constraint 1: can't evaluate pow'(0,0.5). ERROR:
Solver (ipopt) returned non-zero return code (1) ERROR: See the solver
log above for diagnostic information. Exception encountered. Scenario
tree manager attempting to shut down. Traceback (most recent call
last): File "C:\Users\theil\Anaconda3\Scripts\runef-script.py", line
5, in
sys.exit(pyomo.pysp.ef_writer_script.main()) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\pysp\ef_writer_script.py",
line 863, in main
traceback=options.traceback) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\pysp\util\misc.py",
line 344, in launch_command
rc = command(options, *cmd_args, **cmd_kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\pysp\ef_writer_script.py",
line 748, in runef
ef.solve() File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\pysp\ef_writer_script.py",
line 434, in solve
**solve_kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\parallel\manager.py",
line 122, in queue
return self._perform_queue(ah, *args, **kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\parallel\local.py",
line 59, in _perform_queue
results = opt.solve(*args, **kwds) File "C:\Users\theil\Anaconda3\lib\site-packages\pyomo\opt\base\solvers.py",
line 626, in solve
"Solver (%s) did not exit normally" % self.name) pyutilib.common._exceptions.ApplicationError: Solver (ipopt) did not
exit normally
I am wondering now, if my mistake lays within the formulation of the constraint or the way i utilize the solver. Otherwise I have to simplify my problem to make it solvable.
I would be glad, if you can point me in the right direction. Thank you!
Best regards
Philipp
As Erwin mentioned in the comment, Gurobi is generally not intended for nonlinear problems.

Using graph_metrics.py with a saved graph

I want to view statistics of my model by saving my graph to a file then running graph_metrics.py.
I have tried a few different things to write the file, my best effort is:
tf.train.write_graph( session.graph_def, ".", "my_graph", as_text=True )
But here's what happens:
$ python ./util/graph_metrics.py --noinput_binary --graph my_graph
Traceback (most recent call last):
File "./util/graph_metrics.py", line 137, in <module>
tf.app.run()
File ".virtualenv/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "./util/graph_metrics.py", line 85, in main
FLAGS.batch_size)
File "./util/graph_metrics.py", line 109, in calculate_graph_metrics
input_tensor = sess.graph.get_tensor_by_name(input_layer)
File ".virtualenv/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2531, in get_tensor_by_name
return self.as_graph_element(name, allow_tensor=True, allow_operation=False)
File ".virtualenv/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2385, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File ".virtualenv/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2427, in _as_graph_element_locked
"graph." % (repr(name), repr(op_name)))
KeyError: "The name 'Mul:0' refers to a Tensor which does not exist. The operation, 'Mul', does not exist in the graph."
Is there a complete working example of saving a graph, then analyzing it with graph_metrics.py?
This process seems to involve a magic incantation that I haven't yet discovered.
The error you're hitting is because you need to specify the name of your own input node with --input_layer= (it just defaults to Mul:0 because that's what we use in one of our Inception models):
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/graph_metrics.py#L51
The graph_metrics script is still very much a work in progress unfortunately, and you may hit problems with shape inference, but hopefully this should get you past the initial hurdle.

iPython won't start anymore after using os.dup2()

I was just trying out the os.dup2() function to redirect outputs, when I was typing in os.dup2(3,1), which my ipython (2.7) didn't seem to like.
It crashed and now it won't start again, yielding the error:
Traceback (most recent call last):
File "/usr/bin/ipython", line 8, in <module>
launch_new_instance()
File "/usr/lib/python2.7/dist-packages/IPython/frontend/terminal/ipapp.py", line 402, in launch_new_instance
app.initialize()
File "<string>", line 2, in initialize
File "/usr/lib/python2.7/dist-packages/IPython/config/application.py", line 84, in catch_config_error
return method(app, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/IPython/frontend/terminal/ipapp.py", line 312, in initialize
self.init_shell()
File "/usr/lib/python2.7/dist-packages/IPython/frontend/terminal/ipapp.py", line 332, in init_shell
ipython_dir=self.ipython_dir)
File "/usr/lib/python2.7/dist-packages/IPython/config/configurable.py", line 318, in instance
inst = cls(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/IPython/frontend/terminal/interactiveshell.py", line 183, in __init__
user_module=user_module, custom_exceptions=custom_exceptions
File "/usr/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 456, in __init__
self.init_readline()
File "/usr/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 1777, in init_readline
self.refill_readline_hist()
File "/usr/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 1789, in refill_readline_hist
include_latest=True):
File "/usr/lib/python2.7/dist-packages/IPython/core/history.py", line 256, in get_tail
return reversed(list(cur))
DatabaseError: database disk image is malformed
If you suspect this is an IPython bug, please report it at:
https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev#scipy.org
You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.
Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
c.Application.verbose_crash=True
can anyone help me with that?
Reposting as an answer:
That looks like fd 3 is your IPython history database, and you redirected stdout to it and corrupted it.
To get it to start again, remove or rename ~/.ipython/profile_default/history.sqlite (or ~/.config/ipython/profile_default/history.sqlite on certain IPython versions on Linux).

Numpy Load OverflowError: length too large

I have an algorithm that runs through a dataset and creates a scipy sparse matrix which in turn is saved using:
numpy.savez
and the file is open such as:
open(file, 'wb').
The matrix can get a considerable amount of disk space (it took about 20 GB running for 30 days)
After that, those matrices are loaded into other applications such as:
file = open(path_to_file, 'rb')
matrix = load(file)
data = matrix['arr_0']
ind = matrix['arr_1']
indptr = matrix['arr_2']
For 10 days it worked fine.
When running for a dataset of 30 days the matrix was also successfully created and saved.
But when trying to load it I got the error:
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/ubuntu/recsys/Scripts/Neighborhood/s3_CRM_neighborhood.py", line 76, in <module>
data = matrix['arr_0']
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 241, in __getitem__
return format.read_array(value)
File "/usr/lib/python2.7/dist-packages/numpy/lib/format.py", line 458, in read_array
data = fp.read(int(count * dtype.itemsize))
OverflowError: length too large
If I could successfully create and save the matrices shouldn't it be able to also load the result? Is there some overhead that is killing the loading? Is is possible to work around this issue?
Thanks in advance,
From the notes on the just published numpy version 1.8, release candidate 1:
IO compatibility with large files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Large NPZ files >2GB can be loaded on 64-bit systems.
So it seems you hit a known bug that has just been solved.