How to compute additional statistics in Evaluator? - tfx

In the TFX Evaluator, on top of the metrics described in TFMA format,I would like to compute statistics relative to the performance of my model on my dataset. Naturally, I would also like a way to get access to these statistics: either through the output of the component, or by letting the component upload the statistics somewhere.
I guess that some amount of custom code would be needed (both for the computation and the return of the statistics), but I don't really know how much and what would be the best way to write it. Any ideas on the topic ?
Thanks

There are 2 methods that you can achieve this depending on how you see the placement of your functionality in the TFX flow.
Writing a Custom TFX Components - which requires a lot of effort and you need to define quite a few things.
Reusing Existing Components - instead of writing a component for TFX entirely from scratch, we can inherit an existing component and customize it by overwriting the executor functionality.
I would suggest the following blogs to begin with:
Anatomy of TFX Component
Creating custom Components

Related

What exactly is Orchestrators in ML?

Actually, in ML pipeline components we are specifying inputs and outputs clearly .
For example in TFX statisticgen take input from examplegen and outputs some statistics.so input and output is clear which is same in all components .so why we need orchestrators .if anyone knows please help me?
In real-life projects, everything can be much more complicated:
the input data can be from the different sources: database, file system, third-party services. So we need to do classical ETL before we can start working with data.
you can use different technologies in the one pipeline. For instance, Spark as a preprocessing tool, after you can need to use an instance with GPU for the model training.
last, but not least - in production you need to care much more things. For instance data validation, model evaluation, etc. I wrote a separate article about how to organize this part using Apache Airflow.

Reinforcement Learning Agent in FMU

I want to train a reinforcement learning agent on a model which i build in OpenModelica. By using pyFMI, it is no problem to import the FMU, simulate it, and get some results.
My problem is that i donĀ“t have a possibility to "pause" a simulation after each step, getting the states, feeding my RL-agent with it and returning his proposed action as an input.
ModelicaGym seems to be a way to solve this problem by starting a simulation, stopping, getting the results, defining the next action and starting the simulation again with the last end-time as starting time.
Reading a paper from Lund University (https://portal.research.lu.se/portal/files/7201641/pyfmi_tech.pdf) made me think about an other idea:
Creating a FMU with the Learner, and connecting the two FMUs via PyFMI.Master.
Something along these lines:
from pyfmi import load_fmu
from pyfmi.master import Master
controller = load_fmu("controller.fmu")
Circuit = load_fmu("circuit.fmu")
connections = [( Circuit ,"currentSensor1.i",controller ,"feedback1.u2"),
(controller ,"PID.y",Circuit ,"signalVoltage1.v")]
models = [Circuit , controller]
master_simulator = Master(models , connections)
res = master_simulator.simulate(final_time =1)
Controlling the circuit with an other FMU with a PID controller inside works, but is it possible to create a FMU with a Reinforcement Learning Agent, including all other requiered Libraries, packages (Keras, Tensorflow?)
According to my point of view, such an implementation could have a pretty good performance, especially for models and learners with a higher complexity, this could be an interesting approach.
Or am I just chasing some dreams, because implementing a Reinforcement Learning algorithm in a FMU is not possible or causing other troubles?
Actually, i was a little surprised of not finding other people trying to implement this.
Best regards
Henrik
This answer might be plentifully late, but nevertheless I found your question during my research for the exact same problem. Your question is - to my understanding - taken up in the paper
Integration and Evaluation of Deep Reinforcement Learning Controller in a Building CoSimulation Environment
[PDF found here]
However in our context, the co-simulation environment
that is used for the study of Cyber-Physical Systems is a
master of co-simulation, and need to be able to call AI based Python libraries or tools from within the co-simulation. In the co-simulation, the controller requires an
interaction with the controlled component at each
iteration. Therefore, it is necessary to generate the AI based control components in the form of an FMU as is the
case for the other components.
They used a tool called DACCOSIM NG but later introduced their own methodology to make this approach more streamline.
Hope you long since found your own solution.
maybe you can update your question so it is more clear how the learning agent is implemented, but I understand that it can be used from Python?
The example fmu_with_input_function.py from the PyFMI documentation illustrates how to use a function as input an FMU. I suppose you can retrieve information from the FMU in this function like so (untested pseudo code):
from pyfmi import load_fmu
define input_object(model):
response = model.get('response_variable_name')
return ('input_var_name', learner(response))
model = load_fmu('model.fmu')
res = model.simulate(final_time=30, input=input_object(model))
You have to set up your model FMU so that the variables your learner should change (input_var_name), are input variables or tunable parameters. If you use parameters without variability="tunable", you cannot change them in the course of the simulation.
I would first try with input variables, because tunable parameters are a bit more complicated to treat and might not be implemented correctly in some tools.

How can I implement a local search algorithm in a model created on CPLEX ILOG?

I'm currently working on a large scale timetabling problem from my university. I'm using CPLEX to create the model and solve it, but due to it's size and processing time, I'm considering trying out a local search algorithm like G.A to solve it, but I'm lost on how to properly do it. Is there a way of applying a local search on it without having to reformulate the whole model?
one possible manner to tackle your problem is to use the CPLEX callbacks.
You may implement a heuristic callback. In this callback, you can implement your GA within the CPLEX model and use it to find a feasible solution (which I think is very difficult in various timetabling problems) or to improve your current solution.

What Tensorflow API to use for Seq2Seq

This year Google produced 5 different packages for seq2seq:
seq2seq (claimed to be general purpose but
inactive)
nmt (active but supposed to be just
about NMT probably)
legacy_seq2seq
(clearly legacy)
contrib/seq2seq
(not complete probably)
tensor2tensor (similar purpose, also
active development)
Which package is actually worth to use for the implementation? It seems they are all different approaches but none of them stable enough.
I've had too a headache about some issue, which framework to choose? I want to implement OCR using Encoder-Decoder with attention. I've been trying to implement it using legacy_seq2seq (it was main library that time), but it was hard to understand all that process, for sure it should not be used any more.
https://github.com/google/seq2seq: for me it looks like trying to making a command line training script with not writing own code. If you want to learn Translation model, this should work but in other case it may not (like for my OCR), because there is not enough of documentation and too little number of users
https://github.com/tensorflow/tensor2tensor: this is very similar to above implementation but it is maintained and you can add more of own code for ex. reading own dataset. The basic usage is again Translation. But it also enable such task like Image Caption, which is nice. So if you want to try ready to use library and your problem is txt->txt or image->txt then you could try this. It should also work for OCR. I'm just not sure it there is enough documentation for each case (like using CNN at feature extractor)
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/seq2seq: apart from above, this is just pure library, which can be useful when you want to create a seq2seq by yourself using TF. It have a function to add Attention, Sequence Loss etc. In my case I chose that option as then I have much more freedom of choosing the each step of framework. I can choose CNN architecture, RNN cell type, Bi or Uni RNN, type of decoder etc. But then you will need to spend some time to get familiar with all the idea behind it.
https://github.com/tensorflow/nmt : another translation framework, based on tf.contrib.seq2seq library
From my perspective you have two option:
If you want to check the idea very fast and be sure that you are using very efficient code, use tensor2tensor library. It should help you to get early results or even very good final model.
If you want to make a research, not being sure how exactly the pipeline should look like or want to learn about idea of seq2seq, use library from tf.contrib.seq2seq.

Solution Cloning Performance Tips

We are currently trying to improve the performance of a planning problem we've implemented in OptaPlanner. Our model has ~45,000 chained variables and after profiling the application it seems like the main bottleneck is around the cloning. Approximately 90% of the CPU run-time is consumed by the FieldAccessingSolutionCloner method calls.
We've already tried to make our object model more lightweight by reducing the number of Maps and Sets within the PlanningEntities, changing fields to primitives where possible, but from your own OptaPlanner experience have you any advice about how speed up cloning performance?
Have you tried writing a custom cloner? See docs.
The default one needs to rely on reflection, so it's slower.
Also, the structure of your domain model influences how much you need to clone (regardless if you go custom or not):
If you delete your Solution and Planning Entities classes, do your other domain classes still compile?
If yes, then the clone is minimal. If no, it's not.