Exporting Tensorflow Models for Eigen Only Environments - tensorflow

Has anyone seen any work done on this? I'd think this would be a reasonably common use-case. Train model in python, export the graph and map to a sequence of eigen instructions?

I don't believe anything like this is available, but it is definitely something that would be useful. There are some obstacles to overcome though:
Not all operations are implemented by Eigen.
We'd need to know how to generate code for all operations we want to support.
The glue code to allocate buffers and schedule work can get pretty gnarly.
It's still a good idea though, and it might get more attention posted as a feature request on https://github.com/tensorflow/tensorflow/issues/

Related

How can I implement a local search algorithm in a model created on CPLEX ILOG?

I'm currently working on a large scale timetabling problem from my university. I'm using CPLEX to create the model and solve it, but due to it's size and processing time, I'm considering trying out a local search algorithm like G.A to solve it, but I'm lost on how to properly do it. Is there a way of applying a local search on it without having to reformulate the whole model?
one possible manner to tackle your problem is to use the CPLEX callbacks.
You may implement a heuristic callback. In this callback, you can implement your GA within the CPLEX model and use it to find a feasible solution (which I think is very difficult in various timetabling problems) or to improve your current solution.

Can .tflite capture tf.hub.text_embedding_column() processes?

Just a general question here, no reproducible example but thought this might be the right place anyway since its very software specific.
I am building a model which I want to convert to .tflite. It relies on tf.hub.text_embedding_collumn() for feature generation. When I convert to .tflite will this be captured such that the resulting model will take raw text as input rather than a sparse vector representation?
Would be good to know just generally before I invest too much time in this approach. Thanks in advance!
Currently I don't imagine this would work, as we do not support enough string ops to implement that. One approach would be to do this handling through a custom op, but implementing this custom op would require domain knowledge and mitigate the ease-of-use advance of using tf hub in the first place.
There is some interest in defining a set of hub operators that are verified to work well with tflite, but this is not yet ready.

What Tensorflow API to use for Seq2Seq

This year Google produced 5 different packages for seq2seq:
seq2seq (claimed to be general purpose but
inactive)
nmt (active but supposed to be just
about NMT probably)
legacy_seq2seq
(clearly legacy)
contrib/seq2seq
(not complete probably)
tensor2tensor (similar purpose, also
active development)
Which package is actually worth to use for the implementation? It seems they are all different approaches but none of them stable enough.
I've had too a headache about some issue, which framework to choose? I want to implement OCR using Encoder-Decoder with attention. I've been trying to implement it using legacy_seq2seq (it was main library that time), but it was hard to understand all that process, for sure it should not be used any more.
https://github.com/google/seq2seq: for me it looks like trying to making a command line training script with not writing own code. If you want to learn Translation model, this should work but in other case it may not (like for my OCR), because there is not enough of documentation and too little number of users
https://github.com/tensorflow/tensor2tensor: this is very similar to above implementation but it is maintained and you can add more of own code for ex. reading own dataset. The basic usage is again Translation. But it also enable such task like Image Caption, which is nice. So if you want to try ready to use library and your problem is txt->txt or image->txt then you could try this. It should also work for OCR. I'm just not sure it there is enough documentation for each case (like using CNN at feature extractor)
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/seq2seq: apart from above, this is just pure library, which can be useful when you want to create a seq2seq by yourself using TF. It have a function to add Attention, Sequence Loss etc. In my case I chose that option as then I have much more freedom of choosing the each step of framework. I can choose CNN architecture, RNN cell type, Bi or Uni RNN, type of decoder etc. But then you will need to spend some time to get familiar with all the idea behind it.
https://github.com/tensorflow/nmt : another translation framework, based on tf.contrib.seq2seq library
From my perspective you have two option:
If you want to check the idea very fast and be sure that you are using very efficient code, use tensor2tensor library. It should help you to get early results or even very good final model.
If you want to make a research, not being sure how exactly the pipeline should look like or want to learn about idea of seq2seq, use library from tf.contrib.seq2seq.

Converting a deep learning model from GPU powered framework, such as Theano, to a common, easily handled one, such as Numpy

I have been playing around with building some deep learning models in Python and now I have a couple of outcomes I would like to be able to show friends and family.
Unfortunately(?), most of my friends and family aren't really up to the task of installing any of the advanced frameworks that are more or less necessary to have when creating these networks, so I can't just send them my scripts in the present state and hope to have them run.
But then again, I have already created the nets, and just using the finished product is considerably less demanding than making it. We don't need advanced graph compilers or GPU compute powers for the show and tell. We just need the ability to make a few matrix multiplications.
"Just" being a weasel word, regrettably. What I would like to do is convert the the whole model (connectivity,functions and parameters) to a model expressed in e.g. regular Numpy (which, though not part of standard library, is both much easier to install and easier to bundle reliably with a script)
I fail to find any ready solutions to do this. (I find it difficult to pick specific keywords on it for a search engine). But it seems to me that I can't be the first guy who wants to use a ready-made deep learning model on a lower-spec machine operated by people who aren't necessarily inclined to spend months learning how to set the parameters in an artificial neural network.
Are there established ways of transferring a model from e.g. Theano to Numpy?
I'm not necessarily requesting those specific libraries. The main point is I want to go from a GPU-capable framework in the creation phase to one that is trivial to install or bundle in the usage phase, to alleviate or eliminate the threshold the dependencies create for users without extensive technical experience.
An interesting option for you would be to deploy your project to heroku, like explained on this page:
https://github.com/sugyan/tensorflow-mnist

How do you find the most discriminant terms in binary document classification?

I want to use feature selection to find the terms in a document that are most useful for a binary classification task.
I've been looking around:
This mentions Mutual Information and the chi-squared test metric
http://nlp.stanford.edu/IR-book/html/htmledition/feature-selection-1.html
MATLAB has a number of functions as well:
http://www.mathworks.com/help/toolbox/stats/brj0qbu.html
Feature Selection in MATLAB
Of the above, relieff and rankfeatures look promising.
I do not know if my data follows a normal distribution. Any thoughts on which technique performs the best? Are there any newer methods you would suggest? The focus is to increase classification accuracy.
Thank you!
Since the answer is highly dependent on the nature of your data, I'd suggest playing with several options, possibly using a hold-out set for verification.
The easiest path would probably be to use Weka or RapidMiner for experimenting. Choosing from the plethora of options provided by them, you'll probably get acquainted with several other methods.
Having said that, I have found Mutual Information/Infogain to be useful on a large variety of problems.