integration of xgboost with h2o functionalities - xgboost

I see that recently xgboost has been integrated within the H2O ecosystem. Nevertheless lack of documentation appears in the H2O. In particular I wonder whether It is possible to run a grid search on all xgboost params using the h2o.grid.

The H2O-XGBoost documentation is in the Algorithms section of the H2O User Guide, here.
The hyperparameters for Grid Search are also listed with all the other algorithms in the Grid Search section, here.

Related

TensorFlow 2 documentation for graph-mode

When I check the TensorFlow documentation (Python API docs or guides), it all seems exclusively for eager-mode. Almost all the examples don't even mention this.
For some specific operation/function like tf.nn.relu, this does not really make any difference.
However, for more complex things like tf.data (Dataset API, guide), it likely makes a difference. Esp all the examples would be different for graph mode.
Where can I find recent documentation (API references, guides, tutorials, examples) for graph mode?
(My current fallback is to check latest TF 1 documentation. But at some point, this will become more and more outdated.)
Or is graph mode deprecated so far that documentation for it seems not necessary anymore?
Graph mode in TensorFlow 2 is different from graph mode in TensorFlow 1. Instead of using sessions and placeholders, TensorFlow 2 uses functions annotated with tf.function. The eager mode examples you see can be executed in graph mode by wrapping them within a tf.function.
If you prefer to use the TensorFlow 1 style of graph mode with sessions and placeholders, you can still do so in TensorFlow 2 by using the tf.compat.v1 module. The API docs in that module describe the TensorFlow 1 style of graph mode. You can find archived guides about TensorFlow 1 graph mode at https://github.com/tensorflow/docs/tree/master/site/en/r1/guide

I want to use hidden markov model for data prediction

I am new to machine learning models and data science libraries. I wanted to use the Hidden Markov model for statistical data prediction on the fly which read the data from kafka and builds the model which is used to predict the data during the run-time and do the same for continous stream always.
Currently i can see only Tensorflow hidden markov model implementation in tensorflow python (tensorflow_probability distribution). Is their any other library available which can help me acheive the above scenario
Suggestions can involve the libraries of JAVA and python
Please feel free to add any resource links that can help me to understand the usage of tensorflow for hidden markov model
this might be a nice place to start: https://hmmlearn.readthedocs.io/en/latest/tutorial.html
Other alternatives, I found, are
Java:
Mallet library and it's extention GRMM in particular.
Python:
Pommegranate with it's HMM support.
Having said that, TensorFlow is much better known active and supported library, in my impression. I'd try that first.
I'm searching a library that would support Hierarchical HMMs (HHMM). That would probably require some tweaking into one of the listed ones.

Has Microsoft abandoned CNTK?

I want to know if CNTK dead? Release notes on GitHub dated 03/31/2019: "Today’s 2.7 release will be the last main release of CNTK." I've spent months developing software using CNTK and now it appears to be a waste of time and money. I've search for an answer on numerous sites and still no answer. stackoverflow is one of the sites recommend by Microsoft.
From KedengMS, one of the maintainers for CNTK. Reposted from github.
Thanks for all the CNTK supporters, and I am privileged to have worked
on it, and learned a lot in the process. You can continue to use CNTK
for training and inference in the way it currently is, as other
Microsoft internal teams that still runs old models even in
BrainScript or NDL. Stopping adding new features does not mean CNTK is
no longer open source, it just means that going forward, there will be
no new GPU support (say, CUDA 11+), and no major new features added.
For different user scenarios, I think you may have different choices:
Deep learning newcomers: IMO CNTK is still a good entry to understand basics of deep learning, if you found CNTK
documents/tutorials/examples useful. Once you learnt the basic, it
won't be too hard to switch between frameworks. However, the DL field
is changing rapidly and CNTK has already lagged behind in a lot of
ways, so if you need more advanced features like dynamic graph,
PyTorch would be a better choice.
Model maintainers: If you already have CNTK models working, and to maintain it just means training with new data, you can continue to use
CNTK the way you currently use it. Actually, teams inside Microsoft
are doing this too. If there are serious bugs preventing productivity,
they still will be fixed. For inference, you can continue to use CNTK
C/C++/Python/C#/Java APIs, or you may export CNTK models in ONNX
format, and use ONNX Runtime or ORT as a slimmer and faster inference
engine. You'll be surprised to find how much faster it is comparing to
CNTK, and how slimmer the setup is (forget about OpenMPI when you just
need inference!). ORT currently provides C/C++/Python/C# interfaces.
Model builders: If you have CNTK model, and want to use features that are not currently supported in CNTK, please consider switch to
other frameworks like TensorFlow/PyTorch/etc. Our team has done lots
of data reader work inside PyTorch to ensure teams in Microsoft can
switch from CNTK to PyTorch. Besides, we are also in the process of
migrating CNTK specific distributed trainer like BMUF to PyTorch.
Hopefully you'll find that useful too when migrating your model.
The good thing about open source is that the community can continue to
fork/evolve if needed, unlike other Microsoft products that only ship
binaries (Win7 I am looking at you).

Machine Learning Tensor flow and sentiment analysis

I'd like to make a machine learning with Tensorflow about sentiment analysis, I know it's possible to do machine learning with tensorflow but is it possible to make machine learning suited for sentiment analysis ? I know it's possible to do sentiment analysis with Convolutional Neuronal Network (Deep learning so) with Tensor flow but I'm looking for a solution that only uses Machine learning algorithm, not deep learning.
Would you know great tutorials to begin a GCP Machine Learning project ? Is it possible to begin a GCP Machine Learning Project without using google API or Tensorflow (just in "regular" python but which can be linked to other services like google big query, data prep) ?
Thank you very much for your answers :)
In general it wouldn't make much sense to use TensorFlow for non-deep learning solutions. You can always try scikit-learn for implementing other machine learning techniques with python, but I do not think that you could get far with sentiment analysis this way. For a nice introduction on sentiment analysis with TensorFlow I would suggest to go through this guide.
As for machine learning on google cloud platform: there is a number of APIs hosted in the platform which use pre-built models, including cloud natural language which has an analyzeSentiment method. Here you can find a python tutorial on how to do sentiment analysis using the python client library of the cloud natural language api.
If you still want to try and build your own model, you can train (with either scikit-learn or tensorflow), deploy and then use it for predictions on the cloud with the google cloud machine learning engine. As for tutorials, I would suggest that you visit the official documentations of any of the above products/services, you will find there comprehensive step-by-step guides for all levels.

How to use tensorflow-wavenet

I am trying to use the tensorflow-wavenet program for text to speech.
These are the steps:
Download Tensorflow
Download librosa
Install requirements pip install -r requirements.txt
Download corpus and put into directory named "corpus"
Train the machine python train.py --data_dir=corpus
Generate audio python generate.py --wav_out_path=generated.wav --samples 16000 model.ckpt-1000
After doing this, how can I generate a voice read-out of a text file?
According to the tensorflow-wavenet page:
Currently there is no local conditioning on extra information which would allow context stacks or controlling what speech is generated.
You can find more information about current development of the project by reading the issues on the repository (local conditioning is a desired feature!)
The Wavenet paper compares Wavenet to two TTS baselines, one of which appears to have code for training available online: http://hts.sp.nitech.ac.jp
A recent paper by DeepMind describes one approach to going from text to speech using WaveNet, which I have not tried to implement but which at least states the method they use: they first train one network to predict a spectrogram from text, then train WaveNet to use the same sort of spectrogram as an additional conditional input to produce speech. It's a neat idea, especially since you can train the WaveNet part on some huge database of voice-only data, for which you can extract the spectrogram, and then train the text-to-spectrogram part using a different dataset where you have text.
https://google.github.io/tacotron/publications/tacotron2/index.html has the paper and some example outputs.
There seems to be a bunch of unintuitive engineering around the spectrogram prediction part (no doubt because of the nature of text-to-time learning), but there's some detail in the paper at least. The dataset is proprietary so I've no idea how hard it would be to get any results using other datasets.
For those who may come across this question, there is a new python implementation ForwardTacotron that enables text-to-speech readily.