I can't seem to find a source which defines semantic layer tools and I would like to if strictly speaking SSAS is a semantic layer, a tool used to create one or how it would be defined?
Related
TFServin and KFServing both deploy the model on Kubeflow, and let users easy to use the model as a service, don't need to know detail about Kubernetes, hiding the infra layers.
TFServing is from TensorFlow, it can also run on Kubeflow or standalone. TFserving on kubeflow
KFServing is from Kubeflow, which can support multiple frameworks like PyTorch, TensorFlow, MXNet, etc. KFServing
My question is what's the main difference between these two projects.
If I want to launch my model in production, which should I use? which has better performance?
KFServing is an abstraction on top of inferencing rather than a replacement. It seeks to simplify deployment and make inferencing clients agnostic to what inference server is doing the actual work behind the scenes (be it TF Serving, Triton (formerly TRT-IS), Seldon, etc). It does this by seeking agreement among inference server vendors on an inferencing dataplane specification which allows extra components (such as transformations and explainers) to be more pluggable.
Please can anyone explain to me this part of tuto:
here is the link https://www.tensorflow.org/federated/tutorials/federated_learning_for_image_classification
Part:
Customizing the model implementation
Keras is the recommended high-level model API for TensorFlow, and we encourage using Keras models (via tff.learning.from_keras_model or tff.learning.from_compiled_keras_model) in TFF whenever possible.
However, tff.learning provides a lower-level model interface, tff.learning.Model, that exposes the minimal functionality necessary for using a model for federated learning. Directly implementing this interface (possibly still using building blocks like tf.keras.layers) allows for maximum customization without modifying the internals of the federated learning algorithms.
So let's do it all over again from scratch.
Defining model variables, forward pass, and metrics
The first step is to identify the TensorFlow variables we're going to work with. In order to make the following code more legible, let's define a data structure to represent the entire set. This will include variables such as weights and bias that we will train, as well as variables that will hold various cumulative statistics and counters we will update during training, such as loss_sum, accuracy_sum, and num_examples.
MnistVariables = collections.namedtuple(
'MnistVariables', 'weights bias num_examples loss_sum accuracy_sum')
Roughly analogous to the multiple paths Keras exposes to create a Keras model, TFF exposes multiple distinct ways of creating a tff.learning.Model. One of them is through the constructor functions, tff.learning.from_keras_model or tff.learning.from_compiled_keras_model, but each of these functions constructs and returns an instance of the abstract base class tff.learning.Model; the purpose of this section of the tutorial is to show that it is possible to instead directly construct such an instance by implementing the appropriate methods in the abstract interface.
If it is the collections.namedtuple MnistVariables you are asking about, it is simply a data container class introduced for convenience, to help group the tf.Variables which will roughly be used by the TFF runtime to track state during training. One important thing to note from the tff.learning.Model documentation, evidenced by this tutorial, is the line:
All tf.Variables should be introduced in __init__
If you are familiar with TensorFlow Variables, you will understand that controlling their instantiation is quite important.
In Tensorflow (as of v1.2.1), it seems that there are (at least) two parallel APIs to construct computational graphs. There are functions in tf.nn, like conv2d, avg_pool, relu, dropout and then there are similar functions in tf.layers, tf.losses and elsewhere, like tf.layers.conv2d, tf.layers.dense, tf.layers.dropout.
Superficially, it seems that this situation only serves to confuse: for example, tf.nn.dropout uses a 'keep rate' while tf.layers.dropout uses a 'drop rate' as an argument.
Does this distinction have any practical purpose for the end-user / developer?
If not, is there any plan to cleanup the API?
Tensorflow proposes on the one hand a low level API (tf., tf.nn....), and on the other hand, a higher level API (tf.layers., tf.losses.,...).
The goal of the higher level API is to provide functions that greatly simplify the design of the most common neural nets. The lower level API is there for people with special needs, or who wishes to keep a finer control of what is going on.
The situation is a bit confused though, because some functions have the same or similar names, and also, there is no clear way to distinguish at first sight which namespace correspond to which level of the API.
Now, let's look at conv2d for example. A striking difference between tf.nn.conv2d and tf.layers.conv2d is that the later takes care of all the variables needed for weights and biases. A single line of code, and voilà, you just created a convolutional layer. With tf.nn.conv2d, you have to take declare the weights variable yourself before passing it to the function. And as for the biases, well, they are actually not even handled: you need to add them yourself later.
Add to that that tf.layers.conv2d also proposes to add regularization and activation in the same function call, you can imagine how this can reduce code size when one's need is covered by the higher-level API.
The higher level also makes some decisions by default that could be considered as best practices. For example, losses in tf.losses are added to the tf.GraphKeys.LOSSES collection by default, which makes recovery and summation of the various component easy and somewhat standardized. If you use the lower level API, you would need to do all of that yourself. Obviously, you would need to be careful when you start mixing low and high level API functions there.
The higher-level API is also an answer to a great need from people that have been otherwise used to similarly high-level function in other frameworks, Theano aside. This is rather obvious when one ponders the number of alternative higher level APIs built on top of tensorflow, such as keras 2 (now part of the official tensorflow API), slim (in tf.contrib.slim), tflearn, tensorlayer, and the likes.
Finally, if I may add an advice: if you are beginning with tensorflow and do not have a preference towards a particular API, I would personnally encourage you to stick to the tf.keras.* API:
Its API is friendly and at least as good as the other high-level APIs built on top of the low-level tensorflow API
It has a clear namespace within tensorflow (although it can -- and sometimes should -- be used with parts from other namespaces, such as tf.data)
It is now a first-class citizen of tensorflow (it used to be in tf.contrib.keras), and care is taken to make new tensorflow features (such as eager) compatible with keras.
Its generic implementation can use other toolkits such as CNTK, and so does not lock you to tensorflow.
I see that Bayesian filters are use well for binary choices - (spam:not spam, male:female etc). Is there any way for it to categorize multiple values (eg php+javascript, house+yard).
I've seen Naive bayesian classifier - multiple decisions but I want to know if multiple outputs are possible.
If not, what are other suggested approaches for categorization (with or without learning). Especially for php.
As the accepted answer of the question you linked to says: "It's definitely possible to have more than two classes.". In practice, one approach is to train multiple classifiers in parallel, e.g. one classifier for php vs. not php and another classifier for javascript vs. not javascript.
Other widely used multivariate classification methods include
artificial neural networks (also called multilayer perceptrons)
(boosted) decision trees
support vector machines
If you have a more detailed/follow up question on this, post it on http://stats.stackexchange.com .
I'm not sure what libraries for such a task are available for php but Swig is a tool to make libraries written in C/C++ usable from php.
I was wondering if there is any good and clean object-oriented programming (OOP) implementation of Bayesian filtering for spam and text classification? This is just for learning purposes.
I definitely recommend Weka which is an Open Source Data Mining Software written in Java:
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
As mentioned above, it ships with a bunch of different classifiers like SVM, Winnow, C4.5, Naive Bayes (of course) and many more (see the API doc).
Note that a lot of classifiers are known to have much better perfomance than Naive Bayes in the field of spam detection or text classification.
Furthermore Weka brings you a very powerful GUI…
Check out Chapter 6 of Programming Collective Intelligence
Maybe https://ci-bayes.dev.java.net/ or http://www.cs.cmu.edu/~javabayes/Home/node2.html?
I never played with it either.
Here is an implementation of Bayesian filtering in C#: A Naive Bayesian Spam Filter for C# (hosted on CodeProject).
nBayes - another C# implementation hosted on CodePlex
In French, but you should be able to find the download link :)
PHP Naive Bayesian Filter