How to apply a higher weight to a feature in BQML? - google-bigquery

I'm attempting to train an xgboost classification model using BQML. But I'd like to give one feature a higher weight. I couldn't find any documentation about assigning feature weights. There is CLASS_WEIGHTS to assign weights to class labels but that is not what I want. BQML documentation.
I feel like this feature is not available yet and I have to create handmade models using sklearn.

Related

How to extract weights of DQN agent in TF-Agents framework?

I am using TF-Agents for a custom reinforcement learning problem, where I train a DQN (constructed using DqnAgents from the TF-Agents framework) on some features from my custom environment, and separately use a keras convolutional model to extract these features from images. Now I want to combine these two models into a single model and use transfer learning, where I want to initialize the weights of the first part of the network (images-to-features) as well as the second part which would have been the DQN layers in the previous case.
I am trying to build this combined model using keras.layers and compiling it with the Tf-Agents tf.networks.sequential class to bring it to the necessary form required when passing it to the DqnAgent() class. (Let's call this statement (a)).
I am able to initialize the image feature extractor network's layers with the weights since I saved it as a .h5 file and am able to obtain numpy arrays of the same. So I am able to do the transfer learning for this part.
The problem is with the DQN layers, where I saved the policy from the previous example using the prescribed Tensorflow Saved Model Format (pb) which gives me a folder containing model attributes. However, I am unable to view/extract the weights of my DQN in this way, and the recommended tf.saved_model.load('policy_directory') is not really transparent with respect to what data I can see regarding the policy. If I have to follow the transfer learning as I do in statement (a), I need to extract the weights of my DQN and assign them to the new network. The documentation seems to be quite sparse for this case where transfer learning needs to be applied.
Can anyone help me in this, by explaining how I can extract weights from the Saved Model method (from the pb file)? Or is there a better way to go about this problem?

Multi-target classification in tensorflow

How to implement Multi-target classification in tensorflow? Say I have a list of features f1,f2,...,fn and i want to predict the class or a value of three targets t1,t2,and t3. So each target belongs to a single class only.
It sounds like you're interested in multinomial logistic regression. In TensorFlow, the most important function for this is tf.nn.softmax_cross_entropy_with_logits_v2.
This site gives a good idea how the softmax function makes it possible to classify a point in multiple categories.

Feature importance from tf.estimator.BoostedTreeRegression

I am trying to extracted feature importance from a model built in python using tf.estimator.BoostedTreeRegressor.
It looks like a standard way to achieve it is by iterating over all trees in the forest and from the importance of each tree's coefficients to calculate some statistics.
Example in sklearn, xgboost. I have not found how to address this issue in tensorflow.
This is not possible at the moment using TensorFlow's Premade BoostedTreeRegressor or Classifier Estimators.

How to perform a multi label classification with tensorflow in purpose of auto tagging?

I'm new to tensorflow and would like to know if there is any tutorial or example of a multi-label classification with multiple network outputs.
I'm asking this because I have a collection of articles, in which, each article can have several tags.
Out of the box, tensorflow supports binary multi-label classification via tf.nn.sigmoid_cross_entropy_with_logits loss function or the like (see the complete list in this question). If your tags are binary, in other words there's a predefined set of possible tags and each one can either be present or not, you can safely go with that. A single model to classify all labels at once. There are a lot of examples of such networks, e.g. one from this question.
Unfortunately, multi-nomial multi-label classification is not supported in tensorflow. If this is your case, you'd have to build a separate classifier for each label, each using tf.nn.softmax_cross_entropy_with_logits or a similar one.

How to choose the threshold of the output of a dnn in tensorflow?

I am currently learning to make neural networks with tensorflow. And the library provides a very convenient way to create one with the estimator DNNClassifier like in this tutorial: https://www.tensorflow.org/get_started/premade_estimators.
However, I don't manage to see how to choose the final treshold of the output layer before making the prediction:
For instance, let's say we have a binary classifier between 'KO' and 'OK'. The end of the neural network compute the probabilities for each possibility for a specific sample, for instance [0.4,0.6] (so 40% that the answer is 'KO' and 60% that the answer is 'OK'). I assume that the dnn takes by default a threshold of 0.5, so it will answer 'OK' here. But I want to change this threshold to 0.8 so that if the dnn is not sure at 80% for 'OK', it will answer 'KO' (in order to tune the FP-rate and the FN-rate).
How can we do that ?
Thanks in advance for your help.
The premade estimators are somewhat rigid. The DNNClassifier, for example, does not provide a mechanism to change the loss function or to obtain the logits/probabilities output by the classifier, as you've discovered.
To modify the logic of how predictions are generated, or to modify your loss function, you'll have to create a custom Estimator. This tutorial walks you through that process.
If you haven't invested too much time learning how to use the Estimator API yet, I recommend you also acquaint yourself with Keras, another high-level API for building and training deep learning models in TensorFlow; you might find it easier to build custom models with Keras rather than Estimators.