How to learn to web scrape/Selenium?

How to learn to web scrape/Selenium? - selenium

I recently learned how to build ML models using sci-kit learn, tensorflow and keras and I applied my learnings to create a model to predict NBA games. My model ended up doing pretty well, so much so that I'd like to create a self-updating database that scrapes basketball-reference.com/leagues/ regularly to update game data so that anytime I want a prediction for an upcoming NBA game I can have the parameters ready without needing to do anything.
For this to work I need to know how to scrape basketball-reference.com/leagues/ and I thought I might try to use Selenium but it turned out to be more difficult than I realized. Most of my knowledge with programming/compsci is self-taught so I'm not sure where to turn or what to do to learn how to use Selenium.
Does anyone have any resources they may suggest or maybe some 'pre-requisites' that I should know before I try using Selenium?
I was thinking about checking out a course on Udemy!

Related

Omnet++ with Reinforcement Learning Tools [ML]

I am currently failing into find an easy and modular framework to link openAI gym or tensorflow or keras with omnet++ in such a way I can produce communication between each tool and have online learning.
There are tools like omnetpy and veins-gym, however one is very strict and not trustworthy (and no certainty into bridge with openAI, for example) and the other is really poor documented in such a way one person can’t taper how it is supposed to be incorporated into a project.
Being omnet so big project, how is it possible that it is so disconnected to ML world like this?
On top of that, I still will need to use federated learning, so a custom scrappy solution would be even more difficult.
I found various articles that say “we have used omnet++ and keras or tensorflow”, etc, but none of them shared their code, so it is kinda misterious how they did it.
Alternatively, I could use NS3, but as far as I know, it is very steeped to learn it. Some ML tools are well documented, apparently, for NS3. But since I didn’t tried to implement something in NS3 with those tools, I can’t know for sure. Omnet++ was easy to learn for what I need, changing to NS3 still seems a burden with no clear guarantees.
I would like to ask help in both senses:
if u have links regarding good middleware between omnetpp and openai-gym or keras or such, and you have used them, please share with me.
if u have experience with NS3 and ML using ML middleware to link NS3 with openai-gym and keras and so on, please share with me.
I will only be able to finish my POC if I manage to use Reinforcement Learning tooling online a omnet++ simulation (i.e., agent is deciding on simulation runtime which actions to take).
My project is actually complex, but the POC may be simple. I am relying in these tools because I have no sufficient experience to build a complex system translating a domain to another. So a help will be nice.
Thank You.

Switching AI learning technique during training?

I'm thinking about creating an AI trying to play a game. However, it would take forever to learn by itself from 0. So I wonder if it would be possible to start the training with data from existing gameplay (from humans) and then switch to machine learning when the AI reach the point where it knows the basics.
If it's possible, is there an a way to do that with TensorFlow or should I do it from sratch?

If the game is not complicated (i.e. open world) then you may be interested in deep reinforcement learning (you mentioned tensorflow), which does not require existing gameplay. It will start playing the game itself and gradually gather experience and develop game-winning strategies based on it. Other than that, using supervised learning to create your model sounds difficult and needs more clarification in my opinion.

Curious on how to use some basic machine learning in a web application

A co-worker and I had an idea to create a little web game where a user enters a chunk of data about themselves and then the application would write for them to sound like them in certain structures. (Trying to leave the idea a little vague.) We are both new to ML and thought this could be a fun first dive.
We have a decent bit of background with PHP, JavaScript (FE and Node), Ruby a little bit of other languages, and have had interest in learning Python for ML. Curious if you can run a cost efficient ML library for text well with a web app, being most servers lack GPUs?
Perhaps you have to pay for one of the cloud based systems, but wanted to find the best entry point for this idea without racking up too much cost. (So far I have been reading about running Pytorch or TensorFlow, but it sounds like you lose a lot of efficiency running with CPUs.)
Thank you!
(My other thought is doing it via an iOS app and trying Apple's ML setup.)

It sounds like you are looking for something like Tensorflow JS

Yes, before jumping into training something with Deep Learning; (this might even be un-necessary for your purpose) try to build a nice and simple baseline for this.
Before Deep Learning (just a few yrs ago) people did similar tasks using n-gram feature based language models. https://web.stanford.edu/~jurafsky/slp3/3.pdf
Essentially you try to predict the next few words probabilistically given a small context(of n-words; typically n is small like 5 or 6)
This should be a lot of fun to work out and will certainly do quite well with a small amount of data. Also such a model will run blazingly fast; so you don't have to worry about GPUs and compute .
To improve on these results with Deep Learning, you'll need to collect a ton of data first; and it will be work to get it to be fast on a web based platform

What is needed for a recommendation engine based on word/text input

I'm new to the Machine-Learning (AI) technology. I'm developing a messenger app for Android/IOs where I would like to recommend the users based on the texts/word/conversation a product from a relative small product portfolio.
Example 1:
In case the user of the messenger writes a sentence including the words "vine", "dinner", "date" the AI should recommend a bottle of vine to the user.
Example 2:
In case the user of the app writes that he has drunk a good coffee this morning, the AI should recommend a mug to the user.
Example 3:
In case the user writes something about a cute boy she met last day, the AI should recommend a "teddy bear" to the user.
I'm a Software Developer since almost 20 year with experience in the development of C/C++/Java based application (Android and IOs apps) as well as some experience in Google Cloud Platform. The ML/AI technology is completely new to me. Okay, I know the basics (input data is needed to train the ML/AI system etc.), but I wonder If there is already a framework which could help me to develop such a system which solves the above described uses-case.
I would appreciate it, if you could give me some hints where and how to start.
Thank you and regards

It is definitely possible to implement such an application, in case you want to do it in Google Cloud you will need some understanding of Tensorflow.
First of all, I recommend to you to do the Machine Learning Crash Course, for a good introduction to Machine Learning and to start to familiarize yourself with TensorFlow. Afterwards I recommend to take a look into Tensorflow tutorials which will give you a more practical introduction to Tensorflow, and include various examples on building/training/testing models.
Once you are famirialized with Tensorflow, you can jump into learning how to run jobs in the Machine Learning engine, you can start by following the quickstart. The documentation includes detailed guides on how to use the ml-engine, plus multiple samples and tutorials.
Since I believe that your application would fall into the Recommender System type, here you can see an example model, in Google Cloud ML Engine, on how to recommend items to users based on his previous searches. In your case, you would have to build a model in order to recommend items to users based on his previous words in the sentence.
The second option, in case you don't want to go through the hassle of building a new model from scratch, would be to use the Google Cloud Natural Language API, which you can understand as pre-trained models using Google (incredibly big) data. In your case, I believe that the Content Classifying API would help you achieve what your application intends to do, however, the outputs (which you can see here) are limited to what the model was trained to do, and might not be specific enough for your application, however it is an easy solution and you can still profit of this API in order to extract labels/information and send it as input to another model.
I hope that these links provide you with some foundations on what is possible to do with Tensorflow in the ML Engine, and are useful to you.

What Tensorflow API to use for Seq2Seq

This year Google produced 5 different packages for seq2seq:
seq2seq (claimed to be general purpose but
inactive)
nmt (active but supposed to be just
about NMT probably)
legacy_seq2seq
(clearly legacy)
contrib/seq2seq
(not complete probably)
tensor2tensor (similar purpose, also
active development)
Which package is actually worth to use for the implementation? It seems they are all different approaches but none of them stable enough.

I've had too a headache about some issue, which framework to choose? I want to implement OCR using Encoder-Decoder with attention. I've been trying to implement it using legacy_seq2seq (it was main library that time), but it was hard to understand all that process, for sure it should not be used any more.
https://github.com/google/seq2seq: for me it looks like trying to making a command line training script with not writing own code. If you want to learn Translation model, this should work but in other case it may not (like for my OCR), because there is not enough of documentation and too little number of users
https://github.com/tensorflow/tensor2tensor: this is very similar to above implementation but it is maintained and you can add more of own code for ex. reading own dataset. The basic usage is again Translation. But it also enable such task like Image Caption, which is nice. So if you want to try ready to use library and your problem is txt->txt or image->txt then you could try this. It should also work for OCR. I'm just not sure it there is enough documentation for each case (like using CNN at feature extractor)
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/seq2seq: apart from above, this is just pure library, which can be useful when you want to create a seq2seq by yourself using TF. It have a function to add Attention, Sequence Loss etc. In my case I chose that option as then I have much more freedom of choosing the each step of framework. I can choose CNN architecture, RNN cell type, Bi or Uni RNN, type of decoder etc. But then you will need to spend some time to get familiar with all the idea behind it.
https://github.com/tensorflow/nmt : another translation framework, based on tf.contrib.seq2seq library
From my perspective you have two option:
If you want to check the idea very fast and be sure that you are using very efficient code, use tensor2tensor library. It should help you to get early results or even very good final model.
If you want to make a research, not being sure how exactly the pipeline should look like or want to learn about idea of seq2seq, use library from tf.contrib.seq2seq.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas