dtale show in jupyter book - html-table

I use jupyter book for creating books from jupyterlab notebooks. dtale tables do not show up in the jupyter book (they do show up in the jupyter notebook). I read that jupyterbook expects Interactive outputs will work under the assumption that the outputs they produce have self-contained HTML that works without requiring any external dependencies to load.
dtale needs a server on port 40000, so does not fit the bill. However, I control the server that servers the jupyter book html pages. Is it possible to configure this html server to support the dtale requests? Any pointers I would appreciate.

Related

Is there a stealthy headless browser automation tool similar to puppetteer for Python?

I am aware of the Pyppeteer library and Pyppeteer Stealth, but the problem with them is that the website that I am trying to scrape information from detects Pyppeteer Stealth (Python transplant of Puppetteer) and blocks it. The original Puppetteer Stealth used on node JS does work fine on that website, however, I would much rather create this scraper on Python since I am much more familiar with it.
Which other stealthy and up to date headless browser automation tools are available?
All I will need it for is grabbing the HTML content and parsing it through Beautiful Soup. Unfortunately, the requests and requests-html library also do not work on this website.
If you don't care about the automation part of the software that much I would just recommend looking into Scrapy (and Scrapy Splash if you need js to be rendered which is why I assume you want to use Pyppeteer in the first place) combined with the use of some basic tactics to not get caught as a bot such as user-agent rotation and proxy rotation.
This is the tactic I am using too to make a scraper for similarweb.com at the moment.

Tensorflow serving performance slow

I have a model (based on Mask_RCNN) which I have exported to a servable. I can run it with tf serving in a docker container locally on my Macbook pro and using the json API it will respond in 15-20s, which is not fast but I didn't really expect it to be.
I've tried to serve it on various AWS machines based off the DLAMI and also tried some Ubuntu AMIs specifically using a p2.xlarge with a gpu, 4vcpus and 61GB of RAM. When I do this the same model responds in about 90s. The configurations are identical since I've built a docker image with the model inside it.
I also get a timeout using the AWS example here: https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-tfserving.html
Has anyone else experienced anything similar to this or have any ideas on how I can fix or isolate the problem?

Google Colab variable values lost after VM recycling

I am using a Google Colab Jupyter notebook for algorithm training and have been struggling with an annoying problem. Since Colab is running in a VM environment, all my variables become undefined if my session is idle for a few hours. I come back from lunch and the training dataframe that takes a while to load becomes undefined and I have to read_csv again to load my dataframes.
Does anyone know how to rectify this?
If the notebook is idle for some time, it might get recycled: "Virtual machines are recycled when idle for a while" (see colaboratory faq)
There is also an imposed hard limit for a virtual machine to run (up to about 12 hours !?).
What could also happen is that your notebook gets disconnected from the internet / google colab. This could be an issue with your network. Read more about this here or here
There are no ways to "rectify" this, but if you have processed some data you could add a step to save it to google drive before entering the idle state.
You can use local runtime with Google Colab. Doing so, the Colab notebook will use your own machine's resources, and you won't have any limits. More on this: https://research.google.com/colaboratory/local-runtimes.html
There are various ways to save your data in the process:
you can save on the Notebook's VM filesystem, e. g. pd.to_csv("my_data.csv")
you can import sqlite3 which is the Python implementation of the popular SQLite database. Difference between SQLite and other SQL databases is that the DBMS runs inside your application, and data is saved to the file system of that application. Info: https://docs.python.org/2/library/sqlite3.html
you can save to your google drive, download to your local file system through your browser, upload to GCP... more info here: https://colab.research.google.com/notebooks/io.ipynb#scrollTo=eikfzi8ZT_rW

IBM Watson Text To Speech Bower_Components

I'm pulling my hair out and would welcome some input/advice. I can't get any of the code examples for Watson's Text to Speech service to work. Or example codes for Amazon Polly or Read Speaker for that matter...
Every time I try to track down the problem it seems to boil down to something along the lines of "you need to install such and such (Composer, curl, Bowser, Drush, etc.) via the command line". That's all well and good, except for the fact that I'm new to web development and my company is currently using a shared hosting platform for which I do not have command line access.
Is there any way to get a decent text to speech engine installed on a shared hosting platform, or do I just need to bite the bullet and make the switch to a VPS?
Depending on what your actually trying to do with any text to speech solution, you should at least be able to test from a command line locally and then deploy the solution in code to a shared environment.
Regarding IBM Watson's Text to Speech, there are no dependencies in and of itself. In the most basic form, you're just making a call to a REST API: https://www.ibm.com/watson/developercloud/text-to-speech/api/v1/#synthesize_audio
As an example, using curl from any command line, the following will return a wav file:
curl -X GET -u "{username}":"{password}"
--output hello_world.wav
"https://stream.watsonplatform.net/text-to-speech/api/v1/synthesize?
accept=audio/wav&text=Hello%20world&voice=en-US_AllisonVoice"
How you handle that file will depend on the programming language you use, but it still doesn't have any dependencies to get started with the examples.

How do you access an IPython Notebook remotely?

Is it possible to have an IPython notebook upon in your own local browser, but it is running on a remote machine?
How does one actually access an IPython notebook running remotely using ssh?
Quoth the extensive Jupyter Documentation for Running a Notebook Server:
The Jupyter notebook web application is based on a server-client structure. The notebook server uses a two-process kernel architecture based on ZeroMQ, as well as Tornado for serving HTTP requests.
This document describes how you can secure a notebook server and how to run it on a public interface.
If you store your iPython notebook on GitHub or GitHub Gist or any file service (DropBox) then you can point http://nbviewer.jupyter.org/ to your file and view it online.
Or you can export your notebook to HTML https://ipython.org/ipython-doc/1/interactive/nbconvert.html
this may be less necessary as GitHub displays iPython notebooks directly now (try https://github.com/jakevdp/sklearn_pycon2015/blob/master/notebooks/02.1-Machine-Learning-Intro.ipynb)
The code for nbViewer is also on GitHub https://github.com/jupyter/nbviewer
Let me know if you need to modify notebooks remotely or just view them.