Python interaction in Pandas documentation - pandas

In this document http://pandas.pydata.org/pandas-docs/stable/pandas.pdf the python interaction is done very nicely.
Where are the latex sources so I can see how this is done?

The docs are generated using sphinx.
You can see how by reading at the make.py file from the pandas github repository.

Related

Generate sas7bdat files from a pandas dataframe

I would like to know if there's any python library that supports this conversion, currently the options i've found are SASpy, csv or SQL database but was unsuccessful.
This is not really a programming question but hope it won't be an issue.
I've found this post:
Export pandas dataframe to SAS sas7bdat format
But was hoping to find any updates on new libraries that support sas7bdat files creation and how licensing works for SASpy.
The sas7bdat is very hard to write. The read is fairly doable (but pretty hard) but the write is brutal. SAS costs a LOT of money and cannot be purchased (it is leased). My suggestions:
Use one of the products by companies that have done it. Some examples: CoyRoc (SSIS adaptor) $, StatTransfer $, SPSS $$$, SAS (lots of dollar signs). WPS might be able to do it but they save to their format to avoid the mess. They probably also support sas7bdat export.
Do not use sas7bdat format. Consider something else like SAS Transport format. Look at my github repository (savian-net) for C# code that can do it. Translate to Python or find a python library that can handle SAS Transport.
The sas7bdat is a binary, proprietary protocol that is 100% not published anywhere. Any docs are guesses based upon binary sleuthing. It is based on an old mainframe format and 'likely remnants' appear to be included. My suggestion is to avoid it like the plague and find an alternative.
An alternative to using xport as Stu suggested - as of Viya 2021.2.6, SAS supports reading externally generated parquet files via the new parquet import engine. As such, you could export the file to parquet via Python then directly import that into SAS and save it as a .sas7bdat file.
https://communities.sas.com/t5/SAS-Communities-Library/Parquet-Support-in-SAS-Compute-Server/ta-p/811733

Cannot find the source code for `tf.quantization.fake_quant_with_min_max_args`

Where one can find the github source code for tf.quantization.fake_quant_with_min_max_args. Checking the TF API documentation, there is no link to the github source file, and I could not find one on github.
The kernel for this op is defined here:
https://github.com/tensorflow/tensorflow/blob/ac74e1746a28b364230072d4dac5a45077326dc2/tensorflow/core/kernels/fake_quant_ops.cc#L63-L98

How do I link up the NLTK libraries to chatterbot and then use SentimentComparison?

I've managed to download the NLTK stuff to a local chatterbox data folder, but after that I can't find any documentation on how exactly to get SentimentComparison to work, or how to initialize the VADER lexicon (which apparently gets extracted from the NLTK) and then have the chatbot use it all for the output.
Python 3.6.3
chatterbot 0.7.6
NLTK installed here ...chatterbot\data\nltk_data
Q: Can anyone provide me with examples?
You can use the sentiment analysis comparison by setting the statement_comparison_function for your chat bot.
For example:
from chatterbot import ChatBot
from chatterbot.comparisons import sentiment_comparison
chatbot = ChatBot(
# ...
statement_comparison_function=sentiment_comparison
)
Your chat bot will download and use the NLTK data that it needs (such as the VADER lexicon) automatically.
More information about the available comparison functions can be found in the ChatterBot documentation: http://chatterbot.readthedocs.io/en/latest/comparisons.html

Read streaming data from s3 using pyspark

I would like to leverage python for its extremely simple text parsing and functional programming capabilities and also to tap into the rich offering of scientific computing libraries like numpy and scipy and hence would like to use pyspark for a task.
The task I am looking to perform at the outset is to read from a bucket where there are text files being written to as part of a stream. Could someone paste a code snippet of how to read streaming data from an s3 path using pyspark? I thought this could be done only using scala and java till recently but I just found out today that spark 1.2 onwards, streaming is supported in pyspark as well but am unsure whether S3 streaming is supported?
The way I used to do it in scala is to read it in as a HadoopTextFile I think and also use configuration parameters to set aws key and secret. How would I do something similar in pyspark?
Any help would be much appreciated.
Thanks in advance.
Check the "Basic Sources" section in the documentation: https://spark.apache.org/docs/latest/streaming-programming-guide.html
I believe you want something like
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
sc = SparkContext('local[2]', 'my_app')
ssc = StreamingContext(sc, 1)
stream = ssc.textFileStream('s3n://...')

Replacement for gnome.help_display()

I'm looking at porting a pygtk application to Gtk 3 and gobject-introspection. When help is selected in the menu, the code calls gnome.help_display('appname') to display it.
The gnome package is firmly part of Gnome 2 - in Ubuntu it's part of python-gnome2, with lots of Gnome 2 dependencies. I can't find any equivalent package for Gnome 2. Is there any way to achieve the same functionality without depending on Gnome 2?
Apart from that function call, the app has no particular requirement for Gnome libraries. So a desktop-independent way of displaying the help, which is in Docbook format, would be ideal.
You can use Gtk.show_uri() For instance:
$ python
>>> from gi.repository import Gtk
>>> Gtk.show_uri(None, "help:evince", 0)
The first parameters is the Screen, the second the URI and the third a timestamp.
With respect to the documentation, I would recommend you to use Mallard, which is a way simpler than DocBook and it is oriented to build topic-oriented documentation.