How do I link up the NLTK libraries to chatterbot and then use SentimentComparison? - chatterbot

I've managed to download the NLTK stuff to a local chatterbox data folder, but after that I can't find any documentation on how exactly to get SentimentComparison to work, or how to initialize the VADER lexicon (which apparently gets extracted from the NLTK) and then have the chatbot use it all for the output.
Python 3.6.3
chatterbot 0.7.6
NLTK installed here ...chatterbot\data\nltk_data
Q: Can anyone provide me with examples?

You can use the sentiment analysis comparison by setting the statement_comparison_function for your chat bot.
For example:
from chatterbot import ChatBot
from chatterbot.comparisons import sentiment_comparison
chatbot = ChatBot(
# ...
statement_comparison_function=sentiment_comparison
)
Your chat bot will download and use the NLTK data that it needs (such as the VADER lexicon) automatically.
More information about the available comparison functions can be found in the ChatterBot documentation: http://chatterbot.readthedocs.io/en/latest/comparisons.html

Related

Why am i getting error in org.apache.jena.query.text?

Currently i work with Text Search using Jena and Lucene. I have a problem with Apache Lucene, especially in org.apache.jena.query.text. I wrote the import libraries like this:
import org.apache.jena.query.text.EntityDefinition;
import org.apache.jena.query.text.TextDatasetFactory;
import org.apache.jena.query.text.TextIndexConfig;
Those three libraries says that cannot be resolved. I am using Lucene 8.4.0.
What should i do? I think it is because of the version of Lucene but i'm not sure.

Read streaming data from s3 using pyspark

I would like to leverage python for its extremely simple text parsing and functional programming capabilities and also to tap into the rich offering of scientific computing libraries like numpy and scipy and hence would like to use pyspark for a task.
The task I am looking to perform at the outset is to read from a bucket where there are text files being written to as part of a stream. Could someone paste a code snippet of how to read streaming data from an s3 path using pyspark? I thought this could be done only using scala and java till recently but I just found out today that spark 1.2 onwards, streaming is supported in pyspark as well but am unsure whether S3 streaming is supported?
The way I used to do it in scala is to read it in as a HadoopTextFile I think and also use configuration parameters to set aws key and secret. How would I do something similar in pyspark?
Any help would be much appreciated.
Thanks in advance.
Check the "Basic Sources" section in the documentation: https://spark.apache.org/docs/latest/streaming-programming-guide.html
I believe you want something like
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
sc = SparkContext('local[2]', 'my_app')
ssc = StreamingContext(sc, 1)
stream = ssc.textFileStream('s3n://...')

How to add add ons to Orange3

I am using pandas to do some data analysis. Others in my company are wanting to process data in a similar fashion, but won't want to use a programming language to do it. After significant googling, I found Orange, which has the perfect interface for what I'm wanting people to do. However, the widgets don't do the types of tasks we're looking at. So, I decided to see if I could write my own widgets for Orange to do the tasks.
I'm trying to use Orange3; this seems like the best bet when I'm using WinPython. I must say that going through the documentation for widget creation (for Orange2) and the code for the Orange3 widgets is rather impressive - very nicely written and easy to use to implement what I'm wanting to do.
After writing a couple of widgets, how do I get them into Orange3? the widget creation tutorial is for Orange2 (in Python 2.7), and I haven't got it to work for Orange3.
My project is at the moment rather small:
dir/
orangepandas/
__init__.py
owPandasFile.py
pandasQtTable.py
setup.py
setup.py currently contains the following:
from setuptools import setup
setup(name='orangepandas',
version='0.1',
packages=['orangepandas'],
entry_points={'Orange.widgets': 'orangepandas = orangepandas'}
)
When I run python setup.py install on this and then try opening Orange3 canvas, I don't see my shiny new widget in its new group.
After tracing through how Orange3 imports external libraries, it seems that Orange relies on the actual widget file existing, rather than being inside a egg (zipped) file. Adding
zip_safe=False
to the setup options allowed Orange3 to import the widgets correctly. Orange3 uses os.path.exists in cache_can_ignore in canvas/registry/discovery.py to detect if the path exists at all, and if it doesn't, it doesn't try to import it. Using zip_safe=False makes sure that the addon stays uncompressed so that the individual files are accessible.
(For the next person who tries to do what I was doing.)

Python interaction in Pandas documentation

In this document http://pandas.pydata.org/pandas-docs/stable/pandas.pdf the python interaction is done very nicely.
Where are the latex sources so I can see how this is done?
The docs are generated using sphinx.
You can see how by reading at the make.py file from the pandas github repository.

Replacement for gnome.help_display()

I'm looking at porting a pygtk application to Gtk 3 and gobject-introspection. When help is selected in the menu, the code calls gnome.help_display('appname') to display it.
The gnome package is firmly part of Gnome 2 - in Ubuntu it's part of python-gnome2, with lots of Gnome 2 dependencies. I can't find any equivalent package for Gnome 2. Is there any way to achieve the same functionality without depending on Gnome 2?
Apart from that function call, the app has no particular requirement for Gnome libraries. So a desktop-independent way of displaying the help, which is in Docbook format, would be ideal.
You can use Gtk.show_uri() For instance:
$ python
>>> from gi.repository import Gtk
>>> Gtk.show_uri(None, "help:evince", 0)
The first parameters is the Screen, the second the URI and the third a timestamp.
With respect to the documentation, I would recommend you to use Mallard, which is a way simpler than DocBook and it is oriented to build topic-oriented documentation.