Display output from another python script in jupyter notebook - pandas

I run a loop in my jupyter notebook that references another python file using the execfile command.
I want to be able to see all the various prints and outputs from the file I call from execfile. However, I don't see any of the pandas dataframe printouts. E.g. if I just say 'df' I don't see the output of the table of the dataframe. However, I will see 'print 5'.
Can someone help me what options I need to set to enable this to be viewed?
import pandas as pd
list2loop =['a','b','c','d']
for each_item in list2loop:
execfile("test_file.py")
where 'test_file.py' is:
df=pd.DataFrame([each_item])
df
print 3

The solution is simply using the %run magic instead of execfile (whatever execfile is).
Say you have a file test.py:
#test.py
print(test_input)
Then you can simply do
for test_input in (1, 2, 3):
%run -i test.py
The -i tells Python to run the file in IPython's name space, thus the script knows about all your variables, and variables defined in your script are in your name space afterwards. If you explicitly call sys.exit in your script, you have to additionally use -e.

Related

Is there a way to get ipython autocompletion when piping a pandas dataframe to a function?

For example, if I have a pipe function:
def process_data(weighting, period, threshold):
# do stuff
Can I get autocompletion on the process data arguments?
There are a lot of arguments to remember and I would like to make sure they get passed in correctly. In ipython, the function can autocomplete to show me the keyword args which is really neat, but I would like it to do this when piping a pandas dataframe too!
I don't see how this would be possible, but then again, I'm truly in awe of ipython and all its greatness. So, is this possible? If not, are there other hacks that people have come up with?
Install the pyreadline library.
$ pip install pyreadline
Update:
It seems like this problem is specific to some versions of ipython. The solution is the following:
Run below command from the terminal:
$ ipython profile create
It will create a default profile at ~/.ipython/profile_default/ipython_config.py
Now edit this ipython_config.py and add the below lines and it will solve the issue.
c = get_config()
c.Completer.use_jedi = False
Reference:
https://github.com/jupyter/notebook/issues/2435
https://ipython.readthedocs.io/en/stable/config/intro.html

How to get values from a different python script and call it in another python script

I have to files- one file holding all of my sql queries held in variables and the other file executing them. Is there any way to call the queries from my first file from the second file?
Example:
File1-
query1='select * from users'
File2-
import file1
c = conn.cursor()
c.execute(getattribute(query1))
This is the logic I am trying to do. I am new to this so any help would be appreciated.
Depending on how your code is structured, you could probably import the variables you need by adding the following line in File2:
from file1 import query1, query2, query3
That will directly import the variables and you can use them as if they were declared in the second file.
Another option is to keep the import file1 line and refer to your queries as file1.query1.
For more info see this thread: Importing variables from another file?

Snakemake --forceall --dag results in mysterius Error: <stdin>: syntax error in line 1 near 'File' from Graphvis

My attempts to construct DAG or rulegraph from RNA-seq pipeline using snakemake results in error message from graphviz. 'Error: : syntax error in line 1 near 'File'.
The error can be corrected by commenting out two print commands with no visible syntax errors. I have tried converting the scripts from UTF-8 to Ascii in Notepad++. Graphviz seems to have issues with these two specific print statements because there are other print statements within the pipeline scripts. Even though the error is easily corrected, it's still annoying because I would like colleagues to be able to construct these diagrams for their publications without hassle, and the print statements inform them of what is happening in the workflow. My pipeline consists of a snakefile and multiple rule files, as well as a config file. If the offending line is commented out in the Snakefile, then graphviz takes issue with another line in a rule script.
#######Snakefile
!/usr/bin/env Python
import os
import glob
import re
from os.path import join
import argparse
from collections import defaultdict
import fastq2json
from itertools import chain, combinations
import shutil
from shutil import copyfile
#Testing for sequence file extension
directory = "."
MainDir = os.path.abspath(directory) + "/"
## build the dictionary with full path for each for sequence files
fastq=glob.glob(MainDir+'*/*'+'R[12]'+'**fastq.gz')
if len(fastq) > 0 :
print('Sequence file extensions have fastq')
os.system('scripts/Move.sh')
fastq2json.fastq_json(MainDir)
else :
print('File extensions are good')
######Rule File
if not config["GroupdFile"]:
os.system('Rscript scripts/Table.R')
print('No GroupdFile provided')
snakemake --forceall --rulegraph | dot -Tpdf > dag.pdf should result in an pdf output showing the snakemake workflow, but if the two lines aren't commented out it results in Error: : syntax error in line 1 near
To understand what is going on take a close look at the command to generate your dag.pdf.
Try out the first part of your command:
snakemake --forceall --rulegraph
What does that do? It prints out the dag in text form.
By using a | symbol you 'pipe' (pass along) this print to the next part of your command:
dot -Tpdf > dag.pdf
And this part makes the actual pdf from the text that is 'piped' and stores in in dag.pdf. The problem is that when your snakefile makes print statements these prints also get 'piped' to the second half of your command, which interferes with the making of your dag.pdf.
A kinda hackish way how I solved the issue to be able to print, but also to be able to generate the dag is to use the logging functionality of snakemake. It is not a documented way, and a bit hackish, but works really well for me:
from snakemake.logging import logger
logger.info("your print statement here!")

conflict between fortran+iso_c_binding (via ctypes or cython) and matplotlib when reading namelist [only with python Anaconda!!]

[EDIT: the problem only applies with python anaconda, not with standard /usr/bin/python2.7]
[FYI: the gist referred to in this post can still be useful for anyone trying to use fortran with ctypes or cython, credit to http://www.fortran90.org/src/best-practices.html]
When using a fortran code from within python (using iso_c_bindings), either via ctypes or via cython, I ran into a weird incompatibility problem with matplotlib. Basically, if matplotlib is "activated" (via %pylab or by using pyplot.plot command), reading the namelist will omit any digit !! i.e. The value 9.81 is read as 9.00. Without matplotlib, no problem.
I made a minimal working example gist.github.com.
Basically, the fortran module just allow reading a double precision parameter g from a namelist, and store it as global module variable. It can also print its value to string, and allow directly setting its value from the outside. This makes three functions:
read_par
print_par
set_par
You can download the gist example and then run:
make ctypes
python test_ctypes.py
test_ctypes.py contains:
from ctypes import CDLL, c_double
import matplotlib.pyplot as plt
f = CDLL('./lib.so')
print "Read param and print to screen"
f.read_par()
f.print_par()
# calling matplotlib's plot command seem to prevent
# subsequent namelist reading
print "Call matplotlib.pyplot's plot"
plt.plot([1,2],[3,4])
print "Print param just to be sure: everything is fine"
f.print_par()
print "But any new read will lose decimals on the way!"
f.read_par()
f.print_par()
print "Directly set parameter works fine"
f.set_par(c_double(9.81))
f.print_par()
print "But reading from namelist really does not work anymore"
f.read_par()
f.print_par()
With the output:
Read param and print to screen
g 9.8100000000000005
Call matplotlib.pyplot's plot
Print param just to be sure: everything is fine
g 9.8100000000000005
But any new read will lose decimals on the way!
g 9.0000000000000000
Directly set parameter works fine
g 9.8100000000000005
But reading from namelist really does not work anymore
g 9.0000000000000000
The same happen with the cython example (make clean; make cython; python test_cython.py).
Does anyone know what is going on, or whether there is any workaround?? The main reason why I wrote a wrapper to my fortran code is to be able to play around with a model, set parameters (via namelist), run it, plot the result, set other parameters and so on. So for my use case this bug kinds of defies the purpose of interactivity...
Many thanks for any hint.
PS: I am happy to file a bug somewhere, but would not know where (gfortran? matplotlib?)

Saving ipython variable/s to a text file

I have a few lists and arrays in ipython which I should like to save to a text file so that I can then use them in another context. How can this be done?
Look at the %store magic function
important = ['item', 42, 'list']
%store important
... time passes, sessions restarted
%store -r
%store
Stored variables and their in-db values:
important -> ['test', 42, 'list']
Or, look to pickle.