How to perform input redirection in python like the bash >? - io-redirection

I want to feed text files to a C program, with bash I can do ./prog <file, how would you do the same in python ?

You can do that via subprocess.check_call:
import subprocess
subprocess.check_call(["prog"], stdin=open("/path/to/file"))

Related

Is it possible in snakemake to produce reports and the DAG images automatically?

I would like to automatically produce the report and DAG images automatically after running the workflow in snakemake. Also I would like to create the report with a given name, e.g. specified in the config.yaml.
I cannot use the snakemake shell command inside the Snakefile which I would usually use to create the reports manually.
The code I would use for creating the report manually:
snakemake --report
The code for manually creating the DAG image:
snakemake --rulegraph | dot -Tpdf > dag.pdf
How can I do this in the Snakefile?
Thanks for any help!
You could do this but it looks pretty ugly to me. At the end of your Snakefile add:
onsuccess:
shell(
r"""
snakemake --unlock
snakemake --report
snakemake --rulegraph | dot -Tpdf > dag.pdf
""")
As suggested by #FGV comment, it can be done by using auto_report and providing the dag from workflow.persistence:
onsuccess:
from snakemake.report import auto_report
auto_report(workflow.persistence.dag, "report/report.html")
For the dag itself, you can export it to a text file and use shell to turn it to a pdf:
with open("report/dag.txt","w") as f:
f.writelines(str(workflow.persistence.dag))
shell("cat report/dag.txt | dot -Tpdf > report/dag.pdf")
Note that it also works with the rules graph workflow.persistence.dag.rule_dot()

Blender Command line importing files

I will run the script on the Blender command line. All I want to do is run the same script for several files. I have completed the steps to run a background file (.blend) and run a script in Blender, but since I have just loaded one file, I can not run the script on another file.
I looked up the Blender manual, but I could not find the command to import the file.
I proceeded to creating a .blend file and running the script.
blender -b background.blend -P pythonfile.py
In addition, if possible, I would appreciate it if you could tell me how to script the camera and track axes to track to contraint (Ctrl + T -> Track to constraint).
really thank you for reading my ask.
Blender can only have one blend file open at a time, any open scripts are cleared out when a new file is opened. What you want is a loop that starts blender for each blend file using the same script file.
On *nix systems you can use a simple shell script
#!/bin/sh
for BF in $(ls *.blend)
do
blender -b ${BF} -P pythonfile.py
done
A more cross platform solution is to use python -
from glob import glob
from subprocess import call
for blendFile in glob('*.blend'):
call([ 'blender',
'-b', blendFile,
'--python', 'pythonfile.py' ])
To add a Track-to constraint to Camera pointing it at Cube -
camera = bpy.data.objects['Camera']
c = camera.constraints.new('TRACK_TO')
c.target = bpy.data.objects['Cube']
c.track_axis = 'TRACK_NEGATIVE_Z'
c.up_axis = 'UP_Y'
This is taken from my answer here which also animates the camera going around the object.
bpy.context.view_layer.objects.active = CameraObject
bpy.ops.object.constraint_add(type='TRACK_TO')
CameraObject.constraints["Track To"].target = bpy.data.objects['ObjectToTrack']

Training a new entity type with spacy

Need help to try adding new entity and train my own model with spacy named entity recognition. I wanted first to try the example already done here:
https://github.com/explosion/spaCy/blob/master/examples/training/train_new_entity_type.py
but i'am getting this error :
ipykernel_launcher.py: error: unrecognized arguments: -f /root/.local/share/jupyter/runtime/kernel-c46f384e-5989-4902-a775-7618ffadd54e.json
An exception has occurred, use %tb to see the full traceback.
SystemExit: 2
/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py:2890: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)
Tried to look into all related questions and answers and couldn't resolve this.
Thank you for your help.
It looks like you're running the code from a Jupyter notebook, right? All spaCy examples are designed as fully standalone scripts to run from the command line. They use the Python library plac for generating the command-line interface, so you can run the script with arguments. Jupyter however seems to add another command-line option -f, which causes a conflict with the existing command-line interface.
As a solution, you could execute the script directly instead, for example:
python train_new_entity_type.py
Or, with command line arguments:
python train_new_entity_type.py --model en_core_web_sm --n-iter 20
Alternatively, you could also remove the #plac.annotations and plac.call(main) and just execute the main() function directly in your notebook.

Display output from another python script in jupyter notebook

I run a loop in my jupyter notebook that references another python file using the execfile command.
I want to be able to see all the various prints and outputs from the file I call from execfile. However, I don't see any of the pandas dataframe printouts. E.g. if I just say 'df' I don't see the output of the table of the dataframe. However, I will see 'print 5'.
Can someone help me what options I need to set to enable this to be viewed?
import pandas as pd
list2loop =['a','b','c','d']
for each_item in list2loop:
execfile("test_file.py")
where 'test_file.py' is:
df=pd.DataFrame([each_item])
df
print 3
The solution is simply using the %run magic instead of execfile (whatever execfile is).
Say you have a file test.py:
#test.py
print(test_input)
Then you can simply do
for test_input in (1, 2, 3):
%run -i test.py
The -i tells Python to run the file in IPython's name space, thus the script knows about all your variables, and variables defined in your script are in your name space afterwards. If you explicitly call sys.exit in your script, you have to additionally use -e.

How to access cluster_config dict within rule?

I'm working on writing a benchmarking report as part of a workflow, and one of the things I'd like to include is information about the amount of resources requested for each job.
Right now, I can manually require the cluster config file ('cluster.json') as a hardcoded input. Ideally, though, I would like to be able to access the per-rule cluster config information that is passed through the --cluster-config arg. In init.py, this is accessed as a dict called cluster_config.
Is there any way of importing or copying this dict directly into the rule?
From the documentation, it looks like you can now use a custom wrapper script to access the job properties (including the cluster config data) when submitting the script to the cluster. Here is an example from the documentation:
#!python
#!/usr/bin/env python3
import os
import sys
from snakemake.utils import read_job_properties
jobscript = sys.argv[1]
job_properties = read_job_properties(jobscript)
# do something useful with the threads
threads = job_properties[threads]
# access property defined in the cluster configuration file (Snakemake >=3.6.0)
job_properties["cluster"]["time"]
os.system("qsub -t {threads} {script}".format(threads=threads, script=jobscript))
During submission (last line of the previous example) you could either pass the arguments you want from the cluster.json to the script or dump the dict into a JSON file, pass the location of that file to the script during submission, and parse the json file inside your script. Here is an example of how I would change the submission script to do the latter (untested code):
#!python
#!/usr/bin/env python3
import os
import sys
import tempfile
import json
from snakemake.utils import read_job_properties
jobscript = sys.argv[1]
job_properties = read_job_properties(jobscript)
job_json = tempfile.mkstemp(suffix='.json')
json.dump(job_properties, job_json)
os.system("qsub -t {threads} {script} -- {job_json}".format(threads=threads, script=jobscript, job_json=job_json))
job_json should now appear as the first argument to the job script. Make sure to delete the job_json at the end of the job.
From a comment on another answer, it appears that you are just looking to store the job_json somewhere along with the job's output. In that case, it might not be necessary to pass job_json to the job script at all. Just store it in a place of your choosing.
You can manage the resources for the cluster easily per Rules.
Indeed you have the keyword "resources:" to use like this :
rule one:
input: ...
output: ...
resources:
gpu=1,
time=HH:MM:SS
threads: 4
shell: "..."
You can specify the resources by the yaml configuration files for the cluster give with the parameter --cluster-config like this:
rule one:
input: ...
output: ...
resources:
time=cluster_config["one"]["time"]
threads: 4
shell: "..."
When you will call snakemake you will just have to access to the resources like this (example for slurm cluster):
snakemake --cluster "sbatch -c {threads} -t {resources.time} " --cluster-config cluster.yml
It will send each rule with its specific resources for the cluster.
For more informations, you can check the documentations with this link : http://snakemake.readthedocs.io/en/stable/snakefiles/rules.html
Best regards