Snakemake - Tibanna config support - snakemake

I trying to run snakemake --tibanna to deploy Snakemake on AWS using the "Unicorn" Step Functions Tibanna creates.
I can't seem to find a way to change the different arguments Tibanna accepts like which subnet, AZ or Security Group will be used for the actual EC2 instance deployed.
Argument example (when running Tibanna without Snakemake):
https://github.com/4dn-dcic/tibanna/blob/master/test_json/unicorn/shelltest4.json#L32
Thanks!

Did you noticed this option?
snakemake --help
--tibanna-config TIBANNA_CONFIG [TIBANNA_CONFIG ...]
Additional tibanan config e.g. --tibanna-config
spot_instance=true subnet= security
group=
I think it was added recently.
-jk

Related

In what order is the serverless file evaluated?

I have tried to find out in what order the statements of the serverless file are evaluated (maybe it is more common to say that 'variables are resolved').
I haven't been able to find any information about this and to some extent it makes working with serverless feel like a guessing game for me.
As an example, the latest surprise I got was when I tried to run:
$ sls deploy
serverless.yaml
useDotenv: true
provider:
stage: ${env:stage}
region: ${env:region}
.env
region=us-west-1
stage=dev
I got an error message stating that env is not available at the time when stage is resolved. This was surprising to me since I have been able to use env to resolve other variables in the provider section, and there is nothing in the syntax to indicate that stage is resolved earlier.
In what order is the serverless file evaluated?
In effect you've created a circular dependency. Stage is special because it is needed to identify which .env file to load. ${env:stage} is being resolved from ${stage}.env, but Serverless needs to know what ${stage} is in order to find ${stage}.env etc.
This is why it's evaluated first.
Stage (and region, actually) are both optional CLI parameters. In your serverless.yml file what you're setting is a default, with the CLI parameter overriding it where different.
Example:
provider:
stage: staging
region: ca-central-1
Running serverless deploy --stage prod --region us-west-2 will result in prod and us-west-2 being used for stage and region (respectively) for that deployment.
I'd suggest removing any variable interpolation for stage and instead setting a default, and overriding via CLI when needed.
Then dotenv will know which environment file to use, and complete the rest of the template.

how to start Leingen with java -Djavax.net.debug=true option?

I am trying to diagnose a few issues with ssl connectivity with Leingen. I am trying to find what SSL Key Store and Trust Store is being Used by Leingen,
I am behind a corporate firewall and we have self signed certificates deployed on all our desktops . I am running lein.bat on a windows 10.
Hence I have to start Leingen with java -Djavax.net.debug=true option.
The :jvm-opts in the project.clj wont work -- I need to make sure the Liengen's JVM is started with this option
You can set leiningen JVM options by setting LEIN_JVM_OPTS environment variable before running lein in the same terminal session.
The lein command is just a shell script which eventually invokes java with various options. You can edit this script to see what options are used and/or to modify them.
As Piotrek mentioned, the LEIN_JVM_OPTS environment variable is the canonical way of passing options to the jvm in which lein runs. You can see it used on line 372 of the source code.
For your case:
> export LEIN_JVM_OPTS='-Djavax.net.debug=true'
> lein clean
> lein run
Since you're running windows, you'll want to actually look at the lein.bat file. You'll still need to update LEIN_JVM_OPTS, but how you go about it will be a bit different. If you're using windows command terminal (cmd.exe) you will want to use the set command.
set LEIN_JVM_OPTS="-Djavax.net.debug=true"
The command is likely different if you're using powershell, and you can likely find out how to set that on this page on environment variables.

aws emr with yarn scheduler

I am creating AWS EMR using cloudformation template. I need to run the steps parallel. For that I am trying to change the YARN Scheduler from FIFO to fair / capacity scheduler.
I have added:
yarn.resourcemanager.scheduler.class : 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler'
Do I need to add FairScheduler.xml file in conf.empty folder? If so, can you please share the xml file.
and if I want to add fairscheduler.xml through cloudformation template, then do I need to use bootstrap for it? if so could you provide me the bootstrap file please.
Looks like even though after changing the scheduler, EMR won't allow to run jobs concurrently.
You can configure your cluster by specifying the configuration in cloud-formation scripts.
This a example to configure
- Classification: fair-scheduler
ConfigurationProperties:
<key1>: <value1>
<key2>: <value2>
- Classification: yarn-site
ConfigurationProperties:
yarn.acl.enable: true
yarn.resourcemanager.scheduler.class: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler
Please follow these -
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-elasticmapreduce-cluster-configuration.html
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html
EMR recently allows you to run multiple steps in parallel -
https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-emr-now-allows-you-to-run-multiple-steps-in-parallel-cancel-running-steps-and-integrate-with-aws-step-functions/

Drone.io secrets not populating in yml appropriately and documentation seems inaccurate

I am running version 0.8.4 as a container in my lab. CLI is also at version 0.8.4
I am trying to use a secret in a command one of my containers is trying to run.
Following the documentation has me needing to sign a repo to allow the job to consume the secret. The drone CLI does not seem to have a
drone sign command for me to run. So I create the secret with a --skip-verify=true flag. This creates the secret but when I run the job it errors out. The output in the UI shows a blank space where the secret should be injected.
Here is an excerpt of my .drone.yml where I am trying to inject secrets -s production -u ${cf_user} -p ${cf_password} --s
I have tried all the following ways to create a secret:
drone secret add <repo_name> --name <key> --value <value> --skip-verify=true
drone secret add <repo_name> --name <key> --value <value>
GUI Creation
I notice when I create an all capital name value the UI represents the value in all lowercase when the CLI shows it in capitals.
I also notice that if I include hyphens in the name and try to use that in my drone.yml the job errors out immediately with a bad substitution error.
Any help understanding what I am doing wrong would be much appreciated!
I got lost in the different documentation available. Should have been looking here rather than secret-guide.
In case I am not alone, I needed to add a secrects block in my pipeline.
I also needed to access them with $SECRET_KEY rather than ${SECRET_KEY}
pipeline:
publish:
image: governmentpaas/cf-cli
secrets: [ cf_user, cf_password ]
Just a little update on this one, I stumbled over it as well because the docs are inconsistent.
In the 0.8.5 version the only thing I had to do is:
add secrets via CLI or UI
add secrets array to utilise it
no need to pass variables to environment.

How to run scripts automatically after deployment in AWS using EB CLI?

I am trying to make a Django server on AWS. My django app depends on some mathematical python libraries like numpy, scipy, sklearn etc. However there is an issue for which I need to this after every deployment
sudo nano /etc/httpd/conf.d/wsgi.conf
---------------------------------------
add this line in the file
WSGIApplicationGroup %{GLOBAL}
---------------------------------------
sudo /etc/init.d/httpd reload
Basically I need "WSGIApplicationGroup %{GLOBAL}" in my wsgi.conf file otherwise I get 504. I am using a Custom AMI built on top of Amazon Linux 2014 and I am using EB CLI for deployment. However whenever I deploy the wsgi.conf is reset and it does not contain the line that I have added previously and I need to manually SSH into the EC2 instance and do this task myself. It gives a overhead on every deployment and its also not feasible once we scale up (cloning or creating instances also resets it). So is there a way that this will be automatically done after every deployment ?
The content of the wsgi.conf is fixed, so basically I can make a script easily to create it but the issue is how to trigger the script automatically ?
PS:I am new to AWS
You need to use AWS Elastic Beanstalk feature called .ebextensions: http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html
In your case you can't use Files or Commands sections, because:
The commands are processed in alphabetical order by name, and they run
before the application and web server are set up and the application
version file is extracted.
You need to use Container_commands section:
They run after the application and web server have been set up and the
application version file has been extracted, but before the
application version is deployed.
Example .ebextensions/01wsgi.config (not tested :-))
container_commands:
apache_reload:
command: |
echo "WSGIApplicationGroup %{GLOBAL}" >> /etc/httpd/conf.d/wsgi.conf
/etc/init.d/httpd reload
Feel free to tweak my example as you want, for example you can copy your temporary wsgi.conf file somewhere and then replace original in Container_commands section.