Apache Airflow | Use case for an "Ansible operator" - automation

I am a fun of Ansible for IT tasks automation, testing, CI/CD, provisioning, inventorying, no-code/configuration over code approach, etc. ...
.. I also like the way Airflow handles triggers and different workflow exiting/concatenation logic, notifications, and keep tracks of past runs
... so I am wondering, is there any use case for using both technologies together? (i mean embedding ansible inside airflow)
A quick web search just leads me to stuff like "Ansible VS Airflow", but i see potential for using them both without having to rewrite all Ansible playbooks into python code.
I am quite surprised not to see any Ansible operator for Airflow out there, could you share some thoughts on this?

Related

How to securely set up continuous delivery?

Setup:
Private master repo and every developer has their own private fork.
Currently using CircleCI, but we'd be happy to switch to satisfy requirements
Branches on master repo are protected with merge restrictions
Requirements:
Build + test on forked pull requests
Deploy to different environments based on master repo branch updates
Not all developers can be fully trusted with production credentials
Partial Solution:
Enable building and passing secrets on forked pull requests (Reference)
Use CircleCI contexts to set environment variables per branch. This allows different deploy targets.
Problems:
All repo specific secrets as well as all global contexts are now accessible by anyone who can open a PR.
Even if we disable building on forked pull requests, anyone with write access to at least one repo can access all global contexts.
Question:
This would seems to be a very common use case. How do other companies solve it?
Is CircleCI not the right tool for this? - No, it is not (see below).
Should we build a custom solution?
Edit1:
CircleCI got back to me and surprisingly this is not a use case they support. Looking into other providers now. Above questions are still unanswered.
Edit2:
I've also contacted TravisCi and SemaphoreCi and it appears that only TravisCi supports building forked PRs and not leaking secrets into them (Reference).
SempahoreCi is missing (1) building forked PRs and (2) hiding secrets from the deployment phase in non-master workflows
CircleCi has restricted contexts, but they would require manually changing workflows. Definitely not easy to set up and I don't fully understand how they would work.

Dockerized Gitlab Container Backup

I am using a GitLab docker image for integration testing of a service I'm helping to develop. Ideally, the image would be a preconfigured snapshot of GitLab with different users and repos available to run tests against. So the problem ends up being, what is a good way to automate the creation of 'snapshots' of GitLab (that can then be versioned etc.)?
My current solution to this problem is to use GitLab's built in backup utility via gitlab-rake gitlab:backup:create after getting GitLab to a state that I want. This then lets me use GitLab's gitlab-rake gitlab:backup:restore in a hook when the container is starting up to get the container back to the state that I expect (the backup having been ADDed in the Dockerfile for the image). This has the advantage of being relatively lightweight (backups are on the order of MBs) and the backups can be checked in to version control.
I have tried using docker export along with docker import to save the state of the container and then create an image based on that state. This has the advantage of being easy to automate since it is directly supported by Docker, but ends up being fairly expensive considering what the goal is (having users and repos available to test against). It also would require the images to be pushed to a registry of some kind in order to be easily distributed. Perhaps this is the best solution because it is well supported though.
I suppose my question is, what is the Docker way of approaching a problem like this?

How to write Puppet modules for packages such as tigervnc or openvpn that require the user to set passwords or default settings?

I am learning puppet and am trying to write modules to install services such as tigervnc and openvpn.
The problem is that for tigervnc requires the initial password setting by the user. I have tried using:
"exec {'/usr/bin/echo password | /usr/bin/vncpasswd > ~/.vnc/passwd"
This works if I run it on the command line if I'm logged in as the user but does not work when run via puppet.
The problem with openvnc is that it requires a lot of user interaction for the default settings for certificate generation/certificate authority and key generation.
I have tried using execs with the "pkitool" methods which work to a point but not very well or stable. I am also wary of using many execs if there is a better way to do it.
So to sum up my main question is how to deal with these user interactions when trying to automate installations with puppet, and is there a better way than running lots of execs which to me seem like a last resort ?
Thanks
If setting up a piece of software requires user interaction, I don't really see a way around exec. Keeping its use to a minimum is indeed a sensible design goal.
An economic approach is to
create a script that does all the necessary lifting that Puppet resources cannot perform
make Puppet deploy that script to the agent
run it at appropriate times via exec (along with good creates or onlyif queries)
Scripts that run installation wizards that rely on interactive input should probably rely on expect and friends.

Amazon EC2 Load Testing

I am designing a AWS deployment solution for a new dynamic website project. I have acquired an EC2 instance for testing the environment. Need some help on how do I do a load testing on an Ec2 instance to determine how many HTTP requests it can safely handle... P.S. I am new to the AWS platform.
Thanks...
RedLine offers an EC2 Load Testing solution that will automate the distribution of load tests on your own EC2 instances.
Late to the party but could help someone in the future:
A possible tool for load tests, stress tests, whatever you may call them, is Apache JMeter, but there are plenty of alternatives.
A simple starting setup, further explained in this excellent tutorial on DigitalOcean, can exist of a Thread Group containing an HTTP Request Sampler and a View Results in Table Listener. The Thread Group can be used to configure the amount of "clients" you want to simulate. The Request Sampler will be used to configure the server's properties (hostname, path, etc). The Table View Listener outputs a handy CSV file that can be used to calculate means, compare different types of EC2 instances,...
JMeter is a beautiful program with a GUI that can be run on your local workstation, producing an XML file that can be executed on another EC2 instance, for instance. You can even do simple manual edits to the XML file on your server afterward, if necessary.
Take a look at Amazon's testing policy to make sure you're not doing anything illegal.
A couple of quick points;
Set the environment up exactly like it's supposed to run. If there's a database involved, you'll want to involve that in the testing too. Synthetic <?php echo "ok"; CPU based benchmarks won't help you much since normally very little of the time spent replying to HTTP requests is actual CPU time.
A recommendation is to use a service for the benchmarking. Setting load testing up is not without its complexities, and unless you consider benchmarking your core business, you're probably better off using something like Neustar to load and measure your site (there are many services, they're not necessarily what fits you best, just pulled one out of memory)
Of course you can set a load test up yourself, but getting that done right is not anything that can be described in a few sentences. There are very well paid people that only do that for a living :)
There is good experience in using curl-loader aka Davilka tool, also on Amazon EC2 env
http://curl-loader.sourceforge.net

Allowing a PHP script to ssh, using sudo

I need to allow a PHP script on my local web server, to SSH to another machine to perform a specified task on some files. My httpd runs as _www with low permissions, so setting up direct passwordless SSH is difficult, not to say ill-advised.
The way I do it now is to have a minimal PHP script that sudo-exec's (as me) a shell script which is outside of the document root. The shell script in turn calls (as me) the PHP code that does the actual SSH work, and prints its output. Here's the code.
read_remote_files.php (The script I call from my browser):
exec('sudo -u me -n /home/me/run_php.sh /path/to/my_prog.php', $results);
print $results;
/home/me/run_php.sh (Runs as me, calls whatever it's given):
php $1 2>&1
sudoers:
_www ALL = (me) NOPASSWD: /home/me/run_php.sh
This all works, as my_prog.php is called as me and can SSH as me. It seems it's not too insecure since run_php.sh can't be called directly from a browser (outside document root). The issue I'm having is that my_prog.php isn't called as an HTTP program so doesn't have access to the HTTP environment variables (DOCUMENT_ROOT etc).
Two questions:
Am I making this too complicated?
Is there an easy way for my final script to get the HTTP variables?
Thanks!
Andy
Many systems do stuff like this using a (privileged) cron job that frequently checks for the existence of a file, a database record or some other resource, and then performs actions if there are any.
The huge advantage of this is that there is no direct interaction between the PHP script and the privileged script at all. The PHP script leaves the instructions in a resource, the privileged script fetches it. As long as the instructions can't lead to the system getting compromised or damaged, it's definitely more secure than sudoing.
The disadvantage is that you can't push changes whenever you like; you have to wait until the cron job runs again. But maybe it's an option anyway?
"I need to allow a PHP script on my local web server, to SSH to another machine to perform a specified task on some files."
I think that you are phrasing this in terms of a solution that you have difficulty in getting to work rather than a requirement. Surely what you should be saying is "I want to invoke a task on machine B from a PHP script running under Apache on Machine A." And then research solutions to this -- to which there are many from a simple 'roll-your-own' RPC tunnelled over HTTP(S) to using an XMLRPC or SOA framework.
Two caveats:
Do a phpinfo(); on both machines to check what extensions are available and
Also check your php.ini setting to make sure that your service provider hasn't disabled any functions that you expect to use (or do a Q&D script to echo 'disable_functions = ' . ini_get('disable_functions') . "\n"; ...)
If you browse here and the wider internet you'll find many examples. Here is one that I use for a similar purpose.