EC2 automation tools / strategies? - amazon-s3

What tools or strategies are you using for automation of EC2 activities?
I need to be able to bring up a number of EC2 instances, provision various software to it (primarily Python packages), interact with S3 (primarily download data), and run various jobs. I'll be doing this both on-demand and on a scheduled basis.
I'm trying to decide if I should:
Create an AMI with all my software loaded on it
or
Launch a plain vanilla linux AMI instance and scp my software to it
For the provisioning and automation Boto looks pretty good. Or I could write something with Paramiko. Recommend either or anything else I should be looking it?
Basically I'm looking for advice / success stories, let me know what's working for you.

To answer your bullets about selecting AMIs, I would say that it depends on how much software you're installing.
I have been successful with a hybrid approach, where I build an AMI and load my heavyweight and more stable software. This is the stuff that needs to run an installer, or takes considerable time to install (remember that if you re-install a package every time as part of your startup process, you're paying for the install every time). Then, I upload the small and volatile software at provisioning/startup time. In this bucket goes most of the application code, data, etc. That way, I can change my app and not have to touch the AMI.
The benefits of this approach:
Don't have to pay for running the same software install thousands of times.
AMI can stay fairly stable over time.
Can use software that requires intervention or GUI interaction to install.
Major drawbacks:
Your AMI's OS version will become stale over time.
Your AMI may not be flexible as to the instance type/architecture it will run on. For instance, you may create it on a 32-bit OS and thereby prevent it from running on the High CPU instance types, or vice versa. So you may lock yourself into a pricing scheme.
I don't use Python, so I can't comment on either of the APIs you referenced.

AWS just released the Systems Manager suite, which includes an Automation service that will (among other things) handle your use cases around AMIs.

This question was asked some time ago now but I believe my answer could be useful to other users. I believe the best automations tools available on the market are provided by Cloud Management platforms. For example they offer auto-scaling, configuration software integration (Chef/Puppet), databases replications, dns management...
The most popular cloud management softwares are Scalr (disclaimer: I work there), RightScale and enStratus. Scalr is open-source and released under the Apache 2 license.
Regarding your specific question on AMIs, cloud Management platforms usually provide pre-configured AMIs (at Scalr, we call them roles). If you want to create your own AMI built on an existing instance, you'll be able to create snpashots and use them as a template for future instances.

Related

Is there any open source load generators?

I would like to know if there is any open source load generator for stress testing. I am using JMeter script to run the test. I know there is one called flood.io but think it is not free. Can you please recommend me some option and documentation on how to use it properly? Thanks!!
Well, JMeter is open source, you can observe source repositories so it matches your needs.
The point is where to run JMeter so you need to have your own bare metal machines (or normal desktop/laptop/virtual machine). If you're looking for an open-source operating system there is a wide range of free and open source Linux distributions, FreeBSD, illumos, etc.
If you don't have any machines and any money to invest into these machines there are some free cloud options like AWS Free Tier or Oracle Cloud Free Tier, also some SaaS service providers offer free pricing plans, for example BlazeMeter gives you 50 virtual users, LoadFocus gives you 20 concurrent users, etc.

Docker- what real value does it bring for our team?

I am very very new to Docker. Our team has had a very nice deployment line up where We have different CI engines for different projects including Jenkis and TeamCity.
Developers usually check-in and CI takes over, deploys and its perfectly ready for test team to test. I always thought this to be a perfect model. Of course, some parts and our implementation have their flaws but it worked very well for what we wanted.
Now, our Dev-Ops is introducing Docker where test teams get a Docker Image from Docker Registry Everytime we run a build from teamcity. While it sounds really really fancy I am still failing to understand the benefit of it.
After my research, my conclusion was that Dockers can be a good light weight replacements for VM. BUT that is ONLY IF you are using any VMs? We are not using any VMS? I just do not understand what is the real value here? Also, while searching I found a relatively good link on Docker:
https://www.ctl.io/developers/blog/post/what-is-docker-and-when-to-use-it/
Where they discuss when you should use Docker and one of the point says that:
Use Docker whenever your app needs to go through multiple phases of development (dev/test/qa/prod, try Drone or Shippable, both do Docker CI/CD)
Ok. Howeve rthey do not further elaborate on why is docker useful when my app has to go through multiple phases?
And how it is exptremely helpful over regular Dev/Test set up when the existing set up is already working smooth?
First, you are right about comparing it to VMs in that it is similar to a VM. However, docker is incredibly lightweight. This property is the one that surprised me most in the beginning. As opposed to virtual machines, containers share resources much more efficiently. Virtual machines are isolated. Containers can run simultaneously on a host machine with very little overhead. You can configure containers to be able to talk to each other (via volume or port bindings).
Furthermore, in my team, docker brings the following benefits:
our application consists of one big application and several other few microservices. But we want to release all as one package with inter-dependencies among the applications, which eliminates problems with figuring out which version of application and microservices should be deployed together (compatiblity) etc. That is, the image contains all you need and you can bring all applications or one-by-one up/down using docker-compose. You do not need to deploy, you simply pull the image and fire a container/s. If you wish to stop one of the microservices, it can be done without affecting the others.
developers in the team, can run the very same image on local machine, for example to troubleshoot a problem occurred in the production; which means troubleshooting can be done in the same environment as in the production. This brings environment standardization and no more "but it works on machine" talk.
another benefit it brings to us is the following: we build a docker image, run our tests against it, and push it to the registry once all these phases succeed, which translates into a great portability.
Ability to version control the containers. You can easily inspect containers between the current version and the previous versions. If you wish to rollback - that is done smoothly.
Isolating and securing applications. All containers are isolated and you can easily control what goes in and out.
It took me a year before I got used to the idea, but now it seems simple enough.
I think part of that comes from the fact that people keep calling Docker a "virtual machine", which is not accurate. That's really just a nickname for what's happening behind the scenes. In a lot of ways, Docker will NOT replace a complete virtualization solution, such as VMWare. It does, however, bring forth a new way of thinking about infrastructure. One that many people have a difficult time wrapping their heads around.
You can start asking yourself: What makes a Linux distribution unique?
Aside from the kernel, everything else is just a "standard way" of organizing binaries, libraries, runtime and configuration files. You need your binaries in /bin, your libs in /lib, your configuration in /etc. User installations get placed under /usr...
Most distributions will keep the main structure from the Unix legacy and add its own quirks. Each one will have its own way to manage and distribute packages. Each will maintain their own versions of libraries, drivers, etc.
The key ingrident is the kernel. That's something they all have in common. Nowadays, recent builds of the Linux kernel are compatible with pretty much all major distributions available. So, aside from /boot, most of everything else is just a matter of having the right files in the right place with the right permissions.
Now, imagine you take all that distribution bundle (except the kernel) and place it all in another directory of your running OS. Taking advantage of the same kernel you are already running, you isolate a new process so that it "thinks" that / is now that directory. Bingo! This process now "thinks" it's running all by itself on another operating system.
Docker builds on top of Linux Containers, which allows us to do excatly that, but in a more friendly and easier way. Don't think of it as a virtual machine. Think of it as process isolation. The running kernel will share the machine's resources with this process, while keeping it isolated from the rest of the system. It's like jails on steroids.
That was a broad simplification. But, given the concept, think about the implications of this idea.
You can have on the same host, multiple processes with completely different environments that might otherwise conflict with each other. One may be a legacy binary that needs old libraries in place (legacy systems that never die). Another may be the most recent build of a bleeding edge technology. Sharing the same kernel is a efficient, and valuable resource management.
The most value I found comes from managing the infrastructure. Once you install Docker on the hosts, configure a swarm, and define a way of deploying containers, you mostly forget about the hosts. Adding users, installing packages, customizing, editing configuration files... All that becomes a development task on your desktop. There's an incentive to script more, to automate more. To keep your hands away from the physical or virtual machines, unless absolutely necessary.
Gone are the days when someone changed some obscure setting on the server to work around some weird application behavior, forgot to tell anyone about it and took a vacation. Changes to the environment can be commited to version control, tracked and improved by everyone on the team. If your datacenter goes through a disaster, recreating the whole environment is a matter of rebuilding images and redeploying containers. Your infrastructure becomes consistent and reproducible, while keeping the doors open to a wide variety of operating systems and customized configurations for each application.
Developers can take advantage of Docker with the ability of recreating dev/staging/production environments on their desktops. No need to polute a dev machine with application servers and database installations, or even the toll of Virtual Box to emulate all that.
Testing can be automated with a higher level of isolation. The Selenium team already has official Docker images. Creating an entire test hub should be a walk in the park with those puppies.
Building custom software, such as compiling Nginx with third party modules, can also be done inside containers from specialized images. No need to keep an entire server dedicated to it, or even polute your desktop with all the dependencies and build packages.
Overall, we've been having a great experience with Docker. We've migrated our staging environment to this new platform, and plan to migrate other parts of the infrastructure as well, eventually into production. So far, so good.
I hope you can convince enough people to take a better look at it. I'll admit, it took me sometime to get used to the idea. But once you get it, it's actually worth it.

TRAC host that allows plugin customisation

My research and development environment calls for a heavily customised TRAC with a corresponding subversion repository and a binary file store (e.g. WebDAV).
I have my eye on at least 10 plugins that I would like to use (from integration with time tracking software, to specialist mathematics/code rendering). I'd also like to write my own plugins.
I am looking for a commercial host that will allow me to self-manage my TRAC plugins. I've looked into (and contacted) a few of the commercial providers from the TRAC Commercial Services list, including:
Project Locker
Repository Hosting
SVN Repository
Project Locker have described that they do a code review of plugin requests and handle it on their end (unspecified time period). Repository Hosting have said that they "will probably not add support for that in the near future". SVN Repository have said "you won't be able to install any new plugins" and have suggested one of their VPS accounts instead.
Short of managing my own VPS or dedicated server, does anybody know of a commercial SVN/TRAC host who allows paying customers to install their own plugins? I would have thought a chroot environment would have made this a no-brainer!
(Note: this was originally posted on programmers but was down-voted and I was advised to move it here. Quoting from their FAQ: implementation issues or programming tools (ask on Stack Overflow instead))
You'll probably find a hard time finding what you're looking for because as Craig mentioned in his comment, the concept of commercial hosting services typically revolves around limiting a customer's ability to customize. Keeping things relatively uniform means that the hosting company can manage systems and deploy automated updates much more easily and won't have to worry about their scripts breaking because of something odd that one customer installed or re-configured.
If you want to be able to install and configure plugins at will, I highly recommend going the VPS route and managing the server yourself. It's easier than you might expect (I was thrown into this situation and was pleasantly surprised). You can start with something like the Bitnami Trac stack, which is a virtual machine image that has a Linux OS plus Trac and all of the support tools (database, webserver, etc) set up and ready to go. If you use that as a starting point, all you should have to do is customize your Trac settings and install your plugins.
If you really don't want to have anything to do with the management aspect, remember that you can always go the VPS route and contract out the administration work separately. It might be easier if the hosting provider and the system admin come from the same company, but it's not a requirement. Given the flexibility and customization that you need, this might be a more realistic option.

Questions About Using Amazon Web Services (AWS) For Remote Development

We are a very small mobile company (building an application for the iphone) and we are currently considering hosting services. We are currently leaning towards Amazon's hosting/web services. Accordingly, I have some questions:
1) Can I create an admin account on AWS and assign user accounts to developers that should have access to most (but not all) features.
2) Do we need to learn / use AWS APIs in the development of our product? I don't like the
idea of having to create hooks into a hosting service.
3) It looks like the pricing for AWS scales with usage. So, since we are in development and have only developers accessing the server right now, am I right that the cost will be quite low if anything?
4) How does AWS do version management? We have several developers scattered throughout the country. Each will need to checkout the the recent build from the server for development
on his local box. Basically, something like SVN. Is this possible?
5) I am guessing we need something like a dev, svn, and production server? Is this right? If so, how do I set this up and find out the associated costs?
6) We are considering a few database options, among them NoSQL and Neo4j - will we be able to do this using AWS? The server language will be Java.
Thanks for your time.
To answer your questions:
Yes, kind of. There is Identity and Access Management offered by AWS, but it's not the easiest solution to use. Having said that, it can allow you to lock down some of the access activities on an account so that you have some control over your users. I would say that AWS is still very much a single-user environment for server administrators.
You could get away using only the management console. Your use of scripting may only be required if you want to run batch or periodic activities (eg. take a snapshot of all machines at 2am every night).
Costs for EC2 are low, especially for the Micro machine sizes. But keep in mind that the idea of cloud computing is the availability of on-demand resources for short term use. If you run dev machines needlessly over night then you will still be paying! And if someone launches an Extra Large machine (or 30 machine instances) then you will suddenly find yourself with bigger bills than expected.
(5. and 6. as well) Amazon EC2 is really about issuing you the boxes. What you do thereafter is fully up to you. You can create snapshots daily of your machines, you can deploy SVN and noSQL etc. etc.
I've been seriously into EC2 for a while now, and lots of companies are starting to look at the idea you propose. There are benefits to giving staff on-demand compute power, without having to manage any infrastructure in-house. But I will re-iterate my first point that EC2 is very much a single-user, server administration environment, which doesn't lend itself to being used as a dev playground without additional tools. (Or at least it becomes a challenging task if you have several devs spread around in your company).
I own a business that helps companies use EC2 for dev/lab/playground type of environments. I won't directly flog it here, but will show a quick demo we just put on DropBox: http://dl.dropbox.com/u/16347737/RequestEC2Machines.html Feel free to request a machine to see how adding process to EC2 can help meet your goals.
I run/develop a website using Amazon EC2 & SimpleDB and I have some comments for you on your questions
Hi.
We are a very small mobile company (building an application for the iphone) and we are currently considering hosting services. We are currently leaning towards Amazon's hosting/web services. Accordingly, I have some questions:
1) Can I create an admin account on
AWS and assign user accounts to
developers that should have access to
most (but not all) features.
In my experience, there doesn't seem to be a direct correspondence between Amazon users and users on a single instance. An instance's root account is connected to the amazon account indirectly through a key pair. Although, I must say that I haven't explored this question in detail.
2) Do we need to learn / use AWS APIs in the development of our product? I don't like the > idea of having to create hooks into a hosting service.
I manage everything through their web console and Eclipse IDE plugins. I've never had to touch the API yet for development and deployment.
3) It looks like the pricing for AWS scales with usage. So, since we are in
development and have only developers accessing the server right now, am
I right that the cost will be quite low if anything?
Micro instances cost the lowest and the cost is pretty good if you're just starting an instance for a couple of hours and then stopping it. I never think twice about starting a micro instance to try out something new
4) How does AWS do version management? We have several developers
scattered throughout the country. Each will need to checkout the the recent
build from the server for development on his local box. Basically, something like SVN.
Is this possible?
I haven't seen this feature being offered directly by Amazon. You can of course keep an instance always on for your repository with backups
5) I am guessing we need something like a dev, svn, and production server?
Is this right? If so, how do I set this up and find out the associated costs?
EC Pricing - http://aws.amazon.com/ec2/pricing/
Amazon Simple Monthly Calculator - http://calculator.s3.amazonaws.com/calc5.html
6) We are considering a few database options, among them NoSQL and Neo4j -
will we be able to do this using AWS? The server language will be Java.
Amazon instances can be what you want them to be, hence you can either use a pre-configured ami to launch an instance or start off with a bare bones Ubuntu Server or Windows Server e.g. and build a system with what you want. You can then save the snapshot of that system to launch more in the future or to re-launch if your instance crashes

Installation vs. Virtual Machine Images

I seem to end up evaluating a lot of software. This requires me to constantly install all kinds of things on my system. It creates a huge clutter and I spend a lot of time during the install process, and if I don't like it, then removing everything I've done. Much of my evaluation tends away from the features of the software being evaluated and toward how difficult it is to install. I'm sure I miss good software which may have actually been a better choice, because of this startup cost.
With the advent of VM software like VMWare Player and VirtualBox, it would be much easier to sell someone like me your software, if you just provided an image that I could load into the VM and run. I'd be looking at the features almost immediately rather than fighting with which revision of whatever. The VM would take care of all of this for me.
Am I missing something, or should vendors and OSS start distributing VMs for their wares?
Most of my evaluations are for server side software installed on Linux, so OS licensing is not the issue.
VMs require that the operating system have a valid license key. For free operating systems this wouldn't be an issue, but if you're developing for something like Windows machines, each time they send out a demo version of their software, they're sending out a license key that they would have to pay for.
This would be incredibly expensive for most companies.
The only downside I would say IMHO is the size of the images, if say you have a 20 MB application, do you really want to download/transfer an entire OS just for that application.
I would say a better approach would be to have a ready to go VM and then you simply take a snapshot (on Virtual Box, I assume similar feature exist in other players)
Then simply install the applciation inside your sandbox environment, and then just Zap it when done (i.e. return to your Snapshot)
Darknight
This can be done for softwre that runs on open source platforms, and VMware have a library of images which do just this (though the images that are used for evaluating commercial software is generally for infrastructure-type things that have very, very complex installation requirements):
http://www.vmware.com/appliances/
However, if the software is for the Windows platform, you don't really have the opportunity to do this, as Microsoft's Windows licensing would prevent it. Unless, you're Microsoft, of course, in which case you can in fact do this - and MS has done this to permit easier evaluation of such software as Visual Studio, SQL, and many others:
http://technet.microsoft.com/en-us/bb738372.aspx?ppud=4
Novell has an appliance builder called Suse Studio that lets you pick the software you want, it builds out a VM with the software (and dependencies, etc) for you. You can then try out the VM, download it, etc.
Whether the software you're looking for is available or not is a different matter.
Disclaimer: I work for Novell (though not with the Suse team)
But yes, if you can deal with the OS licensing issues, or possibly host trial environments yourself, this is a very effective way for a vendor to demo their app. The problem is that all vendors don't always have the infrastructure (or lack the awareness) to do so.
Microsoft provides fully-provisioned VM's for time-limited trials of their software. So if you want to trial select Microsoft products in that manner, you can do that today.
There is no sign, though, that Microsoft will make this available to third party Windows software vendors.
In the SaaS (Software-as-a-Service) world, you can get fully-provisioned virtual servers that include Windows and your software of interest on a pay-as-you-go basis, based on both Linux and Windows. For example, see Amazon Web Services
For windows, you may be better off developing a portable application that runs from a usb key. That is how Embarcadero distribute All Access. I received a 4 gb usb key that contained multiple applications. Most could be run straight from the key without installation. I believe Embarcadero will be licensing the technology at some stage.
If you are using a programming language such as Delphi or C++ with little in the way of external dependencies, a portable application is straight forward to develop. For .net, it is much harder, but can be done with Mono, or something like Virtual Application Studio.