Is Amazon EC2 recommended for a persistent public facing website? - sharepoint-2010

My company is about to write a new public facing website in SharePoint (so Windows Server 2008 RC2, SQL Server 2008 RC2, etc) and we're looking at using Amazon EC2 to host it. I've read and been told that instances can disappear (often through user-error, but also in batches), so I'm skeptical that EC2 is the best idea for us.
I've done research on the Amazon AWS site, but must confess that most of the terminology used is confusing, and Googling my questions often brought me here, so I thought I'd ask my questions here too and see if people can advise me.
1) It's critical that our website be available to the public as much as possible (the usual 99.9% up times apply). The Amazon EC2 Service Level Agreement commitment is 99.95% availability, which is fine, but what happens if we hit that 0.05% scenario? Would our E2 instance be lost? Can these be recovered? If so, what would we need to do to ensure that we recover to a not-too-old version of our site?
2) I've read about Amazon Elastic Block Store (EBS), and how this is persist independently from the lifetime of the instance. If I understand right, EBS is like having a hard-drive, so if the instance is lost we can start a new instance using our EBS to recover the latest version, while the 'local instance store' would be lost if the instance is lost as well. Is that right?
3) Are 'reserved instances' a more stable option? i.e. are they less likely to disappear? If they do still disappear, what recovery benefits do they offer, if any?
I know these questions are kinda vague, but hopefully you'll be able to offer a newbie from basic info - enough to point me in the right direction for further, deeper research at least.
Many thanks.
Kevin

We rely on AWS for our webservers. I won't use anything else. They're highly scalable, easily configurable and have an absurd uptime. I've never experienced downtime with them. We've been with them for two years.
Reserved instances are cheaper. Get them if you're planning on having that instance for a while. It's simply a cost/budgeting issue.
Never heard of people losing an EC2 instance.
Not terribly knowledgeable about EBS, but S3 is a good way to back up data.
HTH
EDIT:
Came across some links that might be helpful. Cheers.
http://techblog.netflix.com/2010/12/four-reasons-we-choose-amazons-cloud-as.html
http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html
http://www.codinghorror.com/blog/2011/04/working-with-the-chaos-monkey.html

One of the main design goals of AWS is to make fault tolerant services--that is services that can recover from failures. That is, they design all of their services with the assumption that something will fail in some way at some point, but that there will be redundancies and other mechanism in place to recover from those inevitable failures.
In the case of storage services like S3 and SimpleDB, this is achieved primarily by replicating your data across multiple nodes (machines) in multiple data centers. So when one node experiences a hardware failure or one data center experiences a power outage, there's no real down time as the replicas can still service the requests. As a consumer, you aren't even aware of the down nodes or data centers.
EC2 is designed to work similarly, but it is not quite as encapsulated as S3 and SimpleDB, so you'll need to plan for a bit of the work yourself. For example, if you need a web service with guaranteed uptime and availablity, you'll want to look into AWS ELB (Elastic Load Balancing) service. That way if an instance is down, requests will automatically be routed to other healthy instances. For your data, you can either store it in other AWS services (like S3 and SimpleDB and EBS) which have built-in redundancy or you can build your own solution using similar redundancy techniques.

The SLA amounts to none, when we found out that:
Instances and EBS volumes DID get lost
It takes Amazon more than 2 days to recover from a disaster, and even that not to the full extent
We were the lucky ones, that managed to get back on our feet in less than 2 days. Other companies got stuck with no recovery option.
And what does Amazon recommend? "Don't trust our reliability. Pay for 2 or 3 more copies of your system in different regions, and then you will be safe".
More information can be found here:
http://www.zdnet.com/blog/saas/lightning-strike-zaps-ec2-ireland/1382

tldr: AWS is very reliable if you know what you're doing, a bad idea if you don't.
As your unfamiliar with terms here's a very quick glossary:
AZ - Availability zone, there's several availability zones per region (e.g. 3 in Ireland). They are physical isolated datacentres with different power grids, flood plains etc. But with internal network quality speed connections. It's possible even likely an AZ may become unavailable at some point, I don't think all AZ's in a region have ever been down though.
EBS/Instance Store - These are the two main types of storage available to instance. The best way to describe them is Instance Store is the equivalent to a HDD you have plugged in via sata to your motherboard - its very fast. But what happens if you shutdown your instance (or if the motherboard fails) and want to instantly start on another board? (Amazon completely hides the physical hardware setup) obviously you aren't going to wait for an engineer to unplug a drive from one server and into another so they don't even offer this. Instance store is fast but temporary and tied to the physical machine DO NOT store anything important on it. EBS then is the alternative it is a very low latency network drive that any server can connect to as though it were local. You shut down a server, change the size and restart on a completely different server on the other side of the datacentre (again the physical stuff is hidden), doesn't matter your ebs hasn't gone anywhere (by default theyre also on multiple physical discs).
Commodity cloud hardware - My interpretation of all the 'cloud hardware fails all the time - its really risky and unreliable' is that yes aws hardware is not as reliable as enterprise level components in a managed datacentre. This doesn't mean its unreliable, it just means you should build failure as an option into your design.
First very important thing to note when talking about SLA's is that amazon state very clearly that the SLA ONLY applies if one or more AZ goes down. So if you do not understand how their service works and only build one server in one AZ and a generator or router fails it's your own fault.
As for recovery, that depends - is your entire application state stored on one server - if it is, don't bother with the cloud. If however you can cluster your state on multiple servers, store it in RDS or some other persistent DB. OR if your content changes so infrequently you can utilise periodic copies to s3 storage, you'll be fine. You failure strategy (in order of preference) could be clustered, failover, or auto repair. For the first one you have clustered servers sharing state - it doesn't matter if you lose a server or an AZ. For the second you only have one live server, but if it goes down you have a failover standing by with the same content. Finally with auto repair there's two possible situations - if your data is only on one EBS drive, you could start another instance with the same drive and carry on. But if the EBS drive or AZ fails, you will need to be ready with some snapshot in s3 that a completely fresh instance can copy and start up with.
Reserved instances are no more reliable - they're the same hardware, you're just entering into a contract to say i'll have x machines for y years. Which allows aws to plan better, which is cheaper for you.

Related

Openstack - hardware requirements

I've been needing a new VM host for some time now, and from working with/on AWS at work, "The Cloud" seems to be a good idea.
I've done some math, and no matter how I count, it's going to be cheaper to do it myself, than colo or something else. Plus, I really like lots of blinking lights :D
A year or so, I heard about Openstack and have been looking cursory at it since then. Seems big and complex (and scary!), and some friends who have been trying to do it at work for a year and still not quite finished/succeeded, indicate that it is what it seems :)
However, I like tormenting myself, so I've decided I'm going to give it a try. It does provide all the functionality, and then some, that I need. Theoretically, I could go with Vagrant, but that's not quite half-way to what I want/need.
So, I've been looking at https://en.wikipedia.org/wiki/OpenStack#Components and from that came to the following conclusion:
Required: (Nova, Glance, Horizon, Cinder)
This seems to be the "core" services. I need all of them.
Nova
Compute fabric controller
Glance
Image service (for templates)
Horizon
Dashboard
Cinder
Block storage devices (can work with ZoL w/ 3rd party driver)
Less important: (Barbican, Trove, Designate)
I really don't need any of this, it's more of "could be nice to have at some point".
Barbican
REST API designed for the secure storage, provisioning and management of secrets
Trove
Database-as-a-service provisioning relational and non-relational database engine
Designate
DNS as a Service
Possibly not needed: (Neutron, Keystone)
These ones I don't know if I need. I have DHCP, VLAN, VPN, DNS, LDAP, Kerberos services on the network that work just fine, and I'm not replacing them!
Neutron (previously Quantum)
Network management (DHCP, VLAN)
Keystone
Identity service (can work with existing LDAP servers)
Not needed: (Swift, Ceilometer, Ironic, Zaqar, Searchlight, Sahara, Heat, Manilla)
Meh! I'm doing this for me, for my basement and for my own development and enjoyment, so don't need that. Would be nice to go with a fully object based storage, but that's not feasible for me at this time.
Swift
Object storage system
Ceilometer
Telemetry Service (billing)
Ironic
Bare metal provisoning instead of virtual machines
Zaqar
multi-tenant cloud messaging service for Web developers (~ SQS)
Searchlight
Advanced and consistent search capabilities across various OpenStack cloud services
Sahara
Easily and rapidly provision Hadoop (storing and managing vast amounts of data cheaply and efficient) clusters
Heat
Orchestration layer (store the requirements of a cloud application in a file that defines what resources are necessary for that application)
Manila
Shared File System Service (manage shares in a vendor agnostic framework)
If we don't count storage (I already have my own block storage, which I can use with Cinder and some 3rd party plugins/modules) and compute nodes (everything that's left over will become compute nodes), can I run all this on one machine? With a hot standby/failover?
Everything is going to be connected to the same power jack, same rack, same [outgoing] network cable so more redundancy that that is overkill. I don't even need that, but "why not" :)
The basic recommendation I've heard is four to six machines. And after a lot of pestering the ones who said that, it turns out that "two storage, two controller, two compute". Which, is what I was thinking as well: Running this on two machines should be enough. They're basically only going to run Glance, Horizon and Cinder. And possibly Neutron and Keystone.
Neither of them seems to be very resource-heavy.
Is there something I'm missing?
Oh, and nothing of this is going to face the 'Net! It's all just for me.
Though it is theoretically possible to bring up OpenStack without Keystone, it is almost practically impossible and makes the system pretty inconvenient to use.
You can definitely run full OpenStack on a machine (or even in a VM). Checkout the devstack (http://docs.openstack.org/developer/devstack/) -- you just run a shell script to bring up a full working OpenStack setup.
As long as you are not worried about availability and your workload is minimal, single-node deployment is a pretty good start to get your hands wet.

ELB on Amazon - is it "worth it" in this case?

We're thinking about moving to the Elastic Load Balancer on Amazon. However, it turns out that since we use more than one domain name, we would have to rename some of our applications to limit to a single ELB. Another issue is we currently use free level one certificates, whereas moving to ELB would require moving up to level 2, although that's not a huge deal. Another issue is we don't have a lot of volume at this point, and don't really have a need for load-balancing in terms of traffic alleviation. Also, in the case of a failure of an amazon instance, which seems to be quite rare (have not experienced in several years), we can quickly be up and running by creating another instance and restoring.
Otoh, according to all I read about it, people are generally happy and recommend it, due to ease of setup and the value it brings.
Given the above, is it worth it?
since we use more than one domain name, we would have to rename some of our applications to limit to a single ELB
What makes you say this? There's nothing preventing you from launching multiple ELB's if you really want to. And if your application already manages multiple domains properly then there's no reason a single ELB can't handle that either. We currently have one ELB fronting an application on a bunch of EC2 instances that 11 different domains all point to.
Another issue is we currently use free level one certificates, whereas moving to ELB would require moving up to level 2, although that's not a huge deal.
Not sure what you mean by "level one" and "level 2". If you're using a self-signed SSL certificate then you'll need to switch to using certificate signed by a third party Certificate Authority, which will indeed cost you some money. Amazon supports all manner of certificates, including simple certs, EV certs, SAN certs, etc. You'll find more information on ELB and SSL certs in the AWS documentation.
Also, in the case of a failure of an amazon instance, which seems to be quite rare (have not experienced in several years), we can quickly be up and running by creating another instance and restoring.
Consider yourself lucky. We've had Amazon instances fail from time to time, and we also regularly get notifications from Amazon that instances need to be rebooted in order to migrate them off of faulty/old hardware.
If you really don't care about being down for a while and feel like you don't need the capacity that a load balancer and multiple appservers provides then there's no reason for you to move to using an ELB. However if you want the reliability of multiple appservers then moving to an ELB is indeed a good idea.
And if you anticipate your traffic level growing then you might want to consider using Amazon's Auto Scaling tools. Using Auto Scaling you basically tell Amazon the minimum number of application servers you want running behind an ELB, and some parameters to indicate when they should automatically launch additional instances if/when load increases.
Our Amazon account rep actually recommended to us that if we had even a single instance that we wanted to minimize downtime of (like a monitoring server, etc) that we should create an Auto Scaling group with a limit of exactly 1 instance in it. That way if the instance ever does die for any reason whatsoever, Amazon will automatically spin up a new replacement instance.
Agree with Bruce, just wanted to add my 5 cents about Auto Scaling(ASG) and " Amazon will automatically spin up a new replacement instance.".
This is really cool way to get robust hosting solution, but will need some challenge to create CloudFormation template and bash auto install script that will be called from CloudFormation template to install all server software and deploy your app code.
So if you will have 2 instances and ASG with Min/Max = 2, then if some instance will be crashed, ASG will recreate it automaticly with all software installed and code deployed and ready to go
Also if you need to handle some periodic traffic jumps automaticly, then you can change the ASG as (Min=2, Max=5), create 2 CloudWatch alarms:
1. if cpu usage is 90+ for 5 or 10 mins
2. if cpu usage is 30- for 5 or 10 mins
Then assign Alarm 1 to scale up 1 additional instance and assign alarm 2 to destroy any additional instance created by 1

Redis active-active replication

I am using redis version 2.8.3. I want to build a redis cluster. But in this cluster there should be multiple master. This means I need multiple nodes that has write access and applying ability to all other nodes.
I could build a cluster with a master and multiple slaves. I just configured slaves redis.conf files and added that ;
slaveof myMasterIp myMasterPort
Thats all. Than I try to write something into db via master. It is replicated to all slaves and I really like it.
But when I try to write via a slave, it told me that slaves have no right to write. After that I just set read-only status of slave in redis.conf file to false. Hence, I could write something into db.
But I realize that, it is not replicated to my master replication so it is not replicated to all other slave neigther.
This means I could'not build an active-active cluster.
I tried to find something whether redis has active-active cluster capability. But I could not find exact answer about it.
Is it available to build active-active cluster with redis?
If it is, How can I do it ?
Thank you!
Redis v2.8.3 does not support multi-master setups. The real question, however, is why do you want to set one up? Put differently, what challenge/problem are you trying to solve?
It looks like the challenge you're trying to solve is how to reduce the network load (more on that below) by eliminating over-the-net reads. Since Redis isn't multi-master (yet), the only way to do it is by setting up each app server with a master and a slave (to the other master) - i.e. grand total of 4 Redis instances (and twice the RAM).
The simple scenario is when each app updates only a mutually-exclusive subset of the database's keys. In that scenario this kind of setup may actually be beneficial (at least in the short term). If, however, both apps can touch all keys or if even just one key is "shared" for writes between the apps, then you'll need to bake locking/conflict resolution/etc... logic into your apps to consolidate local master and slave differences (and that may be a bit of an overkill). In either case, however, you'll end up with too many (i.e. more than 1) Redises, which means more admin effort at the very least.
Also note that by colocating app and database on the same server you're setting yourself for near-certain scalability failure. What will happen when you need more compute resources for your apps or Redis? How will you add yet another app server to the mix?
Which brings me back to the actual problem you are trying to solve - network load. Why exactly is that an issue? Are your apps so throughput-heavy or is the network so thin that you are willing to go to such lengths? Or maybe latency is the issue that you want to resolve? Be the case as it may be, I recommended that you consider a time-proven design instead, namely separating Redis from the apps and putting it on its own resources. True, network will hit you in the face and you'll have to work around/with it (which is what everybody else does). On the other hand, you'll have more flexibility and control over your much simpler setup and that, in my book, is a huge gain.
Redis Enterprise has had this feature for quite a while, but if you are looking for an open source solution KeyDB is a fork with Active Active support (called Active Replica).
Setting it up is just a little more work than standard replication:
Both servers must have "active-replica yes" in their respective configuration files
On server B execute the command "replicaof [A address] [A port]"
Server B will drop its database and load server A's dataset
On server A execute the command "replicaof [B address] [B port]"
Server A will drop its database and load server B's dataset (including the data it just transferred in the prior step)
Both servers will now propagate writes to each other. You can test this by writing to a key on Server A and ensuring it is visible on B and vice versa.
https://github.com/JohnSully/KeyDB/wiki/KeyDB-(Redis-Fork):-Active-Replica-Support

Planning the development of a scalable web application

We have created a product that potentially will generate tons of requests for a data file that resides on our server. Currently we have a shared hosting server that runs a PHP script to query the DB and generate the data file for each user request. This is not efficient and has not been a problem so far but we want to move to a more scalable system so we're looking in to EC2. Our main concerns are being able to handle high amounts of traffic when they occur, and to provide low latency to users downloading the data files.
I'm not 100% sure on how this is all going to work yet but this is the idea:
We use an EC2 instance to host our admin panel and to generate the files that are being served to app users. When any admin makes a change that affects these data files (which are downloaded by users), we make a copy over to S3 using CloudFront. The idea here is to get data cached and waiting on S3 so we can keep our compute times low, and to use CloudFront to get low latency for all users requesting the files.
I am still learning the system and wanted to know if anyone had any feedback on this idea or insight in to how it all might work. I'm also curious about the purpose of projects like Cassandra. My understanding is that simply putting our application on EC2 servers makes it scalable by the nature of the servers. Is Cassandra just about keeping resource usage low, or is there a reason to use a system like this even when on EC2?
CloudFront: http://aws.amazon.com/cloudfront/
EC2: http://aws.amazon.com/cloudfront/
Cassandra: http://cassandra.apache.org/
Cassandra is a non-relational database engine and if this is what you need, you should first evaluate Amazon's SimpleDB : a non-relational database engine built on top of S3.
If the file only needs to be updated based on time (daily, hourly, ...) then this seems like a reasonable solution. But you may consider placing a load balancer in front of 2 EC2 images, each running a copy of your application. This would make it easier to scale later and safer if one instance fails.
Some other services you should read up on:
http://aws.amazon.com/elasticloadbalancing/ -- Amazons load balancer solution.
http://aws.amazon.com/sqs/ -- Used to pass messages between systems, in your DA (distributed architecture). For example if you wanted the systems that create the data file to be different than the ones hosting the site.
http://aws.amazon.com/autoscaling/ -- Allows you to adjust the number of instances online based on traffic
Make sure to have a good backup process with EC2, snapshot your OS drive often and place any volatile data (e.g. a database files) on an EBS block. EC2 doesn't fail often but when it does you don't have access to the hardware, and if you have an up to date snapshot you can just kick a new instance online.
Depending on the datasets, Cassandra can also significantly improve response times for queries.
There is an excellent explanation of the data structure used in NoSQL solutions that may help you see if this is an appropriate solution to help:
WTF is a Super Column

Deploying on EC2

This question is for anyone who has actually used Amazon EC2. I'm looking into what it would take to deploy a server there.
It looks like I can start in VirtualBox, setup my server and then export the image using the provided ec2-tools.
What gets tricky is if I actually want to make configuration changes to my running server, they will not be persistent.
I have some PHP code that I need to be able to deploy (and redeploy) to the system, so I was thinking that EBS would be a good choice there.
I have a massive amount of data that I need stored, but it just so happens that latency is not an issue, so I was thinking something like s3fs might work.
So my question is... What would you do? What does your configuration look like? What have been particular challenges that perhaps you didn't see coming?
We have deployed a large-scale commercial app in the AWS environment.
There are three basic approaches to keeping your changes under control once the server is running, all of which we use in different situations:
Keep the changes in source control. Have a script that is part of your original image that can pull down the latest and greatest. You can pull down PHP code, Apache settings, whatever you need. If you need to restart your instance from your AMI (Amazon Machine Image), just run your script to get the latest code and configuration, and you're good to go.
Use EBS (Elastic Block Storage). EBS is like a big external hard drive that you can attach to your instance. Even if your instance goes away, EBS survives. If you later need two (or more) identical instances, you can give each one of them access to what you save in EBS. See https://stackoverflow.com/a/3630707/141172
Burn a new AMI after each change. There's a tool to create a new AMI from a running instance. If EBS is like having an external hard drive, creating a new AMI is like having a DVD-R. You can save the current state of your machine to it. Next time you have to start a new instance, base it on that new AMI. Good to go.
I recommend storing your PHP code in a repository such as SVN, and writing a script that checks the latest code out of the repository and redeploys it when you want to upgrade. You could also have this script run on instance startup so that you get the latest code whenever you spin up a new instance; saves on having to create a new AMI every time.
The main challenge that I didn't see coming with EC2 is instance startup time - especially with Windows. Linux instances take 5 to 10 minutes to launch, but I've seen Windows instances take up to 40 minutes; this can be an issue if you want to do dynamic load balancing and start up new instances when your load increases.
I'd suggest the best bet is to simply 'try it'. The charges to run a small instance are not high and data transfer rates are very low - I have moved quite a few GB and my data fees are still less than a dollar(!) in my first month. You will likely end up paying mostly for system time rather than data I suspect.
I haven't deployed yet but have run up an instance, migrated it from Ubuntu 8.04 to 8.10, tried different port security settings, seen what sort of access attempts unknown people have tried (mostly looking for phpadmin), run some testing against it and generally experimented with the config and restart of the components I'm deploying. It has been a good prelude to my end deployment. I won't be starting with a big DB so will be initially sticking with the standard EC2 instance space.
The only negativity I have heard it that some spammers have made some of the IP ranges subject to spam-blocking - but have not yet confirmed that.
Your virtual box approach I will suggest you take after you are more familiar with the EC2 infrastructure. I suggest that you go to EC2, open an account and follow Amazon's EC2 getting-started guide. This guide will give you enough overview on all things (EBS, IP, CONNECTIONS, and otherS) to get you started. We are currently using EC2 for production and the way we started was like I am explaining here.
I hope you become a Cloud Expert Soon.
Per timbo's concern, I was able to nab an IP that, so far hasn't legitimately shown up on any spam lists. You will have a few hiccups since many blacklists are technically whitelists and will have every IP on their list until otherwise notified that a Mail Server is running on that IP. It's really easy to remove, most of them have automated removal request forms and every one that doesn't has been very cooperative in removing me from their lists. Just be professional, ask if they can give a time and reason for the block and what steps you should take to remove your IP. All the services I have emailed never asked me to jump through any hoops, within two or three business days they all informed me my IP had been removed.
Still, if you plan on running a mail server I would recommend reserving IPs now. They're 1 cent per every hour they are not bound to an instance so it works out to being about $7 a month. I went ahead and reserved an extra one as I plan on starting up another instance soon.
I have deployed some simple stuff to EC2 Win2k3 instances. Here's my advice:
Find a tutorial. Sign up for the service. Just spend an afternoon setting up your first server. It's pretty darned easy, though there will be obstacles to overcome. It's not too tough.
When I was fooling with EC2 I think I spent like $2.00 setting up a server and playing with it for a while.
Some of your data will be persistent, but you can connect S3 to EC2 as well.
Just go for it!
With regards to the concerns about blacklisting of mail servers, you can also use Amazon's Simple Email Service (SES), which obviates the need to run the mail server on the EC2 instances.
I had trouble with this as well, but posted a note here in their forums - https://forums.aws.amazon.com/thread.jspa?threadID=80158&tstart=0