Basic AWS EC2 SQL High Availability Query - sql

We are considering to move our application and SQL servers to AWS EC2 instances. Currently, we have one SQL Server Standard and we have failover clustering for high availability.
On AWS, We are planning to have EC2 instance(s) with EBS volumes as needed for SQL server. So, to sql server implement high availability only option available if Synchronous DB Mirroring unless we go for some storage level replication and implement cluster on top of it.
This application does not need cross availability zone (multi site) high availability.
My basic question is why we need to take care of SQL High availability? Asking this question as
Keeping EC2 up is the amazon's responsibility
EBS volumes gets automatically replicated internally in availability zone.
Only think I can think of that we need separate high availability is EC2 instance becomes irresponsive or something on OS / Driver side gets corrupted. Do you see any other reason then these ones?
Thanks for taking time to read this question.

If you'd like a managed, highly available SQL Server, Amazon RDS would be a better option, as you would have HA (Multi-AZ), Backup, Patching and other administrative tasks automatically. Using RDS, you do not have to manage the underlying EC2 instances and EBS volumes yourself.
Using Amazon EC2, you need to take care of application HA (SQL Server in this case) because this is not (and can not be) managed by Amazon. For example, if you are hosting your personal blog you could use one instance in a single Availability Zone, for a database with primary-secondary (HA) replication you may use two instances in separate AZs and for a multi-master distributed database, you may use several instances in several regions. The design and cost of High Availability is very context dependent.
You can try and learn RDS for free with this self-paced lab: https://run.qwiklab.com/focuses/preview/1013?locale=en

Related

Does Horizontal scaling(scale out) option available in AZURE SQL Managed Instance?

Does Horizontal scaling(scale out) option available in AZURE SQL Managed Instance ?
Yes, Azure SQL managed instance support scale out.
You you reference the document #Perter Bons have provided in comment:
Document here:
Scale up/down: Dynamically scale database resources with minimal downtime
Azure SQL Database and SQL Managed Instance enable you to dynamically
add more resources to your database with minimal downtime; however,
there is a switch over period where connectivity is lost to the
database for a short amount of time, which can be mitigated using
retry logic.
Scale out: Use read-only replicas to offload read-only query workloads
As part of High Availability architecture, each single database,
elastic pool database, and managed instance in the Premium and
Business Critical service tier is automatically provisioned with a
primary read-write replica and several secondary read-only replicas.
The secondary replicas are provisioned with the same compute size as
the primary replica. The read scale-out feature allows you to offload
read-only workloads using the compute capacity of one of the
read-only replicas, instead of running them on the read-write
replica.
HTH.
Yes scale out option is available in Business Critical(BC) tier. The BC utilizes three nodes. One is primary and two are secondary. They use Always on on the backend. If you need to utilize for reporting, just ApplicationIntent=Readonly in the connection string and your application will be routed one of the secondary nodes.

Can I replace Redis cache with Cosmos DB?

Can i use azure cosmos db instead of redis cache for server side caching , i feel that cosmos Db also provides key value storage, has geo replication , read write access and lower latency than redis cache
If you're still reading this 2 years later note the following. The answer is yes but the real story is that they work better together. Azure Cache for Redis now has an Enterprise Tier through the same Marketplace tile. This gives you the ability to deploy Redis in an Active-Active model across multiple regions where all instances are readable and writeable with conflict resolution built into the different datatypes that Redis supports. Couple that with higher performance through the redis enterprise proxy and up to 5 9's of availability gives you additional options to choose from. Azure Cache for Redis Enterprise (ACRE) in front of Cosmos is a real option as ACRE has sub-millisecond latency capabilities. Note, I work for Redis Labs and have seen this work and deployed it myself.
Redis is an in-memory datastore hence it's primary use-case is in-memory caching. Since it is a Key-value store, it has generally limited query ability, only allowing queries by primary key.
While, CosmosDB is Globally distributed, horizontally scalable, multi-model database service. It becomes handy in scenarios where you need the ability to query over heterogeneous data.
Those two are totally for different purposes, even Microsoft has redis cache as a service apart from CosmosDB only to serve this purpose.
Cosmos is probably going to be more expensive, from a cost perspective, than using Redis - depending on your throughput.
The one big benefit you can achieve with Cosmos is multi-read regions so your availability could increase and also the latency to your users if they're reading from a Cosmos region closer to them.

AWS S3 alternatives for private cloud

Right now we have a requirement to migrate from AWS to private Data Center. We need to find out potential alternative storage instead of AWS S3.
Currently S3 is used in the following way:
Overall storage size is 10TB;
Min/Avg/Max object size is 0.5/2/100 Mb;
We have N App instances that simultaneously writes/reads
objects approximately 50 writes/sec, 30 reads/sec;
This storage should be redundant (Highly Available), Fault Tolerant, Scalable;
The naive implementation could be store this data on:
Simple NFS storage and add some replication functionality;
Just store mentioned objects in NoSQL DB (as example in Cassandra). However Cassandra will require a number of instances to support this storage (It's nor recommended to store > 1TB pn 1 Cassandra node Cassandra capacity planning)
What solution would you recommend for such scenario ?
Using MinIO is your best bet if you want to have a private cloud storage. It is AWS S3 compatible meaning that applications use AWS S3 can be migrated to MinIO seamlessly. They have a tutorial how to connect MinIO server with AWS CLI. You can test it against the public hosted MinIO server https://play.min.io:9000. Please refer to AWS CLI with MinIO Server.
You can have highly available storage system using MinIO distributed setup. Beware that the dynamic expansion is not a feature of MinIO distributed setup. If you want to expand your cluster you end up spinning a new cluster with your desired number of servers/disks and then you have to migrate your data from old one to new one.
I find it much more easier to use than HDFS. In addition to this, there are a lot of technologies outside Hadoop ecosystem lack HDFS integration. For example, Docker Registry lacks built in HDFS storage driver. However, it has a S3 driver so you can use MinIO as it's object storage.
There're a bunch of options as of S3-compatible private cloud service. if you like open source solutions, the above open stack and Cassandra are good ones. Note that usually no matter what you use, probably you end up setting up a cloud with multiple nodes and this is inevitable to exchange for redundancy and availability. There're some good commercial and economic products as well such as the one from Cloudian
If you need object store I could recommend elliptics (in english).
As I know, it doesn't has limits on disk store.
In case for Cassandra we are using SSD disks (for better performance) < 200-500 Gb. Ring size would be depend from your requirements (read/write latency, replication rate, time to life).
50 writes/sec, 30 reads/sec
This is really quite easy for Cassandra, as I can compare with our setup.
In that case it more depends from time to life for your objects.
Generally, in case for distributed network you also could look at GlusterFS.
You can use OpenStack Swift
Swift is a highly available, distributed, eventually consistent object/blob store. Organizations can use Swift to store lots of data efficiently, safely, and cheaply.
Learn More on : https://docs.openstack.org/swift/latest/
And https://oldhenhut.com/2016/05/31/s3-vs-swift/

Auto-scaling with Amazon EC2 when SQL is involved

I'm building a whiteboard web app with self-contained "rooms" of clients that runs off Amazon EC2 instances (a single one for now). Commands are sent via websockets to a PHP server, which stores all commands in a SQL database.
Up until now I was using Google Cloud SQL. My plan was to learn how to scale with EC2 and have all instances use the same remote database. I've learned this won't work due to the 200 ms write latency of a remote SQL server vs. the 0.5 ms write latency of a local SQL server. The server makes a write every time a command arrives.
I'm new to scalability and distributed systems. My intuition tells me I either need to use Amazon RDS and hope for millisecond latencies if my EC2 and RDS instances are in the same region, or work with SQL locally on EC2 instances. I'm leaning toward the latter. Here's my issue: EC2 is elastic. What happens when I need to get rid of an instance?
All I can think of right now is somehow replicating the SQL data from each EC2 instance to a master instance (maybe even Google Cloud SQL!). In other words, all reads/writes for each "room" happen locally, and are eventually replicated to the master server for long-term storage. If a "room" is re-opened a week later, a different EC2 instance can grab data from the master server, work with it locally, and replicate changes back before being destroyed.
Does my approach sound correct--is replication the right concept here? If so, how much support for what I'm trying to do already exists? That is, do I need to set up a master server that manages EC2 instances and distributes/collects the SQL data manually (100% custom implementation), or is there are there existing libraries/mechanisms for SQL and maybe even EC2 instance replication/management? And if my approach is wrong, what are some better approaches? This is one of those times where I don't know what to research on my own. Thanks!
I'd agree with user02525 perhaps look at using Elasticache redis, it sounds more in line with what you're doing.

Is Amazon EC2 recommended for a persistent public facing website?

My company is about to write a new public facing website in SharePoint (so Windows Server 2008 RC2, SQL Server 2008 RC2, etc) and we're looking at using Amazon EC2 to host it. I've read and been told that instances can disappear (often through user-error, but also in batches), so I'm skeptical that EC2 is the best idea for us.
I've done research on the Amazon AWS site, but must confess that most of the terminology used is confusing, and Googling my questions often brought me here, so I thought I'd ask my questions here too and see if people can advise me.
1) It's critical that our website be available to the public as much as possible (the usual 99.9% up times apply). The Amazon EC2 Service Level Agreement commitment is 99.95% availability, which is fine, but what happens if we hit that 0.05% scenario? Would our E2 instance be lost? Can these be recovered? If so, what would we need to do to ensure that we recover to a not-too-old version of our site?
2) I've read about Amazon Elastic Block Store (EBS), and how this is persist independently from the lifetime of the instance. If I understand right, EBS is like having a hard-drive, so if the instance is lost we can start a new instance using our EBS to recover the latest version, while the 'local instance store' would be lost if the instance is lost as well. Is that right?
3) Are 'reserved instances' a more stable option? i.e. are they less likely to disappear? If they do still disappear, what recovery benefits do they offer, if any?
I know these questions are kinda vague, but hopefully you'll be able to offer a newbie from basic info - enough to point me in the right direction for further, deeper research at least.
Many thanks.
Kevin
We rely on AWS for our webservers. I won't use anything else. They're highly scalable, easily configurable and have an absurd uptime. I've never experienced downtime with them. We've been with them for two years.
Reserved instances are cheaper. Get them if you're planning on having that instance for a while. It's simply a cost/budgeting issue.
Never heard of people losing an EC2 instance.
Not terribly knowledgeable about EBS, but S3 is a good way to back up data.
HTH
EDIT:
Came across some links that might be helpful. Cheers.
http://techblog.netflix.com/2010/12/four-reasons-we-choose-amazons-cloud-as.html
http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html
http://www.codinghorror.com/blog/2011/04/working-with-the-chaos-monkey.html
One of the main design goals of AWS is to make fault tolerant services--that is services that can recover from failures. That is, they design all of their services with the assumption that something will fail in some way at some point, but that there will be redundancies and other mechanism in place to recover from those inevitable failures.
In the case of storage services like S3 and SimpleDB, this is achieved primarily by replicating your data across multiple nodes (machines) in multiple data centers. So when one node experiences a hardware failure or one data center experiences a power outage, there's no real down time as the replicas can still service the requests. As a consumer, you aren't even aware of the down nodes or data centers.
EC2 is designed to work similarly, but it is not quite as encapsulated as S3 and SimpleDB, so you'll need to plan for a bit of the work yourself. For example, if you need a web service with guaranteed uptime and availablity, you'll want to look into AWS ELB (Elastic Load Balancing) service. That way if an instance is down, requests will automatically be routed to other healthy instances. For your data, you can either store it in other AWS services (like S3 and SimpleDB and EBS) which have built-in redundancy or you can build your own solution using similar redundancy techniques.
The SLA amounts to none, when we found out that:
Instances and EBS volumes DID get lost
It takes Amazon more than 2 days to recover from a disaster, and even that not to the full extent
We were the lucky ones, that managed to get back on our feet in less than 2 days. Other companies got stuck with no recovery option.
And what does Amazon recommend? "Don't trust our reliability. Pay for 2 or 3 more copies of your system in different regions, and then you will be safe".
More information can be found here:
http://www.zdnet.com/blog/saas/lightning-strike-zaps-ec2-ireland/1382
tldr: AWS is very reliable if you know what you're doing, a bad idea if you don't.
As your unfamiliar with terms here's a very quick glossary:
AZ - Availability zone, there's several availability zones per region (e.g. 3 in Ireland). They are physical isolated datacentres with different power grids, flood plains etc. But with internal network quality speed connections. It's possible even likely an AZ may become unavailable at some point, I don't think all AZ's in a region have ever been down though.
EBS/Instance Store - These are the two main types of storage available to instance. The best way to describe them is Instance Store is the equivalent to a HDD you have plugged in via sata to your motherboard - its very fast. But what happens if you shutdown your instance (or if the motherboard fails) and want to instantly start on another board? (Amazon completely hides the physical hardware setup) obviously you aren't going to wait for an engineer to unplug a drive from one server and into another so they don't even offer this. Instance store is fast but temporary and tied to the physical machine DO NOT store anything important on it. EBS then is the alternative it is a very low latency network drive that any server can connect to as though it were local. You shut down a server, change the size and restart on a completely different server on the other side of the datacentre (again the physical stuff is hidden), doesn't matter your ebs hasn't gone anywhere (by default theyre also on multiple physical discs).
Commodity cloud hardware - My interpretation of all the 'cloud hardware fails all the time - its really risky and unreliable' is that yes aws hardware is not as reliable as enterprise level components in a managed datacentre. This doesn't mean its unreliable, it just means you should build failure as an option into your design.
First very important thing to note when talking about SLA's is that amazon state very clearly that the SLA ONLY applies if one or more AZ goes down. So if you do not understand how their service works and only build one server in one AZ and a generator or router fails it's your own fault.
As for recovery, that depends - is your entire application state stored on one server - if it is, don't bother with the cloud. If however you can cluster your state on multiple servers, store it in RDS or some other persistent DB. OR if your content changes so infrequently you can utilise periodic copies to s3 storage, you'll be fine. You failure strategy (in order of preference) could be clustered, failover, or auto repair. For the first one you have clustered servers sharing state - it doesn't matter if you lose a server or an AZ. For the second you only have one live server, but if it goes down you have a failover standing by with the same content. Finally with auto repair there's two possible situations - if your data is only on one EBS drive, you could start another instance with the same drive and carry on. But if the EBS drive or AZ fails, you will need to be ready with some snapshot in s3 that a completely fresh instance can copy and start up with.
Reserved instances are no more reliable - they're the same hardware, you're just entering into a contract to say i'll have x machines for y years. Which allows aws to plan better, which is cheaper for you.