what is the difference between s3-website-some-region and s3-website.some-region? - amazon-s3

my s3 static website bucket has the full url as this
http://frontend-erjan-result.s3-website-us-east-1.amazonaws.com
does it matter if the example i m following has dot and not 'dash' before the region name?
"s3-website-region" vs "s3-website.region"
http://frontend-erjan-result.s3-website.us-east-1.amazonaws.com
is it crucial or does not matter?

The official s3-website endpoints are listed in the S3 documentation (scroll down to “Amazon S3 website endpoints”). There is an apparent inconsistency, some have a hyphen, some have a dot.
To be honest, I didn’t know this myself. My guess is that it has historic reasons; I could imagine that they started with the hyphenated version for the early regions (us-east-1, eu-west-1) and then switched to the dot notation, possibly allowing easier DNS zone management based on AWS Regions. And I guess that at the same time, a fallback was created so that the dot notation is also supported for the hyphenated endpoints in order to be consistent, especially for the purposes of automation. Again, this is just an assumption.

Related

AWSDynamoDBObjectMapper or AWSDynamoDB?

The AWS documentation is seemingly endless, and different pages tell me different things. For example, one page tells me that AWSDynamoDBObjectMapper is the entry point to working with DynamoDB, while another tells me that AWSDynamoDB is the entry point to working with DynamoDB. Which class should I be using? Why?
EDIT: One user mentioned he didn't understand the question. To be more clear, I want to know, in general, what the difference is between using AWSDynamoDB and AWSDynamoDBObjectMapper as entry points to interfacing a DynamoDB.
Doc links for both:
AWSDynamoDB
AWSDynamoDBObjectMapper
Since both can clearly read, write, query, and scan, you need to pick out the differences. It appears to me that the ObjectMapper class supports the concept of mapping an AWSDynamoDBModel to a DB vs. directly manipulating specific objects (as AWSDynamoDB does). Moreover, it appears that AWSDynamoDB also supports methods for managing tables directly.
I would speculate that AWSDynamoDB is designed for managing data where the schema is pre-defined on the DB, and AWSDynamoDBObjectMapper is designed for managing data where the schema is defined by the client.
All of this speculation aside though, the most important bit you can glean from the documentation is:
Instead of making the requests to the low-level DynamoDB API directly from your application, we recommend that you use the AWS Software Development Kits (SDKs). The easy-to-use libraries in the AWS SDKs make it unnecessary to call the low-level DynamoDB API directly from your application. The libraries take care of request authentication, serialization, and connection management. For more information, go to Using the AWS SDKs with DynamoDB in the Amazon DynamoDB Developer Guide.
I would recommend this approach rather than worrying about the ambiguity of class documentation.

does passing parameter with url affect amazon s3 caching?

I have a file hosted on amazon s3 service http://www.example.com/tempfile.html
Now if I pass a parameter with url like http://www.example.com/tempfile.html?u=2345
Will this be treated as a different url altogether and I will get no benefit of caching?
Will this be treated as a different url altogether and I will get no
benefit of caching?
Generally speaking, adding the Query string yields a different URL indeed, which in itself is still fully cacheable accordingly, see section Side Effects of GET and HEAD (within Caching in HTTP):
We note one exception to this rule: since some applications have
traditionally used GETs and HEADs with query URLs (those containing a
"?" in the rel_path part) to perform operations with significant side
effects, caches MUST NOT treat responses to such URIs as fresh unless
the server provides an explicit expiration time.
This is actually a widely used, though somewhat controversial technique to enforce browsers to replace long living static assets like CSS/JS, see e.g. SCdF's answer to What is an elegant way to force browsers to reload cached CSS/JS files? and its comments for an extensive discussion of this and related approaches and the respective pros and cons. The apparently preferred solution for the CSS/JS topic nowadays uses filename fingerprinting rather than adding a query fragment accordingly, as discussed in the accepted and other answers there as well.

How important are website optimizations?

Currently I am running Apache and MySQL and I hear about people talking about GZipping content, something about ETags, using a CDN, adding expire headers, minifying text documents, combining script files, etc. I downloaded a Firefox add-on called YSlow and I noticed that many websites do not employ all of these tactics. I believe even Google has a D rating. So I ask, SO, how important are these optimizations?
They depend highly on your traffic and resources at your disposal.
If you make the website for Joe's Pizza in the middle of nowhere, there is no real need to waste time optimising the site, it will likely have a handful of visits a day.
But Stack Overflow receives thousands of hits a minute (probably more), so they use a CDN, distant expiry headers, minification, etc.
Honestly, if people aren't complaining it's probably not a big deal. If people are complaining, start by looking at the database.
In my years of web development most web application performance problems have stemmed from the DB (this doesn't mean that all performance problems come from the DB but it's a good place to start). While I am fascinated for things like minified JS and css sprites, I suspect that these things do not make a difference in a "day in the life of your average web developer".
It's good that you consider these things, but unless you are working at an extremely high traffic site, it probably won't make a difference.
It all depends on your application.
Minifying, for example, might be great for an application that is very external .js dependent. There is no reason NOT to do this - there is no overhead required and it potentially saves quite a few bytes.
Compression is great for certain content types - terrible for others and involves a slight overhead while transporting pages.
CDNs are up to your affordability, content type and how dynamic the content is. You obviously don't need Akamai backing up the average Drupal site.
etc, etc, etc

Experiences and tips for programming with and for Amazon's cloud servers/apps/tools?

We're looking into developing a product that would use Amazon's cloud tools (EC2, SQS, etc), and I'm curious what tips/gotchas/pointers people that have used these technologies have.
One tip/whatever per post, please.
The Elasticfox plug-in for Mozilla makes doing a lot of the EC2 stuff easier. It can be found at: Elasticfox Firefox Extension for Amazon EC2. This page has links specifically to download the Elasticfox plug-in and also the associated Sourceforge project. Well worth using...
Get a developer account at Right Scale. It's free and a god-send for a guy who hates remembering those dumb commands and arguments. If you only resort to Amazon-supplied tools, you're throwing away your human rights.
We're interested in EC2 where i work. We don't care about web-serving or enterprisey stuff, just massive number crunching for physics, using python. This EC2 stuff had me befuddled, with most documentation oriented toward businessy applications and using C# or Java, but this slide show clarified much for me, especially for using python: http://www.datawrangling.com/pycon-2008-elasticwulf-slides
As for SimpleDB, it has a very limited query language and it is very restrictive. If you planning on having lot of complex queries, you must first sit down and think how to organize your data to make those queries possible. One thing missing, but that will probably will be added, is the ability to count the results of a given query, much like SQL's COUNT.
Performance is ok, but I consider the latency maybe a little high.
An important concept to grasp: the file system your EC2 instance lives on while it's running is not persistent. There are tools/services available that let you mount file systems backed by S3 storage, or you can upload to S3 or other storage service from the instance, but when an instance closes the associated file system is no more.
As for tools, I've found Amazon's tools to be great, but you should probably be comfortable with the command line if you're taking this route.
For managing your EC2 instances, etc. Amazon also offers - in beta since a couple of days - the management console which has similar functionality to the Elasticfox Firefox plugin but is a pure web console.
https://console.aws.amazon.com

Amazon SimpleDB

Has anyone considered using something along the lines of the Amazon SimpleDB data store as their backend database?
SQL Server hosting (at least in the UK) is expensive so could something like this along with cloud file storage (S3) be used for building apps that could grow with your application.
Great in theory but would anyone consider using it. In fact is anyone actually using it now for real production software as I would love to read your comments.
This is a good analysis of Amazon services from Dare.
S3 handled what I've typically heard described as "blob storage". A typical Web application typically has media files and other resources (images, CSS stylesheets, scripts, video files, etc) that is simply accessed by name/path. However a lot of these resources also have metadata (e.g. a video file on YouTube has metadata about it's rating, who uploaded it, number of views, etc) which need to be stored as well. This need for queryable, schematized storage is where SimpleDB comes in. EC2 provides a virtual server that can be used for computation complete with a local file system instance which isn't persistent if the virtual server goes down for any reason. With SimpleDB and S3 you have the building blocks to build a large class of "Web 2.0" style applications when you throw in the computational capabilities provided by EC2.
However neither S3 nor SimpleDB provides a solution for a developer who simply wants the typical LAMP or WISC developer experience of building a database driven Web application or for applications that may have custom storage needs that don't fit neatly into the buckets of blob storage or schematized storage. Without access to a persistent filesystem, developers on Amazon's cloud computing platform have had to come up with sophisticated solutions involving backing data up manually from EC2 to S3 to get the desired experience.
I just finished writing a library to make porting an app to simpledb in Perl easy, Net::Amazon::SimpleDB::Simple because I found the Amazon client libraries painful. The library isn't on CPAN yet, but it is at http://rjurneyopen.s3.amazonaws.com/SimpleDB/Simple.pm The idea was to make it trivial to stuff hashes in and out of SimpleDB.
I just ported an app to use it. Overall I am impressed with SimpleDB... even inefficient queries take only 2-3 seconds to return. SimpleDB doesn't seem to care about the size of your table, owing to its Erlang/parallel nature. Tablescans are easy for it.
The pain comes from the fact that you can't count, sum or group by. If you plan on doing any of those things... then SimpleDB probably isn't for you. At the moment in terms of functionality it exists somewhere in between memcached and MySQL. You can SELECT ORDER BY LIMIT, which is nice. Its also nice that you don't have to scale it yourself, and its nice that it doesn't care how much you stuff into it. But more advanced operations like analytics are painful at best. You'll have to do your own calculations server side. Its also a big plus that on any computer I can use the simpledb CLI http://code.google.com/p/amazon-simpledb-cli/ to query my data.
There are some confusing 'gotchas.' For instance, attributes can have more than one value, and you have to explicitly set 'replace' when storing items. Also, storing undef or null string results in a library error, instead of deleting that attribute name/value pair or setting it null/empty string.
Learning to think in terms of a largely un-normalized way is a little strange too, which is why I would second the suggestion above that says it is best for new applications. Porting from a SQL app to SimpleDB would be painful because your application logic would have to change. The way you do things is a bit different. The amazon docs are pretty good at explaining this.
All of this is extractable in a library that sits atop SimpleDB, so for your use of SimpleDB you will want to pick a good library... you probably don't want to deal with it directly. There is some work on the PHP side to make things easy, and there is my library. There is a RAILS activesource, but it doesn't seem to do much for you.
All in all its still early in the game, but compared to other APIs (twitter comes to mind), I have to say that the SimpleDB REST API is pretty simple (especially considering that it is XML) and polite to work with. I would recommend it... depending on the requirements of your application and the economics of your use of it. If you're looking to rapidly scale a service that doesn't put a great load on the DB and don't want to bother with a scalable MySQL/memcache combo... then SimpleDB can offer a 'simple' solution for you.
I expect that its features will continue to grow and it will be a good choice for more and more applications that do more complex and interesting things. But right now it is targeted at and appropriate for your typical Web 2.0 service.
We are using SimpleDB almost exclusively for our new projects. The zero maintenance, high availability, no install aspects are just too good. And for your Ruby developers, check out SimpleRecord, an ActiveRecord like interface for SimpleDB which makes it super easy to use.
But do you really need SQL Server? Can't you live with PostgreSQL or MySQL? Both have proven to be ok for most tasks.
Now if you need SQL Server features then you're out of luck.
Another option is to rent a server. How expensive is expensive?
(I've used Amazon S3 to store images for an application, it's ok and works fine, at least for that)
I haven't used SimpleDB, but have been using combination of S3, EC2, and MySQL for our application.
As long as you are willing to use SimpleDB, then you might as well consider using MySQL (which is very scalable, and not that expensive).
On the S3 and EC2 side, it is great in practice as well.
SimpleDB works great for many applications.... if your project will require a lot of analytic reporting, joining, etc, you may consider MySQL or a hybrid-model.
If you go SimpleDB, we've developed Radquery.com for our internal use and opened it up to the public.