Cloud9 workspace using S3 bucket as source? - amazon-s3

Given the popularity of hosting static sites from AWS S3 buckets it would be great to be able to do that from Cloud9 too.
Is there any way I can set up an FTP-based workspace that uses an S3 bucket as the source?
Transmit and other FTP apps have the ability to work directly with an S3 bucket. I did try setting up an FTP workspace in Cloud9 using the following:
Host: s3.amazonaws.com
Username: My-Access-Key
Password: My-Secret-Key
I know it was a long-shot and I have since read confirmation that Amazon doesn't allow simple FTP access to buckets like that.
Any ideas if this is possible?

FTP workspaces on Cloud9 are actually being phased out, so I'd recommend using the mounting feature described in this blog post to mount an FTP source: https://c9.io/site/blog/2014/12/ftp-sftp-mounting-beta
Unfortunately, S3 doesn't support the FTP protocol, so this would have to be a new feature. Luckily we're opening up our SDK to be able to implement features like this. If you're interested in contributing please email us via https://support.c9.io

Codeanywhere (https://codeanywhere.com) does this now. However, you'll have to shell out $7 to $10/m for that capability.
But then again, like Cloud9 (which I'm a big fan of), you get a bunch of features on the Codeanywhere IDE.
I was disappointed when Cloud9 discontinued its efforts on S/FTP. Codeanywhere seems to be taking on the cloud/storage issue head on by handling cloud access to S3, FTP, SFTP, Google Drive and others.

Related

Mount S3 bucket as an NFS share on an EC2 instance

long time reader but I've usually been able to find the answers I've been looking for in existing posts - but this time I've not been able to.
I am essentially teaching myself AWS CDK from scratch, I've only really just started with it so not finding anything which helps me on my mission may be a result of not knowing enough yet to be asking the right questions... so please bare with me.
Thus far I've used the AWS CDK with Python to create a stack which creates an S3 bucket, and also fires up an EC2 instance with an AWS file storage gateway AMI loaded on it (so running Amazon Linux). This deploys and runs fine - however now I'd like to programmatically set up the S3 bucket to be accessed via an NFS share on the EC2 instance. From what I've seen I'd assumed it is or should be fairly trivial however I keep getting a bit lost in documentation and internet hunts and not quite sure I'm looking in the right places or asking search engines the right questions to unlock the path to achieve this.
It looks like I should be able to script something up to make it happen when the instance is start using user-data but I'm a bit lost. Is anyone able to throw me some crumbs to follow to find a good way of achieving this, or a better way of achieving what I want to happen (which is basically accessing the S3 bucket contents as though they are files on an EC2 instance) - if not tell me how to do it if it's trivial enough?
Much appreciated :)
Dan
You are on good track. user_data can be used for that.
I don't have full code to give you as its use case specific (e.g. which OS are you using?), but the user_data would have to download and install s3fs:
s3fs allows Linux and macOS to mount an S3 bucket via FUSE. s3fs preserves the native object format for files, allowing use of other tools like AWS CLI.
However, S3 is an object storage system, and it can't be really mounted on an instance like you would do with NFS or EBS storage solutions. But with s3fs-fuse you can mimic such a behavior. And for some use-cases it will be sufficient.
So what you can do, is to setup the user_data script through console, verify that it works, and then basically just copy and paste to CDK. Its more of a trial-and-see approach, but this is the best way to learn.

When to use s3cmd over accessing the S3 API programmatically?

I've been having difficulty understanding when to use s3cmd program over using the Java API. A vendor has documentation on accessing S3 with s3cmd. It is unclear to me as the bucket names appear to be dynamic. No region is specified. Additionally, I'm reaching out over an endpoint. I've tried writing some Java code to interact with S3 the same way that s3cmd does but I haven't been able to connect. Overall, it appears to quite a bit different.
To me s3cmd seems to be a utility to manipulate these files or quickly get at them. Integrating this utility into a Java program seems meaningless.
Anyone have any resources or can help me understand this better?
S3cmd (s3cmd) is a free command line tool and client for uploading, retrieving and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol, such as Google Cloud Storage or DreamHost DreamObjects. It is best suited for power users who are familiar with command line programs. It is also ideal for batch scripts and automated backup to S3, triggered from cron, etc.
S3cmd is written in Python. It's an open source project available under GNU Public License v2 (GPLv2) and is free for both commercial and private use. You will only have to pay Amazon for using their storage.
Lots of features and options have been added to S3cmd, since its very first release in 2008.... we recently counted more than 60 command line options, including multipart uploads, encryption, incremental backup, s3 sync, ACL and Metadata management, S3 bucket size, bucket policies, and more!

AWS FTP behavior

I'm having some issue on my AWS S3 bucket and vsftpd.
I've created a vsftpd instance and mount AWS S3 bucket. My issue is that everytime I upload a file and the connection was disrupted, it appends the existing file on the S3 bucket instead of override it when the FTP client retry. What should I set on the S3 bucket policy to have such behavior to override instead of append?
There are no Amazon S3 configuration settings that would impact this behaviour -- it is totally the result of the software you are using.
It's also worth mentioning that FTP is a rather old protocol and these days there are much better alternatives, such as uploads via the browser or Dropbox-like shared folders.
One of the easiest options is to have your users upload directly to Amazon S3 -- that way, you don't need to run any servers. This could be done by uploading via a browser, or by providing users with some software, such as Cloudberry Explorer or the AWS Command-Line Interface (CLI).
I highly encourage you to stop using FTP these days.

Access files stored on Amazon S3 through web browser

Current Situation
I have a project on GitHub that builds after every commit on Travis-CI. After each successful build Travis uploads the artifacts to an S3 bucket. Is there some way for me to easily let anyone access the files in the bucket? I know I could generate a read-only access key, but it'd be easier for the user to access the files through their web browser.
I have website hosting enabled with the root document of "." set.
However, I still get an 403 Forbidden when trying to go to the bucket's endpoint.
The Question
How can I let users easily browse and download artifacts stored on Amazon S3 from their web browser? Preferably without a third-party client.
I found this related question: Directory Listing in S3 Static Website
As it turns out, if you enable public read for the whole bucket, S3 can serve directory listings. Problem is they are in XML instead of HTML, so not very user-friendly.
There are three ways you could go for generating listings:
Generate index.html files for each directory on your own computer, upload them to s3, and update them whenever you add new files to a directory. Very low-tech. Since you're saying you're uploading build files straight from Travis, this may not be that practical since it would require doing extra work there.
Use a client-side S3 browser tool.
s3-bucket-listing by Rufus Pollock
s3-file-list-page by Adam Pritchard
Use a server-side browser tool.
s3browser (PHP)
s3index Scala. Going by the existence of a Procfile, it may be readily deployable to Heroku. Not sure since I don't have any experience with Scala.
Filestash is the perfect tool for that:
login to your bucket from https://www.filestash.app/s3-browser.html:
create a shared link:
Share it with the world
Also Filestash is open source. (Disclaimer: I am the author)
I had the same problem and I fixed it by using the
new context menu "Make Public".
Go to https://console.aws.amazon.com/s3/home,
select the bucket and then for each Folder or File (or multiple selects) right click and
"make public"
You can use a bucket policy to give anonymous users full read access to your objects. Depending on whether you need them to LIST or just perform a GET, you'll want to tweak this. (I.e. permissions for listing the contents of a bucket have the action set to "s3:ListBucket").
http://docs.aws.amazon.com/AmazonS3/latest/dev/AccessPolicyLanguage_UseCases_s3_a.html
Your policy will look something like the following. You can use the S3 console at http://aws.amazon.com/console to upload it.
{
"Version":"2008-10-17",
"Statement":[{
"Sid":"AddPerm",
"Effect":"Allow",
"Principal": {
"AWS": "*"
},
"Action":["s3:GetObject"],
"Resource":["arn:aws:s3:::bucket/*"
]
}
]
}
If you're truly opening up your objects to the world, you'll want to look into setting up CloudWatch rules on your billing so you can shut off permissions to your objects if they become too popular.
https://github.com/jupierce/aws-s3-web-browser-file-listing is a solution I developed for this use case. It leverages AWS CloudFront and Lambda#Edge functions to dynamically render and deliver file listings to a client's browser.
To use it, a simple CloudFormation template will create an S3 bucket and have your file server interface up and running in just a few minutes.
There are many viable alternatives, as already suggested by other posters, but I believe this approach has a unique range of benefits:
Completely serverless and built for web-scale.
Open source and free to use (though, of course, you must pay AWS for resource utilization -- such S3 storage costs).
Simple / static client browser content:
No Ajax or third party libraries to worry about.
No browser compatibility worries.
All backing systems are native AWS components.
You never share account credentials or rely on 3rd party services.
The S3 bucket remains private - allowing you to only expose parts of the bucket.
A custom hostname / SSL certificate can be established for your file server interface.
Some or all of the host files can be protected behind Basic Auth username/password.
An AWS WebACL can be configured to prevent abusive access to the service.

Allowing users to download files as a batch from AWS s3 or Cloudfront

I have a website that allows users to search for music tracks and download those they they select as mp3.
I have the site on my server and all of the mp3s on s3 and then distributed via cloudfront. So far so good.
The client now wishes for users to be able to select a number of music track and then download them all in bulk or as a batch instead of 1 at a time.
Usually I would place all the files in a zip and then present the user a link to that new zip file to download. In this case, as the files are on s3 that would require I first copy all the files from s3 to my webserver process them in to a zip and then download from my server.
Is there anyway i can create a zip on s3 or CF or is there someway to batch / group files in to a zip?
Maybe i could set up an EC2 instance to handle this?
I would greatly appreciate some direction.
Best
Joe
I am afraid you won't be able to create the batches w/o additional processing. firing up an EC2 instance might be an option to create a batch per user
I am facing the exact same problem. So far the only thing I was able to find is Amazon's s3sync tool:
https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html
In my case, I am using Rails + its Paperclip addon which means that I have no way to easily download all of the user's images in one go, because the files are scattered in a lot of subdirectories.
However, if you can group your user's files in a better way, say like this:
/users/<ID>/images/...
/users/<ID>/songs/...
...etc., then you can solve your problem right away with:
aws s3 sync s3://<your_bucket_name>/users/<user_id>/songs /cache/<user_id>
Do have in mind you'll have to give your server the proper credentials so the S3 CLI tools can work without prompting for usernames/passwords.
And that should sort you.
Additional discussion here:
Downloading an entire S3 bucket?
s3 is single http request based.
So the answer is threads to achieve the same thing
Java api - uses TransferManager
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html
You can get great performance with multi threads.
There is no bulk download sorry.