I'm trying to upload some files to my bucket on S3 through boto3 on Python.
These files name are websites addresses (for example www.google.com/gmail).
I want the file name to be the website address, but in fact it creates a folder with name "www.google.com" and inside the uploaded file with name "gmail"
I tried to solve it with double slash and backslash before the trailing slash, but it didn't work.
Is there any way to ignore the trailing slash and upload a file that its name is a website address?
Thanks.
You are misunderstanding S3 - it does not actually have a "folder" structure. Every object in a bucket has a unique key, and the object is accessed via that key.
Some S3 utilities (including to be fair the AWS console) fake up a "folder" structure, but this isn't too relevant to how S3 works.
Or in other words, don't worry about it. Just create the object with / in its key and everything will work as you expect.
S3 has a flat structure with no folders. The "folders" you are seeing are a feature in the AWS Console to make it easier to navigate through your objects. The console will group objects in a "folder" based on the prefix before the slash (if there is one).
There's nothing that prevents you from using slashes in S3 object keys. When you use the API via boto, you can refer to the full URL and you should get the object.
See: https://docs.aws.amazon.com/AmazonS3/latest/user-guide/using-folders.html
Related
We have an Apache Camel app that is supposed to read files in a certain directory structure in S3, process the files (generating some metadata based on the folder the file is in), submit the data in the file (and metadata) to another system and finally put the consumed files into a different bucket, deleting the original from the incoming bucket.
The behaviour I'm seeing is that when I programatically create the directory structure in S3, those "folders" are being consumed, so the dir structure disappears.
I know S3 technically does not have folders, just empty files ending in /.
The twist here is that any "folder" created in the S3 Console, are NOT consumed, they stay there as we want them to. Any folders created via AWS CLI, or boto3 are immediately consumed.
The problem is that we do need the folders to be created with automation, there are too many to do by hand.
I've reached out to AWS Support, and they just tell me that there are no differences between how the Console creates folders, and how the CLI does it. Support confirmed that the command I used in CLI is correct.
I think my issue is similar to Apache Camel deleting AWS S3 bucket's folder , but that has no answer...
How can I get Camel to not "eat" any folders?
Can I create my own directory in s3 using confluent S3SinkConnector?
I know it creates a folder structure, unfortunately we need a new directory strcuture.
Additionally, if you want to completely remove the first S3 'folder' ('topics' by default), you can set the topics.dir configuration to the backspace character: \b.
This way, {bucket}/\b/{partitioner_defined_path} becomes {bucket}/{partitioner_defined_path}.
You can change the topics.dir followed by the path extracted by the partitioner.class.
If you need "a new directory structure" (quoted because S3 has no directories), then you would need to look at implementing your own Partitioner class
https://docs.confluent.io/current/connect/kafka-connect-s3/index.html#s3-object-names
I'm trying to configure Nextcloud to use S3 as the sole path of all files and therefore not hold any files locally.
I guess this can be done but only within a subdirectory. Would it be possible to do it at the root path? It seems the External Storage configuration requires a folder to be entered and / does not seem to be valid.
Recently I used amazon s3 to build a application.But I found a problem that s3 bucket name could not contain (.) among labels when I used hosted-style request over ssl to download files through browser.For example, a bucket name is 'test.bucket', which contains (.).But it accurs that browser has invalid certificert when I download files using url https://test.bucket.s3.amazonaws.com/filename, the same as posting file to s3 bucket.
After searching the documents, I found the last words in the following url:
BucketRestriction
Additionally, if you want to access a bucket by using a virtual hosted-style request, for example, http://mybucket.s3.amazonaws.com over SSL, the bucket name cannot include a period (.).
So, I really want to know whether the bucket name could not include a period (.) such as "a.b", "test.bucket" or "abcd.fdf.fdf" exactly.
You can use periods (now) in your S3 bucket names when using SSL. You just have to use the Path Style format. Full explanation here. Path style looks like this:
https://s3.amazon.aws.com/your.bucket.name
I have set up an S3 bucket to host static files.
When using the website endpoint (http://.s3-website-us-east-1.amazonaws.com/): it forces me to set an index file. When the file isn't found, it throws an error instead of listing directory contents.
When using the s3 endpoint (.s3.amazonaws.com): I get an XML listing of the files, but I need an HTML listing that users can click the link to the file.
I have tried setting the permissions of all files and the bucket itself to "List" for "Everyone" in the AWS Console, but still no luck.
I have also tried some of the javascript alternatives, but they either don't work under the website url (that redirects to the index file) or just don't work at all. As a last resort, a collapsible javascript listing would be better than nothing, but I haven't found a good one.
Is this possible? If so, do I need to change permissions, ACL or something else?
I've created a simple bit of JS that creates a directory index in HTML style that you are looking for: https://github.com/rgrp/s3-bucket-listing
The README has specific instructions for handling Amazon S3 "website" buckets: https://github.com/rgrp/s3-bucket-listing#website-buckets
You can see a live example of the script in action on this s3 bucket (in website mode): http://data.openspending.org/
There is also this solution: https://github.com/caussourd/aws-s3-bucket-listing
Similar to https://github.com/rgrp/s3-bucket-listing but I couldn't make it work with Internet Explorer. So https://github.com/caussourd/aws-s3-bucket-listing works with IE and also add the possibility to order the files by names, size and date. On the downside, it doesn't follow folders: only the files at one level are displayed.
This might solve your problem. Security settings for Everyone group:
(you need the bucketexplorer.com software for this)
If you are sharing files of HTTP, you may or may not want people to be able to list the contents of a bucket (folder.) If you want the bucket contents to be listed when someone enters the bucket name (http://s3.amazonaws.com/bucket_name/), then edit the Access Control List and give the Everyone group the access level of Read (and do likewise with the contents of the bucket.) If you don’t want the bucket contents list-able but do want to share the file within it, disable Read access for the Everyone group for the bucket itself, and then enable Read access for the individual files within the bucket.
I created a much simpler solution. Just place the index.html file in root of your folder and it will do the job. No configuration required. https://github.com/prabhatsharma/s3-directorylisting
I had a similar problem and created a JavaScript-and-iframe solution that works pretty well for listing directories in S3 website files. You just have to drop a couple of .html files into the directory you want to list. You can find it here:
https://github.com/adam-p/s3-file-list-page
I found s3browser, which allowed me to set up a directory on the main web site that allowed browsing of the s3 bucket. It worked very well and was very easy to set up.
Using another approach base in pure JavaScript and AWS SDK JavaScript API. Not need PHP or other engine just pure web site (Apache or even IIS).
https://github.com/juvs/s3-bucket-browser
Not intent for deploy on your own bucket (for me, no make sense).
Using the new IAM Users from AWS you can provide more specific and secure access to your buckets. No need to publish your bucket to website and make all public.
If you want secure the access, you can use the conventional methods to authenticate users for your current web site.
Hope this help too!