Is it possible to solely use AWS S3 for nextcloud/opencloud drive - amazon-s3

I'm trying to configure Nextcloud to use S3 as the sole path of all files and therefore not hold any files locally.
I guess this can be done but only within a subdirectory. Would it be possible to do it at the root path? It seems the External Storage configuration requires a folder to be entered and / does not seem to be valid.

Related

Apache Camel eats S3 "folders" created programatically, but not ones created in AWS S3 Console

We have an Apache Camel app that is supposed to read files in a certain directory structure in S3, process the files (generating some metadata based on the folder the file is in), submit the data in the file (and metadata) to another system and finally put the consumed files into a different bucket, deleting the original from the incoming bucket.
The behaviour I'm seeing is that when I programatically create the directory structure in S3, those "folders" are being consumed, so the dir structure disappears.
I know S3 technically does not have folders, just empty files ending in /.
The twist here is that any "folder" created in the S3 Console, are NOT consumed, they stay there as we want them to. Any folders created via AWS CLI, or boto3 are immediately consumed.
The problem is that we do need the folders to be created with automation, there are too many to do by hand.
I've reached out to AWS Support, and they just tell me that there are no differences between how the Console creates folders, and how the CLI does it. Support confirmed that the command I used in CLI is correct.
I think my issue is similar to Apache Camel deleting AWS S3 bucket's folder , but that has no answer...
How can I get Camel to not "eat" any folders?

Migrate moodle data to Amazon s3

I have migrated moodle data directory to Amazon s3. Now I am trying to access all the files from the s3 storage using this plugin moodle-tool_objectfs
Attaching my settings screenshot. I am trying to access all the media files amazon s3 instead from server file system. Example, site logo, course materials in PDF format, etc.,
Thanks for the shout-out Russell!
It sounds like you have manually migrated the content to S3 rather than relying on this plugin to do the work for you. I'd guess that your manual migration has put the files into a structure/path that the plugin isn't expecting. especially if you have copied your complete moodledata folder into S3 and not just the uploaded user files. (The tool_objectfs plugin does not replace the need for a normal moodledata directory, it just allows the majority of your files to be stored in S3.)
Usually you would have a Moodle site set up with a normal moodledata directory and then you would install our tool_objectfs plugin which would migrate files from moodledata to your s3 storage, relying on the plugin to perform the migration for you.

How to create directories in AWS S3 using Apache NiFi putS3Object

I have a working config to push files from a directory on my server to an S3 bucket. NiFi is running on a different server so I have a getSFTP. The source files have subfolders my putS3Object current config does not support and jams all of the files at the root level of the S3 bucket. I know there's a way to get putS3Object to create directories using defined folders. The ObjectKey by default is set to ${filename}. If set to say, my/directory/${filename}, it creates two folders, my and the subfolder directory, and puts the files inside. However, I do NOT know what to set for the object key to replicate the file(s) source directories.
Try ${path}/${filename} based on this in the documentation:
Keeping with the example of a file that is picked up from a local file system, the FlowFile would have an attribute called filename that reflected the name of the file on the file system. Additionally, the FlowFile will have a path attribute that reflects the directory on the file system that this file lived in.

Mounted S3 on EC2 (Directories are not accessible from AWS UI)

Some quick questions:
Does S3 support soft link?
On mounted S3 on EC2, I can't access the created directory in Linux EC2 instance from AWS UI, however created files are accessible.
Thanks
Amazon S3 is an object store, not a filesystem. It has a specific set of APIs for uploading, listing, downloading, etc but it does not behave like a normal filesystem.
There are some utilities that can mount S3 as a filesystem (eg Expandrive, Cloudberry Drive, s3fs), but in the background these utilities are actually translating requests into API calls. This can cause some issues -- for example, you can modify a 100MB file on a local disk by just writing one by to disk. If you wish to modify one byte on S3, you must upload the whole object again. This can cause synchronization problems between your computer and S3, so such methods are not recommended for production situations. (However, they're a great way of uploading/downloading initial data.)
A good in-between option is to use the AWS Command-Line Interface (CLI), which has commands such as aws s3 cp and aws s3 sync, which are reliable ways to upload/download/sync files with Amazon S3.
To answer your questions...
Amazon S3 does not support a "soft link" (symbolic link). Amazon S3 is an object store, not a file system, so it only contains objects. Objects can also have meta-data that is often for cache control, redirection, classification, etc.
Amazon S3 does not support directories (sort of). Amazon S3 objects are kept within buckets, and the buckets are 'flat' -- they do not contains directories/sub-folders. However, it does maintain the illusion of directories. For example, if file bar.jpg is stored in the foo directory, then the Key (filename) of the object is foo/bar.jpg. This makes the object 'appear' to be in the foo directory, but that's not how it is stored. The AWS Management Console maintains this illusion by allowing users to create and open Folders, but the actual data is stored 'flat'.
This leads to some interesting behaviours:
You do not need to create a directory to store an object in the directory.
Directories don't exist. Just store a file called images/cat.jpg and the images directory magically appears (even though it doesn't exist).
You cannot rename objects. The Key (filename) is a unique identifier for the object. To 'rename' an object, you must copy it to a new Key and delete the original.
You cannot rename a directory. They don't exist. Instead, rename all the objects within the directory (which really means you have to copy the objects, then delete their old versions).
You might create a directory but not see it. Amazon S3 keeps track of CommonPrefixes to assist in listing objects by path, but it doesn't create traditional directories. So, don't get worried if you create a (pretend) directory and then don't see it. Just store your object with a full-path name and the directory will 'appear'.
The above-mentioned utilities take all this into account when allowing an Amazon S3 bucket to be mounted. They translate 'normal' filesystem commands into Amazon S3 API calls, but they can't do everything (eg they might emulate renaming a file but they typically won't let you rename a directory).

Access files in s3n://elasticmapreduce/samples/wordcount/input

How I can I access the file sitting in the following folder of S3 which is own by someone else
s3n://elasticmapreduce/samples/wordcount/input
The files in s3n://elasticmapreduce/samples/wordcount/input are public, and made available as input by Amazon to the sample word count Hadoop program. The best way to fetch them is to
Start a new Amazon Elastic MapReduce Job Flow (it doesn't matter which one) from the Amazon Web Services console, and make sure that you keep the the job alive with the Keep Alive option
Once the EC2 machines have started, find the instances on EC2 from the Amazon Web Services console
ssh into one of the running EC2 instances, using the hadoop user, for example
ssh -i keypair.pem hadoop#ec2-IPADDRESS.compute-1.amazonaws.com
Obtain the files you need, using hadoop dfs -copyToLocal s3://elasticmapreduce/samples/wordcount/input/0002 .
sftp the files to your local system
You can access wordSplitter.py here:
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/wordSplitter.py
You can access the input files here:
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0012
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0011
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0010
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0009
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0008
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0007
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0006
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0005
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0004
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0003
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0002
https://elasticmapreduce.s3.amazonaws.com/samples/wordcount/input/0001
The owner of the folder (most likely a file in the folder) must have made it accessible to anonymous reader.
If that is the case, s3n://x/y... is translated to
http://s3.amazonaws.com/x/y...
or
http://x.s3.amazonaws.com/y...
x is the name of the bucket.
y... is the path wihtin the bucket.
If you want to make sure the file exists, e.g. if you suspect the name was misspelled, you can in your browser to open
http://s3.amazonaws.com/x
and you'll see XML describing "files" that is S3 objects, available.
Try this:
http://s3.amazonaws.com/elasticmapreduce
I tried this, and seems that the path you want is not public.
AWS EBS documentation quotes s3://elasticmapreduce/samples/wordcount/input in one of the "getting started" examples. But s3 is different from s3n, so input might be available to EMR, but not to HTTP access.
In Amazon S3, there is no concept of folders, a bucket it just a flat collection of objects. But you can list all the files you are interested in a browser with the following URL:
s3.amazonaws.com/elasticmapreduce?prefix=samples/wordcount/input/
Then you can download them by specifying the whole name, e.g.
s3.amazonaws.com/elasticmapreduce/samples/wordcount/input/0001