riofs keeps failing with 'Transport endpoint is not connected' message - amazon-s3

I've been using riofs for AWS S3 bucket to filesystem mounting on an EC2 instance.
Yes yes, I know... you shouldn't use fuse for S3 mounting as its a leaky abstraction.
With that said, I didn't see anyone else post any questions regarding this issue (specifically with riofs) so I'm posting this question.
Why is it happening and what can I do it fix it?
It looks like fuse doesn't have a stable version past v2.9.7 so that is what I'm using.
s3fs does have an open issue which I'm guessing has the same problem (https://github.com/s3fs-fuse/s3fs-fuse/issues/152).
I have tried adding -o multireq_max=5 as a riofs command-line option without success.

Related

How to disable encryption of Grfana Loki logs fed to S3

This could be a very dumb question to start off with, so I apologise in advance, but skimming through the documentation I didn't find a way (to control in config) the encryption of logs being fed to s3 buckets. I have a setup where Grafana Loki logs are being fed to S3(collected by fluent-bit from pods, since all of this is deployed in EKS), I have absolutely no problem in viewing logs via Grafana UI, logs are properly stored in S3 as well, but when I download files from within the bucket they are encrypted.
Is there a config flag I missed or there is more to do away with this encrypted logs or there isn't really something that can be done in this situation.
I hope I have shared enough information or presented the situation/question. But in case its not clear please feel free to ask and also thanks in advance for any help !!
I tried to play around with some config items like sse_encryption: false but it didn't seem to have any effect , I also tried to toggle the insecure flag but I believe that has more to do with tls.
The download file from s3 looks like the attached screen.

Mount S3 bucket as an NFS share on an EC2 instance

long time reader but I've usually been able to find the answers I've been looking for in existing posts - but this time I've not been able to.
I am essentially teaching myself AWS CDK from scratch, I've only really just started with it so not finding anything which helps me on my mission may be a result of not knowing enough yet to be asking the right questions... so please bare with me.
Thus far I've used the AWS CDK with Python to create a stack which creates an S3 bucket, and also fires up an EC2 instance with an AWS file storage gateway AMI loaded on it (so running Amazon Linux). This deploys and runs fine - however now I'd like to programmatically set up the S3 bucket to be accessed via an NFS share on the EC2 instance. From what I've seen I'd assumed it is or should be fairly trivial however I keep getting a bit lost in documentation and internet hunts and not quite sure I'm looking in the right places or asking search engines the right questions to unlock the path to achieve this.
It looks like I should be able to script something up to make it happen when the instance is start using user-data but I'm a bit lost. Is anyone able to throw me some crumbs to follow to find a good way of achieving this, or a better way of achieving what I want to happen (which is basically accessing the S3 bucket contents as though they are files on an EC2 instance) - if not tell me how to do it if it's trivial enough?
Much appreciated :)
Dan
You are on good track. user_data can be used for that.
I don't have full code to give you as its use case specific (e.g. which OS are you using?), but the user_data would have to download and install s3fs:
s3fs allows Linux and macOS to mount an S3 bucket via FUSE. s3fs preserves the native object format for files, allowing use of other tools like AWS CLI.
However, S3 is an object storage system, and it can't be really mounted on an instance like you would do with NFS or EBS storage solutions. But with s3fs-fuse you can mimic such a behavior. And for some use-cases it will be sufficient.
So what you can do, is to setup the user_data script through console, verify that it works, and then basically just copy and paste to CDK. Its more of a trial-and-see approach, but this is the best way to learn.

Spark unable to write to S3 Encrypted Bucket even after specifying the hadoopConfigs

When i try to write to an S3 Bucket which is AES-256 Encrypted from my Spark Streaming App running on EMR it is throwing 403. For what ever reason the Spark Session is not honoring the "fs.s3a.server-side-encryption-algorithm" config option.
Here is the code i am using.
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.access.key",accessKeyId);
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.secret.key", secretKeyId);
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.server-side-encryption-algorithm","AES256");
When i use regular Java Code using AWS SDK i can upload the files without any issues.
Some how the Spark Session is not honoring this.
Thanks
Sateesh
Able to resolve it. Silly mistake on my part.
We need to have the following property as well.
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3.enableServerSideEncryption","true");

Connect to AWS S3 without API

I've looked everywhere on the Interweb but couldn't find a satisfying answer...
Does anybody know what "protocol" the AWS S3 speaks?
Our idea is to write a Function for a PLC (no chance to use the provided API) to communicate directly with AWS S3.
For Example PLC to "AWS IoT" works in MQTT/HTTP - how can I skip "AWS IoT"?
I know there is the possibility to put an IoT device inbetween - but we are evaluating our possibilities right now.
Thank you in advance
All of the AWS services have a documented REST API - the S3 one is here. In addition, all of their libraries are open source so you could likely get some ideas from them too.

Amazon EC2 Instance Remotely Start

Can someone elaborate more on the details of how to remotely start a EC2 instance remotely?
I have a Linux box set up locally, and would like to set up a cronjob on it to start an instance in Amazon EC2. How do I do that?
I've never worked with API's, if there are ways to use API's, can someone please explain how to do so...
Pretty Simple.
Download EC2 API. There is a CLI with it.
keep EC2_PRIVATE_KEY and EC2_CERT in as your envt variables, where they are private key and certificate files that you generate from EC2 console.
then call ec2-reboot-instances instance_id [instance_id ...]
Done.
Refer: http://docs.amazonwebservices.com/AWSEC2/latest/CommandLineReference/ApiReference-cmd-RebootInstances.html
Edit 1
Do I download this directly onto my Linux box? And how do I access the CLI on the linux box of the EC2 API? Sorry to ask so many questions, just need to know detailed steps of how to do this.
Yes. Download it from here
If you have unzipped the API in /home/naishe/ec2api, you can call /home/naishe/ec2api/bin/ec2-reboot-instance <instance_id>. Or event better set unzipped location as your envt variable EC2_API_HOME and append $EC2_API_HOME/bin to your system's PATH.
Also, try investing some time on Getting Started Doc which is amazingly simple.