Apache Flink S3 file system credentials does not work - amazon-s3

I am trying to read csv file from Amazon S3 and I need to set credential info at runtime.
But I cant pass the credentials checking.
Is there any alternative or any suggestion?
object AwsS3CSVTest {
def main(args: Array[String]): Unit = {
val conf = new Configuration();
conf.setString("fs.s3a.access.key", "***")
conf.setString("fs.s3a.secret.key", "***")
val env = ExecutionEnvironment.createLocalEnvironment(conf)
val datafile = env.readCsvFile("s3a://anybucket/anyfile.csv")
.ignoreFirstLine()
.fieldDelimiter(";")
.types(classOf[String], classOf[String], classOf[String], classOf[String], classOf[String], classOf[String])
datafile.print()
}
}
00:49:55.558|DEBUG| o.a.h.f.s.AWSCredentialProviderList No credentials from TemporaryAWSCredentialsProvider: org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException: Session credentials in Hadoop configuration: No AWS Credentials
00:49:55.558|DEBUG| o.a.h.f.s.AWSCredentialProviderList No credentials from SimpleAWSCredentialsProvider: org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException: SimpleAWSCredentialsProvider: No AWS credentials in the Hadoop configuration
00:49:55.558|DEBUG| o.a.h.f.s.AWSCredentialProviderList No credentials provided by EnvironmentVariableCredentialsProvider: com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))

As explained on https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/s3/#configure-access-credentials you should use IAM or Access Keys which you configure in flink-conf.yaml. You can't set the credentials in code, because the S3 plugins are loaded via plugins.

Related

Terraform tfstate s3 not creating

I am trying to set up remote backend for my terraform workflow. My backend block is as follows
terraform {
backend "s3" {
bucket = "terraform-aws-007"
key = "global/bananadev/s3/terraform.tfstate"
region = "eu-west-2"
}
}
enter image description here
Terraform initialization is successful, however the state file is not being created in my s3 bucket but locally.
Any ideas what may be wrong?

How to download a file using from s3 private bucket without AWS cli

Is it possible to download a file from AWS s3 without AWS cli? In my production server I would need to download a config file which is in S3 bucket.
I was thinking of having Amazon Systems Manger run a script that would download the config (YAML files) from the S3. But we do not want to install AWS cli on the production machines. How can I go about this?
You would need some sort of program to call the Amazon S3 API to retrieve the object. For example, a PowerShell script (using AWS Tools for Windows PowerShell) or a Python script that uses the AWS SDK.
You could alternatively generate an Amazon S3 pre-signed URL, which would allow a private object to be downloaded from Amazon S3 via a normal HTTPS call (eg curl). This can be done easily using the AWS SDK for Python, or you could code it yourself without using libraries (it's a bit more complex).
In all examples above, you would need to provide the script/program with a set of IAM Credentials for authenticating with AWS.
Just adding notes for any C# code lovers to solve problem with .Net
Firstly write(C#) code to download private file as string
public string DownloadPrivateFileS3(string fileKey)
{
string accessKey = "YOURVALUE";
string accessSecret = "YOURVALUE";;
string bucket = "YOURVALUE";;
using (s3Client = new AmazonS3Client(accessKey, accessSecret, "YOURVALUE"))
{
var folderPath = "AppData/Websites/Cases";
var fileTransferUtility = new TransferUtility(s3Client);
Stream stream = fileTransferUtility.OpenStream(bucket, folderPath + "/" + fileKey);
using (var memoryStream = new MemoryStream())
{
stream.CopyTo(memoryStream);
var response = memoryStream.ToArray();
return Convert.ToBase64String(response);
}
return "";
}
}
Second Write JQuery Code to download string as Base64
function downloadPrivateFile() {
$.ajax({url: 'DownloadPrivateFileS3?fileName=' + fileName, success: function(result){
var link = this.document.createElement('a');
link.download = fileName;
link.href = "data:application/octet-stream;base64," + result;
this.document.body.appendChild(link);
link.click();
this.document.body.removeChild(link);
}});
}
Call downloadPrivateFile method from anywhere of HTML/C#/JQuery -
Enjoy Happy Coding and Solutions of Complex Problems

User don't have access to the secret even after enabling the access though policy

I have created a policy so that a developer can access the secret but when I try to access the secret as a developer it shows that I don't have permission to access it.
I tried changing the value and adding new secret but nothing helped
path "kv/*" {
capabilities = ["list"]
}
path "kv/dev01"{
capabilities = ["read", "list"]
}
path "kv/dev_01_cred" {
capabilities = ["read", "list"]
}
the developer should be able to read the secret in dev01 and dev_01_cred.
Currently, the developer can only see the list of secrets in kv.

S3A client and local S3 mock

To create end-to-end local tests of data workflow I utilize "mock S3" container (e.g adobe/S3Mock). Seems to work just fine. However, some parts of the system rely on S3A client. As far as I see, its format does not allow to point to particular nameserver or endpoint.
Is it possible to make S3A work in local environment?
you talking about the ASF Hadoop S3A Connector? Nobody has tested against S3 mock AFAIK (never seen it before!), but it does work with non-AWS endpoints
set fs.s3a.endpoint to the URL of your S3 connection. There's some settings about switching from https to http (fs.s3a.connection.ssl.enabled = false) and moving from virtual hosts to directories (fs.s3a.path.style.access = true) which will also be needed.
further reading
Like I said: nobody has done this. We developers just go against the main AWS endpoints with its problems (latency, inconsistency, error reporting, etc), precisely because its what you get in production. But for your local testing, it will simplify your life (and you can run it under jenkins without having to give it any secrets)
Answer by #stevel worked for me. Here is the code if someone wants to refer
class S3WriterTest {
private static S3Mock api;
private static AmazonS3 mockS3client;
#BeforeAll
public static void setUp() {
//start mock s3 service using findify
api = new S3Mock.Builder().withPort(8001).withInMemoryBackend().build();
api.start();
/* AWS S3 client setup.
* withPathStyleAccessEnabled(true) trick is required to overcome S3 default
* DNS-based bucket access scheme
* resulting in attempts to connect to addresses like "bucketname.localhost"
* which requires specific DNS setup.
*/
EndpointConfiguration endpoint = new EndpointConfiguration("http://localhost:8001", "us-west-2");
mockS3client = AmazonS3ClientBuilder
.standard()
.withEndpointConfiguration(endpoint)
.withPathStyleAccessEnabled(true)
.withCredentials(new AWSStaticCredentialsProvider(new AnonymousAWSCredentials()))
.build();
mockS3client.createBucket("test-bucket");
}
#AfterAll
public static void tearDown() {
api.shutdown();
}
#Test
void unitTestForHadoopCodeWritingUsingS3A {
Configuration hadoopConfig = getTestConfiguration();
........
}
private static Configuration getTestConfiguration() {
Configuration config = new Configuration();
config.set("fs.s3a.endpoint", "http://127.0.0.1:8001");
config.set("fs.s3a.connection.ssl.enabled", "false");
config.set("fs.s3a.path.style.access", "true");
config.set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider");
config.set("fs.s3a.access.key", "foo");
config.set("fs.s3a.secret.key", "bar");
return config;
}
}

How to load external config from s3 bucket in grails 2.5.1?

In Grails 3.1.4, there is a method to load external yml file as inputstream as
S3Service s3Service = new RestS3Service(awsCredentials)
S3Object downloadedObject= S3Service.getObject("bucketName","application.yml")
Resource resourceConfig = new InputStreamResource(downloadedObject.getDataInputStream())
YamlPropertiesFactoryBean ypfb = new YamlPropertiesFactoryBean()
ypfb.setResources(resourceConfig)
ypfb.afterPropertiesSet()
I want to do the same in grails 2.5.1 but could not figure out. I don't want to download the file to local and set path to grails.config.location.