Elasticsearch Writes Into S3 Bucket for Metadata But Doesn't Write Into S3 Bucket For Scheduled Snapshots? - amazon-s3

According to the documents at http://www.elasticsearch.org/tutorials/2011/08/22/elasticsearch-on-ec2.html, I have included the respective API Access Key ID with its secret access key and Elasticsearch is able to write into the S3 bucket as follows:
[2012-08-02 04:21:38,793][DEBUG][gateway.s3] [Schultz, Herman] writing to gateway org.elasticsearch.gateway.shared.SharedStorageGateway$2#4e64f6fe ...
[2012-08-02 04:21:39,337][DEBUG][gateway.s3] [Schultz, Herman] wrote to gateway org.elasticsearch.gateway.shared.SharedStorageGateway$2#4e64f6fe, took 543ms
However when it comes to writing snapshots into the S3 bucket, out comes the following error:
[2012-08-02 04:25:37,303][WARN ][index.gateway.s3] [Schultz, Herman] [plumbline_2012.08.02][3] failed to read commit point [commit-i] java.io.IOException: Failed to get [commit-i]
Caused by: Status Code: 403, AWS Service: Amazon S3, AWS Request ID: E084E2ED1E68E710, AWS Error Code: InvalidAccessKeyId, AWS Error Message: The AWS Access Key Id you provided does not exist in our records.
[2012-08-02 04:36:06,696][WARN ][index.gateway] [Schultz, Herman] [plumbline_2012.08.02][0] failed to snapshot (scheduled) org.elasticsearch.index.gateway.IndexShardGatewaySnapshotFailedException: [plumbline_2012.08.02][0] Failed to perform snapshot (index files)
Is there a reason why this is happening since the access keys I have provided is able to write metadata and not creating snapshots?

Related

connection error from aws fargete to gcp bigquery by using Workload Identity

I used Workload Identity from AWS EC2 to GCP Bigquery by using assigned role on EC2, and it worked fine.
However when I use Workload Identity from AWS Fargete to GCP Bigquery by using fargate task role, it does not work.
How should I set up the Workload Identity on this case?
I used the libraries below.
implementation(platform("com.google.cloud:libraries-bom:20.9.0"))
implementation("com.google.cloud:google-cloud-bigquery")
Stacktrace has messages below
com.google.cloud.bigquery.BigQueryException: Failed to retrieve AWS IAM role.
at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115) ~[google-cloud-bigquery-1.137.1.jar!/:1.137.1]
…
at java.base/java.lang.Thread.run(Unknown Source) ~[na:na]
Caused by: java.io.IOException: Failed to retrieve AWS IAM role.
at com.google.auth.oauth2.AwsCredentials.retrieveResource(AwsCredentials.java:217) ~[google-auth-library-oauth2-http-0.26.0.jar!/:na]
…
at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.getDataset(HttpBigQueryRpc.java:126) ~[google-cloud-bigquery-1.137.1.jar!/:1.137.1]
... 113 common frames omitted
Caused by: java.net.ConnectException: Invalid argument (connect failed)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:na]
at com.google.auth.oauth2.AwsCredentials.retrieveResource(AwsCredentials.java:214) ~[google-auth-library-oauth2-http-0.26.0.jar!/:na]
... 132 common frames omitted
I faced a similar issue with Google Cloud Storage (GCS).
As Peter mentioned, retrieving the credentials on an AWS Farage task is not the same as if the code is running on an EC2 instance, therefore Google SDK fails to compose the correct AWS credentials for exchange with Google Workload Identity Federation.
I came up with a workaround that saved the trouble of editing core files in "../google/auth/aws.py" by doing 2 things:
Get session credentials with boto3
import boto3
task_credentials = boto3.Session().get_credentials().get_frozen_credentials()
Set the relevant environment variables
from google.auth.aws import environment_vars
os.environ[environment_vars.AWS_ACCESS_KEY_ID] = task_credentials.access_key
os.environ[environment_vars.AWS_SECRET_ACCESS_KEY] = task_credentials.secret_key
os.environ[environment_vars.AWS_SESSION_TOKEN] = task_credentials.token
Explanation:
I am using Python3.9 with boto3 and google-cloud==2.4.0, however it should work for other versions of google SDK if the following code is in the function "_get_security_credentials" under the class "Credentials" in "google.auth.aws" package:
# Check environment variables for permanent credentials first.
# https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html
env_aws_access_key_id = os.environ.get(environment_vars.AWS_ACCESS_KEY_ID)
env_aws_secret_access_key = os.environ.get(
environment_vars.AWS_SECRET_ACCESS_KEY
)
# This is normally not available for permanent credentials.
env_aws_session_token = os.environ.get(environment_vars.AWS_SESSION_TOKEN)
if env_aws_access_key_id and env_aws_secret_access_key:
return {
"access_key_id": env_aws_access_key_id,
"secret_access_key": env_aws_secret_access_key,
"security_token": env_aws_session_token,
}
Caveat:
When running code inside an ECS task the credentials that are being used are temporary (ECS assumes the task's role), therefore you can't generate temporary credentials via AWS STS as it is usually recommended.
Why is it a problem? Well since a task is running with temporary credentials it is subjected to expire & refresh. In order to solve that you can set up a background function that will do the operation again every 5 minutes or so (Haven't faced a problem where the temporary credentials expired).
I had the same issue but for Python code, anyway I think it should be the same.
You're getting this as getting the AWS IAM role at AWS Fargate is different from AWS EC2, where EC2 you can get them from instance metadata, as shown here:
curl http://169.254.169.254/latest/meta-data/iam/security-credentials/s3access
While in AWS Faragte:
curl 169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
So to get around that, the following need to be done:
Change GCP Workload Identity Federation Credential file content [wif_cred_file] as the following:
wif_cred_file["credential_source"]["url"]=f"http://169.254.170.2{AWS_CONTAINER_CREDENTIALS_RELATIVE_URI}"
In the "python3.8/site-packages/google/auth/aws.py" file in the library [Try to find the similar file in Java], I've updated this code as the following:
Comment this line:
# role_name = self._get_metadata_role_name(request)
Remove role_name from _get_metadata_security_credentials function args.
Or if you like, you may change step 1 at the aws.py file, both ways should be fine.
And that should be it.

Databricks to S3 - The backend could not get session tokens for path

I'm trying to move data from the dbfs databricks to my S3 bucket, however, I'm stuck in this error: The backend could not get session tokens for path /mnt/s3/mybucket-upload/product.csv.gz. Did you remove the AWS key for the mount point?
Moving dbfs:/tmp/databricks2s3/product/part-00000-tid-7154689887306924257-8bd689b8-fc4d-46a1-b207-8a6b51aade55-411806-1-c000.csv.gz to /mnt/s3/mybucket-upload/product.csv.gz
An error occurred while calling z:com.databricks.backend.daemon.dbutils.FSUtils.mv.
: com.databricks.backend.daemon.data.common.InvalidMountException:
The backend could not get session tokens for path /mnt/s3/mybucket-upload/product.csv.gz. Did you remove the AWS key for the mount point?
at com.databricks.backend.daemon.data.common.InvalidMountException$.apply(DataMessages.scala:520)
at com.databricks.backend.daemon.data.filesystem.MountEntryResolver.resolve(MountEntryResolver.scala:61)
at com.databricks.backend.daemon.data.client.DBFSV2.resolve(DatabricksFileSystemV2.scala:81)
at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2$$anonfun$getFileStatus$1$$anonfun$apply$15.apply(DatabricksFileSystemV2.scala:757)
at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2$$anonfun$getFileStatus$1$$anonfun$apply$15.apply(DatabricksFileSystemV2.scala:756)
at com.databricks.s3a.S3AExeceptionUtils$.convertAWSExceptionToJavaIOException(DatabricksStreamUtils.scala:119)
at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2$$anonfun$getFileStatus$1.apply(DatabricksFileSystemV2.scala:756)
at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2$$anonfun$getFileStatus$1.apply(DatabricksFileSystemV2.scala:756)
at com.databricks.logging.UsageLogging$$anonfun$recordOperation$1.apply(UsageLogging.scala:440)
at com.databricks.logging.UsageLogging$$anonfun$withAttributionContext$1.apply(UsageLogging.scala:251)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at com.databricks.logging.UsageLogging$class.withAttributionContext(UsageLogging.scala:246)
at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionContext(DatabricksFileSystemV2.scala:450)
at com.databricks.logging.UsageLogging$class.withAttributionTags(UsageLogging.scala:288)
at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionTags(DatabricksFileSystemV2.scala:450)
at com.databricks.logging.UsageLogging$class.recordOperation(UsageLogging.scala:421)
at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.recordOperation(DatabricksFileSystemV2.scala:450)
at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.getFileStatus(DatabricksFileSystemV2.scala:755)
at com.databricks.backend.daemon.data.client.DatabricksFileSystem.getFileStatus(DatabricksFileSystem.scala:201)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1426)
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:496)
And here's how I set up the bucket:
try:
dbutils.fs.mount(
f's3n://{s3_accesskey_id}:{parse.quote(s3_secret_access_key, "")}#mybucket-upload/mylink',
'/mnt/s3/mybucket-upload/mylink')
except Exception as error:
if ('Directory already mounted' not in str(error)):
raise error
I tried to pass AWS credentials directly into the code, but it also doesn't work.
Interestingly, everything works perfectly in DEV

AWS S3 Connection in druid

I have set up a clustered Druid with the configuration as mentioned in the Druid documentation
https://druid.apache.org/docs/latest/tutorials/cluster.html
I am using AWS S3 for deep storage. Following is the snippet of my common configuration file
druid.extensions.loadList=["druid-datasketches", "mysql-metadata-storage", "druid-s3-extensions", "druid-orc-extensions", "druid-lookups-cached-global"]
# For S3:
druid.storage.type=s3
druid.storage.bucket=bucket-name
druid.storage.baseKey=druid/segments
#druid.storage.disableAcl=true
druid.storage.sse.type=s3
#druid.s3.accessKey=...
#druid.s3.secretKey=...
# For S3:
druid.indexer.logs.type=s3
druid.indexer.logs.s3Bucket=bucket-name
druid.indexer.logs.s3Prefix=druid/stage/indexing-logs
While running any ingestion task I am getting Access denied error
Java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: ; S3 Extended Request ID: ), S3 Extended Request ID:
at org.apache.druid.storage.s3.S3DataSegmentPusher.push(S3DataSegmentPusher.java:103) ~[?:?]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$mergeAndPush$4(AppenderatorImpl.java:791) ~[druid-server-0.19.0.jar:0.19.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.19.0.jar:0.19.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.19.0.jar:0.19.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.19.0.jar:0.19.0]
I am using s3 for two purposes
read data from s3 and ingest it. This connection is working fine and data is being from s3 location
for deep storage. I am getting error over here.
I am using Profile information authentication method to provide s3 credential. So I already have configured aws cli with appropriate credentials. Also, s3 data is encrypted by AES256 so i have added druid.storage.sse.type=s3 in config file.
Can someone help me out here as I am not able to debug the issue.
You asked how to approach debugging this. Normally I would:
Ssh onto the ec2 instance and run aws sts get-caller-identity. This will tell you what principal your requests are sent from. Then, I would confirm that principal has the S3 access that is expected.
I would confirm that I can write to the bucket in your configuration.
druid.storage.type=s3
druid.storage.bucket=<bucket-name>
druid.storage.baseKey=druid/segments
I would try some of the other auth methods such as exporting the keys into the environment mentioned in the third option since that is a simple test. Then I would run step 1 again to confirm my principal reflects those keys. And then I would try running your code again.

dms s3 source endpoint connection fails

Getting below connection error when trying to validate S3 source endpoint of DMS.
Test Endpoint failed: Application-Status: 1020912, Application-Message: Failed to connect to database.
Followed all the steps listed in the below links but still maybe I am missing something...
https://aws.amazon.com/premiumsupport/knowledge-center/dms-connection-test-fail-s3/
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.S3.html
The role associated with the endpoint does have access to the S3 bucket of the endpoint, along with dms being listed as trusted entity.
I got this same error when trying to use S3 as a target.
The one thing not mentioned in the documentation, and which turned out to be the root cause for my error, is that the DMS Replication Instance and the Bucket need to be in the same region.

Redshift Spectrum / The bucket you are attempting to access must be addressed using the specified endpoint

I created a parquet file in S3 and an external table pointing to it in Redshift / Spectrum. Both my S3 bucket and Redshift cluster are in us-west-2. I specified the option region when creating the schema.
Queries run smoothly in Athena.
Yet when I run from Redshift client, I get this error:
Amazon Invalid operation: S3 Query Exception (Fetch)
Details:
error: S3 Query Exception (Fetch)
code: 15001
context: Task failed due to an internal error.
HTTP response error code: 301 Message: PermanentRedirect The bucket you are attempting to access must be addressed using the specified endpoint. >Please send all future requests to this endpoint.
x-amz-request-id: XXXX
query: XXXXX
location: dory_util.cpp:689
process: query0_40 [pid=XXX]
-----------------------------------------------;
AWS has acknowledged the issue and released a patch overnight.
Please make sure that your Redshift cluster is running with at least version 1.0.14016 in us-east-2 or us-west-2 and 1.0.1407 in us-east-1. To apply the patch to Redshift immediately, move the maintenance window of your cluster closer to the current time and day to pick it up at your convenience.