Redshift COPY command failing to Load Data from S3 - amazon-s3

We are facing error while we are trying to load a huge zip file from S3 bucket to redshift from EC2 instance and even aginity. Waht is the real issue here?
As far as we have checked this can be because of the VPC NACL rules but not sure.
Error :
ERROR: Connection timed out after 50000 milliseconds

I also got this error and the Enhanced VPC Routing is enabled , check the routing from your Redshift cluster to S3.
There are several ways to let the Redshift cluster reach S3 , you can see the link below:
https://docs.aws.amazon.com/redshift/latest/mgmt/enhanced-vpc-routing.html
I solved this error by setting NAT for my private subnet which is used by my Redshift cluster.

I think you are correct, it might be because bucket access rules or secret/access keys.
Here are some pointers to debug it further if above doesn't work.
Create a small zip file, then try again if its something because of Size(but I don't think it is possible case.)
Split your zip file into multiple zip files and create Manifest file for loading rather then single file.
I hope your will find this useful.

You should create an IAM role which authorizes Amazon Redshift to access other AWS services like S3 on your behalf, you must associate that role with an Amazon Redshift cluster before you can use the role to load or unload data.
Check below link for setting up IAM role:
https://docs.aws.amazon.com/redshift/latest/mgmt/copy-unload-iam-role.html

I got this error when the Redshift cluster had Enhanced VPC Routing enabled, but no route in the route table for S3. Adding the S3 endpoint fixed the issue. Link to docs.

Related

Could not find an active AWS Glue VPC interface endpoint. Could not find an active NAT

I'm trying to create a AWS Databrew job that pulls data from an S3 folder into a AWS RDS SQL Server table and receive the following:
"AWS Glue VPC interface endpoint validation failed for SubnetId: subnet-xxx9574. VPC: vpc-xxxdd2. Reason: Could not find an active AWS Glue VPC interface endpoint. Could not find an active NAT."
My Databrew output is a "Data catalog RDS Tables", which I have setup a crawler/connection and everything is green.
I followed this path and helped solve a different issue, but this error is slightly different.
https://aws.amazon.com/premiumsupport/knowledge-center/glue-s3-endpoint-validation-failed/
I tried to create another endpoint with type of 'Interface' instead of 'Gateway' but not really sure if that's the correct path.
Any guidance? Trying to avoid any custom scripts as this should be something AWS can handle in their interfaces...IMO.
Thanks!
I followed this path and helped solve a different issue, but this error is slightly different.
https://aws.amazon.com/premiumsupport/knowledge-center/glue-s3-endpoint-validation-failed/
I tried to create another endpoint with type of 'Interface' instead of 'Gateway' but not really sure if that's the correct path.

What's the correct URI format when creating scheduled backups from CockroachDB to a linode s3 bucket?

CockroachDB's documentation gives the example
CREATE SCHEDULE core_schedule_label
FOR BACKUP INTO 's3://test/schedule-test-core?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
How can I modify this to use an S3-compatible service like linode rather than AWS?
The format is very similar; you just need to override the endpoint with your actual linode endpoint. A linode s3 URI can look like
CREATE SCHEDULE my_own_backup_schedule FOR BACKUP INTO 's3://test/schedule-test-core?AWS_ACCESS_KEY_ID=accesskeyid&AWS_SECRET_ACCESS_KEY=secret&AWS_REGION=us-east-1&AWS_ENDPOINT=https://us-east-1.linodeobjects.com'
Note that the AWS_ENDPOINT is just the host, not the full endpoint with the bucket name. On older versions of CockroachDB, providing the bucket name in AWS_ENDPOINT (like AWS_ENDPOINT=https://us-east-1.linodeobjects.com/test/schedule-test-core) worked, but in 22.1+ backups created like that may get the error "failed to list s3 bucket". You can fix this issue by creating a new backup schedule formatted as above and adding WITH SCHEDULE OPTIONS ignore_existing_backups so that you don't get an error like unexpected error occurred when checking for existing backups in s3 from validations in current code trying to use the older URI.

AWS Backup from S3 Access Denied

I am trying to setup a simple on-demand backup of an s3 bucket in AWS and anything I try I always get an access denied. See screenshot:
I have tried create a new bucket which is completely public, I've tried setting the access policy on the Vault, I've tried in different regions, all have the same result. Access Denied!
The messaging doesn't advise anything other than Access Denied, really helpful!
Can anyone give me some insight into what this message is referring to and more over how I can resolve this issue.
For aws backup, you need to set up a service role.
Traditionally you need 2 policies attached.
[AWSBackupServiceRolePolicyForBackup]
[AWSBackupServiceRolePolicyForRestore]
For S3, it seems there is a separate policy that you need to attach to your service role.
[AWSBackupServiceRolePolicyForS3Backup]
[AWSBackupServiceRolePolicyForS3Restore]
Just putting this here for those who will be looking for this answer.
To solve this problem for AWS CDK (javascript/typescript) you can use the following examples:
https://github.com/SimonJang/blog-aws-backup-s3/blob/68a05f8cb443411a23f02aa0c188adfe15bab0ff/infrastructure/lib/infrastructure-stack.ts#L63-L200
or this:
https://github.com/finnishtransportagency/hassu/blob/8adc0bea3193ff016a9aaa6abe0411292714bbb8/deployment/lib/hassu-database.ts#L230-L312

Application in EKS fails to access S3 bucket

My application running in EKS (AWS Kubernetes) is failing to access an S3 bucket.
I'm getting a 400 Bad Request errors in my app.
I suspect a permission is missing, so for testing I added arn:aws:iam::aws:policy/AmazonS3FullAccess to any role I could find related to my EKS cluster. Still failing.
Using an S3 client from my local computer, I can access the bucket so I suspect I'm missing some configuration.
Any ideas?
Ok... issue was resolved. I'm leaving this here for future reference.
The problem was a mismatch of the bucket region, us-west-2 and the endpoint I had configured in my application. It should have been s3.us-west-2.amazonaws.com.
The error returned by S3 was not clear.
I hope this helps others.

Amazon S3 with Route 53 for static hosting - shared namespace

I'm currently attempting to use Amazon S3 for static hosting for a domain with the word bucket in the URL. One of the requirements for static hosting is that the bucket is named after the domain, so I had success setting up bucketdomain.com (not the actual domain) but unfortunatley I am unable to setup www.bucketdomain.com as S3 returns the following error when creating the S3 bucket:
The requested bucket name is not available. The bucket namespace is
shared by all users of the system. Please select a different name and
try again.
Does anyone know a way round this issue?
S3 buckets are a global namespace, and so it's very possible that someone else took the same bucket before you could get it. It's also possible that due to internal replication delays or other such issues, a previously-deleted bucket is not yet available for re-use.
It appears the bucket name you are using is not unique enough.