Amazon SES Deliver to S3 bucket action fails sometimes - amazon-s3

I have configured receipt rules on Amazon SES to Deliver to S3 bucket action. I noticed that sometimes, this step fails however and the email does not get delivered to S3. I tried sending it with an attachment that was 8KB and that was fine, but then when sending with an attachment of 115KB it fails. But I see on the docs that maximum size allowed is 40MB so not sure what the problem is.
Anyone have any clues or know how i can debug SES rule failures?

Related

aws s3: Extra charge against us-east-1 when putting to ap-northeast-1

Our AWS statement came in and we noticed we're being doubly charged for the number of requests.
First charge is for Asia Pacific (Tokyo) (ap-northeast-1) and this is straightforward because it's where our bucket is located. But there's another charge against US East (N. Virginia) (us-east-1) with a similar number of requests.
Long story short, it appears this is happening because we're using the aws s3 command and we haven't specified a region either via the --region option or any of the fallback methods.
Typing aws configure list shows region: Value=<not set> Type=None Location=None.
And yet our aws s3 commands succeed, albeit with this seemingly hidden charge. The presumption is, our requests first go to us-east-1, but since there isn't a bucket there by the name we specified, it turns around and comes back to ap-northeast-1, where it ultimately succeeds while getting accounted twice.
The ec2 instance where the aws command is run is itself in ap-northeast-1 if that counts for anything.
So the question is, is the presumption above a reasonable account of what's happening? (i.e. Is it expected behaviour.) And, it seems a bit insidious to me but is there a proper rationale for this?
What you are seeing is correct. The aws s3 command needs to know the region in order to access the S3 bucket.
Since this has not been provided, it will make a request to us-east-1, which is effectively the default - see the AWS S3 region chart to see that us-east-1 does not require a location constraint.
If the S3 receives a request for a bucket which is not in that region then it returns a PermanentRedirect response with the correct region for the Bucket. The AWS CLI handles this transparently and repeats the request with the correct endpoint which includes the region.
The easiest way to see this in action is to run commands in debug mode:
aws s3 ls ap-northeast-1-bucket --debug
The output will include:
DEBUG - Response body:
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access
must be addressed using the specified endpoint. Please send all future requests to
this endpoint.</Message>
<Endpoint>ap-northeast-1-bucket.s3.ap-northeast-1.amazonaws.com</Endpoint>
<Bucket>ap-northeast-1</Bucket>
<RequestId>3C4FED2EFFF915E9</RequestId><HostId>...</HostId></Error>
The AWS CLI does not assume the Region is the same as the calling EC2 instance, it's a long running confusion/feature request.
Additional Note: Not all AWS services will auto-discover the region in this way and will fail if the Region is not set. S3 works because it uses a Global Namespace which inherently requires some form of discovery service.

Can I trust aws-cli to re-upload my data without corrupting when the transfer fails?

I extensively use S3 to store encrypted and compressed backups of my workstations. I use the aws cli to sync them to S3. Sometimes, the transfer might fail when in progress. I usually just retry it and let it finish.
My question is: Does S3 has some kind of check to make sure that the previously failed transfer didn't leave corrupted files? Does anyone know if syncing again is enough to fix the previously failed transfer?
Thanks!
Individual files uploaded to S3 are never partially uploaded. Either the entire file is completed and S3 stores the file as an S3 object, or the upload is aborted and S3 object is never stored.
Even in the multi-part upload case, multiple parts can be uploaded but they never form a complete S3 object unless all of the pieces are uploaded and the "Complete Multipart Upload" operation is performed. So there is no need worry about corruption via partial uploads.
Syncing will certainly be enough to fix the previously failed transfer.
Yes, looks like AWS CLI does validate what it uploads and takes care of corruption scenarios by employing MD5 checksum.
From https://docs.aws.amazon.com/cli/latest/topic/s3-faq.html
The AWS CLI will perform checksum validation for uploading and downloading files in specific scenarios.
The AWS CLI will calculate and auto-populate the Content-MD5 header for both standard and multipart uploads. If the checksum that S3 calculates does not match the Content-MD5 provided, S3 will not store the object and instead will return an error message back the AWS CLI.

Google transfer service error notification

I've been looking everywhere and can't seem to find the answer. I set up a file transfer service between a S3 bucket and a google storage bucket. I know I can see the error messages by clicking on the file transfer, but I wan't to access the log, so I can set up an email notification when an error occurs. Where can I find the log? Or is there another way to set this email notification up?
Google's Transfer Service does not currently have any mechanism for email/pubsub/etc. notifications about the progress of a job or errors it encounters.
Until such a feature exists, I think the closest available solution would be something based on the access logs or notifications directly from GCS or S3 (but that would include other traffic on the bucket, not just Transfer Service). E.g., for errors encountered when writing objects to GCS, you could analyze the access logs or the object change notifications.

Files disappearing from Amazon S3

Here are links to four files that I uploaded in the last week, but have now disappeared from my bucket on S3:
https://gh-resource.s3.amazonaws.com/ghImage/SWjqzgXy9rGCYvpRF-naypyidaw.jpg
https://gh-resource.s3.amazonaws.com/ghImage/SWjqzgXy9rGCYvpRF-london.jpg
https://gh-resource.s3.amazonaws.com/ghImage/SWjqzgXy9rGCYvpRF-brussels.jpg
https://s3.amazonaws.com/gh-resource/ghImage/SWjqzgXy9rGCYvpRF-ottawa.jpg
I know they successfully uploaded because I saw them on my website multiple times before they disappeared. The last file above (ottawa), I just now re-uploaded, so that I could look at the permissions and see if there was an expiry date or expiry rule. When I looked at the permissions, 'everyone' has read/download permission. Expiry date is None, expiry rule is N/A. This has been happening regularly for the last year or so. What could be causing this?
You should enable logging on your bucket. This will tell you who/what is deleting your files.
See: Logging Amazon S3 API Calls By Using AWS CloudTrail
I found that if you have an expiry policy setup you'll also see that in the logs. See Lifecycle and Other Bucket Configurations for more info.

AWS s3 configuration to avoid waiting time for multiple request

I have static content uploaded on S3 bucket.
When I hit URL for the First time, the contents take while to load. It has a single html page with multiple CSS and JS.
So is there any kind on configuration needed at S3 level to optimize.
I am trying to figure out settings such as number of connections like we have in Apache.
There are no configurations available for Amazon S3. It just works!
Some ideas for speeding your download:
Create a bucket that is located closer to you/your users (less latency)
Zip your files before uploading to Amazon S3 (faster download)
Check the Network console in your web browser to determine where the time is being taken