Boto3 generate presinged url does not work - amazon-s3

Here is my code that I use to create a s3 client and generate a presigned url, which are some quite standard codes. They have been up running in the server for quite a while. I pulled the code out and ran it locally in a jupyter notebook
def get_s3_client():
return get_s3(create_session=False)
def get_s3(create_session=False):
session = boto3.session.Session() if create_session else boto3
S3_ENDPOINT = os.environ.get('AWS_S3_ENDPOINT')
if S3_ENDPOINT:
AWS_ACCESS_KEY_ID = os.environ['AWS_ACCESS_KEY_ID']
AWS_SECRET_ACCESS_KEY = os.environ['AWS_SECRET_ACCESS_KEY']
AWS_DEFAULT_REGION = os.environ["AWS_DEFAULT_REGION"]
s3 = session.client('s3',
endpoint_url=S3_ENDPOINT,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
region_name=AWS_DEFAULT_REGION)
else:
s3 = session.client('s3', region_name='us-east-2')
return s3
s3 = get_s3_client()
BUCKET=[my-bucket-name]
OBJECT_KEY=[my-object-name]
signed_url = s3.generate_presigned_url(
'get_object',
ExpiresIn=3600,
Params={
"Bucket": BUCKET,
"Key": OBJECT_KEY,
}
)
print(signed_url)
When I tried to download the file using the url in the browser, I got an error message and it says "The specified key does not exist." I noticed in the error message that my object key becomes "[my-bucket-name]/[my-object-name]" rather than just "[my-object-name]".
Then I used the same bucket/key combination to generate a presigned url using aws cli, which is working as expected. I found out that somehow the s3 client method (boto3) inserted [my-object-name] in front of [my-object-name] compared to the aws cli method. Here are the results
From s3.generate_presigned_url()
https://[my-bucket-name].s3.us-east-2.amazonaws.com/[my-bucket-name]/[my-object-name]?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAV17K253JHUDLKKHB%2F20210520%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20210520T175014Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=5cdcc38e5933e92b5xed07b58e421e5418c16942cb9ac6ac6429ac65c9f87d64
From aws cli s3 presign
https://[my-bucket-name].s3.us-east-2.amazonaws.com/[my-object-name]?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAYA7K15LJHUDAVKHB%2F20210520%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20210520T155926Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=58208f91985bf3ce72ccf884ba804af30151d158d6ba410dd8fe9d2457369894
I've been working on this and searching for solutions for day and half and I couldn't find out what was wrong with my implementation. I guess it might be that I ignored some basic but important settings to create a s3 client using boto3 or something else. Thanks for the help!

Ok, myth is solved, I shouldn't provide the endpoint_url=S3_ENDPOINT param when I create the s3 client, boto3 will figure it out. After i removed it, everything works as expected.

Related

Using a service account and JSON key which is sent to you to upload data into google cloud storage

I wrote a python script that uploads files from a local folder into Google cloud storage.
I also created a service account with sufficient permission and tested it on my computer using that service account JSON key and it worked.
Now I send the code and JSON key to someone else to run but the authentication fails on her side.
Are we missing any authentication through GCP UI?
def config_gcloud():
subprocess.run(
[
shutil.which("gcloud"),
"auth",
"activate-service-account",
"--key-file",
CREDENTIALS_LOCATION,
]
)
storage_client = storage.Client.from_service_account_json(CREDENTIALS_LOCATION)
return storage_client
def file_upload(bucket, source, destination):
storage_client = config_gcloud()
...
The error happens in the config_cloud and it says it is expecting str, path, ... but gets NonType.
As I said, the code is fine and works on my computer. How anotehr person can use it using JSON key which I sent her?She stored Json locally and path to Json is in the code.
CREDENTIALS_LOCATION is None instead of the correct path, hence it complaining about it being NoneType instead of str|Path.
Also you don't need that gcloud call, that would only matter for gcloud/gsutil commands, not python client stuff.
And please post the actual stacktrace of the error next time, not just a misspelled interpretation of it.

Can't create a S3 bucket with KMS_MANAGED key and bucketKeyEneabled via CDK

I have a S3 bucket with this configuration:
I'm trying to create a bucket with this same configuration via CDK:
Bucket.Builder.create(this, "test1")
.bucketName("com.myorg.test1")
.encryption(BucketEncryption.KMS_MANAGED)
.bucketKeyEnabled(true)
.build()
But I'm getting this error:
Error: bucketKeyEnabled is specified, so 'encryption' must be set to KMS (value: MANAGED)
This seems like a bug to me, but I'm relatively new to CDK so I'm not sure. Am I doing something wrong, or is this indeed a bug?
I encountered the issue recently, and I have found the answer. I want to share the findings here just in case anyone gets stuck.
Yes, this was a bug in the AWS-CDK. The fix was merged this month: https://github.com/aws/aws-cdk/pull/22331
According to the CDK doc (https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_s3.Bucket.html#bucketkeyenabled), if bucketKeyEnabled is set to true, S3 will use its own time-limited key instead, which helps reduce the cost (see https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket-key.html); it's only relevant when Encryption is set to BucketEncryption.KMS or BucketEncryption.KMS_MANAGED.
bucketKeyEnabled flag , straight from docs:
Specifies whether Amazon S3 should use an S3 Bucket Key with
server-side encryption using KMS (SSE-KMS)
BucketEncryption has 4 options.
NONE - no encryption
MANAGED - Kms key managed by AWS
S3MANAGED - Key managed by S3
KMS - key managed by user, a new KMS key will be created.
we don't need to set bucketKeyEnabled at all for any scenario. In this case, all we need is aws/s3 , so,
bucketKeyEnabled: Need not be set.(since this is only for SSE-KMS)
encryption: Should be set to BucketEncryption.S3_MANAGED
Example:
const buck = new s3.Bucket(this, "my-bucket", {
bucketName: "my-test-bucket-1234",
encryption: s3.BucketEncryption.KMS_MANAGED,
});

created s3 presigned url (put) with custom headers with boto3

My current code looks like this
s3 = boto3.client('s3')
presigned_url = s3.generate_presigned_url(
'put_object',
Params={'Bucket':bucket_name, 'Key':object_key},
ExpiresIn=3600,
HttpMethod='PUT' )
This is working, but I want to include custom headers like x-amz-meta-my-custom-meta-data. I'm pretty sure S3 supports this, so how can I do this with boto3?
Its not clear from the documentation.
Using Python 3.6
It is a NO and is still classified as a feature request as of Oct 2017.
https://github.com/boto/boto3/issues/1294
Hope it helps.
Send it as metadata
s3 = boto3.client('s3')
presigned_url = s3.generate_presigned_url(
'put_object',
Params={'Bucket':bucket_name, 'Key':object_key, "Metadata": {"mechaGodzilla": "anything is possible"}},
ExpiresIn=3600,
HttpMethod='PUT' )
In your request headers, you must include x-amz-meta-mechaGodzilla: "anything is possible"

Scrapy with S3 support

I have been struggling for the last couple of hours but seem to be blind here. I am trying to establish a link between scrapy and Amazon S3 but keep getting the error that the bucket does not exist (it does, checked a dozen times).
The error message:
2016-11-01 22:58:08 [scrapy] ERROR: Error storing csv feed (30 items) in: s3://onvista.s3-website.eu-central-1.amazonaws.com/feeds/vista/2016-11-01T21-57-21.csv
in combination with
botocore.exceptions.ClientError: An error occurred (NoSuchBucket) when calling the PutObject operation: The specified bucket does not exist
My settings.py:
ITEM_PIPELINES = {
'onvista.pipelines.OnvistaPipeline': 300,
#'scrapy.pipelines.files.S3FilesStore': 600
}
AWS_ACCESS_KEY_ID = 'key'
AWS_SECRET_ACCESS_KEY = 'secret'
FEED_URI = 's3://onvista.s3-website.eu-central-1.amazonaws.com/feeds/%(name)s/%(time)s.csv'
FEED_FORMAT = 'csv'
Has anyone a working setting for me to have a glimpse?
Instead of referring to an Amazon S3 bucket via its Hosed Website URL, refer to it by name.
The scrapy Feed Exports documentation gives an example of:
s3://mybucket/scraping/feeds/%(name)s/%(time)s.json
In your case, that would make it:
s3://onvista/feeds/%(name)s/%(time)s.json

boto - more concise way to get key's value from bucket?

I'm trying to figure out a concise way to get data from s3 via boto
my current code looks like this. s3 manager is simply a class that does all the s3 setup for my app.
log.debug("generating downloader")
downloader = s3_manager()
log.debug("accessing bucket")
bucket_archive = downloader.s3_buckets['#archive']
log.debug("getting key")
key = bucket_archive.get_key(archive_filename)
log.debug("getting key into string")
source = key.get_contents_as_string()
the problem is that , looking at my debug logs, i'm making two requests to amazon s3:
key = bucket_archive.get_key(archive_filename)
source = key.get_contents_as_string()
looking at the docs [ http://boto.readthedocs.org/en/latest/ref/s3.html ] , it seems that the call to get_key checks to see if it exists , while the second call gets the actual data. does anyone know of a method to do both at once ? a more concise way of doing this with one request is preferable for our app.
The get_key() method performs a HEAD request on the object to verify that it exists. If you are certain that the bucket and key exist and would prefer not to have the overhead of a HEAD request, you can simply create a Key object directly. Something like this would work:
import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket', validate=False)
key = bucket.new_key('myexistingkey')
contents = key.get_contents_as_string()
The validate=False on the call to get_bucket eliminates a GET request that also is intended to validate that the bucket exists.