I am trying to upload content on amazon s3 but I am getting this error:
boto3.exceptions.unknownapiversionerror: The 's3' resource does not an
API Valid API versions are: 2006-03-01
import boto3
boto3.resource('s3',**AWS_ACCESS_KEY_ID**,**AWS_PRIVATE_KEY**)
bucket = s3.Bucket( **NAME OF BUCKET**)
obj = bucket.Object(**KEY**)
obj.upload_fileobj(**FILE OBJECT**)
The error is caused by exception raised on "DataNotFound" as in the
boto3.Session source code. Perhaps the developer didn't realize people make the mistake for NOT passing the correct object.
If you read the boto3 documentation example, this is the correct way to upload data.
import boto3
boto3.resource('s3',**AWS_ACCESS_KEY_ID**,**AWS_PRIVATE_KEY**)
bucket = s3.Bucket( **NAME OF BUCKET**)
obj = bucket.Object("prefix/object_key_name")
# You must pass the file object !
with open('filename', 'rb') as fileobject:
obj.upload_fileobj(fileobject)
Related
I am trying to save a user-sent Telegram voice message directly to S3. This happens inside AWS Lambda so saving to disk and using s3.upload_file(filename,...) will not work. This fails:
def audio_handler(update, context):
message = update.effective_message
file = message.voice.get_file()
s3 = boto3.client('s3')
s3.upload_file(file, Bucket='mybucket', Key='onelove.ogg')
ValueError: Filename must be a string
If I attempt to use
s3.upload_fileobj(BytesIO(file).getbuffer(), Bucket='mybucket', Key='onelove.ogg')
TypeError: a bytes-like object is required, not 'File'
Voice.get_file returns an object of type File. To download the voice to memory, you can e.g. pass an empty BytesIO object to the out argument of File.download. Please also have a look at the wiki section on working with files and media.
Disclaimer: I'm currently the maintainer of python-telegram-bot.
We're using boto3 with Linode Object Storage, which is compatible with AWS S3 according to their documentation.
Everything seems to work well, except cross-region copy operation.
When I download an object from source region/bucket and then upload it to destination region/bucket, everything works well. Although, I'd like to avoid that unnecessary upload/download step.
I have the bucket named test-bucket on both regions. And I'd like to copy the object named test-object from us-east-1 to us-southeast-1 cluster.
Here is the example code I'm using:
from boto3 import client
from boto3.session import Session
sess = Session(
aws_access_key_id='***',
aws_secret_access_key='***'
)
s3_client_src = sess.client(
service_name='s3',
region_name='us-east-1',
endpoint_url='https://us-east-1.linodeobjects.com'
)
# test-bucket and test-object are already exists.
s3_client_trg = sess.client(
service_name='s3',
region_name='us-southeast-1',
endpoint_url='https://us-southeast-1.linodeobjects.com'
)
copy_source = {
'Bucket': 'test-bucket',
'Key': 'test-object'
}
s3_client_trg.copy(CopySource=copy_source, Bucket='test-bucket', Key='test-object', SourceClient=s3_client_src)
When I call:
s3_client_src.list_objects(Bucket='test-bucket')['Contents']
It shows me that the test-object exists, But when I run copy, then it throws following message:
An error occurred (NoSuchKey) when calling the CopyObject operation: Unknown
Any help is appreciated!
I'm trying to copy s3 object with boto3 command like below
import boto3
client = boto3.client('s3')
client.copy_object(Bucket=bucket_name, ContentEncoding='gzip', CopySource=copy_source, Key=new_key)
To copy the object succeeded, but ContentEncoding metadata was not added to the object.
When I use the console to add Content-Encoding metadata, there was no problem.
But using python boto3 copy command, it cannot do that.
Here's a document link about client.copy_object()
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.copy_object
And the application versions are like this.
python=2.7.16
boto3=1.0.28
botocore=1.13.50
Thank you in advance.
Try adding MetadataDirective='REPLACE' to your copy_object call
client.copy_object(Bucket=bucket_name, ContentEncoding='gzip', CopySource=copy_source, Key=new_key, MetadataDirective='REPLACE')
I am trying to unittest a function that writes data to S3 and then reads the same data from the same S3 location. I am trying to use a moto and boto (2.x) to achieve that [1]. The problem is that the service returns that I am forbidden to access the key [2]. A similar problem (even though that the error message is a bit different) is reported in the moto github repository [3] but it is not resolved yet.
Has anyone ever successfully tested mocked s3 read/write in PySpark to share some insights?
[1]
import boto
from boto.s3.key import Key
from moto import mock_s3
_test_bucket = 'test-bucket'
_test_key = 'data.csv'
#pytest.fixture(scope='function')
def spark_context(request):
conf = SparkConf().setMaster("local[2]").setAppName("pytest-pyspark-local-testing")
sc = SparkContext(conf=conf)
sc._jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", 'test-access-key-id')
sc._jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey", 'test-secret-access-key')
request.addfinalizer(lambda: sc.stop())
quiet_py4j(sc)
return sc
spark_test = pytest.mark.usefixtures("spark_context")
#spark_test
#mock_s3
def test_tsv_read_from_and_write_to_s3(spark_context):
spark = SQLContext(spark_context)
s3_conn = boto.connect_s3()
s3_bucket = s3_conn.create_bucket(_test_bucket)
k = Key(s3_bucket)
k.key = _test_key
k.set_contents_from_string('')
s3_uri = 's3n://{}/{}'.format(_test_bucket, _test_key)
df = (spark
.read
.csv(s3_uri))
[2]
(...)
E py4j.protocol.Py4JJavaError: An error occurred while calling o33.csv.
E : org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/data.csv' - ResponseCode=403, ResponseMessage=Forbidden
(...)
[3]
https://github.com/spulec/moto/issues/1543
moto is a library which is used to mock aws resources.
1. Create the resource:
If you try to access an S3 Bucket which doesn't exist, aws will return a Forbidden error.
Usually, we need these resources created even before our tests run. So, create a pytest fixture with autouse set to True
import pytest
import boto3
from moto import mock_s3
#pytest.fixture(autouse=True)
def fixture_mock_s3():
with mock_s3():
conn = boto3.resource('s3', region_name='us-east-1')
conn.create_bucket(Bucket='MYBUCKET') # an empty test bucket is created
yield
The above code creates a mock s3 bucket with name "MUBUCKET". The bucket is empty.
The name of the bucket should be same as that of the original bucket.
with autouse, the fixture is automatically available across tests.
You can confidently run tests, as your tests will not have access to the original bucket.
2. Define and run tests involving the resource:
Suppose, you have code that writes a file to S3 Bucket
def write_to_s3(filepath: str):
s3 = boto3.resource('s3', region_name='us-east-1')
s3.Bucket('MYBUCKET').upload_file(filepath, 'A/B/C/P/data.txt')
This can be tested the following way:
from botocore.errorfactory import ClientError
def test_write_to_s3():
dummy_file_path = f"{TEST_DIR}/data/dummy_data.txt"
# The s3 bucket is created by the fixture and not lies empty
# test for emptiness
s3 = boto3.resource('s3', region_name='us-east-1')
bucket = s3.Bucket("MYBUCKET")
objects = list(bucket.objects.filter(Prefix="/"))
assert objects == []
# Now, lets write a file to s3
write_to_s3(dummy_file_path)
# the below assert statement doesn't throw any error
assert s3.head_object(Bucket='MYBUCKET', Key='A/B/C/P/data.txt')
Code:
import boto3
s3_cli = boto3.client('s3')
object_summary = s3_cli.head_object(
Bucket='test-cf',
Key='RDS.template',
VersionId='szA3ws4bH6k.rDXOEAchlh1x3OgthNEB'
)
print('LastModified: {}'.format(object_summary.get('LastModified')))
print('StorageClass: {}'.format(object_summary.get('StorageClass')))
print('Metadata: {}'.format(object_summary.get('Metadata')))
print('ContentLength(KB): {}'.format(object_summary.get('ContentLength')/1024))
Output:
LastModified: 2017-06-08 09:22:43+00:00
StorageClass: None
Metadata: {}
ContentLength(KB): 15
Am unable to get the StorageClass of the key using boto3 sdk. I can see the storage class set as standard from the aws console. I have also tried using s3.ObjectSummary and also s3.ObjectVersion methods in boto3 s3 resouces, but they also returned None.
Not sure why it is returning None. Meanwhile, use the following code to get the storage class. Let me check my version of Boto3.
bucket = s3.Bucket('test-cf')
for object in bucket.objects.all():
print object.key, object.storage_class