Difference between boto and boto3 in aws python aws, related to S3 - amazon-s3

Aws release a note for S3 path deprecation https://aws.amazon.com/it/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/ . In the Documentation, they assure that AWS sdk will be guaranteed except for some problems with the names, if the SDK is the latest version. Now the problem is that AWS has 2 python sdk, boto and boto3. i'm sure that the boto3 will have no problems related to the bucket path, but for the boto i haven't found anything about it. Is boto updated together with boto3?

From the github of boto:
Going forward, API updates and all new feature work will be focused on Boto3.
So boto is no longer getting any API updates nor features. If you check the linked github page, the last commit was over 1 year ago. So its likely that the S3 path changes won't be reflected in boto.

Related

Will objects with the same name uploaded in AWS s3 be overwritten

I'm uploading images/videos to S3 using their API and putObject.
When I use upload of com.amazonaws.services.s3.transfer to post the same PutObjectRequest twice, will the object be overwritten by the latest one ? Or aws will save the object twice with different versionID?
I didn't find the answer in Aws official document. I've checked SO but it's quite an old question and I don't know how the current version is.
Yes, by default the versioning on S3 buckets is disabled.

How would cv2 access videos in AWS s3 bucket

I can use the SageMaker notebook now. But here is a significant problem. When I wanted to use cv2.VideoCapture to read the video in the s3 bucket. It said the path doesn't exist. One answer in Stackoverflow said cv2 only supports local files, which means we have to download videos from s3 bucket to notebook but I don't want to do this. I wonder how you read the videos? Thanks.
I found one solution is to use CloudFront but would this be charged and is it fast?
You are using Python in SageMaker, so you could use:
import boto3
s3_client = boto3.client('s3')
s3_client.download_file('deepfake2020', 'dfdc_train_part_1/foo.mp4', foo.mp4')
This will download the file from Amazon S3 to the local disk, in a file called foo.mp4.
See: download_file() in boto3
This requires that the SageMaker instance has been granted permissions to access the Amazon S3 bucket.
This solution is also working.
To use AWS SageMaker,
1) go to Support Center to ask to improve notebook instance limit. They will reply in 1 day normally.
2) When creating a notebook, change local disk size to 1TB (double data size).
3) Open Jupyter lab and type cd SageMaker on terminal
4) Use CurlWget to get the download link of the dataset.
5) After downloading, unzip data
unzip dfdc_train_all.zip
unzip '*.zip'
There you go.

Upload multiple files to AWS S3 bucket without overwriting existing objects

I am very new to AWS technology.
I want to add some files to an existing S3 bucket without overwriting existing objects. I am using Spring Boot technology for my project.
Can anyone please suggest how can we add/upload multiple files without overwriting existing objects?
AWS S3 supports object versioning in the bucket, in which for use case of uploading same file, S3 will keep all files within the bucket with different version rather than overwriting it.
This can be configured using AWS Console or CLI to enable the Versioning feature. You may want to refer this link for more info.
You probably already found an answer to this, but if you're using the CDK or the CLI you can specify a destinationKeyPrefix. If you want multiple folders in an S3, which was my case, the folder name will be your destinationKeyPrefix.

AWS S3 not stops uploading from my Lenovo® ix2-dl

I have a NAS drive Lenovo® ix2-dl that I set up to back up to AWS S3. It connected fine. But for some reason it only uploads 5% of my Lenovo® ix2-dl Data. How can I get it to upload my whole Lenovo® ix2-dl Data?
I updated my NAS to the latest Firmware 4.1.218.34037.
I recently had issues with the s3 backup feature, where the uploads simply stopped working. No errors, nothing in logs to indicate an issue. I tested by AWS S3 access key and secret with another method and was able to upload files just fine.
To resolve the issue, i had to create a new AWS S3 bucket, then go into the S3 setup of Lenovo and provide the required info. I think what made this work for me, was i made sure to not have anything in the bucket name other than letters and numbers. My bucket name before was similar to this lastname.family.pics, my new bucket which works is similar to this lastname123.
Hope this helps, this feature has worked fine for a long time, perhaps an update came down which has different requirements for the api.

Difference between s3cmd, boto and AWS CLI

I am thinking about redeploying my static website to Amazon S3. I need to automate the deployment so I was looking for an API for such tasks. I'm a bit confused over the different options.
Question: What is the difference between s3cmd, the Python library boto and AWS CLI?
s3cmd and AWS CLI are both command line tools. They're well suited if you want to script your deployment through shell scripting (e.g. bash).
AWS CLI gives you simple file-copying abilities through the "s3" command, which should be enough to deploy a static website to an S3 bucket. It also has some small advantages such as being pre-installed on Amazon Linux, if that was where you were working from (it's also easily installable through pip).
One AWS CLI command that may be appropriate to sync a local directory to an S3 bucket:
$ aws s3 sync . s3://mybucket
Full documentation on this command:
http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html
Edit: As mentioned by #simon-buchan in a comment, the aws s3api command gives you access to the complete S3 API, but its interface is more "raw".
s3cmd supports everything AWS CLI does, plus adds some more extended functionality on top, although I'm not sure you would require any of it for your purposes. You can see all its commands here:
http://s3tools.org/usage
Installation of s3cmd may be a bit more involved because it doesn't seem to be packages for it in any distros main repos.
boto is a Python library, and in fact the official AWS Python SDK. The AWS CLI, also being written in Python, actually uses part of the boto library (botocore). It would be well suited only if you were writing your deployment scripts in Python. There are official SDKs for other popular languages (Java, PHP, etc.) should you prefer:
http://aws.amazon.com/tools/
The rawest form of access to S3 is through AWS's REST API. Everything else is built upon it at some point. If you feel adventurous, here's the S3 REST API documentation:
http://docs.aws.amazon.com/AmazonS3/latest/API/APIRest.html