Boto3's Multiupload vs Upload File - amazon-s3

I'm trying to use signed urls in order to allow users to upload large datasets to an S3 Bucket. Halfway through implementing everything, I realized that I can sign any client method, including upload_file() or upload_fileobj() but I'm getting "client doesn't have method: upload_fileobj":
def generate_signed_upload_url(self, filename: str) -> str:
if not filename:
raise ValueError("filename is empty")
timestamp = int(datetime.now().timestamp())
key = f"user/{self._user_id}/{filename}_{timestamp}/{filename}"
return self._s3_client.generate_presigned_url(
ClientMethod="upload_fileobj",
Params={"Bucket": self._bucket, "Key": key},
ExpiresIn=self._EXPIRATION_TIME_30_MIN_IN_SEC,
)
Would I be able to directly sign an upload_file() method and take advantage of the inbuilt multipart uploading? If not, why not?

Related

Using a service account and JSON key which is sent to you to upload data into google cloud storage

I wrote a python script that uploads files from a local folder into Google cloud storage.
I also created a service account with sufficient permission and tested it on my computer using that service account JSON key and it worked.
Now I send the code and JSON key to someone else to run but the authentication fails on her side.
Are we missing any authentication through GCP UI?
def config_gcloud():
subprocess.run(
[
shutil.which("gcloud"),
"auth",
"activate-service-account",
"--key-file",
CREDENTIALS_LOCATION,
]
)
storage_client = storage.Client.from_service_account_json(CREDENTIALS_LOCATION)
return storage_client
def file_upload(bucket, source, destination):
storage_client = config_gcloud()
...
The error happens in the config_cloud and it says it is expecting str, path, ... but gets NonType.
As I said, the code is fine and works on my computer. How anotehr person can use it using JSON key which I sent her?She stored Json locally and path to Json is in the code.
CREDENTIALS_LOCATION is None instead of the correct path, hence it complaining about it being NoneType instead of str|Path.
Also you don't need that gcloud call, that would only matter for gcloud/gsutil commands, not python client stuff.
And please post the actual stacktrace of the error next time, not just a misspelled interpretation of it.

Receiving "Invalid policy document or request headers!"

I am attempting to upload a file to S3 following the examples provided in your documentation and source files. Unfortunately, I'm receiving the following errors when attempting an upload:
[Fine Uploader 5.3.2] Invalid policy document or request headers!
[Fine Uploader 5.3.2] Policy signing failed. Invalid policy document
or request headers!
I found a few posts on here with similar errors, but those solutions didn't help me.
Here is my jQuery:
<script>
$('#fine-uploader').fineUploaderS3({
request: {
endpoint: "http://mybucket.s3.amazonaws.com",
accessKey: "changeme"
},
signature: {
endpoint: "endpoint.php"
},
uploadSuccess: {
endpoint: "success.html"
},
template: 'qq-template'
});
</script>
(Please note that I changed the keys/bucket names for security sake.)
I used your endpoint-cors.php as a model and have included the portions that I modified here:
require 'assets/aws/aws-autoloader.php';
use Aws\S3\S3Client;
// These assume you have the associated AWS keys stored in
// the associated system environment variables
$clientPrivateKey = $_ENV['changeme'];
// These two keys are only needed if the delete file feature is enabled
// or if you are, for example, confirming the file size in a successEndpoint
// handler via S3's SDK, as we are doing in this example.
$serverPublicKey = $_ENV['AWS_SERVER_PUBLIC_KEY'];
$serverPrivateKey = $_ENV['AWS_SERVER_PRIVATE_KEY'];
// The following variables are used when validating the policy document
// sent by the uploader.
$expectedBucketName = $_ENV['mybucket'];
// $expectedMaxSize is the value you set the sizeLimit property of the
// validation option. We assume it is `null` here. If you are performing
// validation, then change this to match the integer value you specified
// otherwise your policy document will be invalid.
// http://docs.fineuploader.com/branch/develop/api/options.html#validation-option
$expectedMaxSize = (isset($_ENV['S3_MAX_FILE_SIZE']) ? $_ENV['S3_MAX_FILE_SIZE'] : null);
I also changed this:
// Only needed in cross-origin setups
function handleCorsRequest() {
// If you are relying on CORS, you will need to adjust the allowed domain here.
header('Access-Control-Allow-Origin: http://test.mydomain.com');
}
The POST seems to work:
POST http://test.mydomain.com/somepath/endpoint.php 200 OK
318ms
...but that's where the success ends.
I think part of the problem is that I'm not sure what to enter for "clientPrivateKey". Is that my "Secret Access Key" I set up with IAM?
And I'm definitely unclear on where I get the serverPublicKey and serverPrivateKey. Where am I generating a key-pair on the S3? I've combed through the docs, and perhaps I missed it.
Thank you in advance for your assistance!
First off, you are using endpoint-cors.php in a non-CORS environment. Communication between the browser and your endpoint appears to be same-origin, based on the URL of your signature endpoint. Switch to the endpoint.php example.
Regarding your questions about the keys, you should have create two distinct IAM users: one for client-side operations (heavily restricted) and one for server-side operations (an admin user). For each user, you'll have an access key (public) and a secret key (private). You always supply Fine Uploader with your client-side public key, and use your client-side private key to sign requests server-side. To perform other, more restricted operations (such as deleting files), you should use your server user keys.

How do I download a file or photo that was sent to my Telegram bot?

I am using the telegram bot API but I cant see anyway to download a filé that was sent to my bot. I get a hash of the file but dont know what to do with it. Is there any way? Thanks.
This is now available!
https://core.telegram.org/bots/api#getfile
Hooray! It was added on Sep 18th (2015):
https://core.telegram.org/bots/api
Usage:
In the JSON of the message you will receive a file_id as before. An example of a message object with a voice file:
{
message_id: 2675,
from: {
id: 10000001,
first_name: 'john',
username: 'john'
},
chat: {
id: 10000001,
first_name: 'john',
username: 'john'
},
date: 1442848171,
voice: {
duration: 2,
mime_type: 'audio/ogg',
file_id: 'AwADBAADYwADO1wlBuF1ogMa7HnMAg', // <------- file_id
file_size: 17746
}
}
Via the API's getFile you can now get the required path information for the file:
https://api.telegram.org/bot<bot_token>/getFile?file_id=the_file_id
This will return an object with file_id, file_size and file_path. You can then use the file_path to download the file:
https://api.telegram.org/file/bot<token>/<file_path>
Note that this link will only be available for an hour. After an hour you can request another link. This means that if you want to host the file somehow and you rather avoid checking and re-checking for fresh links every time you serve it you might be better off downloading the file to your own hosting service.
The maximum size of a file obtained through this method is 20MB.
Error: Obtained when a file large than 20mb is used.(Shown below)
{"ok":false,"error_code":400,"description":"Bad Request: file is too big[size:1556925644]"}
From telegram's docs:
On success, a File object is returned. The file can then be downloaded via the link https://api.telegram.org/file/bot/<file_path>, where <file_path> is taken from the response. It is guaranteed that the link will be valid for at least 1 hour. When the link expires, a new one can be requested by calling getFile again.For the moment, bots can download files of up to 20MB in size.
Yay! It's just added at September 18, 2015
You can use getFile(file_id). This function returns a File object containing file_path. You can download the file through this address:
https://api.telegram.org/file/bot<token>/<file_path>
As mentioned in Telegram Bot API Documentation, the File object will be valid for about one hour. You should call getFile again to get a new File object if the old one expires.
If you are using pyTelegramBotAPI you can download your photo using this code:
raw = message.photo[2].file_id
path = raw+".jpg"
file_info = bot.get_file(raw)
downloaded_file = bot.download_file(file_info.file_path)
with open(path,'wb') as new_file:
new_file.write(downloaded_file)
The method to work with files is not available yet.
Source: telegram on twitter
https://twitter.com/telegram/status/614468951926509568
If you have the file_id then you need to use the sendDocument or sendPhoto methods, if you want to send to yourself, you need to tell your bot your user id or your chat id (the same in one-to-one chat).
Also, pay attention, that Telegram api (by webhook) provides thumbs prop, for images and gifs it will provide thumbnail of file. To get source file, you need to check root object file_id.

boto - more concise way to get key's value from bucket?

I'm trying to figure out a concise way to get data from s3 via boto
my current code looks like this. s3 manager is simply a class that does all the s3 setup for my app.
log.debug("generating downloader")
downloader = s3_manager()
log.debug("accessing bucket")
bucket_archive = downloader.s3_buckets['#archive']
log.debug("getting key")
key = bucket_archive.get_key(archive_filename)
log.debug("getting key into string")
source = key.get_contents_as_string()
the problem is that , looking at my debug logs, i'm making two requests to amazon s3:
key = bucket_archive.get_key(archive_filename)
source = key.get_contents_as_string()
looking at the docs [ http://boto.readthedocs.org/en/latest/ref/s3.html ] , it seems that the call to get_key checks to see if it exists , while the second call gets the actual data. does anyone know of a method to do both at once ? a more concise way of doing this with one request is preferable for our app.
The get_key() method performs a HEAD request on the object to verify that it exists. If you are certain that the bucket and key exist and would prefer not to have the overhead of a HEAD request, you can simply create a Key object directly. Something like this would work:
import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket', validate=False)
key = bucket.new_key('myexistingkey')
contents = key.get_contents_as_string()
The validate=False on the call to get_bucket eliminates a GET request that also is intended to validate that the bucket exists.

How do I generate authenticated S3 urls using IAM roles?

When I had my hard-coded access key and secret key present, I was able to generate authenticated urls for users to view private files on S3. This was done with the following code:
import hmac
import sha
import urllib
import time
filename = "".join([s3_url_prefix, filename])
expiry = str(int(time.time()) + 3600)
h = hmac.new(
current_app.config.get("S3_SECRET_KEY", None),
"".join(["GET\n\n\n", expiry, "\n", filename]),
sha
)
signature = urllib.quote_plus(base64.encodestring(h.digest()).strip())
return "".join([
"https://s3.amazonaws.com",
filename,
"?AWSAccessKeyId=",
current_app.config.get("S3_ACCESS_KEY", None),
"&Expires=",
expiry,
"&Signature=",
signature
])
Which gave me something to the effect of https://s3.amazonaws.com/bucket_name/path_to_file?AWSAccessKeyId=xxxxx&Expires=5555555555&Signature=eBFeN32eBb2MwxKk4nhGR1UPhk%3D. Unfortunately, I am unable to store keys in config files for security reasons. For this reason, I switched over to IAM roles. Now, I get my keys using:
_iam = boto.connect_iam()
S3_ACCESS_KEY = _iam.access_key
S3_SECRET_KEY = _iam.secret_key
However, this gives me the error "The AWS Access Key Id you provided does not exist in our records.". From my research, I understand that this is because my IAM keys aren't the actual keys, but instead used with a token. My question therefore, is twofold:
How do I get the token programatically? It doesn't seem that there is a simple iam property I can use.
How do I send the token in the signature? I believe my signature should end up looking something like "".join(["GET\n\n\n", expiry, "\n", token, filename]), but I can't figure out what format to use.
Any help would be greatly appreciated.
There was a change to the generate_url method in https://github.com/boto/boto/commit/99e14f3df039997f54a5377cb6aecc83f22c2670 (June 2012) that made it possible to sign a URL using session credentials. That means you would need to be using a boto version 2.6.0 or later. If you are, you should be able to just do this:
import boto
s3 = boto.connect_s3()
url = s3.generate_url(expires_in=3600, method='GET', bucket='<bucket_name>', key='<key_name>')
What version are you using?