Can I set a timeout when writing an object to S3? - amazon-s3

I'm trying to upload files to S3, but if the file takes too long to upload I'd like a Timeout exception to be raised.
I've tried using the Config object from botocore to set a timeout of 1 second, but the upload process spends > 5 seconds uploading. I would expect an exception to be raised after 1 second.
import requests
from boto3 import Session
from botocore.config import Config
url = 'https://www.hq.nasa.gov/alsj/a17/A17_FlightPlan.pdf' # a large pdf
response = requests.get(url)
session = Session()
config = Config(connect_timeout=1, read_timeout=1, retries={'max_attempts': 0})
client = session.client('s3', config=config)
client.put_object(Body=response.content, Bucket='test', Key='test.pdf')
Does boto3 have a configuration option that allows me to prevent long uploads?

Related

Does AWS Sagemaker supports gRPC prediction requests?

I deployed a Sagemaker's Tensorflow model from an estimator in local mode and when trying to call the Tensorflow Serving (TFS) predict endpoint using gRPC I get the error:
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
Im doing the gRPC request exactly as in this blog post:
import grpc from tensorflow.compat.v1
import make_tensor_protofrom tensorflow_serving.apis
import predict_pb2from tensorflow_serving.apis
import prediction_service_pb2_grpc
grpc_port = 9000 # Tried also with other ports such as 8500
request = predict_pb2.PredictRequest()
request.model_spec.name = 'model'
request.model_spec.signature_name = 'serving_default'
request.inputs['input_tensor'].CopyFrom(make_tensor_proto(instance))
options = [
('grpc.enable_http_proxy', 0),
('grpc.max_send_message_length', MAX_GRPC_MESSAGE_LENGTH),
('grpc.max_receive_message_length', MAX_GRPC_MESSAGE_LENGTH)
]
channel = grpc.insecure_channel(f'0.0.0.0:{grpc_port}', options=options)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
result_future = stub.Predict.future(request, 30)
output_tensor_proto = result_future.result().outputs['predictions']
output_shape = [dim.size for dim in output_tensor_proto.tensor_shape.dim]
output_np = np.array(output_tensor_proto.float_val).reshape(output_shape)
prediction_json = {'predictions': output_np.tolist()}
Looking at the Sagemaker's docker container where TFS is running, I see in the logs that the rest endpoint is exported/exposed, but not the gRPC one, although it seems to be running:
ensorflow_serving/model_servers/server.cc:417] Running gRPC ModelServer at 0.0.0.0:9000 ...
Unlike for gRPC, in the container logs I can see the rest endpoint is exported:
tensorflow_serving/model_servers/server.cc:438] Exporting HTTP/REST API at:localhost:8501 ...
Does Sagemaker TFS containers even support gRPC? How can one make a gRPC TFS prediction request using Sagemaker?
SageMaker endpoints are REST endpoints. You can however make gRPC connections within the container. You cannot make the InvokeEndpoint API call via gRPC.
If you are using the SageMaker TensorFlow container, you need to pass an inference.py script that contains the logic to make the gRPC request to TFS.
Kindly see this example inference.py script that makes a gRPC prediction against TensorFlow Serving.

Unable to set a cookie of Github using Selenium Webdriver

I tried to set a cookie for GitHub using Selenium, but it always failed. After deeper analysis, I found that it was throwing an exception when setting a cookie with the name __Host-user_session_same_site. This seems very strange and I would like to know the reason for this phenomenon.
from selenium import webdriver
from selenium.webdriver.edge.options import Options
from selenium.webdriver.edge.service import Service
import json
import time
driveroptions = Options()
driveroptions.use_chromium = True
driveroptions.add_argument('–start-maximized')
driveroptions.binary_location = r'C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe'
service = Service(
executable_path=r'C:\Program Files (x86)\Microsoft\Edge\Application\msedgedriver.exe')
driver = webdriver.Edge(options=driveroptions, service=service)
driver.set_page_load_timeout(60)
driver.implicitly_wait(3)
driver.get("https://github.com")
driver.maximize_window()
driver.delete_all_cookies()
with open('cookies.txt', 'r') as f:
cookies_list = json.load(f)
for cookie in cookies_list:
cookie['expiry'] = int(time.time() + 10000)
new_cookie = {k: cookie[k] for k in {'name', 'value', 'domain', 'path', 'expiry'}}
# if cookie['name'] == '__Host-user_session_same_site':
# continue
driver.add_cookie(new_cookie)
Before that, the cookies.txt was exported using f.write(json.dumps(driver.get_cookies())) after I logged into Github. If I turn on the commented code above, everything works fine. Otherwise, the program will throw an exception: selenium.common.exceptions.UnableToSetCookieException: Message: unable to set cookie. I don't quite understand what is so special about cookies with this name (__Host-user_session_same_site).
My runtime environment information is as follows.
MicrosoftEdge=103.0.1264.62
MsEdgeDriver=103.0.1264.62
I would be very grateful if I could get your help.
This cookie is set to ensure that browsers that support SameSite cookies can check to see if a request originates from GitHub.
You will find only this cookie have a different value of Strict in the sameSite attribute, others being Lax. So when you skip this cookie, everything works fine. You can set this cookie separately by adding this code:
driver.add_cookie({'name':'__Host-user_session_same_site','value': 'itsValue','sameSite':'Strict'})

Reading a feather file from S3 using api gateway proxy for S3

I am trying to read a feather file from an S3 bucket using api gateway proxy for s3 with no luck. Tried everything but every time I get below error.
ine 239, in read_table
reader = _feather.FeatherReader(source, use_memory_map=memory_map)
File "pyarrow\_feather.pyx", line 75, in pyarrow._feather.FeatherReader.__cinit__
File "pyarrow\error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow\error.pxi", line 114, in pyarrow.lib.check_status
OSError: Verification of flatbuffer-encoded Footer failed.
I am able to read the same feather file from my local. So looks like nothing is wrong with the feather file itself. It has something to do with how api gateway is returning reponse.
Tried to import this openapi specification for api gateway to get rid of any issues but still same error.
Can anyone please guide me here.
Python code #1
import boto3
from io import BytesIO
import pandas as pd
s3_data=pd.read_feather("https://<api_gateway>/final/s3?key=naxi143/data.feather")
Python code #2
import pandas as pd
import requests
import io
header={'Accept': 'application/octet-stream'}
resp = requests.get(
'https://<api_gateway>/final/s3?key=naxi143/data.feather',
stream=True,
headers=header
)
#print(resp.json())
resp.raw.decode_content = True
mem_fh = io.BytesIO(resp.raw.read())
print(mem_fh )
pd.read_feather(mem_fh)

Uploading Image from google drive to amazon S3 using app script

I am integrating google form to our backend system. In the form, we are accepting images on google drive. I am trying to move google drive images to s3 whenever a form is submitted.
I am using this to fetch an image from google drive.
var driveFile = DriveApp.getFileById("imageId");
For uploading image to S3, I am using app script library S3-for-Google-Apps-Script. File is being uploaded to s3 but format is not correct.
code for uploading image to S3 is
var s3 = S3.getInstance(awsAccessKeyId, awsSecretKey);
s3.putObject(bucket, "file name", driveFile.getBlob(), {logRequests:true});
I am not able to open image after downloading from s3.
Getting error "It may be damaged or use a file format that Preview doesn’t recognize."
Thanks in Advance.
Do
pip install boto3 googledrivedownloader requests
first,
Then use the code given below:
import boto3
from google_drive_downloader import GoogleDriveDownloader as gdd
import os
ACCESS_KEY = 'get-from-aws'
SECRET_KEY = 'get-from-aws'
SESSION_TOKEN = 'not-mandatory'
REGION_NAME = 'ap-southeast-1'
BUCKET = 'dev-media-uploader'
def drive_to_s3_download(drive_url) :
if "drive.google" not in drive_url:
return drive_url #since its not a drive url.
client = boto3.client(
's3',
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY,
region_name=REGION_NAME,
aws_session_token=SESSION_TOKEN # Optional
)
file_id = drive_url.split('/')[5]
print(file_id)
gdd.download_file_from_google_drive(file_id=file_id,
dest_path=f'./{file_id}.jpg',
unzip=True)
client.upload_file(Bucket=BUCKET, Key=f"{file_id}.jpg", Filename=f'./{file_id}.jpg')
os.remove(f'./{file_id}.jpg')
return f'https://{BUCKET}.s3.amazonaws.com/{file_id}.jpg'
client = boto3.client(
's3',
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY,
region_name=REGION_NAME,
aws_session_token=SESSION_TOKEN # Optional
)
# client.download_file(Bucket='test-bucket-drive-to-s3-upload', Key=, Filename=f'./test.jpg')
print(drive_to_s3_download('https://drive.google.com/file/d/1Wlr1PdAv8nX0qt_PWi0SJpx0IYgQDYG6/view?usp=sharing'))
The above code downloads drive file into local, then upload into S3 and then returns the S3 url, using which file can be viewed by anyone, based on permission.

unable to upload files to S3 through jmeter

I am trying to upload files to S3 through PUT http request from jmeter. I am specifying the URL in the 'Path' and 'file path and mime type in the 'Files upload' section.
I get 'Access Denied' response from S3. The same URL works fine through Postman and the upload succeeds.
Any help on this??
Given you are able to successfully upload the file using Postman you can just record the associated request using JMeter.
Prepare JMeter for recording. The fastest and the easiest way is using JMeter Templates feature. From JMeter's main menu choose File - Templates - Recording and click "Create
Expand Workbench - HTTP(S) Test Script Recorder and click "Start" button
Run Postman using JMeter as proxy server like:
C:\Users\Jayashree\AppData\Local\Postman\app-4.9.3\Postman.exe --proxy-server=localhost:8888
Put the file you need to upload to the "bin" folder of your JMeter installation
Run the request in Postman - JMeter should record it under Test Plan - Thread Group - Recording Controller
See HTTP(S) Test Script Recorder documentation for more information.
Have you properly specified the AWS credentials in the JMeter PUT request ? You need to specify AWS Access Key and secret key.
Another solution would be to use the AWS Java SDK from a JSR223 sampler, and make the call using Java code.
I have mentioned the steps to upload an image to s3 bucket using JMeter below:
Requirements:
Java 9
aws-java-sdk-s3 JAR 1.11.313 dependencies link
Steps:
Copy the jar files to JMeterHome/lib/ext/ of Jmeter.
Create a Test Plan and click on Thread Group.
Set Number of Threads, Ramp-up period and Loop Count to 1.
Right click on thread groups and add a JSR233 sampler.
Select Java as the language in the JSR233 sampler.
Add the following code in the script section of the sampler.
import java.io.IOException;
import java.io.InputStream;
import java.util.Properties;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
import com.amazonaws.auth.AWSSessionCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.AmazonS3Exception;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.PutObjectRequest;
import com.amazonaws.services.s3.model.PutObjectResult;
import com.amazonaws.services.s3.model.S3Object;
import com.amazonaws.services.s3.model.S3ObjectInputStream;
import com.amazonaws.regions.Regions;
import com.amazonaws.regions.Region;
import com.amazonaws.services.s3.model.ObjectMetadata;
import com.amazonaws.services.s3.transfer.Download;
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
import com.amazonaws.services.s3.transfer.Upload;
String accessKey = "xxxxxxx";
String secretKey = "xxxxxxxxx";
String bucketName = "bucketname"; //specify bucketname
String region = "region"; //specify region
BasicAWSCredentials sessionCredentials = new BasicAWSCredentials(accessKey, secretKey);
AmazonS3 s3 = AmazonS3ClientBuilder.standard()
.withRegion(region)
.withCredentials(new AWSStaticCredentialsProvider(sessionCredentials))
.build();
TransferManager xfer_mgr = TransferManagerBuilder.standard()
.withS3Client(s3)
.withDisableParallelDownloads(false)
.build();
File f = new File("xxx/image.jpg"); //specify path to your image
String objectName = "newimage.jpg"; //provide a name for the image how you want your image to be shown i
Upload xfer = xfer_mgr.upload(bucketName, objectName, f);
xfer.waitForCompletion();
xfer_mgr.shutdownNow();
For more reference you may check this link