unable to upload files to S3 through jmeter - amazon-s3

I am trying to upload files to S3 through PUT http request from jmeter. I am specifying the URL in the 'Path' and 'file path and mime type in the 'Files upload' section.
I get 'Access Denied' response from S3. The same URL works fine through Postman and the upload succeeds.
Any help on this??

Given you are able to successfully upload the file using Postman you can just record the associated request using JMeter.
Prepare JMeter for recording. The fastest and the easiest way is using JMeter Templates feature. From JMeter's main menu choose File - Templates - Recording and click "Create
Expand Workbench - HTTP(S) Test Script Recorder and click "Start" button
Run Postman using JMeter as proxy server like:
C:\Users\Jayashree\AppData\Local\Postman\app-4.9.3\Postman.exe --proxy-server=localhost:8888
Put the file you need to upload to the "bin" folder of your JMeter installation
Run the request in Postman - JMeter should record it under Test Plan - Thread Group - Recording Controller
See HTTP(S) Test Script Recorder documentation for more information.

Have you properly specified the AWS credentials in the JMeter PUT request ? You need to specify AWS Access Key and secret key.
Another solution would be to use the AWS Java SDK from a JSR223 sampler, and make the call using Java code.

I have mentioned the steps to upload an image to s3 bucket using JMeter below:
Requirements:
Java 9
aws-java-sdk-s3 JAR 1.11.313 dependencies link
Steps:
Copy the jar files to JMeterHome/lib/ext/ of Jmeter.
Create a Test Plan and click on Thread Group.
Set Number of Threads, Ramp-up period and Loop Count to 1.
Right click on thread groups and add a JSR233 sampler.
Select Java as the language in the JSR233 sampler.
Add the following code in the script section of the sampler.
import java.io.IOException;
import java.io.InputStream;
import java.util.Properties;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
import com.amazonaws.auth.AWSSessionCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.AmazonS3Exception;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.PutObjectRequest;
import com.amazonaws.services.s3.model.PutObjectResult;
import com.amazonaws.services.s3.model.S3Object;
import com.amazonaws.services.s3.model.S3ObjectInputStream;
import com.amazonaws.regions.Regions;
import com.amazonaws.regions.Region;
import com.amazonaws.services.s3.model.ObjectMetadata;
import com.amazonaws.services.s3.transfer.Download;
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
import com.amazonaws.services.s3.transfer.Upload;
String accessKey = "xxxxxxx";
String secretKey = "xxxxxxxxx";
String bucketName = "bucketname"; //specify bucketname
String region = "region"; //specify region
BasicAWSCredentials sessionCredentials = new BasicAWSCredentials(accessKey, secretKey);
AmazonS3 s3 = AmazonS3ClientBuilder.standard()
.withRegion(region)
.withCredentials(new AWSStaticCredentialsProvider(sessionCredentials))
.build();
TransferManager xfer_mgr = TransferManagerBuilder.standard()
.withS3Client(s3)
.withDisableParallelDownloads(false)
.build();
File f = new File("xxx/image.jpg"); //specify path to your image
String objectName = "newimage.jpg"; //provide a name for the image how you want your image to be shown i
Upload xfer = xfer_mgr.upload(bucketName, objectName, f);
xfer.waitForCompletion();
xfer_mgr.shutdownNow();
For more reference you may check this link

Related

creating boto3 s3 client on Airflow with an s3 connection and s3 hook

I am trying to move my python code to Airflow. I have the following code snippet:
s3_client = boto3.client('s3',
region_name="us-west-2",
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key)
I am trying to recreate this s3_client using Aiflow's s3 hook and s3 connection but cant find a way to do it in any documentation without specifying the aws_access_key_id and the aws_secret_access_key directly in code.
Any help would be appreciated
You need to define aws connection in Admin -> Connections or with cli (see docs).
Once the connection defined you can use it in S3Hook.
Your connection object can be set as:
Conn Id: <your_choice_of_conn_id_name>
Conn Type: Amazon Web Services
Login: <aws_access_key>
Password: <aws_secret_key>
Extra: {"region_name": "us-west-2"}
In Airflow the hooks wrap a python package. Thus if your code uses hook there shouldn't be a reason to import boto3 directly.

Accessing NOAA Data via BigQuery

I am trying to access the NOAA Data via BigQuery. I used the following code to achieve the same :
import os
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file("my-json-file-path-with-filename.json")
from google.cloud import bigquery
# Create a "Client" object
client = bigquery.Client(credentials=credentials)
But getting the following error :
DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Can somebody please help. I have already created my service account but still facing this issue.
import os
from google.cloud.bigquery.client import Client
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_file'
bq_client = Client()

Uploading Image from google drive to amazon S3 using app script

I am integrating google form to our backend system. In the form, we are accepting images on google drive. I am trying to move google drive images to s3 whenever a form is submitted.
I am using this to fetch an image from google drive.
var driveFile = DriveApp.getFileById("imageId");
For uploading image to S3, I am using app script library S3-for-Google-Apps-Script. File is being uploaded to s3 but format is not correct.
code for uploading image to S3 is
var s3 = S3.getInstance(awsAccessKeyId, awsSecretKey);
s3.putObject(bucket, "file name", driveFile.getBlob(), {logRequests:true});
I am not able to open image after downloading from s3.
Getting error "It may be damaged or use a file format that Preview doesn’t recognize."
Thanks in Advance.
Do
pip install boto3 googledrivedownloader requests
first,
Then use the code given below:
import boto3
from google_drive_downloader import GoogleDriveDownloader as gdd
import os
ACCESS_KEY = 'get-from-aws'
SECRET_KEY = 'get-from-aws'
SESSION_TOKEN = 'not-mandatory'
REGION_NAME = 'ap-southeast-1'
BUCKET = 'dev-media-uploader'
def drive_to_s3_download(drive_url) :
if "drive.google" not in drive_url:
return drive_url #since its not a drive url.
client = boto3.client(
's3',
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY,
region_name=REGION_NAME,
aws_session_token=SESSION_TOKEN # Optional
)
file_id = drive_url.split('/')[5]
print(file_id)
gdd.download_file_from_google_drive(file_id=file_id,
dest_path=f'./{file_id}.jpg',
unzip=True)
client.upload_file(Bucket=BUCKET, Key=f"{file_id}.jpg", Filename=f'./{file_id}.jpg')
os.remove(f'./{file_id}.jpg')
return f'https://{BUCKET}.s3.amazonaws.com/{file_id}.jpg'
client = boto3.client(
's3',
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY,
region_name=REGION_NAME,
aws_session_token=SESSION_TOKEN # Optional
)
# client.download_file(Bucket='test-bucket-drive-to-s3-upload', Key=, Filename=f'./test.jpg')
print(drive_to_s3_download('https://drive.google.com/file/d/1Wlr1PdAv8nX0qt_PWi0SJpx0IYgQDYG6/view?usp=sharing'))
The above code downloads drive file into local, then upload into S3 and then returns the S3 url, using which file can be viewed by anyone, based on permission.

Pyspark not using TemporaryAWSCredentialsProvider

I'm trying to read files from S3 using Pyspark using temporary session credentials but keep getting the error:
Received error response: com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 403, AWS Service: null, AWS Request ID: XXXXXXXX, AWS Error Code: null, AWS Error Message: Forbidden, S3 Extended Request ID: XXXXXXX
I think the issue might be that the S3A connection needs to use org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider in order to pull in the session token in addition to the standard access key and secret key, but even with setting the fs.s3a.aws.credentials.provider configuration variable, it is still attempting to authenticate with BasicAWSCredentialsProvider. Looking at the logs I see:
DEBUG AWSCredentialsProviderChain:105 - Loading credentials from BasicAWSCredentialsProvider
I've followed the directions here to add the necessary configuration values, but they do not seem to make any difference. Here is the code I'm using to set it up:
import os
import sys
import pyspark
from pyspark.sql import SQLContext
from pyspark.context import SparkContext
os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages com.amazonaws:aws-java-sdk-pom:1.11.83,org.apache.hadoop:hadoop-aws:2.7.3 pyspark-shell'
sc = SparkContext()
sc.setLogLevel("DEBUG")
sc._jsc.hadoopConfiguration().set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider")
sc._jsc.hadoopConfiguration().set("fs.s3a.access.key", os.environ.get("AWS_ACCESS_KEY_ID"))
sc._jsc.hadoopConfiguration().set("fs.s3a.secret.key", os.environ.get("AWS_SECRET_ACCESS_KEY"))
sc._jsc.hadoopConfiguration().set("fs.s3a.session.token", os.environ.get("AWS_SESSION_TOKEN"))
sql_context = SQLContext(sc)
Why is TemporaryAWSCredentialsProvider not being used?
Which Hadoop version are you using?
S3A STS support was added in Hadoop 2.8.0, and this was the exact error message i got on Hadoop 2.7.
Wafle is right, its 2.8+ only.
But you might be able to get away with setting the AWS_ environment variables and have the session secrets being picked up that way, as AWS environment variable support has long been in there, and I think it will pick up the AWS_SESSION_TOKEN
See AWS docs

S3 Python client with boto3 SDK

I'd like to make a python S3 client to store data in the S3 Dynamic Storage service provided by the appcloud. So I've discovered the boto3 SDK for python and was wondering how this thing works on the appcloud. Locally you install the aws cli to configure your credentials but how you do that on the cloud? Does someone have experience with creating a S3 python client for the internal appcloud and could provide me with a short example (boto3 or different approach)?
Greetings
Edit 1:
Tried this:
import boto3
s3 = boto3.client('s3', endpoint_url='https://ds31s3.swisscom.com/', aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET)
s3.create_bucket(Bucket="sc-testbucket1234")
But I got this exception:
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://ds31s3.swisscom.com"
import boto3
conn = boto3.resource('s3',
region_name='eu-west-1',
endpoint_url='https://x',
aws_access_key_id='xx',
aws_secret_access_key='xx',)
conn.create_bucket(Bucket="bucketname")
Works with this configuration (with python 3.5):
import boto3
conn = boto3.resource('s3', region_name='eu-west-1', endpoint_url=HOST, aws_access_key_id=KEY, aws_secret_access_key=SECRTE)
conn.create_bucket(Bucket="pqdjmalsdnf12098")
Thanks to #user3080315