Reading a feather file from S3 using api gateway proxy for S3 - amazon-s3

I am trying to read a feather file from an S3 bucket using api gateway proxy for s3 with no luck. Tried everything but every time I get below error.
ine 239, in read_table
reader = _feather.FeatherReader(source, use_memory_map=memory_map)
File "pyarrow\_feather.pyx", line 75, in pyarrow._feather.FeatherReader.__cinit__
File "pyarrow\error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow\error.pxi", line 114, in pyarrow.lib.check_status
OSError: Verification of flatbuffer-encoded Footer failed.
I am able to read the same feather file from my local. So looks like nothing is wrong with the feather file itself. It has something to do with how api gateway is returning reponse.
Tried to import this openapi specification for api gateway to get rid of any issues but still same error.
Can anyone please guide me here.
Python code #1
import boto3
from io import BytesIO
import pandas as pd
s3_data=pd.read_feather("https://<api_gateway>/final/s3?key=naxi143/data.feather")
Python code #2
import pandas as pd
import requests
import io
header={'Accept': 'application/octet-stream'}
resp = requests.get(
'https://<api_gateway>/final/s3?key=naxi143/data.feather',
stream=True,
headers=header
)
#print(resp.json())
resp.raw.decode_content = True
mem_fh = io.BytesIO(resp.raw.read())
print(mem_fh )
pd.read_feather(mem_fh)

Related

Accessing NOAA Data via BigQuery

I am trying to access the NOAA Data via BigQuery. I used the following code to achieve the same :
import os
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file("my-json-file-path-with-filename.json")
from google.cloud import bigquery
# Create a "Client" object
client = bigquery.Client(credentials=credentials)
But getting the following error :
DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Can somebody please help. I have already created my service account but still facing this issue.
import os
from google.cloud.bigquery.client import Client
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_file'
bq_client = Client()

Can I set a timeout when writing an object to S3?

I'm trying to upload files to S3, but if the file takes too long to upload I'd like a Timeout exception to be raised.
I've tried using the Config object from botocore to set a timeout of 1 second, but the upload process spends > 5 seconds uploading. I would expect an exception to be raised after 1 second.
import requests
from boto3 import Session
from botocore.config import Config
url = 'https://www.hq.nasa.gov/alsj/a17/A17_FlightPlan.pdf' # a large pdf
response = requests.get(url)
session = Session()
config = Config(connect_timeout=1, read_timeout=1, retries={'max_attempts': 0})
client = session.client('s3', config=config)
client.put_object(Body=response.content, Bucket='test', Key='test.pdf')
Does boto3 have a configuration option that allows me to prevent long uploads?

How can I get the parsed remote address when I use urllib open a host via python2.6?

There is a requirement getting remote address when request a host.
In python3, the response received by urlopen I can get the socket, and by using socket.getpeername() I can get the remote address。 But when I use python2.6, the urlopen is completely diffrent with python3. I have no idea how can I get the remote address by python2.6.
First get python socket object and then use getpeername()
from urllib2 import urlopen
resp = urlopen("http://baidu.com")
print resp.fp._sock.fp._sock.getpeername()
enter image description here

S3 Python client with boto3 SDK

I'd like to make a python S3 client to store data in the S3 Dynamic Storage service provided by the appcloud. So I've discovered the boto3 SDK for python and was wondering how this thing works on the appcloud. Locally you install the aws cli to configure your credentials but how you do that on the cloud? Does someone have experience with creating a S3 python client for the internal appcloud and could provide me with a short example (boto3 or different approach)?
Greetings
Edit 1:
Tried this:
import boto3
s3 = boto3.client('s3', endpoint_url='https://ds31s3.swisscom.com/', aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET)
s3.create_bucket(Bucket="sc-testbucket1234")
But I got this exception:
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://ds31s3.swisscom.com"
import boto3
conn = boto3.resource('s3',
region_name='eu-west-1',
endpoint_url='https://x',
aws_access_key_id='xx',
aws_secret_access_key='xx',)
conn.create_bucket(Bucket="bucketname")
Works with this configuration (with python 3.5):
import boto3
conn = boto3.resource('s3', region_name='eu-west-1', endpoint_url=HOST, aws_access_key_id=KEY, aws_secret_access_key=SECRTE)
conn.create_bucket(Bucket="pqdjmalsdnf12098")
Thanks to #user3080315

unable to upload files to S3 through jmeter

I am trying to upload files to S3 through PUT http request from jmeter. I am specifying the URL in the 'Path' and 'file path and mime type in the 'Files upload' section.
I get 'Access Denied' response from S3. The same URL works fine through Postman and the upload succeeds.
Any help on this??
Given you are able to successfully upload the file using Postman you can just record the associated request using JMeter.
Prepare JMeter for recording. The fastest and the easiest way is using JMeter Templates feature. From JMeter's main menu choose File - Templates - Recording and click "Create
Expand Workbench - HTTP(S) Test Script Recorder and click "Start" button
Run Postman using JMeter as proxy server like:
C:\Users\Jayashree\AppData\Local\Postman\app-4.9.3\Postman.exe --proxy-server=localhost:8888
Put the file you need to upload to the "bin" folder of your JMeter installation
Run the request in Postman - JMeter should record it under Test Plan - Thread Group - Recording Controller
See HTTP(S) Test Script Recorder documentation for more information.
Have you properly specified the AWS credentials in the JMeter PUT request ? You need to specify AWS Access Key and secret key.
Another solution would be to use the AWS Java SDK from a JSR223 sampler, and make the call using Java code.
I have mentioned the steps to upload an image to s3 bucket using JMeter below:
Requirements:
Java 9
aws-java-sdk-s3 JAR 1.11.313 dependencies link
Steps:
Copy the jar files to JMeterHome/lib/ext/ of Jmeter.
Create a Test Plan and click on Thread Group.
Set Number of Threads, Ramp-up period and Loop Count to 1.
Right click on thread groups and add a JSR233 sampler.
Select Java as the language in the JSR233 sampler.
Add the following code in the script section of the sampler.
import java.io.IOException;
import java.io.InputStream;
import java.util.Properties;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
import com.amazonaws.auth.AWSSessionCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.AmazonS3Exception;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.PutObjectRequest;
import com.amazonaws.services.s3.model.PutObjectResult;
import com.amazonaws.services.s3.model.S3Object;
import com.amazonaws.services.s3.model.S3ObjectInputStream;
import com.amazonaws.regions.Regions;
import com.amazonaws.regions.Region;
import com.amazonaws.services.s3.model.ObjectMetadata;
import com.amazonaws.services.s3.transfer.Download;
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
import com.amazonaws.services.s3.transfer.Upload;
String accessKey = "xxxxxxx";
String secretKey = "xxxxxxxxx";
String bucketName = "bucketname"; //specify bucketname
String region = "region"; //specify region
BasicAWSCredentials sessionCredentials = new BasicAWSCredentials(accessKey, secretKey);
AmazonS3 s3 = AmazonS3ClientBuilder.standard()
.withRegion(region)
.withCredentials(new AWSStaticCredentialsProvider(sessionCredentials))
.build();
TransferManager xfer_mgr = TransferManagerBuilder.standard()
.withS3Client(s3)
.withDisableParallelDownloads(false)
.build();
File f = new File("xxx/image.jpg"); //specify path to your image
String objectName = "newimage.jpg"; //provide a name for the image how you want your image to be shown i
Upload xfer = xfer_mgr.upload(bucketName, objectName, f);
xfer.waitForCompletion();
xfer_mgr.shutdownNow();
For more reference you may check this link