Accessing NOAA Data via BigQuery

Accessing NOAA Data via BigQuery - google-bigquery

I am trying to access the NOAA Data via BigQuery. I used the following code to achieve the same :
import os
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file("my-json-file-path-with-filename.json")
from google.cloud import bigquery
# Create a "Client" object
client = bigquery.Client(credentials=credentials)
But getting the following error :
DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Can somebody please help. I have already created my service account but still facing this issue.

import os
from google.cloud.bigquery.client import Client
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_file'
bq_client = Client()

Related

Authentication Failure when Accessing Azure Blob Storage through Connection String

We got error of Authentication fail, when we try to create an azure blob client from connection string, using python v12 sdk with Azure Blob Storage v12.5.0, and Azure core 1.8.2.
I used
azure-storate-blob == 12.5.0
azure-core == 1.8.2
I tried to access my blob storage account using connection string with Python v12 SDK and received the error above. The environment I'm running in is python venv in NixShell.
The code for calling the blob_upload is as following:
blob_service_client = BlobServiceClient(account_url=<>,credential=<>)
blob_client = blob_service_client.get_blob_client(container=container_name,
blob=file)
I printed out blob_client, and it looks normal. But the next line of upload_blob gives error.
with open(os.path.join(root,file), "rb") as data:
blob_client.upload_blob(data)
The error message is as follows
File "<local_address>/.venv/lib/python3.8/site-packages/azure/storage/blob/_upload_helpers.py", in upload_block_blob
return client.upload(
File "<local_address>/.venv/lib/python3.8/site-packages/azure/storage/blob/_generated/operations/_block_blob_operations.py", in upload
raise models.StorageErrorException(response, self._deserialize)
azure.storage.blob._generated.models._models_py3.StorageErrorException: Operation returned an invalid status 'Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.'
So I printed out the http put request to azure blob storage, and get the response value of [403]

I can work the following code well with the version the same as yours.
from azure.storage.blob import BlobServiceClient
blob=BlobServiceClient.from_connection_string(conn_str="your connect string in Access Keys")
with open("./SampleSource.txt", "rb") as data:
blob.upload_blob(data)
Please check your connect-string, and check your PC's time.
There is a similar issue about the error: AzureStorage Blob Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature
UPDATE:
I tried with this code, and get the same error:
from azure.storage.blob import BlobServiceClient
from azure.identity import DefaultAzureCredential
token_credential = DefaultAzureCredential()
blob_service_client = BlobServiceClient(account_url="https://pamelastorage123.blob.core.windows.net/",credential=token_credential)
blob_client = blob_service_client.get_blob_client(container="pamelac", blob="New Text Document.txt")
with open("D:/demo/python/New Text Document.txt", "rb") as data:
blob_client.upload_blob(data)
Then I use AzureCliCredential() instead of DefaultAzureCredential(). I authenticate via the Azure CLI with az login. And it works.
If you use environment credential, you need to set the variables. Anyway, I recommend you to use the specific credentials instead DefaultAzureCredential.
For more details about Azure Identity, see here.

Pyspark not using TemporaryAWSCredentialsProvider

I'm trying to read files from S3 using Pyspark using temporary session credentials but keep getting the error:
Received error response: com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 403, AWS Service: null, AWS Request ID: XXXXXXXX, AWS Error Code: null, AWS Error Message: Forbidden, S3 Extended Request ID: XXXXXXX
I think the issue might be that the S3A connection needs to use org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider in order to pull in the session token in addition to the standard access key and secret key, but even with setting the fs.s3a.aws.credentials.provider configuration variable, it is still attempting to authenticate with BasicAWSCredentialsProvider. Looking at the logs I see:
DEBUG AWSCredentialsProviderChain:105 - Loading credentials from BasicAWSCredentialsProvider
I've followed the directions here to add the necessary configuration values, but they do not seem to make any difference. Here is the code I'm using to set it up:
import os
import sys
import pyspark
from pyspark.sql import SQLContext
from pyspark.context import SparkContext
os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages com.amazonaws:aws-java-sdk-pom:1.11.83,org.apache.hadoop:hadoop-aws:2.7.3 pyspark-shell'
sc = SparkContext()
sc.setLogLevel("DEBUG")
sc._jsc.hadoopConfiguration().set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider")
sc._jsc.hadoopConfiguration().set("fs.s3a.access.key", os.environ.get("AWS_ACCESS_KEY_ID"))
sc._jsc.hadoopConfiguration().set("fs.s3a.secret.key", os.environ.get("AWS_SECRET_ACCESS_KEY"))
sc._jsc.hadoopConfiguration().set("fs.s3a.session.token", os.environ.get("AWS_SESSION_TOKEN"))
sql_context = SQLContext(sc)
Why is TemporaryAWSCredentialsProvider not being used?

Which Hadoop version are you using?
S3A STS support was added in Hadoop 2.8.0, and this was the exact error message i got on Hadoop 2.7.

Wafle is right, its 2.8+ only.
But you might be able to get away with setting the AWS_ environment variables and have the session secrets being picked up that way, as AWS environment variable support has long been in there, and I think it will pick up the AWS_SESSION_TOKEN
See AWS docs

Authentication error in Google Colaboratory

I'm trying to run the Big query sample provided in the colab samples. But I'm always getting the following error from gcp (Billing is enabled for the project). All the required permissions are clicked and approved in the oauth tab. Some insight for this issue will be helpful.
import pandas as pd
project_id = 'my-project-id-178304'
from google.colab import auth
auth.authenticate_user()
WARNING:google.auth._default:No project ID could be determined from
the credentials at GOOGLE_APPLICATION_CREDENTIALS Consider setting the
GOOGLE_CLOUD_PROJECT environment variable

This is just a warning which is (in this case) totally harmless.
Happily, it's already been fixed upstream.

S3 Python client with boto3 SDK

I'd like to make a python S3 client to store data in the S3 Dynamic Storage service provided by the appcloud. So I've discovered the boto3 SDK for python and was wondering how this thing works on the appcloud. Locally you install the aws cli to configure your credentials but how you do that on the cloud? Does someone have experience with creating a S3 python client for the internal appcloud and could provide me with a short example (boto3 or different approach)?
Greetings
Edit 1:
Tried this:
import boto3
s3 = boto3.client('s3', endpoint_url='https://ds31s3.swisscom.com/', aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET)
s3.create_bucket(Bucket="sc-testbucket1234")
But I got this exception:
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://ds31s3.swisscom.com"

import boto3
conn = boto3.resource('s3',
region_name='eu-west-1',
endpoint_url='https://x',
aws_access_key_id='xx',
aws_secret_access_key='xx',)
conn.create_bucket(Bucket="bucketname")

Works with this configuration (with python 3.5):
import boto3
conn = boto3.resource('s3', region_name='eu-west-1', endpoint_url=HOST, aws_access_key_id=KEY, aws_secret_access_key=SECRTE)
conn.create_bucket(Bucket="pqdjmalsdnf12098")
Thanks to #user3080315

unable to upload files to S3 through jmeter

I am trying to upload files to S3 through PUT http request from jmeter. I am specifying the URL in the 'Path' and 'file path and mime type in the 'Files upload' section.
I get 'Access Denied' response from S3. The same URL works fine through Postman and the upload succeeds.
Any help on this??

Given you are able to successfully upload the file using Postman you can just record the associated request using JMeter.
Prepare JMeter for recording. The fastest and the easiest way is using JMeter Templates feature. From JMeter's main menu choose File - Templates - Recording and click "Create
Expand Workbench - HTTP(S) Test Script Recorder and click "Start" button
Run Postman using JMeter as proxy server like:
C:\Users\Jayashree\AppData\Local\Postman\app-4.9.3\Postman.exe --proxy-server=localhost:8888
Put the file you need to upload to the "bin" folder of your JMeter installation
Run the request in Postman - JMeter should record it under Test Plan - Thread Group - Recording Controller
See HTTP(S) Test Script Recorder documentation for more information.

Have you properly specified the AWS credentials in the JMeter PUT request ? You need to specify AWS Access Key and secret key.
Another solution would be to use the AWS Java SDK from a JSR223 sampler, and make the call using Java code.

I have mentioned the steps to upload an image to s3 bucket using JMeter below:
Requirements:
Java 9
aws-java-sdk-s3 JAR 1.11.313 dependencies link
Steps:
Copy the jar files to JMeterHome/lib/ext/ of Jmeter.
Create a Test Plan and click on Thread Group.
Set Number of Threads, Ramp-up period and Loop Count to 1.
Right click on thread groups and add a JSR233 sampler.
Select Java as the language in the JSR233 sampler.
Add the following code in the script section of the sampler.
import java.io.IOException;
import java.io.InputStream;
import java.util.Properties;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
import com.amazonaws.auth.AWSSessionCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.AmazonS3Exception;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.PutObjectRequest;
import com.amazonaws.services.s3.model.PutObjectResult;
import com.amazonaws.services.s3.model.S3Object;
import com.amazonaws.services.s3.model.S3ObjectInputStream;
import com.amazonaws.regions.Regions;
import com.amazonaws.regions.Region;
import com.amazonaws.services.s3.model.ObjectMetadata;
import com.amazonaws.services.s3.transfer.Download;
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
import com.amazonaws.services.s3.transfer.Upload;
String accessKey = "xxxxxxx";
String secretKey = "xxxxxxxxx";
String bucketName = "bucketname"; //specify bucketname
String region = "region"; //specify region
BasicAWSCredentials sessionCredentials = new BasicAWSCredentials(accessKey, secretKey);
AmazonS3 s3 = AmazonS3ClientBuilder.standard()
.withRegion(region)
.withCredentials(new AWSStaticCredentialsProvider(sessionCredentials))
.build();
TransferManager xfer_mgr = TransferManagerBuilder.standard()
.withS3Client(s3)
.withDisableParallelDownloads(false)
.build();
File f = new File("xxx/image.jpg"); //specify path to your image
String objectName = "newimage.jpg"; //provide a name for the image how you want your image to be shown i
Upload xfer = xfer_mgr.upload(bucketName, objectName, f);
xfer.waitForCompletion();
xfer_mgr.shutdownNow();
For more reference you may check this link

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Accessing NOAA Data via BigQuery - google-bigquery

import os from google.cloud.bigquery.client import Client os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_file' bq_client = Client()

Related

Authentication Failure when Accessing Azure Blob Storage through Connection String

Pyspark not using TemporaryAWSCredentialsProvider

Authentication error in Google Colaboratory

S3 Python client with boto3 SDK

unable to upload files to S3 through jmeter

Categories

Resources