Add custom S3 endpoint for Vertica backup - amazon-s3

I am trying to backup the Vertica cluster to a S3 like data store (supports S3 protocol) internal to my enterprise network. We have similar credentials (ACCESS KEY and SECRET KEY).
Here's how my .ini file looks like
[S3]
s3_backup_path = s3://vertica_backups
s3_backup_file_system_path = []:/vertica/backups
s3_concurrency_backup = 10
s3_concurrency_restore = 10
[Transmission]
hardLinkLocal = True
[Database]
dbName = production
dbUser = dbadmin
dbPromptForPassword = False
[Misc]
snapshotName = fullbak1
restorePointLimit = 3
objectRestoreMode = createOrReplace
passwordFile = pwdfile
enableFreeSpaceCheck = True
Where can I supply my specific endpoint? For instance, my S3 store is available on a.b.c.d:80. I have tried changing s3_backup_path = a.b.c.d:80://wms_vertica_backups but I get the error Error: Error in VBR config: Invalid s3_backup_path. Also, I have the ACCESS KEY and SECRET KEY in ~/.aws/credentials.
After going through more resources I have exported the following ENV variables VBR_BACKUP_STORAGE_ENDPOINT_URL, VBR_BACKUP_STORAGE_ACCESS_KEY_ID, VBR_BACKUP_STORAGE_SECRET_ACCESS_KEY. vbr init throws the error Error: Unable to locate credentials Init FAILED. , I'm guessing it is still trying to connect to the AWS S3 servers. (Now removed credentials from ~/.aws/credentials
I think it's worthy to add that I'm running Vertica Enterprise mode 8.1.1.

For anyone looking for something similar, the question was answered in the Vertica forum here

Related

Using a service account and JSON key which is sent to you to upload data into google cloud storage

I wrote a python script that uploads files from a local folder into Google cloud storage.
I also created a service account with sufficient permission and tested it on my computer using that service account JSON key and it worked.
Now I send the code and JSON key to someone else to run but the authentication fails on her side.
Are we missing any authentication through GCP UI?
def config_gcloud():
subprocess.run(
[
shutil.which("gcloud"),
"auth",
"activate-service-account",
"--key-file",
CREDENTIALS_LOCATION,
]
)
storage_client = storage.Client.from_service_account_json(CREDENTIALS_LOCATION)
return storage_client
def file_upload(bucket, source, destination):
storage_client = config_gcloud()
...
The error happens in the config_cloud and it says it is expecting str, path, ... but gets NonType.
As I said, the code is fine and works on my computer. How anotehr person can use it using JSON key which I sent her?She stored Json locally and path to Json is in the code.
CREDENTIALS_LOCATION is None instead of the correct path, hence it complaining about it being NoneType instead of str|Path.
Also you don't need that gcloud call, that would only matter for gcloud/gsutil commands, not python client stuff.
And please post the actual stacktrace of the error next time, not just a misspelled interpretation of it.

How to enable s3 Copy Bucket Permissions in Terraform statement

My goal is to copy the data from a set of s3 buckets into main logging account bucket. Every time I try to perform:
aws s3 cp s3://sub-account-cloudtrail s3://master-acccount-cloudtrail --profile=admin;
I get
(AccessDenied) when calling the CopyObject operation: Access Denied`
I've looked at this post:
How to fix AccessDenied calling CopyObject
I am trying to add the bucket permissions to a Terraform data aws_iam_policy_document. The statement is written like so
data aws_iam_policy_document s3 {
version = "2012-10-17"
statement {
sid = "CopyOobjectPermissions"
effect = "Allow"
principals {
type = "AWS"
identifiers = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/ops-mgmt-admin"]
}
actions = ["s3:GetObject","s3:PutObject","s3:PutObjectAcl"]
resources = ["${aws_s3_bucket.nfcisbenchmark_cloudtrail.arn}/*"]
}
statement {
sid = "CopyBucketPermissions"
actions = ["s3:ListBucket"]
effect = "Allow"
principals {
type = "AWS"
identifiers = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/ops-mgmt-admin"]
}
resources = ["${aws_s3_bucket.nfcisbenchmark_cloudtrail.arn}/*"]
}
}
My goal is to restrict the permissions to the role that is assumed from the sub-account to the master account. My specific question is what permissions need to be added in order to enable copy permissions?
Expected:
Terraform plan runs successfully
Actual:
│ Error: Error putting S3 policy: MalformedPolicy: Action does not apply to any resource(s) in statement
How can I resolve this?
Two things to mention:
In your second statement the resource is wrong, this is why you get the MalformedPolicy error. It should be:
resources = [aws_s3_bucket.nfcisbenchmark_cloudtrail.arn]
Be careful with the identifier. At this point I'm not really sure if your buckets are in different accounts or not. If they are, the account_id in the identifier should reference the source account. data.aws_caller_identity.current.account_id returns the account ID to which Terraform is authenticated, which usually is the account where you are deploying resources (destination account). If your are not doing cross account copying, than it should be fine as it is.
Furthermore, in case of cross account access, ops-mgmt-admin role should have a policy applied to it which gives access to get/list/upload objects to an S3 bucket.

Boto3 generate presinged url does not work

Here is my code that I use to create a s3 client and generate a presigned url, which are some quite standard codes. They have been up running in the server for quite a while. I pulled the code out and ran it locally in a jupyter notebook
def get_s3_client():
return get_s3(create_session=False)
def get_s3(create_session=False):
session = boto3.session.Session() if create_session else boto3
S3_ENDPOINT = os.environ.get('AWS_S3_ENDPOINT')
if S3_ENDPOINT:
AWS_ACCESS_KEY_ID = os.environ['AWS_ACCESS_KEY_ID']
AWS_SECRET_ACCESS_KEY = os.environ['AWS_SECRET_ACCESS_KEY']
AWS_DEFAULT_REGION = os.environ["AWS_DEFAULT_REGION"]
s3 = session.client('s3',
endpoint_url=S3_ENDPOINT,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
region_name=AWS_DEFAULT_REGION)
else:
s3 = session.client('s3', region_name='us-east-2')
return s3
s3 = get_s3_client()
BUCKET=[my-bucket-name]
OBJECT_KEY=[my-object-name]
signed_url = s3.generate_presigned_url(
'get_object',
ExpiresIn=3600,
Params={
"Bucket": BUCKET,
"Key": OBJECT_KEY,
}
)
print(signed_url)
When I tried to download the file using the url in the browser, I got an error message and it says "The specified key does not exist." I noticed in the error message that my object key becomes "[my-bucket-name]/[my-object-name]" rather than just "[my-object-name]".
Then I used the same bucket/key combination to generate a presigned url using aws cli, which is working as expected. I found out that somehow the s3 client method (boto3) inserted [my-object-name] in front of [my-object-name] compared to the aws cli method. Here are the results
From s3.generate_presigned_url()
https://[my-bucket-name].s3.us-east-2.amazonaws.com/[my-bucket-name]/[my-object-name]?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAV17K253JHUDLKKHB%2F20210520%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20210520T175014Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=5cdcc38e5933e92b5xed07b58e421e5418c16942cb9ac6ac6429ac65c9f87d64
From aws cli s3 presign
https://[my-bucket-name].s3.us-east-2.amazonaws.com/[my-object-name]?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAYA7K15LJHUDAVKHB%2F20210520%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20210520T155926Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=58208f91985bf3ce72ccf884ba804af30151d158d6ba410dd8fe9d2457369894
I've been working on this and searching for solutions for day and half and I couldn't find out what was wrong with my implementation. I guess it might be that I ignored some basic but important settings to create a s3 client using boto3 or something else. Thanks for the help!
Ok, myth is solved, I shouldn't provide the endpoint_url=S3_ENDPOINT param when I create the s3 client, boto3 will figure it out. After i removed it, everything works as expected.

Is there a way BULK INSERT from local Azure Blob Storage?

TL;DR I am trying to point SQL to BULK INSERT from Local Azure Blob Storage
The problem:
Hi all,
I'm trying to connect my local SQL Server database instance to the blob storage emulator as an external connection, however I'm getting a "Bad or inaccessible location specified" error. Here are the steps I'm taking:
I have created the following MasterDatabaseKey and CREDENTIALS as follows:
IF EXISTS (SELECT * FROM sys.symmetric_keys WHERE name = '##MS_DatabaseMasterKey##')
DROP MASTER KEY;
--Create Master Key
CREATE MASTER KEY
ENCRYPTION BY PASSWORD='MyStrongPassword';
and database credentials:
-- DROP DB Credentials If Exist
IF EXISTS (SELECT * FROM sys.database_credentials WHERE name = 'credentials')
DROP DATABASE SCORED CREDENTIAL credentials;
--Create scoped credentials to connect to Blob
CREATE DATABASE SCOPED CREDENTIAL credentials
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET =
'Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw=='; --local storage key
GO
then I created the following External Data Source:
CREATE EXTERNAL DATA SOURCE external_source
WITH
(
TYPE = BLOB_STORAGE,
LOCATION = 'http://127.0.0.1:10000/devstoreaccount1/container/some_folder/',
CREDENTIAL = credentials
)
But when I run the BULK INSERT command:
BULK INSERT [dbo].[StagingTable] FROM 'some_file_on_blob_storage.csv' WITH (DATA_SOURCE = 'external_source', FIRSTROW = 1, FIELDTERMINATOR = ',', ROWTERMINATOR = '\n')
but it fails and returns
Bad or inaccessible location specified in external data source "external_source".
How can I load a file from Local Blob Storage into SQL Server?
Nick.McDermaid has point out the error correctly. From your code and the error message, the error is caused by the wrong LOCATION syntax:
Do not add a trailing /, file name, or shared access signature
parameters at the end of the LOCATION URL when configuring an
external data source for bulk operations.
Ref here: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-external-data-source-transact-sql?view=sql-server-ver15&tabs=dedicated#examples-bulk-operations
Change value to LOCATION = 'http://127.0.0.1:10000/devstoreaccount1/container/some_folder', the error should be solved. I tested and all works well.
For you another question, we can not answer you directly. I suggest you post another question with you detailed code. We're all glad to help you.
Update:
About your another question, I tested and found that we must set the Shared access signature(SAS) 'Allowed resource type' = Object, then we can access container and child folder and the files in the container.
Example, both the statements work well.
HTH.

Error on creating a new website bucket with PutBucketWebSite

When trying to create a new WebSiteBucket on S3, I get this (understood) error.
Off course I don't have the bucket. I'm trying to create a new one.
Error: "The specified bucket does not exist"
Code:
using (AmazonS3 client = AWSClientFactory.CreateAmazonS3Client(
System.Configuration.ConfigurationManager.AppSettings["AWSAccessKey"],
System.Configuration.ConfigurationManager.AppSettings["AWSSecretKey"],
S3Config))
{
WebsiteConfiguration wSite = new WebsiteConfiguration();
wSite.IndexDocumentSuffix = "Index.HTML";
wSite.ErrorDocument = "Error.HTML";
PutBucketWebsiteRequest request = new PutBucketWebsiteRequest();
request.BucketName = bucket_name;
request.WebsiteConfiguration = wSite;
PutBucketWebsiteResponse response = client.PutBucketWebsite(request);
}
Help please.
Thanks
This all code written above is to set website configuration on S3 bucket that must exist in your S3 account. That's why its throwing error about "The specified bucket does not exist" on setting website configuration.
To set website configuration you require to set Index.html as your index page and Error.html as error page. These two files (Index.html and Error.html) should exist in your S3 Bucket to set website configuration.
To create bucket you can use this REST API to Create Bucket.
Thanks