Error on redshift UNLOAD command - amazon-s3

I am trying to UNLOAD a Redshift table to an S3 bucket, but I am getting errors that I can't resolve.
When using 's3://mybucket/' as the destination (which is the documented way to specify the destination), I have an error saying S3ServiceException:The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint..
After some research I have tried to change the destination to include the full bucket url, without success.
All these destinations:
's3://mybucket.s3.amazonaws.com/',
's3://mybucket.s3.amazonaws.com/myprefix',
's3://mybucket.s3.eu-west-2.amazonaws.com/',
's3://mybucket.s3.eu-west-2.amazonaws.com/myprefix'
return this error S3ServiceException:The authorization header is malformed; the region 'eu-west-2' is wrong; expecting 'us-east-1', which is also the error returned when I use a bucket name that doesn't exist.
My Redshift cluster and my s3 buckets all exist in the same region, eu-west-2.
What am I doing wrong?
[appendix]
Full command:
UNLOAD ('select * from mytable')
to 's3://mybucket.s3.amazonaws.com/'
iam_role 'arn:aws:iam::0123456789:role/aws-service-
role/redshift.amazonaws.com/AWSServiceRoleForRedshift'
Full errors:
ERROR: S3ServiceException:The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.,Status 301,Error PermanentRedirect,Rid 6ADF2C929FD2BE08,ExtRid vjcTnD02Na/rRtLvWsk5r6p0H0xncMJf6KBK
DETAIL:
-----------------------------------------------
error: S3ServiceException:The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.,Status 301,Error PermanentRedirect,Rid 6ADF2C929FD2BE08,ExtRid vjcTnD02Na/rRtLvWsk5r6p0H0xncMJf6KBK
code: 8001
context: Listing bucket=mybucket prefix=
query: 0
location: s3_unloader.cpp:226
process: padbmaster [pid=30717]
-----------------------------------------------
ERROR: S3ServiceException:The authorization header is malformed; the region 'eu-west-2' is wrong; expecting 'us-east-1',Status 400,Error AuthorizationHeaderMalformed,Rid 559E4184FA02B03F,ExtRid H9oRcFwzStw43ynA+rinTOmynhWfQJlRz0QIcXcm5K7fOmJSRcOcHuVlUlhGebJK5iH2L
DETAIL:
-----------------------------------------------
error: S3ServiceException:The authorization header is malformed; the region 'eu-west-2' is wrong; expecting 'us-east-1',Status 400,Error AuthorizationHeaderMalformed,Rid 559E4184FA02B03F,ExtRid H9oRcFwzStw43ynA+rinTOmynhWfQJlRz0QIcXcm5K7fOmJSRcOcHuVlUlhGebJK5iH2L
code: 8001
context: Listing bucket=mybucket.s3.amazonaws.com prefix=
query: 0
location: s3_unloader.cpp:226
process: padbmaster [pid=30717]
-----------------------------------------------
Bucket zone
Cluster zone

Related

camel aws2-s3 component - producer prefix option on upload

I am trying to use camel (3.7.5) aws2-s3 component to upload a file to AWS S3 storage. Consuming works just fine with bucket being configured using specific prefix as component's consumer option - access credentials are therefore correct (accessKey and secretKey). However, producer does not work leaving me with AccessDenied 403. I suspect that it is due to invalid prefix/path configured on producer: on consumer if I set invalid prefix I get identical error (403). On the producer I tried to use the same 'prefix' option but apparently that does not work/the docs also mention "prefix" as only consumer option. How do I set prefix for producer properly?
That works:
from("aws2-s3://MY_BUCKET?" +
"region=eu-central-1" +
"&accessKey=RAW(XXX)" +
"&secretKey=RAW(YYY)" +
"&prefix=MY_PREFIX").log("tick");
That does not work (403):
from("timer://foo?fixedRate=true&period=5000").routeId("aws-route")
.setHeader(AWS2S3Constants.KEY, simple("TEST.xml"))
.setBody(simple("Hello"))
.to("aws2-s3://MY_BUCKET?"
+ "region=eu-central-1"
+ "&prefix=MY_PREFIX"
+ "&accessKey=RAW(XXX)"
+ "&secretKey=RAW(YYY)");
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: Access Denied (Service: S3, Status Code: 403, Request ID: JMCHHA87HN67C2B6, Extended Request ID: 6dEH4iuPXS4dgbJUCtHqw2gfmGuwgbw1cJcvevWpBCnZWJxjCg9oyd1MGWJh++pe2HIe1rT2dws=)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:106)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:84)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:42)
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:94)
at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$6(BaseClientHandler.java:252)
at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:40)
at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:30)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:73)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42)
at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:77)
at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:39)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:50)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:36)
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:64)
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:34)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56)
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:48)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:31)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:193)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:133)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:159)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:112)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:167)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:94)
at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:55)
at software.amazon.awssdk.services.s3.DefaultS3Client.listObjects(DefaultS3Client.java:5901)
at org.apache.camel.component.aws2.s3.AWS2S3Endpoint.doStart(AWS2S3Endpoint.java:114)
at org.apache.camel.support.service.BaseService.start(BaseService.java:115)
at org.apache.camel.support.service.ServiceHelper.startService(ServiceHelper.java:84)
at org.apache.camel.processor.SendProcessor.doStart(SendProcessor.java:230)
at org.apache.camel.support.service.BaseService.start(BaseService.java:115)
at org.apache.camel.support.service.ServiceHelper.startService(ServiceHelper.java:84)
at org.apache.camel.support.service.ServiceHelper.startService(ServiceHelper.java:101)
at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler.doStart(RedeliveryErrorHandler.java:1487)
at org.apache.camel.support.ChildServiceSupport.start(ChildServiceSupport.java:60)
... 27 more
When you set the header
.setHeader(AWS2S3Constants.KEY, simple("TEST.xml"))
It should be enough to prepend the prefix
.setHeader(AWS2S3Constants.KEY, simple("<prefix>/TEST.xml"))

AWS IoT SQL Rule

Am trying to execute the IoT Rule as part of the CFT template.
This rule has to ignore the message which has this field sNumber starting with F0F1.
The rule is like
SELECT * FROM 'topic/+/+/+' WHERE 'sNumber' NOT LIKE 'F0F1%'
But, am facing this error:
Resource handler returned message: "Expected a comparison operation:
StringNode(sNumber) 'sNumber' NOT LIKE 'F0F1%'
--------------------------------------------------------------------------------------------------------------------------------^ at 1:34 (Service: Iot, Status Code: 400, Request ID:
75e91f11-05c8-4e22-8cd7-0a3567261695, Extended Request ID: null)"
(RequestToken: 6cd8d39d-1b2d-4076-6253-60212009a63a, HandlerErrorCode:
InvalidRequest)
Can you help me understand what need to be done to achieve my requirement.
Try using startswith() function
SELECT * FROM 'topic/+/+/+' WHERE NOT startswith(sNumber, 'F0F1')

Copy data activity fails when sink is immutable

In my Azure data factory pipeline, I'm using a Copy data activity inside a ForEach activity to copy files from an input container to an archive container before processing the files in the input container. This normally works, but today I made the archive container immutable by adding a legal hold policy to it, and the next time the copy data activity ran, it failed with an error (see below). Is there any way around this, since you should be able to add new files to an immutable container?
Error code: 2200
Failure type: User configuration issue
Details:
Failure happened on 'Sink' side. ErrorCode=AdlsGen2OperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ADLS Gen2 operation failed for: Operation returned an invalid status code 'Conflict'. Account: 'mydatalake'. FileSystem: 'raw'. Path: 'Source/ABC/File_2021_03_24.csv'. ErrorCode: 'PathImmutableDueToLegalHold'. Message: 'This operation is not permitted as the path is immutable due to one or more legal holds.'. RequestId: '37f75e88-501a-0026-2fa1-20d52e000000'. TimeStamp: 'Wed, 24 Mar 2021 11:30:54 GMT'..,Source=Microsoft.DataTransfer.ClientLibrary,''Type=Microsoft.Azure.Storage.Data.Models.ErrorSchemaException,Message=Operation returned an invalid status code 'Conflict',Source=Microsoft.DataTransfer.ClientLibrary,'
Source: Pipeline LoadMyData

Presto DDL on S3 using FileHiveMetastore not working

I tried to connect Presto to S3 using FileHiveMetaStore with below configurations , but it when I am trying to create table with the statement mentioned but it fails with error message mentioned below . could any one let me know if the configurations mentioned are wrong.
I could see that it is possible as some one has already mentioned it is possible to connect
reference thread :- Setup Standalone Hive Metastore Service For Presto and AWS S3
error message:- com.amazonaws.services.s3.model.AmazonS3Exception: The specified bucket does not exist (Service: Amazon S3; Status Code: 404; Error Code: NoSuchBucket; Request ID: 33F01AA7477B12FC)
**connector.name=hive-hadoop2
hive.metastore=file
hive.metastore.catalog.dir=s3://ap-south-1.amazonaws.com/prestos3test/
hive.s3.aws-access-key=yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
hive.s3.aws-secret-key=zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
hive.s3.endpoint=http://prestos3test.s3-ap-south-1.amazonaws.com
hive.s3.ssl.enabled=false
hive.metastore.uri=thrift://localhost:9083**
External Table Creation
**CREATE TABLE PropData (
prop0 integer,
prop1 integer,
prop2 varchar,
prop3 varchar ,
prop4 varchar
)
WITH (
format = 'ORC',
external_location = 's3://prestos3test'
)**
Thanks
Santosh
I got help form other corners ,thought it would be helpful to others hence documenting necessary config in below .
connector.name=hive-hadoop2
hive.metastore=file
hive.metastore.catalog.dir=s3://prestos3test/
hive.s3.aws-access-key=yyyyyyyyyyyyyyyyyy
hive.s3.aws-secret-key=zzzzzzzzzzzzzzzzzzzzzz
hive.s3.ssl.enabled=false
hive.metastore.uri=thrift://localhost:9083
Thanks
Santosh

S3ResponseError: 403 Forbidden.An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist

try:
conn = boto.connect_s3(access_key,secret_access_key)
bucket = conn.get_bucket(bucket_name, validate=False)
k1 = Key(bucket)
k1.key = 'Date_Table.csv'
# k = bucket.get_key('Date_Table.csv')
k1.make_public()
k1.get_contents_to_filename(tar)
except Exception as e:
print(e)
i am getting error
S3ResponseError: 403 Forbidden
AccessDeniedAccess
DeniedD9ED8BFF6D6A993Eaw0KmxskATNBTDUEo3SZdwrNVolAnrt9/pkO/EGlq6X9Gxf36fQiBAWQA7dBSjBNZknMxWDG9GI=
i tried all posibility and still getting same error .. please guide me how to solve this issue.
i tried other way as below and getting error
An error occurred (NoSuchKey) when calling the GetObject operation:
The specified key does not exist.
session = boto3.session.Session(aws_access_key_id=access_key, aws_secret_access_key=secret_access_key,region_name='us-west-2')
print ("session:"+str(session)+"\n")
client = session.client('s3', endpoint_url=s3_url)
print ("client:"+str(client)+"\n")
stuff = client.get_object(Bucket=bucket_name, Key='Date_Table.csv')
print ("stuff:"+str(stuff)+"\n")
stuff.download_file(local_filename)
ge
Always use boto3. boto is deprecated.
As long as you setup AWS CLI credential, you don't need to pass the hard-coded credential. Read boto3 credential setup throughly.
There is no reason to initiate boto3.session unless you are using different region and user profile.
Take your time and study difference between service client(boto3.client) vs service resources(boto3.resources).
Low level boto3.client is easier to use for experiments. Use high level boto3.resource if you need to pass around arbitrary object.
Here is the simple code for boto3.client("s3").download_file.
import boto3
# initiate the proper AWS services client, i.e. S3
s3 = boto3.client("s3")
s3.download_file('your_bucket_name', 'Date_Table.csv', '/your/local/path/and/filename')