Can we execute AWS Athena Commandline from EMR? - amazon-emr

I am trying to access AWS Athena from EMR via Athena Commandline
aws Athena start-query-execution --query-string --result-configuration
But aws help doesn't list Athena.
Where we need to execute the Athena commandline from and how?

Can you please paste what error you got when you tried below command.
aws Athena start-query-execution --query-string --result-configuration
Did command line didn't recognize the parameter "Athena" ? Can you please try with smaller case "athena".
Below link has athena CLI documentation
Athena CLI Documentation

Updating the AWS CLI Version on EMR Master node worked
pip install awscli --upgrade --user
Before Update CLI version was
aws --version
aws-cli/1.11.83 Python/2.7.12 Linux/4.4.35-33.55.amzn1.x86_64 botocore/1.5.46
After Update the CLI version is -
aws --version
aws-cli/1.15.62 Python/2.7.12 Linux/4.4.35-33.55.amzn1.x86_64 botocore/1.10.61

Related

www-data cannot use aws command

I have some php that executes a .sh which is has some aws s3 cp commands among other things.
However when this script is executed by www-data the aws command is not found. I suppose this is because I installed using pip3 install awscli --upgrade --user so it is now installed under the user "test". The script runs fine when calling it from the CLI with the test user.
How can www-data use the aws command? Should I just install without --user?
You seem correct. Making awscli binaries available for www-data should fix the problem. The easiest way would probably be to install the aws cli for www-data user or you may also try option 2 from this link.

Error while enabling server side encryption policy for aws s3 bucket through cli

aws s3api put-bucket-encryption --bucket my-buxket-en --server-s
ide-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault":
{"SSEAlgorithm": "AES256"}}]}'
I am getting below error
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
To see help text, you can run:
aws help
aws <command> help
aws <command> <subcommand> help
Unknown options: {SSEAlgorithm:, AES256}}]}', [{ApplyServerSideEncryptionByDefau
lt:
Kindly help me to resolve the error
I'm going to add an answer that explains this for people using windows in case they find this and can't figure it out.
aws s3api put-bucket-encryption --bucket my-bucket-name --server-s
ide-encryption-configuration "{\"Rules\": [{\"ApplyServerSideEncryptionByDefault\":
{\"SSEAlgorithm\": \"AES256\"}}]"
Needs to become
aws s3api put-bucket-encryption --bucket my-bucket-name --server-s
ide-encryption-configuration "{\"Rules\": [{\"ApplyServerSideEncryptionByDefault\":
{\"SSEAlgorithm\": \"AES256\"}}]"
Once you handle the quotes in the json it works as expected.
I have examined your AWS CLI syntax and I can confirm to the best of my ability there's nothing wrong with your syntax.
From the error, the issue is more related to the AWS CLI version i.e you are most likely using an older version of AWS CLI hence the old version is not able to pick-up the required parameter server-side-encryption-configuration
Resolution Steps:
1. Check the current version of your AWS CLI :
aws --version
If the output is anything less than version (1.18.31) then proceed to upgrade your AWS CLI version as shown below.
2. Upgrade your AWS CLI using pip (or pip3):
To upgrade an existing AWS CLI installation, use the --upgrade option:
pip install --upgrade awscli
OR
pip3 install --upgrade awscli
3. Upgrade your AWS CLI using AWS Bundled Installer:
curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip"
unzip awscli-bundle.zip
sudo ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws
Note: You may need to log out for the changes to take effect
Hope this helps!

EMR spark-shell not picking up jars

I am using spark-shell and I am unable to pick up external jars. I run spark in EMR.
I run the following command:
spark-shell --jars s3://play/emr/release/1.0/code.jar
I get the following error:
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Warning: Skip remote jar s3://play/emr/release/1.0/code.jar
Thanks in advance.
This is a limitation of Apache Spark itself, not specifically Spark on EMR. When running Spark in client deploy mode (all interactive shells like spark-shell or pyspark, or spark-submit without --deploy-mode cluster or --master yarn-cluster), only local jar paths are allowed.
The reason for this is that in order for Spark to download this remote jar, it must already be running Java code, at which point it is too late to add the jar to its own classpath.
The workaround is to download the jar locally (using the AWS S3 CLI) then specify the local path when running spark-shell or spark-submit.
You can do this with a spark-shell command line on the EMR box itself:
spark-submit --verbose --deploy-mode cluster --class com.your.package.and.Class s3://bucket/path/to/thejar.jar 10
You can also call this command using the AWS Java EMR Client Library or the AWS CLI. The key is to use: '--deploy-mode cluster'
Had same issue, you can add "--master yarn --deploy-mode cluster" args and it will allows you to execute s3 jars remotely

Error: AWS CLI SSH Certificate Verify Failed _ssl.c:581

I am trying to use the sync command from my file system to S3 on a Windows 2008 R2 server.
I have previously had no problem running this command on multiple local machines:
AWS S3 SYNC 'File system Name' S3://'S3 file directory name'
However when I try to run it from this box I get this error:
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)
Every forum I see is using python scripts but I am just using the simple CLI commands.
Any idea why I am getting this error?
If you are running aws cli commands on Windows, above given commands i.e (sudo pip) will not work.
1) TO avoid "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)" error
on cli you can use the format like :
AWS [aws-service-name] --no-verify-ssl [functions]
2) Then your cli command for S3 Sync becomes:
AWS S3 --no-verify-ssl SYNC 'File system Name' S3://'S3 file directory name'
This worked around the issue for me on ubuntu 14.04. I cannot confirm if it is an ideal/complete solution:
sudo pip uninstall certifi
sudo pip install certifi==2015.04.28
From here: https://github.com/aws/aws-cli/issues/1499

Unable to Sync to S3 with s3cmd

After setting up s3cmd and my S3 bucket, when I try this command
sudo s3cmd sync --recursive --preserve /srv s3://MyS3Bucket
I get this error:
ERROR: S3 error: 400 (InvalidRequest): The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.
My s3cmd version is 1.0.0 which is installed by default after following their "deb" installation guide for by Ubuntu 12.04
These days, it is recommended to use the AWS Command-Line Interface (CLI), which also provides a sync capability.
s3cmd version 1.5.2 is necessary for working with regions such as eu-central-1 (Frankfurt) or cn-north-1 (Bejing). debs for such are available in Debian experimental and unstable, and Ubuntu Wily universe. Or you can install from source from https://github.com/s3tools/s3cmd.