HiveAccessControlException Permission Denied: user does not have [ALL] privilege - hive

I am very new to hive and the hadoop ecosystem.
I am trying to create a new table in hive but I am encountering this error:
According to some suggestions, I have to set the Ranger policies but upon checking, policies already had permissions to "All"
Same permissions were also given to other policies.
Did I miss something? Thank you.

You might need HDFS user directory which can be created by the administrator using sudo -u hdfs hdfs dfs -mkdir /user/<user_id>
In case you want to check if there is one exists:
hdfs dfs -ls /user | grep <user_id>

I had a similar issue. You may want to check Ranger > Audit > Plugin Status to see if the policy is being applied. If it's not it may be that you have a jersey-client classpath conflict. More details here:
https://jonmorisissqlblog.blogspot.com/2021/06/ranger-hive-policy-activation-time.html

Related

Which user is connecting to beeline?

When I connect to a beeline database, I don't need to enter the user and password, I just press enter and I get access to the database.
but when I want to write onto the database I get a permission denied error:
Error: java.io.IOException: org.apache.hadoop.security.AccessControlException: Permission denied: user=hive, access=READ, inode="/apps/hive/warehouse
I would know which default user should I use to connect to my database in order to add policies to it.
I thought that it's hive, but it seems it's not.
beeline takes a user account when you connect
beeline -u 'url' -n username
Use hdfs dfs -ls to find tables permissions (assuming you're not using Sentry or Ranger to manage permissions)
In any case, you don't use beeline to add permissions. By default, it's simply HDFS user/group permissions using chmod / chown, assuming you have ACLs enabled.
Try one of the below two.
Login with hadoop user
Give full permissions to /apps/hive/warehouse folder

How to grant a user account access for GCE Cloud Storage

I have a Compute VM that has storage permission of read-write. When I try to run the following command:
gsutil rsync -R local-dir gs://folder1/prod/www.domain.tld
I get insufficient permission error:
Building synchronization state...
Skipping cloud sub-directory placeholder object (gs://folder1/prod/www.domain.tld/) because such objects aren't needed in (and would interfere with) directories in the local file system
Starting synchronization
Copying file://local-dir/.gitignore [Content-Type=application/octet-stream]...
Uploading gs://folder1/prod/www.domain.tld/.gitignore: 174 B/174 B
AccessDeniedException: 403 Insufficient Permission
I am not sure what account needs to have specific permissions or how I would even assign the permissions. Can someone provide some direction as to what I need to look into? The only permissions I can think of is a service account but I am not sure how to create one or if that is what I even need to do. Also, once I grant an account access/permission, how would I use the account to authenticate?
Use gcloud auth list command to lookup the active account. The service account of Comoute Engine is similar to the following:
123845678986-compute#developer.gserviceaccount.com (active)
By default, this service account is a member of your project with Edit permission. Check ACLs of your GCS bucket and its folders and make sure the GCE service account or the group which it is belong to, has ownership or editing rights to them.

HDFS Plugin for Apache Ranger

I am planning to use Apache Ranger for authorization of my HDFS file system. I have a question on the capability of apache ranger plugin. Does HDFS plugin for Apache ranger offers more security features than just managing HDFS ACLs ? From the limited understanding that i gathered by looking into the presentations/blogs, I am unable to comprehend the functions of HDFS plugin for Apache Ranger.
..and now with the latest version of ApacheRanger it is possible to define "deny" rules.
Previously it was just possible to define rules which specifies additional "allow" privileges, on top of the underlying HDFS ACLs. Hence , if you had HDFS ACL for a directory set to "777", everybody can access it, independant from any Ranger-HDFS-policy on top of that ;)
Apache Ranger plugin for HDFS will provide user access auditing with below fields:
IP, Resource type, Timestamp, Access granted/denied.
Note that the Ranger plugin does not actually use HDFS ACLs. Ranger policies are added on top of standard HDFS permissions and HDFS ACLs.
You need to be aware that any access rights that are granted on these lower levels cannot be taken away by Ranger anymore.
Apart from that, Ranger gives you the same possibilities as ACLs, plus some more, like granting access by client IP range.

Hive add partition with ACL

I am facing some problems with hive partition creation where the permissions user has in hdfs are acl based.
1. I created a normal user in linux.
2. I gave him permissions recursively on a directory which is referred to by an external table.
(e.g. hdfs dfs -setfacl -R -m default:user:newUserName:rwx /apps/dbname/tblname)
I checked to see the permissions are recursively applied and I can read and write to the
directory even though I do not have any POSIX related permissions on the same. i.e. I only have the ACLS.
I logged in as newUserName and started hive and ran an "alter table add partition" command, where the location for the partition is /apps/dbname/tablename/somefolder
hive responds with the error: Authorization failed: java.security.AccessControlException: action WRITE not permitted on path hdfs://sandbox.hortonworks.com:8020/apps/dbname/tblname for user newUserName. Use SHOW GRANT to get more details.
What am I missing here. dfs.namenode.acls.enabled is true. I thought that was all that was required for ACLs to work. I am using hortonworks hdp2.1
Thanks
In current version of hive (0.13.x) Support for HDFS ACL (introduced in Apache Hadoop 2.4) is yet to be added.
Please find below jira which addresses the above issue:
https://issues.apache.org/jira/browse/HIVE-7714

How to specify username when putting files on HDFS from a remote machine?

I have a Hadoop cluster setup and working under a common default username "user1". I want to put files into hadoop from a remote machine which is not part of the hadoop cluster. I configured hadoop files on the remote machine in a way that when
hadoop dfs -put file1 ...
is called from the remote machine, it puts the file1 on the Hadoop cluster.
the only problem is that I am logged in as "user2" on the remote machine and that doesn't give me the result I expect. In fact, the above code can only be executed on the remote machine as:
hadoop dfs -put file1 /user/user2/testFolder
However, what I really want is to be able to store the file as:
hadoop dfs -put file1 /user/user1/testFolder
If I try to run the last code, hadoop throws error because of access permissions. Is there anyway that I can specify the username within hadoop dfs command?
I am looking for something like:
hadoop dfs -username user1 file1 /user/user1/testFolder
If you use the HADOOP_USER_NAME env variable you can tell HDFS which user name to operate with. Note that this only works if your cluster isn't using security features (e.g. Kerberos). For example:
HADOOP_USER_NAME=hdfs hadoop dfs -put ...
This may not matter to anybody, but I am using a small hack for this.
I'm exporting the HADOOP_USER_NAME in .bash_profile, so that every time I'm logging in, the user is set.
Just add the following line of code to .bash_profile:
export HADOOP_USER_NAME=<your hdfs user>
By default authentication and authorization is turned off in Hadoop. According to the Hadoop - The Definitive Guide (btw, nice book - would recommend to buy it)
The user identity that Hadoop uses for permissions in HDFS is determined by running
the whoami command on the client system. Similarly, the group names are derived from
the output of running groups.
So, you can create a new whoami command which returns the required username and put it in the PATH appropriately, so that the created whoami is found before the actual whoami which comes with Linux is found. Similarly, you can play with the groups command also.
This is a hack and won't work once the authentication and authorization has been turned on.
Shell/Command way:
Set HADOOP_USER_NAME variable , and execute the hdfs commands
export HADOOP_USER_NAME=manjunath
hdfs dfs -put <source> <destination>
Pythonic way:
import os
os.environ["HADOOP_USER_NAME"] = "manjunath"
There's another post with something similar to this that could provide a work around for you using streaming via ssh:
cat file.txt | ssh user1#clusternode "hadoop fs -put - /path/in/hdfs/file.txt"
See putting a remote file into hadoop without copying it to local disk for more information