User Authentication in hadoop Hdfs - authentication

I have integrated milton webdav with hadoop hdfs and able to read/write files to the hdfs cluster.
I have also added the authorization part using linux file permissions so only authorized users can access the hdfs server, however, I am stuck at the authentication part.
It seems hadoop does not provide any in built authentication and the users are identified only through unix 'whoami', meaning I cannot enable password for the specific user.
ref: http://hadoop.apache.org/common/docs/r1.0.3/hdfs_permissions_guide.html
So even if I create a new user and set permissions for it, there is no way to identify whether the user is authenticate or not. Two users with the same username and different password have the access to the all the resources intended for that username.
I am wondering if there is any way to enable user authentication in hdfs (either intrinsic in any new hadoop release or using third party tool like kerbores etc.)
Edit:
Ok, I have checked and it seems that kerberos may be an option but I just want to know if there is any other alternative available for authentication.
Thanks,
-chhavi

Right now kerberos is the only supported "real" authentication protocol. The "simple" protocol is completely trusting the client's whois information.
To setup kerberos authentication I suggest this guide: https://ccp.cloudera.com/display/CDH4DOC/Configuring+Hadoop+Security+in+CDH4
msktutil is a nice tool for creating kerberos keytabs in linux: https://fuhm.net/software/msktutil/
When creating service principals, make sure you have correct DNS settings, i.e. if you have a server named "host1.yourdomain.com", and that resolves to IP 1.2.3.4, then that IP should in turn resolve back to host1.yourdomain.com.
Also note that kerberos Negotiate Authentication headers might be larger than Jetty's built-in header size limit, in that case you need to modify org.apache.hadoop.http.HttpServer and add ret.setHeaderBufferSize(16*1024); in createDefaultChannelConnector(). I had to.

Related

Can Keycloack replace dap authentication?

Sorry but it doesn't accept "hi everyone "
I have several apps which are authenticated by ldap. can keycloack replace this authentication with ldap? and in this way I do not touch to the configuration Of applications ?
Thank you
You can read this article https://wjw465150.gitbooks.io/keycloak-documentation/content/server_admin/topics/user-federation/ldap.html
By default Keycloack copy your ldap data but you can choose keycloack use your ldap data :
By default, Keycloak will import users from LDAP into the local Keycloak user database. This copy of the user is either synchronized on demand, or through a periodic background task. The one exception to this is passwords. Passwords are not imported and password validation is delegated to the LDAP server. The benefits to this approach is that all Keycloak features will work as any extra per-user data that is needed can be stored locally. This approach also reduces load on the LDAP server as uncached users are loaded from the Keycloak database the 2nd time they are accessed. The only load your LDAP server will have is password validation. The downside to this approach is that when a user is first queried, this will require a Keycloak database insert. The import will also have to be synchronized with your LDAP server as needed.
Alternatively, you can choose not to import users into the Keycloak user database. In this case, the common user model that the Keycloak runtime uses is backed only by the LDAP server. This means that if LDAP doesn’t support a piece of data that a Keycloak feature needs that feature will not work. The benefit to this approach is that you do not have the overhead of importing and synchronizing a copy of the LDAP user into the Keycloak user database

Is it possible to use a keytab for group of user in AD

I am using kerberos with Hadoop environment, and I use keytab file to give authentications to different user. Now I have some users to them i have to give same privilege to all them.
So i created a user group and generated a generic keytab file for that active directory group, but failed to validate the keytab file. It gives me an error as mentioned below: kinit: Client 'xyz#BIGDATA.LOCAL' not found in Kerberos database while getting initial credentials
Now the question is, is there possibility to use a keytab file for group in active directory or should i have to use any other way to achieve the same?
You only need to place one keytab on the application server to successfully do Kerberos SSO authentication, not multiple ones. When users access a service which is Kerberos-enabled, they obtain a Kerberos ticket for that service from the KDC. The keytab on the application server decrypts the contents of that ticket, because inside the keytab is a representation of the service running on the application server users want to access, the FQDN of the application server, and the Kerberos realm name which will honor the authentication attempt, and cryptographic hash of the service principal in the KDC. As the passwords in each are the same, authentication succeeds. This is a very implied explanation. The keytab won't be able to determine users group membership however. That is part is authorization, so you'll need to make an LDAP authorization call back to the Directory server if you want to parse group membership.
There's only one exception to this rule that I know of. In a homogenous Microsoft-only Active Directory environment, in which Kerberos is the primary authentication method (it is by default), keytabs are not used. Microsoft application servers can, without a keytab, natively decrypt the Kerberos ticket to determine who the user is and parse that same ticket for user's group information as well, without any need for LDAP calls back to the Directory server. Parsing the Kerberos service ticket for group information is known as reading the PAC. In an AD environment however, non-Microsoft platforms cannot "read the PAC" for group membership, as Microsoft has never exposed how they do this as far as I am aware. See http://searchwindowsserver.techtarget.com/feature/Advanced-Kerberos-topics-From-authentication-to-authorization.

How to handle authentication for GitLab AND ownCloud?

I'm running GitLab and ownCloud on my webserver. I use it mainly for myself, but I want to let some of my friends use it for some projects. My problem: I want to centralize the authentication process. This means I only want one account for OwnCloud and GitLab. I know both support LDAP as authentication backend. However, are there any restrictions I that might encounter?
GitLab CE only allows one LDAP Server and one base DN so keep that in mind if you decide to how you organize your domain. ownCloud on the other hand supports multiple servers and base DN. Don't forget to create a user which will read from the active directory to check credentials.

Test an application using LDAP

I am in the preparation phase of testing an application that will be using LDAP for user authentication. What are some tips or advice you might have for this endeavor?
I don't have a great understanding of LDAP but I believe all that will be used is the application client calls to LDAP with a username and sees if the password matches. Is that an accurate description? Also, what are some edge cases to test? Thanks.
LDAP authentication is often (but not always) a 2-step process: first the application does an LDAP query to locate the "distinguished name" (dn) of the account the user's trying to authenticate to, then it tries to log in ("bind", in LDAP parlance) as that dn, using the user-supplied password.
If the "bind" attempt works, the application knows the password is correct. (The LDAP server is likely not configured to allow the application to actually extract the password and do the comparison itself, for obvious security reasons).
An LDAP-enabled application will typically require a way to configure:
hostname/port of LDAP server
search base (e.g., dc=mycompany,dc=org)
user search filter (e.g. "(|(cn=%userid)(mail=%userid))")
application credentials for the initial LDAP query (unless the server allows anonymous queries)
You will most likely need to support SSL, so you don't send the user's password to the LDAP server in cleartext.
This strongly depends on your technology.
If you are directly querying the ldap you have to care about performance e.g. how fast are your queries, how often you have to do them and if you server is sized appropriately.
If you are using some kind of container that it providing SSO then these will usually have some caching features, etc.. There you have to check if everything is working properly.
You would need ldap_connect and ldap_bind functions. Also once you are binded with the ldap server you can search for entries on the ldap and get the user information.
You should surely check if encodings are handled properly (try umlauts or other weird stuff in passwords).
And take care to test with the LDAP servers you care about, Active Directory, OpenLdap, Apaches LDAP Server, and a few others, they might show different behaviours.
Also check if the client handles 'referrals' correct, e.g. when you try to lookup users in a different domain (e.g. two AD forrests with forrest trust installed).

Does LDAP provide a token after binding, so I don't have to send credentials every time?

I have a web application (PHP, but doesn't matter). It uses LDAP for authentication (already working), and it allows users to search LDAP (already working).
But when searching, I use a generic process account to bind() and then run the search().
What I would like is to use the LDAP account that logs in to be the same account that binds for the searching. But the only way I see to do that is to store the user's credentials in the sessions (bad!).
Nutshell: can I get a "state/session/??" token from LDAP, to bind() and then search() on subsequent http requests?
(btw, using Active Directory.)
Basic LDAP doesn't provide anything like this. The credentials that you present when binding are used for the rest of the connection, so if you could keep an LDAP connection open across multiple HTTP requests (and share LDAP connections among however many server jobs you have running), then you could avoid saving credentials.
There are various extensions to LDAP floating around (including several within Active Directory), so it's possible that one of those adds sessions-across-connections, but if so, I'm not aware of it.
As a sort-of-workaround, because Active Directory supports GSSAPI and because of how Kerberos works, you ought to be able to use the user's credentials to request a Kerberos ticket for accessing LDAP then store that ticket as your "state/session/??" token. This Kerberos ticket would only be valid for accessing LDAP and would auto-expire, so this would avoid the pitfalls of storing the user's credentials in the session. I don't know if your LDAP library supports GSSAPI and would give you enough control to do this or not.