I'm learning about Amazon Neptune, and noticed that:
IAM authentication is not enabled by default
IAM authentication requires AWS Signature v4 for API calls, which increases application complexity
By default, it seems that Amazon Neptune uses anonymous authentication, as I didn't have to provide any API keys, username / password combinations, or certificates for authentication. Additionally, the code sample provided by AWS doesn't include any authentication details.
It appears that the only default security options for Amazon Neptune are network-level VPC Security Groups.
According to the What is Neptune? documentation, the service claims to be "highly secure." In my opinion, a service that does not support application-level authentication by default, is not "highly secure."
Neptune provides multiple levels of security for your database. Security features include network isolation using Amazon VPC, and encryption at rest using keys that you create and control through AWS Key Management Service (AWS KMS). On an encrypted Neptune instance, data in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster.
Question: Why does Amazon Neptune use an insecure configuration by default, and is there a way to enable authentication without using the complicated IAM integrated authentication?
You've got some very valid points in there, so let me go through them in detail by providing some context.
By default, it seems that Amazon Neptune uses anonymous authentication..
This is intentional for a reason. The query languages that Neptune support right now are Gremlin and SPARQL, both of which are built on top of HTTP/HTTPS without any sort of auth (Basic Auth is supported in Gremlin, but that is not something that clients use in production anyways. I'd need at least some form of message digest auth to call it secure, but unfortunately, the language spec does not include this). As these languages are open, there are a lot of open source client code that exist out there that assume that they are dealing with an unauthenticated endpoint. As a result, purely from an adoption point of view, Neptune chose to keep its request layer to be unauthenticated by default. If you explore other DB engines within AWS (say Aurora MySQL), the backing DB engine does support auth as its default posture.
This does not mean that it is the right thing to do, so I'll let someone from the Gremlin/SPARQL community comment on whether the spec should enforce authentication as the default posture or not.
It appears that the only default security options for Amazon Neptune are network-level VPC Security Groups.
SG's provide the network ACLs, and we do support TLS 1.2 by default (as of the newest engine versions), so that tightens up your client -> db connection as well.
The service claims to be "highly secure." In my opinion, a service that does not support application-level authentication by default, is not "highly secure."
In addition to the details called out above, the "highly secure" aspect of Neptune is not limited just to client -> db connection. Your data is replicated 6 way and stored in 3 AZs. This involves a lot of communication, not just from the DB, but within the backing storage nodes as well. All these communications are guarded by industry standard security protocols. Encryption at rest for the distributed store is a totally interesting case study on its own. Same standards apply to operator access to the machines, auditing, data safety which snapshotting and restoring etc etc. In short, I do agree that the default posture should be SigV4 (or some open standard) auth enabled, I do want to make sure that you do get some clarity on why we do claim to be a highly secure DB, much like any other product that AWS provides.
Is there a way to enable authentication without using the complicated IAM integrated authentication?
SigV4 is the standard that most AWS services do support. I do agree that it would have been a lot easier if there were an SDK that customers could directly use. We did vend out SigV4 plugins for some of the clients (especially Java and Python) and it actually has a pretty good uptake. Do try it out and share feedback on which areas in the integration seemed to be painful, and we'd be more than happy to take a look.
EDIT 1: The OP discussion here was around security between client and the database, so the security practices in the opaque backing data store that I've quoted above isn't really relevant. In other words, the discussion here is entirely around the data plane API of Neptune and whether we could be secure by default, rather than an opt in.
Related
There are primarily two ways to authenticate using Google's reCAPTCHA Enterprise in a non-Google cloud environment (we use AWS). Google recommends using Service Accounts along with their Java client. However, this approach strikes me as less preferable than the second way Google suggests we can can authenticate, namely, using Api Keys. Authenticating via an Api Key is easy, and it’s similar to how we commonly integrate with other 3rd party services, except rather than a username and password that we must secure, we have an Api Key that we must secure. Authenticating using Service Accounts, however, requires that we store a Json file on each environment in which we run reCAPTCHA, create an environment variable (GOOGLE_APPLICATION_CREDENTIALS) to point to that file, and then use Google's provided Java client (which leverages the environment variable/Json file) to request reCAPTCHA resources.
If we opt to leverage Api Key authentication, we can use Google’s REST Api along with our preferred Http Client (Akka-Http). We merely include the Api Key in a header that is encrypted as part of TLS in transit. However, if we opt for the Service Accounts method, we must use Google’s Java client, which not only adds a new dependency to our service, but also requires its own execution context because it uses blocking i/o.
My view is that unless there is something that I’m overlooking--something that the Service Accounts approach provides that an encrypted Api Key does not--then we should just encrypt an Api Key per environment.
Our Site Key is locked down by our domain, and our Api Key would be encrypted in our source code. Obviously, if someone were to gain access to our unencrypted Api Key, they could perform assessments using our account, but not only would it be exceedingly difficult for an attacker to retrieve our Api Key, the scenario simply doesn't strike me as likely. Performing free reCAPTCHA assessments does not strike me as among the things that a sophisticated attacker would be inclined to do. Am I missing something? I suppose my question is why would we go through the trouble of creating a service account, using the (inferior) Java client, storing a Json file on each pod, create an environment variable, etc. What does that provide us that the Api Key option does not? Does it open up some functionality that I'm overlooking?
I've successfully used Api Keys and it seems to work fine. I have not yet attempted to use Service Accounts as it requires a number of things that I'm disinclined to do. But my worry is that I'm neglecting some security vulnerability of the Api Keys.
After poring a bit more over the documentation, it would seem that there are only two reasons why you'd want to explicitly choose API key-based authentication over an Oauth flow with dedicated service accounts:
Your server's execution environment does not support Oauth flows
You're migrating from reCAPTCHA (non-enterprise) to reCAPTCHA Enterprise and want to ease migration
Beyond that, it seems the choice really comes down to considerations like your organization's security posture and approved authentication patterns. The choice does also materially affect things like how the credentials themselves are provisioned & managed, so if your org happens to already have a robust set of policies in place for the creation and maintenance of service accounts, it'd probably behoove you to go that route.
Google does mention sparingly in the docs that their preferred method for authentication for reCAPTCHA Enterprise is via service accounts, but they also don't give a concrete rationale anywhere I saw.
We have a self-developed proprietary user management and self-developed Single Sign-on. (OpenID Connect wasn't born at that time)
Our authentication server and our thick clients are in a private network, without internet access.
The task is to integrate a third-party thick client - its users should authenticate against our existing authentication server.
The general idea is to use an existing future-oriented framework which offers a standard authentication interface (like Keycloak?) and implement our own OpenID Connect authentication provider (or User storage SPI for keycloack).
Is the way with keycloack and User storage SPI recommendable or are there better approaches?
As you say, this is a good choice for meeting your immediate requirements:
External client uses a modern OpenID Connect flow - eg OIDC for desktop apps
It connects to an Authorization Server with support for standards based endpoints
Authorization Server has extensible support for data sources and can potentially reach out to your existing user data source
As an example, Curity, where I work, supports multiple data sources and there is a free community edition if useful.
Any provider that meets the same requirements would be fine though - and I've heard some good things about Keycloak.
LONGER TERM
It makes sense to then gradually update other apps to use modern OAuth and OIDC behaviour.
At a suitable point it is worth making the Authorization Server the only place from which Personally Identifiable user data is accessed, and moving the storage there. See this data privacy article for some advantages of this.
I can vouch for Keycloak User Storage SPI approach. Recently implemented this for a project and it is working pretty well. For any existing user-database I highly recommend it.
I found some example source on github that you could look at (although needed some modification to run it):
https://github.com/mfandre/KeycloakSPI
I have also written an article summarizing my findings working with Keycloak in case you're interested in other features:
https://dev.to/kayesislam/keycloak-as-oidc-provider-42ip
It's extremely customisable.
I've been given the task of implementing a user Sign In / Sign Up flow in a react native app. There is nothing too fancy about this app in particular. The usual Sign In, Sign Up (with SMS Verification) and Password Reset screens suffice. So I went looking for identity providers. Auth0 and AWS Cognito were the most suitable finds. Auth0 was considered too expensive by my managers so we discarded it. Which left me with the Cognito option.
According to the docs, it is possible to completely replace the default UI (which is something that pleases the UI/UX team) but still using the underlying infrastructure. One thing that concerns our team very much is security. According to this and this, authorization requests should only be made through external agents (mobile user browsers). So I went digging into the aws-amplify's source code and found that ultimately what it does (and correct me if I'm wrong here, please) is just a simple API request to the AWS auth endpoints passing my ClientId and other attributes.
This got me a little worried about the security of the interactions with AWS. As AWS endpoints are secure, I know a mitm attack is discarded.
But what keeps an attacker of decompiling my mobile app, getting access to the ClientId and making direct requests to AWS? Is AWS Amplify's really that insecure or am I missing something here?
There are many attacks that are possible but at a high level 3 stand out
Credential compromise
Social engineering
DoS
Credential compromise
Your account credentials should not be exposed, STS credentials are time limited and need you to specifically grant permissions to the pool to access aws services
https://docs.aws.amazon.com/cognito/latest/developerguide/role-based-access-control.html
you need to give a least privilege, follow approach outlined here
https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#grant-least-privilege
Social engineering attack
I guess exposed ClientId from a decompiled source could be used but would need to be combined with other user data so as a general rule lock down everything that links to your account that could be combined with the Client Id in a social attack
Dos
AWS provides what it calls "Advanced Security" in pools
https://docs.aws.amazon.com/cognito/latest/developerguide/cognito-user-pool-settings-advanced-security.html
this should be required when building Cognito Apps its comprehensive
Security threats constantly evolve, AWS do a good job, there are security advantages in using
Cloudfront
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/security.html
CloudTrail
https://aws.amazon.com/cloudtrail/
What is the best way to secure a connection between an Elasticsearch cluster hosted on Elastic Cloud and a backend given that we have hundreds of thousands of users and that I want to handle the authorization logic on the backend itself not on Elasticsearch?
Is it better to create a "system" user in the native realm with all the read and write accesses (it looks like the user feature is intended for real end-users) or to use other types of authentication (but SAML, PKI or Kerberos are also end-user oriented)? Or using other security means like IP based?
I'm used to Elasticsearch service on AWS where authorization is based on IAM roles so I'm a bit lost here.
edit: 18 months later, there's no definitive answer on this, if I had to do it again, I would probably end up using JWT.
There is a lot of discussion about microservice architecture. What I am missing - or maybe what I did not yet understand is, how to solve the issue of security and user authentication?
For example: I develop a microservice which provides a Rest Service interface to a workflow engine. The engine is based on JEE and runs on application servers like GlassFish or Wildfly.
One of the core concepts of the workflow engine is, that each call is user centric. This means depending of the role and access level of the current user, the workflow engine produces individual results (e.g. a user-centric tasklist or processing an open task which depends on the users role in the process).
In my eyes, thus a service is not accessible from everywhere. For example if someone plans to implement a modern Ajax based JavaScript application which should use the workflow microservice there are two problems:
1) to avoid the cross-scripting problem from JavaScript/Ajax the JavaScript Web application needs to be deployed under the same domain as the microservice runs
2) if the microservice forces a user authentication (which is the case in my scenario) the application need to provide a transparent authentication mechanism.
The situation becomes more complex if the client need to access more than one user-centric microservices forcing user authentication.
I always end up with an architecture where all services and the client application running on the same application server under the same domain.
How can these problems be solved? What is the best practice for such an architecture?
Short answer: check OAUTH, and manage caches of credentials in each microservice that needs to access other microservices. By "manage" I mean, be careful with security. Specially, mind who can access those credentials and let the network topology be your friend. Create a DMZ layer and other internal layers reflecting the dependency graph of your microservices.
Long answer, keep reading. Your question is a good one because there is no simple silver bullet to do what you need although your problem is quite recurrent.
As with everything related with microservices that I saw so far, nothing is really new. Whenever you need to have a distributed system doing things on behalf of a certain user, you need distributed credentials to enable such solution. This is true since mainframe times. There is no way to violate that.
Auto SSH is, in a sense, such a thing. Perhaps it may sound like a glorified way to describe something simple, but in the end, it enables processes in one machine to use services in another machine.
In the Grid world, the Globus Toolkit, for instance, bases its distributed security using the following:
X.509 certificates;
MyProxy - manages a repository of credentials and helps you define a chain of certificate authorities up to finding the root one, which should be trusted by default;
An extension of OpenSSH, which is the de facto standard SSH implementation for Linux distributions.
OAUTH is perhaps what you need. It is a way provide authorization with extra restrictions. For instance, imagine that a certain user has read and write permission on a certain service. When you issue an OAUTH authorization you do not necessarily give full user powers to the third party. You may only give read access.
CORS, mentioned in another answer, is useful when the end client (typically a web browser) needs single-sign-on across web sites. But it seems that your problem is closer to a cluster in which you have many microservices that are managed by you. Nevertheless, you can take advantage of solutions developed by the Grid field to ensure security in a cluster distributed across sites (for high availability reasons, for instance).
Complete security is something unattainable. So all this is of no use if credentials are valid forever or if you do not take enough care to keep them secret to whatever received them. For such purpose, I would recommend partitioning your network using layers. Each layer with a different degree of secrecy and exposure to the outside world.
If you do not want the burden to have the required infrastructure to allow for OAUTH, you can either use basic HTTP or create your own tokens.
When using basic HTTP authentication, the client needs to send credentials on each request, therefore eliminating the need to keep session state on the server side for the purpose of authorization.
If you want to create your own mechanism, then change your login requests such that a token is returned as the response to a successful login. Subsequent requests having the same token will act as the basic HTTP authentication with the advantage that this takes place at the application level (in contrast with the framework or app server level in basic HTTP authentication).
Your question is about two independent issues.
Making your service accessible from another origin is easily solved by implementing CORS. For non-browser clients, cross-origin is not an issue at all.
The second problem about service authentication is typically solved using token based authentication.
Any caller of one of your microservices would get an access token from the authorization server or STS for that specific service.
Your client authenticates with the authorization server or STS either through an established session (cookies) or by sending a valid token along with the request.