Using object storage with authorization for a web app - amazon-s3

I'm contributing to developping a web (front+back) application, which uses OpenID Connect (with auth0) for authentication & authorization.
The web app needs authentication to access some public & some restricted information (restriction are per-user or depending on certain group-related rules).
We want to provide a upload/download features for documents such as .pdf, and we have implemented minIO (pretty similar to AWS S3) for public documents.
However, we can't wrap ou heads around restricted-access files :
should we implement OIDC on minIO for users to access directly the buckets but with temporary access tokens, allowing for fine-grained authorization policy
or should the back-office be the only one to have keys to minIO and be the intermediary between the object storage and users ?
Looking for good practices here, thanks in advance for your help.

Interesting question, since PDF docs are web static content unless they contain sensitive data. I would aim to separate secured (API) and non-secured (web) concerns on this one.
UNSECURED RESOURCES
If there is no security involved, connecting to a bucket from the front end makes sense. The bucket contents can also be distributed to a content delivery network, for best global performance. The PDF can be considered a web resource.
SECURED RESOURCES
Requests for these need to be treated as an API request, if a PDF doc contains sensitive data. APIs should receive an access token and enforce access to documents via scopes and claims.
You might use a Documents API for this. The implementation might still connect to a bucket, but this might be a different bucket that the browser does not have access to.
SUMMARY
This type of solution is often clearer if you think in terms of URL design. Eg the front end might have 2 document URLs:
publicDocs
secureDocs
By default I would treat docs that users upload as secure, unless they select an upload option such as make public.

Related

How to restrict public user access to s3 buckets or minIO?

I have got a question about minio or s3 policy. I am using a stand-alone minio server for my project. Here is the situation :
There is only one admin account that receives files and uploads them to minio server.
My Users need to access just their own uploaded objects. I mean another user is not supposed to see other people's object publicly (e.g. by visiting direct link in URL).
Admin users are allowed to see all objects in any circumstances.
1. How can i implement such policies for my project considering i have got my database for user authentication and how can i combine them to authenticate the user.
2. If not what other options do i have here to ease the process ?
Communicate with your storage through the application. Do policy checks, authentication or authorization in the app and store/grab files to/from storage and make the proper response. I guess this is the only way you can have limitation on uploading/downloading files using Minio.
If you're using a framework like Laravel built in S3 driver works perfectly with Minio; Otherwise it's just matter of a HTTP call. Minio provides HTTP APIs.

How to get short lived access to specific Google Cloud Storage bucket from client mobile app?

I have a mobile app which authenticates users on my server. I'd like to store images of authenticated users in Google Cloud Storage bucket but I'd like to avoid uploading images via my server to google bucket, they should be directly uploaded (or downloaded) from the bucket.
(I also don't want to display another Google login to users to grant access to their bucket)
So my best case scenario would be that when user authenticates to my server, my server also generates short lived access token to specific Google storage bucket with read and write access.
I know that service accounts can generate accessTokens but I couldn't find any documentation if it is a good practice top pass these access tokens from server to client app and if it is possible to limit scope of the access token to specific bucket.
I found authorization documentation quite confusing and asking here what would be best practice approach to achieve access to the cloud storage for my case?
I think you are looking for signed urls.
A signed URL is a URL that provides limited permission and time to
make a request. Signed URLs contain authentication information in
their query string, allowing users without credentials to perform
specific actions on a resource.
Here you can see more about them in GCP. Here you have an explanation of how you can adapt them for your program.

Long lived key/token based way to download google storage bucket objects with curl?

O.k. my fellow devops and coders. I have spent the last week trying to figure this out with Google (GCP) Cloud Storage objects. Here is my objective.
The solution needs to be light weight as it will be used to download images inside a docker image, hence the curl requirement.
The GCP bucket and object needs to be secure and not public.
I need a "long" lived ticket/key/client_ID.
I have tried the OAuth2.0 setup that Google's documentation mentions but everytime I want to setup an OAuth2.0 key it I do not get the option to have the "offline" access. AND to top it off it requires you to put in source URL's that will be accessing the auth request.
Also Google Cloud Storage does not support the key= like some of their other services. So here I have a an API KEY for my project as well as an OAuth JSON file for my service user and they are useless.
I can get a curl command to work with the temp OAuth bearer key but I need a long term solution for this.
RUN curl -X GET \
-H "Authorization: Bearer ya29.GlsoB-ck37IIrXkvYVZLIr3u_oGB8e60UyUgiP74l4UZ4UkT2aki2TI1ZtROKs6GKB6ZMeYSZWRTjoHQSMA1R0Q9wW9ZSP003MsAnFSVx5FkRd9-XhCu4MIWYTHX" \
-o "/home/shmac/test.tar.gz" \
"https://www.googleapis.com/storage/v1/b/mybucket/o/my.tar.gz?alt=media"
A long term key/ID/secret that will allow me to download a GCP bucket object from any location.
The solution needs to be lightweight as it will be used to download
images inside a docker image, hence the curl requirement.
This is a vague requirement. What is lightweight? No external libraries, everything written in assembly language, must fit in 1 KB, etc.
The GCP bucket and object needs to be secure and not public.
This normal requirement. With some exceptions (static file storage for websites, etc) you want your buckets to be private.
I need a "long" lived ticket/key/client_ID.
My advice is to stop thinking "long-term keys". The trend in security is to implement short-term keys. In Google Cloud Storage, seven-days is considered long-term. 3600 seconds (one hour) is the norm almost everywhere in Google Cloud.
For Google Cloud Storage you have several options. You did not specify the environment so I will include both user credentials, service account, and presigned-url based access.
User Credentials
You can authenticate with User Credentials (eg username#gmail.com) and save the Refresh Token. Then when an Access Token is required, you can generate one from the Refresh Token. In my website article about learning the Go language, I wrote a program on Day #8 which implements Google OAuth, saves the necessary credentials and creates Access Tokens and ID Tokens as required with no further "login" required. The comments in the source code should help you understand how this is done. https://www.jhanley.com/google-cloud-and-go-my-journey-to-learn-a-new-language-in-30-days/#day_08
This is the choice if you need to use User Credentials. This technique is more complicated, requires protecting the secrets file but will give you refreshable long term tokens.
Service Account Credentials
Service Account JSON key files are the standard method for service-to-service authentication and authorization. Using these keys, Access Tokens valid for one hour are generated. When they expire new ones are created. The max time is 3600 seconds.
This is the choice if you are programmatically accessing Cloud Storage with programs under your control (the service account JSON file must be protected).
Presigned-URLs
This is the standard method of providing access to private Google Cloud Storage objects. This method requires the URL and generates a signature with an expiration so that objects can be accessed for a defined period of time. One of your requirements (which is unrealistic) is that you don't want to use source URLs. The max time is seven-days.
This is the choice if you need to provide access to third-parties to access your Cloud Storage Objects.
IAM Based Access
This method does not use Access Tokens, instead, it uses Identity Tokens. Permissions are assigned to Cloud Storage buckets and objects and not to the IAM member account. This method requires a solid understanding of how Identities work in Google Cloud Storage and is the future direction for Google security - meaning for many services access will be controlled on a service/object basis and not via roles that grant wide access to an entire service in a project. I talk about this in my article on Identity Based Access Control
Summary
You have not clearly defined what will be accessing Cloud Storage, how secrets are stored, if the secrets need to be protected from users (public URL access), etc. The choice depends on a number of factors.
If you read the latest articles on my website I discuss a number of advanced techniques on Identity Based Access Control. These features are starting to appear on a number of Google Services in the beta level commands. This includes Cloud Scheduler, Cloud Pub/Sub, Cloud Functions, Cloud Run, Cloud KMS and soon more. Cloud Storage supports Identity Based Access which requires no permissions at all - the identity is used to control access.

AWS S3: Keep token file accessible to application, not to public users

I'm hosting a static website on S3 that uses an API. My auth token for the API is stored in a JS file, but I want to keep that obscured from public users, but NOT from my application.
At the moment, it looks like you need to make S3 buckets (and all of their files) publicly accessible by everyone, but I want to mask my config file. Is this possible, and if so, what is the best way to do it?
Thanks!
Amazon provides a service called Lambda. It is a serverless computing. You use can can be solved using this.
You can write an auth function in Lambda where you can place the api auth token.
You are not really going to be able to completely hide your token, no matter what you do by masking it etc, ultimately your browser is issuing an API call and passing along the credentials which anyone that cares to look for it can see it.
What you want to do is use something like aws cognito to generate temporary, restricted tokens for each user, even anonymous users.
Cognito Identity supports the creation and token vending process for
unauthenticated users as well as authenticated users. This removes the
friction of an additional login screen in your app, but still enables
you to use temporary, limited privilege credentials to access AWS
resources.
https://aws.amazon.com/cognito/faqs/
If you do this, someone can still see the token being used, but it is time and permission limited - not the keys to the kingdom, so they can't do much with it.

Cloud storage services and session-based file-URLs

I have the following use-case that I am seeking a solution for:
Our website shares files to our clients. The files are stored on a 3rd party cloud service, the file access permissions on our website. When a client on our site requests a file that he has permission to see, it will be served directly from the cloud service (instead of through our own webserver, using our CPU, RAM and bandwidth).
I see services like Amazon S3 and Google Cloud Storage use an approach with a signed URL with a timeout for this purpose, but I would prefer a solution where that URL is only available to the client who requested the resource (and not everyone who has the link during the lifecycle of the URL). The reason for this is that it feels wrong to rely on a duration based un an arbitrary length instead of utilizing a one-time token or in any other way validate the access to the resource before the request is completed.
Does any of the major services provide a feature that would allow for this? Or is it considered "safe enough" to protect sensitive data behind a random URL + timeout period (to me it feels like the answer to the latter is "no")?