I searched google for this but didn't get a clue.
Suppose a user has full read/write access to Redis database. I am wondering, is there any way to connect to database in read-only mode?
Redis, until and including v5, does not have the notion of user privileges as it does not support the notion of users. There is one user as far as the database is concerned, and that user is omnipotent.
That said, since v2.6 replicas are configured by default to reject writes, effectively making them a read-only interface (the replica-read-only configuration directive).
Note: it is expected that in its next major version, Redis will provide user access lists.
Related
How can a single instance of Redis be used in a multi-tenant environment. Meaning multiple and different applications using the same Redis instance.
Suppose I have two apps, one is Baking App and the other one is Delivery App. Both apps will be using the same Redis instance and both apps will be saving similar keys with similar key patterns (e.g. userid:uuid -> johnsmith) etc. Obviously, using the same Redis will have collisions, is there a way to "Namespace" the database as such even the same key will be isolated from each other allowing multiple apps to use the same Redis instance concurrently?
And also work with Redis search, and in the same way, search and indexing would be isolated from each app. So if the search is on the Delivery App namespace it would not fetch anything from the Baking App namespace.
How can this be achieved?
So there are multiple things that you can do:
You can prefix the keys with app name like app1:userid:uuid etc
You can use different in memory db provided by redis. Redis supports upto 16 DBs. You can store keys for different apps in different db. To fetch them connect with respective DB.
You can use both of the above methods.
To improve security so that the Apps cannot access other App's data:
Implement Redis ACL - If you are using Redis version 6+, you can leverage the feature of using ACL(Access Control Lists). You can create users with passwords for each app and pass these credentials while making Redis connection. You can even add permissions/commands etc. to the users.
Data in different DB cannot be accessed i.e. if you make connection to DB 0, you cannot fetch data from DB 1.
I am having an application where different users may connect to different databases (those can be either MySQL or Postgres), what might be the best way to cache those connections across different databases? I saw some connection pools but seems like they are more for one db multiple connections than for multiple db multiple connections.
PS:
For adding more context, I am designing a multi tenant architecture where each tenant connects to one or multiple databases, I have an option for using map[string]*sql.DB where the key is the url of the database, but it can be hardly scaled when we have numerous number of databases. Or should we have a sharding layer for each incoming request sharded by connection url, so each machine will contain just the right amount of database connections in the form of map[string]*sql.DB?
An example for the software that I want to build is https://www.sigmacomputing.com/ where the user can connects to multiple databases for working with different tables.
Both MySQL and Postgres do not allow to connection sharing between multiple database users, single database user is specified in connection credentials. If you mean that your different users have their own database credentials, then it is not possible to share connections between them.
If by "different users" you mean your application users and if they share single database user to access DB deeper in the app, then you don't need to do anything particular to "cache" connections. sql.DB keeps and reuses open connections in its pool by default.
Go automatically opens, closes and reuses DB connections with a *database/sql.DB. By default it keeps up to 2 connections open (idle) and opens unlimited number of new connections under concurrency when all opened connections are already busy.
If you need some fine tuning on pool efficiency vs database load, you may want to alter sql.DB config with .Set* methods, for example SetMaxOpenConns.
You seem to have to many unknowns. In cases like this I would apply good, old agile and start with prototype of what you want to achieve with tools that you already know and then benchmark the performance. I think you might be surprised how much go can handle.
Since you understand how to use map[string]*sql.DB for that purpose I would go with that. You reach some limits? Add another machine behind haproxy. Solving scaling problem doesn't necessary mean writing new db pool in go. Obviously if you need this kind of power you can always do it - pgx postgres driver has it's own pool implementation so you can get your inspiration there. However doing this right now seems to be pre-mature optimization - solving problem you don't have yet. Building prototype with map[string]*sql.DB is easy, test it, benchmark it, you will see if you need more.
p.s. BTW you will most likely hit first file descriptor limit before you will be able to exhaust memory.
Assuming you have multiple users with multiple databases with an N to N relation, you could have a map of a database URL to database details (explained below).
The fact that which users have access to which databases should be handled anyway using configmap or a core database; For Database Details, we could have a struct like this:
type DBDetail {
sync.RWMutex
connection *sql.DB
}
The map would be database URL to database's details (dbDetail) and if a user is write it calls this:
dbDetail.Lock()
defer dbDetail.Unock()
and for reads instead of above just use RLock.
As said by vearutop the connections could be a pain but using this you could have a single connection or set the limit with increment and decrement of another variable after Lock.
There isn’t necessarily a correct architectural answer here. It depends on some of the constraints of the system.
I have an option for using map[string]*sql.DB where the key is the url of the database, but it can be hardly scaled when we have numerous number of databases.
Whether this will scale sufficiently depends on the expectation of how numerous the databases will be. If there are expected to be tens or hundreds of concurrent users in the near future, is probably sufficient. Often a good next step after using a map is to transition over to a more full featured cache (for example https://github.com/dgraph-io/ristretto).
A factor in the decision of whether to use a map or cache is how you imagine the lifecycle of a database connection. Once a connection is opened, can that connection remain opened for the remainder of the lifetime of the process or do connections need to be closed after minutes of no use to free up resources.
Should we have a sharding layer for each incoming request sharded by connection url, so each machine will contain just the right amount of database connections in the form of map[string]*sql.DB?
The right answer here depends on how many processing nodes are expected and whether there will be gain additional benefits from routing requests to specific machines. For example, row-level caching and isolating users from each other’s requests is an advantage that would be gained by sharing users across the pool. But a disadvantage is that you might end up with “hot” nodes because a single user might generate a majority of the traffic.
Usually, a good strategy for situations like this is to be really explicit about the constraints of the problem. A rule of thumb was coined by Jeff Dean for situations like this:
Ensure your design works if scale changes by 10X or 20X but the right solution for X [is] often not optimal for 100X
https://static.googleusercontent.com/media/research.google.com/en//people/jeff/stanford-295-talk.pdf
So, if in the near future, the system needs to support tens of concurrent users. The simplest that will support tens to hundreds of concurrent users (probably a map or cache with no user sharding is sufficient). That design will have to change before the system can support thousands of concurrent users. Scaling a system is often a good problem to have because it usually indicates a successful project.
I am working on an application in which I am using AWS Cognito to store users data. I am working on understanding how to manage the back-up and disaster recovery scenarios for Cognito!
Following are the main queries I have:
I wanted to know what is the availability of this stored user data?
What are the possible scenarios with Cognito, which I need to take
care before we go in production?
AWS does not have any published SLA for AWS Cognito. So, there is no official guarantee for your data stored in Cognito. As to how secure your data is, AWS Cognito uses other AWS services (for example, Dynamodb, I think). Data in these services are replicated across Availability Zones.
I guess you are asking for Disaster Recovery scenarios. There is not much you can do on your end. If you use Userpools, there is no feature to export user data, as of now. Although you can do so by writing a custom script, a built-in backup feature would be much more efficient & reliable. If you use Federated Identities, there is no way to export & re-use Identities. If you use Datasets provided by Cognito Sync, you can use Cognito Streams to capture dataset changes. Not exactly a stellar way to backup your data.
In short, there is no official word on availability, no official backup or DR feature. I have heard that there are feature requests for the same but who knows when they would be released. And there is not much you can do by writing custom code or follow any best practices. The only thing I can think of is that periodically backup your Userpool's user data by writing a custom script using AdminGetUser API. But again, there are rate limits on how many times you can call this API. So, backup using this method can take a long time.
AWS now offers a SLA for Cognito. In the event they are unable to meet their availability target (99.9% at the time of writing), you will receive service credits.
Even through there are couple of third party solutions available, when restoring a user pool users will be created using admin flow (users are not restored rather they will be created from an admin) and they will end up with "Force Change Password" status. So the users will be forced to change the password using the temporary password and that has to be facilitated from the front end of the application.
More info : https://docs.amazonaws.cn/en_us/cognito/latest/developerguide/signing-up-users-in-your-app.html
Tools available.
https://www.npmjs.com/package/cognito-backup
https://github.com/mifi/cognito-backup
https://github.com/rahulpsd18/cognito-backup-restore
https://github.com/serverless-projects/cognito-tool
Pls bear in mind that some of these tools are outdated and can not be used. I have tested "cognito-backup-restore" and it is working as expected.
Also you have to think of how to secure the user information outputted by these tools. Usually they create a json file containing all the user information (except the passwords as passwords can not be backed up) and this file is not encrypted.
The best solution so far is to prevent accidental deletion of user pools with AWS SCPs.
I have an application with a backend as database.
The application is sort of PUB-SUB model where users post changes to the application and other peers subscribe to those changes. These changes may happen very frequently or periodically and all the changes have to be written to database.
Now, I am being asked to find the possibility of replacing this RDBMS with LDAP. Probably they want unified DB for all applications but anyways I have to find the advantage/disadvantages of both approaches.
I cannot directly compare RDBMS a with LDAP as I have almost no idea of LDAP though I tried to get some.
I understand that LDAP is designed for directory access and is optimized for Read access, so it is write once and read many. I have read that frequent writes will reduce the performance of LDAP server as each write will result a trigger to indexing process.
Just to give a scenario in regards with indexing in LDAP, my table will have few columns say 2 viz. Name and Desc. Now in LDAP I suppose this would become two attributes as Name and Desc. In my scenario it's Desc which will be frequently updated. I assume Name will be indexed so even if Desc is changing frequently it won't trigger indexing process.
I point is worth mentioning that the database will be hosted on some cloud platform.
I tried to find out the differences but nothing conclusive I could find out.
LDAP is a protocol, REST is a service based on the HTTP (protocol). So when the LDAP server shall not be exposed to the internet, how do you want to get the data from it? As LDAP is the protocol you would need direct access to the LDAP-server. Its like a database server that you would not expose directly to the internet. You would build an interface to encapsulate it. and that might as well be a REST interface.
I'd try to get the point actos that one is the transfer protocol and a storage backend and the ither is the public interface to its data. It's a bit like why is mysql better than a webinterface. You'd never make the mysql-server publicly available but encapsulate its protocol into an application.
REST is an interface. It doesn't matter how you orgsnize your data behind that interface. When you decide that you want to organize it differently you can do so without the consumer of your API noticing any change. And you can provide different versions of your API depending on improvements of your service.
LDAP on the other hand is an implementation. You can't change the way your data is handled without the consumer noticing it. So there's no way to rearrange your backend without affecting the consumer.
With REST you can therefore change the backend from MySQL to PostgreSQL even to LDAP without notice which you won't be able with LDAP.
Hope that helps
Now that we finally know what you're actually asking, which has nothing to do with your title, the body of your question, or REST, the simple answer is that there is no particular reason to believe that an LDAP server will perform significantly better than an RDBMS in this application, with two riders:
it may not even be feasible, due to the schema issue, and
if it is feasible it may not be semantically suitable, due to the lack of ACID properties, lack of JOINs, and the other issues mentioned in comments.
I will state that this is one of the worst formulated questions I have seen here for some considerable time, and the difficulty of extracting the actual question was extreme.
I want to create my site and in the page have it so that the forum pages will use the forum mysql user having privileges on mydb.forum_table, mydb_forum_table2.
and the profile page to use the profile user having access to mydb.users and mydb.profiefields
and so on with the photogallery, blog, chat and...
is this the right way to do it! I'm thinking of principle of least privileges but I wonder why I haven't seen other big known CMS do it!
One of the critical resources for a database is connections. Generally databases are configured with a maximum number of connections, an each time a process needs to make a query, it needs a connection to do so. Database connections are expensive objects to create -- they take time and memory, and most importantly, connections are established for a specific user. The generally accepted 'best practice' for web applications is for the application, when it needs a database connection, to check a pool for an available connection. If there's a free connection in the pool, the web app will pull that connection, use it as necessary, and then return it to the pool for reuse. If there are no free connections, the app will create a new one, use it, and then place it in the pool for reuse.
If you're dealing with an application that uses multiple database users (for privilege management) and you need to use connection pooling, your application will need to establish many pools (one for each user), which will usually result in your application acquiring at least one connection for each database user it is using. This is inefficient, error prone, and needlessly complex.
If you're truly intent on limiting your application's access to data, then you should probably investigate how much support your database has for views. If views are well-supported, then you can create a view (or views) that are customized to the needs any given portion of your application.
My recommendation would be to stick to a single database user, and then use the time you just freed up to do more debugging of your application. You'll get better results, and will aggravate fewer DBAs.
If I understand correctly, the question is about implementing module access control based on the permissions on the tables that are used by the module.
I think it would be complicated to maintain (the link between modules, and tables), and slow to have to check the permissions on each table accessed by the module.