I know this is a really basic question yet I did not find further information. Figure I want to develop a basic multiple-user application for notes. In my database I have a table where I store user IDs, usernames and passwords.
I now want to store the users notes, but a user should only be able to see their own notes. What is best practice to do this? The two possibilities that come to my mind are
Create a table for each user where you store their notes (probably
scales horribly bad)
Have one big notes-table and save the user IDs as secondary keys (It just
feels a bit "off" to have everything stored in one big table)
Is one of these two ideas used in this exact way in large scale real-world projects? If so, is there anything else one has to pay attention to?
In general you need the 2nd option.
My advice to you, please don’t create any auth functions, because it's a very hard solution for the beginners. Much better for this type of application (as notes) is to use a serverless architecture.
E.g. Firebase, Supabase and so on.
Where you will have database, security authentication, record level security, storage for files etc.
Related
I’m trying to build a mobile app that involves users following each other. I’ve seen posts (here) that say it is a cardinal sin to store a users’ followees and followers as a list in a SQL database as each “cell” should only store one discrete value.
However, is this the case for noSQL, document-based databases? What are the pros and cons of storing followers and followees as a list in the user document, vs storing it in a separate collection?
The only ones i can see now is that retrieving the follower/followee data (could be?) faster for the former method as you don’t have to index the entire follower/followee collection, unlike the latter method (or is the time difference negligible?). On the other hand, one would require 2 writes every time someone follows/unfollows another user, which may be disadvantageous for billing in cloud databases, but might not be a problem if the database is hosted locally (?)
I’m very new to working with databases so I’m hoping to get some insight from more experienced people about long term/large scale effects of this choice. Thanks!
I am working at a company that merged with another company a while ago.
There we have several business units that are basically equivalent. One in Europe, one in China, each. We already had an in-house MariaDB database, which we want to start sharing.
The problem is that there are different GDPR regulations and contracts that prohibit sharing certain data across sites. So what I can't do, is replicate data across sites and then just hide in from the user in the frontend. The private data has to stay at the facility, it belongs to.
So my idea was to separate each table that we have now and where possibly sensitive information is contained into two tables each.
One say table_contracts_private and table_contracts_public.
This would still seem pretty doable with basic database replication and replicating the public tables across sites. But how would you go about publishing private data? Also how would I best combine private and public data? Just using a view
I just could not find any good mechanisms for this, especially because we would also like to avoid data duplication, so the private entries would need to be removed and replaced by the public ones, which would entail also changing all referencing IDs.
Is this a possible application of sharding?
I'd be really grateful, if someone could point me in the right direction, or if someone has a demo project with similar requirements that I could check out.
Cheers
Is this a possible application of sharding?
I wouldn't think so. Sharding is a performance optimization method. What you need is to support legal constraints. Those are two very different problems.
I think you are on the right track. I call this a "walled garden" approach. You create a database with all non-PII information, using ids only. Nothing that remotely directly identifies people, their addresses, phones, credit cards, and so on. This can be tricky. In some jurisdictions combinations of demographics can be PII.
Some of those ids then refer to another database where you store all the sensitive information; this is the "walled garden". I would recommend that this second database be on a separate server. It has a very restricted access list. And this is where you implement requirements for things like "forgetting" a customer.
In any case, the point is that sharding is not the right approach. You want an application redesign with privacy and security as the top priorities. Happily, this is not actually that hard to implement, although if the databases are changing, you may need period auditing. For instance, in one database I worked with, we discovered that "coupon codes" sometimes contained unencrypted email addresses. Arrgggh!
I am developing a VB.Net application. That application might be working on a LAN. MS Access as a back end will be used. I have developed many single user applications, but don't know of multi user , LAN, manage DB etc. How do I make the program as Multi user on LAN. Data will be accessed at the same time. How to manage such things.
Please give me some help and Guidance.
Thanks
Your VB application does not care how many people run it.
Your database, with MS Access, has some serious issues with multiple users. Get away from it if you can. SQL Server has a free version called SQL Express. If you only plan on 2 people, you might be OK with Access for a while but be prepared to support it more.
That was all the easy stuff, now you have to think about how you are going to handle multiple users trying to access and update the same data (concurrency).
Imagine this, you are a user looking at employee record 1 and so is someone else. You change the birthday and save. The the other user changes thier suppervisor and saves. How do you know something changed? What do you do if something changed? These are questions I cannot answer for you, you must decide based on your situation.
There are 2 main types of concurrency, optimistic and pessimistic. See this link for a great explaination and discussion on them: optimistic-vs-pessimistic-locking
You can look at this on a table-by-table basis.
If a table is never updated, you dont have to worry about concurrency
If a table is rarely updated, like a table of states, you can decide if it is worth the extra effort to add concurrency.
Everything else, pretty much should have some type of concurrency.
Now, the million dollar question, how?
You will find as many ways to handle concurrency as you will find colors in the rainbow. Here are some of the ones I like:
Simple number that you increment with each save. Small and easy.
DateTime stamp - As long as you dont expect to ever have 2 people save the same record during the same second, this is easy. (I personally dont like it by it's self)
User Name - Pretty simple gives a little bit of an audit by knowing who last inserted/edited the record but doesn't handle an issue I have seen to often. Imagine the same senerio as above but you had 2 instances of record 1. Now you change the data again, maybe supervisor, and when you save, you overwrite the changes from your first save with those of the second save.
Guid - VB can create a guid, SQL Server can create a guid and so can Access. It is nice an unique and most important, you can create it on the client so you dont have to requery the database after you save the record to get a refreshed record.
Combination of these. I like 2 and 3 myself. Gives a mini audit and is unique to the user.
If you use a DataAdapter, by default, MS will assume concurrency checking means to compare EVERY field to make sure it did not change. This works, but is completely un-scaleable and should not be done.
All of this depends on the size of your application and how you see it being used. Definately do some more research before you settle on a decision.
There are a number of solutions here.
If I may suggest a drastic alternative, have you considered pairing the client running on the user's computer with a server component (through a web service)? A simpler alternative would be for the client to talk directly to a SQL Server (or other database) instance through the network?*
*I'm not a fan of having client side apps talk directly to the database. It will mean maintenance headaches in the future, but I
included it to give you options
.
I found this random example via Google so YMMV.
I am building a web application that will essentially allow authenticated users access to mass amounts of data, but I don't want users to only have read-only access. If there are records missing fields but a user has found information to fill these fields or correct already populated data, I would like the user to be able to do so.
However, I'm worried about mean-spirited folks coming in and simply clearing out records out of sheer boredom and am wondering what the best way to prevent this from happening would be.
My first thought is to have users submit edits, and have a page devoted to batch approvals of these edits after myself or trusted individuals skim over the page. Of course, this would be time consuming (especially as the database grows larger), and I'm curious to know of any better ways to give users editing privileges.
As you are in Rails, there are a number of plugins that provide auditing and versioning of records -
http://github.com/andersondias/acts_as_auditable
http://github.com/laserlemon/vestal_versions
These should let you build something that allows edits but still support reversions in the worst case scenario.
Support rollbacks, like Wikis, to undo malicious edits.
When considering social web app architecture, is it a better approach to document user social patterns in a database or in logs? I thought for sure that behavior, actions, events would be strictly database stored but I noticed that some of the larger social sites out there also track a lot by logging what happens.
Is it good practice to store prominent data about users in a database and since thousands of user actions can be spawned easily, should they be simply logged?
Remember that Facebook, for example, doesn't update users information per se, they just insert your new information and use the most recent one, keeping the old one. If you plan to take this approach is HIGHLY recommended, if not mandatory, to use a NoSQL DB like Cassandra, you'll need speed over integrity.
Information = money. Update = lose information = lose money.
Obviously, it depends on what you want to do with it (and what you mean be "logging").
I'd recommend a flexible database storage. That way you can query it reasonably easily, and also make it flexible to changes later on.
Also, from a privacy point of view, it's appropriate to be able to easily associate items with certain entities so they can be removed, if so requested.
You're making an artificial distinction between "logging" and "database".
Whenever practical, I log to a database, even though this data will effectively be static and never updated. This is because the data analysis is much easier if you can cross-reference the log table with other, non-static data.
Of course, if you have a high volume of things to track, logging to a SQL data table may not be practical, but in that case you should probably be considering some other kind of database for the application.