Unique client-supplied ids in a multi-tenant application - sql

We are building a multi-tenant application that supports multiple clients. Let's use an accounting application as an example - each organization has its own accounts, receipts, etc (with their own unique id). In our case, the numbers of clients is small.
There are two options to go about it
Unique ids (UUIDs) are created by our app, and the client is responsible for maintaining a mapping of their own ids, to our UUIDs. This is easier to support but adds complexity to the client (need to maintain an extra UUID, and potentially have to pass it between their own micro-services)
We let the client specify the ids of the object in the API call as if they are the only tenant, and somehow handle the uniqueness in the background.
If we go with the 2nd approach, then we need to combine the clientId with the objectId. We could think of 3 ways to do it
A DB table per client. We decide which table to use based on the client_id. Requires either manually, or automatically creating a full set of tables per client.
DB composite keys.(I am aware of the performance hit of using Strings as pkeys, )
CREATE TABLE User (
clientId String,
userId String,
PRIMARY KEY (clientId, userId)
)
Application level: Application maintains both, and is responsible for returning the client their id, while producing the internal UUID for internal use. The internal id can be optimized for storage type. For example (Scala)
trait UniqueId[T]{
val toClientId: String // The unique id we got from a client
val to InteralId: T // The unique id we use internally
}
case class Id(client: Client, userId: String) extends UniqueId[...]{
val toClientId = userId
// Could have any of...
val to InteralId = s"${client.name}_${userId}"
val to InteralId = MD5(s"${client.name}_${userId}")
}

I do not see any question here, but I supposed you want some advice on the way to go ;-)
If you have a database that is multitenant aware (oracle multitenant, or a db with partition) you could easily implement your solution 1.
I would personnaly use an internal guid or int surrogate to identify each objects and would store the client ID only once as an attribute...

Related

Adding ASP.NET Identity to an existing project with already managed users table

I have an existing web api project with a users table. In general User is involved in some key business queries in the system (as other tables keep its 'UserId' foreign key).
These days I'm interested in adding Asp.net (core) identity. Basically I've already performed the required steps adding a separate Identity table, managing an additional db context (implementing IdentityDbContext), and also added a JWT token service. It looks that everything works fine. However I am now wondering how should I "link" between the authenticated user (which has logged in through the Identity module) and the user which is found on the other original "business related db".
What I was thinking of is that upon login, having the userId retrieved from the original Users table, based on the email which is used as the username and is found on both the original Users table and the new Identity table, and than have it kept as a Claim on the authenticated user. This way, each time the user is calling the API (for an Authorize marked action on the API relevant controller), assuming is authenticated I will have the relevant userId on hand and be able to address and query what ever is needed from the existing business table.
I guess this can work, however I'm not sure regarding this approach and I was wondering if there are any other options?
Regarding the option I've mentioned above, the main drawback I see is that upon the creation of a new user, this should be performed against 2 different tables, on 2 different DBs. In this case, in order to keep this in one unit of work, is it possible to create transaction scope consists of 2 different db contexts?
You're on the right track.
I faced similar problem
Imagine two different microservices.
Identity-Microservice(Stores identity information (Username, Password Etc...))
Employees-Microservice (Stores employee information (Name, Surname Etc...))
So how to establish a relationship between these two services?
Use queues(RabbitMq, Kafka etc...)
An event is created after User Registration(UserCreatedEvent {Id, Name etc..})
The workers microservice listens for this activity and records it in the corresponding table
This is the final state
Identity
Id = 1, UserName = ExampleUserName, Email = Example#Email Etc...
Employee
Id = 1, Name = ExampleName, Surname = ExampleSurname Etc...
Now both services are associated with each other.
Example
If i want to get the information of an employee who is logged in now.
var currentEmployeeId = User.Identity.GetById()
var employee = _db.Employee.GetById(currentEmployeeId)

designing a database schema for aws mobile backend

I am new to databases and sql and would like to design a database for a fitness app that will keep track of workouts at the gym.
In my app, I have designed a custom workout object that has a name (e.g. 'Chest day'), an ID (some number) and a date (string). Each workout object contains an array of exercises, another custom object, that has a property for called 'set'. The set is also a custom object with only two numeric properties: number of reps and weight (e.g. 10 reps at 50 lbs)
What I thought of is to have one table for the workouts, another for the exercises and another for the sets. The problem is I do not know how to connect the tables (i.e. link multiple exercises to a unique workout and link multiple sets to a unique exercise) and am not sure if this is even the correct approach.
Also, I planned to set up the backend for this app using the amazon web services mobile hub which provides a noSQL database.
In NoSQL, you should keep all the attributes in single table. You shouldn't normalize the data like RDBMS. Also, please try to come away from Join. The main advantage of NoSQL is that keep everything as one item, so that you don't need to Join to get the result.
Advantages of this approach are:-
1) Fast response as all the data is present as one item in a table
2) Schema less database i.e. you can add any attributes at any time (i.e. no need to alter table and add the new columns)
DynamoDB design for above use case:-
The combination of partition and sort key should be unique
name -String (Partition Key)
id -Number (Sort Key)
date - String
exercise : [array of values] - List data type
custom_set : {rep : 1, weight : 2} - Map data type
Important Note:-
The important thing while designing the data model for DynamoDB is all the data retrieval use cases (i.e. Query Access Patterns) should be available to design the appropriate model.

Domain Driven Design Auto Incremented Entity Key

Just starting with Domain Driven Design and I've learned that you should keep your model in a valid state and when creating a new instance of a class it's recomended to put all required attributes as constructor parameters.
But, when working with auto incremented keys I just have this new ID when I call an Add method from my persistent layer. If I instanciate my objects without a key, I think they will be in a invalid state because they need some sort of unique identifier.
How should I implement my architecture in order to have my IDs before creating a new instance of my entity ?
Generated Random IDs
The pragmatic approach here is to use random IDs and generate them before instantiating an entity, e.g. in a factory. GUIDs are a common choice.
And before you ask: No, you won't run out of GUIDs :-)
Sequential IDs with ID reservation
If you must use a sequential ID for some reason, then you still have options:
Query a sequence on the DB to get the next ID. This depends on your DB product, Oracle for example has them).
Create a table with an auto-increment key that you use only as key reservation table. To get an ID, insert a row into that table - the generated key is now reserved for you, so you can use it as ID for the entity.
Note that both approaches for sequential IDs require a DB round-trip before you even start creating the entity. This is why the random IDs are usually simpler. So if you can, use random IDs.
DB-generated IDs
Another possibility is to just live with the fact that you don't have the ID at creation time, but only when the insert operation on the DB succeeds. In my experience, this makes entity creation awkward to use, so I avoid it. But for very simple cases, it may be a valid approach.
IN adition to theDmi's comments
1) You can in your factory method make sure your entity gets stored to the database. This might or might not be applicable to your domain but if you are sure that entity is going to be saved that might be a valid approach
2) You can separate the ID from the primary key from the database. I've worked with a case there something was only an order if the customer payed and at that point it would be identified by it's invoice id (a sequentual ID). that doesn't mean in the database i would need an column ID which was also the primary key of the object. You could have a primary key in the database (random guid) and till have an ID (int?) to be sequentual and null if it hasn't be filled yet.

Model simple social network in Azure table service

What is the best table design for a simple social networking website using Azure Table Service?
The website could have millions of users.
Users need to be able to view a list of all other users in the system sorted by the number of mutual connections.
Users must be able to view a list of their connections
User must be able to view content posted by themselves and their connections.
One major design constraint is that Azure table service queries are generally limited to the partition key and row key when there are a large number of records or else they get really slow. Another constraint is that query results are only sorted by the partition key and then the row key.
Try this Design:
UserTable
PK: GUID ( GUID for PK will maximize scalability, only one partition with single row in each server)
RK: GUID
... Rest of properties
UserFriendsTable
PK: UserTable.RK ( Every User with his friends in a separate server)
RK: GUID
FriendWith: UserTable.Pk - UserTable.RK (Concatenate PK and RK from user table separated with "-", this will help you to execute point query fast when you try to access friend profile )
PostsTable
PK: UserTable.RK + "-" +YYYYMM+ Random number (This will allow azure to put all monthly posts of any user in a separate server. Random number to prevent azure from auto grouping partitions in sequence. You can query posts with filtering PK partly ex: pk start with XCtghi94ktY-201411.
RK use following code to generate row key in descending order. means latest post comes first.
long ticks = DateTimeOffset.MaxValue.UtcDateTime.Ticks - DateTimeOffset.Now.UtcDateTime.Ticks;
string guid = Guid.NewGuid().ToString("N");
string suffix = "-";
string.Format("{0:d21}{1}{2}", ticks, suffix, guid);
Post : String

Best way to maintain data integrity between local and remote sql databases

So I have, what would seem like a common question that I can't seem to find an answer to. I'm trying to find what is the "best practice" for how to architect a database that maintains data locally, then syncs that data to a remote database that is shared between many clients. To make things more clear, this remote database would have many clients that use it.
For example, if I had a desktop application that stored to-do lists (in SQL) that had individual items. Then I want to be able to send that data to a web-service that had a "master" copy of all the different clients information. I'm not worried about syncing problems as much as I am just trying to think through actual architecture of the client's tables and the web-services tables
Here's an example of how I was thinking about it:
Client Database
list
--list_client_id (primary key, auto-increment)
--list_name
list_item
--list_item_client_id (primary key, auto-increment)
--list_id
--list_item_text
Web Based Master Database (Shared between many clients)
list
--list_master_id
--list_client_id (primary key, auto-increment)
--list_name
--user_id
list_item
--list_item_master_id (primary key, auto-increment)
--list_item_remote_id
--list_id
--list_item_text
--user_id
The idea would be that the client can create todo lists with items, and sync this with the web service at any given time (i.e. if they lose data connectivity, and aren't able to send the information until later, nothing will get out of order). The web service would record the records with the clients id's as just extra fields.
That way, the client can say "update list number 4 with a new name" and the server takes this to mean "update user 12's list number 4 with a new name".
I think they general concept you're working with is the right direction, but you may need to pay careful attention to the use of auto-increment columns. For example, auto-increment on the server is useless if the client is the owner of this ID. Instead, you probably want list.list_master_id to be an auto-increment. Everything else you've mentioned is entirely plausible, though the complexity may increase if there may be multiple clients per user. Then, the use of an auto-increment alone probably isn't sufficient. Instead, you may need a guid or a datatype that also includes a client identifier to prevent id collision.
Without having more details it would be difficult to speculate on what other situations you may need to consider.
SERVER:
list
--id
--name
--user_id
--updated_at
--created_from_device_id
Those 2 tables link all records, might be grouped in one table also.
list_ids
--list_id
--device_id
--device_record_id
user_ids
--user_id
--device_id
--device_record_id
CLIENT (device_id=5)
list
--id
--name
--user_id
--updated_at
That will allow you to save records as(only showing relevant fields):
server
list: id=1, name=shopping, user_id=1234
user: id=27, name=John Doe
list_ids: list_id=1, device_id=5, device_record_id=999
user_ids: user_id=27, device_id=5, device_record_id=567
client
id=999, name=shopping, user_id=567
This way they are totally unaware of any ID's, translations can be done quite fast and you can supply the clients only with information and ID's they know of.
I have the same issue with a project i am working on, the solution in my case was to create an extra nullable field in the local tables named remote_id. When synchronizing records from local to remote database if remote_id is null, it means that this row has never been synchronized and needs to return a unique id matching the remote row id.
Local Table Remote Table
_id (used locally)
remote_id ------------- id
name ------------- name
In the client application i link tables by the _id field, remotely i use the remote id field to fetch data, do joins, etc..
example locally:
Local Client Table Local ClientType Table Local ClientType
_id
remote_id
_id -------------------- client_id
remote_id client_type_id -------------- _id
remote_id
name name name
example remotely:
Remote Client Table Remote ClientType Table Remote ClientType
id -------------------- client_id
client_type_id -------------- id
name name name
This scenario, and without any logical in the code, would cause data integrity failures, as the client_type table may not match the real id either in the local or remote tables, therefor whenever a remote_id is generated, it returns a signal to the client application asking to update the local _id field, this fires a previously created trigger in sqlite updating the affected tables.
http://www.sqlite.org/lang_createtrigger.html
1- remote_id is generated in the server
2- returns a signal to client
3- client updates its _id field and fires a trigger that updates local tables that join local _id
Of course i use also a last_updated field to help synchronizations and to avoid duplicated syncs.