Modeling data in firebase using joins - many-to-many relationship - oop

I'm interested in the new firebase.util package that allows you to join data (paths) and how I might be able to continue modeling with UML as I have become accustomed to over many years. I can see how easy it might be to make one-to-many relationships in this way. And because firebase is hierachical, component relationships are just very natural.
Aggregate relationships can be duck'd as we're all accustomed to this in javascript - enforcing aggregate relationship doesn't seem to me to be a barrier to modeling successful projects using firebase...
My question is if anyone has experimented | had success with | can show examples of how it might be possible to represent many-to-many relationships, perhaps by joining the join paths themselves.
If I don't get much interest in the question I may post my own trial-error results...
Thanks

I have tried to use composite key. For example, user can be member of many rooms. We need two queries: List of room members, and list of user's rooms. So we can have only one collection rooms-users, where key is built like this:
id = [roomId, userId].join()
The truth is, I'm not sure whether it is a good pattern. It seems it can prevent security rules settings https://stackoverflow.com/a/17431390/233902 and maybe even have performance implications.
So maybe two or even more collections are required. Two for many to many, third for relation metadata. As I'm thinking about, collections should be optimized for queries, so composite key is anti-pattern for Firebase.

Related

Collection of relations. NoSQL vs SQL

I understand the difference between NoSQL and SQL, but I still have a small question. It's about a many-to-many relationship. I know that if I need such a relationship, I should use relational databases. However, in my case, I prefer to use document-oriented databases, because I need to store a large number of documents, much larger than the number of entities with relationships.
So, I need to implement user groups. Of course, users can exist outside of groups, therefore they are documents of a separate collection. In addition, one user can be in several groups, which means this is a real many-to-many relationship.
People say that “mongo-way” is to make a user with a list of links to groups and a group with a list of links to users, but this option doesn’t suit me, because sometimes I need to display a list of groups without displaying a list of the group users, which can take most of the document.
As an alternative, I want to use the traditional “relationship tables” that are used for many-to-many relationships in relational databases.
So my question is, what is the practical difference between using such tables in mysql or in mongodb? As far as I know, there are no foreign keys in mongodb, but does that really prevent me from doing something like this? I see the problem only in the fact that you can not get rid of the required _id and its indexes. By the way, in this case, should I create indexes for the UserId and GroupId fields?
Or maybe I should give up the idea to fit everything into one database and use SQL and NoSQL together in one project?
below links might help you for schema designing .
https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-schema-design-part-1

What is the problem with many-to-many relationships?

I feel like I have searched through the internet to find an answer to this question for quite some time now, but without success. Does anyone feel comfortable explaining why many-to-many relationships should be replaced with a bridge table?
Probably most (all??) RDMS implement a M:N relationship by creating a table containing two columns with the FKs.
So there is no advantage to explicitely model the bridge table.
But in most realistic cases you want to store additional information (besides the fact of its existence) about the relationship instance, e.g. timestamp and user from the creation. That means that you need to model the bridge table anyway.

Embeddable vs one to many

I have seen an article in Dzone regarding Post and Post Details (two different entities) and the relations between them. There the post and its details are in different tables. But as I see it, Post Detail is an embeddable part because it cannot be used without the "parent" Post. So what is the logic to separate it in another table?
Please give me a more clear explanation when to use which one?
Embeddable classes represent the state of their parent classes. So to take your example, a StackOverflow POST has an ID which is invariant and used in an unbreakable URL for sharing e.g. http://stackoverflow.com/q/44017535/146325. There are a series of other attributes (state, votes, etc) which are scalar properties. When the post gets edited we have various versions of the text (which are kept and visible to people with sufficient rep). Those are your POST DETAILS.
"what is the logic to separate it in another table?"
Because keeping different things in separate tables is what relational databases do. The standard way of representing this data model is a parent table POST and child table POST_DETAIL with a defined relationship enforced through a foreign key.
Embeddable is a concept from object-oriented programming. Oracle does support object-relational constructs in the database. So it would be possible to define a POST_DETAIL Type and create a POST Table which has a column declared as a nested table of that Type. However, that would be a bad design for two reasons:
The SQL for working with nested tables is clunky. For instance, to get the POST and the latest version of its text would require unnesting the collection of details every time we need to display it. Computationally not much different from joining to a child table and filtering on latest version flag, but harder to optimise.
Children can have children themselves. In the case of Posts, Tags are details because they can vary due to editing. But if you embed TAG in POST_DETAIL embedded in POST how easy would it be to find all the Posts with an [oracle] tag?
This is the difference between Object-Oriented design and relational design.
OO is strongly hierarchical: everything is belongs to something and the way to get the detail is through the parent. This approach works well when dealing with single instances of things, and so is appropriate for UI design.
Relational prioritises commonality: everything of the same type is grouped together with links to other things. This approach is suited for dealing with sets of things, and so is appropriate for data management tasks (do you want to find all the employees who work in BERLIN or whose job is ENGINEER or who are managed by ELLIOTT?)
"give me a more clear explanation when to use which one"
Always store the data relationally in separate tables. Build APIs using OO patterns when it makes sense to do so.

Is there any way to represent many to many relationship without using junction tables?

And in addition to that, when you are drawing out ERD diagrams should you include the "junction" tables as entities, even though they are not explicitly mentioned in the spec, but you can clearly see that it is many to many relationship?
The entities that you include in an ERD really depend upon the intended audience. If you plan on presenting an ERD to software engineers or database administrators then omitting the associative tables would just be confusing. If you are trying to give a high level system overview, then I would advise leaving out everything except the entities that are directly relevant to system operation.
In my personal opinion, junction table aka "cross-walk" tables are important to show data flow.
I think leaving these cross-walk tables out of the ERD, may make viewing the logical flow of the data difficult to comprehend for someone new to the ERD. As your data model becomes more complex, it makes it increasingly difficult to comprehend if you do not show them.

Refactor schema with multiple many-to-many relationships to semantically different but structurally similar entities?

I have a schema with a number of many to many relationships and what I'm seeing is a alot of similar data structure spread out among tables with different names. The intuition I have is that there is a more efficient/desirable way to achieve the same result but I'm not sure what alternative approaches fall into reasonable design/best practices.
Note: Counries, TrafficTypes and People - as they exist now - could all be represented by Id and Name columns, but in the future may have additional fields. Maybe what I'm after is some kind of technique akin to inheritance?
Here's the diagram of what I've got:
Don't lump together things which you think are similar; they may diverge later when you need to store more information about each entity.
Is there a problem with the number of tables you have in your database?
You are probably thinking about the problem from an object oriented design position and thinking you can use some sort of "parent" table to represent the common parts - databases don't work that way.
If you are not careful you will end up with a MUCK or OTLT table.
Off the bat, I would keep seperate enties/objects/(cars vs animals) in seperate tables.
The chances of such enties overlapping properties are slim at best.
Thing is, once those entites start to evolve in your system, you will find that a single table will have hundreds of columns with singles populated per entity.
I don't see how inheritance applies to your case. But if you are interested in inheritance, as it applies to SQL tables or relations, here are some things to look up:
At the level of ER modeling look up "ER model specialization". This is the way the extended ER model diagrams "is A" relationships.
At the level of table design, look up "Class Table Inheritance" and "Shared primary key" for a couple of techniques that, used together, sort of mimic what inheritance does for you in an OOP. You might also want to look up "single table inheritance" for an alternative that's simpler, but can be more wasteful.
Don't worry. You're doing it right.
In a highly normalized schema, you're going to have tons of tables that are nearly identical.
As the others have said, what you're starting to consider (a generic table for multiple things) is a very bad idea and has many drawbacks.
The biggest drawback is that your relationships are made useless by it, in terms of maintaining data integrity. There's nothing stopping someone from assigning a CountryCode where a TrafficTypeId should be, and so on.
Another drawback would be that having one larger table will likely perform worse than many smaller, specialized tables; due to extra, unnecessary blocking.
Your may still want to implement some type of inheritance concept, but that'll be best done in whatever code accesses the database.