I'm working on a contact database in rails 3..
One thing thats really frustrating is how ugly the family relationship code is..
Is there a clean way of doing this in rails?
Basically all contacts are of the contact class (go figure!)
And contacts have many family_relationships (another model)
and many relatives through family_relationships.. The family relationship model also has one family relationship type (another model)
So far i've implemented this using the methods here http://railscasts.com/episodes/163-self-referential-association (using inverse relationships etc..)
But this just doesnt feel very clean.. and if i want to get all the contacts relatives, relationships etc.. i have to drop to raw SQL or join the arrays..
Is there a better (or definitive) way that this kinda thing is done in rails?
The Ancestry gem seems like it solves exactly this kind of problem:
Ancestry is a gem/plugin that allows the records of a Ruby on Rails ActiveRecord model to be organised as a tree structure (or hierarchy). It uses a single, intuitively formatted database column, using a variation on the materialised path pattern. It exposes all the standard tree structure relations (ancestors, parent, root, children, siblings, descendants) and all of them can be fetched in a single sql query. Additional features are STI support, scopes, depth caching, depth constraints, easy migration from older plugins/gems, integrity checking, integrity restoration, arrangement of (sub)tree into hashes and different strategies for dealing with orphaned records.
Related
I have seen an article in Dzone regarding Post and Post Details (two different entities) and the relations between them. There the post and its details are in different tables. But as I see it, Post Detail is an embeddable part because it cannot be used without the "parent" Post. So what is the logic to separate it in another table?
Please give me a more clear explanation when to use which one?
Embeddable classes represent the state of their parent classes. So to take your example, a StackOverflow POST has an ID which is invariant and used in an unbreakable URL for sharing e.g. http://stackoverflow.com/q/44017535/146325. There are a series of other attributes (state, votes, etc) which are scalar properties. When the post gets edited we have various versions of the text (which are kept and visible to people with sufficient rep). Those are your POST DETAILS.
"what is the logic to separate it in another table?"
Because keeping different things in separate tables is what relational databases do. The standard way of representing this data model is a parent table POST and child table POST_DETAIL with a defined relationship enforced through a foreign key.
Embeddable is a concept from object-oriented programming. Oracle does support object-relational constructs in the database. So it would be possible to define a POST_DETAIL Type and create a POST Table which has a column declared as a nested table of that Type. However, that would be a bad design for two reasons:
The SQL for working with nested tables is clunky. For instance, to get the POST and the latest version of its text would require unnesting the collection of details every time we need to display it. Computationally not much different from joining to a child table and filtering on latest version flag, but harder to optimise.
Children can have children themselves. In the case of Posts, Tags are details because they can vary due to editing. But if you embed TAG in POST_DETAIL embedded in POST how easy would it be to find all the Posts with an [oracle] tag?
This is the difference between Object-Oriented design and relational design.
OO is strongly hierarchical: everything is belongs to something and the way to get the detail is through the parent. This approach works well when dealing with single instances of things, and so is appropriate for UI design.
Relational prioritises commonality: everything of the same type is grouped together with links to other things. This approach is suited for dealing with sets of things, and so is appropriate for data management tasks (do you want to find all the employees who work in BERLIN or whose job is ENGINEER or who are managed by ELLIOTT?)
"give me a more clear explanation when to use which one"
Always store the data relationally in separate tables. Build APIs using OO patterns when it makes sense to do so.
I'm interested in the new firebase.util package that allows you to join data (paths) and how I might be able to continue modeling with UML as I have become accustomed to over many years. I can see how easy it might be to make one-to-many relationships in this way. And because firebase is hierachical, component relationships are just very natural.
Aggregate relationships can be duck'd as we're all accustomed to this in javascript - enforcing aggregate relationship doesn't seem to me to be a barrier to modeling successful projects using firebase...
My question is if anyone has experimented | had success with | can show examples of how it might be possible to represent many-to-many relationships, perhaps by joining the join paths themselves.
If I don't get much interest in the question I may post my own trial-error results...
Thanks
I have tried to use composite key. For example, user can be member of many rooms. We need two queries: List of room members, and list of user's rooms. So we can have only one collection rooms-users, where key is built like this:
id = [roomId, userId].join()
The truth is, I'm not sure whether it is a good pattern. It seems it can prevent security rules settings https://stackoverflow.com/a/17431390/233902 and maybe even have performance implications.
So maybe two or even more collections are required. Two for many to many, third for relation metadata. As I'm thinking about, collections should be optimized for queries, so composite key is anti-pattern for Firebase.
I wanna know if there exists conceptual framework or documented techniques to study how to map entities classes to database tables??
Before I was using JPA (Java Persistance API) to map one table -> one entity class, this is: each table's row represented by one object class. Is it the most common/correct way? Has these patterns specific names?
Thanks a lot.
One table to one class is called the "Active Record" pattern. One alternative pattern people use is the Repository pattern. DDD (Domain Driven Design) has the concept of Aggregate Roots which also plays in this space. You should be able to do some reading on that terminology. I am personally not in love with any of those patterns, they all have pros and cons.
For me, there is no recipe, formula or pattern for the ideal class to table relationship. The key point to mapping is that it acts as an adapter layer, isolating the table structure from the entity structure.
If you are designing your application and tables at the same time, I think you should design your tables to minimise round trips, getting as much data to hydrate your entities as possible. Then let your mapper(s) build the entities and any appropriate associations. This might lead you to denormalise your table structure a little more than you might normally.
In my project (ASP.NET MVC + NHibernate) I have all my entities, lets say Documents, described by set of custom metadata. Metadata is contained in a structure that can have multiple tags, categories etc. These terms have the most importance for users seeking the document they want, so it has an impact on views as well as underlying data structures, database querying etc.
From view side of application, what interests me the most are the string values for the terms. Ideally I would like to operate directly on the collections of strings like that:
class MetadataAsSeenInViews
{
public IList<string> Categories;
public IList<string> Tags;
// etc.
}
From model perspective, I could use the same structure, do the simplest-possible ORM mapping and use it in queries like "fetch all documents with metadata exactly like this".
But that kind of structure could turn out useless if the application needs to perform complex database queries like "fetch all documents, for which at least one of categories is IN (cat1, cat2, ..., catN) OR at least one of tags is IN (tag1, ..., tagN)". In that case, for performance reasons, we would probably use numeric keys for categories and tags.
So one can imagine a structure opposite to MetadataAsSeenInViews that operates on numeric keys and provide complex mappings of integers to strings and other way round. But that solution doesn't really satisfy me for several reasons:
it smells like single responsibility violation, as we're dealing with database-specific issues when just wanting to describe Document business object
database keys are leaking through all layers
it adds unnecessary complexity in views
and I believe it doesn't take advantage of what can good ORM do
Ideally I would like to have:
single, as simple as possible metadata structure (ideally like the one at the top) in my whole application
complex querying issues addressed only in the database layer (meaning DB + ORM + at less as possible additional code for data layer)
Do you have any ideas how to structure the code and do the ORM mappings to be as elegant, as effective and as performant as it is possible?
I have found that it is problematic to use domain entities directly in the views. To help decouple things I apply two different techniques.
Most importantly I'm using separate ViewModel classes to pass data to views. When the data corresponds nicely with a domain model entity, AutoMapper can ease the pain of copying data between them, but otherwise a bit of manual wiring is needed. Seems like a lot of work in the beginning but really helps out once the project starts growing, and is especially important if you haven't just designed the database from scratch. I'm also using an intermediate service layer to obtain ViewModels in order to keep the controllers lean and to be able to reuse the logic.
The second option is mostly for performance reasons, but I usually end up creating custom repositories for fetching data that spans entities. That is, I create a custom class to hold the data I'm interested in, and then write custom LINQ (or whatever) to project the result into that. This can often dramatically increase performance over just fetching entities and applying the projection after the data has been retrieved.
Let me know if I haven't been elaborate enough.
The solution I've finally implemented don't fully satisfy me, but it'll do by now.
I've divided my Tags/Categories into "real entities", mapped in NHibernate as separate entities and "references", mapped as components depending from entities they describe.
So in my C# code I have two separate classes - TagEntity and TagReference which both carry the same information, looking from domain perspective. TagEntity knows database id and is managed by NHibernate sessions, whereas TagReference carries only the tag name as string so it is quite handy to use in the whole application and if needed it is still easily convertible to TagEntity using static lookup dictionary.
That entity/reference separation allows me to query the database in more efficient way, joining two tables only, like select from articles join articles_tags ... where articles_tags.tag_id = X without joining the tags table, which will be joined too when doing simple fully-object-oriented NHibernate queries.
I am interested in knowing the pros and cons of creating a custom system supported by a database like the one described below:
It has 6 tables that support it.
Entity: Lets say, anything "physical" that can exist and have detail stored against it
(Hilton Hotel, Tony Taxi, One Bar)
Entity Type: A grouping/type of entity
(Bar, Hotel, Restaurant)
Metadata: Any detail describing or belonging to an entity item
(IR232PH, foo#bar.com, 555-555-555)
Metadata Type: A grouping/type of metadata
(Post Code, Telephone, Email, address)
Entity Relationship: The ability to group any entity item to another
(Entity1-Entity2, Entity3)
Entity Relationship Type: The grouping/type of entity relationship.
I can see how this model is good for Entities that are similar but don't always have the same amount of attributes.
What are the pro/cons of using it as it is for entities as described?
An artist can be performing (relationship type) at a venue.
An artist can be supporting (relationship type) another artist
What would be the pro/cons of using it also to store more standard entities like users of the system?
A user can have a favourite (relationship type) venue/artist/bar etc
A user can have a attending (relationship type) event
Would you take it as far as having the news and blog posts in it?
This is highly subjective, but before I went up the abstraction ladder to where you are suggesting, I'd rather code my application to use DDL to modify the database schema to match the concrete aspects of the actual entities it was using, rather than having a static schema abstracted so far as to be able to store data about any potential entities.
In a way, to be a bit facetious, IMHO, what you are suggesting has already been done.... It is called a Relational Database. Every RDBMS is a software tool designed to be able to model any possible set of entities, and their attributes, in a way that accurately models those entities and the relationships between them.
Although you can certainly store the data in such a data model, there are a couple of problems (at least) with it.
The first problem is controlling the data. When an 'hotel' is described, what is the set of attributes and metadata that must be defined? Which metadata types can legitimately be entered for an hotel? Related to that is 'when I delete an hotel from the list, what else do I have to delete'? When I delete all hotels from the list (and I never want to store information about hotels again), what else do I have to delete? It is terrifically (terrifyingly?) easy to get all sorts of stray extraneous, unreferenced data into the database.
The second problem is retrieving the data. Suppose I want to know all the information about a specific hotel? How do I write a query for that? Actually, even inserting the data is hard, but selecting it is, if anything, harder. If I only want three attributes, it is easy - if the hotel actually has them all. It is harder if the hotel only has two of the three specified. But suppose the hotel has 30 atttributes, which is not a lot. Then it is terrifically difficult.
What you are describing is a souped-up version of a model known as the EAV or Entity-Attribute-Value model of data. It is generally accepted to be a 'bad idea', for all it is a common idea.
What you described is also known as a triplestore. A triple is a subject-object-predicate (Hotel HAS Rooms, Joe Likes HotelX, etc.). There are mechanisms for running these things (triplestore implementations), controlling the data (eg with ontologies) and for querying them, too (eg the SPARQL language). However, this is all fairly bleeding edge stuff and is known to have scalability problems. Nevertheless, combined with NoSQL approaches (index all your hotels in a big document store, etc.), it's an interesting area to keep an eye on.
See: http://en.wikipedia.org/wiki/Triplestore.