RavenDB Modeling - a single document vs multiple documents? - ravendb

Given a simple example as follows, I'd like some guidance on whether to store as a single document vs multiple documents.
class User
{
public string Id;
public string UserName;
public List<Post> Posts;
}
class Post
{
public string Id;
public string Content;
}
Once the data is stored, there are times when I will want all the posts for a given user. Sometimes I might want posts across multiple users that meet a particular criteria.
Should I store each User as a document (with Posts embedded), or does it make more sense to store Users and Posts as seperate documents, and have some sort of ID in my post to link it back to a User?
Now, what if each user belonged to an Organization (there will be hundreds of organizations in my application)?
class Organization
{
public string Id;
public List<User> users;
}
Should I then stay with the single document approach? In this case I would store one giant document for each organization, which will contain embedded users, which in turn contain embedded posts?

You should keep them as separate documents. User, Organization, and Post are great examples of aggregate entities, and in Raven, each aggregate is usually its own document.
Only entities which are not aggregates should be nested in the same document. For example, in Post you might have a List<Comment>. Comment and Post are both entities, but only Post is an aggregate.
You should instead model them with references:
public class User
{
public string Id { get; set; }
public string Name { get; set; }
public List<string> PostIds { get; set; }
}
public class Post
{
public string Id { get; set; }
public string Content { get; set; }
}
public class Organization
{
public string Id { get; set; }
public List<string> UserIds { get; set; }
}
Optionally, you can denormalize some of the data into your references where appropriate:
public class UserRef
{
public string Id { get; set; }
public string Name { get; set; }
}
public class Organization
{
public string Id { get; set; }
public List<UserRef> Users { get; set; }
}
Denormalizing the user's name into the organization document has the benefit of not needing to fetch each user document when displaying the organization. However, it has the drawback of having to update the organization document any time a user's name is changed. You should weigh the pros and cons of this each time you consider a relationship. There is no one right answer for all cases.
Also, you should be considering how data will be really used. In practice, you will probably find that your Organization class may not need a user list at all. Instead, you could put a string OrganizationId property on the User class. That would be easier to maintain, and if you wanted a list of users in an organization, you could query for that information using an index.
You should read more in the raven documentation on Document Structure Design and Handling Document Relationships.

Related

Confused about DTOs when reading and editing. How to desing DTO for filling the form in VUEjs app?

I am trying to develop an enterprise-level application. I have domain and application services. I have created my DTOs for multiple purposes separately. But confused about which way I should use them from the API viewpoint.
I have complex objects lets say,
public class Entity{
public int Id { get; set; }
public string Name { get; set; }
public int? ManufacturerId { get; set; }
public virtual Manufacturer Manufacturer { get; set; }
}
public class Manufacturer{
public int Id { get; set; }
public string Text { get; set; }
}
And I have corresponding DTOs designed with composition now. It was separated before.
public class EntityBaseDto{
public string Name { get; set; }
}
public class EntityReadDto : EntityBaseDto{
public string Manufacturer { get; set; }
}
public class EntityWriteDto : EntityBaseDto{
public int? ManufacturerId { get; set; }
}
Now the question is,
I have a table which is filled with List<EntityReadDto> which is clear. Before, EntityReadDto also had the ManufacturerDto as fully included with id and text. Whenever I require to edit one of the entries from the table I was able to load the dropdown selected items or list of tags etc with the ids attached to the Manufacturer objects within ReadDtos. Now it is not possible. Since I wanted to simplify the codes I just converted them to strings that are read-only. Now I have created another endpoint to get an editable version of the record when needed. Ex: EntityWriteDto will be used to fill the form when the edit is clicked on a specific item. The manipulation will be carried on that DTO and sent with the PUT type request to edit the record.
I am not sure if this approach is ok for these cases. What is the best practice for this? I have many objects related to the entities from other tables. Is it ok to make a call to get an editable version from the backend or need to have it right away in a VUEjs app?

Reference another class using an index vs storing entire instance

In a class which needs to "contain information" about another class (sorry I don't know the terms for this), should I store the reference to that other class as something like an integer/id, or should I store it as an instance of the other class? What is this called, if there is a name for it?
As a very basic example, an app where we want to store what a user's favorite restaurant is:
public class User {
public int id { get; set; }
public string name { get; set; }
// id of restaurant...
// public int favoriteRestaurantId { get; set; }
// ...or entire instance of Restaurant type
// public Restaurant favoriteRestaurant { get; set; }
}
public class Restaurant {
public int id { get; set; }
public string name { get; set; }
}
Note: if you think this is off topic, please explain why this question would be allowed and is a highly rated/useful question, but mine is not: Interface vs Base class Or at the very least tell me what this is "called" so I can research it more myself. As far as I can tell from Stackoverflow's FAQ this question is on topic.
Your first variant
public int favoriteRestaurantId { get; set; }
only makes sense if you are only interested in the id and not the other attributes (name) of the restaurant object. Otherwise you will need some external container that stores all restaurant objects and have to search the container for the restaurant with the given id.
In your second variant
public Restaurant favoriteRestaurant { get; set; }
if you write
someUser.favouriteRestaurant = someRestaurant;
this also stores a reference to an existing someRestaurant. It will not copy the whole object. at least not in languages like C# and Java.
Only if you do something like
someUser.favouriteRestaurant = new Restaurant(someRestaurant);
the user will have its own copy of the restaurant object.
There are cases where this would make sense but in your example it is probably not a good idea for two reasons:
If for example the name of the someRestaurant changes, this should also change the name of favouriteRestaurant. This will not happen automatically if favouriteRestaurant is a copy.
It is a waste of memory.

Inheriting data from ancestor documents in RavenDB

I'm using RavenDB to store three types of curriculum: Units, Lessons, and Activities. These three types all inherit from CurriculumBase:
public abstract class CurriculumBase
{
public string Id { get; set; }
public string Title { get; set; }
public List<string> SubjectAreaIds { get; set; }
// non-relevant properties removed
}
These documents have a hierarchical relationship, so I've modeled the hierarchy as a separate single document as recommended here: Modelling Hierarchical Data with RavenDB
public class CurriculumHierarchy
{
public class Node
{
public string CurriculumId { get; set; }
public string Title { get; set; }
public List<Node> Children { get; set; }
public Node()
{
Children = new List<Node>();
}
}
public List<Node> RootCurriculum { get; set; }
public CurriculumHierarchy()
{
RootCurriculum = new List<Node>();
}
}
I need to be able to do searches across all curriculum documents. For simple properties, that seems easy enough to do with a multi-map index.
However one of the properties I need to be able to search by (in combination with the other search criteria) is SubjectAreaId. I need to be able to get curriculum for which it or any of its ancestors have the specified subject area id(s). In other words, for search purposes, documents should inherit the subjectAreaIds of their ancestors.
I've considered de-normalizing subjectAreaIds, and storing the full calculated set of subjectAreaIds in each document, but that will require updates whenever the hierarchy itself or the subjectAreaIds of any of a given document's ancestors change. I'm hoping this is something I can accomplish with an index, or perhaps an entirely different approach is needed.
You can use LoadDocument to load the parents during indexing.
http://ravendb.net/docs/article-page/3.0/csharp/indexes/indexing-related-documents
The main challenge I encountered was that I had written code in CurriculumHierarchy to get a document's ancestors, but this code isn't executable during indexing.
To solve this, I added a read-only property to CurriculumHierarchy which generates a dictionary of ancestors for each document:
public Dictionary<string, IEnumerable<string>> AncestorLookup
{
get
{
// Not shown: build a dictionary where the key is an
// ID and the value is a list of the IDs for
// that item's ancestors
}
}
This dictionary is serialized by Raven and therefore available for indexing.
Then my index ended up looking like this:
public class Curriculum_Search : AbstractMultiMapIndexCreationTask
{
public Curriculum_Search()
{
AddMap<Activity>(
activities =>
from activity in activities
let hierarchy =
LoadDocument<CurriculumHierarchy>("curriculum_hierarchy")
let ancestors =
LoadDocument<CurriculumBase>(hierarchy.AncestorLookup[activity.Id])
select new
{
subjectAreaIds = ancestors.SelectMany(x => x.SubjectAreaIds).Distinct().Union(activity.SubjectAreaIds),
});
// Not shown: Similar AddMap statements for Lessons and Units
}
}
I was a bit concerned about performance, but since there are less than 2000 total curriculum documents, this seems to perform acceptably.

RavenDB Persisting chain of relationships

I'm working on a collaborative document editing tool that's going to use RavenDB for persistence. In my domain I have a document class that looks like this.
public class Document
{
public string Id { get; private set; }
public string Name { get; set; }
public IRevision CurrentRevision {get; private set; }
public string Contents {get { return CurrentRevision.GenerateEditedContent(); }}
}
As you can see that document then has a CurrentRevision property that points to an IRevision object that looks like this.
public interface IRevision
{
IRevision PreviousRevisionAppliedTo { get; }
IRevision NextRevisionApplied { get; set; }
Guid Id { get; }
string GenerateEditedContent();
}
So the basic idea is that the document's contents are generated on the fly by checking out the current revision, which in turn checks it's parent revision, and so on and so forth.
Out of the box, RavenDB doesn't seem to handle persisting this chain of object references the way I need it to. I've been trying to persist it by just calling Session.Store(document), and hoping that the list of associated revisions would get stored as well. I've looked into some pieces of the RavenDB framework like custom serializers, but I can't figure out a clear path that would allow me to deserialize and reserialize the data as I would like. What's a good way to handle this situation.

RavenDB document design, patching, and index creation

I am revisiting RavenDB after a brief experiment quite a while ago. At the moment I'm considering document design which is nested 3 levels deep, i.e.
public class UserEvent
{
public UserEvent()
{
Shows = new List<Show>();
}
public readonly string IdPrefix = "Events/";
public string Id { get; set; }
public string Title { get; set; }
public List<Show> Shows { get; set; }
}
public class Show
{
public Show()
{
Entries = new List<ShowEntry>();
}
public readonly string IdPrefix = "Shows/";
public string Id { get; set; }
public string EventId { get; set; }
public string Name { get; set; }
public DateTime Date { get; set; }
public List<ShowEntry> Entries { get; set; }
}
public class ShowEntry
{
public readonly string IdPrefix = "ShowEntries/";
public string Id { get; set; }
public string DogId { get; set; }
public string OwnerName { get; set; }
public EntryClass Class { get; set; }
}
First of all, is this a sensible design? A UserEvent generally has a few (less than 6) Show, but a Show can have between tens to hundreds of ShowEntry. I have included DogId in ShowEntry but maybe later I will change it to a property of Dog type. A Dog is of a particular Breed, and a Breed belongs to a Group. The Dog side of the story will have to be another question but for now I'm interested in the UserEvent side.
If my documents are designed this way can I use the Patching API to add items into the Entries collection within a Show? I would like to have an index which will summarise Entries based on Dog properties. Will indexes get processed if an a document is patched?
Your design certainly looks sensible from an outside perspective. The big question you need to ask yourself is, "What do you plan on querying a majority of the time?"
For instance, Show seems to be a fairly common object that would benefit from being an Aggregate Root (from Domain Driven Design). I find that when organizing my documents, the most important question is, "how often do you plan on querying the object."
To answer your last question, Patching should definitely causing re-indexing.