What is a proper way to store comments in RavenDb - ravendb

I'm thinking how to design comments. My initial idea was just to store a Comments list in a document:
public class BlogPost
{
...
public IList<Comment> Comments { get; set; }
}
But I need to implement voting so I'd like to have an identificator for each comment (to find out which comment was voted at the client). But RavenDb is not very friendly for nested objects identificators.
So I am confused whether should I fake Comment identificator or store comments in a more relational way:
public class Comment
{
public string BlogPostId {get;set;}
public string Text {get;set;}
public IList<CommentVote> Votes {get;set;}
}

I ended up identifying comments with AuthorId + CreationDateTime

Related

Reference another class using an index vs storing entire instance

In a class which needs to "contain information" about another class (sorry I don't know the terms for this), should I store the reference to that other class as something like an integer/id, or should I store it as an instance of the other class? What is this called, if there is a name for it?
As a very basic example, an app where we want to store what a user's favorite restaurant is:
public class User {
public int id { get; set; }
public string name { get; set; }
// id of restaurant...
// public int favoriteRestaurantId { get; set; }
// ...or entire instance of Restaurant type
// public Restaurant favoriteRestaurant { get; set; }
}
public class Restaurant {
public int id { get; set; }
public string name { get; set; }
}
Note: if you think this is off topic, please explain why this question would be allowed and is a highly rated/useful question, but mine is not: Interface vs Base class Or at the very least tell me what this is "called" so I can research it more myself. As far as I can tell from Stackoverflow's FAQ this question is on topic.
Your first variant
public int favoriteRestaurantId { get; set; }
only makes sense if you are only interested in the id and not the other attributes (name) of the restaurant object. Otherwise you will need some external container that stores all restaurant objects and have to search the container for the restaurant with the given id.
In your second variant
public Restaurant favoriteRestaurant { get; set; }
if you write
someUser.favouriteRestaurant = someRestaurant;
this also stores a reference to an existing someRestaurant. It will not copy the whole object. at least not in languages like C# and Java.
Only if you do something like
someUser.favouriteRestaurant = new Restaurant(someRestaurant);
the user will have its own copy of the restaurant object.
There are cases where this would make sense but in your example it is probably not a good idea for two reasons:
If for example the name of the someRestaurant changes, this should also change the name of favouriteRestaurant. This will not happen automatically if favouriteRestaurant is a copy.
It is a waste of memory.

RavenDB Modeling - a single document vs multiple documents?

Given a simple example as follows, I'd like some guidance on whether to store as a single document vs multiple documents.
class User
{
public string Id;
public string UserName;
public List<Post> Posts;
}
class Post
{
public string Id;
public string Content;
}
Once the data is stored, there are times when I will want all the posts for a given user. Sometimes I might want posts across multiple users that meet a particular criteria.
Should I store each User as a document (with Posts embedded), or does it make more sense to store Users and Posts as seperate documents, and have some sort of ID in my post to link it back to a User?
Now, what if each user belonged to an Organization (there will be hundreds of organizations in my application)?
class Organization
{
public string Id;
public List<User> users;
}
Should I then stay with the single document approach? In this case I would store one giant document for each organization, which will contain embedded users, which in turn contain embedded posts?
You should keep them as separate documents. User, Organization, and Post are great examples of aggregate entities, and in Raven, each aggregate is usually its own document.
Only entities which are not aggregates should be nested in the same document. For example, in Post you might have a List<Comment>. Comment and Post are both entities, but only Post is an aggregate.
You should instead model them with references:
public class User
{
public string Id { get; set; }
public string Name { get; set; }
public List<string> PostIds { get; set; }
}
public class Post
{
public string Id { get; set; }
public string Content { get; set; }
}
public class Organization
{
public string Id { get; set; }
public List<string> UserIds { get; set; }
}
Optionally, you can denormalize some of the data into your references where appropriate:
public class UserRef
{
public string Id { get; set; }
public string Name { get; set; }
}
public class Organization
{
public string Id { get; set; }
public List<UserRef> Users { get; set; }
}
Denormalizing the user's name into the organization document has the benefit of not needing to fetch each user document when displaying the organization. However, it has the drawback of having to update the organization document any time a user's name is changed. You should weigh the pros and cons of this each time you consider a relationship. There is no one right answer for all cases.
Also, you should be considering how data will be really used. In practice, you will probably find that your Organization class may not need a user list at all. Instead, you could put a string OrganizationId property on the User class. That would be easier to maintain, and if you wanted a list of users in an organization, you could query for that information using an index.
You should read more in the raven documentation on Document Structure Design and Handling Document Relationships.

RavenDB Index on SubClasses

I am trying to create an indexes for ProviderProfileId, Email, and Address1
I have created queries that work, but not indexes. I know the inheriting from List for the collections might be part of the problem. List is a carry over from when I had to do a significant amount of XmlSerialization on much older projects, and became a habit in my modeling. I also noticed that in Raven the serialization is much cleaner that if AddressCollection were just List. Any thoughts?
Model is similar to
public class Customer {
public string Id {get;set}
public string Name {get;set;}
public AddressCollection {get;set;}
public SocialMediaAliasCollection {get;set;}
}
public class SocialMediaAliasCollection:List<SocialMedialProfile>{}
public class SocialMediaProfile{
public string ProviderProfileId {get;set;}
public string Email {get;set;}
}
public class AddressCollection:List<Address>{}
public class Address{
public string Address {get;set;}
public string City {get;set;}
public string State {get;set;}
public string Zip {get;set;}
}
I now answered it, basically I didn't know linq well enough. Makes sense once I figured it out. I was trying to make an index for a sub collection, in this case addresses. Not 100% this works, but it does compile and when I push the index to the server it does not blow up.
Map = collection => from item in collection where item.AddressCollection != null
from item2 in item.AddressCollection
select new {
item2.city
}

How do I model Revisions of a Post?

Let's say I need to create a model for some portal like Stack Overflow, and I have a class Question.
Is it good idea to have a class like this?
public class Question
{
public Guid Id { get; set; }
public int IdCreator { get; set; }
public List<QuestionRevision> QuestionRevisions { get; set; }
public List<Comment> Comments { get; set; }
}
and a class QuestionRevisions with fields like Editor and Content?
I would start with something like:
public class Question
private Guid id
private List<QuestionRevision> revisions
private List<Comment> Comments
Question(id : Guid, text : String)
getRevisions() : List<QuestionRevision>
addRevision(revision : QuestionRevision) : void
getComments() : List<Comment>
addComment(comment : Comment) : void
So the main points here are:
The guid and questions text are supplied to the object on construction. These should be validated (ie non-null). Consider Builder pattern if Question requires more setup.
A single revision is added to the question
A single comment is added to the question
Immutable views of the comments and revisions are accessed via the getters.
I almost never like seeing a class that is purely a holder for a collection, like QuestionRevisions. Question is a good choice to manage its own revisions and internally use its own appropriate data structure to store them (eg a List is sensible). Without elaborating further on Editor and Content I'm not sure I can do any meaningful pseudo code for QuestionRevision.

How to structure domain model when using NHibernate Search/Lucene

I am messing around with NHibernate Search and Lucene to create a searchable index of legal entities. My domain model looks somewhat like this:
[Indexed]
public abstract class LegalEntity : AggregateRoot
{
public virtual Address Address { get; set; }
}
public class Person : LegalEntity
{
public virtual string FirstNames { get; set; }
public virtual string LastName { get; set; }
}
public class Company: LegalEntity
{
public virtual string Name { get; set; }
}
public class Address : Component
{
public virtual string Street { get; set; }
public virtual string HouseNumber { get; set; }
// etc...
}
As the subclassing implies, LegalEntity is an NHibernate entity specialized as Person and Company, and Address is an NHibernate component.
Now, how would I best go about creating a really Google-like fuzzy search that includes all the fields of a LegalEntity, including those inside the Address component?
My first idea was to implement an AddressFieldBridge to help in bringing in the fields of the Address component, and then just throw in [Field] on all the fields, but then I could not find a way to construct a FuzzyQuery as a conjuntion between multiple search terms.
My next idea was to create an abstract property tagged with [Field] on LegalEntity, like so:
[Field(Index.Tokenized)]
public abstract string SearchableText { get; }
and then have Person and Company return texts that combine names and all the fields from the Address component into one string, which would then be tokenized and indexed by Lucene.
That, however, made me feel kind of icky.
I would like to learn the best and least intrusive (from a domain model perspective) way to accomplish this task - any suggestions are appreciated :)
Take a look at my question.
Andrey created patch to specify lucene stuff outside. Syntax ain't uber clean (maybe there's some progress done on this, haven't checked), but i guess it gets job done.