Document structure for RavenDB

Document structure for RavenDB - ravendb

I have class
class UserActivity
{
private IList<Activity> _activities = new List<Activity>();
public void AddActivity(Activity activity)
{
_activities.Add(activity);
}
public IEnumerable<Activity> ActivityHistory
{
return _activities;
}
}
Activity history can have a lot of elements.
I wish store UserActivity in RavenDB with history. But, when I get UserActivity first time from DB, ActivityHistory should have last 10 items, for example, and I should have possibility load other items or add new to collection and save them.
How I can do it?
As I understand, i have only one way - store apart UserActivity and history items.

Gengzu,
Yes, you need to deal with the document as a whole.
You can project parts of the documents out into an index, but that is a projection, not an entity

Related

Strategy for generating unique identifier for document in RavenDB

Say I have something like a support ticket system (simplified as example). It has many users and organizations. Each user can be a member of several organizations, but the typical case would be one org => many users, and most of them belong only to this organization. Each organization has a "tag" which is to be used to construct "ticket numbers" for this organization. Lets say we have an org called StackExchange that wants the tag SES.
So if I open the first ticket of today, I want it to be SES140407-01. The next is SES140407-02 and so on. Doesn't have to be two digits after the dash.
How can I make sure this is generated in a way that makes sure it is 100% unique across the organization (no orgs will have the same tag)?
Note: This does not have to be the document ID in the database - that will probably just be a Guid or similar. This is just a ticket reference - kinda like a slug - that will appear in related emails etc. So it has to be unique, and I would prefer if we didn't "waste" the sequential case numbers hilo style.
Is there a practical way to ensure I get a unique ticket number even if two or more people report a new one at almost the same time?
EDIT: Each Organization is a document in RavenDB, and can easily hold a property like LastIssuedTicketId. My challenge is basically to find the best way to read this field, generate a new one, and store this back in a way that is "race condition safe".
Another edit: To be clear - I intend to generate the ticket ID in my own software. What I am looking for is a way to ask RavenDB "what was the last ticket number", and then when I generate the next one after that, "am I the only one using this?" - so that I give my ticket a unique case id, not necessarily related to what RavenDB considers the document id.

I use for that generic sequence generator written for RavenDB:
public class SequenceGenerator
{
private static readonly object Lock = new object();
private readonly IDocumentStore _docStore;
public SequenceGenerator(IDocumentStore docStore)
{
_docStore = docStore;
}
public int GetNextSequenceNumber(string sequenceKey)
{
lock (Lock)
{
using (new TransactionScope(TransactionScopeOption.Suppress))
{
while (true)
{
try
{
var document = GetDocument(sequenceKey);
if (document == null)
{
PutDocument(new JsonDocument
{
Etag = Etag.Empty,
// sending empty guid means - ensure the that the document does NOT exists
Metadata = new RavenJObject(),
DataAsJson = RavenJObject.FromObject(new { Current = 0 }),
Key = sequenceKey
});
return 0;
}
var current = document.DataAsJson.Value<int>("Current");
current++;
document.DataAsJson["Current"] = current;
PutDocument(document);
{
return current;
}
}
catch (ConcurrencyException)
{
// expected, we need to retry
}
}
}
}
}
private void PutDocument(JsonDocument document)
{
_docStore.DatabaseCommands.Put(
document.Key,
document.Etag,
document.DataAsJson,
document.Metadata);
}
private JsonDocument GetDocument(string key)
{
return _docStore.DatabaseCommands.Get(key);
}
}
It generates incremental unique sequence based on sequenceKey. Uniqueness is guaranteed by raven optimistic concurrency based on Etag. So each sequence has its own document which we update when generate new sequence number. Also, there is lock which reduced extra db calls if several threads are executing at the same moment at the same process (appdomain).
For your case you can use it this way:
var sequenceKey = string.Format("{0}{1:yyMMdd}", yourCompanyPrefix, DateTime.Now);
var nextSequenceNumber = new SequenceGenerator(yourDocStore).GetNextSequenceNumber(sequenceKey);
var nextSequenceKey = string.Format("{0}-{1:00}", sequenceKey, nextSequenceNumber);

With repository pattern, should I check first if the object queried exists in memory before it makes an actual database call

I have a repository, for example, UserRepository. It returns a user by given userId. I work on web application, so objects are loaded into memory, used, and disposed when the request ends.
So far, when I write a repository, I simply retrieved the data from the database. I don't store the retrieved User object into memory (I mean in a collection of the repository). When the repository's GetById() method is called, I don't check if the object is already in the collection. I simply query the database.
My questions are
Should I store retrieved objects in the memory, and when a repository's Get method is called, should I check if the object exists in the memory first before I make any Database call?
Or is the memory collection unnecessary, as web request is a short-lived session and all objects are disposed afterward

1) Should I store retrieved objects in the memory, and when a repository's Get method is called, should I check if the object exists in the memory first before I make any Database call?
Since your repository should be abstracted enough to simulate the purpose of an in-memory collection, I think it is really up to you and to your use case.
If you store your object after being retrieved from the database you will probably end-up with an implementation of the so-called IdentityMap. If you do this, it can get very complicated (well it depends on your domain).
Depending on the infrastructure layer you rely on, you may use the IdentityMap provided by your ORM if any.
But the real question is, is it worth implementing an IdentityMap?
I mean, we agree that repeating a query may be wrong for two reasons, performance and integrity, here a quote of Martin Fowler:
An old proverb says that a man with two watches never knows what time it is. If two watches are confusing, you can get in an even bigger mess with loading objects from a database.
But sometimes you need to be pragmatic and just load them every time you need it.
2) Or is the memory collection unnecessary, as web request is a short-lived session and all objects are disposed afterward
It depends™, for example, in some case you may have to play with your object in different place, in that case, it may be worth, but let's say you need to refresh your user session identity by loading your user from database, then there are cases where you only do it once within the whole request.

As is the usual case I don't think there is going to be a "one-size-fits-all".
There may be situations where one may implement a form of caching on a repository when the data is retrieved often, does not go stale too quickly, or simply for efficiency.
However, you could very well implement a type of generic cache decorator that can wrap a repository when you do need this.
So one should take each use case on merit.

When you're using an ORM like Entity Framework or NHibernate it's already taken care of - all read entities are tracked via IdentityMap mechanism, searching by keys (DbSet.Find in EF) won't even hit the database if the entity is already loaded.
If you're using direct database access or a microORM as base for your repository, you should be careful - without IdentityMap you're essentially working with value objects:
using System;
using System.Collections.Generic;
using System.Linq;
namespace test
{
internal class Program
{
static void Main()
{
Console.WriteLine("Identity map");
var artrepo1 = new ArticleIMRepository();
var o1 = new Order();
o1.OrderLines.Add(new OrderLine {Article = artrepo1.GetById(1, "a1", 100), Quantity = 50});
o1.OrderLines.Add(new OrderLine {Article = artrepo1.GetById(1, "a1", 100), Quantity = 30});
o1.OrderLines.Add(new OrderLine {Article = artrepo1.GetById(2, "a2", 100), Quantity = 20});
o1.ConfirmOrder();
o1.PrintChangedStock();
/*
Art. 1/a1, Stock: 20
Art. 2/a2, Stock: 80
*/
Console.WriteLine("Value objects");
var artrepo2 = new ArticleVORepository();
var o2 = new Order();
o2.OrderLines.Add(new OrderLine {Article = artrepo2.GetById(1, "a1", 100), Quantity = 50});
o2.OrderLines.Add(new OrderLine {Article = artrepo2.GetById(1, "a1", 100), Quantity = 30});
o2.OrderLines.Add(new OrderLine {Article = artrepo2.GetById(2, "a2", 100), Quantity = 20});
o2.ConfirmOrder();
o2.PrintChangedStock();
/*
Art. 1/a1, Stock: 50
Art. 1/a1, Stock: 70
Art. 2/a2, Stock: 80
*/
Console.ReadLine();
}
#region "Domain Model"
public class Order
{
public List<OrderLine> OrderLines = new List<OrderLine>();
public void ConfirmOrder()
{
foreach (OrderLine line in OrderLines)
{
line.Article.Stock -= line.Quantity;
}
}
public void PrintChangedStock()
{
foreach (var a in OrderLines.Select(x => x.Article).Distinct())
{
Console.WriteLine("Art. {0}/{1}, Stock: {2}", a.Id, a.Name, a.Stock);
}
}
}
public class OrderLine
{
public Article Article;
public int Quantity;
}
public class Article
{
public int Id;
public string Name;
public int Stock;
}
#endregion
#region Repositories
public class ArticleIMRepository
{
private static readonly Dictionary<int, Article> Articles = new Dictionary<int, Article>();
public Article GetById(int id, string name, int stock)
{
if (!Articles.ContainsKey(id))
Articles.Add(id, new Article {Id = id, Name = name, Stock = stock});
return Articles[id];
}
}
public class ArticleVORepository
{
public Article GetById(int id, string name, int stock)
{
return new Article {Id = id, Name = name, Stock = stock};
}
}
#endregion
}
}

LightSwitch - bulk-loading all requests into one using a domain service

I need to group some data from a SQL Server database and since LightSwitch doesn't support that out-of-the-box I use a Domain Service according to Eric Erhardt's guide.
However my table contains several foreign keys and of course I want the correct related data to be shown in the table (just doing like in the guide will only make the key values show). I solved this by adding a Relationship to my newly created Entity like this:
And my Domain Service class looks like this:
public class AzureDbTestReportData : DomainService
{
private CountryLawDataDataObjectContext context;
public CountryLawDataDataObjectContext Context
{
get
{
if (this.context == null)
{
EntityConnectionStringBuilder builder = new EntityConnectionStringBuilder();
builder.Metadata =
"res://*/CountryLawDataData.csdl|res://*/CountryLawDataData.ssdl|res://*/CountryLawDataData.msl";
builder.Provider = "System.Data.SqlClient";
builder.ProviderConnectionString =
WebConfigurationManager.ConnectionStrings["CountryLawDataData"].ConnectionString;
this.context = new CountryLawDataDataObjectContext(builder.ConnectionString);
}
return this.context;
}
}
/// <summary>
/// Override the Count method in order for paging to work correctly
/// </summary>
protected override int Count<T>(IQueryable<T> query)
{
return query.Count();
}
[Query(IsDefault = true)]
public IQueryable<RuleEntryTest> GetRuleEntryTest()
{
return this.Context.RuleEntries
.Select(g =>
new RuleEntryTest()
{
Id = g.Id,
Country = g.Country,
BaseField = g.BaseField
});
}
}
public class RuleEntryTest
{
[Key]
public int Id { get; set; }
public string Country { get; set; }
public int BaseField { get; set; }
}
}
It works and all that, both the Country name and the Basefield loads with Autocomplete-boxes as it should, but it takes VERY long time. With two columns it takes 5-10 seconds to load one page.. and I have 10 more columns I haven't implemented yet.
The reason it takes so long time is because each related data (each Country and BaseField) requires one request. Loading a page looks like this in Fiddler:
This isn't acceptable at all, it should be a way of combining all those calls into one, just as it does when loading the same table without going through the Domain Service.
So.. that was a lot explaining, my question is: Is there any way I can make all related data load at once or improve the performance by any other way? It should not take 10+ seconds to load a screen.
Thanks for any help or input!s

My RIA Service queries are extremely fast, compared to not using them, even when I'm doing aggregation. It might be the fact that you're using "virtual relationships" (which you can tell by the dotted lines between the tables), that you've created using your RuleEntryTest entity.
Why is your original RuleEntry entity not related to both Country & BaseUnit in LightSwitch BEFORE you start creating your RIA entity?
I haven't used Fiddler to see what's happening, but I'd try creating "real" relationships, instead of "virtual" ones, & see if that helps your RIA entity's performance.

Relation many-to-one retrieved from custom cache

It's more like theoretical question.
I have one table to hold dictionary items, and the next one for hold Users data.
User table contains a lot reference collumns of type many to one indicated on dictionary item table. It's looks like:
public class User
{
public int Id;
public Dictionary Status;
public Dictionary Type;
public Dictionary OrganizationUnit;
......
}
I want retrieve all dictionary on startup of aplication, and then when i retrieved user and invoke reference property to dictionary the dictionary object should be taken from cache.
I know i can use a 2nd level cache in this scenario, but i'm interested about other solution. Is there any?
It's posible to make my custom type and said that: use my custom cache to retrieved value of dictionary??

Across multiple session the second level cache is the best answer, the only other solutions to populate objects from a cache without using second level cache i can think of would be to use an onLoad interceptor (and simply leave your dictionaries unmapped) or do it manually somewhere in your application.
But why don't you want to use the seocondlevel cache? If your views on caching is very different from the storages there are providers for in hibernate it is possible for you to implement your own provider?

Why not store it in the session? Just pull the record set one time and push it into session and retrieve it each time you want it. I do something similar for other stuff and I believe my method should work for you. In my code I have a session manager that I call directly from any piece of code needs the session values. I choose this method since I can query the results and I can manipulate the storage and retrieval methods. When relying on NHibernate to do the Caching for me, I don't have the granularity of control to cause specific record sets to only be available to specific sessions. I also find that NHibernate is not as efficient as using the session directly. When profiling the CPU and memory usage I find that this method is faster and uses a little less memory. If you want to do it on a site level instead of session, look into HttpContext.Current.Cache.
The following example works perfectly for storing and retrieving record sets:
// Set the session
SessionManager.User = (Some code to pull the user record with relationships. Set the fetch mode to eager for each relationship else you will just have broken references.)
// Get the session
User myUser = SessionManager.User;
public static class SessionManager
{
public static User User
{
get { return GetSession("MySessionUser") as User; }
set { SetSession("MySessionUser", value); }
}
private static object GetSession(string key)
{
// Fix Null reference error
if (System.Web.HttpContext.Current == null || System.Web.HttpContext.Current.Session == null)
{
return null;
}
else
{
return System.Web.HttpContext.Current.Session[key];
}
}
private static void SetSession(string key, object valueIn)
{
// Fix null reference error
if (System.Web.HttpContext.Current.Session[key] == null)
{
System.Web.HttpContext.Current.Session.Add(key, valueIn);
}
else
{
System.Web.HttpContext.Current.Session[key] = valueIn;
}
}
}

Does Fluent Nhibernate Lazy loads IList from Criteria

I want to make a kind of "news feed" in the application of my company.
In the scenario, User actions will generate "Activity" of different kinds, and other users will see in their "news feed".
However, an "Activity" is not related to all users, and to determine the relation, we have a complex piece of code.
Here is my Activity class
public class Activity: IActivity
{
public virtual int Id { get; set; }
public virtual ActivityType Type { get; set; }
public virtual User User { get; set; }
public virtual bool IsVisibleToUser(User userLook)
{
// Complex business calculation etc.
return true;
}
}
I want to get latest 10 news that is visible to User. But since the Activity table will be quite huge, and performance is an issue, I want to do the best practice about it.
What i am about to do, is get 25 last Activity, and check if we fill the list to show to user. For example, if only 5 Activity is visible to user, i will get another 25 Activities and so on.
IList<Activity> resultList = session.CreateCriteria(typeof(Activity))
.SetMaxResults(25)
.AddOrder(Order.Desc("Id"))
.List<Activity>();
I want to learn, if I get the whole list ordered by Id, and check one by one if it is visible to User, would NHibernate only loads the objects that i use for me or not?
IList<Activity> resultList = session.CreateCriteria(typeof(Activity))
.AddOrder(Order.Desc("Id"))
.List<Activity>();
int count = 0;
foreach( Activity act in resultList){
if (act.IsVisible(CurrentUser)){
count++;
// Do something with act
if (count == 10)
break;
}
}
EDIT:
Here is ActivityMapping for Activity model.
public class ActivityMap : ClassMap<Activity>
{
public ActivityMap()
{
Id(x => x.Id);
Map(x => x.Type).CustomType(typeof(Int32));
References(x => x.User).Nullable();
}
}

If your question is about how the generated SQL would look like, my guess would be :
SELECT
this_.Id as Id0_0_,
this_.ActivityTypeas ActivityType_0_0_,
--Other fields
FROM dbo.ActivityType this_
WHERE
--condition
ORDER BY
--condition
Since you have mentioned that the Activity count is huge, you can make use of
ICriteria's SetFirstResult and SetMaxResult.
SetFirstResult(int) indicates the index of the first item that you wish to obtain and SetMaxResult(int) indicates the number of rows you wish to get, 25 in your case.
The ToList would load all the records in memory at once.
[UPDATE] If you need the records to be returned one by one, make use of Enumerable() -
If you expect your query to return a very large number of objects, but you don't expect to use them all, you might get better performance from the Enumerable() methods, which return a System.Collections.IEnumerable. The iterator will load objects on demand, using the identifiers returned by an initial SQL query (n+1 selects total).
Source - Link

No, the List() method pulls everything into memory at once.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas