Custom NHibernate session implementation - nhibernate

I'm working on a system that performs bulk processing using NHibernate. I know that NHibernate was not designed for bulk processing, but nonetheless the system is working perfectly thanks to a number of optimizations.
The object at the lowest level of granularity (i.e. the root of my aggregates) has a number of string properties that cannot (or, it does not make sense to) be modeled as many-to-one's (e.g. "Comment"). In reality, the fields in the DB corresponding to these properties take only so many values (for example because most - but not all - comments are machine-generated), with the result that when hydrating tons of objects, lots of memory is wasted by having thousands and thousands of instances of strings with the same values.
I was thinking of optimizing this scenario transparently by creating my own NHibernate custom type that enhances NHibernate's StringType by overriding NullSafeGet() and doing a dictionary lookup to return the same instance of each string occurrence over and over. In other words, I would perform a kind of string interning myself. The use of a custom type allows me to select which properties of which objects should be "interned" by just specifying this type in the mapping files.
Ideally, I would like to "stick" this dictionary into the session, so that the lifetime of this string pool is tied with the lifetime of the first level cache. After all, from our system's point of view, it makes sense to intialize this string pool at the same time a session and its first-level cache are initialized, and to nuke the string pool at the same time a session is closed. It is also a desirable property that concurrent sessions are completely isolated from each other by having their own private dictionaries.
Problem is, I can't find a way to "inject" a custom implementation of NHibernate's session into NHibernate itself so that an IType can access it at NullSafeGet() time, short of creating my own personal NHibernate code branch.
Is there a way to provide NHibernate with a custom session implementation?

I see three different approaches to solve this:
1. Use a interceptor
In the IInterceptor, you get:
void AfterTransactionBegin(ITransaction tx);
void BeforeTransactionCompletion(ITransaction tx);
2. Wrap opening and closing the session:
Opening and closing the session is an explicit call. It should be easy to wrap this into a method.
public ISession OpenSession()
{
var session = sessionFactory.CreateSession();
StringType.Initialize();
}
You could make it much nicer. I wrote a transaction service, which has events. Then you could handle begin transaction and end transaction events.
3. Don't attach the string cache to the session
It doesn't need to be related to the session. The strings are immutable objects, it doesn't hurt when you mix them between sessions. To avoid that the cache grows unlimitedly, you could write your own or use an existing "most recently used"-cache. After growing to a certain size, it throws away the oldest items.
This would probably require some time to implement, but would be very nice and easy to use.

Related

Static or instance class for reuse between RFC calls?

I am implementing some piece of functionality which must be called in the remote system, so I wrap this functionality into a global class which would be called by an RFC module. The calculations to be done are rather complex, heavy and involve many DB calls, so I am seeking ways to pre-save some results for the future RFC calls. The module will be called very frequently and this trick can save many seconds of runtime.
The question is: should I use static class or instance class in my RFC wrapper?
The idea is to put calculation results into itab attributes of the class and re-use them in future calls. This data will have some validity interval though and after expiration of time it will be invalidated and re-calculated again.
In SAP recommendations we see that static classes generally are not recommended with some exceptions, and for re-use SAP recommends to stick to singletones. Does the singleton idea applies to my use-case as well?
In my understanding if I put tables into the attributes of static class they will be alive in memory for a while (from ABAPDOCU):
They are persisted in the memory for as long as the current internal session exists
and as can be seen from this magnificent figure by Sandra
the user session will be reused (for how long?) between RFC calls, and so my saved tables/attributes will be reused too.
It is better than shared memory which has its overhead and disadvantages, e.g. it is valid only for current AS instance.
Is ir a viable idea or I'd better stick to singleton with instance class?
The thing with static classes vs. singletons is more of a question of code style. When you implement the singleton pattern, then the instance is stored in a static attribute. So there is no difference between a singleton and an all-static class when it comes to persistence. And that persistence is only through the internal session. The internal sessions of RFC calls are bound to the RFC session. When the caller is another ABAP program running on a different server, then it will keep its RFC session while it is running, just like a regular internal session would. So you can use static attributes to cache data within one execution of one program, but you can not use them to share data between multiple executions of the program. (When the RFC call comes from a non-ABAP program, then the lifetime of the RFC session will end when the program explicitly closes the session).
So should you use a static class / singleton here? Only if all those requests happen through one run of the same program (but then why don't you cache on the client or create an RFC function module which accepts requests in bulk?)
The program is only executed once by one user
If you have multiple requests from multiple users, then you have a typical use-case for shared memory. The argument about the shared memory instance being bound to the application server is pretty irrelevant here, because RFC user sessions are also bound to an application server.
If you need caching across application servers, then there is the option to cache on the database server by creating a database table to cache the results of more time-consuming queries. But retrieving those results would still require a database request, so it only makes sense for queries which take several seconds to complete and still don't return much data. They are not useful for queries which just take milliseconds or where the reason for the runtime is the amount of data they return.

NHibernate and interceptors - measuring/monitoring SQL round-trip times

In order to get early-warning of a slow or potentially slow areas, I'd like to have an Interceptor for NHibernate that can act as a performance monitor, so that any database operation that takes more than a given time raises an event and (importantly) a full stacktrace into the application's logs.
Interceptors seemed to be a good window into this. However, having experimented, there doesn't seem to be anyway to catch a "just-back-from-SQL" event:
OnPreFlush and OnPostFlush work on full batches where writes are involved, but aren't invoked on read events.
OnPrepareStatement() seems to be the best to put start measuring, but to stop?
For read events, OnLoad might be the place to stop the clock, but it's called once-per-entity returned - how do I know when I've got to the end of all entities?
For write events, I can't see any post-SQL event (other than those that work on the entire batch - OnPostFlush and OnAfterTransaction; I get the impression OnSave, OnFlushDirty, etc are called before the actual database call occurs - happy to be corrected though).
From what I can tell, documentation is heavily lacking on exactly what the pipeline order is with NHibernate's interaction with the database, and thus when in that pipeline different events and interceptor calls are called in relation to the actual SQL execution itself.
This is something that needs to be permanently available, sitting in the background, pretty much that requires no human intervention except when slow queries are detected. It also needs to run headless on a large farm of servers, so interactive tools such as NHibernate Profiler are out: it's literally something that we can enable and forget about, letting it log as and when appropriate.
Is there anything I have missed or misunderstood?
I had a similar problem. I wanted measure and log all queries that goes through NHibernate.
What I did is I wrote a custom batching factory (in this case I work with oracle) but you can apply the same technique to any db:
1-) Implement batcher factory, (in this case I am extending existing factory)
public class OracleLoggingBatchingBatcherFactory : OracleDataClientBatchingBatcherFactory
{
public override IBatcher CreateBatcher(ConnectionManager connectionManager, IInterceptor interceptor)
{
return new OracleLoggingBatchingBatcher(connectionManager, interceptor);
}
}
2-) Implement the Batcher itself (in this case I am extending existing batcher). Make sure you inherit IBatcher again since we want our new methods to be called
public class OracleLoggingBatchingBatcher : OracleDataClientBatchingBatcher, IBatcher
{
.... // here override ExecuteNonQuery, DoExecuteBatch and ExecuteReader.
//You can do all kind of intercepting, logging or measuring here
//If they are not overrideable just implement them and use "new" keyword if necessary
//since we inherit IBatcher explicitly it will work polymorphically.
//Make sure you call base implementation too or re-implement the method from scratch
}
3-) Register the factory via NHibernate config:
<property name="adonet.factory_class">OracleLoggingBatchingBatcherFactory, MyAssembly</property>

nhibernate and sessions, please clarify

I am building a web application, and whenever I make a database call I need a session.
I understand creating a session object is very expensive.
I am following the repository pattern here: http://web.archive.org/web/20110503184234/http://blogs.hibernatingrhinos.com/nhibernate/archive/2008/10/08/the-repository-pattern.aspx
He uses something called a UnitOfWork to get the session.
For a web application, shouldn't I be storing the Session in Request.Items collection? So its only created once per request?
Do I really need UofW?
The session IS the unit of work - its basically used to store changes until you flush them to the db. Save a static session factory at startup, and use that to create one session per web request - Request.Items seems a valid place to put the session.
The repository pattern is a wrapper over the unit of work. The repository pattern differs from the UoW pattern in that repo.Save(obj) should save the obj to the db straight away, while the UoW waits for a flush.
My advice would be to skip the repository pattern and use the ISession directly (see http://ayende.com/Blog/archive/2009/04/17/repository-is-the-new-singleton.aspx)
In the case of NHibernate the key class is the SessionFactory, which SessionProvider is taking care of for you (if you implement it like that). Keep the SessionFactory alive, and it handles the sessions for you.
I've also seem people save the SessionFactory in their IoC.
Use this to manage your sessions:
HybridSessionBuilder
It manages and gives you access to a single session that's used across the entire application.

Beans, methods, access and change? What is the recommened practice for handling them (i.e. in ColdFusion)?

I am new to programming (6 weeks now). i am reading a lot of books, sites and blogs right now and i learn something new every day.
Right now i am using coldfusion (job). I have read many of the oop and cf related articles on the web and i am planning to get into mxunit next and after that to look at some frameworks.
One thing bothers me and i am not able to find a satisfactory answer. Beans are sometimes described as DataTransferObjects, they hold Data from one or many sources.
What is the recommended practice to handle this data?
Should i use a separate Object that reads the data, mutates it and than writes it back to the bean, so that the bean is just a storage for data (accessible through getters) or should i implement the methods to manipulate the data in the bean.
I see two options.
1. The bean is only storage, other objects have to do something with its data.
2. The bean is storage and logic, other objects tell it to do something with its data.
The second option seems to me to adhere more to encapsulation while the first seems to be the way that beans are used.
I am sure both options fit someones need and are recommended in a specific context but what is recommended in general, especially when someone does not know enough about the greater application picture and is a beginner?
Example:
I have created a bean that holds an Item from a database with the item id, a name, and an 1d-array. Every array element is a struct that holds a user with its id, its name and its amount of the item. Through a getter i output the data in a table in which i can also change the amount for each user or check a user for deletion from this item.
Where do i put the logic to handle the application users input?
Do i tell the bean to change its array according to the user input?
Or do i create an object that changes the array and writes that new array into the bean?
(All database access (CreateReadUpdateDelete) is handled through a DataAccessObject that gets the bean as an argument. The DAO also contains a gateway method to read more than one record from the database. I use this method to get a table of items, which i can click to create the bean and its data.)
You're observing something known as "anemic domain model". Yes, it's very common, and no, it's not good OO design. Generally, logic should be with the data it operates on.
However, there's also the matter of separation of concerns - you don't want to stuff everything into the domain model. For example, database access is often considered a technically separate layer and not something the domain models themselves should be doing - it seems you already have that separated. What exactly should and should not be part of the domain model depends on the concrete case - good design can't really be expressed in absolute rules.
Another concern is models that get transferred over the network, e.g. between an app server and a web frontend. You want these to contain only the data itself to reduce badnwidth usage and latency. But that doesn't mean they can't contain logic, since methods are not part of the serialized objects. Derived fields and caches are - but they can usually be marked as transient in some way so that they are not transferred.
Your bean should contain both your data and logic.
Data Transfer Objects are used to transfer objects over the network, such as from ColdFusion to a Flex application in the browser. DTOs only contain relevant fields of an object's data.
Where possible you should try to minimise exposing the internal implementation of your bean, (such as the array of user structs) to other objects. To change the array you should just call mutator functions directly on your bean, such as yourBean.addUser(user) which appends the user struct to the internal array.
No need to create a separate DAO with a composed Gateway object for your data access. Just put all of your database access methods (CRUD plus table queries) into a single Gateway object.

NHibernate sessions - what's the common way to handle sessions in windows applications?

I've just started using NHibernate, and I have some issues that I'm unsure how to solve correctly.
I started out creating a generic repository containing CUD and a couple of search methods. Each of these methods opens a separate session (and transaction if necessary) during the DB operation(s). The problem when doing this (as far as I can tell) is that I can't take advantage of lazy loading of related collections/objects.
As almost every entity relation has .Not.LazyLoad() in the fluent mapping, it results in the entire database being loaded when I request a list of all entities of a given type.
Correct me if I'm wrong, 'cause I'm still a complete newbie when it comes to NHibernate :)
What is most common to do to avoid this? Have one global static session that remains alive as long as the program runs, or what should I do?
Some of the repository code:
public T GetById(int id)
{
using (var session = NHibernateHelper.OpenSession())
{
return session.Get<T>(id);
}
}
Using the repository to get a Person
var person = m_PersonRepository.GetById(1); // works fine
var contactInfo = person.ContactInfo; // Throws exception with message:
// failed to lazily initialize a collection, no session or session was closed
Your question actually boils down to object caching and reuse. If you load a Foo object from one session, then can you keep hold of it and then at a later point in time lazy load its Bar property?
Each ISession instance is designed to represent a unit of work, and comes with a first level cache that will allow you to retrieve an object multiple times within that unit of work yet only have a single database hit. It is not thread-safe, and should definitely not be used as a static object in a WinForms application.
If you want to use an object when the session under which it was loaded has been disposed, then you need to associate it with a new session using Session.SaveOrUpdate(object) or Session.Update(object).
You can find all of this explained in chapter 10 of the Hibernate documentation.
If this seems inefficient, then look into second-level caching. This is provided at ISessionFactory level - your session factory can be static, and if you enable second-level caching this will effectively build an in-memory cache of much of your data. Second-level caching is only appropriate if there is no underlying service updating your data - if all database updates go via NHibernate, then it is safe.
Edit in light of code posted
Your session usage is at the wrong level - you are using it for a single database get, rather than a unit of work. In this case, your GetById method should take in a session which it uses, and the session instance should be managed at a higher level. Alternatively, your PersonRepository class should manage the session if you prefer, and you should instantiate and dispose an object of this type for each unit of work.
public T GetById(int id)
{
return m_session.Get<T>(id);
}
using (var repository = new PersonRepository())
{
var person = repository.GetById(1);
var contactInfo = person.ContactInfo;
} // make sure the repository Dispose method disposes the session.
The error message you are getting is because there is no longer a session to use to lazy load the collection - you've already disposed it.