NHibernate Search Clustered Lucene Index - nhibernate

We are using NHibernate Search in an application which is going to be clustered.
I have been reading up on the approaches for maintaining separate collections, in particular the master/slave configuration and I was wondering how to go about implementing it using MSMQ if indeed there is an implementation for this at this time. The JMS implementation (as described in NHibernate Search in Action) seems a little daunting to me, especially as we are using a .NET environment.
Alternatively, I'm open to suggestions with regard to instantiating local RAMDirectories for the lucene collections. I know that Lucene can build a RAMDir from a FSDir and I know how to initialise an NHibernate app with a blank RAMDir but I'm getting a little lost when it comes to initialising an app with a RAMDir from an existing (network shared) FSDir.
Or indeed any other approaches.
Cheers,
Steve

I have actually recently came across this very problem. Primarily because we shared index over several webapps, for realtime updates to the indexes. However, we suffered from index corruption and couldn't really figure out why, aswell as the fact that it wouldn't work in a clustered enviroment.
My approach was this: using a service that schematically indexed new entities at very frequent intervals, aswell as reindexing everything at certain timeintervals. I also run an optimize automatically, since NHSearch doesn't seem to support auto-optimize yet.
At application start, I index everything into a RAMDirectoryProvider.
The choice you make depends highly on the data that you want to index and how sensitive you are to delays in that data, aswell as how frequent it changes. In my case, it was to allow textsearches amongst product-data for a website, so any delay was fine with me.
I did some brief research about Master-Slave providers, however I think I felt that NHSearch is rather imature as opposed to the original Java-implementation.
For me, an optimal solution would be to have a Master-Master provider, that would cross-apply all updates to the index on all nodes. I haven't research how much work it would be to write a DirectoryProvider myself, but that would be an option but much effort aswell.

Related

Pros and cons of each key generation strategy in RavenDB

I'm migrating a SQL database of a web application to RavenDB and my team is trying to define what key generation strategy is the best for us.
The main discussion point is whether we use natural keys or surrogate keys. So I would like to know the pros and cons of each strategy in RavenDB.
Thanks
The recommended solution is to let RavenDB handles them for you.
It will generate things like "items/2", etc.
To start with, those are plenty good enough, human readable and easy to generate efficiently.
You can also do things like "users/ayende", but I would wait for that until you have more experience with RavenDB.
Finally, for advance stuff, you have keys such as "customers/1234/orders/8234" which opens up some really nice option for advanced scenarios.
If you build too much logic in your keys you may regret it later unless you know your problem domain well. We do a fair amount of RavenDB work in PHP, since there is no client. We prototype in C# then usually follow what the client creates. Sometimes the auto-pluralization's come out funny like People might be Peoples. This normally does not matter unless you are hitting RavenDB directly from Javascript. Then the only downfall is friends may question your English skills.

Should I choose Hiberlite for integrating SQLite into my Win/iOS application?

I am a composer by profession and my computer science skills are limited though I program quite a bit of the software that I use.
What are the most reasonable ways to approach SQLite integration as a file format and database in an iOS app (it also needs to run on windows, but that is a secondary concern)?
I have been researching Hiberlite, which looks fantastic, but it seems to be little used and apparently it doesn't run well on embedded systems (iOS?) and chokes up when thousands of objects are in play. I haven't been able to get a sense of how severe those bottle necks are when running under those conditions.
The settings of thousands of objects (~50,000 though that number could expand) would be read every 1-10 seconds and written periodically. Read performance is more critical as write operations can stutter with out effecting the core operation of the app.
Given those conditions, how should I approach SQLite? My understanding is that without something like Hiberlite the entire database (many millions of entries) must be read and rewritten for every entry, is that less efficient. If that is the best approach is there a good resource to follow for implementing it?
Any advice would be greatly appreciated. My current software that I rely on is beyond buggy and needs refactoring, but due to my inexperience I am having a difficult time finding information about a reasonable approach.
I'm guessing you've probably found a solution for this by now, but I've been interested myself in embedding SQLite on Android and IOS, and I came across many C++-based ORM solutions.
Hiberlite looked possibly not fully mature (I didn't readily see a method of returning subsets of data, which is fairly standard). A framework which did draw my attention was the POCO:Data ORM library. It's based on the stream-based mechanism used in SOCI ORM. The POCO library is modular and optimised for embedded environments (I believe it also has a minimal external dependencies). Wikipedia has an article here, they outline some of its users, of which OpenFrameworks is one.
The WT ORM also looked pretty interesting.
I'm listing some of the other C++ ORM frameworks I found here, in no particular order:
http://soci.sourceforge.net
webtoolkit WT DBO ORM
http://debea.net
http://www.qxorm.com
http://sourceforge.net/apps/trac/litesql
http://otl.sourceforge.net
http://cppcms.com/sql/cppdb
http://dtemplatelib.sourceforge.net
http://code.google.com/p/qdjango

Whither NetTiers?

I used NetTiers in a number of projects a job or two back. I found it extremely useful for generating back-end interfaces in ASP.NET webforms. The business and data layers were also pretty sweet. I typically use NHibernate, but I think it may be overkill on these particular projects in terms of the time it will take to get running.
Since then, I've been working on projects where practically everything is end-user facing. However, I've recently gotten a side project that will have a lot of back-end administrative stuff and was wondering if NetTiers is still as well-maintained and clean as it was a couple of years back. It doesn't appear to be, but I don't know if that means that it has actually been abandoned or if it has merely been moved elsewhere. Or is there another product (preferably a set of CodeSmith templates) that might work better for me? All I really need is a clean ActiveRecord model that can hit a SQL database on the backend and generate simple user interfaces for CRUD screens for most of my model objects. I need something that will do deep-loading of object graphs kind of like NetTiers will do as well.
Any suggestions?
I'm currently supporting a large NetTiers application and my experience has generally been one of frustration. I inherited the project and took over maintenance of the templates, fixing a number of bugs in the templates and applying some post-generation scripts to the generated files. IMHO the generated code is overly verbose, suffers from massive duplication, and would benefit from more use of generics. The templates I'm working with didn't dispose of resources correctly (the newer template versions may be better). At one point I considered upgrading to a newer version but the size of the exercise put me off. Useful documentation is difficult to find and getting answers to NetTiers questions is not straight forward. The overall impression I have is one of gradual decline.
If you're just after a simple .Net stack for generating a UI from a SQL database I suggest you take a look at ASP.NET MVC3 with MvcScaffolding and Entity Framework. Add AutoMapper and Munq for DI.
We have been using NetTiers for several years now. I think it tend to look overwhelming for first time users, in terms of quantity of stuff generated, and there are a couple of limitations around the DeepLoad functionallity and circularities. I too have the feeling that there have not been many updates lately, but in the overall I've had a great experience using Nettiers with codesmith, and from all the ones I've tried, it's clearly our favorite, with huge productivity gains. We use views, custom sp's, the indexes, etc.
In a comment to another reply: We've tried Automapper, and moved away from it due to the fact that it fails silently when the object's structures change. And moved away from Entity Framework because we don't like hand-coding our DALs. :)

NHibernate with Sql Azure and Sharding

Does anyone have any good sources of information of using NHibernate with Sql Azure with the implications of sharding (because of the 10gb cap)? I know there are posts on the internet that reference a sharding project for NH but they are from 3rd quarter 09 and I haven't found any much more relevant on google.
Related does anyone have information about manually implementing sharding if the sharding project isn't viable to use yet? Would it just be as simple as creating a session factory for each shard and keep a collection of factories? That seems like it would be problematic reproducing the ISession calls through each factory however I suppose it could be achieved by passing operations as Funcs that get invoked on the ISession from each factory but seems more like the wrong path to be going down.
I wrote a proof of concept about a month ago using NHibernate on SQLAzure/Sharding. As you've pointed out, there are aspects that just do not feel right about it. Until the NH support has evolved, you may have to try a few things to find out what works best for you. I can tell you a general flow of how it worked for us.
We implemented a simple sharding strategy factory that provides strategies that decide which shard to place you in based on our needs. Your needs may vary here. The key is creating strategies that process, merge and order your query results. From there, session creation and usage is all the same as any other session usage, which is highly desirable.
EDIT: I know this post by Ayende is a few months old, but it's exactly how we implemented it and it works. The rumor is better support in nHibernate will be coming.

What is the best way to save my POJOs into Jackrabbit JCR?

In Jackrabbit I have experienced two ways to save my POJOs into repository nodes for storage in the Jackrabbit JCR:
writing my own layer
and
using Apache Graffito
Writing my own code has proven time consuming and labor intensive (had to write and run a lot of ugly automated tests) though quite flexible.
Using Graffito has been a disappointment because it seems to be a "dead" project stuck in 2006
What are some better alternatives?
Another alternative is to completely skip an OCM framework and simply use javax.jcr.Node as a very flexible DAO itself. The fundamental reason why OCM frameworks exist is because with RDBMS you need a mapping from objects to the relational model. With JCR, which is already very object-oriented (node ~= object), this underlying reason is gone. What is left is that with DAOs you can restrict what your programmers can access in their code (incl. the help of autocompletion). But this approach does not really leverage the JCR concept, which means schema-free and flexible programming. Using the JCR API directly in your code is the best way to follow that concept.
Imagine you want to add a new property to an existing node/object later in the life of your application - with an OCM framework you have to modify it as well and make sure it still works properly. With direct access to nodes it is simply a single point of change. I know, this is a good way to get problems with typos in eg. property names; but this fear is not really backed by reality, since you will in most cases very quickly notice typos or non-matching names when you test your application. A good solution is to use string constants for the common node or property names, even as part of your APIs if you expose the JCR API across them. This still gives you the flexibility to quickly add new properties without having to adopt OCM layers.
For having some constraints on what is allowed or what is mandatory (ie. "semi-schema") you can use node types and mixins (since JCR 2.0 you can also change the node type for existing content): thus you can handle this completely on the repository level and don't have to care about typing and constraints inside your application code - apart from catching the exceptions ;-)
But, of course, this choice depends on your requirements and personal preferences.
You might want to have a look at Jackrabbit OCM that is alive and kickin. Of course another way is to manually serialize/deserialize the POJOs. For that there are many different options. Question is whether you need fix schema to query the objects in JCR. If you just want to serialize into XML then XStream is a very painless way to do so. If you need a more fix schema there is also Betwixt from Apache Commons.
It depends on your needs. When you directly use javax.jcr.node, it means your code is heavily coupled to the underlying mechanism. In medium and even some small sized projects, this is not a good idea. Obviously the question will be how to go from the Node to your own domain model. The problem is quite similar as with going from Jdbc ResultSet to your own domain model. Mind you, I mean from a technical point of view the problem is similar. From a functional point of view, there are huge differences between using JDBC and JCR.
Another deciding factor is whether you can impose a structure in your JCR content or not. Some application domains can (but still match better with JCR than JDBC), in other domains the content may be highly unstructured in nature. In such case OCM is clearly overkill. I'd still advice to write your own wrapper layer around javax.jcr.* classes.
There's also https://github.com/ilikeorangutans/omf, a very flexible object to JCR mapper. Unfortunately it doesn't have write support yet. However we're successfully using this framework in a large CMS installation.
There is also the JCROM project at http://code.google.com/p/jcrom/. That project went dormant for a couple of years, but there have been a few new releases as of summer 2013.