Hibernate Search 5.2+ programmatic configuration of facet field - lucene

Before Hibernate Search 5.2 there was no need to explicitly use a #Facet annotation. In 5.2 it became necessary in order to use Lucene’s native faceting API.
I'm using Hibernate Search on external classes that cannot be annotated. Is there a way to define this "facet" programmatically?
For the mapping configuration, there is no issue because the SearchMapping provides a complete programmatic alternative to the #Entity, #Indexed, and #Field annotations. But within this API, and in particular in the EntityMapping class, there is no way to define that a field will be used in a facet query; there is no other alternative rather than annotating the field with #Facet.
2018 update:
I've updated to Hibernate Search 5.6.4 and it is working with this kind of mappings:
.property("businessProcess", ElementType.METHOD)
.field()
.analyze(Analyze.NO)
.store(Store.YES)
.facet()
.name("businessProcess")
.encoding(FacetEncodingType.STRING)

The workaround you referenced does not configure faceting in Hibernate Search at all (no #Facet annotation nor programmatic equivalent). In recent versions of Hibernate Search, this will not work, because we had to require this metadata in order to fix other bugs.
Using custom facet formatting is very much uncharted territory, and admittedly much harder than it should be. The main reason is that facets were originally, for a reason I cannot fathom, designed to work directly on the entity property instead of the field value. Thus, facets ignore the field bridge. We're working on cleaning up the faceting support in Search 6, but this one of many works in progress and will take some time.
In the meantime, your easiest option will probably be to just use the built-in formatting.
EDIT: Also, for dates you might want to use numeric facet formatting, so as to perform range faceting (from the first of May to the 30th of may). In that case, the name of facets is defined at query time, so built-in formatting should not matter.
And there's actually one easy solution to customize formatting in your string-encoded facets, but I didn't mention it since you are using programmatic mapping and probably do not want to change your model: you could add read-only properties returning the exact value you want in your facet (getYear, getMonth, ...), and add fields with faceting on those properties.

Related

Why are aggregate functions like group_by not supported in hibernate search?

Why are aggregate functions like group_by not supported in hibernate search?
I have a use case where i need to fetch results after applying group by in the query.
There is no technical reason, if this is what you mean. We could probably add it, but there simply wasn't enough demand for this feature to make it to the top of our priority list.
If you want to see a feature added to Hibernate Search, feel free to create a ticket on our JIRA instance, describing in details your use case and the API you would expect.
Note that I am not 100% sure we would implement it for the Lucene backend, since that would probably require a lot of effort. But for people using Elasticsearch behind Hibernate Search, we may at least introduce ways to use Elasticsearch's aggregation support from within Hibernate Search. We are currently experimenting with Hibernate Search 6 and trying this is on my checklist.
In the meantime, if you want us to suggest alternatives, please provide more details about your use case: domain model, mapping, fields you would like to aggregate as part of your "group by"...
Why it's missing
The primary reason for this to not be support by Hibernate Search is that noone ever asked for it or contributed it.
Another reason is that since the results would be "groups of entities" while the FulltextQuery API returns a List of entities, this would need a new API specifically to run such queries.
How to get it added
We could make that, but if there is not much interest in the feature it would possibly not be worth the maintenance work.
If you need such a feature I suggest you open an issue on the Hibernate Search issue tracker so that other people can also vote or express interest for it. Ideally, someone needing it like yourself might be willing to create a patch or at least start a proof of concept.
Alternatives
Until Hibernate Search provides direct support for it, you can still run such queries yourself. See Using IndexReaders directly to work on the Lucene index directly.
Using the IndexReaders you can always read and Search on Lucene using any advanced feature for which Hibernate Search doesn't provide an API.

Are there any specific scenarios to use Liferay search container over Dandelion datatables framework?

Are there any specific scenarios to use Liferay search container over Dandelion data tables framework,when Data tables provide far better collection of features(such as multi column sorting,filtering,searching,i18,etc) and is easy to integrate too.To rephrase my question,should data tables be preferred over search container for all scenarios.
It's 100% your choice. Search Container is styled as every built-in list of entities within Liferay (because Liferay uses Search Container). If you use it or choose any other method/framework/technology is strictly your choice.
Make your choice based on
appearance and level of visual integration you'd like to have
familiarity with the framework
suitability for the job
maintainability of the solution for whoever is going to maintain your code
assumed stability (or level of maintenance) for your solution of choice
If you end up using either one of the proposed solutions or yet another one: So be it. For your future maintainers sake, just make sure to choose one and standardize on it.
If you're customizing Liferay's UI, you might still need to understand Search Container, but that's a different story.

Finding document/field length in Lucene 4

I'm looking to have the ability to access the length (in terms) of a specific field of a document post-indexing. Preferably, if there is a way without re-indexing I would like to do that. But if re-indexing in a certain way will give easy access to this value, that would also serve.
http://blog.mikemccandless.com/2012/03/new-index-statistics-in-lucene-40.html
That link (scoll down a bit and find the mention of length) talks of accessing the value at indexing time. I wish to be able to do so post-indexing. The link also talks about saving away the value to a doc value, but it gives no examples of how to do so.
If anyone could provide examples of saving the document length, or accessing it post-indexing, it would be incredibly helpful. Thanks.
The mention of that statistic in the article is in reference to a FieldInvertState. Once you have that, it should be fairly straightforward how to get the statistics you are looking for (Just call getLength, getUniquetermCount or whatever you need).
The FieldInvertState is passed into the Similarity, particularly to the call Similarity.computeNorm. The norm value is calculated and stored at index time, rather than evaluated at query time, so making effective use of it would require you to reindex.
The typical way to make use of this would be to create a custom Similarity, possibly extending DefaultSimilarity. Simply overriding the lengthNorm method of DefaultSimilarity would be the simplest approach. It's standard implementation is:
return (float)(1.0 / Math.sqrt(numTerms));
Which you could override with whatever you like.
That would work to tweak scoring based on a custom length-based calculation. If that's not what you are looking for, but rather need to be able to just fetch that information, I would think just storing and the field, and getting the length from the field value returned when you fetch a Document would be the simplest implementation.

Same business entity for identical tables?

I got a legacy database which have about 10 identical tables (only name differs).
Is it possible to be able to use the same business entity for all tables without having to create several classes/mapping files?
You can use the entity-name feature if you are using NHibernate v2.1 or higher. It is poorly documented but I am actively using the feature. It has gotten hard to find the documentation on it but look here:
Section 5.3 in
http://docs.jboss.org/hibernate/core/3.2/reference/en/html/mapping.html#mapping-entityname
A couple of things to be aware of. You must now use entity-name instead of class name to refer to the objects. In general it is not an entirely transparent change moving from class names to entity names.
Session actions now require two parameters, for example:
_session.Save("MyEntity", myobject)
The entity-name controls what table the data goes into.
Some HQL queries do not work right anymore, sometimes you must use Criteria instead.
If you need a set of sample code I may be able post some, but far too busy at the moment. I suggest you look at the limited info you can find and set it up for a very simple object and multiple tables to learn how it all works. It does work.
You can create a base class with all the properties, but you still need to map them all.
For that, you can either use copy&paste, XML entities (see examle at http://nhibernate.info/doc/nh/en/index.html#inheritance-tableperconcreate-polymorphism), or a code-based mapping method (Fluent or ConfORM). They usually make reuse easier.

Where is the api reference for nhibernate?

I may be going mental, but I can not find any api reference material for nhibernate. I've found plenty of manuals, tutorials, ebooks etc but no api reference. I saw the chm file on the nhibernate sourceforge page, but it doesn't seem to work on any of my PCs (different OSes)
Can someone please point me in the right direction?
I just found this one:
http://web.archive.org/web/20141001063046/http://elliottjorgensen.com/nhibernate-api-ref/index.html
It doesn't seem to be official, but at least it looks like an API reference... unlike the official reference, which mostly describes concepts and mappings without any information about classes and members.
If you're on Windows, get ILSpy and point it at NHibernate.dll. It's not quite the same as real API documentation, but it's not half bad.
There is no class references publicly available on Internet as far as I know. You may build it from the source. Clone them, build the NHibernate.sln solution, then go into doc folder, ensure you have prerequisites indicated in reference\readme.txt file, and run nant doc. This will generate the class reference in the build folder.
Otherwise the most commonly used API are not wide, and most of them are xml documented with intellisens working in Visual Studio. The reference documentation has the advantage of giving more context, probably helping avoiding pitfalls like believing ISession.Update is to be used for updating entities (this is wrong, you do not need it unless you use detached entities, or entities coming from another session).
Official documentation reference is on https://nhibernate.info.
Sub-links:
Global documentation list
Reference (What I mostly use, especially following sub parts.)
Configuration
Mapping - basic / entities. (Add mapping xsd definition file in any or your solution folders for letting VS know it and give you intellisens in your hbm mappings.)
Mapping - collections
Querying - general. Do not miss the named queries feature in The IQuery interface.
Querying APIs:
HQL. I mostly use HQL with named queries, in mappings, for queries not dynamically built. They get parsed and validated when building session factory, which normally occurs at application startup, so it is almost as good as compile time validation. Checks log4net logs to get detailed reasons of named query parsing failures.
Criteria API. I view it as the historical way of dynamically building queries in code, to be preferred over constructing HQL strings.
QueryOver API. Based on Criteia API, with lambda expression support for having compile time validation of queried entities namings. Should be preferred over Criteria API in my opinion.
Linq API. Great for dynamically built queries. Bear in mind that its implementation translates your queries to HQL. With complex queries, it may generate unsupported HQL constructs. Having knowledge of HQL capabilities allows a better understanding of how to write a supported Linq query for complex cases. (By example, for a complex order by, better use an explicit linq sub-query in the OrderBy rather than using a collection mapped on your queried entity.)
Native SQL. Well, quite self-explanatory. To be used by example when you need some SQL special feature not available through other querying APIs (SQL server full-text, select for xml, ...), and that you do not wish to extend those other APIs. You may also call stored procedures. When using native SQL, I favor SQL named queries.
Modifying data, from Updating objects to Flush, and Exception handling.
Performances.
Batch fetching. About this, you may read my post here for a detailed explanation of why lazy loading can be very efficient with NHibernate, thanks to batch fetching. This single feature will always cause me to prefer NHibernate over Entity Framework, till it ceases being lacking in EF.
Second level cache. Another great NHibernate feature, lacking native support in EF. Beware, you must use transactions for leveraging this. It allows NHibernate to automatically evict cached entries for you as you change data through your application process. Without transactions, NHibernate will disable the second level cache as soon as you start changing data, for avoiding letting the cache yield you stale data.
Interceptors. This is one way among many allowing to customize NHibernate inner working. NHibernate is very strong at allowing you to extend it. You may also add your own HQL extensions as here, your own linq2NH extension as here (all are answers from me). And there are other ways, see this list for linq2NH extensibility solutions.
Moreover, a class reference will very likely be near the Hibernate one. There is so many internals APIs supporting its implementation that is not much usable.
Why are such API not hidden (internal, private, ...)? Not hiding them is required for allowing the great extensibility capabilities of NHibernate. Those capabilities are a must have in my opinion. In contrast, it is so hard to fix some other .Net project shortcomings, due to lacks of extensibility they suffer. (MVC FileResult and the TweakDispositionAsInline I had to use instead of just being able of overriding some method, or try extend linq-to-entities, see this.)
there is a good book that covers a lot, and there is the html documentation on the site (which also comes as a book)
(the book would be manning - nHibernate in Action - a little outdated, but a good start)
Here is the link to the online reference