How to do a Lucene.Net search for local results based on geographic location? - lucene

I'm looking for some info so that users can find local results in their Lucene.Net searches.
I would index the latitude/longitude of the location in the document, and query Lucene based on the users position and 20 (or 30, 40...) mile range.

The utmost on local search in Lucene is Grant Ingersoll's Location-aware search with Apache Lucene and Solr. Trouble is that Lucene.net lags after Java Lucene, so that these Lucene features available with the rather new Lucene 2.9.0 will take a while to trickle into Lucene.net (Lucene.net 2.4.0 came about a year after Java Lucene's 2.4.0). In the meantime,
try Spatial.net in Lucene.net's contrib, or you can try to port Sujit Pal's suggestions from Java to C#.

Related

Can we use lucene query to have fts_alfresco search?

I want to upgrade my Alfresco server to 5.2 and in all my custom webscripts am using lucene queries. Since from Alfresco 5.x lucene indexing has been removed and solr indexing is not instantaneous, am planing to use fts_alfresco search. While testing i found that few lucene queries can be used for fts_alfresco search without modifying. So my concern is will i be able to do fts_alfresco search using lucene query? If no, is there any better way to migrate all my lucene queries to fts_alfresco?
Thanks in advance.
You will need to test/check your queries since there are small differences (for instance, date range query is not the same), but in general there's no reason why you would not be able to use FTS.
I'm not sure a comprehensive documentation exists where you would see all those small differences, though. If you find it, please share.
"Alfresco FTS is compatible with most, if not all of the examples here.."
https://community.alfresco.com/docs/DOC-4673-search

what techniques does Solr use to index files?

as you know, there are different technique to index documents for search engines.
such as inverted index, Distributed Dynamic Indexing, Semantic Indexing, NGram Indexing, Context Indexing, Big Data, Multilingual Indexing and so on.
I am working with Solr now. I wonder which techniques does Solr use to index documents and how does Solr (or Lucene) use these techniques?
First - this is a very wide area and most of the terms you're listing isn't index types. They describe product features (or buzzwords) that could be supported regardless of how the index is built behind the scene.
Solr uses Lucene - which at the core is an inverted index.
The index stores statistics about terms in order to make term-based search more efficient. Lucene's index falls into the family of indexes known as an inverted index. This is because it can list, for a term, the documents that contain it. This is the inverse of the natural relationship, in which documents list terms.
There is also many support structures in place to make Lucene even more efficient for certain queries and features. On such feature is the DocValues support - which can be described as a column oriented store with document -> term mappings to speed up things like faceting.
You can see most of these support features in the Codecs API Doc for Lucene 6.3.0. As it's quite a large list, I'll leave it out from the comment itself.
To answer which techniques - Under the hood , Solr uses Lucene APIs and Lucene indexing technique is - Inverted Indexing. Solr is simply a complete application with infrastructure wrapper but underlying document indexing technique is the one provided by Lucene APIs.
How does Solr (or Lucene) use these techniques?
Here is a nice overview of Lucene indexing for beginners. Its just a very simplistic overview but explains the basics.
Since Solr is a product, most of its available documentations are functional ones ( not explaining actual indexing techniques etc) and since raw usage of Lucene is minimal, Lucene documentation is not up to the mark so most of the time, one needs to dig Lucene code or API documentation to understand working of Lucene.
Hope it helps !!

Creating Lucene Index in a Database - Apache Lucene

I am using grails searchable plugin. It creates index files on a given location. Is there any way in searchable plugin to create Lucene index in a database?
Generally, no.
You can probably attempt to implement your own format but this would require a lot of effort.
I am no expert in Lucene, but I know that it is optimized to offer fast search over the filesystem. So it would be theoretically possible to build a Lucene index over the database, but the main feature of lucene (being a VERY fast search engine) would be lost.
As a point of interest, Compass supported storage of a Lucene index in a database, using a JdbcDirectory. This was, as far as I can figure, just a bad idea.
Compass, by the way, is now defunct, having been replaced by ElasticSearch.

Working with Katta ( Lucene, Hadoop )

Can any one provide me with some sample Java code as how to go about storing the Lucene index in a HDFS( Hadoop File Sytem ), using Katta.
Katta is wrapper for Lucene indexes. A folder containing several lucene indexes form a katta index. I had worked on this long back and dont have code handy.
You have to install katta in each nodes along with Hadoop. It wouldnt be that difficult if you try. One of my colleague has written a article on lucene indexing and searching using hadoop. It may help you.

Couple o' quick questions on Apache Lucene

-- I don't want to start any religious wars, but a quick google search indicates that Apache Lucene is the preferred open source tool for indexing and searching. Are there others?
-- What file format does Lucene use to store its index file(s)?
Thank is advance.
Doug
Which are the best alternatives to Lucene? And as a lucene user I can say it has improved a lot performance wise the last couple of versions (NOT meaning it was slow before!)
it uses an proprietary format see here
I suggest you to look at Sphinx.
I have an experience with Lucene.net and we have many problems with multithread indexing. Lucene stores index in files, and this files can be locked by anti-viruses software.
Also you can not compare numbers in Lucene: it is impossible to filter products by size and price.