I'm researching RavenDb for my organization and one of the main research parameters is the spatial search. I tried to find if Raven enables Spatial search using SRID but couldn't find any info.
Does it consider the datum of the stored geometries?
No, RavenDB doesn't use SRID.
For spatial searches, it relies on Spatial4n, a port of Spatial4j: https://github.com/spatial4j/spatial4j
Related
I've developed a website that has a search facility that utilises Neo4j's full text search feature.
In order to build my index, I used the following cypher command:
CALL db.index.fulltext.createNodeIndex(“ArticlesIndex”, [“Article”], [“title", “abstract”])
I was wondering if there was any way to configure the scoring metric for this index? I believe Neo4j currently uses VSM but I'm hoping to switch it to BM25.
I've checked the Neo4j Docs- it mentions an optional 3rd argument config for createNodeIndex() but this only seems to have 2 supported options, neither of which override the default scoring metric.
I'm not exactly proficient with neo4j so any help would be appreciated :)
No, you cannot change the scoring function. Actually there isn't much you can configure except the analysers.
I want to add a full text search functionality to my Spring Boot application, data should be stored in an SQL database, I also read that using ES as a primary database is not recommended.
One way I thought of is: create, update and delete operations can be done on both the primary SQL database and in ES (which we can do using the Java High Level REST Client), for example, when inserting a row in SQL, we index it in ES as well, then we perform searches using Elasticsearch.
I think we can also use Hibernate search.
Is it the right approach? Otherwise any suggestions?
The main difference is that Hibernate Search provides integration between JPA and your index of choice (Lucene or Elasticsearch):
Hibernate Search will automatically add/update/delete documents in your full-text index according to changes in your JPA entities (as soon as you commit a transaction).
Hibernate Search will allow your to build a full-text query (full-text world), and retrieve the results as managed entities (JPA world).
As far as I understand, Spring-Data-Elasticsearch is focused on accessing Elasticsearch and has no JPA integration whatsoever. That is to say, you can use Spring-Data-JPA, and you can use Spring-Data-Elasticsearch, but they won't communicate with each other. You will have two separate models, which you will update and query separately.
Some other elements:
If you don't need a distributed index, Hibernate Search can run in embedded Lucene mode, without all the Elasticsearch stack. It will probably be more lightweight.
Hibernate Search is currently not very flexible when it comes to customizing your Elasticsearch mapping or using advanced Elasticsearch features, because of the abstraction layer. That will change in the future, though (Hibernate Search 6).
A Spring-Data-HibernateSearch module is in the works, allowing to benefit from the best of both worlds. It hasn't been released yet and is not really well documented yet, though: https://github.com/snowdrop/spring-boot-hibernate-search-booster
If you need only simple full text search consider postgresql, I'am using it for indexing and search document content: https://www.postgresql.org/docs/9.1/textsearch-controls.html .
Both MongoDB and CouchDB use B-trees as an underlying data structure for storing indexes. Anyone knows what is the equivalent for RavenDB? There is nothing mentioned about this in the documentation. Thanks!
RavenDB uses Lucene index.
In order to allow fast queries over your indexes, RavenDB processes
them in the background, executing the queries against the stored
documents and persisting the results to a Lucene index. Lucene is a
full text search engine library (Raven uses the .NET version) which
allows us to perform lightning fast full text searches.
You can read more about indexing in the documentation: How the indexes work
I have an assignment in which I need to build R-tree indexes on a table and query them.
But I am not getting proper tutorial or guide which specifically deals with R-tree in IBM Informix
and querying an R-tree.
I tried to Google but without much success.
Can anyone can provide me with a good startup?
I don't know what Google search terms you used, but I get to a good selection of information using:
site:ibm.com informix r-tree index
Variations on this search theme is how I usually search for IBM Informix Dynamic Server documentation on the IBM web site - when I'm not accessing the PDF manuals which I keep downloaded.
Specifically, the section Using R-Tree Indexes (Google indexes an older version of IDS; this URL is to the IDS 11.50 Info Center), seems to cover the questions you have. This is from the Spatial DataBlade guide. You could also use the R-Tree Index User's Guide as a source of information. See, for example, When does the Optimizer use an R-tree Index.
If this does not sufficiently point you in the correct direction, you'd better update the question and explain what you don't understand, with cross-references to the documentation where you don't understand what it is saying.
I'm looking into using Lucene and/or Solr to provide search in an RDBMS-powered web application. Unfortunately for me, all the documentation I've skimmed deals with how to get the data out of the index; I'm more concerned with how to build a useful index. Are there any "best practices" for doing this?
Will multiple applications be writing to the database? If so, it's a bit tricky; you have to have some mechanism to identify new records to feed to the Lucene indexer.
Another point to consider is do you want one index that covers all of your tables, or one index per table. In general, I recommend one index, with a field in that index to indicate which table the record came from.
Hibernate has support for full text search, if you want to search persistent objects rather than unstructured documents.
There's an OpenSymphony project called Compass of which you should be aware. I have stayed away from it myself, primarily because it seems to be way more complicated than search needs to be. Also, as I can tell from the documentation (I confess I haven't found the time necessary to read it all), it stores Lucene segments as blobs in the database. If you're familiar with the Lucene architecture, Compass implements a Lucene Directory on top of the database. I think this is the wrong approach. I would leverage the database's built-in support for indexing and implement a Lucene IndexReader instead. The same criticism applies to distributed cache implementations, etc.
I haven't explored this at all, but take a look at LuSql.
Using Solr would be straightforward as well but there'll be some DRY-violations with the Solr schema.xml and your actual database schema. (FYI, Solr does support wildcards, though.)
We are rolling out our first application that uses Solr tonight. With Solr 1.3, they've included the DataImportHandler that allows you to specify your database tables (they call them entities) along with their relationships. Once defined, a simple HTTP request will tirgger an import of your data.
Take a look at the Solr wiki page for DataImportHandler for details.
As introduction:
Brian McCallister wrote a nice blog post: Using Lucene with OJB.