So, using redisearch-py, search by 'test' and 'TEST' gives for me different results. How i can make a query for caseless full text search?
Related
I have a Azure DocumentDB collection with a 100 documents. I have tokenized an array of search terms in each document for performing a search based on keywords.
I was able to search on just one keyword using below SQL query for DocumentDB:
SELECT VALUE c FROM root c JOIN word IN c.tags WHERE
CONTAINS(LOWER(word), LOWER('keyword'))
However, this only allows search based on single keyword. I want to be able to search given multiple keywords. For this, I tried below query:
SELECT * FROM c WHERE ARRAY_CONTAINS(c.tags, "Food") OR
ARRAY_CONTAINS(c.tags, "Dessert") OR ARRAY_CONTAINS(c.tags, "Spicy")
This works, but is case-sensitive. How do I make this case-insensitive? I tried using scalar function LOWER like this
LOWER(c.tags), LOWER("Dessert")
but this doesn't seem to work with ARRAY_CONTAINS.
Any idea how I can perform a case-insensitive search on multiple keywords using SQL query for DocumentDB?
Thanks,
AB
The best way to deal with the case sensitivity is to store them in the tags array with all lower case (or upper case) and then just do LOWER(<user-input-tag>) at query time.
As for your desire to search on multiple user input tags, your approach of building a series of OR clauses is probably the best approach.
We have a scenario where we are trying to perform accurate name matching of Items using SOLR.
Query Parameter: Apple
SOLR Indexed Word: Apple-D
In our business case, "Apple" and "Apple-D" are totally different items and therefore SOLR shouldn't return the match.
Is there an option to achieve the same?
You need to change the fieldType used for the field. Use the String fieldType for the your field.
This String fieldType will make sure that the words will be stored as it is by solr.
It won't apply any analysis on the word. Or it won't create any tokes of it.
With the String type applied to it . The Apple and Apple-D are stored/indexed different token. As there won't be any tokenizing on the same. This will help you to achieve the exact match.
Once you change the fieldType. Re-index the same.
You can use the solr analysis tool to check how it is indexing and querying .
Note : Make sure whenever you ask question on it, Share your schema.xml
I have developed a search application with Lucene. I have created the basic search. Basically, my app works as follows:
My index has many fields. (Around 40)
User can enter query to multiple fields i.e: +NAME:John +SURNAME:Doe
Queries can contain wildcards such as ? and * i.e: +NAME:J?hn +SURNAME:Do*
Queries can also contain fuzzy i.e: +NAME:Jahn~0.5
Now, I want to find, which field(s) contains my search term(s). As I am using wildcard and fuzzy, I cannot just make string comparison. How can I do it?
If you need it for debugging purposes, you could use IndexSearcher.explain.
Otherwise, this problem looks like highlighting, so you should be able to find out the fields that matched by:
re-analyzing your document,
or using its term vectors.
I'm currently using the query
SELECT Url FROM Link WHERE CONTAINS(Url, 'href=blah')
It is including results with href=/blah. Any way I can tell the query to act more like WHERE Url LIKE '%href=blah%' and still use the full-text catalog?
Your problem is that = and / are both word breakers, in other words, sql fulltext is actually searching for href and blah
There are a couple of options you could try. First you could filter down the search domain using the fulltext engine, then search the subset of data using LIKE. You'll need to experiment to see how to squeeze out the best performance.
The other option is, if href=blah is a consistent term you could add that to a custom dictionary. A good article on this is here.
Is there any way to retrieve the match field/position for each keyword for each matching document from solr?
For example, if the document has title "Retrieving per keyword/field match position in Lucene Solr -- possible?" and the query is "solr keyword", I'd like to get, in addition to the doc-id (I normally only want the doc-id, not the full document), something that can tell me the matches are at:
solr:
title: 9
keyword:
title: 3
I'm pretty sure such info is computing during query execution (for phrase queries), but is it possible to return these to the application?
Thanks!
Debugging Relevance Issues in Search suggest using Solr analysis, which you can get to from the admin URL, using something like http://localhost:8983/solr/admin/analysis.jsp?highlight=on .
This highlights matching terms and gives their position.
AFAIK there is no way to do that directly, but you can use hit highlighting to implement it.