searching in solr for specific values with dismax - lucene

I'm using the dismax handler to perform solr search over records (boosting some fields).
In my index, I have a RetailerId for each document, as well as other fields.
My query needs to search for documents that have this RetailerId as well as keywords:
http://localhost:8983/solr/select?qt=dismax&q=RetailerId:(27 OR 92) AND socks
What is the syntax for such a query?
Thanks!

Dismax does not support boolean operators. For a query like the one you described, you need to use the Standard Query Handler.
UPDATE
I have made a couple of tests and the fq parameter seems to work with dismax:
/select?qt=dismax&q=socks&fq=RetailerId:(27 OR 92)

if you want to filter by facet, user eDismax (extended disMax) that way you can say for instance q= your query AND face_name:"facet value"

Related

Searching using SOLR on multiple fields

I have two requirements for my SOLR implementation:
I need to be able to search on multiple fields at the same time (preferably with field boosting). This is possible using dismax parser.
I also have a specific set of indexed fields (example gender field). I need to be able to apply such specific filters (example: select?q=david&gender:male&status:married). As per my understanding of dismax, this is not possible.
Please suggest if the second requirement can be handled using dismax (or edismax)? For now i am forced to use standard query parser, even though i really liked dismax.
There is nothing stopping you from using dismax or edismax. Use qf to tell it which fields to search by default, and use fq to apply queries that act as filters.
/select?q=david&fq=gender:male&fq=status:married&qf=name^10 address^3
Filter Queries doesn't affect score, and will be cached separately. If you always filter on both gender and status, you could combine them to get a single query cache instead (fq=gender:male AND status:married).

Endeca search query on multiple fields

How to create an Endeca query on combination of multiple fields [just like where clause in sql query]. Suppose we have three fields indexed are -
empId
empName
empGender
Now, I need a query like "where empName like 's%' AND empGender=male"
Thanks.
Firstly,
Checkout Record Filters in the Advanced Development Guide.
If you are trying to use a Record Filter on a property, you will need to enable it explicitly in Developer Studio for that property, while your Dimensions will automatically have the ability to apply a Record Filter. This will help when you have explicit values to filter on, for example empGender.
Your Record Filter can then look as follow:
Nr=AND(empGender:male)
You can further use the Ntk parameter to specify fields to search on so assuming your empName field is enabled for wildcard searching (configure this in Developer Studio) searching this field will look as follow:
Ntk=empName&Ntt=s*
So assuming your properties have been configured correctly, your example above will probably end up looking as follow:
Nr=AND(empGender:male)&Ntk=empName&Ntt=s*
To take this one step further, you can specify Search Filters (ie. Ntk + Ntt parameters) together. I haven't tried this for wildcards so you'll need to confirm that yourself but to combine Search Filters you delimit them with |
Ntk=empName|empId&Ntt=s*|1234*
I suggest you manually build up queries in the Reference Application to confirm you get your expected results and then start to code this up in your application.
radimbe, the problem with record filters for this use case is that they need to be precise. This means you don't get pelling correction, thesaurus expansion, case insensitivity or stemming. It's very unlikely that a user will input precise information like this.
Saraubh, you can do a boolean search to do OR text search queries. You can also use the Endeca Query Language to specify a complex set of boolean logic that goes beyond boolean search and which would incorporate spelling correction, stemming, etc.
In general though, I think for an application like this, you should move away from searching specific individual fields simultaneously and make use of the faceting capabilities of dimensions to guide the user. Additionally, a search box that searches many fields in combination simultaneously in order of importance is really the way to go for a simplified user interface for this sort of application.

lucene join query

Is there a way to issue join queries (http://www.searchworkings.org/blog/-/blogs/query-time-joining-in-lucene) in lucene without directly using Query API? Is it possible to issue query in text form for this requirement? For example:
title:derivatives join(comments:great)
Apache Solr (4.0, not released yet) has a query parser which can handle join queries.
If I understand your question, I think you want a query like 'title:derivatives AND comments:great'. Or you can use code like 'queryParser.setDefaultOperator(QueryParser.Operator.AND)' to change the default conjunction operator to AND instead of OR ('OR' is used by default unless you tell Lucene otherwise).

Solr query parser that allows specifying multiple default fields

I would like to use the Dismax query parser because it allows me to specify multiple default search fields (using the 'qf' parameter) as well as other nice features such as field boosting.
However, I want a query parser/scoring algorithm that takes the sum of all field scores, rather than just the max.
Is there a way to configure DisMax to take a sum of scores rather than the max?
Can I specify multiple default search fields using the standard query parser?
Is there a different query parser alltogether that would achieve this?
Do I need to write my own query parser?
Any help is greatly appreciated.
Thanks!
Isn't that qt=fieldA fieldB what you are looking for?
if fieldA is more important do qt=fieldA^2 fieldB

Search literal within a word

Is there a way to perform a FULLTEXT search which returns literals found within words?
I have been using MATCH(col) AGAINST('+literal*' IN BOOLEAN MODE) but it fails if the text is like:
blah,blah,literal,blah
blahliteralblah
blah,blah,literal
Please Note that there is no space after commas.
I want all three cases above to be returned.
I think that should be better fetching the array of entries and then perform a text manipulation over the fetched data (in this case a search)!
Because any text manipulation or complex query take more resources and if your database contains a lot of data, the query become too slow! Moreover, if you are running your
query on a shared server, that increases the performance issues!
You can easily accomplish what you are trying to do with regex, once you have fetched the data from the database!
UPDATE: My suggestion is the same even if you are running your script on a dedicated server! However, if you want to perform a full-text search of the word "literal" in BOOLEAN MODE like you have described, you can remove the + operator (because you are searching only one word) and construct the query as follow:
SELECT listOfColumsNames WHERE
MATCH (colName)
AGAINST ('literal*' IN BOOLEAN MODE);
However, even if you add the AND operator, your query works fine: tested on Apache Server with MySQL 5.1!
I suggest you to read the documentation about the full-text search in boolean mode.
The only one problem of this query is that doesn't matches the word "literal" if it is a sub-string inside an other word, for example: "textliteraltext".
As you noticed, you can't use the * operator at the beginning of the word!
So, to accomplish what you are trying to do, the fastest and easiest way is to follow the suggestion of Paul, using the % placeholder:
SELECT listOfColumsNames
WHERE colName LIKE '%literal%';