Searching a word on all fields of an index with solr 8.9 - indexing

I'm fetching some datas from a sql database using the DIH of Solr. I created a field all as this :
and I would like to be able to use it to search on all fields thought it. so like if I do he query "John" it would match with a title and a author name.
Actually I have a problem, when I do a query on the all field it only works on a perfect match.
For exemple, if I search name:lub it returns
"name":"CR2/LUB/ Lub oil pump",
"all":["1706443412665794562",
"2165C92A-D107-48A6-A410-08D92AA77517",
"CR2/BER/CRACK/LUB/OT/10-PU-200C",
"CR2/LUB/ Lub oil pump"],
Which is good
But if I search all:lub the response show :
"numFound":0,"start":0,"numFoundExact":true,"docs.
The ultimate goal being to be able to use a word to search on all fields, and to ponderate the weight of the different fields.
Like, if someone search John for books it finds it in the title , and in the author fields (by looking in the all) and then ponderate, by making the title more important and viewing in the response the score of each document
Thanks in advance for your advice!

Related

How to re-rank documents based on their attributes rather than just their field relevance?

I'm trying to use Solr to re-rank document results based relevance to the user searching. For example, if I search joann*this could return documents where the Name field is anything from joanna to joanne. What I'm trying to do is to return documents that match on certain attributes that I have as well-- this could be something like us both having the field Location = "NYC".
So my question is two fold- is there a way to grab and handle a users information when they are making a query and also is there a way to re-rank based on these additional field values? Would this look more like writing some code or just an expanded query?
it looks to me like you are talking about functionality that Query Reranking exactly provides. Did you check that out?

Sample code of java lucene indexing and searching for creating one document per line

I am very new to lucene.I have a text file containing 100s of records with two columns per line.First column is of userid and second is of url_list(I guess those will be my document fields)
I need to provide a search feature using lucene which will give the document containing entered url or userid. And for that i need to create one lucene document per line of my text file.
Please suggest me some sample code for this..
I m using lucene version 3.6.2
Here is a short but fantastic tutorial on Lucene for starters.
Lucene in 5 minutes
Steps
1) I assume that you are pre-parsing the text file to get hold of userid, corresponding url list. You've got to do this. Lucene won't help. Lucene does break the text that belongs to a single field, but won't break the text and add userid to userid field and urls to URL field.
2) Read the above tutorial. I highly recommend you to use the latest version of Lucene which is 4.1 as of now.
3) Things to remember that are specific to your use-case
Have two fields for each document: USER_ID, URL (of course you may change those names)
Do not ANALYZE (break into tokens) the content of USER_ID field.
I am not sure how you wanna store the URL field. You may not want to ANALYZE it or use the StandardAnalyzer which recognizes a URL without tokenizing.
4) You can find the sample code to index, query, search, retrieve results in the tutorial.

Lucene term query in tis files

You know lucene firstly query the term in tii then point to tis,my question is that how the lucene filter fields.
for example:The tis file has 1 million terms,999 thousands terms belongs to content field,the other 1 thousand belongs to title field.
So If I query title:city, then Lucene will search the term city undistinguish fields?i.e firstly both searh the two fields terms (content and title )and then drop the content field.Or there are two tis files one for content field other for title field.
Thanks in advance
A field value alone makes no sense to Lucene. Terms consist of a value ("city") and a field name ("title", "content", ...).
If you search for "title:city", Lucene will only search for the "city" value for field name "title".

Solr: Search in multiple fields BUT STOP if documents match was found

I want to search in multiple fields in Solr.
(In know the concept of the copy-fields and I know the (e)dismax search handler.)
So I have an orderd list of fields, I want the terms to be searched against.
1.) SKU
2.) Name
3.) Description
4.) Summary
and so on.
Now, when the query matches a term, let's say in the SKU field, I want this match and no further searches in the proceeding fields.
Only, if there are NO matches at all in the first field (SKU field), the second field (in this case "name") should be used and so on.
Is this possible with Solr?
Do I have to implement my own Lucene Search Handler for this?
Any advice is welcome!
Thank you,
Bernhard
I think your case requires executing 4 different searches. If you implement you very own SearchHandler you could avoid penalty of search result accumulation in 4 different request. Which means, you would send one query, and custom SearchHandler would execute 4 searches and prepare one result set.
If my guess is right you want to rank the results based on the order of the fields. If so then you can just use standard query like
q=sku:(query)^4 OR name:(query)^3 OR description:(query)^2 OR summary:(query)
this will rank the results by the order of the fields.
Hope is helps.

Apache SOLR search by category

I am using apache-solr-1.4.1 and jdk1.6.0_14.
I have the following scenario.
I have 3 categories of data indexed in SOLR i.e. CITIES, STATES, COUNTRIES.
When I query data from SOLR I need the search result from SOLR based on the following criteria:
In a single query to SOLR I need data fetched from SOLR grouped by each category with a predefined results count for each category.
How can I specify this condition in SOLR?
I have tried to use SOLR Field Collapsing feature, but I am not able to get the desired output from SOLR.
Please suggest.
My solution is not exactly what you have asked but is my take on what SOLR does best, which is full text search. Instead of grouping the results by "category", I'd suggest you order the results by relevance score but also provide a facet count for the category values. In my experience users expect a "search" to behave like Google, with the best matches at the top. Deviating form this norm confuses the user in most cases.
If you want exactly as you have asked (actual results grouped by category) then you could use a relational database and do a group_by or write a custom function query with SOLR (I cannot advise on this as I've never done it).
More info: index the data with the appropriate fields, e.g. name, population, etc. But also add a field called "category", which would have a value of either CITIES, STATES or COUNTRIES. Then perform a standard SOLR search, which will return results in order of relevance - i.e. best matches at the top. As part of the request, you can specify a facet.field=category, which will return counts for the search results for each of the given categories (in the "facet" results section). In the UI you can then create links for each category facet which performs the original search plus &fq=category:CITIES, etc., thus restricting results to just that category. See the facetting overview on the SOLR wiki for more info.