How to avoid retrieve entire stored field from solr - ruby-on-rails-3

I'm using sunspot and solr for a rails app to search ebook contens, for highlight feature I have to set the ebook_content as a stored filed, every time I queried solr for result, it sends back the entire document content about the book, which makes the query very slow.
How could I only get the result without the stored field?

The fl parameter of Solr allows you to specify which fields you want returned in the result. If you had fields id, title, ebook_content, then you could use fl=id,title to omit the ebook_content field. I don't think there's support in Solr for getting all fields except one (e.g. -ebook_content).
Update
If you don't want to return the field in the normal results, but still want highlighting on that field, exclude the field as I described above, then turn on the highlighter:
hl=true
set the field(s) which should be highlighted:
hl.fl=ebook_content
and set the size of the highlighting fragment (in characters):
hl.fragsize=50
your finished query looks something like this:
?q=search term&fl=id,title&hl=true&hl.fl=ebook_content&hl.fragsize=50

Related

Apache Solr only return fields that value/query string was found in

I am just getting started with Apache Solr.
I have successfully run through the Apache tutorials and have now created my own collection and indexed my files.
Whilst the documentation is extensive I cannot find if there is a way to query all fields, but only return the fields that the search string/query was found in.
For example, if I have a file:
Filename: Weekly Report For Company X.pdf
Associated / indexed meta-data:
"id":"S:\\Weekly Reports\\JAN\\Weekly Report For Company X.PDF",
"date":["2017-11-02T19:14:07Z"],
"pdf_pdfversion":[1.6],
"company":["Microsoft"],
"access_permission_can_print_degraded":[true],
"subject":["weekly report; reports; weekly"],
"contenttypeid":["0x010100F29081EC69D67544A17D8172A093E42E"],
"dc_format":["application/pdf; version=1.6"],
If I query for "Weekly Report" I only want to return the 'id' and 'subject' fields as these are the only fields that contain the actual queried values. If other fields contained the string, I would want them returned too.
I'm leaning towards 'it cannot be done' (but hope I am wrong) as I liken it to a SQL query. It has to know what fields to return in the SQL statement and does not remove fields based on no matching string.
Since I don't know the matched fields before running the query I cannot use the filter list option at the point of executing the query.
Is this possible?
While this may be not precisely what you want, but you could mimic similar behaviour with highlighting.
All you need to do - is to create dismax query with qf being all fields that you have (e.g qf=id,subject,company)
Then you need to request highlighting, request all fields for it (hl.fl=id,subject,company) and enable hl.requireFieldMatch which would force Solr to return only fields which were matched for the query.
In this case you will have a highlighting section, that will contain ids of the matched documents and only highlighted contents of matched fields

Lucene - exclude fields from being searched

I have a search index and require a lucene query which will conditionally search specified fields. The end result will be that if you're logged into the website, all fields will be searched, or if you're logged out, specified fields will be skipped by modifying the lucene query.
The closest I have at the moment is:
+(term1~ term2~) +_culture:([en-gb TO en-gb] [invariantifieldivaluei TO invariantifieldivaluei]) **-FieldToIgnore1:(term1 term2) -FieldToIgnore2:(term1 term2)**
The problem with this however is if one of the search terms exists in one of the fields not mentioned (FieldToIgnore1 or FieldToIgnore2), then the document is ignored because it's been excluded as one one of the fields to ignore were matched.
How can this be modified so lucene doesn't even match against the fields to ignore?
Instead of qualifying your search via Lucene and the Smart Search Results webpart, have you tried modifying the searchability of the document fields themselves. You can set search parameters on the Page Type or index itself.
Go to Page Types --> [your doc type] --> Search fields, and set what fields are and aren't exposed to searching.
Version 9 gives you these settings in the Smart Search app. See these docs for details.

Adding an extra field to already indexed data Solr

I have indexed approximately 1000 documents in Solr. But all of them are missing a field. I need to add a field to all these documents, and this field will have the same value for all of them. I do not have access to these documents to index them again. Is there any way to do this without re-indexing all the data again?
Unless you've configured your schema to store all values, no, there is no usable way to add a field to the documents without reindexing. If you all fields are stored, you can use atomic updates to add a new field for a document, so you could query Solr for the ids of all existing documents and perform an update that way.
Otherwise you're going to have to go with the suggestion from #michielvoo, and return a static value from the query string .. but then you could also just append it in your application before returning it to the user (or, you could add the field as a default value for the request handler in solrconfig.xml, so that you can edit and change it server side).

Showing more feature attributes in Solr Highlight

How can I get more feature fields from Solr highlight output?
Currently the Highlight just returns the text snippet and docID.
During the indexing step I indexed the feature alongside with other fields I'd like to get back.
Thank you in advance!
You can specify other fields to return highlighting on using the hl.fl parameter. For multiple extra fields, just use that field repeatedly. For example, if you want to highlight in the fields author and title, you would append
&hl.fl=author&hl.fl=title
to your Solr query. Take a look at the linked page for other highlighting options.

How to display search results in a new form

I've created a system and within that system i've a find/search page and a find/search results page. Basically, the find/search page consists of a number of text fields and the more the user completes, the more efficient the search will be.
I'm using SQL server 2005 to store the data and I can easily update/insert/save new data but I don't know how to search for the data ...
I want the user to fill out the fields in the find/search form and for the results to appear in the find/search results page. Can this be done?
It depends on what kind of Data you need to search.
If it's generic text data the best way is to use Full-Text Search
Yes. There are a number of ways you could achieve this. One possible way would be to pass the search criteria to the search results page via query string. Another way which is very similar is to store the search criteria in a session and redirect to the search results page. In either case on the search results page you'd want to take the data and build your SQL query. Depending on what you need you could utilize a full-text search like Kesty has suggested or you could simply use FIELD like '%user entered data%' in your queries. It really depends on your needs.