Hey I am trying to bring autocomplete into my application but it is giving me error. sunspot-2.1.1/lib/sunspot/dsl/fields.rb:93:in rescue in method_missing': undefined methodautocomplete' for #Sunspot::DSL::Fields:0x000001029b7cd0 (NoMethodError)
Below are the changes that I have done. I appreciate your help.
Model
def category_name
self.name
end
searchable do
text :name
autocomplete :category_name, :as => :name
end
Solr Schema.xml
<fieldType name="autocomplete" class="solr.TextField" positionIncrementGap="100">
<!-- The index analyzer adds parts of the field from 2 - 25 chars including whitespace etc. -->
<analyzer type="index">
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="2" maxGramSize="25"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<!-- The query analyzer takes the whole input, whitespace and all -->
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<fieldType name="autosuggest" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.LetterTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.LetterTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
HTML
<input id="category_name" name="search" size="30" type="text" /> <script>$('#category_name').autocomplete('http://127.0.0.1:8982/solr/', 'name', {});</script>
<script>$('#search').autocomplete('http://127.0.0.1.120:8982/solr/', 'search', {});</script>
I changed the JRE from openJDK to Oracle and reinstalled everything. It works now.
Related
Good evening,
when I search for the word "app" it dont show the word "apple". But if I search for "app*", it show "apple" and "app". I dont want to write "*" in the search bar. How can I do this if I only search for "app" and it shows "apple" and "app"?
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
<filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
I tried to add <filter class="solr.ReversedWildcardFilterFactory"/>
but it didnt work.
Can someone help me?
I use Apache Solr 6.4.1
Sry for my bad english.
Use EdgeNGramFilterFactory
EdgeNGramFilterFactory :
This filter generates edge n-gram tokens of sizes within the given range.
Arguments:
minGramSize: (integer, default 1) The minimum gram size.
maxGramSize: (integer, default 1) The maximum gram size.
Example :
If we use minGramSize = 1 and maxGramSize = 4 then
In: "four score"
Tokenizer to Filter: "four", "score"
Out: "f", "fo", "fou", "four", "s", "sc", "sco", "scor"
For your case you can use the below schema :
<fieldType name="text_ngram" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="200"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
<filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
And update your fieldType to text_ngram Ex.
<field name="name" type="text_ngram" indexed="true" stored="false" multiValued="true"/>
Note : Don't forget to reload the core and reindex data
I'm trying to make a search with apache solr using this scheme http://pastie.org/5114389 but when I type "josé" the file is found but when I write "jose" I do not get the result.
Efetuei searching the internet for an answer and had to use the class but when I insert makes no difference.
I see from your schema that you are using the ASCIIFoldingFilterFactory already on your text fieldType that is assigned to the default field. However, it is only being applied to the indexing of that field. I would suggest that you also apply it to the querying of your field as well, to ensure that your query terms are being folded to match the items in the index. Typically, in a case like this when you add a filter factory to the indexing you would also add it to the querying so that query terms and index terms are all being converted/compared appropriately.
So I would modify your schema to the following:
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.ASCIIFoldingFilterFactory" words="mapping-FoldToASCII.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.ASCIIFoldingFilterFactory" words="mapping-FoldToASCII.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory" />
</analyzer>
</fieldType>
I'm building a Symfony 1.4 application using apache solr to search through a music database. I'm using the tjSolrDoctrineBehaviorPlugin to port apache solr to my Symfony 1.4 / Doctrine 1.2 app. I'm new to using apache solr.
The problem I'm getting is that when I type in the string "Katy Perry - Firework", I get only the results for "Katy Perry" and it seems like everything after the dash "-" in the query is ignored. If I just enter "KatY Perry Firework", the search works properly and the exact song is retrieved. I'm not sure why the dash messes up the searching. I thought the WordDelimiterFilterFactory discards non-alpha-numeric characters. Are my parameters wrong?
How do I use the tokenizer/filters to ignore dashes or " - " (space dash space) string as I'm pretty sure users will be using dashes in the search bar a lot to delineate song from artist (" - ").
Here's my schemal.xml:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- in this example, we will only use synonyms at query time
<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
-->
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<!-- <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> -->
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
Thanks.
Certain characters have a special function in Lucene (Solr). Read this to find out which and how to escape them.
I have a site with Drupal, apache solr and tomcat as host for apache solr. I edited the tomcat schema.xml to enable utf-8 support. And that enabled searches for utf-8 characters.
However the actual resultset works unexpectedly. When searching for content with utf-8 characters, apache solr returns content with the "equivalent" character as well.
Example
A search for lag (law) will return content with låg (low). Very different things in Swedish. Is this possible to config. And in that case, where?
Looks like you have the ASCIIFoldingFilterFactory setup in your schema.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ASCIIFoldingFilterFactory
This is configurable by solr. when Solr indexes a record (see type="index"), it uses the analyzers and filters you defined in your schema. Moreover, when you issue a search (see type="query"), the search again will be analyzed by a queryAnalyzer and filters. This is what is defined in the schema. I would suggest using the Solr direct web interface, and anlyze your query as well as your indexing procedure.
for example:
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" expand="false" ignoreCase="true" synonyms="synonyms.txt"/>
<filter class="solr.StopFilterFactory" enablePositionIncrements="true" ignoreCase="true" words="stopwords.txt"/>
<filter catenateAll="0" catenateNumbers="1" catenateWords="1" class="solr.WordDelimiterFilterFactory" generateNumberParts="1" generateWordParts="1" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
<filter class="solr.StopFilterFactory" enablePositionIncrements="true" ignoreCase="true" words="stopwords.txt"/>
<filter catenateAll="0" catenateNumbers="0" catenateWords="0" class="solr.WordDelimiterFilterFactory" generateNumberParts="1" generateWordParts="1" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
for example we can add solr.ISOLatin1AccentFilterFactory for replacing accented characters in the ISO Latin 1 character set (ISO-8859-1) by their unaccented equivalent
I would suggest looking at your schema one more time.
OK thank you both!
Uncommenting the
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
Uncommenting the line above in type="index" och type="query" did the trick.
note <!-- below
<analyzer type="query">
<!--
<filter class="solr.ASCIIFoldingFilterFactory"/>
<charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
-->
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
protected="protwords.txt"
generateWordParts="1"
generateNumberParts="1"
catenateWords="0"
catenateNumbers="0"
catenateAll="0"
splitOnCaseChange="1"
preserveOriginal="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
I'm using solr 1.4 and solr 4 for fulltext-search inside documents.
At the moment I'm unable to search whole phrases, like "The dog runs" at the textblock: "The dog runs through the house."
For this testcase I use an simple solr URL: http://plocalhost:8088/solr/select/?start=0&q="the dog runs"
I'm using an tokenized, stemmed textfiled with the following options:
<fieldType name="text_de" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords-de.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.SnowballPorterFilterFactory" language="German" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms-de.txt" ignoreCase="true" expand="true"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"/>
<filter class="solr.SnowballPorterFilterFactory" language="German" />
</analyzer>
</fieldType>
I have no idea, why it's not working. :-(
...thank you for any hint.
To answer my own question:
The analyzer on index time is using a stopwords list, while the analyzer on query time does NOT use a stopword list. So the phrase in the index was not the same as the phrase on query time.
I only had to add the StopFilterFactory at the "query"-analyzer.