Solr query search for multiple instances for single keyword - apache

I'm stuck on this one issue. What i want to do is to query on a Multivalued and see if a value comes up at least try. For example the field must be "FREE","FREE" and not just "FREE" or "FREE","IN_USE".
Field
<field name="point_statusses" type="string" indexed="true" stored="true" multiValued="true" />
Type
<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
SQL
GROUP_CONCAT(cp.status) as point_statusses
Clarification:
I have an object that has multiple plugs and those all have a status of FREE, IN_USE or ERROR. What i want to do is filter on ones that have two plugs with status FREE and I can't change the structure of the schema.xml. How do i query to for this?

Unfortunately, it cannot be done without applying any changes to schema, because solr.StrField does not preserve term frequency information.
Quote from schema.xml:
...
1.2: omitTermFreqAndPositions attribute introduced, true by default
except for text fields.
...
However, if you can apply some changes, then the following will work (tested on the Solr 4.5.1):
1) Make one of the following changes to schema:
Change field to text_general (or any solr.TextField field);
<field name="point_statusses" type="text_general" indexed="true" stored="true" multiValued="true" />
OR add omitTermFreqAndPositions="false" to point_statusses definition:
<field name="point_statusses" type="string" indexed="true" stored="true" multiValued="true" omitTermFreqAndPositions="false"/>
2) Filter by term frequency.
Examples:
Search documents having exactly 2 'FREE' point_statusses:
{!frange l=2 u=2}termfreq(point_statusses,'FREE')
Or from 2 to 3 'FREE' point_statusses:
{!frange l=2 u=3}termfreq(point_statusses,'FREE')
The final solr query may look like this:
http://localhost:8983/solr/stack20746538/select?q=*:*&fq={!frange l=2 u=3}termfreq(point_statusses,'FREE')

Related

SOLR index on pdate field included in search results

I am migrating from SOLR 4.10.2 to 8.1.1. For some reason, in the 8.1.1 core, a pdate index named IDX_ExpirationDate is appearing as a field in the search results documents.
I have several other indexes that are defined and (correctly) do not appear in the results. But the index I am having trouble with is the only one based on a pdate.
Here is a sample 8.1.1 response that demonstrates the issue:
"response":{"numFound":58871,"start":0,"docs":[
{
"id":"11111",
"ExpirationDate":"2018-01-26T00:00:00Z",
"_version_":1641033044033798170,
"IDX_ExpirationDate":["2018-01-26T00:00:00Z"]},
{
"id":"22222",
"ExpirationDate":"2018-02-20T00:00:00Z",
"_version_":1641032965380112384,
"IDX_ExpirationDate":["2018-02-20T00:00:00Z"]},
ExpirationDate is supposed to be there, but IDX_ExpirationDate should not. I know that I can probably keep using date, but it is deprecated, and part of the reason for upgrading to 8.1.1 is to use the latest non-deprecated stuff ;-)
I have an index named IDX_ExpirationDate based on a field called ExpirationDate that was a date field in 4.10.2:
<fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0"/>
<field name="IDX_ExpirationDate" type="date" indexed="true" stored="false" multiValued="true" />
<field name="ExpirationDate" type = "date" indexed = "true" stored = "true" />
<copyField source="ExpirationDate" dest="IDX_ExpirationDate"/>
In the 8.1.1 core, I have this configured as a pdate:
<fieldType name="pdate" class="solr.DatePointField" docValues="true"/>
<field name="IDX_ExpirationDate" type="pdate" indexed="true" stored="false" multiValued="true" />
<field name="ExpirationDate" type = "pdate" indexed = "true" stored = "true" />
<copyField source="ExpirationDate" dest="IDX_ExpirationDate"/>
Fixed.
According to Shawn Heisey on the solruser mailing list, the pdate type defaults to docValues=true and useDocValuesAsStored="true", which makes it appear in results.
So I changed the IDX_ExpirationDate by adding useDocValuesAsStored="false", reloaded the index, and it no longer appears in the results:
<field name="IDX_ExpirationDate" type="pdate" indexed="true" stored="false" multiValued="true" useDocValuesAsStored="false"/>

Search for specific string solr

i'm usig Apache Solr. I would like search for a specific text in a string:
for example i use the query
title:"Hello"
and the result are 3 because it search 'Hello' in all titles,
but i want only one result ---> "Hello"
have I to change the schema.xml? Or is there a specific query that tries to string?
Define title in Solr Schema as below
<field name="title" type="string" indexed="true" stored="true"/>
Now title:"Hello" will match exactly Hello

Solr + schema.xml creating custom FieldType Object

This is only an example, but it will Help me to get further
I have an Object "person" with Fields [Age,Name]
My schema.xml
<field name="age" type="string" indexed="true" stored="false"/>
<field name="name" type="string" indexed="true" stored="false"/>
everything is ok, but I want to add +1 more Field "relation" (or parents,children etc.)
Person[age,name,relation] -> Relation has also [age,name]
how can i, insert a FieldType Relation to my schema.xml ?
<field name="age" type="string" indexed="true" stored="false"/>
<field name="name" type="string" indexed="true" stored="false"/>
<field name="relation" type="???" indexed="true" stored="false"/>
I want to add an Field, which takes all existing Fields like above
<field name="field1" type="string">
<field name="field2" type="string">
<field name="field3" type="string">
<field name="field4" type="field1,field2,field3">
Solr doesn't really support what you want, so you'd probably either index it with a multivalued field that contains ids that point to the other documents, such as (any reason why the age field is a string and not an int?):
<field name="id" type="int" indexed="true" stored="false"/>
<field name="age" type="string" indexed="true" stored="false"/>
<field name="name" type="string" indexed="true" stored="false"/>
<field name="relation" type="int" multiValued="true" indexed="true" stored="false" />
.. and then query all documents with a given relation when displaying a document (making two queries to Solr).
You can also use nested child documents, but it requires a bit more handling (since everything is contained in one document, you'll have to update everything together).
Solr prefers everything to be in denormalized way. Multi value is in that direction. But as #MatsLindh said, it involves 2 queries, because most of the times the child entities would be more than just a single field(arrays of strings v/s array of entities).
(Parent and child in your case is Person and "relation")
Nested child documents, is more of object relation, just like how we have in other frameworks. You have parent documents, you have child documents, and solr maintains the relationship and we should have a field which identies parent from child. The good part about this is
You can get parent document with child field matching
All child documents for a parent field matching
Finally only one query
Nested stuff, is recent addition. We are using lucid works to interact with solr. They suggested not to use nested documents, so we ended up with multivalue. But if allowed at your infrastructure, and solr framework itself having the feature, i think there is no wrong in using it.

What is the use of "multiValued" field type in Solr?

I'm new to Apache Solr. Even after reading the documentation part, I'm finding it difficult to clearly understand the functionality and use of the multiValued field type property.
What internally Solr does/treats/handles a field that is marked as multiValued?
What is the difference in indexing in Solr between a field that is multiValued and those that are not?
Can somebody explain with some good example?
Doc says:
multiValued=true|false
True if this
field may contain multiple values per
document, i.e. if it can appear
multiple times in a document
A multivalued field is useful when there are more than one value present for the field. An easy example would be tags, there can be multiple tags that need to be indexed. so if we have tags field as multivalued then solr response will return a list instead of a string value. One point to note is that you need to submit multiple lines for each value of the tags like:
<field name="tags">tag1</tags>
<field name="tags">tag2</tags>
...
<field name="tags">tagn</tags>
Once you have all the values index you can search or filter results by any value, e,g. you can find all documents with tag1 using query like
q=tags:tag1
or use the tags to filter out results like
q=query&fq=tags:tag1
multiValued defined in the schema whether the field is allowed to have more than one value.
For instance:
if I have a fieldType called ID which is multiValued=false indexing a document such as this:
doc {
id : [ 1, 2]
...
}
would cause an exception to be thrown in the indexing thread and the document will not be indexed (schema validation will fail).
On the other hand if I do have multiple values for a field I would want to set multiValued=true in order to guarantee that indexing is done correctly, for example:
doc {
id : 1
keywords: [ hello, world ]
...
}
In this case you would define "keywords" as a multiValued field.
I use multiple value fields only with copyfields, so think this way, say all fields will be single valued unless it's a copyfield, for example I have following fields:
<field name="id" type="string" indexed="true" stored="true"/>
<field name="name" type="string" indexed="true" stored="true"/>
<field name="subject" type="string" indexed="true" stored="true"/>
<field name="location" type="string" indexed="true" stored="true"/>
I want to query one field only and possibly to search all 4 fields above, then we need to use copyfield. first to create a new field call 'all', then copy everything into 'all'
<field name="all" type="text" indexed="true" stored="true" multiValued="true"/>
<copyField source="*" dest="all"/>
Now field 'all' need to be multi-valued.

Problem with Solr dynamic/copy Field

I have a problem that i have a dynamic field in schema.xml
as <dynamicField name="sec_*" type="text" indexed="true" stored="false"/>
and <field name="Contents" type="text" indexed="true" stored="false" multiValued="true"/>
dynamic field is copied to Contents field as
<copyField source="sec_*" dest="Contents"/>
now when i perform search using some dynamic fields like "sec_1069:risk" it filters documents that does not contains that dynamic field called sec_1069 can any body help how i can force this thing that solr should not filter documents that don't have that dynamic field.
Try sec_1069:risk OR -sec_1069:[* TO *]