Elasticsearch Query String Query returns all documents - apache

I have an indice named users
When I make a request on http://localhost:9200/users/_search?pretty=true with the following query:
curl -X GET "localhost:9200/users/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
"query": {
"query_string": {
"query" : "firstName: Daulet"
}
}
}'
the query returns two users with the following names:
firstName: Daulet
firstName: Daulet Nurlanuly
How do I make the query string query return a the document with firstName: Daulet ?
I've looked up that Elasticsearch uses Apache Lucene's request syntax and that for the strict search I would need to do the following by enclosing request in quotes as followes:
firstName: "Daulet"
But it is already enclosed within quotes
How do I do that using only Query String Query?
** UPDATE **
The response I get when I make a GET request at http://localhost:9200/users:
{
"users": {
"aliases": {},
"mappings": {
"userentity": {
"properties": {
"firstName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "long"
},
"language": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"lastName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
},
"settings": {
"index": {
"refresh_interval": "1s",
"number_of_shards": "5",
"provided_name": "users",
"creation_date": "1530245236170",
"store": {
"type": "fs"
},
"number_of_replicas": "1",
"uuid": "IlE1Ynv2Q462LBttptVaTg",
"version": {
"created": "5060999"
}
}
}
}
}

You're correct that you need to surround the value with double quotes. You're on the right path and you simply need to escape the double quotes and use the firstName.keyword field instead of firstName, basically like this:
curl -X GET "localhost:9200/users/_search?pretty=true" -H 'Content-Type: application/json' -d'
{
"query": {
"query_string": {
"query" : "firstName.keyword:\"Daulet\""
}
}
}'

Related

Elastic Search Query for displaying the documents when it contains all the set of specific properties

I am new to Elastic Search APIs. I have a requirement where i need to query and list the documents which compulsorily contains following properties, say
"request: "/v3?id=100000" & "type: "GET"
Result should contains list of documents containing both the above. I have tried the following and it gets either of the above.
{
"query": {
"match": {
"type": "GET"
}
}
}
I tried
{
"query": {
"match": {
"type": "GET",
"request: "/v3/id=100000"
}
}
}
It fails...
Can someone suggest me a query to list all the docs with both the properties set as above ? Not sure how to use filters, if I try it shows failures - parse exceptions.
My example document:
{
"_index": "logstash-2016.04.22",
"_type": "endpoint-access",
"_id": "fAhTQkDRQTiHKlzuleNA",
"_score": null,
"_source": {
"#version": "1",
"#timestamp": "2016-04-22T15:26:35.153Z",
"offset": "43714176",
"ident": "-",
"auth": "-",
"timestamp": "22/Apr/2016:15:26:35 +0000",
"type": "GET",
"request": "/v3?id=1b32e833-b521",
"httpversion": "1.1",
"response": "500",
"bytes": "265",
"referrer": "-",
"agent": "-",
"x_forwarded_for": "\"101.2.123.24\""
"host": "101.123.115.167"
},
"sort": [
1461338795153,
1461338795153
]
}
You may use "must" to get the result:
{
"query": {
"bool": {
"must": [
{
"match": {
"type": "GET"
}
},
{
"match": {
"request": "/v3/id=100000"
}
}
]
}
}
}

How do I enable stemming in Elasticsearch?

curl -XPUT "localhost:9200/products" -d '{
"settings": {
"index": {
"number_of_replicas" : 0,
"number_of_shards": 1
}
},
"mappings": {
"products": {
"properties": {
"location" : {
"type" : "geo_point"
}
}
}
}
}'
I currently have a bash script that creates my index. Code is above.
How do I add stemming to it?
The most generic way to do it is by replacing default analyzer with snowball analyzer. This will enable stemming for all dynamically-mapped string fields. This is how you can enable english stemmer:
curl -XPUT "localhost:9200/products" -d '{
"settings": {
"index": {
"number_of_replicas" : 0,
"number_of_shards": 1,
"analysis" :{
"analyzer": {
"default": {
"type" : "snowball",
"language" : "English"
}
}
}
}
},
"mappings": {
"products": {
"properties": {
"location" : {
"type" : "geo_point"
}
}
}
}
}'

ElasticSearch:filtering documents based on field length?

Is there a way to filter ElasticSearch documents based on the length of a specific field?
For instance, I have a bunch of documents with the field "body", and I only want to return results where the number of characters in body is > 1000. Is there a way to do this in ES without having to add an extra column with the length in the index?
Use the script filter, like this:
"filtered" : {
"query" : {
...
},
"filter" : {
"script" : {
"script" : "doc['body'].length > 1000"
}
}
}
EDIT
Sorry, meant to reference the query DSL guide on script filters
You can also create a custom tokenizer and use it in a multifields property as in the following:
PUT test_index
{
"settings": {
"analysis": {
"analyzer": {
"character_analyzer": {
"type": "custom",
"tokenizer": "character_tokenizer"
}
},
"tokenizer": {
"character_tokenizer": {
"type": "nGram",
"min_gram": 1,
"max_gram": 1
}
}
}
},
"mappings": {
"person": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"words_count": {
"type": "token_count",
"analyzer": "standard"
},
"length": {
"type": "token_count",
"analyzer": "character_analyzer"
}
}
}
}
}
}
}
PUT test_index/person/1
{
"name": "John Smith"
}
PUT test_index/person/2
{
"name": "Rachel Alice Williams"
}
GET test_index/person/_search
{
"query": {
"term": {
"name.length": 10
}
}
}

elastich search missing filter on query

it's the first time I use the 'missing' parameter and I am not sure if I am doing something wrong as i am not getting what i expect.
Can someone please tell me if the missing condition is correctly integrated in this query? it should created 5 facets, counting for each one only the occurrences for which decimallatitude field is 'not set in the index' or its value is null.
curl -XGET http://my_url:9200/idx_occurrence/Occurrene/_search?pretty=true -d '{
"filter": {
"missing": {
"field": "decimallatitude",
"existence": true,
"null_value": true
}
},
"query": {
"query_string": {
"fields": ["dataset"],
"query": "3",
"default_operator": "AND"
}
},
"facets": {
"test": {
"terms": {
"field": ["kingdom_interpreted"],
"size": 5
}
}
}
}
'
As you can see on the Search API - Filter page, the filter is applied to your query results but not to the facets. To make it work for facets, try using the Filtered Query instead
curl -XGET http://my_url:9200/idx_occurrence/Occurrene/_search?pretty=true -d '{
"query": {
"filtered": {
"filter": {
"missing": {
"field": "decimallatitude",
"existence": true,
"null_value": true
}
},
"query": {
"query_string": {
"fields": ["dataset"],
"query": "3",
"default_operator": "AND"
}
}
}
},
"facets": {
"test": {
"terms": {
"field": ["kingdom_interpreted"],
"size": 5
}
}
}
}
'

elasticsearch / lucene highlight

I'm using ElasticSearch to index documents.
My mapping is:
"mongodocid": {
"boost": 1.0,
"store": "yes",
"type": "string"
},
"fulltext": {
"boost": 1.0,
"index": "analyzed",
"store": "yes",
"type": "string",
"term_vector": "with_positions_offsets"
}
To highlight the complete fulltext I am setting number_of_framgments to 0.
If I do the following Lucene-like string query:
{
"highlight": {
"pre_tags": "<b>",
"fields": {
"fulltext": {
"number_of_fragments": 0
}
},
"post_tags": "</b>"
},
"query": {
"query_string": {
"query": "fulltext:test"
}
},
"size": 100
}
For some documents in the result set the length of the highlighted fulltext is smaller than the fulltext itself.
Since I am setting number_of_fragments to 0 and pre_tags/post_tags are added this should not happen.
Now comes the strange behaviour: If I only search for one of the failing elements by doing this:
{
"highlight": {
"pre_tags": "<b>",
"fields": {
"fulltext": {
"number_of_fragments": 0
}
},
"post_tags": "</b>"
},
"query": {
"query_string": {
"query": "fulltext:test AND mongodocid:4d0a861c2ebef6032c00b1ec"
}
},
"size": 100
}
then all works fine.
Any ideas?
Sounds like issue which has been fixed in 0.14.0 (see #479). As of writing the 0.14.0 hasn't been released yet, can you try master?