elasticsearch match two fields - sql

How can I get this simple SQL query running on elasticsearch ?
SELECT * FROM [mytype] WHERE a = -23.4807339 AND b = -46.60068
I'm really having troubles with it's syntax, multi-match queries doesn't work in my case, which query type should I use?

For queries like yours bool filter is preferred over and filter. See here the whole story about this suggestion and why is considered to be more efficient.
These being said, I would choose to do it like this:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{"term": {"a": -23.4807339}},
{"term": {"b": -46.60068}}
]
}
}
}
}
}

You can approach this with the and filter. For your example, something like:
{
"size": 100,
"query" : {"filtered" : {
"query" : {"match_all" : {}},
"filter" : {"and" : [
"term" : {"a": -23.4807339},
"term" : {"b": -46.60068}
]}
}}
}
Be sure to direct the query against the correct index and type. Note that I specified the size of the return set as 100 arbitrarily - you'd have to specify a value that fits your use case.
There's more on filtered queries here, and more on the and filter here.

Related

What is the default doc sequence of the result from an Elasticsearch filter request?

I recently run an Elasticsearch filter request that is
{
"from" : 0,
"size" : 10,
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"must" : {
"terms" : {
"a_id" : [ 257793, 257798, 257844 ]
}
}
}
}
}
},
"explain" : false,
"fields" : "a_id"
}
So that I can find all docs with a_id in 257793, 257798, 257844 and the results are 257844, 257798, 257793. So far so good.
Then I find that whatever the sequence of the term numbers are, the return docs are always in the same a_id order. That is, even I run
"terms" : {
"a_id" : [257798, 257844, 257793 ]
}
The result docs are in the order of 257844, 257798, 257793 as well.
So I am so curious about the mechanism behind the Elasticsearch filtering. Can anyone help me and give me a hint?
By default, ES returns in descending order of _score. You can provide the sort option, to say in which order and based on what you want the results to be returned. For e.g., for based on date field
{
"sort": { "date": { "order": "desc" }}
"query" : {
"term" : { "user" : "kimchy" }
}
}
You can get more information:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/_sorting.html

How to know if a geo coordinate lies within a geo polygon in elasticsearch?

I am using elastic search 1.4.1 - 1.4.4. I'm trying to index a geo polygon shape (document) into my index and now when the shape is indexed i want to know if a geo coordinate lies within the boundaries of that particular indexed geo-polygon shape.
GET /city/_search
{
"query":{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_polygon" : {
"location" : {
"points" : [
[72.776491, 19.259634],
[72.955705, 19.268060],
[72.945406, 19.189611],
[72.987291, 19.169507],
[72.963945, 19.069596],
[72.914506, 18.994300],
[72.873994, 19.007933],
[72.817689, 18.896882],
[72.816316, 18.941052],
[72.816316, 19.113720],
[72.816316, 19.113720],
[72.790224, 19.192205],
[72.776491, 19.259634]
]
}
}
}
}
}
}
With above geo polygon filter i'm able get all indexed geo-coordinates lies within described polygon but i also need to know if a non-indexed geo-coordinate lies with in this geo polygon or not. My doubt is that if that is possible in the elastic search 1.4.1.
Yes, Percolator can be used to solve this problem.
As in normal use case of Elasticsearch, we index our docs into elasticsearch and then we run queries on indexed data to retrieve matched/ required documents.
But percolators works in a different way of it.
In percolators you register your queries and then you percolate your documents through registered queries and gets back the queries which matches your documents.
After going through infinite number of google results and many of blogs i wasn't able to find any thing which could explain how i can use percolators to solve this problem.
So i'm explaining this with an example so that other people facing same problem can take a hint from my problem and the solution i found. I would like if someone can improve my answer or can share a better approach of doing it.
e.g:-
First of all we need to create an index.
PUT /city/
then, we need to add a mapping for user document which consist a user's
latitude-longitude for percolating against registered queries.
PUT /city/user/_mapping
{
"user" : {
"properties" : {
"location" : {
"type" : "geo_point"
}
}
}
}
Now, we can register our geo polygon queries as percolators with id as city name or any other identifier you want to.
PUT /city/.percolator/mumbai
{
"query":{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_polygon" : {
"location" : {
"points" : [
[72.776491, 19.259634],
[72.955705, 19.268060],
[72.945406, 19.189611],
[72.987291, 19.169507],
[72.963945, 19.069596],
[72.914506, 18.994300],
[72.873994, 19.007933],
[72.817689, 18.896882],
[72.816316, 18.941052],
[72.816316, 19.113720],
[72.816316, 19.113720],
[72.790224, 19.192205],
[72.776491, 19.259634]
]
}
}
}
}
}
}
Let's register another geo polygon filter for another city
PUT /city/.percolator/delhi
{
"query":{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_polygon" : {
"location" : {
"points" : [
[76.846998, 28.865160],
[77.274092, 28.841104],
[77.282331, 28.753252],
[77.482832, 28.596619],
[77.131269, 28.395064],
[76.846998, 28.865160]
]
}
}
}
}
}
}
Now we have registered 2 queries as percolators and we can make sure by making this API call.
GET /city/.percolator/_count
Now to know if a geo point exist with any of registered cities we can percolate a user document using below query.
GET /city/user/_percolate
{
"doc": {
"location" : {
"lat" : 19.088415,
"lon" : 72.871248
}
}
}
This will return : _id as "mumbai"
{
"took": 25,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"total": 1,
"matches": [
{
"_index": "city",
"_id": "mumbai"
}
]
}
trying another query with different lat-lon
GET /city/user/_percolate
{
"doc": {
"location" : {
"lat" : 28.539933,
"lon" : 77.331770
}
}
}
This will return : _id as "delhi"
{
"took": 25,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"total": 1,
"matches": [
{
"_index": "city",
"_id": "delhi"
}
]
}
Let's run another query with random lat-lon
GET /city/user/_percolate
{
"doc": {
"location" : {
"lat" : 18.539933,
"lon" : 45.331770
}
}
}
and this query will return no matched results.
{
"took": 5,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"total": 0,
"matches": []
}

elastic search query filter out ids by wildcard

I'm hoping to create a query where it will filter out IDs containing a wildcard. For instance, I would like to search for something everywhere except where the ID contains the word current. Is this possible?
Yes it is possible using Regex Filter/Regex Query. I could not figure a way to directly do it using the Complement option hence I've used bool must_not to solve your problem for the time being. I'll refine the answer later if possible.
POST <index name>/_search
{
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must_not": [
{
"regexp": {
"ID": {
"value": ".*current.*"
}
}
}
]
}
}
}

How to find out result of elasticsearch parsing a query_string?

Is there a way to find out via the elasticsearch API how a query string query is actually parsed? You can do that manually by looking at the lucene query syntax, but it would be really nice if you could look at some representation of the actual results the parser has.
As javanna mentioned in comments there's _validate api. Here's what works on my local elastic (version 1.6):
curl -XGET 'http://localhost:9201/pl/_validate/query?explain&pretty' -d'
{
"query": {
"query_string": {
"query": "a OR (b AND c) OR (d AND NOT(e or f))",
"default_field": "t"
}
}
}
'
pl is name of index on my cluster. Different index could have different analyzers, that's why query validation is executed in a scope of an index.
The result of the above curl is following:
{
"valid" : true,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"explanations" : [ {
"index" : "pl",
"valid" : true,
"explanation" : "filtered(t:a (+t:b +t:c) (+t:d -(t:e t:or t:f)))->cache(org.elasticsearch.index.search.nested.NonNestedDocsFilter#ce2d82f1)"
} ]
}
I made one OR lowercase on purpose and as you can see in explanation, it is interpreted as a token and not as a operator.
As for interpretation of the explanation. Format is similar to +- operators of query string query:
( and ) characters start and end bool query
+ prefix means clause that will be in must
- prefix means clause that will be in must_not
no prefix means that it will be in should (with default_operator equal to OR)
So above will be equivalent to following:
{
"bool" : {
"should" : [
{
"term" : { "t" : "a" }
},
{
"bool": {
"must": [
{
"term" : { "t" : "b" }
},
{
"term" : { "t" : "c" }
}
]
}
},
{
"bool": {
"must": {
"term" : { "t" : "d" }
},
"must_not": {
"bool": {
"should": [
{
"term" : { "t" : "e" }
},
{
"term" : { "t" : "or" }
},
{
"term" : { "t" : "f" }
}
]
}
}
}
}
]
}
}
I used _validate api quite heavily to debug complex filtered queries with many conditions. It is especially useful if you want to check how analyzer tokenized input like an url or if some filter is cached.
There's also an awesome parameter rewrite that I was not aware of until now, which causes the explanation to be even more detailed showing the actual Lucene query that will be executed.

Elasticsearch/Tire text query DSL for excluding certain fields from being searched

I have a elastic search query like the following,
{
"query": {
"bool": {
"must": [
{
"query_string": {
"fields": ["title"],
"query": "test"
}
}
],
"must_not": [],
"should": []
}
},
"from": 0,
"size": 50,
"sort": [],
"facets": {}
}
I am able to execute an elastic search query on certain fields by giving a fields param to query_string as mentioned above. In my index mapping i have around 50 fields indexed. How do i query for all but one field. Something like an exclude option to query string. Is it possible with Tire/Elastic Search ?
I assumed it cannot be done and proceeded with getting all the mappings and parsing the hash which kinda sucks actually.