Querying data from Elasticsearch

Querying data from Elasticsearch - sql

Using Elasticsearch 7.*, trying to execute SQL query on an index 'com-prod':
GET /com-prod/_search
{
"script_fields": {
"test1": {
"script": {
"lang": "painless",
"source": "params._source.ElapsedTime"
}
}
}
}
It gives the output and below as one of the hit successfully:
"hits" : [
{
"_index" : "com-prod",
"_type" : "_doc",
"_id" : "abcd",
"_score" : 1.0,
"fields" : {
"test1" : [
"29958"
]
}
}
Now, I am trying to increment the ElapsedTime by 2, as below:
GET /com-prod/_search
{
"script_fields": {
"test2": {
"script": {
"lang": "painless",
"source": "params._source.ElapsedTime + 2"
}
}
}
}
But its actually adding number 2 to the output, as below:
"hits" : [
{
"_index" : "com-prod",
"_type" : "_doc",
"_id" : "abcd",
"_score" : 1.0,
"fields" : {
"test2" : [
"299582"
]
}
}
Please guide what could be wrong here, and how to get the output as 29960.

You are getting 299582, instead of 29960, because the ElapsedTime field is of string type ("29958"), so when you are adding 2 in this using script, 2 gets appended at the end (similar to concat two strings).
So, in order to solve this issue, you can :
Create a new index, with updated mapping of the ElaspsedTIme field of int type, then reindex the data. Then you can use the same search query as given in the question above.
Convert the string to an int type value, using Integer.parseInt()
GET /com-prod/_search
{
"script_fields": {
"test2": {
"script": {
"lang": "painless",
"source": "Integer.parseInt(params._source.ElapsedTime) + 2"
}
}
}
}

Related

Mapping ElasticSearch apache module field

I am new to ES and I am facing a little problem I am struggling with.
I integrated metricbeat apache module with ES and the it works fine.
The problem is that metricbeat apache module reports the KB of web traffic of apache (field apache.status.total_kbytes), instead I would like to create my own field, the name of which would be "apache.status.total_mbytes).
I am trying to create a new mapping via Dev Console using the followind api commands:
PUT /metricbeat-7.2.0/_mapping
{
"settings":{
},
"mappings" : {
"apache.status.total_mbytes" : {
"full_name" : "apache.status.total_mbytes",
"mapping" : {
"total_mbytes" : {
"type" : "long"
}
}
}
}
}
Still ES returns the following error:
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [settings : {}] [mappings : {apache.status.total_mbytes={mapping={total_mbytes={type=long}}, full_name=apache.status.total_mbytes}}]"
}
],
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [settings : {}] [mappings : {apache.status.total_mbytes={mapping={total_mbytes={type=long}}, full_name=apache.status.total_mbytes}}]"
},
"status" : 400
}
FYI
The following may shed some light
GET /metricbeat-*/_mapping/field/apache.status.total_kbytes
Returns
{
"metricbeat-7.9.2-2020.10.06-000001" : {
"mappings" : {
"apache.status.total_kbytes" : {
"full_name" : "apache.status.total_kbytes",
"mapping" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
},
"metricbeat-7.2.0-2020.10.05-000001" : {
"mappings" : {
"apache.status.total_kbytes" : {
"full_name" : "apache.status.total_kbytes",
"mapping" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
}
}
What am I missing? Is the _mapping command wrong?
Thanks in advance,

A working example:
Create new index
PUT /metricbeat-7.2.0
{
"settings": {},
"mappings": {
"properties": {
"apache.status.total_kbytes": {
"type": "long"
}
}
}
}
Then GET metricbeat-7.2.0/_mapping/field/apache.status.total_kbytes will result in (same as your example):
{
"metricbeat-7.2.0" : {
"mappings" : {
"apache.status.total_kbytes" : {
"full_name" : "apache.status.total_kbytes",
"mapping" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
}
}
Now if you want to add a new field to an existing mapping use the API this way:
Update an existing index
PUT /metricbeat-7.2.0/_mapping
{
"properties": {
"total_mbytes": {
"type": "long"
}
}
}
Then GET metricbeat-7.2.0/_mapping will show you the updated mapping:
{
"metricbeat-7.2.0" : {
"mappings" : {
"properties" : {
"apache" : {
"properties" : {
"status" : {
"properties" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
},
"total_mbytes" : {
"type" : "long"
}
}
}
}
}
Also, take a look at Put Mapping Api

mapper_parsing_exception in Elasticsearch(Reason: No type specified for field [X])

I wanted to provide explicit mapping to the fields in my document, So I defined a mapping for my index demo and It looks like this below:
PUT /demo
{
"mappings": {
"properties": {
"X" : {
"X" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"Sub_X" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
After running the query , I am getting error as :
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "No type specified for field [X]"
}
],
"type" : "mapper_parsing_exception",
"reason" : "Failed to parse mapping [_doc]: No type specified for field [X]",
"caused_by" : {
"type" : "mapper_parsing_exception",
"reason" : "No type specified for field [X]"
}
},
"status" : 400
}
The field X in json document looks like :
"X" : {
"X" : [
"a"
],
"Sub_X" : [
[
"b"
]
]
},
Please help me out with this elastic search mapper_parse_exception error.

What you have is called nested data type
You have X which in turn contains X and Sub_X.
Mapping:
{
"properties": {
"X": {
"type": "nested"
}
}
}
Data:
{
"X": {
"X": [
"a"
],
"Sub_X": [
[
"b"
]
]
}
}
Query:
{
"query": {
"nested": {
"path": "X",
"query": {
"bool": {
"must": [
{ "match": { "X.X": "a" }},
{ "match": { "X.Sub_X": "b" }}
]
}
}
}
}
}
It outputs the document.

MongoDB - How to make a query to get the average usage?

Here is how the documents in db look like.
/* 1 */
{
"_id" : 1,
"feat" : {
"processName": [
{
"value" : {
"value": "Process1"
}
}
],
"processUsage": [
{
"value" : {
"value": 23.21
}
}
]
}
}
/* 2 */
{
"_id" : 2,
"feat" : {
"processName": [
{
"value" : {
"value": "Process2"
}
}
],
"memoryUsage": [
{
"value" : {
"value": 2.411502e+05
}
}
]
}
}
/* 3 */
{
"_id" : 3,
"feat" : {
"processName": [
{
"value" : {
"value": "Process1"
}
}
],
"processUsage": [
{
"value" : {
"value": 67.42
}
}
]
}
}
/* 4 */
{
"_id" : 4,
"feat" : {
"processName": [
{
"value" : {
"value": "Process3"
}
}
],
"processUsage": [
{
"value" : {
"value": 39.97
}
}
]
}
}
/* 5 */
{
"_id" : 5,
"feat" : {
"processName": [
{
"value" : {
"value": "Process2"
}
}
],
"processUsage": [
{
"value" : {
"value": 21.05
}
}
]
}
}
Each process has entries with processUsage and memoryUsage. What I am interest in is the average processUsage. So, I'd like to ignore the entries with memoryUsage.
I tried $match + $group in an aggregate with $avg, but for each process I just got back as average 0.00000000.
Then I tried my luck with mapReduce using javascript, unfortunately it did not work out either.
Could someone just show me how to do that? By the way, I am using Robomongo 0.8.5
Edit:
The query looks like this:
db.database.aggregate([
{ $match : {"$feat.processUsage.value.value": {$gt : -1}
},
{
$group: {_id: "$feats.processName.value.value", average: {$avg:
"$feats.processUsage.value.value"}
}
])

You can use the following aggregate query:
db.test.aggregate(
[
{
$unwind : "$feat.processUsage"
},
{
$group: {
_id: "$feat.processName.value.value",
average: {$avg:"$feat.processUsage.value.value"}
}
}
]
)
Unwinding in the initial phase will let you filter documents that has processUsage key in document.
Result:
{ "_id" : [ "Process2" ], "average" : 21.05 }
{ "_id" : [ "Process3" ], "average" : 39.97 }
{ "_id" : [ "Process1" ], "average" : 45.315 }

How to know if a geo coordinate lies within a geo polygon in elasticsearch?

I am using elastic search 1.4.1 - 1.4.4. I'm trying to index a geo polygon shape (document) into my index and now when the shape is indexed i want to know if a geo coordinate lies within the boundaries of that particular indexed geo-polygon shape.
GET /city/_search
{
"query":{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_polygon" : {
"location" : {
"points" : [
[72.776491, 19.259634],
[72.955705, 19.268060],
[72.945406, 19.189611],
[72.987291, 19.169507],
[72.963945, 19.069596],
[72.914506, 18.994300],
[72.873994, 19.007933],
[72.817689, 18.896882],
[72.816316, 18.941052],
[72.816316, 19.113720],
[72.816316, 19.113720],
[72.790224, 19.192205],
[72.776491, 19.259634]
]
}
}
}
}
}
}
With above geo polygon filter i'm able get all indexed geo-coordinates lies within described polygon but i also need to know if a non-indexed geo-coordinate lies with in this geo polygon or not. My doubt is that if that is possible in the elastic search 1.4.1.

Yes, Percolator can be used to solve this problem.
As in normal use case of Elasticsearch, we index our docs into elasticsearch and then we run queries on indexed data to retrieve matched/ required documents.
But percolators works in a different way of it.
In percolators you register your queries and then you percolate your documents through registered queries and gets back the queries which matches your documents.
After going through infinite number of google results and many of blogs i wasn't able to find any thing which could explain how i can use percolators to solve this problem.
So i'm explaining this with an example so that other people facing same problem can take a hint from my problem and the solution i found. I would like if someone can improve my answer or can share a better approach of doing it.
e.g:-
First of all we need to create an index.
PUT /city/
then, we need to add a mapping for user document which consist a user's
latitude-longitude for percolating against registered queries.
PUT /city/user/_mapping
{
"user" : {
"properties" : {
"location" : {
"type" : "geo_point"
}
}
}
}
Now, we can register our geo polygon queries as percolators with id as city name or any other identifier you want to.
PUT /city/.percolator/mumbai
{
"query":{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_polygon" : {
"location" : {
"points" : [
[72.776491, 19.259634],
[72.955705, 19.268060],
[72.945406, 19.189611],
[72.987291, 19.169507],
[72.963945, 19.069596],
[72.914506, 18.994300],
[72.873994, 19.007933],
[72.817689, 18.896882],
[72.816316, 18.941052],
[72.816316, 19.113720],
[72.816316, 19.113720],
[72.790224, 19.192205],
[72.776491, 19.259634]
]
}
}
}
}
}
}
Let's register another geo polygon filter for another city
PUT /city/.percolator/delhi
{
"query":{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_polygon" : {
"location" : {
"points" : [
[76.846998, 28.865160],
[77.274092, 28.841104],
[77.282331, 28.753252],
[77.482832, 28.596619],
[77.131269, 28.395064],
[76.846998, 28.865160]
]
}
}
}
}
}
}
Now we have registered 2 queries as percolators and we can make sure by making this API call.
GET /city/.percolator/_count
Now to know if a geo point exist with any of registered cities we can percolate a user document using below query.
GET /city/user/_percolate
{
"doc": {
"location" : {
"lat" : 19.088415,
"lon" : 72.871248
}
}
}
This will return : _id as "mumbai"
{
"took": 25,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"total": 1,
"matches": [
{
"_index": "city",
"_id": "mumbai"
}
]
}
trying another query with different lat-lon
GET /city/user/_percolate
{
"doc": {
"location" : {
"lat" : 28.539933,
"lon" : 77.331770
}
}
}
This will return : _id as "delhi"
{
"took": 25,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"total": 1,
"matches": [
{
"_index": "city",
"_id": "delhi"
}
]
}
Let's run another query with random lat-lon
GET /city/user/_percolate
{
"doc": {
"location" : {
"lat" : 18.539933,
"lon" : 45.331770
}
}
}
and this query will return no matched results.
{
"took": 5,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"total": 0,
"matches": []
}

Sorting ElasricSearch based on size of array type field

I have a ElasticSearch cluster on which I have to perform a sort query based on the size of the object array field 'contents'.
So far I have tried,
{
"size": 10,
"from": 0,
"fields" : ['name'],
"query": {
"match_all": {}
},
"sort" : {
"script" : {
"script" : "doc['contents'].values.length",
"order": "desc"
}
}
}
The above query gives me SearchPhaseExecutionException. The ES query is made from client side using elasticsearch.angular.js.
Any kind of help will be appreciate.

The security has changed for scripts in versions 1.2.x. In ES_HOME/config/scripts create a file called script_score.mvel and add the script:
doc.containsKey('content') == false ? 0 : doc['content'].values.size()
Restart Elasticsearch and change your query to:
{
"size": 10,
"from": 0,
"query": {
"match_all": {}
},
"sort": {
"_script": {
"script": "script_score",
"order": "desc",
"type" : "string"
}
}
}
For more information take a look here:
http://www.elasticsearch.org/blog/scripting-security/

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Querying data from Elasticsearch - sql

Related

Mapping ElasticSearch apache module field

mapper_parsing_exception in Elasticsearch(Reason: No type specified for field [X])

MongoDB - How to make a query to get the average usage?

How to know if a geo coordinate lies within a geo polygon in elasticsearch?

Sorting ElasricSearch based on size of array type field

Categories

Resources