How can I query elasticsearch for only one type of record? - api

I am issuing a query to elasticsearch and I am getting multiple record types. How do I limit the results to one type?

The following query will limit results to records with the type "your_type":
curl - XGET 'http://localhost:9200/_all/your_type/_search?q=your_query'
See http://www.elasticsearch.org/guide/reference/api/search/indices-types.html for more details.

You can also use query dsl to filter out results for specific type like this:
$ curl -XGET 'http://localhost:9200/_search' -d '{
"query": {
"filtered" : {
"filter" : {
"type" : { "value" : "my_type" }
}
}
}
}
'
Update for version 6.1:
Type filter is now replaced by Type Query: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-type-query.html
You can use that in both Query and Filter contexts.

{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"must" :[{"term":{"_type":"UserAudit"}}, {"term" : {"eventType": "REGISTRATION"}}]
}
}
}
},
"aggs":{
"monthly":{
"date_histogram":{
"field":"timestamp",
"interval":"1y"
},
"aggs":{
"existing_visitor":{
"terms":{
"field":"existingGuest"
}
}
}
}
}
}
"_type":"UserAudit" condition will look the records only specific to type

On version 2.3 you can query _type field like:
{
"query": {
"terms": {
"_type": [ "type_1", "type_2" ]
}
}
}
Or if you want to exclude a type:
{
"query": {
"bool" : {
"must_not" : {
"term" : {
"_type" : "Hassan"
}
}
}
}
}

Related

Mapping ElasticSearch apache module field

I am new to ES and I am facing a little problem I am struggling with.
I integrated metricbeat apache module with ES and the it works fine.
The problem is that metricbeat apache module reports the KB of web traffic of apache (field apache.status.total_kbytes), instead I would like to create my own field, the name of which would be "apache.status.total_mbytes).
I am trying to create a new mapping via Dev Console using the followind api commands:
PUT /metricbeat-7.2.0/_mapping
{
"settings":{
},
"mappings" : {
"apache.status.total_mbytes" : {
"full_name" : "apache.status.total_mbytes",
"mapping" : {
"total_mbytes" : {
"type" : "long"
}
}
}
}
}
Still ES returns the following error:
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [settings : {}] [mappings : {apache.status.total_mbytes={mapping={total_mbytes={type=long}}, full_name=apache.status.total_mbytes}}]"
}
],
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [settings : {}] [mappings : {apache.status.total_mbytes={mapping={total_mbytes={type=long}}, full_name=apache.status.total_mbytes}}]"
},
"status" : 400
}
FYI
The following may shed some light
GET /metricbeat-*/_mapping/field/apache.status.total_kbytes
Returns
{
"metricbeat-7.9.2-2020.10.06-000001" : {
"mappings" : {
"apache.status.total_kbytes" : {
"full_name" : "apache.status.total_kbytes",
"mapping" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
},
"metricbeat-7.2.0-2020.10.05-000001" : {
"mappings" : {
"apache.status.total_kbytes" : {
"full_name" : "apache.status.total_kbytes",
"mapping" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
}
}
What am I missing? Is the _mapping command wrong?
Thanks in advance,
A working example:
Create new index
PUT /metricbeat-7.2.0
{
"settings": {},
"mappings": {
"properties": {
"apache.status.total_kbytes": {
"type": "long"
}
}
}
}
Then GET metricbeat-7.2.0/_mapping/field/apache.status.total_kbytes will result in (same as your example):
{
"metricbeat-7.2.0" : {
"mappings" : {
"apache.status.total_kbytes" : {
"full_name" : "apache.status.total_kbytes",
"mapping" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
}
}
Now if you want to add a new field to an existing mapping use the API this way:
Update an existing index
PUT /metricbeat-7.2.0/_mapping
{
"properties": {
"total_mbytes": {
"type": "long"
}
}
}
Then GET metricbeat-7.2.0/_mapping will show you the updated mapping:
{
"metricbeat-7.2.0" : {
"mappings" : {
"properties" : {
"apache" : {
"properties" : {
"status" : {
"properties" : {
"total_kbytes" : {
"type" : "long"
}
}
}
}
},
"total_mbytes" : {
"type" : "long"
}
}
}
}
}
Also, take a look at Put Mapping Api

What is the default doc sequence of the result from an Elasticsearch filter request?

I recently run an Elasticsearch filter request that is
{
"from" : 0,
"size" : 10,
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"must" : {
"terms" : {
"a_id" : [ 257793, 257798, 257844 ]
}
}
}
}
}
},
"explain" : false,
"fields" : "a_id"
}
So that I can find all docs with a_id in 257793, 257798, 257844 and the results are 257844, 257798, 257793. So far so good.
Then I find that whatever the sequence of the term numbers are, the return docs are always in the same a_id order. That is, even I run
"terms" : {
"a_id" : [257798, 257844, 257793 ]
}
The result docs are in the order of 257844, 257798, 257793 as well.
So I am so curious about the mechanism behind the Elasticsearch filtering. Can anyone help me and give me a hint?
By default, ES returns in descending order of _score. You can provide the sort option, to say in which order and based on what you want the results to be returned. For e.g., for based on date field
{
"sort": { "date": { "order": "desc" }}
"query" : {
"term" : { "user" : "kimchy" }
}
}
You can get more information:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-sort.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/_sorting.html

Elasticsearch sum aggregration

there is documents with category(number), and piece(long) fields. I need some kind of aggregration that group these docs by category and sum all the pieces in it
here how documents looks:
{
"category":"1",
"piece":"10.000"
},
{
"category":"2",
"piece":"15.000"
} ,
{
"category":"2",
"piece":"10.000"
},
{
"category":"3",
"piece":"5.000"
}
...
The query result must be like:
{
"category":"1",
"total_amount":"10.000"
}
{
"category":"2",
"total_amount":"25.000"
}
{
"category":"3",
"total_amount":"5.000"
}
..
any advice ?
You want to do a terms aggregation on the categories, from the records above I see that you are sending them as strings, so they will be categorical variables. Then, as a metric, pass on the sum.
This is how it might look:
"aggs" : {
"categories" : {
"terms" : {
"field" : "category"
},
"aggs" : {
"sum_category" : {
"sum": { "field": "piece" }
}
}
}
}

Elastic search Not filter inside and filter

I am trying to add a "not" filter inside "and" filter
Sample input:
{
"query":{
"filtered":{
"query":{
"query_string":{
"query":"error",
"fields":[
"request"
]
}
},
"filter":{
and:[
{
"terms":{
"hashtag":[
"br2"
]
},
"not":{
"terms":{
"hashtag":[
"br1"
]
}
}
}
]
}
}
}
},
}
But above is giving error, i also tried various combination but in vain.
Above is just an example in short i require a query in which both "and", "not" filter are present.
you forgot the "filters" array.
Write it like this :
{
"from" : 0,
"size" : 25,
"query" : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"and" : {
"filters" : [{
"term" : {
"field1" : "val1"
}
}, {
"not" : {
"filter" : {
"term" : {
"field2" : "val2",
}
}
}
}
]
}
}
}
}
}

How to find out result of elasticsearch parsing a query_string?

Is there a way to find out via the elasticsearch API how a query string query is actually parsed? You can do that manually by looking at the lucene query syntax, but it would be really nice if you could look at some representation of the actual results the parser has.
As javanna mentioned in comments there's _validate api. Here's what works on my local elastic (version 1.6):
curl -XGET 'http://localhost:9201/pl/_validate/query?explain&pretty' -d'
{
"query": {
"query_string": {
"query": "a OR (b AND c) OR (d AND NOT(e or f))",
"default_field": "t"
}
}
}
'
pl is name of index on my cluster. Different index could have different analyzers, that's why query validation is executed in a scope of an index.
The result of the above curl is following:
{
"valid" : true,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"explanations" : [ {
"index" : "pl",
"valid" : true,
"explanation" : "filtered(t:a (+t:b +t:c) (+t:d -(t:e t:or t:f)))->cache(org.elasticsearch.index.search.nested.NonNestedDocsFilter#ce2d82f1)"
} ]
}
I made one OR lowercase on purpose and as you can see in explanation, it is interpreted as a token and not as a operator.
As for interpretation of the explanation. Format is similar to +- operators of query string query:
( and ) characters start and end bool query
+ prefix means clause that will be in must
- prefix means clause that will be in must_not
no prefix means that it will be in should (with default_operator equal to OR)
So above will be equivalent to following:
{
"bool" : {
"should" : [
{
"term" : { "t" : "a" }
},
{
"bool": {
"must": [
{
"term" : { "t" : "b" }
},
{
"term" : { "t" : "c" }
}
]
}
},
{
"bool": {
"must": {
"term" : { "t" : "d" }
},
"must_not": {
"bool": {
"should": [
{
"term" : { "t" : "e" }
},
{
"term" : { "t" : "or" }
},
{
"term" : { "t" : "f" }
}
]
}
}
}
}
]
}
}
I used _validate api quite heavily to debug complex filtered queries with many conditions. It is especially useful if you want to check how analyzer tokenized input like an url or if some filter is cached.
There's also an awesome parameter rewrite that I was not aware of until now, which causes the explanation to be even more detailed showing the actual Lucene query that will be executed.