Elasticsearch sum aggregration - elasticsearch-aggregation

there is documents with category(number), and piece(long) fields. I need some kind of aggregration that group these docs by category and sum all the pieces in it
here how documents looks:
{
"category":"1",
"piece":"10.000"
},
{
"category":"2",
"piece":"15.000"
} ,
{
"category":"2",
"piece":"10.000"
},
{
"category":"3",
"piece":"5.000"
}
...
The query result must be like:
{
"category":"1",
"total_amount":"10.000"
}
{
"category":"2",
"total_amount":"25.000"
}
{
"category":"3",
"total_amount":"5.000"
}
..
any advice ?

You want to do a terms aggregation on the categories, from the records above I see that you are sending them as strings, so they will be categorical variables. Then, as a metric, pass on the sum.
This is how it might look:
"aggs" : {
"categories" : {
"terms" : {
"field" : "category"
},
"aggs" : {
"sum_category" : {
"sum": { "field": "piece" }
}
}
}
}

Related

Convert SQL query to Java Elasticsearch Query with grails plugin

How can I convert the following SQL query to Java Elasticsearch Query?
SELECT s.*
FROM supplier_detail_info sdi
JOIN supplier s ON sdi.supplier_id = s.id
JOIN supplier_detail sd ON sdi.supplier_detail_id = sd.id
WHERE (sdi.value = '70' AND sd.id = 1 ) OR (sdi.value = '46' and sd.id = 4);
I have tried the following (excluding the OR clause from above) but failed:
def query = QueryBuilders.boolQuery().
must(QueryBuilders.nestedQuery('supplierDetailInfos', QueryBuilders.matchQuery("supplierDetailInfos.value", '46')))
.must(QueryBuilders.nestedQuery('supplierDetailInfos.supplierDetail', QueryBuilders.matchQuery("supplierDetailInfos.supplierDetail.id", 4)))
This resulted in:
{
"bool" : {
"must" : [ {
"nested" : {
"query" : {
"match" : {
"supplierDetailInfos.value" : {
"query" : "46",
"type" : "boolean"
}
}
},
"path" : "supplierDetailInfos"
}
}, {
"nested" : {
"query" : {
"match" : {
"supplierDetailInfos.supplierDetail.id" : {
"query" : 4,
"type" : "boolean"
}
}
},
"path" : "supplierDetailInfos.supplierDetail"
}
} ]
}
}
I'm using grails 2.3.7 and have the followings domains:
class Supplier {
String name
....
static hasMany = [supplierDetailInfos : SupplierDetailInfo]
static searchable = {
only = ['name', 'supplierDetailInfos']
supplierDetailInfos component: true
}
}
class SupplierDetailInfo {
SupplierDetail supplierDetail
Supplier supplier
String value
static searchable = {
only = ['id', 'supplierDetail', 'value']
supplierDetail component: true
}
}
class SupplierDetail {
String name
....
static searchable = {
only = ['id']
}
}
I'm using elasticsearch 2.3.5 and the elastic search plugin "org.grails.plugins:elasticsearch:0.1.0".
From what I understand,
must is equivalent to AND
&
should is equivalent to OR.
The above query should have returned only those suppliers for which the value of supplier_detail_info is '46' when the supplier_detail id is 4. But it returns all suppliers for whom there exists a supplier detail info with value '46'. It simply ignores the 'and sd.id = 4' part of the query. Also, I cannot figure out how to use the OR part.
Instead of a bool query try a nested query with a bool inner query and two must clauses. Also I think that the match queries should be term queries as they seem like keyword/long fields and not text fields and if you want OR make the must a should clause
{
"query":{
"nested":{
"path":"supplierDetailInfos",
"query":{
"bool":{
"must":[
{
"match":{
"supplierDetailInfos.value":"46"
}
},
{
"match":{
"supplierDetailInfos.supplierDetail.id":4
}
}
]
}
}
}
}
}
I found the solution to my problem thanks to #sramalingam24.
The elasticsearch query can be written as follows:
{
"query":{
"nested":{
"path":"supplierDetailInfos",
"query":{
"bool":{
"must":[
{
"match":{
"supplierDetailInfos.value":"46"
}
},
{
"query":{
"nested":{
"path":"supplierDetailInfos.supplierDetail",
"query":{
"bool":{
"must":[
{
"match":{
"supplierDetailInfos.supplierDetail.id":4
}
},
]
}
}
}
}
}
]
}
}
}
}
}
The equivalent query can be written using QueryBuilders as follows:
def supplierDetailQuery = QueryBuilders.boolQuery()
def supplierDetailIdQuery = QueryBuilders.nestedQuery('supplierDetailInfos.supplierDetail',
QueryBuilders.boolQuery()
.must(QueryBuilders.matchQuery("supplierDetailInfos.supplierDetail.id", id)))
def supplierDetailIdAndValueQuery = QueryBuilders.boolQuery()
.must(QueryBuilders.matchQuery("supplierDetailInfos.value", value))
.must(supplierDetailIdQuery)
supplierDetailQuery.must(QueryBuilders.nestedQuery('supplierDetailInfos',
supplierDetailIdAndValueQuery))

how to count number of keys in embedded mongodb document

I have a mongodb query: (Give me the settings where account='test')
db.collection_name.find({"account" : "test1"}, {settings : 1}).pretty();
where I get the following sample output:
{
"_id" : ObjectId("49830ede4bz08bc0b495f123"),
"settings" : {
"clusterData" : {
"us-south-1" : "cluster1",
"us-east-1" : "cluster2"
},
},
What I'm looking for now, is to give me the account where the clusterData has more than 1 key.
I'm only interested in listing those accounts with (2) or more keys.
I've tried this: (but this doesn't work)
db.collection_name.find({'settings.clusterData.1': {$exists: true}}, {account : 1}).pretty();
Is this possible to do with the current data structure? I don't have the option to redesign this schema.
Your clusterData field is not an array which is why you cannot just filter the number of elements it has. There is a way, though, to get what you want via the aggregation framework. Try this:
db.collection_name.aggregate({
$match: {
"account" : "test1"
}
}, {
$project: {
"settingsAsArraySize": { $size: { $objectToArray: "$settings.clusterData" } },
"settings.clusterData": 1
}
}, {
$match: {
"settingsAsArraySize": { $gt: 1 }
}
}, {
$project: {
"_id": 0,
"settings.clusterData": 1
}
}).pretty();

Elastic search aggregation

Is there a way to return only one product if it has different color.
e.g. suppose I have a product with following properties:
brand,color,title
nike, red, air max
nike, blue, air max
now I want to create elastic search query to return only one product while aggregation but count as two belonging to brand nike.
{
"query" : {
"match_all" : {}
},
"aggs" : {
"brand" : {
"terms" : {
"field" : "brand"
},
"aggs" : {
"size" : {
"terms" : {
"field" : "title"
}
}
}
}
}
}
I am not able to get desired results. I want like select name,color,title, count(*) title from product group by name,title
I think you want to get document, aggregated by name,title
This can be done using topHits aggregation.
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"brand": {
"terms": {
"field": "name"
},
"aggs": {
"size": {
"terms": {
"field": "title"
}
},
"aggs":{
"top_hits" :{
"_source" :[ "name","color","band"],
"size":1
}
}
}
}
}
}
For count, there is always doc_count in returned buckets.
Hope this helps!! If I am missing something, do mention.
Thanks

ElasticSearch facet: how do I do this?

I have an ElasticSearch index that contains several million products with over a thousand brands.
What query would I have to use to get a list of all the brands in the index?
Sample product entry:
{
_index: main
_type: one
_id: LA37dcdc7D70QygoV4KjfRU0hqUDhPs=
_version: 4
_score: 1
_source: {
pid: S2dcdcd528950_C243
mid: 6540
url: http://being.successfultogether.co.uk/
price: 4
currency: GBP
brand: Reebok
store: Matalan
}
}
Here is an example of generating facets against a selected field within your documents -
curl -XPUT <host>:9200/indices/type/_search?
{
"query": {
"match": {
"store": "Matalan"
}
},
"facets": {
"brand": {
"terms": {
"field": "brand"
}
}
}
}'
I think the all terms facet will get every term for a field:
POST /_all/_search
{
"query" : {
"match_all" : { }
},
"facets" : {
"tag" : {
"terms" : {
"field" : "stub",
"all_terms" : true
}
}
}
}
Terms aggregation as seen below ES 1.0 style, with a very high size count will probably return you every term and its count, it is not efficient nor is it for sure going to get them all.
You can read more about size and shard size params with aggregations/faceting here:
Elasticsearch Doco 1.0
POST /_all/_search
{
"aggs" : {
"genders" : {
"terms" : {
"field" : "stub",
"size":1000
}
}
},
"size":0
}
ALSO, There are faceting plugins to get every term for a field as a list, see here:
Approx Plugin

How can I query elasticsearch for only one type of record?

I am issuing a query to elasticsearch and I am getting multiple record types. How do I limit the results to one type?
The following query will limit results to records with the type "your_type":
curl - XGET 'http://localhost:9200/_all/your_type/_search?q=your_query'
See http://www.elasticsearch.org/guide/reference/api/search/indices-types.html for more details.
You can also use query dsl to filter out results for specific type like this:
$ curl -XGET 'http://localhost:9200/_search' -d '{
"query": {
"filtered" : {
"filter" : {
"type" : { "value" : "my_type" }
}
}
}
}
'
Update for version 6.1:
Type filter is now replaced by Type Query: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-type-query.html
You can use that in both Query and Filter contexts.
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"must" :[{"term":{"_type":"UserAudit"}}, {"term" : {"eventType": "REGISTRATION"}}]
}
}
}
},
"aggs":{
"monthly":{
"date_histogram":{
"field":"timestamp",
"interval":"1y"
},
"aggs":{
"existing_visitor":{
"terms":{
"field":"existingGuest"
}
}
}
}
}
}
"_type":"UserAudit" condition will look the records only specific to type
On version 2.3 you can query _type field like:
{
"query": {
"terms": {
"_type": [ "type_1", "type_2" ]
}
}
}
Or if you want to exclude a type:
{
"query": {
"bool" : {
"must_not" : {
"term" : {
"_type" : "Hassan"
}
}
}
}
}