Elasticsearch nest combination filter issue - nest

I'm using elasticsearch nest within an asp.net mvc application.
Following elasticsearch query is throwing an exception because fields like categories and brands could be null or empty. How do add if statements and build the filters conditionally. Thank you!
I have to use bool & must to combine (AND) filters for search criteria. As an example a user want products in "shoes" category and retailer "macys".
s.Query(q => q
.Bool(bq => bq
.Must(
mq => mq.Filtered(fq => fq
.Filter(f => f.Terms("Categories", request.Categories))
),
mq => mq.Filtered(fq => fq
.Filter(f => f.Terms("BrandName", request.Brands))
),
mq => mq.Filtered(fq => fq
.Filter(f => f.Terms("RetailerName", request.Retailers))
),
mq => mq.Filtered(fq => fq
.Filter(f => f.Terms("RetailerName", request.Retailers))
),
mq => mq.Range(r => r
.OnField("SellingPrice")
.GreaterOrEquals((double)request.PriceRanges[0].Start)
.LowerOrEquals((double)request.PriceRanges[0].End)
)
)
)
);

You don't have to care about null or empty values passed to queries, because NEST has feature called Conditionless Queries. Documentation says
If any of the queries would result in an empty query they won't be
sent to Elasticsearch.
The cause of the exception are these lines of code:
mq => mq.Range(r => r
.OnField("SellingPrice")
.GreaterOrEquals((double)request.PriceRanges[0].Start)
.LowerOrEquals((double)request.PriceRanges[0].End)
)
Propably PriceRanges is null or empty, and you are trying to access Start and End properties from a first element. Would be great if you are able to change a request class to something like below, in case you are using only first item from PriceRanges:
class Request
{
public List<string> Categories { get; set; }
public List<string> Brands { get; set; }
public double? PriceRangeStart { get; set; }
public double? PriceRangeEnd { get; set; }
}
Then your NEST query will look like:
s.Query(q => q
.Bool(bq => bq
.Must(
mq => mq.Filtered(fq => fq
.Filter(f => f.Terms("Categories", request.Categories))
),
mq => mq.Filtered(fq => fq
.Filter(f => f.Terms("BrandName", request.Brands))
),
mq => mq.Filtered(fq => fq
.Filter(f => f.Terms("RetailerName", request.Retailers))
),
mq => mq.Range(r => r
.OnField("SellingPrice")
.GreaterOrEquals(request.PriceRangeStart)
.LowerOrEquals(request.PriceRangeEnd)
)
)
));
For this request object
var request = new Request
{
Brands = new List<string>{"brand"},
PriceRangesEnd = 100
};
NEST produces following elasticsearch query
"query": {
"bool": {
"must": [
{
"filtered": {
"filter": {
"terms": {
"BrandName": [
"brand"
]
}
}
}
},
{
"range": {
"SellingPrice": {
"lte": "100"
}
}
}
]
}
}

Related

Create field from file name condition in logstash

I have several logs with the following names, where [E-1].[P-28], [E-1].[P-45] and [E-1].[P-51] are operators that generate these logs (They do not appear within the data. I can only identify them by obtaining from the file name)
p2sajava131.srv.gva.es_11101.log.online.[E-1].[P-28].21.01.21.log
p1sajava130.srv.gva.es_11101.log.online.[E-1].[P-45].21.03.04.log
p1sajava130.srv.gva.es_11101.log.online.[E-1].[P-51].21.03.04.log
...
is it posible to use translate filter create a new field?
somethink like:
translate{
field => "[log.file.path]"
destination => "[operator_name]"
dictionary => {
if contains "[E-1].[P-28]" => "OPERATOR-1"
if contains "[E-1].[P-45]" => "OPERATOR-2"
if contains "[E-1].[P-51]" => "OPERATOR-3"
thanx
I don't have ELK here so I can't test but this should works
if [log][file][path] =~ "[E-1].[P-28]" {
mutate {
add_field => { "[operator][name]" => "OPERATOR-1" }
}
}
if [log][file][path] =~ "[E-1].[P-45]" {
mutate {
add_field => { "[operator][name]" => "OPERATOR-2" }
}
}
if [log][file][path] =~ "[E-1].[P-51]" {
mutate {
add_field => { "[operator][name]" => "OPERATOR-3" }
}
}

Using RabbitMQ fields in Logstash output

I want to use some fields from RabbitMQ messages into Logstah Elasticsearch output (like a index name, etc).
If I use [#metadata][rabbitmq_properties][timestamp] in filter it works nice, but not in output statement (config below).
What am I doing wrong?
input {
rabbitmq {
host => "rabbitmq:5672"
user => "user"
password => "password"
queue => "queue "
durable => true
prefetch_count => 1
threads => 3
ack => true
metadata_enabled => true
}
}
filter {
if [#metadata][rabbitmq_properties][timestamp] {
date {
match => ["[#metadata][rabbitmq_properties][timestamp]", "UNIX"]
}
}
}
output {
elasticsearch {
hosts => ['http://elasticsearch:9200']
index => "%{[#metadata][rabbitmq_properties][IndexName]}_%{+YYYY.MM.dd}"
}
stdout {codec => rubydebug}
}
check with replace function as mentioned below.
input {
rabbitmq {
host => "rabbitmq:5672"
user => "user"
password => "password"
queue => "queue "
durable => true
prefetch_count => 1
threads => 3
ack => true
metadata_enabled => true
}
}
filter {
if [#metadata][rabbitmq_properties][timestamp] {
date {
match => ["[#metadata][rabbitmq_properties][timestamp]", "UNIX"]
}
}
mutate {
replace => {
"[#metadata][index]" => "%{[#metadata][rabbitmq_properties][IndexName]}_%{+YYYY.MM.dd}"
}
}
}
output {
elasticsearch {
hosts => ['http://elasticsearch:9200']
index => "%{[#metadata][index]}_%{+YYYY.MM.dd}"
}
stdout {codec => rubydebug}
}

only strings in influxdb

i've this config file in logstash
input {
redis{
host => "localhost"
data_type => "list"
key => "vortex"
threads => 4
type => "testrecord"
codec => "plain"
}
}
filter {
kv {
add_field => {
"test1" => "yellow"
"test" => "ife"
"feild" => "pink"
}
}
}
output {
stdout { codec => rubydebug }
influxdb {
db => "toast"
host => "localhost"
measurement => "myseries"
allow_time_override => true
use_event_fields_for_data_points => true
exclude_fields => ["#version", "#timestamp", "sequence", "message", "type", "host"]
send_as_tags => ["bar", "feild", "test1", "test"]
}
}
and a list in redis with the following data:
foo=10207 bar=1 sensor2=1 sensor3=33.3 time=1489686662
everything works fine but every field in influx is defined as string regardless of values.
does anybody know how to get around this issue?
The mutate filter may be what you're looking for here.
filter {
mutate {
convert => {
"value" => "integer"
"average" => "float"
}
}
}
It means you need to know what your fields are before-hand, but it will convert them into the right data-type.

Logstash with multiple kafka inputs

I am trying to filter kafka events from multiple topics, but once all events from one topic has been filtered logstash is not able to fetch events from the other kafka topic. I am using topics with 3 partitions and 2 replications Here is my logstash config file
input {
kafka{
auto_offset_reset => "smallest"
consumer_id => "logstashConsumer1"
topic_id => "unprocessed_log1"
zk_connect=>"192.42.79.67:2181,192.41.85.48:2181,192.10.13.14:2181"
type => "kafka_type_1"
}
kafka{
auto_offset_reset => "smallest"
consumer_id => "logstashConsumer1"
topic_id => "unprocessed_log2"
zk_connect => "192.42.79.67:2181,192.41.85.48:2181,192.10.13.14:2181"
type => "kafka_type_2"
}
}
filter{
if [type] == "kafka_type_1"{
csv {
separator=>" "
source => "data"
}
}
if [type] == "kafka_type_2"{
csv {
separator => " "
source => "data"
}
}
}
output{
stdout{ codec=>rubydebug{metadata => true }}
}
Its a very late reply but if you wanted to take input multiple topic and output to another kafka multiple output, you can do something like this :
input {
kafka {
topics => ["topic1", "topic2"]
codec => "json"
bootstrap_servers => "kafka-broker-1:9092,kafka-broker-2:9092,kafka-broker-3:9092"
decorate_events => true
group_id => "logstash-multi-topic-consumers"
consumer_threads => 5
}
}
output {
if [kafka][topic] == "topic1" {
kafka {
codec => "json"
topic_id => "new_topic1"
bootstrap_servers => "output-kafka-1:9092"
}
}
else if [kafka][topic] == "topic2" {
kafka {
codec => "json"
topic_id => "new_topic2"
bootstrap_servers => "output-kafka-1:9092"
}
}
}
Be careful while detailing your bootstrap servers, give name on which your kafka has advertised listeners.
Ref-1: https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html#plugins-inputs-kafka-group_id
Ref-2: https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html#plugins-inputs-kafka-decorate_events
The previous answer didn't work for me and it seems it doses not recognize conditional statements in output, Here is my answer which correct and valid at least for my case where I have defined tags in input for both Kafka consumers and documents (in my case they are logs) are ingested into separate indexes related to their consumer topics .
input {
kafka {
group_id => "35834"
topics => ["First-Topic"]
bootstrap_servers => "localhost:9092"
codec => json
tags => ["First-Topic"]
}
kafka {
group_id => "35834"
topics => ["Second-Topic"]
bootstrap_servers => "localhost:9092"
codec => json
tags => ["Second-Topic"]
}
}
filter {
}
output {
if "Second-Topic" in [tags]{
elasticsearch {
hosts => ["localhost:9200"]
document_type => "_doc"
index => "logger"
}
stdout { codec => rubydebug
}
}
else if "First-Topic" in [tags]{
elasticsearch {
hosts => ["localhost:9200"]
document_type => "_doc"
index => "saga"
}
stdout { codec => rubydebug
}
}
}
Probably this is what you need:
input {
kafka {
client_id => "logstash_server"
topics => ["First-Topic", "Second-Topic"]
codec => "json"
decorate_events = true
bootstrap_servers => "localhost:9092"
}
}
filter { }
output {
if [#metadata][kafka][topic] == "First-Topic" {
elasticsearch {
hosts => ["localhost:9200"]
index => "logger"
}
}
else if [#metadata][kafka][topic] == "Second-Topic" {
elasticsearch {
hosts => ["localhost:9200"]
index => "saga"
}
}
else {
elasticsearch {
hosts => ["localhost:9200"]
index => "catchall"
}
}
}
There's no need on having two separate inputs of Kafka if they point to the same Bootstrap, you just have to specify the list of topics you want to read from Logstash.
You could also add the "stdout { codec => rubydebug }" if you want to, but that's usually used when debugging, in a prod environment that would cause a lot of noise. 'document_type => "_doc"' can also be used if you want but is not a must, and in the new version of Elasticsearch (8.0) that option is already deprecated, I would simply get rid of it.
And I also added a final "else" statement to the output, if for some reason any of the statements match, it's also important to send the events to any other default index, in this case "catchall".

Cakephp 3: How to ignore beforefind for specific queries?

I am working on multilingual posts. I have added beforefind() in the PostsTable so I can list posts for current language
public function beforeFind(Event $event, Query $query) {
$query->where(['Posts.locale' => I18n::locale()]);
}
In order to allow users duplicate posts in different languages i wrote following function:
public function duplicate(){
$this->autoRender = false;
$post_id= $this->request->data['post_id'];
$post = $this->Posts
->findById($post_id)
->select(['website_id', 'category_id', 'locale', 'title', 'slug', 'body', 'image', 'thumb', 'meta_title', 'meta_description', 'other_meta_tags', 'status'])
->first()
->toArray();
foreach($this->request->data['site'] as $site) {
if($site['name'] == false) {
continue;
}
$data = array_merge($post, [
'website_id' => $site['website_id'],
'locale' => $site['locale'],
'status' => 'Draft',
'duplicate' => true
]);
$pageData = $this->Posts->newEntity($data);
if($this->Posts->save($pageData)) {
$this->Flash->success(__('Post have been created.'));;
} else{
$this->Flash->error(__('Post is not created.'));
}
}
return $this->redirect(['action' => 'edit', $post_id]);
}
In order to check if the posts are already duplicated. I am doing a check in 'edit' functino:
$languages = TableRegistry::get('Websites')->find('languages');
foreach($languages as $language)
{
$exists[] = $this->Posts
->findByTitleAndWebsiteId($post['title'], $language['website_id'])
->select(['locale', 'title', 'website_id'])
->first();
}
$this->set('exists',$exists);
but as the beforefind() is appending query to above query. I am not getting any results. Is there any way I can ignore beforefind() for only cerrtain queries. I tried using entity as below:
public function beforeFind(Event $event, Query $query) {
if(isset($entity->duplicate)) {
return true;
}
$query->where(['Posts.locale' => I18n::locale()]);
}
but no luck. Could anyone guide me? Thanks for reading.
There are various possible ways to handle this, one would be to make use of Query::applyOptions() to set an option that you can check in your callback
$query->applyOptions(['injectLocale' => false])
public function beforeFind(Event $event, Query $query, ArrayObject $options)
{
if(!isset($options['injectLocale']) || $options['injectLocale'] !== false) {
$query->where(['Posts.locale' => I18n::locale()]);
}
}
Warning: The $options argument is currently passed as an array, while it should be an instance of ArrayObject (#5621)
Callback methods can be ignored using this:
$this->Model->find('all', array(
'conditions' => array(...),
'order' => array(...),
'callbacks' => false
));