Is it possible to use redis_store output plugin for fluentd to handle huge amount of logs?

Is it possible to use redis_store output plugin for fluentd to handle huge amount of logs? - redis

I'm trying to configure fluentd to send logs to redis on different server (the path is fluentbit-fluentd-redis-logstash-elastic), but I can't figure out some performance issues. If I save logs only on disc, everything is ok, all logs are saved (thousands per sec), but if I add redis_store section, the same amount of data makes fluentd much slower and the memory is still growing until next restart (day or two).
I know that it's probably because of input is faster than output, but how to handle that, how to send so many data to redis? Add more memory is not solution it can only add some time. Is it because redis can't receive data from so many threads (but redis is not overloaded, there is no queue)? I dont know if that is network issue (but in this plugin i cant try another protocol?), cpu issue (cpu is on cca 70%), or the plugin itself?
If I add redis plugin, the communication with redis is just so slow that fluentd is not fast enough and add data to memory.
Config
<system>
workers 4
root_dir /fluentd/log/buffer/
</system>
<worker 0-3>
<source>
#type forward
bind 0.0.0.0
port 9880
</source>
<label #TEST>
<match test.**>
#type forest
subtype copy
<template>
<store>
#type file
#id "test-#{worker_id}"
#log_level debug
path "fluentd/log/test-#{worker_id}.*.log"
append true
<buffer>
flush_mode interval
flush_interval 3
flush_at_shutdown true
</buffer>
<format>
#type single_value
message_key log
</format>
</store>
<store>
#type redis_store
host server_ip
port 6379
key test
store_type list
<buffer>
flush_mode interval
flush_interval 3
flush_at_shutdown true
flush_thread_count 4
</buffer>
</store>
</template>
</match>
</label>
</worker>
Any tips how to make throughput to redis better?
Thank you

If you are using the redistore plugin you can get much better performance by installing the hiredis gem which wraps the C redis API
https://github.com/moaikids/fluent-plugin-redisstore

Related

logs to splunk are getting truncated

I am using fluentd to forward my Kubernetes pod logs to splunk but in splunk I am not able to see full length of pod log as they getting truncated. For example we have a single line log length of 74286 chars, but splunk shows only 16385 chars.
what can I do to overcome this issue ?
This way I have configured in fluentd configmap.
<match **>
#id splunk
#type splunk-hec
#log_level info
server "#{ENV['FLUENT_SPLUNK_HOST']}"
protocol https
verify false
host "#{ENV['CLUSTER_NAME']}_#{ENV['NODE_NAME']}"
token "#{ENV['FLUENT_SPLUNK_TOKEN']}"
index "#{ENV['SPLUNK_INDEX']}"
buffer_type memory
buffer_queue_limit 256
buffer_chunk_limit 8m
batch_size_limit 8000000
flush_interval 1s
</match>

By default, Splunk is supposed to truncate at 10,000 characters. You can change that in your props.conf file.
[mysourcetype]
TRUNCATE = 75000
This would be in addition to the rest of the "magic" 6 settings: TIME_PREFIX, TIME_FORMAT, MAX_TIMESTAMP_LOOKAHEAD, SHOULD_LINEMERGE, and LINE_BREAKER.

Fluentd grep + output logs

I have a service, deployed into a kubernetes cluster, with fluentd set as a daemon set. And i need to diversify logs it receives so they end up in different s3 buckets.
One bucket would be for all logs, generated by kubernetes and our debug/error handling code, and another bucket would be a subset of logs, generated by the service, parsed by structured logger and identified by a specific field in json. Think of it one bucket is for machine state and errors, another is for "user_id created resource image_id at ts" description of user actions
The service itself is ignorant of the fluentd, so i cannot manually set the tag for logs based on which s3 bucket i want them to end in.
Now, the fluentd.conf i use sets s3 stuff like this:
<match **>
# docs: https://docs.fluentd.org/v0.12/articles/out_s3
# note: this configuration relies on the nodes have an IAM instance profile with access to your S3 bucket
type copy
<store>
type s3
log_level info
s3_bucket "#{ENV['S3_BUCKET_NAME']}"
s3_region "#{ENV['S3_BUCKET_REGION']}"
aws_key_id "#{ENV['AWS_ACCESS_KEY_ID']}"
aws_sec_key "#{ENV['AWS_SECRET_ACCESS_KEY']}"
s3_object_key_format %{path}%{time_slice}/cluster-log-%{index}.%{file_extension}
format json
time_slice_format %Y/%m/%d
time_slice_wait 1m
flush_interval 10m
utc
include_time_key true
include_tag_key true
buffer_chunk_limit 128m
buffer_path /var/log/fluentd-buffers/s3.buffer
</store>
<store>
...
</store>
</match>
So, what i would like to do is to have something like a grep plugin
<store>
type grep
<regexp>
key type
pattern client-action
</regexp>
</store>
Which would send logs into a separate s3 bucket to the one defined for all logs

I am assuming that user action logs are generated by your service and system logs include docker, kubernetes and systemd logs from the nodes.
I found your example yaml file at the official fluent github repo.
If you check out the folder in that link, you'll see two more files called kubernetes.conf and systemd.conf. These files have got source sections where they tag their data.
The match section in fluent.conf is matching **, i.e. all logs and sending them to s3. You want to split your log types here.
Your container logs are being tagged kubernetes.* in kubernetes.conf on this line.
so your above config turns into
<match kubernetes.* >
#type s3
# user log s3 bucket
...
and for system logs match every other tag except kubernetes.*

Collectd not collecting data when changed to "High Counter" or HC OIDs in snmp plugin config

I am playing with collectd. Evertying works fine until I decided to use "ifHCInOctets" instead of "ifInOctets". Here is my SNMP plugin config.
<Plugin snmp>
<Data "std_traffic_hc">
Type "if_octets"
Table true
# Instance "IF-MIB::ifDescr"
Instance "IF-MIB::ifName"
# Values "IF-MIB::ifInOctets" "IF-MIB::ifOutOctets"
Values "IF-MIB::ifHCInOctets" "IF-MIB::ifHCOutOctets"
</Data>
<Host "ABCDESW01-01">
Address "10.0.3.131"
Version 1
Community "xxx"
Collect "std_traffic_hc"
Interval 60
</Host>
</Plugin>
I have also tried ifDescr and ifName in the "Instance" directive and that did not make a difference. Either one works for the regular OID but not the HC OID.
I used tcpdump but I don't see collectd ever tried to send snmp traffic to the HC OID. I do see the snmp traffic for the Instance OID.
I have also used snmpwalk to confirm that my switch (HP) supports the OID:
# snmpwalk -v2c -cxxx 10.0.3.131 IF-MIB::ifHCInOctets
IF-MIB::ifHCInOctets.1 = Counter64: 0
IF-MIB::ifHCInOctets.2 = Counter64: 356053022
Where did I do wrong?
Thank you!

Well High Counter OID aren't usable within SNMP v1 so i would guess that that's the problem here.
You are correctly manually requesting the OID in Version 2c on the CLI, but the collectd configuration is set to "Version 1"

How to run two instances of Apache ActiveMQ in one system?

I am using apache-activemq-5.11.1 which is the stable version runs on JDK 7 (Major version 51.0), I am using JDK 7 Update 80. I had error if I run the same on JDK 6.
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/ac
tivemq/console/Main : Unsupported major.minor version 51.0
Coming to my problem I need to have two running instances of ActiveMQ in my system. I had followed the following steps to create two instance.
C:\>cd \apache-activemq-5.11.1
C:\apache-activemq-5.11.1>.\bin\activemq create instance1
C:\apache-activemq-5.11.1>.\bin\activemq create instance2
I had changed to different set of port numbers for instance2 as below,
<!--EDITED: apache-activemq-5.11.1\instance2\conf\activemq.xml-->
<transportConnectors>
<!-- DOS protection, limit concurrent connections to 1000 and frame size to 100MB -->
<transportConnector name="openwire" uri="tcp://0.0.0.0:61716?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
<transportConnector name="amqp" uri="amqp://0.0.0.0:5772?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
<transportConnector name="stomp" uri="stomp://0.0.0.0:61713?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
<transportConnector name="mqtt" uri="mqtt://0.0.0.0:1983?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
<transportConnector name="ws" uri="ws://0.0.0.0:61714?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
</transportConnectors>
Now I am starting instance1 & instance2 as follows.....
C:\apache-activemq-5.11.1\instance1\bin>instance1 start
C:\apache-activemq-5.11.1\instance1\bin>instance2 start
Among these the second instance which I am trying to start gives the following kahadb lock problem.....
INFO | Refreshing org.apache.activemq.xbean.XBeanBrokerFactory$1#7209d9af: startup date [Thu May 07 16:16:23 IST 2015]; root of context hierarchy
INFO | PListStore:[C:\apache-activemq-5.11.1\data\localhost\tmp_storage] started
INFO | Using Persistence Adapter: KahaDBPersistenceAdapter[C:\apache-activemq-5.11.1\data\kahadb]
INFO | Database C:\apache-activemq-5.11.1\data\kahadb\lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File 'C:\apache-activemq-5.11.1\data\kahadb\lock' could not be locked.
Please give a solution for this db lock issue.

Make a replica of your ActiveMQ like apache-activemq-x.xx.x to apache-activemq-x.xx.x_2
Change the ports of apache-activemq-x.xx.x_2\conf\activemq.xml. Make sure the port numbers that you are changing are not in clash.
<!--EDIT: apache-activemq-5.11.1_2\conf\activemq.xml-->
<transportConnectors>
<!-- DOS protection, limit concurrent connections to 1000 and frame size to 100MB -->
<transportConnector name="openwire" uri="tcp://0.0.0.0:61716?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
<transportConnector name="amqp" uri="amqp://0.0.0.0:5772?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
<transportConnector name="stomp" uri="stomp://0.0.0.0:61713?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
<transportConnector name="mqtt" uri="mqtt://0.0.0.0:1983?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
<transportConnector name="ws" uri="ws://0.0.0.0:61714?maximumConnections=1000&wireFormat.maxFrameSize=104857600"/>
</transportConnectors>
And along with port changes we have to correct jetty.xml http management console port as well.
<bean id="jettyPort" class="org.apache.activemq.web.WebConsolePort" init-method="start">
<!-- the default port number for the web console -->
<property name="host" value="0.0.0.0"/>
<property name="port" value="8162"/>
</bean>
In this way you can run two services of ActiveMQ in one system.

Running more instances created for ActiveMQ gives the Fail Over option.
In this way when one instance goes down for some reason. The other instance coming up automatically, since the first instance which locks the KahaDB is been released.
For this ports are not to be changed,- as we're configuring instances for Fail Over Mode.
C:\>cd \apache-activemq-5.11.1
C:\apache-activemq-5.11.1>.\bin\activemq create instance1
C:\apache-activemq-5.11.1>.\bin\activemq create instance2
Please start instances without changing any of the configuration. So when instance1 goes down for any reason, instance2 coming up.
C:\apache-activemq-5.11.1\instance1\bin>instance1 start
C:\apache-activemq-5.11.1\instance2\bin>instance2 start
I hope this must be the purpose of creating multiple instances under ActiveMQ. And more than this some more tweaky config. also available for Kahadb.

I did do the following instructions from:
Running Multiple ActiveMQ Instances on One Machine (Dzone)
It works fine for Mac (Not tested in Linux)
Note: the instances must be started through instanceNumber start (the console argument/parameter is not valid anymore).
I had the same problem about the kahadb locking only for Windows, it for ActiveMQ 5.13.3 and 5.14.5 versions
The same author from DZone wrote practically the same post in his blog
Running multiple ActiveMQ instances on one machine (Blog)
But there exists an important update.
You must open each instanceNumber.bat file for each instance from each bin directory and add these two lines:
set ACTIVEMQ_CONF="ACTIVEMQ_HOME/instanceNumber/conf"
set ACTIVEMQ_DATA="ACTIVEMQ_HOME/instanceNumber/data"
Where ACTIVEMQ_HOME represents the path location of your ActiveMQ and instanceNumber is the instance being edited such as: instanceA and instanceB

what is happening is you have changed the port numbers correctly but both the instances that you created use a same Database(in this case file system KahaDB) to store their messages,
So when one instance is up and running, it holds the lock for that database and other instance of activeMQ will be waiting to gain a lock of this DB.
Essentially this is becoming a master slave configuration .
look at this line in activeMQ.xml
<persistenceAdapter>
<kahaDB directory="${activemq.data}/kahadb"/>
</persistenceAdapter>
this will be pointing so same location for both instances.
what my solution is to copy entire folder apache-activemq-x.xx.x in different location change the port numbers for second instance and run them differently
by this you will have 2 instances of activeMQ running on same machine
hope this helps!
Good luck!

Although because of KahaDB restriction load balancing/fault tolerant configuration is restricted. We can use following kind of connection URL to utilize ActiveMQ load....
failover://(tcp://192.nnn.nn.nn:61616,tcp://192.nnn.nn.nn:61616)?randomize=false
randomize=true will made message shuffles between two AciveMQ in active mode, rather not by just fail-over of ActiveMQ......
Complete reference for this can be found under the following Apache Site link....
http://activemq.apache.org/failover-transport-reference.html
But Still high availability (i.e, cluster) configuration make things stable for your App although Apache must advance ActiveMQ High Availability, hence things can work smoother.
The present Apache ActiveMQ High Availability configuration available in the following link.
http://activemq.apache.org/clustering.html
Although KahaDB has file lock restriction, following tweaking/alternates ways of configuration can be done...
1)Shared File System Master Slave,- A shared file system such as a SAN
http://activemq.apache.org/shared-file-system-master-slave.html
2)JDBC Master Slave,- A Shared database
http://activemq.apache.org/jdbc-master-slave.html
3)Replicated LevelDB Store,- ZooKeeper Server
http://activemq.apache.org/replicated-leveldb-store.html
Over & above by having JCA connectors,- AS like JBoss, Weblogic, Websphere, Geronimo, Glassfish,- ActimeMQ patching as a kind of Resource Adapter can be done. And with Apache Camel (karaf), JBoss Fuse ESB kind of products HA & clustering of ActiveMQ can be done.

activeMQ master/slave cluster with zookeeper

I have my activeMQ connected to zookeeper (a cluster of 5 zookeepers), in the config file "activemq.xml", I have
<persistenceAdapter>
<replicatedLevelDB
directory="${activemq.data}/leveldb"
replicas="3"
bind="tcp://0.0.0.0:0"
zkAddress="blablabla:2181"
zkPassword="password"
zkPath="/activemq/leveldb-stores"
hostname="blabla"
/>
</persistenceAdapter>
now I have activeMQ-server1 started, successfully become the master; activeMQ-server2 with the same "activemq.xml" config file, successfully become the slave; activeMQ-server3 with the same "activemq.xml" config file, successfully become the slave, but kicks out activeMQ-server2 (start to give connection error)
I think I put the wrong number for replicas, I changed all the 3 config files with "replicas="4"", still not work
what would be the correct replicas number with 3 activeMQ servers, or I am wrong with some other parts. (I only have 1 zookeeper listed in config, since the 5 zookeepers can connect to each other, already a cluster there)
Thanks :)

You need to list all zookeeper servers in the zkAddress portion, zkAddress="zoo1.example.org:2181,zoo2.example.org:2181,zoo3.example.org:2181", taken from activemq replicated levelDB

The replicas value is the number of activemq nodes, not the number of zookeeper nodes. So if you have 3 amq nodes, set replicas="3", not more. http://activemq.apache.org/replicated-leveldb-store.html :
Replicas property :
The number of nodes that will exist in the cluster.
At least (replicas/2)+1 nodes must be online to avoid service outage.
Another thing, all amq nodes in the cluster must get the same name (MyBroker below):
<broker xmlns="http://activemq.apache.org/schema/core" brokerName="MyBroker" dataDirectory="${activemq.data}">

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas