Want to get collectd CPU threshold warnings without CPU details - collectd

Dear collectd experts,
I am using collectd on many clients to send telemetry data to a central instance by network plugin. The clients are using for this collectd with a configuration containing among others:
LoadPlugin cpu
LoadPlugin network
<Plugin network>
Server "xxx.xxx.xxx.xxx" "yyyy"
</Plugin>
This configuration also includes a threshold definition to send a warning in case the CPU is using more than 90% over a given time span. The configuration for this is:
LoadPlugin "threshold"
<Plugin "threshold">
<Plugin "cpu">
<Type "percent">
Instance "user"
# start to warn from 90%
WarningMax 90
# every 5 seconds -> 12 Hits (2 min)
Hits 12
# don't stop to warn until value drops below hyteresis
Persist true
# hyteresis means, we stop warning by 90% - 10% = 80%
Hysteresis 10
</Type>
</Plugin>
</Plugin>
Unfortunately the collectd client will send the whole data set for each CPU and not only the warning, if the CPU reached the threshold. If I remove the cpu plugin, collectd will send nothing, neither the CPU details nor the warnings in case of triggering the threshold.
Is there an option to change the configuration that only the threshold related warnings will come but not the whole CPU details?
Redards

You can use Aggregation plugin - https://collectd.org/wiki/index.php/Plugin:Aggregation/Config.
Not sure if will work with Threshold plugin though.

Related

Is it possible to use redis_store output plugin for fluentd to handle huge amount of logs?

I'm trying to configure fluentd to send logs to redis on different server (the path is fluentbit-fluentd-redis-logstash-elastic), but I can't figure out some performance issues. If I save logs only on disc, everything is ok, all logs are saved (thousands per sec), but if I add redis_store section, the same amount of data makes fluentd much slower and the memory is still growing until next restart (day or two).
I know that it's probably because of input is faster than output, but how to handle that, how to send so many data to redis? Add more memory is not solution it can only add some time. Is it because redis can't receive data from so many threads (but redis is not overloaded, there is no queue)? I dont know if that is network issue (but in this plugin i cant try another protocol?), cpu issue (cpu is on cca 70%), or the plugin itself?
If I add redis plugin, the communication with redis is just so slow that fluentd is not fast enough and add data to memory.
Config
<system>
workers 4
root_dir /fluentd/log/buffer/
</system>
<worker 0-3>
<source>
#type forward
bind 0.0.0.0
port 9880
</source>
<label #TEST>
<match test.**>
#type forest
subtype copy
<template>
<store>
#type file
#id "test-#{worker_id}"
#log_level debug
path "fluentd/log/test-#{worker_id}.*.log"
append true
<buffer>
flush_mode interval
flush_interval 3
flush_at_shutdown true
</buffer>
<format>
#type single_value
message_key log
</format>
</store>
<store>
#type redis_store
host server_ip
port 6379
key test
store_type list
<buffer>
flush_mode interval
flush_interval 3
flush_at_shutdown true
flush_thread_count 4
</buffer>
</store>
</template>
</match>
</label>
</worker>
Any tips how to make throughput to redis better?
Thank you
If you are using the redistore plugin you can get much better performance by installing the hiredis gem which wraps the C redis API
https://github.com/moaikids/fluent-plugin-redisstore

Collectd not collecting data when changed to "High Counter" or HC OIDs in snmp plugin config

I am playing with collectd. Evertying works fine until I decided to use "ifHCInOctets" instead of "ifInOctets". Here is my SNMP plugin config.
<Plugin snmp>
<Data "std_traffic_hc">
Type "if_octets"
Table true
# Instance "IF-MIB::ifDescr"
Instance "IF-MIB::ifName"
# Values "IF-MIB::ifInOctets" "IF-MIB::ifOutOctets"
Values "IF-MIB::ifHCInOctets" "IF-MIB::ifHCOutOctets"
</Data>
<Host "ABCDESW01-01">
Address "10.0.3.131"
Version 1
Community "xxx"
Collect "std_traffic_hc"
Interval 60
</Host>
</Plugin>
I have also tried ifDescr and ifName in the "Instance" directive and that did not make a difference. Either one works for the regular OID but not the HC OID.
I used tcpdump but I don't see collectd ever tried to send snmp traffic to the HC OID. I do see the snmp traffic for the Instance OID.
I have also used snmpwalk to confirm that my switch (HP) supports the OID:
# snmpwalk -v2c -cxxx 10.0.3.131 IF-MIB::ifHCInOctets
IF-MIB::ifHCInOctets.1 = Counter64: 0
IF-MIB::ifHCInOctets.2 = Counter64: 356053022
Where did I do wrong?
Thank you!
Well High Counter OID aren't usable within SNMP v1 so i would guess that that's the problem here.
You are correctly manually requesting the OID in Version 2c on the CLI, but the collectd configuration is set to "Version 1"

How to know if images are really cached into memory ohs 11.1.1.7 (based on apache 2.2)

We have configured the module mod_file_cache, example configuration:
...
LoadModule file_cache_module "${ORACLE_HOME}/ohs/modules/mod_file_cache.so"
...
include "/opt/oraas/appweb/estaticos/media/conf/mmap_media.conf"
...
Example of content of mmap_media.conf:
...
MMapFile /opt/oraas/appweb/estaticos/media/img_fmj_cu1.jpg
MMapFile /opt/oraas/appweb/estaticos/media/img_fmj_cu2.jpg
...
The total number of images to be cached is 48100, it is around 3.5 Gbs.
OHS instance takes around 50 seconds in starting, we see memory of server increases and no error in logs but we do not know how to confirm all those images are really cached in memory.
How can we confirm explicitly that those images are cached in memory of the OHS server?

Apache configuration fine tuning

I run a very simple website (basically a redirect based on a php database) which gets on average 5 visits per second throughout the day, but at peak times (usually 2-3 times a day), this may go up to even 300 visits/s or more. I've modified the default apache settings as follows (based on various info found online as I'm not an expert):
Start Servers: 5 (default) / 25 (now)
Minimum Spare Servers: 5 (default) / 50 (now)
Maximum Spare Servers: 10 (default) / 100 (now)
Server Limit: 256 (default) / 512 (now)
Max Clients: 150 (default) / 450 (now)
Max Requests Per Child: 10000 (default)
Keep-Alive: On (default) / Off (now)
Timeout: 300 (default)
Server (VPS) specs:
4x 3199.998 MHz, 512 KB Cache, QEMU Virtual CPU version (cpu64-rhel6)
8GB RAM (Memory: 8042676k/8912896k available (5223k kernel code, 524700k absent, 345520k reserved, 7119k data, 1264k init)
70GB SSD disk
CENTOS 6.5 x86_64 kvm – server
During average loads the server handles just fine. Problems occur almost every day during peak traffic times, as in http time-outs or extremely long response/load times.
Question is, do I need to get a better server or can I improve response times during peak traffic by further tuning Apache config? Any help would be appreciated. Thank you!
maybe you need to enable mod_cache with mod_mem_cache, another parameter that i always configure is ulimits:
nofile to get more sockets
nproc to get more processes
http://www.webperformance.com/load-testing/blog/2012/12/setting-apache2-ulimit-for-maximum-prefork-performance/
finally TCP Tuning and Network, check all net.core and net.ipv4 parameters to get less latency
http://fasterdata.es.net/host-tuning/linux/

Apache + Tomcat with mod_jk: maxThread setting upon load balancing

I have Apache + Tomcat setup with mod_jk on 2 servers. Each server has its own Apache+Tomcat pair, and every request is being served by Tomcat load balancing workers on 2 servers.
I have a question about how Apache's maxClient and Tomcat's maxThread should be set.
The default numbers are,
Apache: maxClient=150, Tomcat: maxThread=200
In this configuration, if we have only 1 server setup, it would work just fine as Tomcat worker never receives the incoming connections more than 150 at once. However, if we are load balancing between 2 servers, could it be possible that Tomcat worker receives 150 + (some number from another server) and make the maxThread overflow as SEVERE: All threads (200) are currently busy?
If so, should I set Tomcat's maxThread=300 in this case?
Thanks
Setting maxThreads to 300 should be fine - there are no fixed rules. It depends on whether you see any connections being refused.
Increasing too much causes high memory consumption but production Tomcats are known to run with 750 threads. See here as well. http://java-monitor.com/forum/showthread.php?t=235
Have you actually got the SEVERE error? I've tested on our Tomcat 6.0.20 and it throws an INFO message when the maxThreads is crossed.
INFO: Maximum number of threads (200) created for connector with address null and port 8080
It does not refuse connections until the acceptCount value is crossed. The default is 100.
From the Tomcat docs http://tomcat.apache.org/tomcat-5.5-doc/config/http.html
The maximum queue length for incoming
connection requests when all possible
request processing threads are in use.
Any requests received when the queue
is full will be refused. The default
value is 100.
The way it works is
1) As the number of simultaneous requests increase, threads will be created up to the configured maximum (the value of the maxThreads attribute).
So in your case, the message "Maximum number of threads (200) created" will appear at this point. However requests will still be queued for service.
2) If still more simultaneous requests are received, they are queued up to the configured maximum (the value of the acceptCount attribute).
Thus a total of 300 requests can be accepted without failure. (assuming your acceptCount is at default of 100)
3) Crossing this number throws Connection Refused errors, until resources are available to process them.
So you should be fine until you hit step 3