I'm trying to get logstash working in a centralised setup using the docs as an example:
http://logstash.net/docs/1.2.2/tutorials/getting-started-centralized
I've got logstash (as indexer), redis, elasticsearch and standalone kibana3 running on my web server. I then need to run logstash as an agent on another server to collect apache logs and send them to the web server via redis. The number of agents will increase and the logs will vary, but for now I just want to get this working!
I need everything to run as a service so that all is well after reboots etc. All servers are running Ubuntu.
For all logstash instances (indexer and agent), I'm using the following init script (Ubuntu version, second gist):
https://gist.github.com/shadabahmed/5486949#file-logstash-ubuntu
For running redis as a service, I followed the instructions here:
http://redis.io/topics/quickstart (Installing redis more properly)
Elasticsearch is also running as a service.
On the web server, running redis-cli returns PONG correctly. Navigating to the correct Elasticsearch URL returns the correct JSON response. Navigating to the Kibana3 url gives me the dashboard, but no data. UFW is set to allow the redis port (at the moment from everywhere).
On the web server, my logstash.conf is:
input {
file {
path => "/var/log/apache2/access.log"
type => "apache-access"
sincedb_path => "/etc/logstash/.sincedb"
}
redis {
host => "127.0.0.1"
data_type => "list"
key => "logstash"
codec => json
}
}
filter {
grok {
type => "apache-access"
pattern => "%{COMBINEDAPACHELOG}"
}
}
output {
elasticsearch {
embedded => true
}
statsd {
# Count one hit every event by response
increment => "apache.response.%{response}"
}
}
From the agent server, I can telnet successfully to the web server IP and redis port. logstash is running. The logstash.conf file is:
input {
file {
path => "/var/log/apache2/shift.access.log"
type => "apache"
sincedb_path => "/etc/logstash/since_db"
}
stdin {
type => "example"
}
}
filter {
if [type] == "apache" {
grok {
pattern => "%{COMBINEDAPACHELOG}"
}
}
}
output {
stdout { codec => rubydebug }
redis { host => ["xx.xx.xx.xx"] data_type => "list" key => "logstash" }
}
If I comment out the stdin and stdout lines, I still don't get a result. The logstash logs do not give me any connection errors - only warnings about the deprecated grok settings format.
I have also tried running logstash from the command line (making sure to stop the demonised service first). The apache log file is correctly outputted in the terminal, so I know that logstash is accessing the log correctly. And I can write random strings and they are output in the correct logstash format.
The redis logs on the web server show no sign of trouble......
The frustrating thing is that this has worked once. One message from stdin made it all the way through to elastic search. That was this morning just after getting everything setup. Since then, I have had no luck and I have no idea why!
Any tips/pointers gratefully received... Solving my problem will stop me tearing out more of my hair which will also make my wife happy......
UPDATE
Rather than filling the comments....
Thanks to #Vor and #rutter, I've confirmed that the user running logstash can read/write to the logstash.log file.
I've run the agent with -vv and the logs are populated with e.g.:
{:timestamp=>"2013-12-12T06:27:59.754000+0100", :message=>"config LogStash::Outputs::Redis/#host = [\"XX.XX.XX.XX\"]", :level=>:debug, :file=>"/opt/logstash/logstash.jar!/logstash/config/mixin.rb", :line=>"104"}
I then input random text into the terminal and get stdout results. However, I do not see anything in the logs until AFTER terminating the logstash agent. After the agent is terminated, I get lines like these in the logstash.log:
{:timestamp=>"2013-12-12T06:27:59.835000+0100", :message=>"Pipeline started", :level=>:info, :file=>"/opt/logstash/logstash.jar!/logstash/pipeline.rb", :line=>"69"}
{:timestamp=>"2013-12-12T06:29:22.429000+0100", :message=>"output received", :event=>#<LogStash::Event:0x77962b4d #cancelled=false, #data={"message"=>"test", "#timestamp"=>"2013-12-12T05:29:22.420Z", "#version"=>"1", "type"=>"example", "host"=>"Ubuntu-1204-precise-64-minimal"}>, :level=>:info, :file=>"(eval)", :line=>"16"}
{:timestamp=>"2013-12-12T06:29:22.461000+0100", :level=>:debug, :host=>"XX.XX.XX.XX", :port=>6379, :timeout=>5, :db=>0, :file=>"/opt/logstash/logstash.jar!/logstash/outputs/redis.rb", :line=>"230"}
But while I do get messages in stdout, I get nothing in redis on the other server. I can however telnet to the correct port on the other server, and I get "ping/PONG" in telnet, so redis on the other server is working..... And there are no errors etc in the redis logs.
It looks to me very much like the redis plugin on the logstash shipper agent is not working as expected, but for the life of me, I can't see where the breakdown is coming from.....
Related
I'm using syslog-ng 3.37.1 on a VMware Photon 3.0 virtual appliance (preconfigured VM). The appliance is configured to write logs into certain files under /var/log folder as well as to remote syslog servers (optional).
Logs from facility 'auth' and 'authpriv' are configured to write to /var/log/auth.log, as well as send it over to remote syslog server when enabled.
In addition, there are other messages as well from kernel, systemd services as well as other processes, configured to be processed via syslog-ng.
Issue is that, logs from a few facilities (such as auth, authpriv, cron etc) are not processed (received?) by syslog-ng initially. So, any SSH events, TTY login events are not logged into the file and remote. However, many other events from kernel, systemd and other processes are logged fine.
Below is the configuration for auth.log, that does not log in the first boot.
filter f_auth { facility(auth) or facility(authpriv)); };
destination authlog { file("/var/log/auth.log" perm(0600)); };
log { source(s_local); filter(f_auth); destination(authlog); };
I updated the filter as below without any success
filter f_auth {
facility(auth) or facility(authpriv) or
match('sshd' value('PROGRAM')) or match('systemd-logind' value('PROGRAM'));
};
In journal logs I can observe the relevant logs, for example, below command to view SSH logs.
journalctl -f -u sshd
Additional syslog-ng service restart or config reload during appliance startup do not fix this.
The log file /var/log/auth.log (and also cron log etc) show zero size during this time. Syslog-ng log looks fine too.
However, if I generate some auth facility event (say, SSH/TTY login) and manually restart syslog-ng, all the log entries (including old events) are immediately written into filesystem log (/var/log/auth.log) and also sent to remote syslog server.
In the syslog-ng.log I find below entry when it starts working that way.
syslog-ng[481]: [date] Failed to seek journal to the saved cursor position; cursor='', error='Invalid argument (22)'
It makes me wonder if it is due to some bad cursor position. However, I can still see other systemd and kernel logs being logged fine. So, not sure.
What could be causing such behaviour? How can I ensure that syslog-ng is able to receive and process these logs without manual intervention?
Below is more detailed configuration for reference:
#version: 3.37
#include "scl.conf"
source s_local {
system();
internal();
udp();
};
destination d_local {
file("/var/log/messages");
file("/var/log/messages-kv.log" template("$ISODATE $HOST $(format-welf --scope all-nv-pairs)\n") frac-digits(3));
};
log {
source(s_local);
# uncomment this line to open port 514 to receive messages
#source(s_network);
destination(d_local);
};
filter f_auth {
facility(auth) or facility(authpriv)); # Also tried facility (auth, authpriv)
};
destination authlog { file("/var/log/auth.log" perm(0600)); };
log { source(s_local); filter(f_auth); destination(authlog); };
destination d_kern { file("/dev/console" perm(0600)); };
filter f_kern { facility(kern); };
log { source(s_local); filter(f_kern); destination(d_kern); };
destination d_cron { file("/var/log/cron" perm(0600)); };
filter f_cron { facility(cron); };
log { source(s_local); filter(f_cron); destination(d_cron); };
destination d_syslogng { file("/var/log/syslog-ng.log" perm(0600)); };
filter f_syslogng { program(syslog-ng); };
log { source(s_local); filter(f_syslogng); destination(d_syslogng); };
# A few more of above kind of configuration follows here.
# Add configuration files that have remote destination, filter and log configuration for remote servers
#include "remote/*.conf"
As can be seen, /var/log/auth.log should hold logs from auth facility, but the log remains blank until subsequent restart of syslog-ng after a syslog config change (via API) or manual login into the system. However, triggering automated restart of syslog-ng using cron (without additional syslog config change) does not help.
Any thoughts, suggestions?
This is probably caused by your real time clock going backwards. The notification mechanism in libsystemd does not work in this case.
There's a proof-of-concept patch in this syslog-ng issue:
https://github.com/syslog-ng/syslog-ng/issues/2836
But I've increased the priority to tackle that problem and fix this, as it is causing issue more often than I anticipated.
As a workaround you should synchronize the time for your VM, preferably so that during boot it waits until a sync and then keep the time synchronized by ntp.
I'm trying to migrate a docker based redis container into AWS Elasticache. I have the Redis instance running and can connect via the redis CLI but when I setup the logstash with the following:
input {
redis {
host => "redis<domain>.cache.amazonaws.com"
data_type => "list"
key => "logstash"
codec => msgpack
}
}
It explodes with this:
[2022-02-02T13:52:27,575][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, "pipeline.sources"=>["/usr/share/logstash/logstash.conf"], :thread=>"#<Thread:0x547a32a1 run>"}
[2022-02-02T13:52:28,685][INFO ][logstash.javapipeline ][main] Pipeline Java execution initialization time {"seconds"=>1.11}
[2022-02-02T13:52:28,701][INFO ][logstash.inputs.redis ][main] Registering Redis {:identity=>"redis://#redis<domain>.cache.amazonaws.com:6379/0 list:logstash"}
[2022-02-02T13:52:28,709][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"}
[2022-02-02T13:52:28,823][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2022-02-02T13:52:28,837][ERROR][logstash.inputs.redis ][main][08c8cf37082e202fd617f2bc3c642b630c437b5e58521b08cd412f29ed9a10e1] Unexpected error {:message=>"invalid uri scheme ''", :exception=>ArgumentError, :backtrace=>["/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/redis-4.5.1/lib/redis/client.rb:473:in `_parse_options'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/redis-4.5.1/lib/redis/client.rb:94:in `initialize'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/redis-4.5.1/lib/redis.rb:65:in `initialize'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-redis-3.7.0/lib/logstash/inputs/redis.rb:129:in `new_redis_instance'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-redis-3.7.0/lib/logstash/inputs/redis.rb:134:in `connect'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-redis-3.7.0/lib/logstash/inputs/redis.rb:186:in `list_runner'", "org/jruby/RubyMethod.java:131:in `call'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-input-redis-3.7.0/lib/logstash/inputs/redis.rb:87:in `run'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:409:in `inputworker'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:400:in `block in start_input'"]}
but when I then use this configuration to provide the uri:
input {
redis {
host => "redis://redis<domain>.cache.amazonaws.com"
data_type => "list"
key => "logstash"
codec => msgpack
}
}
I get this:
[2022-02-02T13:57:10,475][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, "pipeline.sources"=>["/usr/share/logstash/logstash.conf"], :thread=>"#<Thread:0x20738737 run>"}
[2022-02-02T13:57:11,586][INFO ][logstash.javapipeline ][main] Pipeline Java execution initialization time {"seconds"=>1.11}
[2022-02-02T13:57:11,600][INFO ][logstash.inputs.redis ][main] Registering Redis {:identity=>"redis://#redis://redis<domain>.cache.amazonaws.com:6379/0 list:logstash"}
[2022-02-02T13:57:11,605][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"}
[2022-02-02T13:57:11,724][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2022-02-02T13:57:11,843][WARN ][logstash.inputs.redis ][main][08c8cf37082e202fd617f2bc3c642b630c437b5e58521b08cd412f29ed9a10e1] Redis connection error {:message=>"Error connecting to Redis on redis://redis<domain>.cache.amazonaws.com:6379 (SocketError)", :exception=>Redis::CannotConnectError}
The latter error looks saner but the Registering Redis line looks messed up. But neither provide any insight as to why they can't connect, yet I can connect to the Redis instance from the pod. What am I missing here?
Turned out on top of the config, I also had an environment variable set called REDIS_URL that was trying to gazump the config as it's used in the Redis client.
From the readme, I finally discovered:
By default, the client will try to read the REDIS_URL environment
variable and use that as URL to connect to. The above statement is
therefore equivalent to setting this environment variable and calling
Redis.new without arguments.
I'm moving data from two ES clusters which are seperated. I've added s3 as a common area and have two logstash instances, one that writes to s3 from Elasticsearch and another that reads S3 and loads Elasticsearch.
The problem is that only one document from each index is loaded. The output file written by s3 output plugin is a single long line, with many json documents all run together without commas or opening or closing square brackets for the array. For example, instead of [{"id":1},{"id":2},{"id":3}] the output is writing files which read {"id":1}{"id":2}{"id":3}. In which case only {"id":1} is read by logstash using s3 as an input.
The configuration to go to s3 is:
input {
elasticsearch {
hosts => ["${ES_HOST}:${ES_PORT}"]
index => "${ES_INDEX}"
password => "${ES_PASS}"
ssl => "true"
user => "${ES_USER}"
}
}
output {
s3 {
bucket => "${S3_BUCKET}"
encoding => "gzip"
codec => "json"
prefix => "${S3_PREFIX}/${ES_INDEX}"
region => "ap-southeast-2"
}
}
The configuration reading S3 is:
input {
s3 {
bucket => "${S3_BUCKET}"
codec => "json"
prefix => "${S3_PREFIX}/${ES_INDEX}/"
region => "ap-southeast-2"
watch_for_new_files => false
}
}
output {
stdout { }
}
In both cases the ${} variables are set in the environment (bash shell).
Both servers are running logstash 7.6.0
PS: I don't think they are important, but the stdout log from logstash says:
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.headius.backport9.modules.Modules (file:/home/ec2-user/logstash-7.6.0/logstash-core/lib/jars/jruby-complete-9.2.9.0.jar) to method sun.nio.ch.NativeThread.signal(long)
WARNING: Please consider reporting this to the maintainers of com.headius.backport9.modules.Modules
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Sending Logstash logs to /home/ec2-user/logstash-7.6.0/logs which is now configured via log4j2.properties
[2020-03-09T01:10:35,168][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-03-09T01:10:35,353][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.6.0"}
[2020-03-09T01:10:37,813][INFO ][org.reflections.Reflections] Reflections took 48 ms to scan 1 urls, producing 20 keys and 40 values
[2020-03-09T01:10:53,476][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been create for key: cluster_uuids. This may result in invalid serialization. It is recommended to log an issue to the responsible developer/development team.
[2020-03-09T01:10:53,515][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/home/ec2-user/kibana/from_s3.conf"], :thread=>"#<Thread:0x1364485f run>"}
[2020-03-09T01:10:54,561][INFO ][logstash.inputs.s3 ][main] Registering s3 input {:bucket=>"my-bucket-here", :region=>"ap-southeast-2"}
[2020-03-09T01:10:55,334][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-03-09T01:10:55,435][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-03-09T01:10:55,833][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2020-03-09T01:10:57,507][INFO ][logstash.inputs.s3 ][main] Using default generated file for the sincedb {:filename=>"/home/ec2-user/logstash-7.6.0/data/plugins/inputs/s3/sincedb_1906e463a09b003733b719c08277c793"}
/home/ec2-user/logstash-7.6.0/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
my-document-here
}
[2020-03-09T01:10:59,587][INFO ][logstash.runner ] Logstash shut down.
PPS: deleting the since DB allows the one row to load, the file is not changing.
Use the below in your s3 output plugin.
codec => "json_lines"
This does delimit each event by a new line.
s3 input plugin can still use the codec of "json".
I've set up a logstash on a CentOS server to read from our production web servers IIS logs via a CIFS mount.
input {
file {
path => "/mnt/remote/server*/W3SVC1/ex*.log"
type => "w3c"
}
}
filter {
grok {
type => "w3c"
match => [ "message", "%{HOST:hostname} %{IP:hostip} %{WORD:method} %{URIPATH:request} (?:%{NOTSPACE:param}|-) %{NUMBER:port} (?:%{USER:username}|-) %{IPORHOST:clientip} %{NOTSPACE:httpver} (?:%{NOTSPACE:agent}|-) %{NOTSPACE:cookies} %{NOTSPACE:referer} %{IPORHOST:webhostname} %{NUMBER:status} %{NUMBER:time-taken}" ]
}
}
But, after initially reading an initial burst of logs, it just dies.
(The elevated data afterwards is from a different data source)
I tried a hack from Jordan from this thread, but it doesn't seem to work
tail -f /mnt/remote/server1/W3SVC1/ex130913.log | java -jar logstash.jar
We are purposely avoiding installing Java/Logstash on our front-end web servers because of security issues. So, can you think of a way to make this work?
i have a host running syslog-ng. it does all it's stuff locally fine (creating log files etc). however, i would like to forward ALL of it's logs to a remote machine - specifically to one facility on the remote machine (local4). i tried playing around with rewrite (set-facility) and templates within the destination (syntax errors) - but to no avail.
destination remote_server {
udp(\"172.18.192.8\" port (514));
udp(\"172.18.192.9\" port (514));
};
rewrite r_local4 {
set-facility(local4);
};
filter f_alllogs {
level (debug...emerg);
};
log {
source(local);
filter(f_alllogs);
rewrite(r_local4)
destination(remote_server);
};
AFAIK, currently it is not possible to modify the facility of a message in syslog-ng.
Is there a special reason you want to do it?