Strange "pattern not match" error on fluentd - apache

Can someone tell if its normal that fluentd raise this error in td-agent.log file?
2015-07-31 13:15:19 +0000 [warn]: pattern not match: "- - - [31/Jul/2015:13:15:19 +0000] GET http://172.31.108.218/ HTTP/1.1 200 0 \"-\" \"ELB-HealthChecker/1.0\""
While this is a well formated apache2 log:
- - - [31/Jul/2015:13:15:19 +0000] GET http://172.31.108.218/ HTTP/1.1 200 0 \"-\" \"ELB-HealthChecker/1.0\"
And here is the source configuration:
<source>
type tail
format apache2
path /var/log/varnish/varnishncsa.log
pos_file /var/log/td-agent/tmp/access.log.pos
tag "apache2.varnish-access"
</source>
I can't figure out what's wrong there above.

Instead of finding some way to filter out logs from ELB-HealthChecker, you can set your own format for the Apache access log that is a little more flexible in terms of the first couple fields. I ran into this same error when getting /server-status checks from collectd (using it to monitor for SignalFx).
Setting the source like so:
<source>
type tail
format /^(?<host>[^ ]*(?:\s+[^ ]+)*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
time_format %d/%b/%Y:%H:%M:%S %z
path /var/log/apache2/access.log
pos_file /var/log/td-agent/apache2.pos
tag apache2.log
</source>
Allows both log lines like:
172.18.0.2:80 127.0.0.1 - - [08/Aug/2017:19:58:38 +0000] "GET /server-status?auto HTTP/1.1" 200 508 "-" "collectd/5.7.2.sfx0"
As well as:
192.168.0.1 - - [28/Feb/2013:12:00:00 +0900] "GET / HTTP/1.1" 200 777 "-" "Opera/12.0"
You can test format regex matching using Fluentular.
See related: Fluentd apache log format with multiple host ip

The problem is that these ELB-HealthChecker line log has an empty referer ip field. And then the log doesn't match apache2 log format for fluentd.
So the way to fix that is to filter logs with ELB-HealthChecker user-agent.

Related

Springboot Webflux accesslog: What are the two numbers at the end please?

Small question regarding how to interpret a SpringBoot Webflux app access log please.
Currently, in my logs, more precisely access logs, I can see:
2021-07-31 13:46:19.913 INFO [service,,] 10 --- [or-http-epoll-1] reactor.netty.http.server.AccessLog : ip - - [31/Jul/2021:13:46:19 +0000] "GET /health HTTP/1.1" 200 3349 6
2021-07-31 13:47:18.531 INFO [service,,] 10 --- [or-http-epoll-2] reactor.netty.http.server.AccessLog : ip - - [31/Jul/2021:13:47:18 +0000] "GET /health/liveness HTTP/2.0" 200 3312 8
2021-07-31 13:47:33.347 INFO [service,,] 10 --- [or-http-epoll-2] reactor.netty.http.server.AccessLog : ip - - [31/Jul/2021:13:47:33 +0000] "GET /health HTTP/1.1" 200 3349 11
I understand the 200 is probably my http response, I return http 200.
But I am having a hard time understanding what are the last two numbers please.
3349 6
3312 8
3349 11
Any help?
Thank you
It does depend on log format definition, but it looks like the larger number is response size in bytes and the smaller is processing time of the request in ms.
I'll look at documentation to see where I'd expect to find the log format definition for a spring webflux app. I'd expect the format to be defined in a similar way to httpd access logs ( documentation for those is at https://httpd.apache.org/docs/2.4/logs.html)

I want to exclude some line in the logs read by filebeat and also want to add a tag by using processors in filebeat but it is not working

I want to remove the log lines containing the word "HealthChecker" in the given log below and also add some tags in the payload to be send to logstash.
My logs:
18.37.33.73 - - [18/Apr/2019:14:49:53 +0530] "GET /products?sort=date&direction=desc HTTP/1.1" 200 8543 "https://codingexplained.com/products/view/124" "Mozilla/5.0 (iPhone; CPU iPhone OS 10_0 like Mac OS X) AppleWebKit/602.1.38 (KHTML, like Gecko) Version/10.0 Mobile/14A300 Safari/602.1"
20.4.2.88 - - [18/Apr/2019:14:49:54 +0530] "GET / HTTP/1.1" 200 100332 "-" "ELB-HealthChecker/2.0"
18.37.33.73 - - [18/Apr/2019:14:49:55 +0530] "GET /products?sort=date&direction=desc HTTP/1.1" 200 8543 "https://codingexplained.com/products/view/124" "Mozilla/5.0 (iPhone; CPU iPhone OS 10_0 like Mac OS X) AppleWebKit/602.1.38 (KHTML, like Gecko) Version/10.0 Mobile/14A300 Safari/602.1"
20.4.2.88 - - [18/Apr/2019:14:49:56 +0530] "GET / HTTP/1.1" 200 100332 "-" "ELB-HealthChecker/2.0"
I have already tried giving this configuration inside the processor plugin inside filebeat.yml file but it still does not work.
My filebeat.yml file:
filebeat.modules:
- module: apache
access:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths: ["/location/apache_access_2017-09-28.log"]
# Input configuration (advanced). Any input configuration option
# can be added under this section.
processors:
- add_tags:
tags: [web, production]
target: "environment"
- drop_event:
when:
contains:
message: "ELB-HealthChecker"
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: false
output.console:
# Boolean flag to enable or disable the output module.
enabled: true
codec.json:
pretty: true
YAML to blame in your case. "Processor" is the top level element, so this would work:
filebeat.modules:
- module: apache
access:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths: ["/location/apache_access_2017-09-28.log"]
# Input configuration (advanced). Any input configuration option
# can be added under this section.
processors:
- add_tags:
tags: [web, production]
target: "environment"
- drop_event:
when:
contains:
message: "ELB-HealthChecker"
When in doubt about indentation, refer to filebeat.full.yml file.

Written only dash in apache access log

A normal log looks like this:
111.111.111.111 222.222.222.222 - - [06/Jun/2017:02:19:00 +0900] "GET /monitor/l7check.nhn HTTP/1.1" 200 4 1222 "-" "-"
but some log looks like this:
111.111.111.111 333.333.333.333 - - [06/Jun/2017:02:18:58 +0900] "-" 408 - 13 "-" "-"
I can't understand the meaning of this log.
Why does it have only a 'dash' instead of a 'get URL'?
Is it possible to log to a URL without requesting a URL?
https://www.rfc-editor.org/rfc/rfc7231#section-6.5.7
6.5.7. 408 Request Timeout
The 408 (Request Timeout) status code indicates that the server did not receive a complete request message within the time that it was prepared to wait. A server SHOULD send the "close" connection option (Section 6.1 of [RFC7230]) in the response, since 408 implies that the server has decided to close the connection rather than continue waiting. If the client has an outstanding request in transit, the client MAY repeat that request on a new connection.
So, the client connected, but did not send any HTTP request. The server waited, and eventually closed the connection.

Rails Webrick monitoring functionality in apache2?

Hi I'm wodering if it is possible to get the same webrick monitoring functionality in apache2. Doing a quick reseach on this site an goolge, I found that I can use tail -f to monitor the log realtime. But the info I need is not displayed on the access.log.
On Webrick I can see the complete request that came to the server, this includes all POST parameters that are sent to it. I'm developing a Phonegap aplication that is aiming to a production server with apache, and I need to doublecheck my REST request to the server (exactly as I did in my development environment in Rails with Webrick). That's why tail -f don't fit my needs.
Does anyone has a solution?
Thanks in advance.
I've got the solution. And it's a stupid answer. I was looking to the apache2 other_vhosts_access.log file, and the info I was getting was:
migtrace.com:80 89.131.219.51 - - [16/Sep/2013:12:14:14 +0200] "POST /api/reports.json HTTP/1.1" 200 964 "-" "curl/7.29.0"
But if I tail -f the rails production.log what I get is:
Started POST "/api/reports.json" for 89.131.219.51 at 2013-09-16 12:14:14 +0200
Processing by Api::ReportsController#create as JSON
Parameters: {"report"=>{"geo"=>["41.2058334", "1.697777"], "patient_id"=>"X", "patient_token"=>"XXXXXXXXXX", "lunch"=>"{\"pasta\",\"cheese\",\"chocolate\"}", "sex"=>"true"}}
Completed 200 OK in 522ms (Views: 2.3ms | ActiveRecord: 259.3ms)
That is exactly what I need.

Publishing Mercurial Respository on apache problem

I have a repository in here http://repos.joomlaguruteam.com/
I can browse it but I can clone it.
Every time I clone it I have this error
hg clone http://repos.joomlaguruteam.com/hello
destination directory: hello
requesting all changes
abort: HTTP Error 404: Not Found
and the access log have that
115.5.95.59 - - [10/Feb/2011:04:20:33 -0600] "GET /hello?pairs=0000000000000000000000000000000000000000-0000000000000000000000000000000000000000&cmd=between HTTP/1.1" 200 1 "-" "mercurial/proto-1.0"
115.5.95.59 - - [10/Feb/2011:04:20:34 -0600] "GET /hello?cmd=heads HTTP/1.1" 200 41 "-" "mercurial/proto-1.0"
115.5.95.59 - - [10/Feb/2011:04:20:34 -0600] "GET /hello?cmd=changegroup&roots=0000000000000000000000000000000000000000 HTTP/1.1" 404 597 "-" "mercurial/proto-1.0"
What is the problem.
I really hope somebody can help me with that.
Thanks,
Yuan
I could clone this by using uncompressed transfer.
If you are using TortoiseHg, then check the box Use uncompressed transfer
If you are using command-line, then use --uncompressed flag
hg clone --uncompressed http://repos.joomlaguruteam.com/hello