I want to exclude some line in the logs read by filebeat and also want to add a tag by using processors in filebeat but it is not working - filebeat

I want to remove the log lines containing the word "HealthChecker" in the given log below and also add some tags in the payload to be send to logstash.
My logs:
18.37.33.73 - - [18/Apr/2019:14:49:53 +0530] "GET /products?sort=date&direction=desc HTTP/1.1" 200 8543 "https://codingexplained.com/products/view/124" "Mozilla/5.0 (iPhone; CPU iPhone OS 10_0 like Mac OS X) AppleWebKit/602.1.38 (KHTML, like Gecko) Version/10.0 Mobile/14A300 Safari/602.1"
20.4.2.88 - - [18/Apr/2019:14:49:54 +0530] "GET / HTTP/1.1" 200 100332 "-" "ELB-HealthChecker/2.0"
18.37.33.73 - - [18/Apr/2019:14:49:55 +0530] "GET /products?sort=date&direction=desc HTTP/1.1" 200 8543 "https://codingexplained.com/products/view/124" "Mozilla/5.0 (iPhone; CPU iPhone OS 10_0 like Mac OS X) AppleWebKit/602.1.38 (KHTML, like Gecko) Version/10.0 Mobile/14A300 Safari/602.1"
20.4.2.88 - - [18/Apr/2019:14:49:56 +0530] "GET / HTTP/1.1" 200 100332 "-" "ELB-HealthChecker/2.0"
I have already tried giving this configuration inside the processor plugin inside filebeat.yml file but it still does not work.
My filebeat.yml file:
filebeat.modules:
- module: apache
access:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths: ["/location/apache_access_2017-09-28.log"]
# Input configuration (advanced). Any input configuration option
# can be added under this section.
processors:
- add_tags:
tags: [web, production]
target: "environment"
- drop_event:
when:
contains:
message: "ELB-HealthChecker"
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: false
output.console:
# Boolean flag to enable or disable the output module.
enabled: true
codec.json:
pretty: true

YAML to blame in your case. "Processor" is the top level element, so this would work:
filebeat.modules:
- module: apache
access:
enabled: true
# Set custom paths for the log files. If left empty,
# Filebeat will choose the paths depending on your OS.
var.paths: ["/location/apache_access_2017-09-28.log"]
# Input configuration (advanced). Any input configuration option
# can be added under this section.
processors:
- add_tags:
tags: [web, production]
target: "environment"
- drop_event:
when:
contains:
message: "ELB-HealthChecker"
When in doubt about indentation, refer to filebeat.full.yml file.

Related

nginx-proxy/nginx-proxy with SSL

I'm really new to all this reverse proxy stuff and I hoped I could get around learning how it works by using this quite popular docker container: https://github.com/nginx-proxy/nginx-proxy
I'm trying to set up a few docker instances with the nginx proxy. The domains are accessable without https but for some reason SSL does not seem to work. You can try that:
http://foundry.hahn-webdesign.de/ => works
https://foundry.hahn-webdesign.de/ => 500 - Internal Server Error
Here is my example project which I can't get to work.
Docker Compose File:
version: "3.8"
services:
nginx-proxy:
image: nginxproxy/nginx-proxy
container_name: nginx-proxy
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx-proxy/certs:/etc/nginx/certs/:ro
- ./nginx-proxy/vhost:/etc/nginx/vhost.d/
- ./nginx-proxy/html:/usr/share/nginx/html/
- /var/run/docker.sock:/tmp/docker.sock:ro
- dhparam:/etc/nginx/dhparam
acme-companion:
image: nginxproxy/acme-companion
container_name: acme-companion
restart: unless-stopped
volumes:
- ./nginx-proxy/html:/usr/share/nginx/html/
- ./nginx-proxy/vhost:/etc/nginx/vhost.d/
- ./nginx-proxy/certs:/etc/nginx/certs/:rw
- ./nginx-proxy/acme:/etc/acme.sh
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
- DEFAULT_EMAIL=admin#hahn-webdesign.de
- NGINX_PROXY_CONTAINER=nginx-proxy
whoami:
image: jwilder/whoami
container_name: foundry
restart: unless-stopped
hostname: foundry
domainname: hahn-webdesign.de
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./nginx-proxy/certs:/etc/nginx/certs
expose:
- "8000"
environment:
- VIRTUAL_HOST=foundry.hahn-webdesign.de
- VIRTUAL_PORT=8000
I find the documentation lacking a lot of input when it comes to SSL samples. Maybe it's because I'm lacking knowledge of how the nginx reverse proxy works in it's basics.
Directories are all working fine and are accessable.
Certificates are valid and created by the acme-companion.
Can someone please tell me what I have to do to make SSL work in this configuration?
Logs from the docker container when accessing both protocols (http -> https):
nginx.1 | foundry.hahn-webdesign.de 95.90.215.63 - - [29/Dec/2021:11:25:43 +0000] "GET / HTTP/1.1" 200 12 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0" "172.22.0.6:8000",
nginx.1 | foundry.hahn-webdesign.de 95.90.215.63 - - [29/Dec/2021:11:25:48 +0000] "GET / HTTP/2.0" 500 177 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0" "-"
I found the reason:
version: "3.8"
services:
whoami:
image: jwilder/whoami
container_name: foundry
restart: unless-stopped
hostname: foundry
domainname: hahn-webdesign.de
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./nginx-proxy/certs:/etc/nginx/certs
expose:
- "8000"
environment:
- VIRTUAL_HOST=foundry.hahn-webdesign.de
- VIRTUAL_PORT=8000
- LETSENCRYPT_HOST=foundry.hahn-webdesign.de
An existing certificate is not sufficient. If you create a valid certificate but remove the container which created the certificate the symlinks will vanish. So if you use a dummy container like suggested in the documentation it will result in this behaviour.
Adding the LETSENCRYPT_HOST will add the symlinks again. So if the containers are accessable you don't even have to use the dummies.
This Environment Variable will actually tell the nginx-proxy to call a certificate if neccessary.

Apache cgi script invoked from browser but not embedded device

I am working on a project that involves two embedded devices, let's call them A and B. Device A is the controller and B is being controlled. My goal is to make an emulator for device B, i.e., something that acts like B so A thinks it's controlling B but in reality, it is controlling my own emulator. I don't control or can change A.
Control occurs via the controller posting GET commands invoking various cgi scripts so the plan is to install apache on "my" device, setup CGI and replicate the various scripts. I am running apache version 2.4.18 on Ubuntu 16.04.5 and have configured Apache2 so it successfully runs the various scripts depending on the URL. As an example, one of the scripts is called 'man_session' and a typical URL issued by device A looks like this: http://192.168.0.14/cgi-bin/man_session?command=get&page=122
I have build a C/C++ program named 'man_session' and have successfully configured Apache to invoke my script when this URL is submitted. I can see this based on the apache log:
192.168.0.2 - - [24/Jan/2019:14:38:38 +0000] "GET /cgi-bin/man_session?command=get&page=122 HTTP/1.1" 200 206 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36"
Also, my script writes to stderr and I can see the output in the log file:
[Thu Jan 24 14:46:10.850123 2019] [cgi:error] [pid 23346:tid 4071617584] [client 192.168.0.2:62339] AH01215: Received man_session command 'command=get&page=122': /home/pi/cgi-bin/man_session
So far so good. The problem I am having is that the script does not get invoked when device A makes the request, only when I make the request via a browser (both Chrome and Internet Explorer work) or curl. The browsers run on my Windows PC and curl runs on the embedded device "B" itself.
When I turn on device A, I can see the URL activity on the log but the script does not get invoked. Below is a log entry showing the URL but which that does not invoke the 'man_session' script. It shows a code of 400 which according to the HTTP specification is an error "due to malformed syntax". Other differences are the missing referrer and user-agent information and http 1.0 vs http 1.1, but I don't see why these would matter.
192.168.0.9 - - [24/Jan/2019:14:38:12 +0000] "GET /cgi-bin/man_session?command=get&page=7 HTTP/1.0" 400 0 "-" "-"
Note that device A is 192.168.0.9 and my PC is 192.168.0.2. What am I missing here, why doesn't the above URL invoke the script as when issued by the browser? Is there any place where I can get more information about why the code 400 occurs in this case?
After a lot of back and forth, I finally figured out the issue. Steps taken:
Increased log level to debug (instead of the default 'warn' in apache2.conf
This caused the following error message to show up in the log
[Sat Jan 26 02:47:56.974353 2019] [core:debug] [pid 15603:tid 4109366320] vhost.c(794): [client 192.168.0.9:61001] AH02415: [strict] Invalid host name '192.168.000.014'
After a bit of research, added the following line to the apache2.conf file
HttpProtocolOptions Unsafe
This fixed it and the scripts are now called as expected.

Extract date time from Apache Combined log format using AWS Logs and Cloudwatch

We're using awslogs to collect Apache Combined formatted logs into Cloudwatch. It's all capturing fine, but we're getting timestamp could not be parsed from message error.
An example log entry:
::ffff:10.0.0.1 - blahblah [17/Aug/2017:20:31:07 +0000] "GET /favicon-16x16.png HTTP/1.1" 304 - "http://blahblah:3000/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"
Our config for this set of log files looks like this, including our datetime_format entry:
[access_logs]
log_group_name = cromwell
log_stream_name = react-172.31.43.245-access
file = /home/admin/aperian-react/log/*access.log
datetime_format = "%d/%b/%Y:%H%M:%S %z"
multi_line_start_pattern = ::ffff:
time_zone = UTC
encoding = ascii
As you can see, the datetime is mid-line. This is different from most examples for syslogs, etc. We could change our log format, but we'd prefer not to since they flow into other systems as well.
Our dateformat_string was missing a colon.😒 😢
datetime_format = "%d/%b/%Y:%H%M:%S %z" # wrong
datetime_format = "%d/%b/%Y:%H:%M:%S %z" # correct

Strange "pattern not match" error on fluentd

Can someone tell if its normal that fluentd raise this error in td-agent.log file?
2015-07-31 13:15:19 +0000 [warn]: pattern not match: "- - - [31/Jul/2015:13:15:19 +0000] GET http://172.31.108.218/ HTTP/1.1 200 0 \"-\" \"ELB-HealthChecker/1.0\""
While this is a well formated apache2 log:
- - - [31/Jul/2015:13:15:19 +0000] GET http://172.31.108.218/ HTTP/1.1 200 0 \"-\" \"ELB-HealthChecker/1.0\"
And here is the source configuration:
<source>
type tail
format apache2
path /var/log/varnish/varnishncsa.log
pos_file /var/log/td-agent/tmp/access.log.pos
tag "apache2.varnish-access"
</source>
I can't figure out what's wrong there above.
Instead of finding some way to filter out logs from ELB-HealthChecker, you can set your own format for the Apache access log that is a little more flexible in terms of the first couple fields. I ran into this same error when getting /server-status checks from collectd (using it to monitor for SignalFx).
Setting the source like so:
<source>
type tail
format /^(?<host>[^ ]*(?:\s+[^ ]+)*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
time_format %d/%b/%Y:%H:%M:%S %z
path /var/log/apache2/access.log
pos_file /var/log/td-agent/apache2.pos
tag apache2.log
</source>
Allows both log lines like:
172.18.0.2:80 127.0.0.1 - - [08/Aug/2017:19:58:38 +0000] "GET /server-status?auto HTTP/1.1" 200 508 "-" "collectd/5.7.2.sfx0"
As well as:
192.168.0.1 - - [28/Feb/2013:12:00:00 +0900] "GET / HTTP/1.1" 200 777 "-" "Opera/12.0"
You can test format regex matching using Fluentular.
See related: Fluentd apache log format with multiple host ip
The problem is that these ELB-HealthChecker line log has an empty referer ip field. And then the log doesn't match apache2 log format for fluentd.
So the way to fix that is to filter logs with ELB-HealthChecker user-agent.

Apache logs showing strange ^# characters ? What does this mean ?

My apache logs are always interrupted by strange characters :
84.196.205.238, 172.23.20.177, 172.23.20.177 - - [05/May/2015:11:48:15 +0200] 0 www.sudinfo.be "GET /sites/default/files/imagecache/pagallery_450x300/552495393_google_street_view HTTP/1.1" 200 32620 "http://www.sudinfo.be/247263/article/culture/medias/2011-11-23/google-street-view-en%C2%A0belgique-comment-trouver-votre-maison" "Mozilla/5.0 (Linux; U; Android 4.2.2; nl-be; GT-P3110 Build/JDQ39) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Safari/534.30"
^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#^#efault/files/imagecache/pagallery_450x300/2015/01/13/1554554859_B974505865Z.1_20150113094316_000_GVR3PDRHQ.1-0.jpg HTTP/1.1" 200 26033 "http://www.bing.com/images/search?q=leonardo+dicaprio+Met+gala&id=06B1C7410D6458C6A698AC09F3F8C6B7915BFFDE&FORM=IQFRBA" "Mozilla/5.0 (iPad; CPU OS 7_1_1 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D201 Safari/9537.53"
Do you have any idea what can be the cause of this ?
If your web server is externally accessible then this is probably an artifact from an attempt to hack your server
ISTR ^# is how apache logs a "NULL" zero byte. These are used to pad attacks such as buffer overflow
You may like to look at counter measures such as mod_security
https://github.com/SpiderLabs/ModSecurity/wiki/ModSecurity-Frequently-Asked-Questions-%28FAQ%29
I hope it is obvious that a full patched server and application stack is more likely to be able to withstand random attack attempts like this
Ok finally found out what the problem was. My log files are written on a Network filesystem and my bash client just had problems to read it because of the network.
False alarm, everything still safe. Thanks for the help.