I setup an ELK stack in AWS Elasticsearch. I'm ingesting Apache logs into ELK, one of the apache fields is the vhost. In the vhost field it will have www.domain.com and domain.com, I'd like to combine those so I can setup accurate searches and visualize data by vhost. Currently it separates www.domain.com and domain.com as separate values.
123.456.7.89 - - [13/Feb/2017:18:56:19 +0000] thisdomain.com "GET /about.html HTTP/1.0" 200 1446 "-" "-" Server=aws8 SSL=- 634713 0
123.456.7.89 - - [13/Feb/2017:18:58:19 +0000] www.thisdomain.com "GET /services.html HTTP/1.0" 200 1446 "-" "-" Server=aws8 SSL=- 634713 0
I entered this data into Elasticsearch setup in AWS which gave me my field definitions. Also works in Cloudwatch.
[ip, user, username, timestamp, vhost, request, status_code, bytes, referrer, browser, server, ssl, timems, times]
Related
Yesterday we faced a strange behavior when reading access log of Apache httpd. An example of below:
207.46.13.135 - - [25/Sep/2022:15:28:28 +0700] "GET / HTTP/1.1" 302 287 (core.c/0/translate_name) - 140
This is a normal access entry: x.x.x.x - - [26/Sep/2022:14:16:57 +0700] "GET /corp/L003/consumer/theme/vn.ssc.css HTTP/1.1" 200 1043 (core.c/0/handler) - 830
We have directives to proxy and redirect the request going to the system. But why this does not redirect (the return code 302 is understandable when in debug mod but why we don't get it when having in production log) – > We suspected that these IPs used some kind of engines to flood the web server, only to response status but not the content.
I am working on emulating an embedded device that is being controlled via HTML commands. The controller issues URLs such as
http://192.168.0.10/cgi-bin/aw_cam?cmd=QFT&res=1
And these affect the device in specific ways. My goal is to make an emulator of the device so I need to capture and handle all such requests. I have successfully configured Apache to call my scripts and I can get access to the "cmd=QFT&res=1" control string by reading the value of QUERY_STRING. I am using Apache 2.4.18 on Ubuntu 16.04.5. The scripts are written in C++.
The problem I am running into is that some of the commands issued by the controller are of the following form:
http://192.168.0.10/cgi-bin/aw_ptz?cmd=#P80&res=1
http://192.168.0.10/cgi-bin/aw_ptz?cmd=#T50&res=1
For whatever reason, whoever designed the command structure decided to use the # character as part of the command. But since the '#' delimits the fragment part of the URL, the information after it never makes it to my script, which only receives "cmd="
Is there any way to force Apache to pass the entire string after the ? to my scripts? I cannot change the client or the protocol, only the server side.
Edit:
The apache log shows the entire URL (see portion of log file below), so even though # is supposed to be a fragment delimiter, it makes it into the log file at least but not the cgi script.
192.168.0.9 - - [27/Jan/2019:00:21:10 +0000] "GET /cgi-bin/aw_ptz?cmd=#P53&res=1 HTTP/1.0" 200 151 "-" "-"
192.168.0.9 - - [27/Jan/2019:00:21:11 +0000] "GET /cgi-bin/aw_ptz?cmd=#P66&res=1 HTTP/1.0" 200 151 "-" "-"
192.168.0.9 - - [27/Jan/2019:00:21:11 +0000] "GET /cgi-bin/aw_ptz?cmd=#P99&res=1 HTTP/1.0" 200 151 "-" "-"
192.168.0.9 - - [27/Jan/2019:00:21:11 +0000] "GET /cgi-bin/aw_ptz?cmd=#P76&res=1 HTTP/1.0" 200 151 "-" "-"
This seems to work:
RewriteCond %{THE_REQUEST} \s(.*)#(.*)\s
RewriteRule ^ http://localhost:8000%1#%2 [P,NE]
ProxyPass / http://localhost:8000/
ProxyPassReverse / http://localhost:8000/
(I had python simple http server listening on localhost:8000 to verify if hash was passed correctly
My Apache (Linux/CentOS) is behind a load balancer (AWS ELB). So in my Apache access_log, there are tons of useless logs, like:
- - - [13/Jan/2016:23:38:02 +0800] "GET / HTTP/1.1" 200 16 "-" "ELB-HealthChecker/1.0"
- - - [13/Jan/2016:23:42:39 +0800] "OPTIONS * HTTP/1.0" 200 - "-" "Apache (internal dummy connection)"
These two logs heavily flooded in my access_log. Actually these two are totally useless for me (since they are just internal health checkers only), but affecting/confusing to my log analyzers, like awstats. So i kinda don't want to have this kind of log entries.
Is there a way to configure Apache to WHAT NOT TO log?
I am currently getting 500 errors from Apache using a alarming probe shell script that has been provided to myself.
Unfortunately I have not been able to get to the bottom of why the script generates a 500 error when attempting to access content locally on the server but using other methods like wget and telnet works fine.
The following are the Apache access log entries for each of the attempts:
Using Wget
127.0.0.1 - "" [19/Mar/2013:14:31:44 +1100] "GET /index.html HTTP/1.1" 200 1635 "-" "Wget/1.13.3" "-"
Using Telnet
127.0.0.1 - "" [20/Mar/2013:13:12:11 +1100] "GET /index.html HTTP/1.1" 200 1635 "-" "-" "-"
Using the Probe Scripts
127.0.0.1 - - [19/Mar/2013:14:33:56 +1100] "GET /index.html HTTP/1.1" 500 - "-" "" "-"
The only difference I can see is that the probe has a - instead of a "" in the user agent (3rd item) which either way tells me it wasn't passed in any of the instances (as this is expected since there is no authentication).
I've bumped up the logging for everything in Apache and can't figure out what is amiss. There is no processing involved, it's a static file, and I have attempted with other file types too, like images to no avail.
Does anyone have any ideas or has seen something similar?
Thanks,
Tony
In Apache access.log, I am used to this kind of access log line:
127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326
I was checking some apache access logs this morning and found something I'm not used to:
192.168.1.10- - [20/Feb/2013:00:00:45 +0000] "POST /form/... 404 200 252 "-" "-" 435835
There are multiple status code. Does-it mean the request was sended multiple times (something like a failed/retry mechanism?