log format for goaccess log analysis - log-analysis

Installed goaccess, and trying to parse/analyse one log file. Facing issues in the log format. Any one knows the format we need to use - for below kind of log:[updated the log sample]
::1 - - [24/Jun/2013:17:10:39 -0500] "GET /favicon.ico HTTP/1.1" 404 286 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36" 0 -

It worked after using --log-format=COMBINED.
Answer credits to #Pete Darrow.

Related

Apache random IPs in access log trying to execute scripts

I just got a quick question. My apache access log has random IPs from China, Japan, etc. It looks like they are trying to execute scripts from where they are.
The log looks like this: 171.117.10.221 - - [29/Jan/2018:08:05:04 -0800] "GET /ogPipe.aspx?name=http://www.dongtaiwang.com/ HTTP/1.1" 301 0 "-" "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.3$
1.202.79.71 - - [29/Jan/2018:08:05:06 -0800] "GET /ogPipe.aspx?name=http://www.epochtimes.com/ HTTP/1.1" 301 0 "-" "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (K$
113.128.104.239 - - [29/Jan/2018:08:05:11 -0800] "GET /ogPipe.aspx?name=http://www.wujieliulan.com/ HTTP/1.1" 301 0 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Ge$
117.14.157.148 - - [29/Jan/2018:08:05:17 -0800] "GET /ogPipe.aspx?name=http://www.ntdtv.com/ HTTP/1.1" 301 0 "-" "Mozilla/5.0 (Linux; U; Android 4.3; en-us; SM-N900T Build/JSS15J) AppleWebKit/$
110.177.75.106 - - [29/Jan/2018:08:05:37 -0800] "GET /ogPipe.aspx?name=http://www.dongtaiwang.com/ HTTP/1.1" 404 3847 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Firefox/$
221.11.229.244 - - [29/Jan/2018:08:05:57 -0800] "GET /ogPipe.aspx?name=http://www.epochtimes.com/ HTTP/1.1" 404 3847 "-" "Mozilla/5.0 (Linux; U; Android 4.3; en-us; SM-N900T Build/JSS15J) Appl$
182.101.57.39 - - [29/Jan/2018:08:06:03 -0800] "GET /ogPipe.aspx?name=http://www.epochtimes.com/ HTTP/1.1" 404 3847 "-" "Mozilla/5.0 (Linux; U; Android 4.3; en-us; SM-N900T Build/JSS15J) Apple$
113.128.104.88 - - [29/Jan/2018:08:06:13 -0800] "GET /ogPipe.aspx?name=http://www.epochtimes.com/ HTTP/1.1" 404 3847 "-" "Mozilla/5.0 (Linux; U; Android 4.3; en-us; SM-N900T Build/JSS15J) Appl$
106.114.65.1 - - [29/Jan/2018:08:06:14 -0800] "GET /ogPipe.aspx?name=http://www.wujieliulan.com/ HTTP/1.1" 404 3847 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45$
113.128.104.148 - - [29/Jan/2018:08:06:31 -0800] "GET /ogPipe.aspx?name=http://www.ntdtv.com/ HTTP/1.1" 404 3847 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46$
114.221.124.84 - - [29/Jan/2018:08:06:45 -0800] "GET /ogPipe.aspx?name=http://www.ntdtv.com/ HTTP/1.1" 404 3847 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 $
172.104.108.109 - - [29/Jan/2018:08:17:50 -0800] "GET / HTTP/1.1" 302 830 "-" "Mozilla/5.0" (None of these are my IPs, that's why I am putting them out there.)
I used an IP lookup site to see where they are. Does anyone have any advice towards what I should do?
It's a new tls prober from GFW.
The https://example.com/ogPipe.aspx is a tool to bridge some blocked news website in china.(you can see the target websites in log lines)
GFW indeeds to detect/figure out it.
Here's my splunk search result of these 3 days.
remote_ip.png
user_agent.png
The features of the prober.
Source ip is a one-shot address
User-Agent is simulated to Chrome/Safari/Firefox
TLS Protocol is TLSv1.2
Short answer: Ignore them.
Long answer: There are plenty of vulnerabilities in various web servers / application frameworks that hackers want to abuse. Those originating IPs may not be the hackers themselves but victims of some malware / trojan horses remotely controlled by hackers. Those victims were used by hackers to dig if your server is vulnerable for a more promising rewards, e.g. access to your database or passwords. If you are hosting a .net framework application, look closely for any announcement of vulnerability and apply security patches if available. Especially if you have a "ogPipe.aspx" file serving, you should examine every line of code in it to see whether there is security loophole. As shown in your server log, it responded http code 404 meaning that you don't serve ogPipe.aspx, so you are safe. As a prevailing security advice, look closely for any announcement of vulnerability (from your software vendor, e.g. Apache / Microsoft) and apply security patches if available.

Apache access logs show a domain name where IP addresses usually are

Very rarely I will get a computer attempting to connect to my server with a domain name show-up where the IP addresses usually are. Can someone explain why this is happening and if this is something I should keep a closer eye on?
(related log snippet)
403 - ec2-52-53-242-144.us-west-1.compute.amazonaws.com - - [30/Nov/2017:20:26:47 -0500] "OPTIONS / HTTP/1.1" 339 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"

Where statement in Kibana search?

This is a typical log line from Apache being stored in AWS Elasticsearch. I'd like to be able to add a viz to my dashboard showing top referrers. The problem is that many static files have referrers from its own domain which prevents me from seeing the data I want.
Is it possible to have a search expression like "where REFERRER does not contain VHOST"
123.456.78.9 - - [15/Feb/2017:18:33:25 +0000] example.com "GET / HTTP/1.1" 200 42766 "http://facebook.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 10_0_2 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Mobile/14A456 Safari/602.1" Server=aws8 SSL=- 8868 0
123.456.78.9 - - [15/Feb/2017:18:33:25 +0000] example.com "GET /js/lib/jquery-ui/jquery-ui.js HTTP/1.1" 200 42766 "http://example.com/" "Mozilla/5.0 (iPhone; CPU iPhone OS 10_0_2 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Mobile/14A456 Safari/602.1" Server=aws8 SSL=- 8868 0

Apache access log investigation

I have been monitoring google analytics of our ecommerce server. Normally we would have less than 10 visitors. However recently I been seeing unusual bot activities. Sometimes it jumps to over 50 connections at a time. All in within few minutes. I am not sure if it is a bad crawler or someone committing click fraud on our google PPC ad campaigns.
Following is a small part from our access_log. Checking ip addresses does not reveal much. Also ipaddresses are unique and I could not find any repeat access from same ip when I compare over a few days.
76.189.130.73 - - [27/Feb/2016:21:32:25 -0600] "GET /hp-ce260x-toner-cartridge.html HTTP/1.1" 200 11548 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.137 Safari/4E423F"
71.82.43.43 - - [27/Feb/2016:21:32:26 -0600] "GET /hp-cb540a-oem-black-toner-cartridge.html HTTP/1.1" 200 11497 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.0 Safari/537.36"
68.4.69.7 - - [27/Feb/2016:21:32:25 -0600] "GET /hp-c9723a-magenta-laser-toner-cartridge.html HTTP/1.1" 200 11233 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2227.1 Safari/537.36"
50.54.179.218 - - [27/Feb/2016:21:32:26 -0600] "GET /hp-q5942xd-black-toner-cartridge.html HTTP/1.1" 200 11299 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36"
64.213.217.226 - - [27/Feb/2016:21:32:28 -0600] "GET /hp-q2682a-yellow-toner-cartridge.html HTTP/1.1" 200 11336 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.62 Safari/537.36"
50.25.245.238 - - [27/Feb/2016:21:32:29 -0600] "GET /hp-ce255x-oem-high-yield-toner-cartridge.html HTTP/1.1" 200 11196 "-" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2226.0 Safari/537.36"
I am not sure if this is related but I also see a few crawling from ahrefs.com/robot/ and webmeup-crawler.com/, but their ip addresses are consistent. I have already modified robots.txt to block ahrefs.com bot.
robots.txt can be abused, but it's mainly meant for google bots looking for what's available to be searched for. I have noticed in my own log that both google and random IP addresses tries a variety of different directories including these:
/phpMyAdmin/scripts/setup.php
/phpmyadmin/scripts/setup.php
/pma/scripts/setup.php
/robots.txt (Google in this case)
'9\xdd\xb1\xf8\xa1\xa8\xa8\x82\x904\x1f\x84\xbeNv\x7fa\xd9\xd4,)\x98^\xbf\x98\x14\x82q
\x19\xa5\b\x7f\xee\x98\x02\xde_\xa1\x1b\xc0
\x06\xe6\xf2\xba\"!=\xe1\x18?\xb6\xf5$\xb4n0[\x92\xe9_
\x8b[Y5nS\x1d (some kind of hash cracker)
//wp-login.php
/blog//wp-login.php
/wordpress//wp-login.php
/wp//wp-login.php
/?author=1
What they are looking for are mostly pre-created directories from from free download templates.
You should know that nearly all IP's starting on 66.249 are google.
The rest you can lookup yourself.
In your case it looks like the bot(s) are looking for an HP printer to mess with.
Hope this helped

apache2 server sometimes returns no response but logs a 200 status

I'm running an Apache2 web server
Server version: Apache/2.2.22 (Ubuntu)
Server built: Mar 19 2014 21:11:10
Every once in a while it silently fails to return content yet writes a 200 status code to the log file. For example, here is the regular log entry for a particular file.
50.158.90.90 - - [17/Nov/2014:06:18:16 -0800] "GET /beta/images/supported_browsers_64h.png HTTP/1.1" 200 12028 "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
But every once in a while an entry like this shows up.
50.158.90.90 - - [17/Nov/2014:07:30:38 -0800] "GET /beta/images/supported_browsers_64h.png HTTP/1.1" 200 0 "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
Nothing shows up in the error log when this happens. Any idea of what is going on?