Nice remote apache log viewer - apache

I have a server with 10+ virtual domains (most running Mediawiki). I'd like to be able to watch their traffic remotely with something nicer than tail -f . I could cobble something together, but was wondering if something super-deluxe already exists that involves a minimum of hacking and support. This is mostly to understand what's going on, not so much for security (though it could serve that role too). It must:
be able to deal with vhost log files
be able to handle updates every 10 seconds or so
Be free/open source
The nice to haves are:
Browser based display (supported by a web app/daemon on the server)
Support filters (bots, etc)
Features like counters for pages, with click to view history
Show a nice graphical display of a geographic map, timeline, etc
Identify individual browsers
Show link relationships (coming from remote site, to page, to another page)
Be able to identify logfile patterns (editing or creating a page)
I run Debian on the server.
Thanks!

Take a look at Splunk.
I'm not sure if it supports real time (~10 second) updates but there are a ton of features and it's pretty easy to get set up.
The free version has some limitations but there is also an enterprise version.

Logstash is the current answer. (=

Depending on the volume, Papertrail could be free for you. It is the closest thing to a tail -f and is searchable, archivable and also sends alerts based on custom criteria.

Related

Using Cuckoo sandbox platform for dynamically analyzing multiple file samples

I'm trying to run more than one sample at the same time in a single guest VM, for efficiency reasons, something that will be even more efficient than the distributed cuckoo solution, or using a few guest VMs.
For example, to submit a few URLs, so they will be opened in a few tabs(in IE or FF) in Cuckoo, so I won't need to run a clean VM for each URL.
Then, if any malicious activity is detected in any of the URLs, I'll find the malicious URL, and will make a deeper inspection of its activity using all other cuckoo plugins and modules, etc.
Can you think of a way to make it using cuckoo? or any workaround?
My use-case is that I have A LOT of samples, but only very few are malicious, so to run a VM for every one of them would be a waste of resources.
Cuckoo monitors malwares activity in system, record them and create report in a language like JSON. if you try several suspicious links probably malware in one VM, you cant track which part of JSON report (features) belong to which link (possibility malware). I believe you need to run different suspicious links/files in different VM. you can run few VM at the same time though.

configure multiple servers and scale

I have been given a task configure 1000 of servers with some simple data. Lets say I need to login to server (linux or windows) and setup the ntp server. I need to come up with some kind of automation framework using perl. I have some ideas and want to get more.
Here is my thought process:
a) Since there are 1000s of servers, definitely the framework should be able to read in a csv file so all inputs can be provided as apposed to single input.
b) Since there are so many servers, I have to find a way to do things in parallel. I cant go server by server in a sequential way
c) I should have some output file that shows the results of all the servers that I successfully configured, servers that failed. That way I can compare input file and output file and generate a report
Should I consider anything else in my framework ?
How can I do parallel processing using perl ?
Even if you want to stick with Perl, it looks like there are already some alternatives available that would keep you from implementing another framework from scratch.
Check out the comments from http://my.opera.com/cstrep/blog/2010/05/14/puppet-fabric-and-a-perl-alternative for a couple options.

How to compare test website and live website

We have our production server running our website. Then we have a test server which has exact same data but with changes to code to do some new functionality. This web app has over 500 pages.
Is there any program that can
Login to the test site
Crawl through each page and then save the page as html
Compare with the same page saved with live site?
This way we can make sure that new features that we add to our test site will not break the live site when code updates are applied to production.
I am currently trying to use WinHTTrack website copier and then comparing the test and live folders with some code comparison tool like beyond compare. This works ok but there are lot of files changed because of the domain name changes.
Looking forward to ideas / solutions for this problem.
Regards
Have you looked at using Watir for this? It's not exactly the thing you are looking for but it might allow you some more granularity in your tests and ensure the site is functionally identical rather than getting caught up on changing guids, timestamps and all the other things that tend to change across any significant size website from day to day as part of it's standard functionality.
Apparently you can't make consistent, reproduceable builds in your project, can you? I would recommend moving towards that in the long run, it will save you a lot of headaches. That way you would know exactly what was deployed to which server when, so there would be no more need to bend around backwards to get the deployed sources back like this...
I know this is not a direct solution to your problem... but maybe it is worth comparing, whether you would save more in the long run by investing the efforts into your build process now, instead of implementing this workaround (and then improving your build process anyway - because one day you will almost surely need to do that).
wget has a --convert-links option, there are also some options to preserve cookies that might let you do it logged in http://drupal.org/node/118759#comment-664498
use an Offline Downloader, download all files to your computer from both sources, then compare the folder contents using a free tool like Total Commander.
EDIT
Load both of your sources into a CVS, and compare it there.

browser plugin to test a site's look when migrating

I'm thinking I need a browser plugin that does the following, and if it doesn't exist, it should. I may as well say FF for now, but it could be any browser.
The problem: when moving a website from one server to another, you need migration testing. It is a pain to click on every link by hand and compare it to the old host. You really need 2 machines or have to constantly thrash your hosts file.
The plugin:
Would allow you to specify an alternate hosts entry for a website. 2 entries would make it clear, one for live, one for test.
The plugin would crawl every link on the site, and render the page in the browser, and save an image of the entire page.
It would switch hosts and repeat, and save images in a second folder. Since the rendering engines match, the images should match. We need to switch hosts (like /etc/hosts) so all absolute links are the same for the site.
Now this could be part of the plugin or external, now that we have 2 folders of identically named images, we run an image-diff program on the whole batch. A quick test would be a bdiff or hash, or we could get more sophisticated and determine how different each image is.
This would save so much time. So can it be done with existing tools, or do I need to go write it?
Have a look at Selenium, it allows you to script interactions with the browser and verify content.
That is overengineered. What kind of website is it? How big? Which framework (PHP, JSP, Rails, etc.)? Why not copy the website onto the new server and grep the code for specific ties to the old server?
I'd concentrate on why you think the site would differ between two servers, and focus on testing those specific cases rather than the whole site. When a site is moved to a new machine the issues are generally very obvious from looking at a couple of pages.
Presumably they are both looking at the same data source, assuming there is a data source, otherwise a folder diff on the two installations would suffice. This being the case, it should be a simple task to identify which areas of the site are likely to be affected by a server migration.
Also, I wouldn't personally trust a machine matching two images to sign off system as ready to go live. There just isn't a substitute for real human testing. Yes it's time consuming, but how important is your site?
Try http://www.browsercam.com/ - free trial should allow you to specify main page and follow links to make screenshots automatically of the sub-pages as well.

Best IT/back-office system hacks? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Lots of people have things that their systems do for them or for their teams. Source control post-commit hooks are a standard example: have an automated build system that checks out the latest source, compiles, tests, and packages it is a back-office hack that most of us probably use.
What other cool things have you done?
We had one developer in our team who wasn't familiar with the concept of a subversion conflict. He deduced that if he simply deleted all that weird stuff in his code and clicked resolve that everything was ok (i.e. knocking out all the other changes in the file....)
Regardless to say, after the 5th time this occurred, and the 5th time that I had to explain why that defect that I just closed was reoccuring, I wrote a script.
It would diff for the changes to a file to see whether the consecutive checkin deleted all the previous changes and that they were done by the nameless developer.
It would then send an email to the boss with a description of what happened, and how much work was lost during the checkin.
There was no 7th occurrence.
We have a traffic-light that shows whether our daily build succeeds, has failed tests or simply doesn't build.
Also, we have a light bar that lights up for a few seconds whenever we receive an upload from a customer.
We aren't staffed 24x7 but we have critical processes that run throughout the night. We created an in-house alerts system to notify us of serious system issues, failed mission-critical processes, etc. It uses text-to-speech to create a descriptive message and then connects to our automated dialer to call the appropriate people with the message.
Working at a web design company I configured our dev server so we could see a working copy of a project in real time by a sub domain name. So if your name was joe and you were working on project jetfuel you would go to joe.jetfuel.test-example.com and you could see your changes instantly without committing.
This was a simple hack that used sub domain names as a partial directory structure. Our htdocs path looked like this htdocs/tag/project. We had a script (a php app that you would access by setup.test-example.com) that would create a new tag name for you and checkout whatever version you wanted and call the deploy script for that project. If it succeeded it would forward you to the new sub domain. You could then work on this new copy by a samba share.
This worked really well for us since we always deployed to the same linux build and our projects had simple database requirements.
Our original reason for doing this was because our developers worked on all kinds of different platforms. Besides fixing this platform problem this was awesome for viewing changes and testing. We had all kinds of tags ranging from peoples names, trunk versions, test tags, all the way to prototypes like jquery-menu-hack.jetfuel.test-example.com
Now that I look back I wonder how much easier it would have been to run virtual machines.
We had a dev working on a classic ASP site that didn't believe in source control. The code went from his machine straight to the production box. This lead to issues with lost changes or the inability to revert back to a stable version. Since CruiseControl.Net has the ability to monitor a directory, I added a project that actually checked in files whenever they were copied to production. Completely backward from CC.Net's original intent, but we didn't lose any more code.
Put in a pre-commit hook that checks the bug comment refers to an open bug, assigned to the user doing the checkin. (SCMBug can do this).
Then to make life REALLY interesting, spell check the comments!!
The commit comment, and the one in the code. (spell is my buddy)
Run the code through a code formatter set to compayn standard; and diff it to the original: if it's not in company offical format: reject the commit.
Do a coverage test with the unit test build.
Email all mistakes/errors caused to the development team.
I left OUT the name of the developer. They know they did it.
Not exactly hacks, but a couple of must-haves for IT dev work:
If you're using subversion, you've got to use CommitMonitor. (http://tools.tortoisesvn.net/CommitMonitor) It lets you monitor svn repositories for new commits & then review the new commits. Great if you're wanting to stay on top of what your team is doing. Particularly if you have a couple of juniors that need to be watched. ;)
Rsnapshot (http://www.rsnapshot.org/) is also invaluable - we have complete backup snapshots of our entire filesystem every four hours going back 2 years, and every day beyond that. It's like a data cube for your filesystem! The peace of mind this gives is pure bliss. :)
Hardly a hack, but back in the day, on our speedy VAX 11/730, our overnight process would print the file "BLAMMO.TXT" on the printer if something went amiss. Every morning, the first stop was the printer when coming in.
Back in the dotCom days about 9 years ago, I had to hack a failover system between two different locations. We had a funky setup with a powerbuilder front end website, and powerbuilder managment tool. Data was stored in MSSQL 7.0. The webservers used IPX to communicate to the SQL Servers (don't ask). Anyway, I was responsbile for coming up with a failover plan.
I ended up hacking together some linux boxes, and had them run our external DNS. One at each location. We had a remote site w/ webserver, and sql server I got SQL transaction replication working over a 128k ISDN IPX connection (of all things). Then built a monitoring tool at our production site to send packets out to various upstream network handoffs. If we experienced more than 20% outage the primary site, the monitoring tool ran a perl script on the Debian box to change DNS and point to our 2ndary. Our secondary had a heartbeat w/ our primary DNS, and monitoring station. It would duplicate records unless it lost both connections then it would roll over to pointing DNS to backup location.
The primary site would shut down the SQL server at the primary location to break replication. Automated site to site failover using 128k ISDN IPX connection :)
Back at my previous job, we had to audit many tables for data changes (inserts, updates and deletes). Our support crew had to be able to search through this data to find changes that users made.
The temporary solution that had become semi-permanent was to store each non-select query. However this was a large system, that the table would grow by about 1.5GB a day.
The solution I came up with was to create a script that for all tables in an external list, created the appropriate triggers that audit each table, row, column, before and after, when and by whom and store it in our new audit table. This table grew by about 10% the size of the older version and stored much more usable data. It enabled us to create a UI to search and view every change made to our data, without requiring any knowledge of SQL for our support team or business users.
This is at a lesser level, but I am fairly proud of a make file I wrote for compiling code for my research. It only needs to be given your source and header file names that can take care of the rest all by itself (though it does make the one assumption that you will not be compiling any header files into objects, only source files get compiled). The other downsides are the fact that it relies on the GNU make program's second expansion feature, so I don't know if it works on other make programs. Additionally the compiler used needs to support something similar to gcc's -MM feature. Here is hoping that no one laughs at it.
-include prereqs.mk
HEADERS=$(SRC_DIR)/gs_lib.h $(SRC_DIR)/gs_structs.h
SOURCES=$(SRC_DIR)/main.cpp $(SRC_DIR)/gs_lib.cpp
OBJECTS=$(patsubst $(SRC_DIR)/%.cpp,$(OBJ_DIR)/%.o,$(SOURCES))
release: FLAGS=$(GEN_FLAGS)$(OPT_FLAGS)
release: $(OBJECTS) prereqs.mk
$(CXX) $(FLAGS) $(LINKER_FLAGS) $(OUTPUT_FLAG) $(EXECUTABLE) $(OBJECTS)
prereqs.mk: $(SOURCES) $(HEADERS)
$(CXX) $(DIR_FLAGS) $(MAKE_FLAG) $(SOURCES) | sed 's,\([abcdefghijklmnopqrstuvwxyz_]*\).o:,\1= \\\n,' > $#
.SECONDEXPANSION:
$(OBJECTS): $$($$(patsubst $(OBJ_DIR)/%.o,%,$$#))
$(CXX) $(FLAGS) $(NO_LINK_FLAG) $(OUTPUT_FLAG) $# $(patsubst $(OBJ_DIR)/%.o,$(SRC_DIR)/%.cpp,$#)
Obviously I dropped the definition of a number of variables, but I think it gets the idea across.
Since my coding tools and style are compatible with the requirements of this script I like to use it. All I need to do to add (a) new piece(s) of source code is add its name(s) to the appropriate variable and the rest is taken care of.
We have Twitter accounts for many projects which tweet things like commit messages, notices from builds, failed unit tests, deployments, bug tracking activity - any kind of event associated with the project. Running a client like Twitter Gwibber (which displays a pop-up for each new status) is a great way to stay in touch with the activity on the projects you are interested. Using Twitter is good as you can take advantage of all the 3rd party apps - such as the iPhone clients.
Add commit-hook check for VRML/3d-model files with absolute path to textures/images. f:/maya/my-textures/newproject/xxxx.png just doesn't belong on the server.
Back in the 1993, when source control systems were really expensive and unwieldy, the company I worked about had an in-house source control built as 4DOS scripts. It wasn't as sofisticated as most current source control systems, for example it didn't have branching or integrates, but it did the basic job of supporting revisions history, checkout/checkin and rudimentary conflict resolution.