Top & httpd - demystifying what is actually running - process

I often use the "top" command to see what is taking up resources. Mostly it comes up with a long list of Apache httpd processes, which is not very useful. Is there any way to see a similar list, but such that I could see which PHP scripts etc. those httpd processes are actually running?

If you're concerned about long running processes (i.e. requests that take more than a second or two to execute), you'll be able to get an idea of them using Apache's mod_status. See the documentation, and an example of the output (from www.apache.org). This isn't unique to PHP, but applies to anything running inside an apache process.
Note that the www.apache.org status output is publicly available presumably for demonstration purposes -- you'd want to restrict access to yours so that not everyone can see it.

There's a top-like ncurses-based utility called apachetop which provides realtime log analysis for Apache. Unfortunately, the project has been abandoned and the code suffers from some bugs, however it's actually very much usable. Just don't run it as root, run it as any user with access to the web server log files and you should be fine.

The php scripts happen so fast, top wouldn't show you very much. Or it would zip by quite quickly. Most webrequests are quite quick.
I think your best bet would be to have some type of real time log processor, that kept an eye on your access logs and updates stats for you of average run time, memory usage and stuff like that.

You could make your PHP pages time themselves and write their path and execution time to file or database. Note that would slow everything down while you were monitoring, but it would serve as a good measuring method.
It wouldn't be that interactive though. You'd be able to get daily or weekly results from it, but it'd be hard to see something meaningful within minutes or hours.

Related

jmeter Load Test Serevr down issues

I was used a load of 100 using ultimate thread group for execution in NON GUI Mode .
The Execution takes place around 5 mins. only . After that my test environment got shut down. I am not able to drill down the issues. What could be the reason for server downs. my environment supports for 500 users.
How do you know your environment supports 500 users?
100 threads don't necessarily map to 100 real users, you need to consider a lot of stuff while designing your test, in particular:
Real users don't hammer the server non-stop, they need some time to "think" between operations. So make sure you add Timers between requests and configure them to represent reasonable think times.
Real users use real browsers, real browsers download embedded resources (images, scripts, styles, fonts, etc) but they do it only once, on subsequent requests the resources are being returned from cache and no actual request is being made. Make sure to add HTTP Cache Manager to your Test Plan
You need to add the load gradually, this way you will be able to state what was amount of threads (virtual users) where response time start exceeding acceptable values or errors start occurring. Generate a HTML Reporting Dashboard, look into metrics and correlate them with the increasing load.
Make sure that your application under test has enough headroom to operate in terms of CPU, RAM, Disk space, etc. You can monitor these counters using JMeter PerfMon Plugin.
Check your application logs, most probably they will have some clue to the root cause of the failure. If you're familiar with the programming language your application is written in - using a profiler tool during the load test can tell you the full story regarding what's going on, what are the most resources consuming functions and objects, etc.

Server load is minimal but website responds poorly

I have VPS on hetzner. Server is located in Germany.
It has 256GB RAM, 6CPUs (12 threads).
I have a file which since yesterday, is requested about 30 times in one second. File has 2 Select, 2 Update, 2 Insert queries, so I assumed (not sure how this works) from this file server has about 180 requests per second. So right after this requests started, all the websites on the server just started loading poorly. I made this file run just one select query and than die. This didn't help. In WHM load is aboiut 0.02.
I've checked for error logs and there is no max_user_connection or any error there.
I have enabled slow query log and checked log file. there is nothing (I've tested it with select sleep(10) and this query was logged).
This is visit statistics, please bring your attention to may 30th:
Bandwidth stats for last 24 hours:
There are many errors like this in ssl_log (diff IPs of course):
188.121.206.150 - - [30/May/2018:19:50:03 +0200] "-" 408 - "-" "-"
I've been searching web a lot and couldn't find any solution. Could anyone at least tell what should I monitor or where. I have full access to anything there is possible inside the server. Any help is appreciated.
UPDATE 1
I have subdomain: banners.analyticson.com (access allowed for now) and there I have all the images and html5 files that are requested.
Take one image for example : https://banners.analyticson.com/img/suy8G1S6RU.jpg
It needs too much time to load. As I noticed, this sub domain has some issue.
Script, that I mentioned earlier (with 6 queries) just tries to get one of those banners to the user, so result of that script is to return one banner from banners.analyticson.com.
UPDATE 2
I've checked my script, it is fine. It takes less than 1 second to complete.
I've also checked Top command and there is a result. I'm not sure if $MEM value is fine.
You're going to have to narrow the problem down...
There are multiple potential issues.
First thing to eliminate would be the performance of your new script on a development laptop - I assume you're using PHP, so use the profiling tools to work out what is going on. If it's a database query, you'll see which one by looking at the profiler.
If your PHP script and database queries are fine, the next thing to look at: it sounds like you've hit some bottleneck resource on your infrastructure. In these cases, scripts that run fine as a single request start queueing for the bottleneck resource, and every new request adds to the queue until the whole server starts to crawl. This can be a bit of a puzzle - start with top and keep digging.
Next, I'd look at configuration of Apache to make sure everything is squeaky clean - Apache used to have a default to do a reverse DNS lookup for every request, which slows the server down rather impressively on production. You may also want to look at your SSL configuration - the error you report is linked to a load balancer issue.
If it's not as simple as memory, CPU etc., you're into more esoteric issues. You may need to ramp up a load testing rig so you can experiment without affecting the live site - typically, I do this on a machine as similar to live as possible, using Apache JMeter to generate load, and find the "inflection point". Typically, you see response times increase linearly with the number of concurrent requests, until you hit the bottleneck resource, at which point the response time increases rapidly. As a simple example, if you have 10 database connections available, response time should increase linearly up to 10 concurrent connections, and then become much larger from 11 up.
Knowing where the inflection point is and being able to recreate it allows you to use PHP profiling tools under load. This is a lot of work.
UPDATE
You're using php-cgi; this is easily the most inefficient way of running PHP scripts. Your server is barely breaking a sweat - CPU and memory basically idle. Here's a comparison for how to run PHP; consider changing to mod_php.

How can you test your web server speed?

Our website seems to be slower than it used to be, how can I test that? And is there a way to find the cause? (eg too many visitors).
Thanks.
There is a rather good tool for performance benchmarking of web servers: Jakarta Jmeter, which is an Apache project, so it's rather well supported and tested.
The key to be able to pinpoint the cause would be to do benchmarking regularly, so you can actually match changes in your benchmark results with events on your server: upgrades, code changes, variations in the number of visitors...
The Firebug add on for Firefox has a Net tab which is useful for debugging issues and testing. Also Fiddler on Windows is nice. And then there is the age old tradition of checking your server error logs for any problems.
A good first step is to make sure you are keeping fairly complete server logs and feed them into a log analyser. This is helpful for giving you a general idea of how long things take and which pages are slowest. It's also a good idea to check your error logs to make sure things are working properly.
Beyond that, things get more complicated as you may need to isolate your webserver, code and database to see if one of these is the bottleneck. Also, Jeff's blog, coding horror had a recent entry on server optimization.
Use Google Analytics to track your site's visitors over time to find out if you are getting more traffic.
You tagged your question with shared-hosting - being on a shared host means that someone else's code running on the same machine as your may be affecting your site's performance.
I'd suggest going with Varkhan and apphacker's suggestion to make sure your site's code is reasonably quick. use Analytics to get some stats and the possibly, depending on how many visitors you are getting and how slow the site is, consider moving away from a shared host.
Try the server speed checker at Bitcatcha.com. The tool ping your website server and record the time needed to get a response. It also pings from 8 different nodes to your server. You are able to at least find out whether it's your server that is slowing your website.

Using a SQL Server for application logging. Pros/Cons?

I have a multi-user application that keeps a centralized logfile for activity. Right now, that logging is going into text files to the tune of about 10MB-50MB / day. The text files are rotated daily by the logger, and we keep the past 4 or 5 days worth. Older than that is of no interest to us.
They're read rarely: either when developing the application for error messages, diagnostic messages, or when the application is in production to do triage on a user-reported problem or a bug.
(This is strictly an application log. Security logging is kept elsewhere.)
But when they are read, they're a pain in the ass. Grepping 10MB text files is no fun even with Perl: the fields (transaction ID, user ID, etc..) in the file are useful, but just text. Messages are written sequentially, one like at a time, so interleaved activity is all mixed up when trying to follow a particular transaction or user.
I'm looking for thoughts on the topic. Anyone done application-level logging with an SQL database and liked it? Hated it?
I think that logging directly to a database is usually a bad idea, and I would avoid it.
The main reason is this: a good log will be most useful when you can use it to debug your application post-mortem, once the error has already occurred and you can't reproduce it. To be able to do that, you need to make sure that the logging itself is reliable. And to make any system reliable, a good start is to keep it simple.
So having a simple file-based log with just a few lines of code (open file, append line, close file or keep it opened, repeat...) will usually be more reliable and useful in the future, when you really need it to work.
On the other hand, logging successfully to an SQL server will require that a lot more components work correctly, and there will be a lot more possible error situations where you won't be able to log the information you need, simply because the log infrastructure itself won't be working. And something worst: a failure in the log procedure (like a database corruption or a deadlock) will probably affect the performance of the application, and then you'll have a situation where a secondary component prevents the application of performing it's primary function.
If you need to do a lot of analysis of the logs and you are not comfortable using text-based tools like grep, then keep the logs in text files, and periodically import them to an SQL database. If the SQL fails you won't loose any log information, and it won't even affect the application's ability to function. Then you can do all the data analysis in the DB.
I think those are the main reasons why I don't do logging to a database, although I have done it in the past. Hope it helps.
We used a Log Database at my last job, and it was great.
We had stored procedures that would spit out overviews of general system health for different metrics that I could load from a web page. We could also quickly spit out a trace for a given app over a given period, and if I wanted it was easy to get that as a text file, if you really just like grep-ing files.
To ensure the logging system does not itself become a problem, there is of course a common code framework we used among different apps that handled writing to the log table. Part of that framework included also logging to a file, in case the problem is with the database itself, and part of it involves cycling the logs. As for the space issues, the log database is on a different backup schedule, and it's really not an issue. Space (not-backed-up) is cheap.
I think that addresses most of the concerns expressed elsewhere. It's all a matter of implementation. But if I stopped here it would still be a case of "not much worse", and that's a bad reason to go the trouble of setting up DB logging. What I liked about this is that it allowed us to do some new things that would be much harder to do with flat files.
There were four main improvements over files. The first is the system overviews I've already mentioned. The second, and imo most important, was a check to see if any app was missing messages where we would normally expect to find them. That kind of thing is near-impossible to spot in traditional file logging unless you spend a lot of time each day reviewing mind-numbing logs for apps that just tell you everything's okay 99% of the time. It's amazing how freeing the view to show missing log entries is. Most days we didn't need to look at most of the log files at all... something that would be dangerous and irresponsible without the database.
That brings up the third improvement. We generated a single daily status e-mail, and it was the only thing we needed to review on days that everything ran normally. The e-mail included showed errors and warnings. Missing logs were re-logged as warning by the same db job that sends the e-mail, and missing the e-mail was a big deal. We could send forward a particular log message to our bug tracker with one click, right from within the daily e-mail (it was html-formatted, pulled data from a web app).
The final improvement was that if we did want to follow a specific app more closely, say after making a change, we could subscribe to an RSS feed for that specific application until we were satisfied. It's harder to do that from a text file.
Where I'm at now, we rely a lot more on third party tools and their logging abilities, and that means going back to a lot more manual review. I really miss the DB, and I'm contemplated writing a tool to read those logs and re-log them into a DB to get these abilities back.
Again, we did this with text files as a fallback, and it's the new abilities that really make the database worthwhile. If all you're gonna do is write to a DB and try to use it the same way you did the old text files, it adds unnecessary complexity and you may as well just use the old text files. It's the ability to build out the system for new features that makes it worthwhile.
yeah, we do it here, and I can't stand it. One problem we have here is if there is a problem with the db (connection, corrupted etc), all logging stops. My other big problem with it is that it's difficult to look through to trace problems. We also have problems here with the table logs taking up too much space, and having to worry about truncating them when we move databases because our logs are so large.
I think its clunky compared to log files. I find it difficult to see the "big picture" with it being stored in the database. I'll admit I'm a log file person, I like being able to open a text file and look through (regex) it instead of using sql to try and search for something.
The last place I worked we had log files of 100 meg plus. They're a little difficult to open, but if you have the right tool it's not that bad. We had a system to log messages too. You could quickly look at the file and determine which set of log entries belonged which process.
We've used SQL Server centralized logging before, and as the previous posted mentioned, the biggest problem was that interrupted connectivity to the database would mean interrupted logging. I actually ended up adding a queuing routine to the logging that would try the DB first, and write to a physical file if it failed. You'd just have to add code to that routine that, on a successful log to the db, would check to see if any other entries are queued locally, and write those too.
I like having everything in a DB, as opposed to physical log files, but just because I like parsing it with reports I've written.
I think the problem you have with logging could be solved with logging to SQL, provided that you are able to split out the fields you are interested in, into different columns. You can't treat the SQL database like a text field and expect it to be better, it won't.
Once you get everything you're interested in logging to the columns you want it in, it's much easier to track the sequential actions of something by being able to isolate it by column. Like if you had an "entry" process, you log everything normally with the text "entry process" being put into the "logtype" column or "process" column. Then when you have problems with the "entry process", a WHERE statement on that column isolates all entry processes.
we do it in our organization in large volumes with SQL Server. In my openion writing to database is better because of the search and filter capability. Performance wise 10 to 50 MB worth of data and keeping it only for 5 days, does not affect your application. Tracking transaction and users will be very easy compare to tracking it from text file since you can filter by transaction or user.
You are mentioning that the files read rarely. So, decide if is it worth putting time in development effort to develop the logging framework? Calculate your time spent on searching the logs from log files in a year vs the time it will take to code and test. If the time spending is 1 hour or more a day to search logs it is better to dump logs in to database. Which can drastically reduce time spend on solving issues.
If you spend less than an hour then you can use some text search tools like "SRSearch", which is a great tool that I used, searches from multiple files in a folder and gives you the results in small snippts ("like google search result"), where you click to open the file with the result interested. There are other Text search tools available too. If the environment is windows, then you have Microsoft LogParser also a good tool available for free where you can query your file like a database if the file is written in a specific format.
Here are some additional pros and cons and the reason why I prefer log files instead of databases:
Space is not that cheap when using VPS's. Recovering space on live database systems is often a huge hassle and you might have to shut down services while recovering space. If your logs is so important that you have to keep them for years (like we do) then this is a real problem. Remember that most databases does not recover space when you delete data as it simply re-uses the space - not much help if you are actually running out of space.
If you access the logs fequently and you have to pull daily reports from a database with one huge log table and millions and millions of records then you will impact the performance of your database services while querying the data from the database.
Log files can be created and older logs archived daily. Depending on the type of logs massive amounts of space can be recovered by archiving logs. We save around 6x the space when we compress our logs and in most cases you'll probably save much more.
Individual smaller log files can be compressed and transferred easily without impacting the server. Previously we had logs ranging in the 100's of GB's worth of data in a database. Moving such large databases between servers becomes a major hassle, especially due to the fact that you have to shut down the database server while doing so. What I'm saying is that maintenance becomes a real pain the day you have to start moving large databases around.
Writing to log files in general are a lot faster than writing to DB. Don't underestimate the speed of your operating system file IO.
Log files only suck if you don't structure your logs properly. You may have to use additional tools and you may even have to develop your own to help process them, but in the end it will be worth it.
You could log to a comma or tab delimited text format, or enable your logs to be exported to a CSV format. When you need to read from a log export your CSV file to a table on your SQL server then you can query with standard SQL statements. To automate the process you could use SQL Integration Services.
I've been reading all the answers and they're great. But in a company I worked due to several restrictions and audit it was mandatory to log into a database. Anyway, we had several ways to log and the solution was to install a pipeline where our programmers could connect to the pipeline and log into database, file, console, or even forwarding log to a port to be consumed by another applications.
This pipeline doesn't interrupt the normal process and keeping a log file at the same time you log into the database ensures you rarely lose a line.
I suggest you investigate further log4net that it's great for this.
http://logging.apache.org/log4net/
I could see it working well, provided you had the capability to filter what needs to be logged and when it needs to be logged. A log file (or table, such as it is) is useless if you can't find what you're looking for or contains unnecessary information.
Since your logs are read rarely, I would write them on file (better performance and reliability).
Then, if and only if you need to read them, I would import the log file in a data base (better analysis).
Doing so, you get the advantages of both methods.

The ideal multi-server LAMP environment

There's alot of information out there on setting up LAMP stacks on a single box, or perhaps moving MySQL onto it's own box, but growing beyond that doesn't seem to be very well documented.
My current web environment is having capacity issues, and so I'm looking for best-practices regarding configuration tuning, determining bottlenecks, security, etc.
I presently host around 400 sites, with a fair need for redundany and security, and so I've grown beyond the single-box solution - but am not at the level of a full ISP or dedicated web-hosting company.
Can anyone point me in the direction of some good expertise on setting up a great apache web-farm with a view to security and future expansion?
My web environment consists of 2 redundant MySQL servers, 2 redundant web-content servers, 2 load balancing front-end apache servers that mount the content via nfs and share apache config and sessions directories between them, and a single "developer's" server which also mounts the web-content via nfs, and contains all the developer accounts.
I'm pretty happy with alot of this setup, but it seems to be choking on the load prematurely.
Thanks!!
--UPDATE--
Turns out the "choking on the load" is related to mod_log_sql, which I use to send my apache logs to a mysql database. By re-configuring the webservers to write their sql statements to a disk file, and then creating a separate process to submit those to the database it allows the webservers to free up their threads much quicker, and handle a much greater load.
You need to be able to identify bottlenecks and test improvements.
To identify bottlenecks, you need to use your system's reporting tools. Some examples:
MySQL has a slow query log.
Linux provides stats like load average, iostat, vmstat, netstat, etc.
Apache has the access log and the server-status page.
Programming languages have profilers, like Pear Benchmark.
Use these tools to identifyy the slowest/biggest offenders and concentrate on them. Try an improvement and measure to see if it actually improves performance.
This becomes a never ending loop for two reasons: there's always something in a complex system that can be faster and as your system grows, different functions will start slowing down.
Based on the description of your system, my first hunch would be disk io and network io on the NFS servers, then I'd look at MySQL query times. I'd also check the performance of the shared sessions.
The schoolbook way of doing it would be to identify the bottlenecks with real empirical data.
Is it the database, apache, network, cpu, memory,io? Do you need more ram, sharding(+), is the DiskIO, the NFS network load, cpu for doing full table scans?
When you find out where the problem is you might run into the problem that its not enough to scale the infrastructure, because of the way the code works, and you end up with the need to either just create more instances of you current setup or make the code different.
I would also recommend as a first step in terms of scalability, off-load your content to a CDN like Edgecast. Use your current two content servers as additional web servers.