As part of my learning process, I thought it would be good if I expand a little more knowledge on what I know about apache. I have several questions, and while I know some of the stuff may require a rather lengthy explanation, I hope you can provide an overview so I know where to go looking. (preferably reference to mod_wsgi)I have read some resources after searching on google, and what I know arrive from there, so please bear with me.
What does the apache lifecyle looks like before, during, and after it receives a http request? Does it spawns a new child process to do the work, or creates a thread in one of the child process?
Does apache by default runs under www-data? So if that's the case, if I want a directory under my project folder to be used for logs, I can change just the folder group to www-data and allows write access?
What user will the python interpreter run under, after being invoked by apache? And what will processes created by Popen or multiprocessing from there run under?
I ran ps U www-data. Why are there so many processes with
S 0:00 /usr/sbin/apache2 -k start
The Apache mpm prefork module handles one connection in one process. To handle connections fast and not spawn processes on demand, apache maintains a process-pool. This explains why you see so many processes in the process-list. If a connection comes in, it is handed to one of the already existing processes.
Some more information is here: http://httpd.apache.org/docs/2.0/en/mod/prefork.html
The answer to question 2) is yes, apache always runs as www-data und you can grant access to any directory by changing it's group permissions to www-data.
Read:
http://www.fmc-modeling.org/category/projects/apache/amp/Apache_Modeling_Project.html
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
http://code.google.com/p/modwsgi/wiki/QuickConfigurationGuide#Delegation_To_Daemon_Process
http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIDaemonProcess
The first one will tell you all the gory details of how Apache works internally. The latter relate to mod_wsgi specifically and process/threading model.
Related
I have a Apache module that acts as a security filter that allows requests to pass or not. This is a custom made module, I don't want to use any existent module.
I have actually two questions:
The module has its own log file. I'm thinking that the best location should be in /var/log/apache2/ but since the Apache process runs on www-data user, it cannot create files on that path. I want to find a solution for the log file in such way that is not much intrusive (in terms of security) for a typical web server. Where would be the best place and what kind of security attributes should be set?
The module communicates with another process using pipes. I would like to spawn this process from Apache module only when I need it. Where should I locate this binary and how should I set the privileges as less intrusive as possible?
Thanks,
Cezane.
Apache starts under the superuser first and performs the module initialization (calling the module_struct::register_hooks function). There you can create the log files and either chown them to www-data or keep the file descriptor open in order to later use it from the forked and setuided worker processes.
(And if you need an alternative, I think it's also possible to log with syslog and configure it to route your log messages to your log file).
Under the worker process you are already running as the www-data user so there isn't much you can do to further secure the execution. For example, AFAIK, you can't setuid to yet another user or chroot to protect the filesystem.
What you can do to improve the security is to use a system firewall. For example, under AppArmor you could tell the operating system what binaries your Apache module can execute, stopping it from executing any unwanted binaries. And you can limit that binary's filesystem access, preventing it from accessing www-data files that doesn't belong to it.
I am having the most frustrating time trying to edit my httpd.conf file for Apache server. I've done it numerous times on other machines, but on this one, it says it's always in use. I've killed the Apache service and the Apache monitor process.
Does anyone know what else would use this file? Let me know of any other info you need.
you can try the commands fuser or lsof to see/kill the process that uses the file. see this link for some examples (scroll down a bit): http://www.lunarforums.com/vps_hosting_at_lunarpages/useful_linux_scripts_lsof_ps_fuser_netstat-t41474.0.html
I have an application with some cacheing backend and I want to clear the cacheing whenever the webserver is been restarted.
Is there a apache configuration directive or any other way to execute a shell script upon webserver (re)start?
Thanks,
Phil
Adding some more information, as asked by some answers already:
Base system is ofc linux based, in this exact situation: CentOs
Modifying the startup script is unfortunately no option as pointed out by one of the comments already, due to it beeing not configuration file within the respective RPM packages and therefor beeing replaced by updates. Also I think modifying the startup script would be a bad thing in general
I see, that actually linking both "restarting the webserver" and "clearing my app cache" is not exactly what should be tied together. I will consider other alternatives
My situation is as follows: I can define how the virtual host config looks like, but I can not define how the rest of the servers configuration looks like.
The application is actually PHP based (and runs on the symfony framework). Symfony pre-compiles alot of stuff into dynamic php files from what it finds in the static configuration files. We deploy our apps via RPM and after deployment, an webserver restart is actually initiated already, so I thought it might make sense to tie the cache-cleanup to it. But I think after getting all your feedback, it looks like it is better to put the cache cleanup process into the installation process itself.
You haven't provided a lot of detail here, so it's hard to give a concrete answer, but I would suggest that your best option is to write a script which handles restarting apache, and clearing your cache. It would look something like this:
#!/bin/sh
# restart apache
/etc/init.d/httpd graceful
# whatever needs to be done to clear cache
rm -rf /my/cache/dir
Ramy suggests modifying the system startup script for Apache -- this is a bad idea! If and when you update Apache on your server, there is a good chance that your change will be lost.
Dirk suggests that what you are trying to do is probably misguided, and I think he's right. You haven't told us what platform you are running, but I can think of few situations where restarting your webserver and clearing a cache actually need to happen together.
You can modify Startup script for the Apache Web Server in /etc/init.d/httpd and write your own syntax inside it.
chattr +i /etc/init.d/httpd
If you have (root) access to the server you could do this by shell scripts but I would consider if it is the best way of cache management to rely on apache restarts.
My scripts (php, python, etc.) and the scripts of other users on my Linux system are executed by the apache user aka "www-data". Please correct me if I'm wrong, but this might lead to several awkward situations:
I'm able to read the source code of other users' scripts by using a script. I might find hardcoded database passwords.
Files written by scripts and uploads are owned by www-data and might be unreadable or undeleteable by the script owner.
Users will want their upload-folders to be writeable by www-data. Using a script I can now write into other users upload directories.
Users frustrated with these permission problems will start to set file and directory permissions to 777 (just take a look at the Wordpress Support Forum…).
One single exploitable script is enough to endanger all the other users. OS file permission security won't help much to contain the damage.
So how do people nowadays deal with this? What's a reasonable (architecturally correct?) approach to support several web-frameworks on a shared system without weakening traditional file permission based security? Is using fastCGI still the way to go? How do contemporary interfaces (wsgi) and performance strategies fit in?
Thanks for any hints!
as far as i understand this, please correct me if i am wrong!
ad 1. - 4. with wsgi you have the possibility to change and therefor restrict the user/group on per process-basis.
http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIDaemonProcess
ad 5. with wsgi you can isolate processes.
http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIProcessGroup
quote from mod_wsgi-page:
"An alternate mode of operation available with Apache 2.X on UNIX is 'daemon' mode. This mode operates in similar ways to FASTCGI/SCGI solutions, whereby distinct processes can be dedicated to run a WSGI application. Unlike FASTCGI/SCGI solutions however, neither a separate process supervisor or WSGI adapter is needed when implementing the WSGI application and everything is handled automatically by mod_wsgi.
Because the WSGI applications in daemon mode are being run in their own processes, the impact on the normal Apache child processes used to serve up static files and host applications using Apache modules for PHP, Perl or some other language is much reduced. Daemon processes may if required also be run as a distinct user ensuring that WSGI applications cannot interfere with each other or access information they shouldn't be able to."
All your point are valid.
Points 1, 3 and 5 are solved by setting the open_basedir directive in your Apache config.
2 and 4 are truly annoying, but files uploaded by an web-application is also (hopefully) removable with the same application.
I am building a cgi application, and now I would like it to be like an application that stands and parses each connection, with this, I can have all session variables saved in memory instead of saving them to file(or anyother place) and loading them again on a new connection
I am using lamp within a linux vmware but I can't seem to find how to install the module for it to work and what to change in the httpd.conf. I tried to compile the module, but I couldn't because my apache isn't a regular instalation, its a lamp already built one, and it seems that the mod needs the apache directory to be compiled. I saw some coding examples out there, so I guess is not that hard once its runing ok with Apache
Can you help me with this please?
Thanks,
Joe
Using FastCGI just means that you spawn a number of processes which will handle requests they get from the actual webserver instead of the webserver spawning a new process whenever a request arrives.
Use something like memcached if you want to keep stuff like sessions in memory.