What is the preferred way to reconcile multiple Default Page(s)? - apache

Here's the problem: we have a family (approx. 8) of websites, each hosted on a different subdomain of a single domain common to every member of the family. E.g.,
ecommerce.my_domain.com
forums.my_domain.com
signup.my_domain.com
For various reasons, each subdomain is administered separately from the others--i.e., different servers, complete autonomously with respect to the others regarding nearly every development decision, including choice of web framework--for instance., two are Django, one is Zend, and so on (though all run Apache 2.2). We want to fix this, and someday we will, it just won't be anytime soon.
One direct consequence of this structure is that we have multiple Default page names. By 'Default Page' i'm referring to the page the server defaults to when no page on the subdomain is given--sometimes it's 'index.html', sometimes 'index.php', etc. (I know what they are, it's the fact that there are multiple pages that's a problem.)
(The Default page is the webpage to which your server defaults when no page on the domain is specified. For example, if the "index.html" page is served when you enter "www.my_domain.com", "index.html" is the Default page.)
Here's one problem it causes: our analytics code (javascript) will count page views to subdomain1.my_domain.com and subdomain1.my_domain.com/index.html as two separate pages, unless the correct default page is specified.By itself, this can cause a two-fold error in the basic page view measurement. In addition, the analytics system (Google Analytics) only allows a single Default Page to be specified.
After looking into this, it seems one way to do is at the Server (Apache 2.2) : (i) create a CGI directory without using ScriptAlias; (ii) use DirectoryIndex to specify a default document when only the directory is requested.
i suppose this can also be done within the Web Framework that supports each subdomain property, though given we have multiple different frameworks, that option is certainly less appealing.
I would be grateful for the Community's view on the preferred way to do this.
strong text*strong text*

What about using a .htaccess file to handle it?
Just add a line resembling the following with the order you want it to look for files to be your default:
DirectoryIndex index.html default.php default.htm foo.html index.cgi
.htaccess reference docs: http://httpd.apache.org/docs/current/howto/htaccess.html

Related

(MacOS Server) Apache File Extension Questions

I am running into some sort of issue when trying to access my local website:
Forbidden
You don't have permission to access /index.html on this server.
Apache Server at ffghost.local Port 34580
I'm using macOS X Server 5.2 with Apache 2.4.18. OS X Server automatically creates two default websites (one on port 80 and one on port 443). I created a new website. It was my understanding that Apache would redirect from the default site to the created site automatically once created. This didn't happen. So, in an attempt to begin de-conflicting I replaced the files where the default site was located with the new website files and all of the sudden am getting the above 404 message.
I have read a lot of possibilities as to why this may be happening. I've run a syntax checker for Apache in terminal and terminal says syntax is ok. So from there I was going to check into the config files, but there are several, and I just want to know the gist behind them.
There seem to be about 4 file extension types. I don't know what they all mean or if they are active.
.config (I'm assuming this is the active file)
.config.prev (I'm assuming this is a previous version or copy of an active config file and is no longer active)
.config.orig (original file? and is no longer active)
.config.default (???)
Also, OS X Server and Apache seem to have the same files in two different places and I'm a little confused on which one to change. If I change one of them will it be reflected in the other? Do I need to change both of them? Additionally, I don't have DNS set up and am unsure if that was the original issue of not pulling up the new website over the default site.
You are mixing several aspects in your question which makes it complicated to give a helpful answer. For example, you say you get Forbidden when accessing your site, but later you mention a status 404. The former might be due to configuring a user group being allowed to access the site, while the latter just means Not found.
As to your actual question about the config files:
The file just ending in .conf is the one that is being used.
However, the Server app uses a lot of of different config files which might be relevant:
Path /Library/Server/Web/Config/apache2 contains the general config files
httpd.conf - general Apache configuration
httpd_server_app.conf - more general configuration
the other files contain configurations for specific applications or webapps (the latter being defined in plist files in /Library/Server/Web/Config/apache2/webapps)
Path /Library/Server/Web/Config/apache2/sites contains config files specific to your websites. They are named something like 0000_127.0.0.1_34543_your.domain.name.conf where 34543 is the configuration for the https (SSL) port, while 35480 would indicate the http port. There is also a file like 0000_127.0.0.1_34543_.conf (no domain name in the file name) which defines the default site.
In addition to these, there are two more configuration file in /Library/Server/Web/Config/proxy which configure the proxy services.
It is not recommended to manually adjust the config files, except for those in the sites subdirectory, because they may get overwritten by the Server app or when updating the Server app.
Important: If you change the files manually, you must re-start the Apache server in order to make the changes effective. Use sudo serveradmin stop/start web to do so.
However, I do not know of a detailed documentation of of all these files, so I try to stay on the safe side and possibly not edit the general config files (only those in sites). I also recommend to write down any manual changes, so they can be reapplied if necessary.
Without exactly knowing what you configured in the Server app and which files you changed how, I'm afraid it is impossible to say what might have gone wrong. I recommend to start all over by removing and re-adding the web sites.

List of served files in apache

I am doing some reverse engineering on a website.
We are using LAMP stack under CENTOS 5, without any commercial/open source framework (symfony, laravel, etc). Just plain PHP with an in-house framework.
I wonder if there is any way to know which files in the server have been used to produce a request.
For example, let's say I am requesting http://myserver.com/index.php.
Let's assume that 'index.php' calls other PHP scripts (e.g. to connect to the database and retrieve some info), it also includes a couple of other html files, etc
How can I get the list of those accessed files?
I already tried to enable the server-status directive in apache, and although it is working I can't get what I want (I also passed the 'refresh' parameter)
I also used lsof -c httpd, as suggested in other forums, but it is producing a very big output and I can't find what I'm looking for.
I also read the apache logs, but I am only getting the requests that the server handled.
Some other users suggested to add the PHP directives like 'self', but that means I need to know which files I need to modify to include that directive beforehand (which I don't) and which is precisely what I am trying to find out.
Is that actually possible to trace the internal activity of the server and get those file names and locations?
Regards.
Not that I tried this, but it looks like mod_log_config is the answer to my own question

Aliases on Dreamhost, general management of http request / server errors

I had a hard time deciding how I should manage these errors (404, 500, ...) and when I finally decided, I am encountering problems. This is a reeeeeally long question, I appreciate anyone's attempt to help!
Let me first describe how I decided to set it up. I have several sites hosted on a shared Dreamhost account. In the folder structure that I see, everything of mine on the server is under /home/username, and for example, site1.com's web root is at /home/username/site1.com
I am creating a generic error handler (php script) for errors like 404 not found, 500, etc. that I want to store above the web roots of my sites at /home/username/error_handler/index.php so that I can use an .htaccess file at /home/username/.htaccess which includes something like the following:
ErrorDocument 404 /error_handler/index.php
ErrorDocument 500 /error_handler/index.php
...and many more
When these errors occur on any of my sites, I want it to be directed to /home/username/error_handler/index.phpThis is the problem I'm having a hard time figuring out. The ErrorDocument directives above will actually cause Apache to look for /home/username/site1.com/error_handler/index.php
Anyway, the errors should be redirected to my error handling php script. The script will use $_SERVER['REDIRECT_STATUS'] to get the error code, then use $_SERVER['REDIRECT_URL'] and $_SERVER['HTTP_HOST'] to decide what to do. It will check if an error handler specific to that site exists (for example: site1.com/errors/404.php). If this custom page doesn't exist, it will output a generic message that is slightly more user-friendly and styled, and perhaps will include some contact info for me depending on the error.
Doing it this way lets me funnel all these errors through this 1 php script. I can log the errors however I like or send email notifications if I want. It also lets me set up the ErrorDocument Apache directives once for all my sites instead of having to do it for every site. It will also continue to work without modification when I move the site around, since I already have a system that scans the folder structure to figure out where my site roots are when they really aren't at the web root technically speaking. This may not be possible with other solutions like using mod_rewrite for all 404 problems, which I know is common. Or if it is possible, it may be very difficult to do. Plus, I have already done that work, so it will be easy for me to adapt.
When I am working on sites for which I don't have a domain name yet (or sites where the domain name is already in use at the moment), I store them temporarily in site1.com/dev/site3.com for example. Moving the site to site3.com eventually would cause me to have to update the htaccess files if I had one for each site. Changing the domain name would do the same.
Ex: a site stored at site1.com/dev/site3.com would have this in its htaccess file:
ErrorDocument 404 /site1.com/dev/site3.com/error/404.php
And it would have to be changed to this:
ErrorDocument 404 /site3.com/error/404.php
Obviously, this isn't a huge amount of work, but I already manage a lot of sites and I will probably be making more every year, 95% of which will be hosted on my shared DreamHost account. And most of them get moved at least once. So setting up something automatic will save me a some effort in the long run.
I already have a system set up for managing site-relative links on all my sites. These links will work whether the site exists in a subdirectory of an existing site, or in their own domain. They also work without change in a local development server despite a difference in the web root location. For example, on the live server, the site-relative http link /img/1.jpg would resolve to the file /home/username/site1.com/img/1.jpg while on my local development server it would resolve to C:\xampp\htdocs\img\1.jpg, despite what I consider the logical site root being at C:\xampp\htdocs\site1.com. I love this system, and it is what gave me the idea to set up something that would work automatically like I expected it to, based on the file structure I used.
So, if I could get it to work, I think this seems like a pretty good system. But I am still very new to apache configuration, mod_rewrite, etc. It's possible there is a much easier and better way to do this. If you know of one, please let me know.
Anyway, all that aside, I can't get it working. The easiest thing would be if I could have the ErrorDocument directive send the requests to folders above the web root. But the path is a URL path relative to the document root. Using the following in /home/username/.htaccess,
ErrorDocument 404 /error_handler/index.php
a request for a non-existent resource causes Apache to look for the file at
site1.com/error_handler/index.php
So I thought I should set up a redirection (on all my sites) that would redirect those URLS to /home/username/error_handler. I tried a few things and couldn't get any of them to work.
Alias seemed like the simplest solution, but it is something that has to be set at server runtime (not sure if that is the right terminology - when the server is started). On my local server, it worked fine using:
Alias /error_handler C:\xampp\htdocs\error_handler2
I changed the local folder to test that the Alias was functioning properly. (On the local server, the URL path specified by the ErrorDocument directive is actually pointing to the right folder, since in my local server the web root is technically C:\xampp\htdocs and I store the error handler I want to use is stored locally at C:\xampp\htdocs\error_handler\index.php)
Dreamhost has a web client that can create what I am guessing is an Alias. When I tried to redirect the folder error_handler on site1.com to /home/username/error_handler, it would seem to work right if I typed site1.com/error_handler in the browser. But if I typed site1.com/test1234 (non-existant), it would say there was a 404 error trying to use the error handler. Also, I would have to login through the web client and point and click (and wait several minutes for the server to restart) every time I wanted to set this up for a new site, even if I could get it to work.
So I tried getting it to work with mod_rewrite, which seems like the most flexible solution. My first attempt looked something like this (stored in /home/username/site1.com/.htaccess for now, though it would eventually be at /home/username/.htaccess:
RewriteEngine On
RewriteRule ^error_handler/index.php$ /home/username/error_handler/index.php
The plain english version of what I was trying to do above is to send requests on any of my sites for error_handler/index.php to /home/username/error_handler/index.php. The mis-understanding I had is that the subsitution will be treated as a file path if it exists. But I missed that the documentation says "(or, in the case of using rewrites in a .htaccess file, relative to your document root)". So instead of rewriting to /home/username/error_handler/index.php, it's actually trying to rewrite to /home/username/site1.com/home/username/error_handler/index.php.
I tried including Options +FollowSymLinks because in the Apache documentation it says this:
To enable the rewrite engine in this context [per-directory re-writes in htaccess], you need to set "RewriteEngine On" and "Options FollowSymLinks" must be enabled. If your administrator has disabled override of FollowSymLinks for a user's directory, then you cannot use the rewrite engine. This restriction is required for security reasons.
I searched around for a while and I couldn't find anything about how Dreamhost handles this (probably because I don't know where to look).
I experimented with RewriteBase because in the Apache documentation it says this:
"This directive is required when you use a relative path in a substitution in per-directory (htaccess) context unless either of the following conditions are true:
The original request, and the substitution, are underneath the DocumentRoot (as opposed to reachable by other means, such as Alias)."
Since this is supposed to be a URL path, in my case it should be RewriteBase /, since all my redirects will be from site1.com/error_handler. I also tried Rewrite Base /home/username and RewriteRule ^error_handler/index.php$ error_handler/index.php. However, the Rewrite Base is a URL path relative to the document root. So I need to use something like an alias still. The implication in the quote from the documentation above is that it is possible to use mod_rewrite to send content above the web root. One of the many things I don't know is what the 'other means' besides Alias might be. I believe Alias might not be an option on Dreamhost. At least I couldn't make sense of it.
Why not use error pages in the site root, then include the actual file from the shared section?

How to use relative URL's in website with two base URL's

I have our basic corporate static html website installed in our web root directory and our billing software installed in /portal. I have integrated the websites to look like a single site by including the /menu.tpl smarty template file in the /portal/header.tpl file. However, if I use relative URL's, the menu sysem doesnt work as the base url for the billing script is /portal. i.e. if I create a link to faq.php in the menu.tpl and I load a page on the portal site, the link in the menu back to the faq page is now /portal/faq.php whereby if I load a page off the root site the link is just /faq.php as it should be.
The obvious answer is to just use absolute URL's, but I need the site to be portable as I have many developers who need to install and test it.
I cant find anyway to resolve this. Any ideas?
I ran into the same problem as you a while ago, and after trying a lot of dead ends, I finally ended up with the following solution:
For any URL you need to be a chamelion, i.e. change its path depending on the environment, insert a PHP function that writes out the correct URL.
If you include the PHP function from a single central file, then you can change all of the URL's in the entire site automatically, based on a setting, or some pre-detected switch such as the current domain name, etc.
Example:
<?php print_base_url_plus("/menu.php"); ?>
... where print_base_url_plus() is a function which appends the base URL onto the output.
You may find that you have to change some of the URL's to be php, so they are preprocessed by the PHP engine, or, you can alter the web settings so that standard .htm files are piped through the PHP engine, just like .php files.

Is sub-domain better or sub-folder?

I have a classifieds site and I want to know whether creating subdomains or sub-folders is better, as I am in state of confusion.
If we need to take sub-domains then what are the benefits which we can get from sub-domains?
If we need to take sub-folders then what are the benefits we can get from them?
If we create sub-domains then Google considers sub-domains as individual domains and would show only 2 results per page.
So please clear my doubts and let me know which one is better.
Search engines will treat subdomains more like separate domains, so it many cases using sub-folders is the way to go, so that you are not spreading yourself to thin.
using both is a bad idea, as search engines will try to index both, and one will get flagged as duplicate content.
Here's an article from SEOMoz explaining it in more detail:
http://www.seomoz.org/blog/understanding-root-domains-subdomains-vs-subfolders-microsites
Personally, I go off the logic that a subdomain is a different application / site owned by the same company. A subfolder is part of the same application / site.
It is probably better for your PageRank / search engine listings if you have 'one-big-site', rather than lots of different loosely coupled domains. If the content of the sites is distinctly different, e.g. a personal site might have a gallery or blog subdomain which keeps the content distinctly seperate from the main site - then it probably makes sense for a subdomain, otherwise, I'd stick with folders.
p.s. Side note, dunno if this is important, but web-browsers only open two connections at once to a website when downloading the different files that constitute a page. So there is a reason for a subdomain in this instance where it actively speeds up (though, on a fast site, minimally) the page load time.
It depends on what content you want to create subfolder/-domains for.
Is it related to your original site? Then you should definitely use sub-folder as this belongs to the same domain. It's much better for the link juice to spread if you use sub folders.
As for the PageRank, it's better to use subfolders.
Subdomains is considered as a new site. Subfolders is not, and will get better rankings if your original site is powerful.
You can use both subdomains and subfolders if you like, but don't forget to use the canonical-tag to avoid duplicate content.
Two relevant links that explains this further:
http://www.searchmarketingstandard.com/when-to-use-subdomains-vs-subfolders
http://www.searchenginejournal.com/subdomains-or-subfolders-which-are-better-for-seo/6849/
Why not both? Have demandb.com/foo and foo.demandb.com go to the same place.
Technically, a subdomain is a different server. The company I work for has a domain with several subdomains where every subdomain is located on a different (virtual or real) computer/server. That way, if one of them crashes, the rest just keeps running.
From a developer's perspective, a subdomain would force everything within the subdomain to be a different application while in a subfolder, the subfolder and subsubfolders could be part of the application in the root folder. When I create web applications, those web applications are often tested first on a test server in a subfolder. Once they make it through the tests, they are moved to the root of their own subdomain.
When two subfolders are related to one another, they're often part of the same application, thus it would be better to keep them in their own subfolder so they can share cookies and sessions more easily.
2 comments:
Use sub folders if you need SSL- then you only need 1 basic certificate for the root
If you use both, make sure you redirect 301 one to the other. That will avoid the search engine duplication issue, but would still be problematic for SSL in certain situations.
If your site can be easily partitioned by the subdomains and each subdomain can operate independently then do it! You can then easily scale out your application by deploying different servers(or clusters) for each subdomain.
Examples:
Craigslist: by region(seattle.craigslist.org, sfbay.craigslist.org, etc)
Livejournal: by community/user
Technically, you can do this with folders, but it requires a web proxy farm, whereas subdomains can be done with simple DNS entries.
its also depends on your needs whether u want separate login system for sub domains, because it wont be possible is you are using sub-folder. Sub-folder share same session.
For sub domains you have to set shared cookie for all domains.