If you follow the principle that an application should run "out of the box" when pulled out of version control, .htaccess should be included. Something doesn't feel right about that though, as it doesn't really feel part of the application. I'm conflicted, could someone put my mind at rest?
I typically do keep an application's .htaccess in source control which includes the Apache configuration required for the application to run, that is, rewrite rules which are not specific to the server it is running on, access to environment variables, etc.
Think about it this way - if the .htaccess file contains rewrite rules used by your application, those are effectively functioning as part of the application routing and therefore are part of the application.
If you are packaging an application for others to download and use, you should probably include a skeletal .htaccess file which includes the rules needed to make the application run. If your application is only intended to run on your own server and you keep all its relevant Apache config in .htaccess other than VirtualHost configuration, I would say it does indeed belong in source control.
This is a common problem for us, too, and I know what you mean about 'not feeling right'.
However, for me the issue is not about being feeling that .htaccess is not part of the application, as bits of it clearly are. The issue is more that that the file mixes application-specific code (routing, etc.) with installation-specific stuff (local URL rewrites, authentication/access rules, etc.). Ideally you would version control the application-specific rules but not the installation-specific rules, but of course this is not possible as both need to be in the same file.
I can think of four approaches to managing this. The optimal one will depend on your particular situation (and may vary from project-to-project):
Michael's second suggestion is the best option, if you control the deployments. Namely, keep application-specific code in the .htaccess file, under version control, and any installation-specific code in your main Apache VirtualHost directive. This gives full separation but is not a viable solution if you don't have direct access to the main Apache config or are distributing to third-parties.
Version control the application-specific elements of your .htaccess file, with clear comment markers, e.g. ### FOO APPLICATION SETTINGS - DO NOT CHANGE ### and ### ADD ANY ADDITIONAL LOCAL CONFIG BELOW THIS LINE #### and do not version anything that is installation-specific. This is fine if the rules are simple enough to not really cause conflicts, but is not great if your skeletal file requires users to modify existing lines, as you will likely end up with merge conflicts down the line. It also runs the risk of unwanted edits getting into your repository if deployments are version-controlled (as is probably the case for dev versions, at least) and it runs the risks of local changes being blown away by an upgrade if not (e.g. public zipfile distributions).
What we settled on as a good alternative is to version control a file called .htaccess.sample which can contain both application-specific rules and (where relevant) suggestions for installation-specific rules that users may find useful. This is easy to maintain and deploy whether or not version control is being used, but makes upgrades slightly harder as users will need to hand-merge any changes into their local .htaccess file. In our case, upgrades are always done by developers therefore this is not a big issue, assuming appropriate diff tools are available.
Same as #3, except you also provide a script to automatically apply any modifications. More work to set up, not necessarily more work to maintain, depending how you implement it (e.g. if it just replaces a block of code between some markers) and possibly worthwhile if you are distributing to a wide user base. This is what Wordpress does, for example.
Related
Suppose you want to find out about a specific apache config setting/directive, say LimitRequestBody in a certain directory path/to/somewhere.
You don't know and don't where the actual file responsible for the setting in effect is - it might be the main apache config, or any .htaccess on the way down (path/.htaccess, path/to/.htaccess or path/to/somewhere/.htaccess).
Is there a convenient way to find out, which actual setting is in effect in path/to/somewhere (optionally which file the setting originated from)?
If you stick to not using .htaccess, you can get a decent amount of information from using mod_info. If you use .htaccess files, there is currently no module that will tell you "what/who caused this", simply because it's very very difficult to figure out due to how rules are put in place internally.
I am, as a hobby project, working on having mod_lua (which is a core part of the 2.4 distribution) be able to tell you this in the near future, but right now, the simple answer is: You can't.
I want to know if my Apache server uses .htaccess files or not. If it uses them, then why and how?
How can I know if my Apache server is using .htaccess or not?
Thank you.
As for why, it's a convenient way for shared-hosting providers to give some access to users who would like to set some configuration options. You obviously wouldn't want everyone to have access to the main configuration file for security purposes. It's also useful for development purposes since you can set different options for different directories.
As for how Apache uses the file, I recommend reading the documentation.
As for how to know if Apache is using .htaccess files, it most likely is. I've yet to meet a shared hosting provider that doesn't. And if you are running your own server, I assume you would know how you set it up. Worst case scenario, you could follow this advice from the docs:
A good test for this is to put garbage in your .htaccess file and reload the page. If a server error is not generated, then you almost certainly have AllowOverride None in effect.
I'm using this for one of my applications:
Options +FollowSymLinks -SymLinksIfOwnerMatch
And I worry about the security problems this may bring. Any idea what measures I can take to make this approach as secure as possible?
There's nothing specific you can do to make using those options as secure as possible. The risk in using them is that a user, or a process running under a user, can disclose information or even hijack content by creating symlinks. For example, if an unpriviliged user (who may have been compromised) wants to read a file that they normally can't, they can sort of escalate it by creating a symlink from their public_html directory to it, and if apache can read it, they can then just access their webpage and read the file. There's nothing specific you can do to prevent something like that from happening except to make sure you're system is properly patched and configured.
Note that this threat isn't just from users on your system. If you are running a webapp in, say php, and it got compromised somehow, an attacker can upload a php file browser and create symlinks to content outside of your document root (like to /etc/passwd or some other file you don't want exposed to the web).
If you're worried about stuff like that, it's better not to use these options.
How can a client detect if a server is using mod_rewrite? Now I know that some mod_rewrite rules are not very obvious. But some are, such as "SEO Friendly Urls". What types of behavior is impossible unless a server is running mod_rewrite?
What types of behavior is impossible unless a server is running mod_rewrite?
The real answer is "none". In theory, any URL could be formed by actual files or directories, including the classical "SEO friendly" URLs.
There is only circumstantial evidence:
The best indication that I can think of is when the entire site structure consists of URLs without .htm .php .html file extensions:
http://domain.com/slugs/house-warming-party
to exclude the possibility of that URL being a directory, request
http://domain.com/slugs/house-warming-party/index.htm
http://domain.com/slugs/house-warming-party/index.html
http://domain.com/slugs/house-warming-party/index.php
http://domain.com/slugs/house-warming-party/index.asp
... whatever other extensions there are .....
if those requests all fail, it is very likely that the site is using mod_rewrite. However if they succeed, as #Gumbo says, it could also be the MultiViews option fixing the request. Either way, this is nowhere near safe!
Depending on what your use case is, you could also try to deduct things from the CMS used on the site. Wordpress with mod_rewrite turned on will show a different URL structure than with it turned off. The same holds true for most other CMSes. But of course, this is also a highly imperfect approach.
The use of HTML resources with a .html/.htm/.php ending would point slightly against the use of mod_rewrite, but you can never be sure.
The use of the PATHINFO variable (also known as poor man's mod_rewrite) would point somewhat strongly against the use of mod_rewrite:
http://example.com/index.php/slugs/house-warming-party
In conclusion, mod_rewrite (like most URL-rewriting tools) is supposed to be a module transparent to the outside world. I know of no sure-fire way to detect it from outside, and there may well be none.
Has any put much thought into this? Personally, I think managing endpoints in configuration files are a pain. Are there any pros/cons to doing one over the other?
Only points in favour of configuration files from me.
Managing endpoints in configuration files mean that you don't have to update your application if (or perhaps I should say when) the endpoints change.
You can also have several instances of the application running with different endpoints.
I tend to like the config approach myself too, other than the config file can get pretty big.
The one thing I have noticed with WCF configuration is that there is a lot of stuff that you can do from code that you can't do in XML config without adding your own custom extensions. In other words, doing config in code will allow more flexibility, of course you could also just code your own extensions and use those from configuration.
However, do note that there is what I would consider a 'bug' in Visual Studio that if you start making your own extensions and including them in XML, then VS won't like your config file any more and will tag them as errors, and then if you try to add a new service through the wizards, it will fail to add the endpoint to the configuration.
This is sort of a followup to my own answer:
After months of having everything in xml configuration, I'm changing everything to construct the endpoints and bindings in code. I found a really good case for having it in code;
When you want to have a deployable / sharable .dll that contains WCF clients.
So for example if you have a CommonClients.dll that contains all your WCF interfaces and contracts to communicate with some remote server, then you don't want to also say "here is 100 lines of xml that you also have to drop into your app.config for every client to make it work". Having it all constructed in code works out much better in this case.
There is also a "feature" of .NET 3.5 where if you have some wcf extensions, you have to specify the fully qualified assembly name. This means that if your assembly containing the extensions changes the version nnumber, you have to go change the assembly name in the config file too. It is supposedly fixed in .NET 4 to use a short assembly name and not require the full name.
Offhand, an endpoint in a config file doesn't need to be recompiled when it's changed. This also means that you just need to update your config file when moving an application from Development to UAT to Production.
If your just coding something for your own use at home, then there's no real difference. However in a business environment, having the enpoint defined in your config file saves all sorts of headaches.
When using an app.config, your application does not need to be recompiled to adjust to a change. Also it can be resused in multiple situations with the exact same code. Finally, hardcoding your endpoints (or anything subject to change) is poor coding practice. Don't fear the configuration file, it's declarative programming. You say, "I want to use this endpoint." and it does the work for you.
I generally do programmatic configuration, as I don't want to expose my applications internal structure the the user. The only thing I keep configurable is service address, but even this I keep in userSettings section, not system.ServiceModel.
I prefer and recommend the configuration file approach. It offeres a lot of flexibility by allowing to make change to your server without the need to recompile the applcation.
If you need security, you can encrypt the config file.
The biggest worry with plain config files could be that it can be accidentally (or on purpose) modified by the end user causing your app to crash. To overcome this you could make some tests in code to check the configuration is ok in the config file and if not, initialize it programatically to some defaults. I presented how you could do that in another answer to this question.
It's just a question of how much flexibility you need.
Usually I prefer the config file approach.
Check out the .NET StockTrader app. It uses a repository to store config data and has a separate app to manage the configuration. The setup and structure is pretty advanced and there's a fair bit of head scratching for anyone like me that only has the basics of WCF configuration so far, but I would say it's worth a look.