How do many websites hide their file structure? - apache

When I look at many large sites (e.g. wikipedia or this site) the urls looks like this:
http://en.wikipedia.org/wiki/StackOverflow
And not like:
http://en.wikipedia.org/wiki.php?article=StackOverflow
http://en.wikipedia.org/wiki.pl?article=StackOverflow
... or even
http://en.wikipedia.org/wiki?article=StackOverflow
I suppose that wikipedia does not create a separate file for every article (and then use apache modules like mod_rewrite to hide the file extensions).
But how do they do this? Are they using a special server? Is there a way to configure apache to act like this? For example one script is called by every request and the path of the request is transmitted to the script, which will decide what to print.

These are called Friendly or Clean Urls .
Have a look at
http://en.wikipedia.org/wiki/Rewrite_engine
http://en.wikipedia.org/wiki/Clean_URL
http://www.petefreitag.com/item/503.cfm

Related

Is there a quick way to detect redirections?

I am migrating a website and it has many redirections. I would like to generate a list in which I can see all redirects, target and source.
I tried using Cyotek WebCopy but it seems to be unable to give the data I need. Is there a crawling method to do that? Or probably this can be accessed in Apache logs?
Of course you can do it by crawling the website, but I advise against it in this specific situation, because there is an easier solution.
You use Apache, so you are (probably) working with HTTP/HTTPS protocol. You could refer to HTTP referrer, if you use PHP, then you can reach the previous page via $_SERVER['HTTP_REFERER']. So, you will need to do the following:
figure out a way to store previous-next page pairs
at the start of each request store such a pair, knowing what the current URL is and what the previous was
maybe you will need to group your URLs and do some aggregation
load the output somewhere and analyze

Rename wp-login.php with default permalink

For better security, I would like to rename the login url of my blog to something other than /wp-login.php. I found a plugin that would do the Job
http://wordpress.org/plugins/rename-wp-login/
But the problem is that it works only with non-default permalinks, which is a problem for me, because I use unicode names for my topics, which could make the link very long and messy with percent encoding. I wouldn't want to translate every link name to english... that's tedious!
Is there a way to hide wp-login.php and wp-admin from hackers without having to change the permalink form?
Thank you.
You can now use Rename wp-login.php plugin with any kind of permalink structure! ;)
I can suggest one great plugin that have plenty useful things in it and also what you want. And it uses other technique, that is not dependent on permalinks (in two words - it uses htaccess for all the magic).
It's called Better WP Security.
Here is the link
Why don't you use a permalink structure like this?
/%post_id%/
From long time i was tackiling with one issue.
some one trying to access my website using random password.
i got report of ip addresss, who hits wp-login.php files.
beside that i found .sd0 file in my root folder.
that file filled with some encrypted code.
I removed this and change my wp-login.php to wp-login-xx.php
After changed this file you required to change below file also to get proper execution.
search for wp-login.php and replace this with your assign name (wp-login-xx.php)
wp-login.php
wp-includes/general-template.php
wp-includes/pluggable.php
for better security also update wordpress with latest one.

How to direct multiple clean URL paths to a single page?

(Hi! This is my first time asking a question on Stack Overflow after years of finding answers here... Thanks!)
I have a dynamic page, and I'd like to have fixed URLs that point to different states of that page. So, for example: "www.mypage.co"(/index.php) is the base page, and it rearranges its content based on user choices. I'd then like to be able to point to "www.mypage.co/contentA" or "www.mypage.co/contentB" in order to automatically load base the page at "www.mypage.co" with the desired content.
At heart the problem is an aesthetic one. I know I could simply write www.mypage.co/index.html?state=contentA to reach the desired end, but I want to keep the URL simple and readable (ie, clean). I also, due to limitations in my hosting relationship, would most appreciate a solution that is server-independent (across LAM[PHP] stacks, at least), if possible.
Also, if I just have incorrect assumptions about how to implement clean URLs, I'd appreciate direction to a good, comprehensive explanation. I can't seem to find one...
You could use a htaccess file to redirect all requests to one location and then from there determine what you want to return to the client. Look over the htaccess/dispatch system that Tonic uses.
If you use Apache, you can use mod_rewrite. I have a rule like this where multiple restful urls all go to the same page, using regex and moving parts of the old url into parameters for the new url:
RewriteRule ^/testapp/(name|number|rn|sid|unii|inchikey|formula)(/(startswith))?/?(.*) /testapp/ProxyServlet?objectHandle=Search&actionHandle=drillIn&searchtype=$1&searchterm=$4&startswith=$3 [NC,PT]
That particular regex accepts urls like
testapp/name
testapp/name/zuchini
testapp/name/startswith/zuchini
and forwards them to the same page.
I also use UrlRewriteFilter for Tomcat, but as you mentioned PHP, that doesn't seem that it would be useful.

apache mod_rewrite: using database to update rewrite rules

Total newbie at mod_rewrite.
Let's say I want to create nice URLs for every manufacturer on my site,
so I have
www.mysite.com/samsung
www.mysite.com/sony
www.mysite.com/acme
works well enough.
However, if I have hundreds of manufacturers and if they're changing constantly, what then? There are some vague references for something called rewrite map somewhere but nothing that explains it and no tutorials. Can anyone help?
Also, why is this problem not the main topic covered in tutorials for mod_rewrite? How is mod_rewrite possibly useful when you have to maintain it manually (assuming you have new content on your site once in a while)?
There is also mention of needing to have access to httpd.conf
How do I access httpd.conf on my hosting provider's server? How does every other site do this?
Thanks
Just came across this answer while searching for a similar solution — searching a bit further I discovered that mod_rewrite now has the RewriteMap directive, which will do exactly what you want without the need to run PHP or another scripting language.
It lets you define a mapping rule with a text file, a DBM file, an external script or an SQL query.
I hope that helps!
The way this would typically be done is that you would take all URLs that match a specific pattern and route them to a PHP file (or whatever your server-side programming language is) for more complex routing. Something like this:
RewriteRule ^(.*)$ myroute.php?url=$1 [QSA,L]
Then, in your myroute.php file, you can include logic to look at the "url" query string parameter, since it will contain the original URL that came in. Perhaps you could match it to a manufacturer in the database, or whatever else is required.
This example obviously takes all URLs and maps them to myroute.php. Another example might be something like:
RewriteRule ^/manufacturers/(.*)$ manuf.php?name=$1 [QSA,L]
In this case, it will map URLs like so:
/manufacturers/sony => /manuf.php?name=sony
/manufacturers/samsung => /manuf.php?name=samsung
etc...
In this case, your manuf.php file could look up the database based on the name query string parameter.

Advantage of placing webpages in separate directories?

My question is there really an advantage by placing each webpage in it's own directory compared to putting them in a directory?
( www.example.com/ and www.example.com/b.php ) vs ( www.example.com/ and www.example.com/b/ )
What you've seen is probably not that each file is in its separate folder, but rather a rewriting/routing engine in action. The basic concept is that you tell the server that "a URL that looks like <this>, should point to a file with a filename like <this>, and with <these> parameters". This way, you can create easily readable URLs (which benefit both users, developers and search engines).
Example:
A user types in domain.com/cats/Garfield/. This could be interpreted as domain.com/index.php?category=cats&cat=Garfield by the server. Thus, the "usage URL" is far cleaner and easier to read and remember.
More info in the Wikipedia article about URL Rewriting.