keep query string or use mod rewrite - apache

I have seen big company such as facebook, google or yahoo etc use a mix of mod write and query string,
facebook:
https://www.facebook.com/zxzxzx
https://www.facebook.com/events/upcoming?action_history=null
google:
https://www.google.com.hk/search?q=asddvdfv
https://plus.google.com/u/0/xxx
what is the best practice when dealing with url?

mod rewrite with URL is used for making smart readable URL in browser, so search engines can read them correctly. Also its meaningful URL so user can also read URL with meaning. So most of directory listing or informative pages are rewritten with mod rewrite.
When there is searching activity or too many parameters in the query string such as param1=..., param2=..., or when dynamic parameters are used in URL its not beneficial to write meaningful URL and in that case mix URL should be used
if in listing pages where you used meaningful URL (with mod rewrite) and you additionally using some dynamic parameters like paging parameters, its not necessary to rewrite them also, but you can in this situation pass those parameters as query sting like you see in second Facebook link

Related

In Express router, what is the best way to specially handle "random" urls that aren't handled by anything else?

Say I was making a URL shortener service, and I want it to be able to make urls like [domain]/xf6B2sT. But I also want to be able to have "normal looking urls," whether the pages are static or dynamic, and if a normal page is routed, it won't continue to look for ones of this compact format.
It would be best if you had an algorithmic way to tell whether a URL was a shortened URL or not without looking it up in your database and without comparing to all the regular site URLs. That algorithm just has to be something that allows you to examine a URL and immediately determine whether it's a shortened URL or not. If not, you send it to a router for your site URLs and if it doesn't match there you return a 404. If it does match the format for a shortened URL, then you look it up in the database and go from there.
The algorithm could be whatever you want. It could be that all site URLs have one level of path: http://yourdomain.com/site/home or it could be that all shortened URLs start with some magic character like an x that no site URLs will ever start with. There's an infinite number of possible algorithms you could invent. The point is you need to be able to quick look at a URL with some Javascript in your middleware and determine which it is without looking up anything in a database.

How to geotarget website with multiple languages but one link only?

I have a website with two languages, which works in this format:
example.com/changelanguage.xx?lang=de
and redirects to German language
and calling the same URL again like:
example.com/changelanguage.xx?lang=​en
redirects to English language.
The URL remains the same example.com after redirection, just the language changes.
How to add the hreflang attribute here (for Google indexing)?
It’s a bad practice to use the same URL for different (i.e., translated) content.
Consumers, like search engine bots, would use rel-alternate + hreflang markup to find translations. For this to work, you have to provide a different URL for the translated page.
From the perspective of the search engine, it doesn’t work for their users if using the same URL: when they give http://example.com/foobar as search result, they want to make sure that their users get the language the search engine intended (e.g., someone searching for German terms should get the German page). But with your system, this doesn’t work; the search engine user might end up with the English version.
Instead, you should represent the language in the URL, e.g. the language code as first path segment:
http://example.com/en/contact
http://example.com/de/kontact
(Or use different domains/subdomains, or add a query parameter, etc. If you can make sure that translated pages would never have the same URL slug, you could even omit the language codes.)
This is a year late but https://www.bablic.com/ do exactly this!
Furthermore they can automatically detect the language set in the user's browser and automatically show the user your website in that language!

Accept case-insensitive URL or redirect to "correct" URL?

Let's say that I have a web app that responds to URLs in the format /entities/{entityKey}. In my access logs, I find people visiting both /entities/KEY1 which is how app URLs are generated, as well as the lower case version of /entities/key1. Currently key1 will throw a 404 not found error due to route requirements.
My question is, would you:
Use URL re-writing to re-write key to uppercase.
Create 302 redirects from lowercase to uppercase?
Have the application convert to uppercase and handle requests in a case-insensitive fashion
Most users these days expect URLs to be case-insensitive. I would have the app silently handle the conversion in the background. I don't see it being worth the extra request time to issue a redirect.
If SEO is a concern, then you can use the rel="canonical" meta tag to let google/other search engines know which URL you want to appear in search results.

Why are urls usually lowercase with words separated by a dash and no special characters?

As an example, the Rails parameterize method would create a string like so:
"hello-there-joe-smith" == "Hello There Joe.Smith".parameterize
For legacy reasons, a project I am working on requires uppercase letters as well as periods to be available in a particular URL parameter.
Why would this ever be a problem?
Clarification
The url type I'm talking about is what is used instead of an id, commonly knows as a slug.
Would a Rails app with the following url come to any issues: http://example.com/Smith.Joe?
This will be a problem both in terms of SEO and browser caching (and hence performance,)
Search engines are case sensitive, so same URL in different case will be taken as two URLs.
Browser like IE's caching is case sensitive, so eg. if you try to access your page as MYPAGE.aspx and at some place in code, you write it as mypage.aspx then IE will treat them as two different pages and instead of getting it from cahce, it will get it from server.
Dashes should be fine but underscores should be avoided : http://www.mattcutts.com/blog/dashes-vs-underscores/

Apache rewriting eats one level of escaping (%23)

I want to use fancy URLs for a tag filter on my website. The URLs should look like http://example.com/source/+tag1+tag2. This should filter for all items tagged with tag1 and tag2. I came up with the following rewrite rule for that, which I have saved to the root directory of the site:
RewriteRule ^([^+]+)(\+.+)$ $1?tags=$2 [L]
This works fine for all normal tag names, but it fails for the tag name "c#". I know that the hash character is not sent to the server, so the tag name is url-encoded and the link in the HTML page is like this: ./+c%23 But the target page will only see the "c" in its tags parameter, the rest and anything after the "#" is not there anymore.
I have enabled Apache's rewrite logging and saw that it already logs the incoming URL request like …/+c#. This made me think that another level of escaping could be required. So I tried with %2523 which actually passed the rewriting successfully and the whole string "c#" turned up in my page.
But then again, when I access the page with its internal URL like ./?tags=c%23, it already works, too. So why is Apache eating up one level of escaping? Is there a hidden rewrite flag I can use to avoid that? Do I need to use public URLs that are double-encoded for my fancy URLs to work? Or will it be too messy and I should instead just rename my tag to "csharp"?
I think you need the B flag (so use [L,B])