Is there a quick way to detect redirections? - apache

I am migrating a website and it has many redirections. I would like to generate a list in which I can see all redirects, target and source.
I tried using Cyotek WebCopy but it seems to be unable to give the data I need. Is there a crawling method to do that? Or probably this can be accessed in Apache logs?

Of course you can do it by crawling the website, but I advise against it in this specific situation, because there is an easier solution.
You use Apache, so you are (probably) working with HTTP/HTTPS protocol. You could refer to HTTP referrer, if you use PHP, then you can reach the previous page via $_SERVER['HTTP_REFERER']. So, you will need to do the following:
figure out a way to store previous-next page pairs
at the start of each request store such a pair, knowing what the current URL is and what the previous was
maybe you will need to group your URLs and do some aggregation
load the output somewhere and analyze

Related

How to properly use a CDN?

Good evening everyone! Thank you for opening this post.
I currently bought myself the ProCDN from MediaTemple (basically EdgeCast) and have setup a CDN where now I go to cdn-small.DOMAIN.com (or cdn-large.DOMAIN.com) it loads the normal website just fine...
However, I'm not sure which one to use.. Would I use this for the whole complete site to optimize, or use the links to add one by one for each script/stylesheet based on file size? (e.g. All JS/CSS will have the cdn-small while anything larger such as 300kb will have the cdn-large link)
And to say, if the correct way is to load the whole site as one link (e.g. everything is linked normally like js/jquery.js instead of a full link like https://cdn-small.domain.com/js/jquery.js).. Would I set a redirect from DOMAIN.com to cdn-small.DOMAIN.com for the best loading and that they only need to type in the domain not the full sub-CDN-domain?
Apologize if this isn't making sense or anything, but trying to do my best. To put it much more simple terms again is that I'm trying to find the best way to use my cdn-small/cdn-large for my website by having the user enter in the domain (https:// or http://) normally to serve my content as fast as possible near the user.
Kindly appreciate your time for reading this and wish you all a positive weekend.
Here is my live site if it even matters or want to experiement; http://bit.ly/1eGCShX

How to direct multiple clean URL paths to a single page?

(Hi! This is my first time asking a question on Stack Overflow after years of finding answers here... Thanks!)
I have a dynamic page, and I'd like to have fixed URLs that point to different states of that page. So, for example: "www.mypage.co"(/index.php) is the base page, and it rearranges its content based on user choices. I'd then like to be able to point to "www.mypage.co/contentA" or "www.mypage.co/contentB" in order to automatically load base the page at "www.mypage.co" with the desired content.
At heart the problem is an aesthetic one. I know I could simply write www.mypage.co/index.html?state=contentA to reach the desired end, but I want to keep the URL simple and readable (ie, clean). I also, due to limitations in my hosting relationship, would most appreciate a solution that is server-independent (across LAM[PHP] stacks, at least), if possible.
Also, if I just have incorrect assumptions about how to implement clean URLs, I'd appreciate direction to a good, comprehensive explanation. I can't seem to find one...
You could use a htaccess file to redirect all requests to one location and then from there determine what you want to return to the client. Look over the htaccess/dispatch system that Tonic uses.
If you use Apache, you can use mod_rewrite. I have a rule like this where multiple restful urls all go to the same page, using regex and moving parts of the old url into parameters for the new url:
RewriteRule ^/testapp/(name|number|rn|sid|unii|inchikey|formula)(/(startswith))?/?(.*) /testapp/ProxyServlet?objectHandle=Search&actionHandle=drillIn&searchtype=$1&searchterm=$4&startswith=$3 [NC,PT]
That particular regex accepts urls like
testapp/name
testapp/name/zuchini
testapp/name/startswith/zuchini
and forwards them to the same page.
I also use UrlRewriteFilter for Tomcat, but as you mentioned PHP, that doesn't seem that it would be useful.

Planning url rewrite for my web app

I'm working on a site which shows different products for different countries. The current url scheme I'm using is "index.php?country=US" for the main page, and "product.php?country=US&id=1234" to show a product from an specific country.
I'm planning now to implement url rewrite to use cleaner urls. The idea would be using each country as subdomain, and product id as a page. Something like this:
us.example.com/1234 -> product.php?country=US&id=1234
I have full control of my dns records and web server, and currently have set a * A record to point to my IP in order to receive *.example.com requests. This seems to work ok.
Now my question is what other things I'd need to take care of. Is it right to assume that just adding a .htaccess would be enough to handle all requests? Do I need to add VirtualHost to each subdomain I use as well? Would anything else be needed or avoided as well?
I'm basically trying to figure out what the simplest and correct way of designing this would be best.
The data you need to process the country is already in the request URL (from the hostname). Moving this to a GET variable introduces additional complications (how do you deal with POSTs).
You don't need seperate vhosts unless the domains have different SSL certs.

How to use regular urls without the hash symbol in spine.js?

I'm trying to achieve urls in the form of http://localhost:9294/users instead of http://localhost:9294/#/users
This seems possible according to the documentation but I haven't been able to get this working for "bookmarkable" urls.
To clarify, browsing directly to http://localhost:9294/users gives a 404 "Not found: /users"
You can turn on HTML5 History support in Spine like this:
Spine.Route.setup(history: true)
By passing the history: true argument to Spine.Route.setup() that will enable the fancy URLs without hash.
The documentation for this is actually buried a bit, but it's here (second to last section): http://spinejs.com/docs/routing
EDIT:
In order to have urls that can be navigated to directly, you will have to do this "server" side. For example, with Rails, you would have to build a way to take the parameter of the url (in this case "/users"), and pass it to Spine accordingly. Here is an excerpt from the Spine docs:
However, there are some things you need to be aware of when using the
History API. Firstly, every URL you send to navigate() needs to have a
real HTML representation. Although the browser won't request the new
URL at that point, it will be requested if the page is subsequently
reloaded. In other words you can't make up arbitrary URLs, like you
can with hash fragments; every URL passed to the API needs to exist.
One way of implementing this is with server side support.
When browsers request a URL (expecting a HTML response) you first make
sure on server-side that the endpoint exists and is valid. Then you
can just serve up the main application, which will read the URL,
invoking the appropriate routes. For example, let's say your user
navigates to http://example.com/users/1. On the server-side, you check
that the URL /users/1 is valid, and that the User record with an ID of
1 exists. Then you can go ahead and just serve up the JavaScript
application.
The caveat to this approach is that it doesn't give search engine
crawlers any real content. If you want your application to be
crawl-able, you'll have to detect crawler bot requests, and serve them
a 'parallel universe of content'. That is beyond the scope of this
documentation though.
It's definitely a good bit of effort to get this working properly, but it CAN be done. It's not possible to give you a specific answer without knowing the stack you're working with.
I used the following rewrites as explained in this article.
http://www.josscrowcroft.com/2012/code/htaccess-for-html5-history-pushstate-url-routing/

Redirect to SSL/HTTPS

I have a site where users must only be able to access the Login.aspx page via SSL/HTTPS.
Currently I run something like the following in my Page_Load():
If Not (IsSSL) Then
Response.Redirect("https://" + thisDomainName + "/Login.aspx")
End If
Is there a better way of doing this?
I would say without using url re-writing rules or some other http module then no, this is the simplest way to get what you are after.
Seeing as it's only the one page, a more elaborate process would be overkill.
Using the various Request.URL... properties may be better than hard-coding however.