Upgrade URL for SEO from example.com/dbtable_id/ to example.com/dbtable_id/article-title - seo

I have an existing journal website with the following url structure
http://example.com/dbtable_id/
(eg. http://example.com/89348/)
where 89348 is the primary key id of the journal article.
I want to add the title of the article to the url for SEO purposes like
http://example.com/dbtable_id/article-title
(eg. http://example.com/89348/hello-world)
I like this approach because I don't need to change the PHP code since it will still look up the article by dbtable_id. All I have to do is append url friendly titles to relevant links in template files and add one more rule to a .htaccess file.
Is there anything I should be concerned about? Am I following best practices? Will the possibility for mismatch between "dbtable_id" and "article-title" affect SEO?

There are some that argue that shallow paths are better than deeper paths, but I don't put too much stock in this. A semantic page with a screwed up URL will always do better than an unsemantic page with a "perfect" URL.
So i say, go for it. As long as it doesn't have any querystring parameters, you should be fine.

Related

Is rel=self the correct rel tag to use for forum permalinks?

I have been building a forum from scratch with my friends just for fun, and we're starting to see bots and scrapers go by. The problem we're having is that you can load a page /post/1 with four replies, and each reply includes a little permalink to itself /reply/1#reply-1. If I am on /post/1 and navigate to /reply/1, I'll end up right back where I started, just with the anchor to the reply. But! Scrapers have no idea this is the case, so they're opening every /post link and then following every /reply link, and it's causing performance issues, so I've been looking around SEO sites to try to fix it.
I've started using rel=canonical on the /reply page, to tell the bots they're all the same, but as far as I can tell that doesn't help me until the bot has already loaded the page, and thus I wind up with tons of traffic. Would it be correct to change my
Permalink
tags to
Permalink
since they should be the same content? Or would this be misusing rel="self" and there's another, better rel tag I should be using instead?
The self link type is not defined for HTML (but for Atom), so it can’t be used in HTML5 documents.
The canonical link type is appropriate for your case (if you make sure that it always points to the correct page, in case the thread is paginated), but it doesn’t prevent bots from crawling the URLs.
If you want to prevent crawling, no link type will help (not even the nofollow link type, but it’s not appropriate for your case anyway). You’d have to use robots.txt, e.g.:
User-agent: *
Disallow: /reply/
That said, you might want to consider changing the permalink design. I think it’s not useful (neither for your users nor for bots) to have such an architecture. It’s a good practice to have exactly one URL per document, and if users want to link to a certain post, there is no reason to require a new page load if it’s actually the same document.
So I would either use the "canonical" URL and add a fragment component (/post/1#reply-1, or what might make more sense: /threads/1#post-1), or (if you think it can be useful for your users) I would create a page that only contains the reply (with a link back to the full thread).

Is there any way to change the structure of urls in Shopify?

Is there any way to change the actual structure of a url in Shopify, rather than just the handle? For instance if I add a product it will be available at the following urls:
/products/some-product
/collections/collection-name/products/some-product
Is there any way I could change this to /collection-name/some-product, i.e. remove unnecessary words from the url?
I also realise you can add redirects, but this isn't what I want.
When thinking on the product page you should never think of playing or using the url which has 'collections'. If you take a deep look on the source code of a product you'll realize they all have a rel canonical tag pointing to the
../products/some-product
even if the product is displayed within the url
../collections/collection-name/products/some-product
If the collections url doesn't have that canonical tag, use it, otherwise crawlers/robots would consider it duplicate content because 2 different urls would show the same content.
Then if you're ok with the first part, you'll only have
../products/some-product
In such case, you will never be able to change the
../products/
part. But this is good as it helps Shopify store owners maintain a really well structured organization of products.
If you still for some reason need to play hard with urls, you can deep a bit into Application Proxies.

How to direct multiple clean URL paths to a single page?

(Hi! This is my first time asking a question on Stack Overflow after years of finding answers here... Thanks!)
I have a dynamic page, and I'd like to have fixed URLs that point to different states of that page. So, for example: "www.mypage.co"(/index.php) is the base page, and it rearranges its content based on user choices. I'd then like to be able to point to "www.mypage.co/contentA" or "www.mypage.co/contentB" in order to automatically load base the page at "www.mypage.co" with the desired content.
At heart the problem is an aesthetic one. I know I could simply write www.mypage.co/index.html?state=contentA to reach the desired end, but I want to keep the URL simple and readable (ie, clean). I also, due to limitations in my hosting relationship, would most appreciate a solution that is server-independent (across LAM[PHP] stacks, at least), if possible.
Also, if I just have incorrect assumptions about how to implement clean URLs, I'd appreciate direction to a good, comprehensive explanation. I can't seem to find one...
You could use a htaccess file to redirect all requests to one location and then from there determine what you want to return to the client. Look over the htaccess/dispatch system that Tonic uses.
If you use Apache, you can use mod_rewrite. I have a rule like this where multiple restful urls all go to the same page, using regex and moving parts of the old url into parameters for the new url:
RewriteRule ^/testapp/(name|number|rn|sid|unii|inchikey|formula)(/(startswith))?/?(.*) /testapp/ProxyServlet?objectHandle=Search&actionHandle=drillIn&searchtype=$1&searchterm=$4&startswith=$3 [NC,PT]
That particular regex accepts urls like
testapp/name
testapp/name/zuchini
testapp/name/startswith/zuchini
and forwards them to the same page.
I also use UrlRewriteFilter for Tomcat, but as you mentioned PHP, that doesn't seem that it would be useful.

Efficient way to add Canonical tags

If the value of the href for Canonical tags is populated via javascript function, would that affect the Search engine indexing (as search engines ignore javascript) ?
I'm not sure I fully understand the question as you worded it. But here's my take:
Canonical tags are used to make sure that Google (et al) knows that the same page with different URLs are, in fact, the same page.
This saves Google a lot of processing time, because it will treat those pages as a single page instead of trying to index every one of them. Also, your domain's search engine ranking will probably go up because Google doesn't think you're duplicating content.
For any page that could be duplicated because of parameters, you should include a canonical link of the page you want known as the original. So yes, it would help in your case. Though you cannot put a canonical link on someone else's domain pointing to your domain, so putting it on a partner's page would not have the intended consequences.
If you want more information, read up here: Google Webmaster Central: Specify Your Canonical

SEO and hard links with dynamic URLs

With ASP.NET MVC (or using HttpHandlers) you can dynamically generate URLs, like the one in this question, which includes the title.
What happens if the title changes (for example, editing it) and there's a link pointing to the page from another site, or Google's Pagerank was calculated for that URL?
I guess it's all lost right? (The link points to nowhere and the pagerank calculated is lost)
If so, is there a way to avoid it?
I use the same system as is in place here, everything after the number in the URL is not used in the db query, then I 301 redirect anything else to be the title.
In other words, if the title changed, then it would redirect to the correct place. I do it in PHP rather than htaccess as it's easier to manage more complex ideas.
I think you're generally best off having the server send a permanent redirect to the new location, if possible.
That way any rank which is gained from third party links should, in theory, be transferred to the new location. I'm not convinced whether this happens in practice, but it should.
The way Stackoverflow seems to be implemented everything after the question number is superfluous as far as linking to the question goes. For instance:
SEO and hard links with dynamic URLs
links to this question, despite the fact that I just made up the 'question title' part out of thin air. So the link will not point to nowhere and the PageRank is not lost (though it may be split between the two URLs, depending on whether or not Google can canonicalize them into a single URL).
Have your app redirect the old URL via a 301 Redirect. This will tell Google to transfer the pagerank to the new URL.
If a document is moved to a different URL, the server should be configured to return a HTTP status code of 301 (Moved Permanently) for the old URL to tell the client where the document has been moved to. With Apache, this is done using mod_rewrite and RewriteRule.
The best thing to help Google in this instance is to return a permanent redirect on the old URL to the new one.
I'm not an ASP.NET hacker - so I can't recommend the best way to implement this - but Googling the topic looks fairly productive :-)
Yes, all SEO is lost upon a url change -- it forks to an entirely new record. The way to handle that is to leave a 301 redirect at the old title to the new one, and some search engines (read: Google) is smart enough to pick that up.
EDIT: Fixed to 301 redirect!