Editing the head element on an old blog platform on a post-by-post basis. Is this impossible or am I missing something? - header

Sorry for being a total rookie.
I am trying to help my professor implement this advice:
Either as a courtesy to Forbes or a favor to yourself, you may want to include the rel="canonical" link element on your cross-posts. To do this, on the content you want to take the backseat in search engines, you add in the head of the page. The URL should be for the content you want to be favored by search engines. Otherwise, search engines see duplicate content, grow confused, and then get upset. You can read more about the canonical tag here: http://www.mattcutts.com/blog/canonical-link-tag/. Have a great day!
The problem is I am having trouble figuring out how to edit the head element on a post-by-post basis. We are currently on a super old blogging platform (Movable Type 3.2 from 2005), so maybe it is not possible. But I'd like to know if that is likely the reason, so I'm not missing out on a workaround.
If anyone could point me in the right direction, I would greatly appreciate it!

Without knowing much about your installation, I'll give a general description, and hopefully it matches what you see and helps.
In Movable Type, each blog has a "Design" section where you can see and edit the templates for the blog. On this page, the templates that are published once are listed under "Index Templates," and the templates published multiple times, once per entry, per category, etc., are listed under "Archive Templates."
There probably is an archive template called "Entry" (could be renamed) publishing to a path like category/sub-category/entry-basename.php. This is the main template that publishes each entry. Click on this to open the template editor.
This template could be an entire HTML document, or it might have "includes" that look like <MTInclude module=""> or <$mt:Include module=""$> (MT supports varying tag styles.).
You may find there is an included module that contains the <head> content, or it might just be right in that template. To "follow" the includes and see those templates, there should be links on the side of the included templates.
Once you find the <head> content, you can add a canonical link tag like this:
<mt:IfArchiveType type="Individual">
<mt:If tag="EntryPermalink">
<link rel="canonical" href="<$mt:EntryPermalink$>" />
</mt:If>
</mt:IfArchiveType>
Depending on your needs, you might want to customize this to output a specific URL structure for other types of content, like category listings. The above will just take care of telling search engines the preferred URL for each entry.

#Charlie: may be I'm missing something, but your solution basically places a canonical link on each entry to… itself, which is a no-no for search engines (the link should point to another page that's considered the canonical one).
#user2359284 you need a way to define the canonical entry for those which need this link. As Shmuel suggested, either reuse an unused field or a custom field plugin. Then you simply add that link in the header in the proper archive template that outputs your notes. In the hypothesis that the Entry template includes the same header as other templates, and, say, you're using the Keywords field to set the URL, then the following code should work (the mt:IfArchiveType test simply ensures it's output in the proper context, which you don't need if your Entry template has its own code for the header):
<mt:IfArchiveType type="Individual">
<link rel="canonical" href="<$mt:EntryKeywords$>" />
</mt:IfArchiveType>

Related

Is rel=self the correct rel tag to use for forum permalinks?

I have been building a forum from scratch with my friends just for fun, and we're starting to see bots and scrapers go by. The problem we're having is that you can load a page /post/1 with four replies, and each reply includes a little permalink to itself /reply/1#reply-1. If I am on /post/1 and navigate to /reply/1, I'll end up right back where I started, just with the anchor to the reply. But! Scrapers have no idea this is the case, so they're opening every /post link and then following every /reply link, and it's causing performance issues, so I've been looking around SEO sites to try to fix it.
I've started using rel=canonical on the /reply page, to tell the bots they're all the same, but as far as I can tell that doesn't help me until the bot has already loaded the page, and thus I wind up with tons of traffic. Would it be correct to change my
Permalink
tags to
Permalink
since they should be the same content? Or would this be misusing rel="self" and there's another, better rel tag I should be using instead?
The self link type is not defined for HTML (but for Atom), so it can’t be used in HTML5 documents.
The canonical link type is appropriate for your case (if you make sure that it always points to the correct page, in case the thread is paginated), but it doesn’t prevent bots from crawling the URLs.
If you want to prevent crawling, no link type will help (not even the nofollow link type, but it’s not appropriate for your case anyway). You’d have to use robots.txt, e.g.:
User-agent: *
Disallow: /reply/
That said, you might want to consider changing the permalink design. I think it’s not useful (neither for your users nor for bots) to have such an architecture. It’s a good practice to have exactly one URL per document, and if users want to link to a certain post, there is no reason to require a new page load if it’s actually the same document.
So I would either use the "canonical" URL and add a fragment component (/post/1#reply-1, or what might make more sense: /threads/1#post-1), or (if you think it can be useful for your users) I would create a page that only contains the reply (with a link back to the full thread).

How to batch rename Tumblr tags?

I have tagged over 700 blog posts with tags containing hyphens, and these tags suddenly stopped working in 2011, because Tumblr decided (without any notice) to forbid hyphens in tags (I guess hyphens are blocked now, because spaces in tags (which are allowed) get changed to hyphens.). Unfortunately, Tumblr is not willing to globally rename all tags containg hyphens (although these tags are of no use anymore → 404).
Now I want to rename my tags myself.
I tried to do it with the "Mass Post Editor" (tumblr.com/mega-editor), but it's not possible to select posts by tag. I'd have to manually select post after post and look if a certain tag was used, and if so, delete it and add a new one instead. This would be a huge job (700 tagged posts, but more than 1000 in total).
So I thought that the Tumblr API might help me. I'm no programmer, but I'd be willing to dig into it, if I could get some help here as a starting point.
I think I need the following process:
select all posts that are tagged with x (= a tag containing hyphens)
tag all these posts with y (= a tag without hyphens)
delete the tag x on all these posts
I'd start this process for every affected tag manually.
I see that the method (or whatever you call it) /post knows the request parameter tag:
Limits the response to posts with the specified tag
(I guess I can only hope that this works for tags containing hyphens, too.)
After that I'd need a way to add and remove tags from that result set. /post/edit doesn't say anything about tags. Did I miss something? Isn't it possible to add/remove tags with the API?
Have you an idea how I could "easily" rename my tags?
Is it possible with the API? Could you give me a starting point, tip etc. how I could manage to do it?
I don't know if this might be helpful, but I noticed that the search function is still able to find posts "tagged" with tags that contain hyphens.
Example: let's say I have the tag foo-bar. It is linked with /tagged/foo-bar (→ 404). I can find the posts with /search/foo-bar (but this is of course not ideal because it might also find posts that contain (in the body text) words similar/equal to the tag name).
I tried to encode the hyphen (/tagged/foo%2Dbar), but no luck.
just for the record, because this is a popular google search: i've done it! you can use it at http://dev.goose.im/tags/.
i used a combo of PHP and jquery, basing my jquery off of a previous tumblr api script i wrote a year or two ago, and used this tumblr php oauth script for the authentication. if anyone wants me to put up the source code, i'd be happy to.
If you aren't a programmer, how much is your time is worth to you? As they say, time is money. Not only do you have to figure out how to use the API, but choose a language and learn to write in it. That's no small task. You could higher a freelancer for $50 for an hour worth of work.
To answer your question, yes it is possible to do this with the API. It mentions "These parameters are used for /post, /post/edit and /post/reblog methods." and tags is mentioned as a string of comma separated words.
What you want to do is get a listing of every single blog post using the /posts method. You'll want to look at the "Request" section to figure out the criteria to pass to this URL. You want it to be as general as possible to get a complete listing of all your posts.
After you get a listing of posts you'll want to iterate over it and modify the tags parameter provided in the response for each post. You'll want to use the id paramater along with /post/edit, which again takes tags as a string.
The simplest language you can use for this task is PHP. You'll want to look at the curl extension to make your requests. You'll want to read up on arrays as you'll be using them a lot. You'll also need to look at explode, implode, str_replace (for the dashes), and foreach for iterating over the result.
When you do this I would highly recommend you use break at the end of your foreach loop so it only affects one post at first. Testing it first will be important, as you don't want to accidentally erase your tags/posts. print and var_dump are good ways to help you debug the code. xdebug is a nice extension that allows you to step through the code line by line as it runs. Netbeans is an IDE that has good xdebug support.
There's also a nice page here to get you started with PHP. You'll need to install PHP on your machine. You don't need to install a web server - for this PHP-CLI (command line) sapi is good enough.

Is googlebot indexing links in html comments?

I got a huge number of NOT FOUND links on Google webmaster tool, looks like the links are coming from a section of code in the footer which was put in an HTML comment
All pages have NOARCHIVE tag so it's probably not a cache issue
Did this happen to anyone?
A quick Google (ironic, eh?) shows that whilst there is no official word on the subject, the general concensus (through anecdotal and experimental evidence) is that Google will process everything including content in comment tags. This means that it will indeed index your links, even if they're in comment tags. However, it does not use the content as a source for keyword searches, i.e. anything in a HTML comment is not considered to be part of your page's visible content and is therefore not usable as part of search criteria.
HTML comments are designed to simply specify human-readable information about what your layout is doing, for example signifying where a particular include begins in a page outputted by a PHP script. You shouldn't be using HTML comments to remove large chunks of code in your site. I suggest that you remove the content.
If you don't want Google to follow a link, you can add rel="nofollow" to your hyperlink. You can also use robots.txt to specify directories or URL wildcards that you do not want Google to index.
References:
http://en.wikipedia.org/wiki/Nofollow
http://en.wikipedia.org/wiki/Robots.txt
http://www.webmasterworld.com/forum3/4270.htm
http://www.codingforums.com/archive/index.php/t-71686.html
If you are talking about links in comments between tags, I don't think they are taking effect with Google Bots as stated there and there.
Regards.

Dynamic blocks in django templates

It is a question about django that has found absolutely no answer for me.
Let's suppose I have a site where I display two blocks in the sidebar :
A list of the last users who've logged in
A list of the last published blog articles
Let's say that these blocks are to be displayed on 80% of the website urls and presented using template files.
The data for these blocks is generated by code (obviously), bt not by url views.
Well, how to do such a thing ?
You might want to take a look at custom template tags.
Edit: more specifically, look at inclusion template tags.

Is the meta type="title" tag needed and what's the best format for the title tag?

I have seen some websites use the following tag:
<meta type="title" content="Title of the page" />
Is it needed when you have a <title>?
Also, what's the best formatting for a page title? Some ideas:
Page Description :: Company Name
Page Description - Company Name
Page Description <> Company Name
Company Name: Page Description
...
Does it matter to Google/Yahoo/etc? Do you include the company name or a general description of the site in the title on every page?
The <meta type="title"> tag has little rank or relevance to search engine crawlers. The good old <title> tag is far and away the most important element of a good web page.
As for the format of the title, I think there is good advice in this article at Standards Schmandards:
If the title contains the name of the
site, the name of the site should be
placed at the end of the title. This
makes sure that multiple bookmarks
from the same site are easy to browse
through in the bookmarks folder and
listeners to your page get the most
important information first.
I would highly suggest that you do include the company name or site name at the end of each title because:
Consistency is always a good idea.
Newer browsers like Firefox 3 allow you to search your history and bookmarks by page titles, so users can easily get a view of all the pages they've visited on your site by simply typing in your company name or site name.
People that use screen readers will have no idea what website they are visiting if it isn't listed somewhere on the page.
However, I would not put a description of the site anywhere but on the home page because that would make the title unnecessarily long and would frustrate screen reader users because they would have to make an extra effort to skip that information on every page they visit.
If you do decide to put the company name in your title, keep these things in mind (also from Standards Schmandards):
The separator character should be
distinct so that users understand
that it is a separator. (I.e. it
should not appear as part of text
items in the title).
Prime candidates to use as separators are the vertical bar (|),
the dot (·) and the dash (-).
Regardless of the character you pick, it is important to surround it
with whitespace. This will aid both
sighted visitors and listeners as it
will distinguish the character from
the title text.
Based on all the information herein, that essentially makes the second example in your question the obvious choice:
<title>
Page Description - Company Name
</title>
Search engines often ignore meta tags as in the past they where used for spamming purposes. The best tag for title is precisely <title>.
As the best formatting for the title there is no best recipe, but instead try to make the title as descriptive as possible of the real contents of the page.
Meta Robots: This tag enjoys full support, but you only need
it if you DO NOT want your pages indexed.
Meta Description: This tag enjoys much support, and it is well worth using.
Meta Keywords: This tag is only supported by some major crawlers
and probably isn't worth the time to implement.
Meta Else: Any other meta tag you see is ignored by the major crawlers,
though they may be used by specialized search engines.
is what you want to use, because it stands out more than meta tags to most search engines.
My suggestion is to put the keywords that matter first, and avoid repeating the name of your business other than on the homepage, because this only serves to dilute the value of the title text.