Not just for spam anymore: NOFOLLOW for skepticism
September 3, 2008 10 Comments
If you’ve been following the blogosphere for a while, you know that there was a huge problem with comment spam around 2003 or so. The problem arose because most blogs and online guestbooks allow you to supply a link to your own blog when you add a comment. Spammers realized that they could use automated robots to add comments and therefore links to the sites they were being paid to promote.
For a while the comment areas of certain blogs were almost unusable, and many bloggers were overwhelmed with the number of comments they had to delete. Exacerbating this problem was the fact that it didn’t matter if any person actually followed these spam links, it only mattered if the search engines found them. This is because search engines determine the relative importance of web sites by counting how many other sites link to them.
Naturally it was the folks at Google (and Blogger) who responded to this problem by proposing a slight change to the way links are handled. Understanding how that works requires a little background.
REL attributes and the Semantic Web
HTML has long had an optional attribute on the <LINK> and <A> tags named REL. This attribute allows the author of the page to specify the RELationship between the current page and the target of the link. For example, you can specify that the link is to the next page in a multi-page document, which would look like this in the header of an HTML document:
<link rel="next" href="chapter2.html">
Using REL is a baby step toward something called the semantic web. The intent of the semantic web is that instead of just a large collection of documents, the web will actually contain more structured content that understands its own inter-relationships. Progress on this concept has been slow, in part because the current web standards are so successful as is. Another reason is companies like Google have been very clever in building semi-semantic tools on top of the existing non-semantic web.
The REL=NOFOLLOW Attribute
What Google proposed, and everyone quickly adopted, was that everyone start using a REL attribute containing the word “NOFOLLOW” to mark hyperlinks that may have come from untrusted sources. Any link so marked would be ignored by a search engine in determining page rank or other relevance scores. This did take the wind out of the spammers’ sails for a bit, but unfortunately comment spam has not completely abated. This has actually led to a backlash in the blog community against using NOFOLLOW for spam fighting purposes, which we will discuss later.
So what we are talking about here is very simple. Here’s what a normal hyperlink to another website looks like in HTML:
<a href="http://whatstheharm.net">What's The Harm?</a>
Here is what a “nofollow” version of the same hyperlink looks like:
<a rel="nofollow" href="http://whatstheharm.net">What's The Harm?</a>
As you can see it is very simple indeed, and even someone who edits their own HTML by hand can use this feature without much effort at all.
All good blog software now mark comment hyperlinks with “NOFOLLOW” automatically by default. Some blogs allow you to configure this behavior for various purposes, such as letting trusted friends post links that are not marked this way. More on this later.
What does this have to do with skepticism?
As Louis Brandeis famously said, “sunlight is the best disinfectant”. Linking directly to misinformation on the web and explaining why it is wrong is like skeptical sunlight. But because Google and the other search engines use hyperlinks to determine the importance of web pages, many skeptics are fearful of doing so because they are helping boost the visibility of misinformation on the web. (Granted each hyperlink is a tiny part of this, but every little bit helps).
I have seen some skeptics try to deal with this problem by using URL shortening services such as TinyURL to indirectly link to the site they are debunking. I hate to be the one to break it to you, but this does not work! Google still follows these redirects and still includes them in page rank calculations.
I think the correct way to proceed is to continue providing skeptical sunlight through direct linking. For one thing this demonstrates that we are not afraid of those who we oppose. In general they don’t link back to us, and that demonstrates something to casual readers who take note of it. It also potentially allows automated tools to discover the relationships between skeptical writing and the misinformation we are writing about (i.e. the semantic web can notice skepticism specifically).
But while we are doing this we must be constantly vigilant of the page rank issue. Page ranking in Google is vitally important to those who are pushing misinformation on the web. It is how they attract new customers to their vile schemes, whether they be psychics or astrologers or homeopaths or something else. Even if we as skeptics are providing only a miniscule fraction of a misinformation peddler’s page rank, that fraction is too much.
How big is this problem?
I’ve been taken to task occasionally for not providing evidence of what I’m talking about. I will point out that when I mentioned this issue in my Skepticality podcast interview last week the hosts, who are long-time skeptics, were completely unaware of the issue.
But just to follow up a bit, I did some searching. I picked a misinformation site that everyone is familiar with: The Discovery Institute. (Yes, believe me that link is REL=NOFOLLOW. Better yet, don’t believe me. Click View Source and verify for yourself.) I set out to find out who in the skeptical community links to that page.
I used the advanced search “LINK:” keyword in Google, which is somewhat tedious because it does not play well with others. I.e. if you specify “link:www.discovery.org” in your search, you can’t use any other keywords to narrow it down. In any case, just in the first 100 results from Google I found the following articles that are trying to have a skeptical or scientific viewpoint, but which link to that horrible site without specifying nofollow:
- Despite Overwhelming Evidence, Creationists Cling to Unreality…
- ‘Are Darwin’s Theories Fact or Faith Issues?’ by PZ Myers …
- Crosscut Seattle – The scientific dark age of George Bush
- Nature of Science. An ID free Zone.
- North Texas Skeptics News for 30 April 2002
One of those links is on Richard Dawkins’ web site.
Now some of the above are on news sites and other places where minute control over linking is often not feasible, or is controlled by an editor. However, I also found the following skeptical or science blogs that link to Discovery Institute in their blogroll, so therefore every single page on their site is giving discovery.org a little tiny boost in Google:
And yes, I am trying to embarrass you. You are not helping the skeptical cause, folks.
So, out of the top 100 links to discovery.org according to Google, eight links were from skeptical/science articles that are disapproving of the site. Another 27 of the top 100 links were from Discovery’s own web sites linking back to themselves. So over 10% of the link strength that Discovery has is due to skeptics in this case.
Now you might argue, “Oh, we are just a small fraction of the sites that link to Discovery! Thousands of Christian sites link to it.” If a mad scientist asked you for fifty cents to help him fund his machine to destroy the earth, would you give it to him? A journey of a thousand miles begins with one step.
I would also point out that Discovery Institute is perhaps not the best example, because it is a gigantic and well known site. Many of the other sites skeptics link to are much lesser known. Some are positively obscure. In some of their cases, links from skeptical sites ridiculing them may be a significant proportion of their page rank. We can cut off their oxygen if we choose to.
NOFOLLOW to the rescue
I’m sure most readers are ahead of me at this point, but just to summarize: I believe that skeptics should, as a matter of policy, always use REL=NOFOLLOW on all hyperlinks to websites which we are opposing or debunking. This is important both ethically (helping them spread their filth) and strategically (helping the skeptic movement combat their misinformation).
If you edit your website manually, you can use the HTML tags I listed above to mark each hyperlink as appropriate. If you have a page or pages on your site that primarily consist of links to misinformation or woo-woo websites, then you might want to look into the META NOFOLLOW tag that you can put into the header of a page. This tells all crawlers like Google to ignore all of the links on that page.
If you use higher level software to manage this, read on.
If you run a blog, then you should make sure that the nofollow feature for comment links is enabled. It usually is by default. Otherwise commenters debating your in your own comments will be able to link to their own pseudoscience or misinformation, and you will be helping them advertise their nonsense.
Now, I realize the blogosphere backlash against NOFOLLOW is largely about this very reciprocity of linkage. Bloggers argue that comment links are part of the social contract of the blogosphere. Specifically, allowing people who comment on your blog to derive some benefit in the form of “google juice” from comment links is how you reward them for commenting in the first place. I don’t disagree with that argument. However, as skeptics we need to weigh the relative value of reciprocity with other skeptical bloggers versus the problems posed by linking to the very websites we are opposing.
I think the choice is clear for skeptics. If you wish to explicitly link back to good skeptical commenters, there is always your blogroll and the main content of your blog posts. Some blogs such as Pharyngula and Skepchick make a point of pointing out excellent commenters periodically, that’s a good opportunity to offer some reciprocity as well. As long as you allow open commenting and don’t censor woo-woos when they comment, ethically you must leave the NOFOLLOW option on your blog turned on. Otherwise you are helping the bad guys, whether you want to think about it or not.
For those wanting to tweak this behavior, there are a number of plugins available. You can read about the nofollow options for WordPress here and this article covers nofollow plugins for other platforms as well.
A great deal of skeptical effort gets expended on the various forums. Many search engines are particularly good at indexing forums, so links posted on them get considered very quickly. Some of these sites (such as randi.org) have been around long enough to have a very high page rank themselves, which means links posted there have particular strength.
Note that because vBulletin auto-links URLs by default, you must leave off the “http://” part of the URL when doing this. The NFURL tag will put this back for you to make a proper link when your message posts. Use the preview feature if you are unsure (click Go Advanced and then click Preview).
I believe this tag a specific modification on the JREF forum, it may not be available on other instances of vBulletin. It does not appear to be available on UK Skeptics.
phpBB (Skeptic Society Forum): There are some readily available mods to make this the default for all links in phpBB. Lo and behold The Skeptics Society has this turned on in their installation. Major props to you folks for being ahead of the game on this one.
Invision Power Board (Skepticality): There does not appear to be a way for a user to do this in their posts by default. The sysop of an IP.Board can make this happen for all user-defined links through some configuration tips described here. Of course, because it is system-wide this hurts skeptics as well as woo-woos who post links in your forum, just like turning it off in blog comments.
SMF: Simple Machines Forum (Skeptics Guide to the Universe Forum): By default I see no way to create nofollow links here, however there does appear to be an optional NoFollow BBCode plugin that can be added to allow posters to do this manually just like on the JREF forum. There is also a another optional NoFollow All Links plugin that will automatically do this to any link posted in the forum.
Snitz Forum (Skeptic Friends Network): I see no way to do a nofollow link here, and the forum is not configured to do it by default.
Please, in the comments below if you know of additional info I don’t have here, add it. Easily installed plugins to solve this problem are of particular value.
Conclusion and further reading
I think this is an important issue for skeptical webmasters and bloggers, and we need to take action about it. Otherwise we are undermining our own movement, one link at a time.
For more about other semantic things you can do in your HTML, see the microformats initiative. Particularly relevant here is their summary of the use of REL. Some of those other REL usages may be relevant to skepticism as well. I will write about that at a later date.