URL Canonicalisation (or URL Canonicalization as Americans spell it) is something I like to talk about a lot. It's a complicated name, but a simple concept. It's a fundamental concept of Search Engine Optimisation and it's also a great test for your web developer – or more importantly, potential web developer – to test their understanding of SEO! Read this blog post and then ask them to explain the concept to you (over the phone or face to face) – I guess 9 times out of 10 they'll have no idea what you're talking about… so you already know more about SEO than they do!
What is 'url canonicalisation'?
url canonicalisation is the concept that a page on a website can be displayed by entering different things into your web browser address bar.
So, enter any of the following into a web browser:
and with practically all web hosting packages they will all show the same webpage.
To the human eye the results all look the same, however, Google sees each of the above web addresses as being different things.
- 1 and 3 appear to be different pages on the same website.
- 1 and 2 (& 3 and 4) appear to be different websites altogether.
Google WILL see the content on the different website addresses as being identical, however, and there's nothing Google hates more than duplicate content! As a result it will show one website in preference to another, e.g. www.websanity.co.uk in preference to websanity.co.uk. It decides which, but if you sign up for Google Webmaster tools you can ask it to prefer one over the other (we usually prefer the www. version).
No big deal, HOWEVER, as you build links to your website from other websites you will probably end up getting people linking to both web address versions, and, in the case of the home page, to / or index.php. That means links can end up going to four different places. The reason this is so important is that this splits the authority of all those hard earned external links to your site across 4 different page variations (some of which aren't even shown in the results) – what a waste! This can be made worse by some sloppy website designs which refer loosely to different versions of the page name within the site, thus leaking credibility from within your own website.
Aside: Some web content management systems/hosting packages will even show the same page for:
Believe it or not, but Google sees even these as separate pages – possibly watering down all your hard earned link authority EVEN FURTHER!
How do I fix URL Canonicalisation issues?
Basically your website needs to inspect the web address requested of it (each of the versions 1 to 4 above) and REDIRECT them to one preferred version.
i.e. If I type in any of:
The website changes the address to:
This brings all the authority for ANY of the web address versions 1-6 to one single place – no waste!
Try entering some of the terms 1-6 above in a browser and see what happens. Now try similar things with your own web address (although your home page might be called index.htm, index.html, index.asp, index.aspx or something else completely!)
I'll show you how to do this redirection (as it is called) in a future post: unfortunately the solution IS somewhat technical AND varies according to the hosting platform you used (Microsoft or Apache), but at least for now you understand what URL Canonicalisation is and whether you have a problem.