If you haven’t heard of canonical uniform resource locators, you’re not alone. Despite sounding like a religious clothing supplier, this relatively new system of webpage identification has become crucial in search engine optimisation. And since SEO determines whether a certain page achieves the much-vaunted first page position on Google/Bing searches, canonical URLs are increasingly important to a site’s profile and performance.
What is a canonical URL?
It’s essentially a way of designating a particular webpage address as the preferred or master copy, so search engines know multiple page designations all lead to the same place. Pages can be presented in various forms, such as a website homepage with /home after the main address. Homepages generally have no need of an extra tab, but some web designers include it to simplify the process of identifying one page from another in CMS portals or traffic analysis.
The presence of a trailing slash also confuses matters – is co.uk/ any different to co.uk? Some people believe a trailing slash indicates a directory as opposed to a file, though their inclusion is often based on aesthetic rather than practical reasons. It’s a bit like the old debate about whether to put brackets around landline area codes when writing down a phone number. Sometimes, web hosting or development software eliminates this choice entirely, by making autonomous decisions about subpage or URL designations. Other times, third-party websites come up with their own interpretation of a webpage.
Homepages are especially prone to ending up with multiple designations – with and without /home or /index, which can themselves be bookended with an optional / symbol. The choice between http or https prefixes further muddies the waters. Without identifying a preferred address, platforms ranging from search engines to aggregators may publish different versions of the same homepage address. That dilutes the presence of each iteration, whereas one dominant URL would be more authoritative – and therefore achieve a higher ranking in SEO results.
This is the process of canonicalization – choosing one address and using it in every instance. And while it’s important to choose carefully between secure and unsecure links, the nominated address should ideally be as short or as easy to read as possible.
Why were canonical URLs created?
The origins of canonical URLs date back to the popularity of 301 page redirects. These were a clunky yet well-used way of directing traffic to a particular designation, often bouncing people from one webpage to another and consequently increasing page loading times. Since a page can have multiple versions of the same address, and web crawlers regard every address as unique, the confusion was evident.
In the late Noughties, Yahoo and Google began working on a tag that would eliminate duplicate content and simplify the process of identifying page addresses. They proposed creating an element for a page’s HTML header, alongside the meta description tag. In code, it looked like this:
<link rel=canonical href=”http://www.uk2.net” />
This snippet of HTML instructs the search engines that the UK2 homepage is located at a precise address, and any ranking results or performance metrics should be attributed to that location. Each page on a site can be given its own canonical snippet, providing it forms part of the same root domain (in this case uk2.net). The search engines then know which version of the address to list in search results.
Introduced in 2009 to much fanfare, the success of Google and Yahoo’s canonical experiment is reflected in its prevalence nowadays. Despite being a preference rather than an instruction, search engines have been known to adopt a canonical location even if it returns a 404 error. The rather lengthy term canonicalization defines the process of selecting an optimal URL from available options, and setting this as the rel=canonical default.
How do these URLs benefit website owners?
There are several benefits to using this system of page identification:
- Eliminating duplicate results. Imagine if search engines ranked websites with points. If there are four addresses leading to the same page, those points would be divided across four pages, diluting each one’s total score. Nominating a canonical address means all the points go to one place, helping it to score more highly in results pages.
- Clarification. If search engines aren’t told which page is canonical, they guess. And that can lead to a less appropriate URL being published – potentially one the site administrators aren’t even aware of. Equally, there might not be any need for someone to type a link ending in /aboutus.php/ if the page displays perfectly well without those thirteen additional characters.
- More traffic. Lengthier domain names are less enticing to audiences. It’s been proven that having to enter (or even click on) lengthy URLs can reduce traffic volumes. When it comes to manually entering website addresses, shorter is preferable; when it comes to brand recall, simpler addresses perform more strongly.
- Analytics results. It’s easier to track and investigate metrics like traffic volumes if the results are all attached to a single page. Platforms like Google Analytics are hugely powerful in terms of editing web content and improving SEO performance, but it’s important to ensure they’re being supplied with accurate data.
How do I incorporate canonicals into my site?
That depends to a large extent on how your site is programmed. For instance, the popular Yoast SEO plugin for WordPress has a canonical URL box in its advanced page settings menu, alongside meta robots index and breadcrumbs title fields. This will inform the search engines about a page’s optimal designation.
It’s also possible to generate canonicals from external sites, which is useful when content is being republished elsewhere by permission. The original content provider is credited with any offsite page views, while the secondary site provides extra content to its audiences. However, this should only be attempted by people familiar with the canonicalization process, since multiple canonical links can perform unexpectedly.
Finally, be aware that Google is becoming particularly keen on secured websites. It’s advisable for canonical links to direct to HTTPS pages, if content anywhere else on the site involves sensitive data or encrypted connections.