Why I Dig Clean HTML: And Why I Think It Benefits SEO
When we think of Search Engine Optimisation we more often than not think about the keywords, the content, the link building and the rankings. Yes, the content is ultimately the single most important factor (content is King as they say) but there are absolutely other contributing factors for SEO. One of which is your website’s code.
Prior to Attacat I had nearly zero experience with technical SEO or coding for that matter and could only just about manage the right click option to see the website’s source code, beyond that I wouldn’t have a clue what to do or what I was looking at. My experience lies with that all-mighty buzzword ‘content'; a word that gets tutted at on a daily basis by our Managing Director Tim (seriously, he hates it, don’t ask). Nearly a year on and after various Technical SEO Audits, site investigations and site launches I’ve finally got to grips (kind of) with the mumbo jumbo that is website source code.
Now don’t get me wrong, I’m no developer, and our resident code geek Chelsey can verify that but I can definitely tell the difference between SEO-friendly and non-SEO-friendly code. Enough about my life story and more to the point, here’s a few basic tips on how clean HTML can benefit SEO:
As I said above, the content on the page will always be the most important factor for SEO; content that is unique, relevant, useful, insightful, fresh and generally “good” is always going to drive more traffic and attract more users than having clean code. Sometimes though it’s worth considering that it isn’t just users that use and view the web page, but those pesky search engine spiders too. Sometimes they’re not so great at deciding whether content is “good” content, especially if it’s very new or really old. What they are good at however is determining if the page is structured in an appropriate and likable manner.
Did you know that different Content Management Systems (CMS) will insert various bits of code into your script as they see appropriate? It’s actually very common for them to add in unnecessary code that doesn’t serve any purpose but will actually end up making your source code look very “messy”. It’s always worth considering this if you’re using something like WordPress or Blogspot to look over your HTML before publishing to make sure it’s nice and tidy; don’t panic, WordPress have a load of plugins these days that can help you to clean up and optimise your HTML so help is at hand.
You’ve most probably heard of H1 and H2 tags and why they’re important for SEO (see my previous post on How to SEO your Blog Post for more details), but it doesn’t actually go much deeper than that, despite popular belief. Semantic markup in HTML means using tags such as <strong> and <em> to emphasis important text, you’ll get the same effect by using styling tags such as <b> and <i> (bold and italics). Matt Cutts himself has actually explained that there is no real reason to fret about such things; they don’t have enough of an impact to worry about.
Basically beyond H1s and H2s don’t place too much focus on semantic or text styling markup for SEO as it’s really not going to move you from the sixth page to the first in the SERPS!
That doesn’t mean don’t follow the rules!
All the standard page optimisation rules do and will always apply:
- Ensure every page has relevant and optimised meta data
- Internal links
- Appropriate keyword placement
- Titles tags and alt-text for images
- Give the search engine spider a helping hand by providing a Sitemap; this allows you to specify a structure and key pages on your website.
These basic code/content optimisation tips should be applied as they’re basically expected as standard by search engines these days. If you don’t fall in line or exceed, you’ll get kicked out straight away!
High signal to noise ratio
In plain text; more content than code please – when a website’s source code is particularly code-heavy this is often referred to as “code bloat” (a phrase I totally and obviously love by the way). Iframes and flash are great for making your website look pretty and cool and yea, it can help to improve the user’s experience but it’s a nightmare for search engines and SEO. Search engine spiders are getting better at reading these things but they’re not quite where they need to be just yet.
Token example of bad code
So, who can see what’s wrong here? Just above I suggested utilising CSS for styling, and keeping it in a separate file would eliminate the majority of the “styles” in the code above, allowing the Spider to get to the juicy content quicker. There is more wrong but I’ll leave you to ponder the rest and maybe give me a shout in the comments if you figure the rest out.
The more sloppy code you have on your site, the more you’re making search engine spiders sift through to find the valuable content – the content you actually want them to find. In fact if there’s too much uninterpretable code a search engine spider will get bored and leave the page without finishing the crawl, it’s definitely worth bearing in mind when you’ve got 3 or 4 iframes on the page.
Now fly away my pretties and tidy up that code, nobody wants code bloat!