Google Hummingbird is not just another of the Google algorithm updates that have been appearing regularly over the past few years. It is the most important algorithm change introduced by Google since 2001. So says Amit Singhal, Google’s search chief.
Fundamentally Hummingbird uses semantics to establish what searchers are really looking for with the search term they use. With this knowledge, Google believes it can provide web pages that are more relevant to their needs than the previous search algorithm could provide.
Hummingbird is not just another algorithm update, such as Panda and Penguin, but a complete search algorithm replacement. Before we discuss Hummingbird, and the various recent Google algorithm updates, let’s first understand how Google works to provide us with the search engine results pages we find when we use the search engine for information.
1. How Google Works
There are over 60 trillion pages on the World Wide Web, and this number is increasing rapidly. Even with some of the best programmers in the world, Google is facing a hard task to maintain the quality of the information it offers to its customers: those using the search engine for information.
By providing you with the fundamentals of how Google goes about that, it will be easier for you understand the need for the Google Hummingbird algorithm and the various Google algorithm updates that have been in the news over the past year or two. There are four fundamental processes in the entire procedure that provide you with your results pages:
- a) Crawling the web with automated programs known as spiders. These follow link after link and ultimately build up an index of pages based on keywords.
- b) Pages are sorted by content and many other factors which are all stored in the index.
- c) Whenever a user enters a search query, Google will refer to the index for web pages contain the keywords used in the query. It will then use vocabulary, semantics, spelling, query understanding and other factors to extract relevant documents from the index.
- d) These documents (web pages) are then presented in the search results in order of calculated relevance to the query, involving more than 200 factors relating to the web page. There are 10 web pages offered per results page, ranked in order based upon these 200+ factors.
Google uses a large number of algorithms to establish:
- What pages qualify to be indexed
- Which indexed pages qualify to be included in the search results
- The ranking order of these search results
2. Google Algorithms
Most people may have heard of Googlebot. In the early days it was misunderstood that this was the Google algorithm that was responsible for the entire search and ranking process. In fact, Googlebot is Google’s web crawler, often referred to as a ‘spider’ that deals with section 1.a above.
In fact, Google uses the results of many algorithms before you see the results of your search. Google Hummingbird is just one of these, although a very important one. PageRank was the most enduring until it was called off by Google. In fact, PageRank was developed by Larry Page before he and Sergei Brin started up Google.
Here, we shall discuss the most recent algorithm updates, then finish with an explanation of why Google Hummingbird helps people seeking information, and how you can use it’s concept to improve your results.
3. Spambots and Updates
Spambots are designed to seek out the various forms that webspam can take. Spam is not only defied as these annoying emails you get every day, but also includes keyword and link spamming. Google has been having a long-term ongoing battle against keyword and link spamming, which together can reduce the search experience of a Google search engine user.
Keep in mind that Google is fundamentally a search engine, whose customers are people looking for information – they are not advertisers, and certainly not website owners seeking online exposure. You must meet Google’s Terms of Service, not the other way around, if you want your website or blog to be visible online.
Here are Google’s latest algorithm updates. Each has been designed to improve the relevance of the PageRank algorithm and also to increase the relevance of social media, particularly that of Google’s own Google+.
3.1 Google Algorithm Updates: Google Panda
Google Panda is named after Navneet Panda, a Google engineer who worked on it. It was also known as the ‘Farmer’ update because it tackled the issue of content farms. The fundamental objective of Google Panda was to reduce the influence on PageRank of links that are artificially generated by publishing content on what are known as content farms.
Ezine Articles, for example, lost almost 90% of its web presence due to article publications that were deleted from Google SERPS. Mostly for low word count and poor grammar: that’s why EZA has tightened up on its word count requirements and on its grammar. Many other content sites were the same, and some even closed down due to the Google Panda update. The type of content that Panda is designed to downrank or even completely remove from its index is:
- Spun content, where several pages or articles are generated from a master page or article using spinning software.
- Content that has been rewritten from published articles.
- Content or articles created using spinning or scraping software that searches the web and takes snippets from published content. This will be likely be hit harder in the near future.
- Content that is badly written with poor grammar and spelling and offers little benefit to offer the reader. Grammar now counts in Google’s opinion!
- Short articles of only 300 words or under: this is where article directories generally suffered with Panda.
3.2 Google Penguin Update
There have been major Penguin updates till now (there have been some minor changes made to them). Each has the same objectives. The only difference is that now Penguin runs in real time and is part of the core algorithm and this has been seen on Moz algorithm change history.
Now we know that the following applies not only to your home page, but also to internal pages and files, and also to blogrolls and archives that can be accessed though the file structure of your site. Ultimately, a blog is no more than a website with a content management system, and all files are accessible by Googlebot unless you prevent it using the robots file or the ‘noindex’ attribute.
Here are the main issues that the Penguin algorithm update in both its forms is intended to tackle:
Article and Content Scraping
More emphasis on rooting out spun content and articles created using article scraping software. Google wants hand-written content, and will search out that which is obviously not.
Too many keywords on a page. Nobody but Google knows for sure what that means, but if you aim for around 1% KD, then you should be OK. That is 8 keywords in an 800 word article. You may get away with up to 1.5% for single-word keywords, but we would recommend that as a maximum. In fact, if you just write naturally and ignore keywords, you might even achieve better results.
This refers to links back to your web pages. You are now better using links to your site that predominantly contain your brand name or website/blog name rather than your keywords. In the past, keyword-rich anchor text was the ‘Big Thing’. Now, Penguin will punish you for it – it is looking for your blog title, your website domain name or your brand name to constitute at least 15-20% of your anchor text.
Other very important linking techniques that the Penguin algorithm update is seeking out include:
- Paid links – never pay for links back to your page.
- Low quality links – these are usually generated using liking software, particularly reciprocal linking software.
- Passing PageRank though text adverts on your site. All adverts should have the ‘nofollow’ attribute.
- Guest posting with Keyword-rich anchor text.
- Using linked keywords in forum posts or signatures.
- Footers containing widely distributed links to your site.
- Linking schemes where links are offered automatically and not voluntarily according to the value of your web pages or blog posts.
Penguin applied all of the above to your home page, and Penguin looks deeper into your site. What Google is fundamentally doing is to punish links generated unnaturally, or artificially, and reward those obtained from people genuinely impressed by your website or blog, or individual pages within it.
In other words, your backlinks should be from people who genuinely want to provide their visitors with a link to your site for more information. This is another aspect of Google ranking: you get ranking points if you offer your visitors access to further information through outbound links on your site to other authority web pages.
3.3 Google Hummingbird Algorithm
As explained earlier, the Google Hummingbird algorithm is not an update such as Panda or Penguin, but it is a new search algorithm. It has been designed partially to extend Google’s Latent Semantic Indexing (LSI) algorithm to the search term use by Google users. LSI has been used by Google to establish the real content on your site for indexing purposes. Hummingbird establishes the meaning of search terms for search purposes.
Introduced in September, 2013, this algorithm makes it less necessary to optimize your web pages for long tail keywords, because Hummingbird can extract the meaning of the search term used and apply that to its indexed pages. However, it does means that you can use long tail keywords in your search because Google will look at each word in your search term to decipher exactly what it is that you are looking for.
When Google introduced Conversational Search (using natural speech) a couple of years back, we all thought it was a gimmick. Now we know different. It was the forerunner of Hummingbird, whereby you can type like you talk to get a better search experience than simply using keywords.
How does Hummingbird Affect You?
Google Hummingbird should not have much effect on your blog or website if you have been writing naturally. Without seriously optimizing your pages for long tail keywords, you can improve traffic if you can foresee the type of wording people will use to find the services or products that you provide.
If you cannot do that, don’t worry about it. The objective of Google hummingbird in the long run, as it has been for all Google algorithm updates, is to reward natural websites and blogs and tackle those designed to play the search system and attract traffic to low quality web pages. Be natural and honest, and Google will reward you with the exposure in the form of the search engine ranking it believes you deserve. Try to play it and cheat, and Google will reward with no ranking at all – and may even delist you.