Last Updated: 17th October 2019
Imagine the scenario – You’ve spent hours creating some high-quality content, that both users and the search engines will love, and lined up some great outreach opportunities for when it goes live. You hit publish… wait…and then wait some more.
Why is no traffic going to this great webpage?!
Diagnosing indexation problems in Google Search can be a worthwhile endeavour either for individual pages, priority pages or at scale, depending on the situation. There can be lots of reasons why a webpage might not be indexed in Google which is why we’ve put together the below checklist to assist in working out why a page isn’t showing up in the search results. Using the ‘inspect URL’ feature in Google Search Console is a great way to examine individual pages. This can be used alongside this checklist if your relevant domain property is verified.
Why is my URL not indexed?
1:Blocked in robots.txt
The robots.txt file (always found at example.com/robots.txt) controls what URLs Google (and other search engines) are allowed to crawl or visit. Crawlers maybe blocked from visiting certain pages they might not be indexed (they may be indexed through other means, however. For example if other pages link to them). To solve this, delete the offending line from your robots.txt file (make sure it won’t inadvertently affect any of your other pages first though!)
2:Page has a noindex status
We’ve written about the noindex tag quite recently. It is worth repeating though, that pages with page-level noindex meta tags (or a server-level X robots noindex tag) applied will not show in search once they’ve been crawled by the search engines. The solution for this is to remove the noindex tags and wait for the page(s) to be recrawled. Simple!
3: Canonicalised with the canonical tag
If a page has a canonical URL set (telling the search engines that the ‘linked-to’ page is the preferred version to be shown in search) this may mean that the original URL does not show in Google. If appropriate, removing the canonical tag or changing it to be a self-referencing one will allow indexing of the page.
4: Page removed with the URL removal tool
If, for some reason, the page has been submitted through Google’s page removal tool, it may be temporarily excluded from the index for 90 days. These requests can be undone through the tool itself if desired, which will allow the content to show again.
Let's work together
If you want to talk to our specialist team about how we can help you with your digital marketing, talk to our team today.
5: Authorisation needed or Googlebot blocked at server level
If Google gets to a web address that requests authorisation then it won’t be able to crawl or index it, just as a user wouldn’t be able to access a page if they didn’t have a username or password at a log-in screen. This may be desired behaviour or it may not, but it is worth checking for. It is also possible that Google’s bots have been blocked at the server level which would typically be done through the .htaccess file. Developer help may well be needed to fix these kinds of errors!
6: Not crawled due to server overload
Googlebot tries to be a ‘good citizen of the web’ whenever it can. This means that when it detects that a site’s web server is under strain (the server response time is a lot longer than it should be for example) it will generally scale back its efforts to retrieve pages from the site. This may result in some pages not being crawled, or crawled incorrectly. Server speed optimisation is a topic in itself, and does vary from site to site, but loss of crawlability in this case could also be related to the hosting provider. Cheap, shared hosting often has issues with page load speed and, if the cost is justifiable, moving to dedicated hosting may be worth the investment for larger websites.
7: Page removed because of legal complaint
This is fortunately rather rare, but it is possible that webpages will sometimes be taken down from Google due to legal challenges (for example, due to a DMCA violation claim). There isn’t necessarily a specific recourse from this unfortunately, except potentially through legal channels.
8: Page doesn’t meet quality standards
Duplicate, low-quality or thin content may all be reasons why certain pages aren’t included within the index after they’ve been crawled. In which case, improving the page’s content and re-requesting indexing are the best solutions.
9: Random bad luck!
Crawling and indexing the billions of pages on the web is rather a taxing job and sometimes mistakes happen.This isn’t the end of the world though, as the request indexing feature in Search Console can be an easy way of requesting that Google re-fetches and indexes the page in question.
Bonus: Get your content reviewed quickly
In a hurry to get your fixed content reviewed by Google? If you inspect the URL in Google Search Console, they give you a handy ‘request indexing’ feature-this doesn’t have to be used on new content only though, it can be used for existing URLs as well!