No products in the cart!
Please make your choice.View all catalog
It is a common misconception that all pages that fall under “pages not indexed” are errors and must be fixed.
Not all of the pages not indexed are errors.
The job of a skilled SEO is to analyze and interpret these findings and put them into context with the specific situation of the website in question.
The key here is prioritization as is the case with any SEO audit .
Let’s talk about the “Not Found (404)” category.
404 is just the page is not found, but why is Google listing these pages that don’t exist?
Well, maybe they once did. Maybe your site is linking to an old page (inspect the URL and get a list of URLs linking).
Maybe there is a case sensitivity issue (Linux vs Windows difference)
Maybe there are external links to an old page (this is known as SEO gold)
Maybe an old version of your site or a site migration that didn’t catch some things is still there.
There are lots of reasons but you should follow up on all of these errors and identify them.
If you have a site with a handful of errors it’s easy to deal with manually.
If you have a large site, you will need to filter and possibly programmatically deal with these errors. It could well be that there is only one or two root causes that give 1000s of errors.
The point is you have to look and then follow up and actively deal with these issues.
We refer to this process as GSC cleanup. The total errors on your site are a quality signal, don’t waste Google’s time and energy on errors.
Google Search Console gives you a breakdown of the different reasons why your pages are not indexed.
You should pay attention to these reasons.
The noindex tag or other noindex instruction (not there are several methods for index control not just robots.txt) is a fairly clear signal to Google not to list a page.
It’s OK to have a noindex instruction, however, you need to be very sure that all of the pages that are in this category really should be.
Having accidental noindex instructions is more common than you think.
Sometimes (especially in WordPress) a page is copied from a template and the noindex tag is carried over (it’s just a check box – and it’s usually buried way down in a plugin control and not obvious).
Sometimes a whole directory is noindexed in robots.txt and a new page is put in a directory with noindex instruction. This would prevent (or restrict) the page from being indexed, however, the reason may not be obvious.
Sometimes whole sites get pushed from staging to live with a noindex tag or instruction applied.
Trust me – this happens more often than most people would like to admit.
Not speaking from experience of course 🤦
I recommend using the SEO Pro Extention by Kristina Azarenko, it’s my go-to SEO tool for a quick check on various things.
There is a tendency in the SEO industry (due to historical reasons) to have an over-reliance on robots.txt as the primary and only index control – in fact, some people still teach this. However, you should be controlling index directives (robots.txt is a hint) at the page level with either a meta robots tag or an X robots header response. The meta robots tag is the most common and it’s what I recommend.
NOTE: with some CMS systems it’s not possible (or at least not straightforward) to add a meta robots tag so in this case, modifying server-side files (.htaccess usually) to provide the correct X robots header response solves the problem.
The main takeaway here is that you should check all excluded by robots tag pages and understand why they are there. Make sure that they should be and if not – fix it.
It’s an easy fix and an easy win if you find pages that should be indexed.
You should pay attention to these reasons.
This term is confusing as it sounds like an error, however, IF the correct use of canonicalization has been used then this particular category can be considered as informational and not an error.
We will need to touch on canonicalization to better understand this category.
A canonical URL or use of the rel=”canonical” tag is intended to identify the main / master / original source of content.
The canonical tag was introduced so search engines could understand and avoid duplicate content.
With some navigation systems, it’s possible to end up with the same content at multiple URLs. This is duplicate content.
We use the canonical tag to tell search engines what the primary source of the content is.
Any duplicate URL (or near duplicate) that has canonical URL references will be considered as an “Alternate page with the proper canonical tag”.
This is not to be confused with these two categories which likely do need attention;
My site. Com / author / page 1
My site. Com / date / page 1
My site . Com / category / page 1
My site. Com / product / page 1
In all these example URLs page 1 is the same content. Let’s say /product/page 1 is the master and all other URLs are canonical to it, they would all be listed as “alternate page with the proper canonical tag”.
So this is not an error but informational. You should still check and understand why pages fall in this category. It is also possible that goes or sections of your site wrongly fall under this category and that some things do need attention.
Some other reasons why pages may fall under this category;
As always, the moral of the story is to check and verify. Understand why your pages fall under this category.
A page with redirect is a fairly descriptive term.
This category of “page with redirect” is simply because the URL that was arrived at is redirecting to another URL.
However, that’s often only part of the story.
What type of redirect is it?
What is there a redirect in the first place?
Should this URL have been discoverable?
Are these URLs being directly linked to your page somewhere?
It’s OK to have redirects but they shouldn’t be a large percentage of your total response codes.
Redirects return response codes of 3XX with the type most common ones being
They act the same but are treated completely differently by search engines. A 301 permanent redirect is most commonly used and is what most SEOs will advise as 301 redirects pass link juice. A 302 temporary redirect on the other hand does not and is considered exactly that – temporary. A 302 is a bit of a brick wallin terms of SEO. If there is a 302 redirect – be sure that it really should be.
I have worked with a major insurance brand that had their top-level URL 302 redirect to a sub-structure because of their antiquated CMS. It happens! This is why we check. Sometimes this can yield major SEO wins when you find an incorrect 302 redirect and can make it a 301.
Next, you can inspect (click the magnifying glass or enter the URL into the top bar in Google Search Console) the URL and see if it is in the site map or if there are referring pages.
A common error is not to have a trailing slash when a URL is linked to but to have the CMS (looking at you WordPress) automatically 301 redirect it to the version with a trailing slash.
These are two different URLs.
It is best practice not to link to the URL that redirects and ensure that you are linking directly to the version of the URL that returns a 200 (file found OK) response code. This is not obvious from normal navigation.
If you add in a 301 redirect, it’s important to update the links on your pages to the correct 200 response code. This can get especially messy after a site migration and or major update.
It’s best practice not to let the number of redirects get larger than the quantity of 200 response codes.
Again, this requires inspection, digging deep into the data that GSC gives you, and taking action based on your findings.
A server error 5xx is something no dev or webmaster wants to hear.
The 5xx series status code usually means there is an error with the server. This means that the URL request that was being made did not end with the desired result.
Exactly why is not always obvious. The page returned (or not) may give some insight into what went wrong. Also note, that this can be a security risk as it can expose information about your server to the public that you don’t want to be available and would not normally be available.
5xx series errors are usually fairly temporary and once resolved on the server, simply submitting a “validate fix” will usually result in the page not indexed for the 5xx error being lifted and the correct result is obtained (this is not usually the case for most other errors as they take some specific action to resolve).
If you notice a spike in 5xx errors you should alert your webmaster or dev team.
It is not good to leave Google hanging with a 5xx error as this could be traffic you are losing and rankings you are not earning with the URLs that encountered the error.
The soft 404 is a somewhat unusual and ambiguous category.
It means that Google thinks it has arrived at what looks like a user-friendly page not found error page but the request does not generate a 404 error.
It is important to make sure your 404 error page and page not found URL results in a 404 response code and that URLs that truly don’t exist are responding with a 404 response code. Once you have that part down, it’s time to deal separately with the 404 errors.
In Google’s documentation on this issue, they harp on the fact that it’s a bad user experience to return a page with a 200 response code that is a broken or incomplete page and or a page that says not found.
First of all, there should be NO pages without a canonical URL declaration.
Canonicalization is one of the higher-priority items to deal with. Many of the errors and page not indexed categories in the search console are in some way related to canonical and or duplication issues.
Duplicate content can be a serious issue as it dilutes your footprint and poses a challenge for Google. Where there is duplicate content, there should be a clear canonical so Google can understand which the primary version is.
In our previous example, all of these URLs return the same content. If none of them reference a canonical URL then they would all be marked as duplicate content.
Google will ultimately decide on a canonical but it’s not something you should leave to chance.
Make sure you are correctly defining canonical URLs for ALL of your pages.
“Crawled – currently not indexed” indicates that Google’s crawlers have visited the URL, but it has not been added to Google’s index.
For a URL to appear in Google’s search results, it must be included in Google’s index.
When a page is crawled but not indexed, it means that Google’s crawlers have found the page and read its content, but for some reason have decided not to include it in the index.
This can indicate quality issues or thin content. If you have a lot of pages in this category further investigation is needed.
The discovered – currently not indexed category means that Google is aware of your URL but it hasn’t crawled it yet so the page is not in the index and can’t rank. This is often the case for new URLs.
There are methods to prioritize crawl requests and submit your URLs to Google.
Again, if you have lots of URLs in this category it is an indication that there is some kind of quality issue and or your site is not meeting Google’s helpful content bar or you have a low site-wide quality score.
This issue needs further investigation.
This category is indicative of a problem. Usually, if Canonical tags are correctly set and appropriate Google will respect and agree with your canonical declaration.
Where Google finds a different canonical you should investigate why.
This can happen when there is pagination and Google will consider the first page in a series as the canonical. This can also occur where there is a sub URL that is duplicate or where there is a lot of faceted navigation and multiple ways to get to the same or very similar content.
In this article, I have discussed the different reasons why your pages might not be indexed in Google.