Google Indexing Research
October 2023 – Crawling System Issues – InspectionTool Crawlers
After the announcement of the Core Update completing on October 19th 2023 and the Spam Update on October 20th 2023, there appeared an intermittent number of anecdotal incidents of sites across the globe experiencing sudden inability to successfully request indexation via Search Console. The message was Indexing Request Rejected.
On October 23rd, AWM published an article to record contemporaneously while it was happening so that we could have something on record about the Hostload Exceeded Error messages people were seeing in Search Console. Obviously, getting this error message has happened before but not for the same reasons. Then it was discovered that not only was it coming from Search Console, it was coming from the Mobile Friendly Test Tool and Rich Snippet Tools.
During this time Google blamed the incident on spammers seeking indexation that had caused the issue.
As per John Mueller –
I wouldn’t worry about it – normal crawling and indexing is generally fine.
— I am John – ⭐ ⭐ LIVE ⭐ ⭐ (@JohnMu) October 22, 2023
In researching this issue further and covering it on Crawl Or No Crawl Reports on Youtube it was discovered that during this time the InspectionTool Crawlers which were introduced in May 2023 were absent from server logs across several test and live sites. InspectionTool Crawlers Are MIA Crawl or No Crawl 25 October 2023.
On October 26 in preparation of data in the Indexation Research, after not seeing the normal inspection-tool crawlers, I wondered what bots would come if I was using the mobile friendly test tool. Within seconds it became apparent what was happening and is STILL happening.
The InspectionTool Crawlers in many cases are sending dozens and even hundreds of (compatible; Google-InspectionTool/1.0) to a single request for a mobile friendly test. Conversely, for search console requests for indexation – there is a reappearance of the Desktop Googlebot for the purpose of render crawling new content, a function that the Chrome build (compatible; Google-InspectionTool/1.0) crawlers had been doing.
Normal numbers of render pulls for an standard page are dictated by how many pieces make up a pages. In the past, when InspectionTool chrome crawlers come and render a page they can run from 3 – 6 as mobile and desktop, for a total of 6 to 12 crawls.
Today, some of these pages pulled 10 to 15 times the normal number of crawls.
Below is a breakdown of 10 sites, the type of page that was tested and the number of Google-InspectionTool Crawlers that arrived in response to the request for mobile friendliness as per each site’s server logs. I did perform the same type of requests across a total of 30 sites and have more in the works waiting for their server logs to update. At no point was there an error message or indication of anything other than a success test of the mobile friendly tool.
This indicates that there is an ongoing issue (today’s date is Oct 26, 2023) in the crawling system of the indexation system. Obviously, Google engineers have been able to compensate the taking of Inspection Crawlers out of the crawling process by replacing them with the Desktop chrome-build googlebots. This is event when content goes through the search console indexation request.
If you’re using the Google Indexing API to request indexation of new content, please note that according to the testing data, no rendering crawls have commenced with requests made through the API since isolating testing was started on Spetemeber 3, 2023.
This page will be updated once the situation is resolved.
September 2023 During The Helpful Content Update:
From time to time Google has mechanical issues with getting new content findable by its keywords – where the traffic comes into your site. There are hundreds of reasons for there to be issues getting content into the findable index and most of the analysis assumes with a fault in the content creator. This information however, looks at where the issues lie within Google’s indexation system and sub-systems – crawling, indexing, ranking and serving.
During September Google announced the Helpful Content Update – based on field evidence in the form of indexation-resistant content that literally within hours of the announcement being made started to see impressions and rankings in search console, I still believe that the Helpful Content Update is about a topic and the confirms of that topic.
The video below reveals the state of indexation during the week of Sept 18 – 25th – there were a number of chrome build updates in the google. agent crawlers – including mobile, desktop and the inspectiontool crawlers.
How do I get my page or site indexed?
As of today, the Google Indexing API is currently not running new content through the 2nd pass or rendering pass of the system. Since that is an incomplete process, I no longer recommend using the Google Indexing API.
Instead, the data of the research shows that requests made through the Google search console are not only a complete process including both the simple and render passes, it was very quick – less than 24 hours in most of the recent testing.
How to spot indexing issues?
The quickest way to spot them is to set up a test keyword that when searched will only pull up the page. But since most content creators don’t understand how to set this up – the next best way is to isolate the URL in the performance report in Search Console and set the date range from a day prior to the publish date of the new content to within a week later. When the indexation system is in working order, it should take within a week to see the page start to receive impressions.
Also, at this time this holds true for either data designation in search console – whether the site is a primary designated for desktop or mobile (smartphone) the same rates of crawling, indexing, ranking and serving are reported for both which wasn’t the case between January and April 2023.
Updated Indexation Research for 2023:
The research continues to provide further and deeper insights into the indexation process. In January 2023 the Google Indexing API modified its response and no longer provides the same predictable serving rate. Additionally, site that are designated “primary crawler desktop” in Google search console indicate that for these sites, there is an issue between the indexation and serving systems. Why or precisely if this is an system error or a filtering error or even a deliberate hobbling, is not readily visible to the data. The data clearly shows an issue but not the nature of the internal issue.
Update: May 2023
Signs in the testing data that primary desktop crawler sites are behaving more like the smartphone designated sites. Both showed same behaviors – crawling system (working), indexing system (working) and in the serving system, the mobile designated sites had little to no issue in serving but the desktop designated sites showed a marked reduction in the serving new content since early December 2022.
The more recent two weeks of data reveal that the serving of the new content through the indexation systems is performing as per a PRE-December rate.
Thank everyone for being so singleminded to have me procure a replay video. Someone in the group was able to provide a truncated version. This isn’t a great recording but it does have the meat of the research. Below is a transcript.
The presentation had a little more information and research data. If you’re interested and want to have an opportunity to see this again and have an opportunity to ask your questions, I’m putting a way below to share your interest. Just let me have your email and I’ll let you know when I can carve out some time to present this again.
Be The First to Get Notified On Next Forensic SEO Live Training
Trascription below and before you leave – please subscribe to the Crawl or No Crawl Youtube Channel
Link to research data log – google index detector
Subscribe to Confessions of an SEO – available everywhere you can get podcasts
Course: Forensic SEO Live Training
Okay. so there, there will not be a quiz on this, so don’t worry about it. but I just wanted to, to show you a lot of, everything that I’m going to share today, came from this activity. And I’ll explain as, as we go.
01:60 – 03:02
Hey, Marie, do you mind if I make you a co-host and then if people come in the waiting room, do you mind letting them in? I just want to respect everybody’s time as they say, time is money. So, we can, keep going now, everybody is set so you’re mute. If you can’t hear me or something weird happens, please unmute yourself and, and say, you know, we lost you. But if it’s a question about what you see, go ahead and type it in the chat and we’ll get to it.
03:07 — 04:26
There are so many little things on this screen. All right. So can you guys see that Okay. I see a couple of heads nodding up and down. Okay. So the, short message from today is when it comes to indexation of new content, you better ask Google nicely.
Okay. So I’m not gonna spend a lot of time. I assume most of you know, who I am, I’ve been an SEO for 13, 14 years. I’ve lost track. and I am an avid tester. I was in the beginning of SEO testing in 2015. And a lot of, what you’re going to see today is I’m going to do my best to present it as if we were in a test review, where other testers are present.
04:27 – 05:13
We’ve kind of done that before. So everything I’m sharing, I’m going to share it like a test, but, we’ve got, some proof on there. So I’m going to provide some context. I’m going to show you what I did with the two test sites that sort of led to this discovery, general testing methodology, what some of the updated findings have been since, discovering this, and then take some questions and then share with you, what else is going on, what else is coming out of, out of this, research project. So if everybody’s cool, we will keep going. I see a lot of people are coming in, so, Maria, are there any questions to start? No? OK
05:17 – 06:33
All right, here we go. So I just really want you to know you’re not wasting your time. and this has worked for other people, not just me. So, and actually, the first one I’m going to share is this is from Lee Witcher and Lee’s actually here. So we, I don’t know if you know, is going to just like post everything there. But, Lee is someone who I greatly respect. I’ve learned a lot from him from his approach to SEO, and he has a specialized advanced training for SEOs and specifically those using Cora. And, so it works for Lee and then, now I’ve got some screenshots, from Eric St-Cry, he sent me in some graphics to show you that basically, everything, once he started doing this, grew tall, like tall green grass. They went from 82 index pages to 520 pages.
06:37 – 08:13
And we didn’t really have any way to, document that we were all bitching on Facebook. We’re all complaining with each other. And, Ted Kubaitis in one of his (SEO FIGHT CLUB) shows was like, please, will somebody, we need to start putting this together, putting a body of work together so that we know what’s going on.
So, for right smart or stupid, I went ahead and did it because partially for me, my SEO html testing was at a standstill. I could not get any of my test pages to index. And if you can’t get them index you, you can’t find anything out.
All right. So I’m presuming that this looks very familiar to a lot of people. this was, right out of the search console. Crawled not indexed, I couldn’t find a screenshot of Discovered not crawled because I don’t have that problem anymore.
Okay. now the other one, which is always fun, Google probably already has enough content on that topic, so they don’t need you. And this was always an, always favorite. I don’t have any trouble, therefore, you just did it wrong.
How are we supposed to do it? And simple terms we’re supposed to go to site, put in our URL.
10:21 – 12:10
I published a test page every day, which means I had to create seven HTML pages a week and then publish them in the mornings and check on them periodically throughout the day. It got so bad after, by the end of the year, I
And so I created a WordPress site and I did modify after January 1st, I modified the testing schedule two, three times a week. So I’m banging my head against the wall less. And I’m happy to say that it began to work. And out of 33 test pages, 33 were index a 100% indexation not to rub it in. This is kind of like what it looked like when I first thought, okay, I think I’m onto something. And this is as of last week. I mean, it’s still growing, growing and growing. So it still works.
12:12 – 12:36
And when I found is it matters how you ask Google to index your lab And I will prove it. All right. So remember I
had two sites and one ran from August the end of August to mid February test pages every day. And then site two was launched on February 14th and it launched,
12:41 – 13:11
Now for everybody who says, well, the content quality really sucked. It was supposed to because the plan is, and testing is to remove as many variables as you can. So you’re only testing one factor. That’s why there are no entities in here. That’s why there are no pictures in here. Nothing, historically, as testing has been the lowest quality content, you can come up with for Google.
So for instance, you want to come up with, and I’ll explain this. You want to come up with keywords that Google doesn’t know about, because if you use something that.
13:27 – 13:50
To test within live content on, you can’t isolate your test to see what’s going on. So by using these unique words, we’re able to isolate the test pages and see what’s going on. So there were two, two ways that keywords were on a page. The one is the way that we all do,it.
If it had been render processed by Google, so that when it showed up in the index, that’s how you knew that page had been rendered Well. So the goal was testing for two things, is Google taking in new content, just very simple content.
And if you’re curious to know how to find words that Google doesn’t know, just pretend you’re typing, and eventually you will come up, you will search for a word that Google doesn’t know. And when you see the little monster fishing, that’s when you know you’ve got the right mind. You don’t want one that suggests, oh, did you mean, you know, this JC penny No, I did not. So you want to make sure that it’s something that is very, super simple. All right.
So we’ve got to, we’re comparing these two sites.
16:01 – 17:03
So here’s, you know, I did everything on the left was the site that started in August. So analytics search console, site map, robots, texts, feed XML. I requested indexing be a search console, submitted site maps and low-level traffic. You know, I didn’t know what it was. And then on the other one, I did everything the same with the exception of, I never requested indexation through the search console. And I set the site up directly with the Google indexing API being index now. And I said, low level traffic. Now I know everybody’s heard, you know, it was like,
Now. A lot of people, when they tell them, when you tell them this, they’re like, that’s not true. Google requires that you have schema. Now, if you’re a tester, this makes sense why you would test it, find out that it didn’t work.
You do not have to believe me. You just try it, convince yourself. Now, the other thing that tester does, is that just because it happened one time doesn’t mean it’s it’s real. So I created three more sites setups. the third site was real content, high level of optimization. Low-level traffic submitted through search console and fungus, the fourth site, real content, no real indexation, optimization low-level traffic, hooked it up with the API, every single page with index within 48 hours.
18:20 – 19:45
Okay. It’s still not enough, right Because now I’m wondering, is it the low-level traffic then even though search console had it, but I just wanted to eliminate and really isolate that it was the API. So real content, no real optimization, no traffic hook it up to the, and got the same result. I mean, it was giddy to say the least, and here was the real question. Can I go back and do some SEO testing with random alpha And so I was like, Nope, it was a nervous test because I didn’t, I knew I wanted an answer, but, I, I knew I couldn’t go down that road.
So if you have new content, there are two ways that you can do it. I’ll, I’ll go through the two different ways, but
basically both of them involve connecting your site to the indexing API and the indexNow, now for WordPress, The rank math plugin has an instant indexing, a plugin that lets you hook it up. And then there’s a SEO tools for Excel has an indexing API connector, found that went out when I presented this information to a bunch of testers, which bit so when you publish new content, you just let the automatic submission take care of it. And you do not request indexing from search console, unless you just like writing a ton of content and letting it die. And again, this is for now, right So, if you, I can put this off, I can send this out, in an email.
20:47 – 22:06
So you all have it, but if you want to write it down, cause I I’d say don’t waste any time, get, if you’re having any