These Leaked Documents Uncover How Google Search Ranks Its Results
What you heard about the massive Google documentation leak is true. Thousands of potential ranking factors were exposed, but not due to a hack or whistle-blower as you would imagine. This leak actually came from Google’s internal Content API Warehouse.
The original raw documentation of the leak consists of over 2,500 modules in an API doc with about 14,000 ranking factors that Google has used or continues to use to this day.
Some SEO experts have combed through it, and a few even provided their summary and opinion on it. But most SEO companies, freelancers, and in-house marketing teams don’t understand what they are looking at or how to use it to their benefit.
That’s where Tag Marketing’s SEO expert Chris Taglia comes in. Yeah, that’s me! I am going to break down the most important parts of this Google leak and ignore the rest that won’t really help you. I’ll even go a step further than everyone else and explain to you in simple terms how to use it to improve your search engine rankings.
5 Things You Should Always Keep In Mind With Google Search
- Google doesn’t want you to know how they rank you. If you knew, you would be able to “hack” the system. It would create less of a need for you to spend on Google Ads. So of course, they lie and mislead you.
- Ranking has always been a simple concept but people constantly overthink it. Create quality content that your target audience wants to learn about, in a format they want to digest it, shared across multiple marketing channels where they spend most of their time.
- The more your content gets linked to and shared, the more likely it will be shown to others who search for similar words or phrasing.
- There should always be a way a viewer could skim your content. Think if they only read your headlines, subheaders, bold wording, lists, and links; would they get a solid understanding of the topic you are writing about? Do your images help tell the story or are they just an afterthought, confusing everyone more?
- We were always told that search engines don’t rank websites, they rank webpages. However, good SEO experts know that a website as a whole would have to be considered in some way such as authority on a topic, trusted not spammy, internal linking, and the role your index/home page plays. We just weren’t sure how much weight an entire website has on your web page’s ranking.
10 Topics We’re Going to Review From Google’s Leak Right Now
- Page Quality
- Topical Authority
- Content Cluster Testing
- How Clicks Affect Re-Ranking
- Domain Age
- Archived Versions of Your Website
- Google’s Search Ranking System
- Length Doesn’t Always Matter
- Link Freshness
- Site-Wide Ranking
1. Page Quality
Google has something called pageQuality (PQ) that was mentioned in the leak. It uses an LLM (large language model) to estimate the amount of effort you put into your articles and pages, and whether it can be replicated easily.
Takeaway: Tools, images, videos, unique information, and depth of information are ways to score a high pageQuality.
2. Topical Authority
In the Google search algorithm leak, siteFocusScore, siteRadius, siteEmbeddings, and pageEmbeddings are used for ranking.
These are the two most important:
- siteFocusScore is just like it sounds, how much a site is focused on a specific topic.
- siteRadius measures how far a page’s topic is from the website’s overall topic because every website is assigned an identity.
Takeaway: Basically, if you are a residential roofing company in Illinois, only talk about residential roofing in Illinois. Not commercial roofing and not national roofing. If you offer other services such as windows, that has nothing to do with roofing, so consider creating another website for windows. You can link from the roofing site directly to your window website. This tells Google that you do it but it isn’t your main focus, roofing is your main focus. Windows is just a side service you offer to customers who hire you for roofing projects.
3. Content Cluster Testing
Google calls this Host NSR (Normalized Site Rank) or siteChunk within the leaked docs. It is a site-wide ranking factor that takes random clusters from across your entire website and measures its topical basis. We already knew Google did this within pages, but now we know it measures it across all your pages as well.
Takeaway: A blog or webpage should be as long as it needs to be. Don’t add a bunch of fluff or loosely related stuff to try and reach a certain word count. NSR will hurt you if you do this.
4. How Clicks Affect Re-Ranking
NavBoost is a re-scoring system based on clickthrough rates and user behavior after it initially scores you. Google has denied this many times, but recently Google admitted in their anti-trust lawsuit that you can buy clicks through their Google Ads platform to boost your organic ranking in search results.
The most interesting part mentioned in this section of the leak is that Google’s Chrome browser not only collects this data but uses it as well to display results.
Takeaway: Google scores you on a topic along with thousands of others. This narrows it down to several hundred pages to show at the top of the search results. If multiple scores are the same, it will determine the winner of that position by clickthrough rates. So following all the principles of a great webpage and distributing it where chances of clicks are higher, including Google’s top ad spots, will improve your position in search results.
5. Domain Age
Nothing about a website’s age is considered in rank scoring but the hostAge is mentioned regarding a sandbox. The data is used to sandbox fresh spam during serving time.
It’s interesting because many SEO experts argue about the importance of domain age. As far as the leak is concerned, the sandbox is for spam and domain age doesn’t matter.
Takeaway: My thought on this is that websites can be moved from one domain to another and websites get revamped constantly, so this is why Google doesn’t pay much attention to it. Unless it comes to the age of the domain, or website, and how much spam is associated with it. The longer the domain has been active and the website has been publicly live, compared to how much spam is attached to it, determines the score I believe. If you have a website up for 5 years with a spam score of 25%, it is worse than having a website up for 1 year with a spam score of 3%.
6. Archived Versions of Your Website
Google keeps 20 archived versions of your website, the leak revealed. If you update a page, wait for Google to crawl it, and then repeat the process 20 times, you can actually clear the memory of any old versions you don’t want indexed.
This is good to know because Google uses historical versions of your website as a scoring factor. Although we don’t know if the updates have to be significant or not to count as a “new version”.
Takeaway: There is no definition of what Google considers a significant change or what type of changes count as a new version. We don’t know if this Google memory score is only for sitewide changes or on a page-to-page basis. If there is something hurting your rankings, you may want to test this but if you keep your website content fresh and publish often, it should reset your score frequently anyhow.
7. Google’s Search Ranking System
Google’s primary scoring system is called Mustang. It scores your webpage content based on eight factors.
- Freshness Twiddler
- Fresh Docs
- Homepage Trust
- Document Length
- Navboost
- Average Term Weight
- Short Content Score
- Title Match Score
Takeaway: The four most interesting takeaways are:
- Links from newer pages have a higher ranking weight than those from older pages.
- Google has a maximum length and word count, so put your most important information early on.
- Emphasis on text such as bolding not only boosts the attention of the reader but Google as well for that term or phrase.
- Clickthrough rate is a re-ranking factor based on user behavior after your webpage or blog post was already scored.
- Short content is not always considered thin content so quality is king over length.
8. Length Doesn’t Always Matter
As mentioned in the Mustang scoring system, short content doesn’t equal thin content, long copy can also be considered thin if you use a lot of fluff text, repeat yourself, and go off-topic.
Takeaway: Your content should be as long as it needs to be to explain the topic fully. If you are providing a description or meaning of something, a paragraph could be enough for Google to rank it highly. If it’s a complex thing you are trying to describe, maybe a full page will be necessary to accurately and fully explain it.
9. Link Freshness
New fresh links trump old existing links, according to freshdocs, a link value scorer. We don’t know how much of a weight difference exists here.
Takeaway: You still should include links from old pages to new, and new pages to old, as long as they are relevant and not over-linked. It will help with traffic and is still an element of a high-value page.
10. Site-Wide Ranking
We were always told that Google search engines rank webpages, not websites, but we always felt that didn’t make much sense with what we know about topic authority, trust, internal linking, and that your home page usually generates the most traffic organically.
Here are some ranking factors that stood out in the NsrNsrData document.
- titlematchScore: A sitewide title match score that tells how well titles match what the user searched for.
- pnavClicks: We were told that Google ignores your header and footer links, but this primary navigational factor might tell us otherwise because it may be paying attention to which menu links get the most clicks.
- chromeInTotal: Site-wide Chrome views. Google uses your Chrome browser’s data to determine some site-wide scoring.
- chardVariance and chardScoreVariance: Google may be predicting site and/or page quality based on your previous content, so consistency is key.
Actionable Advice to Consider
Here is some actionable advice that you can think about and implement today to improve your Google rankings based on what we learned from the leak.
Remove poorly performing pages. If the user metrics are bad, no links point to the page, and the page has had plenty of opportunity to thrive, then that page should be eliminated. Quality content is still king!
Site-wide ranking factors and scoring averages are mentioned throughout Google’s leaked docs, and it is just as valuable to delete the weakest links as it is to optimize your new article (with some caveats of course).
Moving forward you should keep the length and character count of a new blog post or web page to what it needs to be in order to fully explain the topic. Stay focused, eliminate fluff, and stay frequent with publishing new content and updating old.
Also to boost your rankings, you can run Google Ad campaigns for new content to buy some traffic and earn some quick clicks. Have strong headings & subheads, build fresh links, and bold important terms & phrases related to your focus keyword(s).
Your home page (AKA index page) is the most important page on your website because it acts as your table of contents, branding page, an overview of what your website is all about, directs users and crawlers where to go, and lets viewers know what the next steps are.