Skip to main content
Guide

Six Google Quality Signals You Can Audit With Search Operators

Six Google ranking signals — sitewide authority, topical focus, freshness honesty, content originality, effort estimation, AI-content correlates — turned into operator audits you can run in a browser tab in five minutes, with worked calculations and live examples instead of theory.

11 min read

Six names. Six audits. Each one runs in a browser tab, on a real domain, in under five minutes — and reads a Google ranking signal that doesn't show up in any analytics dashboard.

The names — siteAuthority, siteFocusScore/siteRadius, lastSignificantUpdate, OriginalContentScore/copycatScore, contentEffort, ractorScore — come from internal Google documentation that surfaced publicly in 2024. They sit inside the scoring stack that decides whether an indexed page outranks the next one. The scores themselves aren't queryable. Their shadows are. Six audits, every one runnable on your own domain, on a competitor, or on any URL in the index — without a paid backlink suite, a log-file parser, or a Search Console export.

Build any of these audits visually

Each query below can be assembled in the Query Builder — pick operators from a list, get a live preview, copy the search link without typing a single colon.

1. siteAuthority: triangulating a sitewide score Google said didn't exist

From 2009 through 2024, Google's public position was that no sitewide authority signal existed. Mueller, Illyes, Cutts before them — all on the record, repeatedly. The leak named the attribute: siteAuthority, applied as a sitewide multiplier before any per-page ranking starts. Operators don't read it. Three independent axes triangulate it.

Axis 1 — index footprint over time

A 30-year archive carries weight a 30-month archive can't fake. The probe:

site:nytimes.com before:2010-01-01

The New York Times' pre-2010 indexed footprint runs into the millions. Run the same probe against any competitor's domain and the comparison is immediate. A B2B SaaS blog launched in 2018 returns a few thousand. A regional daily founded in 2003 returns tens of thousands. The absolute number isn't the point — the ratio between the audited domain and three named competitors decides who's heavier.

Narrow into one-year windows and run the same probe across two adjacent years to read consistency:

site:example.com after:2023-01-01 before:2024-01-01
site:example.com after:2024-01-01 before:2025-01-01

Compare the two counts. 200 indexed pages one year and 30 the next means the site stopped publishing — a clear collapse signal. The simple rule: a site that publishes a little every week looks better to Google than a site that published a lot once and then went silent.

Axis 2 — unlinked brand mentions as implicit authority

Linked references shape PageRank. Unlinked references shape siteAuthority — and unlinked references resist link-building manipulation, which is part of why Google weights them. The probe:

intext:"Stripe" -site:stripe.com

Stripe gets named in millions of pages without any link back. That's the implicit-authority surface. Run the same on a regional fintech startup and the count drops by three orders of magnitude. Three things to read: the count itself, the breadth of unique domains in the first five SERP pages (50 separate domains beats 10 from the same network), and the recency of dates. A B2B brand with 40,000 unlinked mentions across 2,000+ unique domains has implicit authority a link audit alone won't surface.

Axis 3 — outbound link quality (Bing only)

Google deprecated link: in 2017. Bing kept the inverse — linkfromdomain: — which lists outbound links from a domain. Authority sites link to authority neighbourhoods. Spam farms link to spam farms. Run this in Bing's search bar (the operator is paste-only, not part of the visual library):

linkfromdomain:bbc.co.uk

Scan the first three pages. Editorial outbound destinations dominate — Reuters, named publishers, government domains, university press releases. Run the same on a content-mill aggregator and the destinations skew toward expired-domain blogs and other aggregators. The pattern correlates with sitewide authority more reliably than any single backlink count.

You won't get a number. You'll get a relative position. That's enough to answer the only question agency pitches actually ask: is this client's domain above or below the three competitors they care about? The full Bing-operator picture is in Bing SEO Operators.

2. siteFocusScore + siteRadius: when topical expansion costs you

Two paired attributes. siteFocusScore measures how tightly a domain stays on a single topic. siteRadius measures how far individual pages drift from that core — high radius is a topical-drift penalty. Read together: high focus + low radius is strong topical authority. Low focus + high radius is the reason a SaaS blog that started writing personal-finance content lost half its rankings.

Real-world reference points make the scale obvious. Stack Overflow sits at near-extreme focus — almost every indexed page is a programming question or answer. Mayo Clinic is the same shape on the medical side. PubMed is the same shape on the academic side.

The operator measurement is a focus ratio: pages with the core term in the title, divided by total indexed pages on the domain. Worked example on a real domain. Take a single-product site like the official Tailwind CSS homepage:

site:tailwindcss.com intitle:"tailwind"

At the time of this audit, run two queries — one with intitle:"tailwind" for the numerator, one without for the denominator. Read the result count under each search bar, then divide:

site:tailwindcss.com intitle:"tailwind" → ~1,540 results
site:tailwindcss.com → ~1,790 results
focus ratio = 1,540 ÷ 1,790 = 0.86

Almost every page on the domain is about Tailwind. A diffuse general-purpose publisher running the same probe on any single topic returns a fraction of one percent because fiction, finance, recipes, and opinion all dilute the ratio.

For a typical small local IT services company — a regional web development agency, a mobile app studio, or a small outsourcing firm with 100–500 indexed pages — the calculation works the same way, but the right ratio is computed across multiple service lines, not one. A single-term probe like intitle:"development" captures only one slice of what the firm actually sells. Run the probe once per core service term, plus once for the total:

site:your-agency.com intitle:"development"
site:your-agency.com intitle:"design"
site:your-agency.com intitle:"consulting"
site:your-agency.com

Sample numbers for a typical 280-page agency site. Divide each per-term count by the total to get a focus ratio per service line:

site:your-agency.com → ~280 results
site:your-agency.com intitle:"development" → ~95 results
site:your-agency.com intitle:"design" → ~42 results
site:your-agency.com intitle:"consulting" → ~28 results

development focus = 95 ÷ 280 = 0.34
design focus = 42 ÷ 280 = 0.15
consulting focus = 28 ÷ 280 = 0.10

Read the shape of the distribution, not a single number. Development is the dominant content footprint at 0.34; design and consulting are smaller specialties at 0.15 and 0.10. For a small services firm this is a healthy shape — one primary service line plus two complementary specialties. Per-line guidance: above 0.30 means strong content investment in that service, 0.10–0.30 is moderate (normal for multi-service firms), below 0.05 means token coverage that probably won't rank. If every line lands below 0.10, service pages are crowded out by blog, careers, and miscellaneous content — the drift signal that should trigger pruning.

To name the off-topic pages that pulled the per-line ratios down, exclude every core service term at once:

site:your-agency.com -intitle:"development" -intitle:"design" -intitle:"consulting"

Typical surfaces on a small services site that's drifted off topic:

  • Generic productivity listicles. «10 best apps for remote teams», «Morning routines of successful founders» — written for SEO traffic, unrelated to anything the agency actually delivers.
  • Personal-opinion essays from leadership. CEO posts on entrepreneurship, life lessons, year-end retrospectives — interesting to the author, irrelevant to the buyer.
  • Off-topic industry round-ups. «Top 50 startups to watch in 2026», «Tech trends for the year» — copy-paste journalism that competes against the firm's actual service pages for crawl budget and dilutes topical authority.
  • Holiday and seasonal posts. «Christmas message from the team», «What we're thankful for this year» — natural for social, but indexable web pages they shouldn't be.
  • Generic finance/legal/HR explainers. «What is an LLC», «How to write a job description» — broad SEO bait that signals to Google the domain isn't sure what it's about.

Every one of those competes against the firm's actual service pages for crawl budget and topical signal, and adds nothing to ranking on the queries that drive client revenue. Three actions follow. Noindex the lowest-effort listicles. Consolidate personal essays into a single «From the founder» hub at one URL. Move CEO opinion writing to a personal subdomain or a separate domain entirely — the «move drift to a subdomain» pattern below, applied to founder-led services firms.

Thresholds for reading the ratio on your own domain. Above 0.6 — most indexed pages share the core entity, the domain reads as topically focused. Between 0.3 and 0.6 — moderate focus, the domain covers several related themes with depth. Below 0.3 — diffuse, the core topic is one of many, and topical authority on it is hard to claim.

For drift detection, exclude your core term and its known adjacents. Buffer's platform centres on social media — scheduling, engagement, and platform-specific tooling:

site:buffer.com -intitle:"social" -intitle:"instagram" -intitle:"buffer"

The results surface Buffer pages beyond their core platform — content about remote work, salary transparency, company culture, and self-management essays. Substitute the exclusion terms for whatever your own domain's primary entity is. Three questions decide whether expansion is justified. Does it serve the same buyer? Does it use the same expertise? Does it strengthen or fragment the site's primary entity? Two yeses keep it. One yes — prune, redirect, or move to a subdomain.

3. lastSignificantUpdate: the fake "Updated 2026" trap

Google stores four dates per document, not one. bylineDate is what gets written in the byline. syntacticDate is parsed from URL patterns and structured-data fields. semanticDate is inferred from body content — references, statistics, dated examples. lastSignificantUpdate is the one that matters: the timestamp of the last meaningful change to the page, not the last cosmetic touch.

The most common freshness trick in the industry is changing «Updated 2025» to «Updated 2026» in the H1 and republishing. It doesn't work. lastSignificantUpdate stays anchored to the last real edit. bylineDate shifts. The two now disagree, which is itself a manipulation flag.

Find your own fake-updated pages:

site:example.com intitle:2026 -intext:2026

Pages with «2026» in the title but no occurrence of 2026 or 2025 in the body — those are the cosmetic refreshes. Run a complementary probe on a high-output publisher to see the pattern at scale — pages with 2026 in the title where the body is still anchored in 2025:

site:clickup.com/blog intitle:2026 intext:"2025"

A live example surfaces directly. Open this ClickUp blog post — title currently reads as a 2026 article: https://clickup.com/blog/it-strategy-templates/. Now open the Wayback Machine snapshot from October 2025: https://web.archive.org/web/20251016213522/https://clickup.com/blog/it-strategy-templates/. Side-by-side diff: title got bumped, dated examples in the body still reference 2025, no new statistics, no new sections. lastSignificantUpdate stays anchored to the original publication; only bylineDate shifts.

Cross-check with before:/after: filters to flag last-modified vs publish-date desync:

site:example.com after:2026-01-01 intitle:"2025"

Pages stamped «2025» in the title but indexed in 2026 carry inconsistent date metadata that Google's date pipeline flags during normalisation.

Date manipulation is logged

Just changing the date in your H1 is the worst possible move. Google logs the rewrite without a meaningful body change as a manipulation signal — your lastSignificantUpdate stays old, but now your bylineDate is provably out of sync with semanticDate. Two flags from one shortcut.

An honest update changes five things: at least one refreshed example with a current date, one new section addressing a question that didn't exist when the post was written, one updated statistic with citation, an internal-link rewire to recently-published related pages, and at least one new external citation. That combination moves semanticDate forward, which moves lastSignificantUpdate with it.

Halfway through the cookbook

The remaining three signals — content originality, effort estimation, AI-content correlates — compound into the same CompressedQualitySignals bundle. Save these queries in the Builder as you go so you can re-run them on a schedule.

4. OriginalContentScore + copycatScore: scraper detection on real domains

Two attributes from the spam-related stack. OriginalContentScore applies primarily to shorter content and rewards uniqueness. copycatScore flags duplication patterns — including legitimate articles that read like syndicated boilerplate.

Pick an 8-word fragment from one of your articles. The fragment must be specific — no common phrases, no brand names, no obvious idioms. Quote it:

intext:"the eight-word fragment goes here exactly" -site:example.com

Three reading patterns. Zero results — the content is unique and the fragment was well-chosen. A handful of low-traffic scraper domains — expected, ignore unless they outrank you. A scraper domain ranking above the original — inverse signal that OriginalContentScore for your URL is below the scraper's, which usually means a canonicalisation or crawl-priority issue, not the scraper's superior content.

Worked example on a real domain. Pick any moderately specific Wikipedia construction-and-history sentence — the kind tourist sites don't paraphrase:

intext:"On this top, third level, is a private apartment built for Gustave Eiffel" -site:wikipedia.org

Hundreds of mirror sites, Quora answers, and reference farms surface. None outrank Wikipedia for the head term «Eiffel Tower» — that's OriginalContentScore doing what it was built for.

The same pattern works on widely-quoted product taglines. React's canonical tagline appears on react.dev, in the npm package description, and in the opening paragraph of thousands of tutorials and «What is React?» articles:

intext:"A JavaScript library for building user interfaces" -site:react.dev -site:reactjs.org

The SERP fills with tutorial sites, course pages, blog intros, and aggregators quoting the line verbatim. None outrank react.dev for the head term «React» — that's OriginalContentScore working as designed across short, heavily-cited content. For news content, the same probe on a Reuters or AP wire sentence shows the syndication footprint — a single news fragment can be legitimately republished across 200+ outlets without the originals losing their position.

Three actions follow. Pursue under DMCA when scrapers are commercial and dense. Add explicit canonical and request indexing of the original when crawl priority is the real problem. Ignore the long tail — DMCA churn doesn't scale, and most low-traffic scrapers compound no damage. The originality methodology in Competitive Content Audit covers cross-checks for distinguishing real plagiarism from accidental phrase collisions.

5. contentEffort: LLM-based effort estimation

contentEffort is described as an LLM-based estimation of human work invested in a page — research depth, originality, structural complexity, presence of expertise markers. It feeds the Helpful Content System and propagates into CompressedQualitySignals. There's no operator that reads it directly. Four proxy queries get close.

Entity density

A genuine article on a topic mentions the entities a domain expert would mention. Pages that don't read as thin to an LLM-based scorer. Worked example on a payments topic:

site:example.com intitle:"payments api" -intext:"Stripe" -intext:"Plaid" -intext:"ACH"

Articles claiming to cover the payments API space without naming Stripe, Plaid, or ACH read low-effort by definition. Substitute the entities for any topic. CRM article: -intext:"Salesforce" -intext:"HubSpot" -intext:"pipeline". Search-relevance article: -intext:"Okapi BM25" -intext:"vector" -intext:"embedding". Machine-learning intro: -intext:"PyTorch" -intext:"TensorFlow" -intext:"transformer".

For a high-effort reference point, look at how Wikipedia handles entity density on a focused technical topic:

site:wikipedia.org intitle:"Okapi BM25"

The body names every adjacent algorithm, every cited author, every benchmark dataset. That's the bar Google's content-effort scorer was trained on.

First-party language

First-party voice anchors are the cheapest effort signal to add and one of the strongest. Probe for them:

site:example.com (intext:"according to our analysis" OR intext:"we surveyed" OR intext:"in our testing")

Pages that lack any first-party voice anchor — no original survey, no internal data, no proprietary methodology — score low on effort by definition. Stack Overflow's accepted-answer voting, arXiv's citation chains, and every peer-reviewed publication's reference trail are first-party voice variants Google's scorer recognises.

Schema and imagery

Run a site: query, open three results, check whether the page declares an Article schema with author, publisher, and datePublished. A page claiming to be a published article with no author block is one of the cleanest low-effort flags in the index. Original imagery sits outside operator scope — flag for manual check after the audit. Pages with stock photography only, versus pages with diagrams, screenshots, photographed evidence, or original charts. Shorthand: if every illustration could come from a stock library, contentEffort reads thin.

Three actions follow the audit. Invest more — original research, primary data, expert quotes — when the page covers a high-value topic. Consolidate or redirect when the effort gap is large and the page's traffic is small. Prune outright when neither investment nor consolidation makes sense.

6. ractorScore: the underreported AI-content signal

The most underreported attribute in mainstream post-2024 coverage. Phrasing in adjacent modules connects it to effort, originality, and genuine-author signals. Semantic clustering puts it next to OriginalContentScore and contentEffort. The most defensible reading: an AI-content detector, weighted into CompressedQualitySignals at pre-flight.

No operator reads it. But six operator-detectable signals likely correlate with low ractorScore — the same patterns AI-content workflows produce as side effects:

  • Burst publishing patterns. Twenty articles in a single week from a site that previously published two a month.
  • Templated title and H2 structures. «Top N best X for Y in 2026» repeating across 200 articles with only the entity slot changing.
  • Date schema mismatches. datePublished identical to dateModified across hundreds of pages.
  • Synonym-templated language. The same sentence shape with one or two synonym swaps, repeated across articles.
  • Low entity density. Long pages that name no proper nouns, no products, no people, no places.
  • Suspicious before/after velocity. A long-dormant domain that suddenly publishes 500 pages in one quarter.

Three self-audit queries surface those patterns on your own domain. Publishing-rate spike check:

site:example.com after:2026-04-01 before:2026-04-30

A domain that returned 8 indexed pages last month and 240 this month is showing the kind of burst pattern a content-mill sweep flags.

Title-template detection — pick a formulaic prefix and count:

site:example.com intitle:"top 10 best"

A handful of matches is normal. Two hundred matches across one domain is a content-mill silhouette. The pattern is visible on every consumer-affiliate site that rode the AI-content wave through 2024 and 2025.

Entity-density gap — pages on a topic that don't name the obvious experts, products, or places:

site:example.com intitle:"machine learning" -intext:"PyTorch" -intext:"TensorFlow" -intext:"transformer"

A page about machine learning that names no frameworks, no architectures, no benchmarks reads as model-generated to both a human reader and an LLM-based scorer. Run the same three queries against a competitor and the AI-content patterns sitting in their index right now become legible.

These six signals are not the model.

Hypothesis, not doctrine

Treat findings as flags, not verdicts. ractorScore is one signal in a system of signals — and the leak doesn't include the model weights.

Proximity-aware variants — pairing entities with expected co-occurring terms — refine the signal further; the underlying technique is in AROUND operator proximity search.

Quick reference: six signals, one table

Six rows. One operator template per signal. Total runtime: roughly twenty minutes for one domain.

# Attribute Operator query template What you read
1 siteAuthority site: over time + intext:"brand" -site: + Bing linkfromdomain: Higher / equal / lower than competitor
2 siteFocusScore + siteRadius site:example.com intitle:"core" ÷ total site: Focus ratio (0–1)
3 lastSignificantUpdate site:example.com intitle:2026 -intext:2026 Count of fake-updated pages
4 OriginalContentScore + copycatScore intext:"unique fragment" -site:example.com Scraper count + outranking flag
5 contentEffort Entity-density gap + first-party voice probes Low-effort page count
6 ractorScore Burst-publishing + template-title + entity-gap combo AI-pattern flag count

Run the table monthly on your own domain. Run it once per pitch on a competitor. Run it before any major content investment to catch sitewide signal collapse early — by the time click data reflects the issue, the upstream damage has been compounding for weeks.

Build any of these six audits in seconds

Pick operators visually, get a live preview, copy the search link. The Query Builder runs every workflow in this article without typing a single colon.

Open Query Builder