What Your Devs Forgot to Hide: Finding Public Staging Environments
Most teams hunt for staging sites the wrong way. The real leaks hide on Vercel previews, Netlify subdomains, and Heroku instances Google already indexed — and no obscure URL will keep them out. Here is how to find every one.
Go to Google right now and type site:.azurewebsites.net. You will see over 5 million results. Try site:vercel.app or site:amplifyapp.com and you'll find millions more. These aren't intentional public websites. They are auto-generated staging environments, QA servers, and developer pull-request previews that were never meant to see the light of day. And they represent a massive SEO and security blind spot.
Most technical SEOs hunt for these leaks the wrong way. They type site:dev.yourdomain.com, see nothing, and close the tab feeling reassured. That habit is roughly a decade behind how modern web teams actually ship code.
The real leaks don't live on a subdomain of your production site anymore — they live on auto-generated cloud URLs that never touched your DNS. CI/CD platforms ship a fresh preview URL for every pull request. Vercel, Netlify, Heroku, AWS Amplify, Pantheon, WP Engine — each one generates real, browsable, content-identical copies of your site sitting on random subdomains. And every one of them can be indexed.
Security by obscurity has been dead for a decade, but the industry keeps pretending otherwise. A URL like feature-nav-update-v3-xkj82.vercel.app feels private because it looks like random noise. It isn't. The moment a developer opens that URL in a non-incognito Chrome window, Googlebot learns it exists — and from that point on, it's a race between you and the crawler.
The operators used in this guide
This article leans on site:, inurl:, intitle:, and exact-match quotes. If any are unfamiliar, the Operator Reference has one-line definitions and examples. The Query Builder assembles them visually if you'd rather skip the syntax entirely.
How Google finds the "unguessable" URLs
Discovery is passive, not active. You don't have to submit a URL for Google to find it — you just have to leak it into any system Google can see. The modern web has a lot of those systems.
- Chrome user data. A developer opens the preview URL without incognito mode. Chrome's safe browsing checks and user metrics report back to Google. The URL gets queued for crawl within hours.
- Slack and Teams link unfurlers. These services hit the public web preview endpoint to generate the rich card that appears in chat. That request is logged on the origin side, and those logs feed downstream discovery pipelines at every major platform. Many chat and collaboration tools also pipe pasted URLs through Google-owned safety services — Safe Browsing and VirusTotal — for phishing and malware checks before rendering the preview, and those lookups are another path that puts the URL straight into Google's observable surface.
- GitHub, Jira, and public issue trackers. Pull request comments, ticket descriptions, and wiki pages routinely paste staging URLs inline. If the repo is public, so is the URL. Googlebot crawls popular GitHub issues within the day.
- CI/CD integration bots. Vercel, Netlify, and most cloud platforms post preview URLs directly into PR comments as part of their GitHub App. Every public PR is a billboard.
Reverse footprint hunting
Stop searching for URLs. Start searching for the DNA of your site.
Every website has fingerprints — footer copy, unique asset paths, custom class names, a specific way of phrasing a product tagline. Search for those fingerprints and explicitly exclude your production domain. Whatever remains is a copy of you that shouldn't exist.
Exact-match text from footer or TOS
Pick the single most unique sentence on your site. The trademark line. The registered-address disclosure. A product tagline nobody else would ever write.
"© 2026 YourBrand Inc. All rights reserved. Registered in Delaware." -site:yourbrand.com -site:facebook.com -site:linkedin.com
The more specific, the better. A generic phrase like "Made with love" drowns you in false positives. A full legal disclosure with a state name and a corporate suffix usually returns between zero and five results — and anything that does come back is either staging, a mirror, or a scraped copy worth investigating. The social network exclusions strip out LinkedIn and Facebook pages that automatically mirror your copyright line in their footer embed.
Brand + Dev Placeholder Text
Staging sites are often incomplete builds. Developers frequently use "Lorem ipsum" or obvious test strings on pages that accidentally get pushed to a live preview URL. Combining your exact brand name with common developer placeholder text is highly effective:
intitle:"YourBrand" ("Lorem ipsum" OR "test product") -site:yourbrand.com
If Google indexes a page that has your brand in the title but contains raw placeholder text, it's almost certainly an exposed dev environment or a forgotten draft template.
The auto-generated cloud trap
Every merge, every feature branch, every experimental spin-up generates a new live URL. Most engineering teams have between 20 and 200 active preview deployments at any given moment. Almost nobody keeps a list.
Hit every major staging platform at once
The list of hosts that auto-generate preview URLs is long. Any PaaS that ships a fresh URL per deployment is a candidate, and a single engineering team often pushes to three or four of them in parallel without realizing the previews are public. The table below is the working list — every TLD has shown up in real SEO audits as a source of indexed staging content.
| Category | Platform | Preview domain |
|---|---|---|
| Frontend / JAMstack | Vercel | vercel.app |
| Frontend / JAMstack | Netlify | netlify.app |
| Frontend / JAMstack | Cloudflare Pages | pages.dev |
| Frontend / JAMstack | GitHub Pages | github.io |
| Frontend / JAMstack | GitLab Pages | gitlab.io |
| Frontend / JAMstack | Deno Deploy | deno.dev |
| Fullstack PaaS | Heroku | herokuapp.com |
| Fullstack PaaS | Railway | railway.app |
| Hyperscaler PaaS | AWS Amplify | amplifyapp.com |
| Hyperscaler PaaS | Azure App Service | azurewebsites.net |
| Hyperscaler PaaS | Google App Engine | appspot.com |
| Hyperscaler PaaS | Firebase Hosting | web.app, firebaseapp.com |
| Hyperscaler PaaS | Cloudflare Workers | workers.dev |
| Managed WordPress | Flywheel | flywheelsites.com |
| Managed WordPress | WP Engine | wpengine.com |
Google caps a single query at roughly 32 terms, which is too short to fit every platform above in one OR chain. Split the list into two batches — frontend plus fullstack hosts in one, hyperscaler PaaS plus managed WordPress in the other — and run them back-to-back. The parentheses around each OR group are mandatory. Without them, Google applies site: only to the first operand and treats the rest as standalone web searches across the entire internet.
"YourBrand" (site:vercel.app OR site:netlify.app OR site:pages.dev OR site:github.io OR site:gitlab.io OR site:herokuapp.com OR site:fly.dev OR site:onrender.com OR site:railway.app OR site:flywheelsites.com) -site:yourbrand.com
"YourBrand" (site:amplifyapp.com OR site:azurewebsites.net OR site:appspot.com OR site:run.app OR site:web.app OR site:firebaseapp.com OR site:workers.dev OR site:pantheonsite.io OR site:wpengine.com OR site:kinsta.cloud) -site:yourbrand.com
Run both. Look at the combined result count. Anything above zero means there are preview URLs in the index right now. Expect more hits on Batch 1 — JS frontend previews are the biggest leak source because every pull request gets its own URL by default.
Pro Tip: Bypassing the Duplicate Filter
Platforms like Vercel often generate multiple alias URLs for the exact same deployment. Because the content is identical, Google aggressively hides them as "omitted results." To see the true scale of your leak, append &filter=0 to the end of your Google search URL. This forces Google to show every single preview link it has indexed.
The robots.txt paradox
Developers reach for the same fix over and over again. "Let's just drop Disallow: / in staging's robots.txt — that'll hide it from Google." It sounds defensive. It's actually catastrophic if the URL has already been discovered.
Google discovers the URL through one of the vectors above. Before you can add a noindex tag to the page, someone ships the robots.txt update with Disallow: /. Now Googlebot can't fetch the page at all — which means it can't see the noindex directive either, and the URL sits in the index permanently, labeled with exactly this string in Search Console:
Indexed, though blocked by robots.txt
It still appears in search results. It still competes with your production page for rankings. Google just can't see its content anymore, so it shows the URL with no snippet — a ghost result that's nearly impossible to evict without filing a manual URL removal through Search Console. And that removal is temporary: six months later, the ghost comes back.
Don't rely on robots.txt to hide staging
Three methods actually hide a staging environment from Googlebot. HTTP Basic Auth — the server requires a password before serving any content. Cloudflare Access or Zero Trust — an identity gate in front of the origin. IP whitelisting at the network layer. Everything else — Disallow: /, obscure URL slugs, noindex meta tags added after the fact — is theater.
Treat this as a snapshot, not surveillance
Reverse footprint hunting is powerful, but it's not a real-time monitor. These queries show you what Google has already crawled and decided to keep — nothing more, nothing less. They lag behind fresh deployments by roughly 24 to 72 hours, and they can't see anything Googlebot couldn't see in the first place.
That means you'll catch the staging copies that have been sitting in the index for weeks or months, plus anything that ranked well enough to surface in the first 100 results for your brand terms. You won't catch a Vercel preview that went live two hours ago. And you won't catch anything behind HTTP Basic Auth, Cloudflare Access, or an IP whitelist — those URLs are invisible to Google by design, which is exactly why those three methods are the only real defenses.
So treat this as a recurring audit, not a one-shot fix. Run the cloud-preview hunter monthly at minimum. Teams shipping preview URLs daily should run it weekly and bookmark the query right next to their analytics dashboard. And pair it with Google Search Console's Coverage report — it's a first-party data source you already have access to, and it picks up URLs from your own sitemap submissions that operator-based searches can miss entirely.
If you already read Uncovering Orphaned Index Bloat, treat this article as the cloud-native extension of that methodology. That one covered staging subdomains on your own DNS. This one covers the part that isn't on your DNS at all.
Bonus: enumerate every subdomain on your own DNS
Everything above targets cloud URLs that never touched your DNS. For full coverage, you should also audit the subdomains you do own — and Google's site: operator supports a wildcard that turns this into a single query.
site:*.example.com -site:www.example.com
The wildcard * matches any subdomain segment. Excluding www strips your production site out of the results, leaving only the subdomains you may have forgotten — dev., staging., qa., admin., beta., that abandoned 2019 marketing microsite nobody remembered to take down.
For an even narrower view — only subdomains that look like dev environments — chain a few inurl: filters together:
site:*.example.com (inurl:dev OR inurl:staging OR inurl:qa OR inurl:test OR inurl:beta OR inurl:uat)
Pair this with the cloud-native queries above and you have full coverage. The on-DNS audit catches subdomains your team hosted under your own brand. The cloud-provider hunter catches everything they pushed to Vercel and Netlify. Together they map every public surface Google can see.
Quick reference
"© 2026 YourBrand Inc." -site:yourbrand.com"YourBrand" "Lorem ipsum" -site:yourbrand.com"YourBrand" (site:vercel.app OR site:netlify.app OR site:pages.dev OR site:github.io OR site:gitlab.io OR site:herokuapp.com OR site:fly.dev OR site:onrender.com OR site:railway.app OR site:flywheelsites.com) -site:yourbrand.com"YourBrand" (site:amplifyapp.com OR site:azurewebsites.net OR site:appspot.com OR site:run.app OR site:web.app OR site:firebaseapp.com OR site:workers.dev OR site:pantheonsite.io OR site:wpengine.com OR site:kinsta.cloud) -site:yourbrand.comsite:*.example.com -site:www.example.comsite:*.example.com (inurl:dev OR inurl:staging OR inurl:qa)Audit your own cloud leaks in under five minutes
Open the Query Builder, pick Google, and drop in your brand name alongside your production domain. The builder assembles the cloud-provider OR group, appends the right exclusions, and hands you a copy-ready query. Paste it into Google, scan the first page of results, and you'll know within minutes whether your team has been leaking previews into the index.
Open Query Builder