Google Indexing: The Developer's Guide to Search Console
You shipped a feature. The deploy succeeded. Users can reach the page. But Google doesn't know it exists.
Google indexing is the process by which Googlebot discovers, crawls, renders, and stores your pages in its search index. If a page isn't indexed, it doesn't rank. It doesn't appear in search results. It might as well not exist for the 8.5 billion daily searches happening on Google.
Most developers ignore indexing until something breaks. By then, pages have been invisible for weeks. This guide treats Google Search Console (GSC) like what it actually is: a monitoring system for your site's search infrastructure. We'll cover the data model, the common failure modes, and how to pull GSC data directly into your development workflow.
Why Developers Should Care About Indexing#
Here's the uncomfortable truth: your framework doesn't guarantee indexing.
Next.js with SSG? Pages can still be stuck in "Discovered - currently not indexed." A perfectly valid React SPA? Google might render it, or it might not. A site with 10,000 pages? Google's crawl budget means some pages will never be crawled at all.
Indexing is a distributed systems problem. Google allocates finite resources (crawl budget) across billions of pages. Your site competes for those resources. The quality of your technical implementation directly affects how much budget Google allocates to you — and whether crawled pages make it into the index.
Indexing and ranking are separate processes. A page must be indexed before it can rank, but being indexed doesn't guarantee a good position. Think of indexing as getting your data into the database. Ranking is the query that retrieves it.
The Google Search Console Data Model#
If you think about GSC like a database, the mental model clicks immediately.
The Core Tables#
URL Inspection is the single-row lookup. Given a URL, it returns:
interface URLInspectionResult {
indexingState: 'INDEXED' | 'NOT_INDEXED';
crawlState: 'CRAWLED' | 'DISCOVERED' | 'NOT_FOUND';
lastCrawlTime: string; // ISO timestamp
canonicalUrl: string; // what Google chose as canonical
userCanonical: string; // what you declared
robotsTxtState: 'ALLOWED' | 'BLOCKED';
pageFetchState: 'SUCCESSFUL' | 'SOFT_404' | 'REDIRECT' | 'NOT_FOUND';
verdict: 'PASS' | 'PARTIAL' | 'FAIL' | 'NEUTRAL';
coverageState: string; // "Submitted and indexed", "Crawled - currently not indexed", etc.
}Search Analytics is the aggregated query table. It stores impressions, clicks, CTR, and average position — grouped by query, page, country, device, date, or any combination:
interface SearchAnalyticsRow {
keys: string[]; // dimension values, e.g. ["nextjs seo", "/blog/nextjs-seo"]
clicks: number;
impressions: number;
ctr: number; // 0.0 to 1.0
position: number; // average position in search results
}Sitemaps is the registration table. It tracks which sitemaps you've submitted, how many URLs each contains, and when Google last processed them.
Coverage (Page Indexing) is the status report. It categorizes every known URL into one of these buckets:
- Valid — Indexed and available in search
- Valid with warnings — Indexed but with issues
- Excluded — Intentionally or unintentionally not indexed
- Error — Something is broken
The Relationships#
These tables interconnect. A URL that shows "Crawled - currently not indexed" in Coverage will show indexingState: 'NOT_INDEXED' in URL Inspection. A page receiving impressions in Search Analytics is necessarily in the Valid bucket of Coverage.
Sitemaps → Coverage → URL Inspection
↘ Search Analytics (only for indexed pages)
The Search Console API gives you programmatic access to Search Analytics and URL Inspection. Coverage and Sitemaps are UI-only — you can't query them via API. This limitation is why most automated SEO workflows only operate on partial data.
The Five Indexing States (And What Causes Each)#
Every URL Google knows about exists in one of these states. Understanding them is like understanding HTTP status codes — you need to know what each means to debug problems.
1. Discovered, Not Crawled#
Google found the URL (via sitemap, internal link, or external link) but hasn't fetched it yet. This is a queue state. Common on large sites where crawl budget is constrained.
Fix: Improve crawl budget by fixing broken links, reducing redirect chains, and ensuring your server responds quickly.
2. Crawled, Currently Not Indexed#
Google fetched the page but decided not to add it to the index. This is the most frustrating state because it means Google evaluated your content and said "no." Thin content, duplicate content, and low perceived value are the usual causes.
Fix: See our dedicated guide on fixing crawled-not-indexed issues.
3. Indexed, Not Submitted in Sitemap#
The page is indexed and appearing in search, but it wasn't in your sitemap. Not an error, but it means your sitemap is incomplete. Google found the page through crawling.
Fix: Add the URL to your sitemap. Accurate sitemaps help Google crawl more efficiently.
4. Submitted and Indexed#
The ideal state. The page is in your sitemap and in Google's index.
5. Excluded (Intentional)#
Pages excluded by noindex, robots.txt, canonical pointing elsewhere, or redirect. If intentional, no action needed. If accidental, you've found a bug.
The Workflow Problem: Dashboard Context-Switching#
Here's how most developers interact with GSC today:
- Open
search.google.com/search-console - Navigate to the right property
- Click through to Page Indexing
- Find the problem URL
- Read the status
- Switch to your editor
- Try to remember what you just read
- Make a fix
- Go back to GSC
- Request re-indexing
- Wait 2-14 days
- Check again
This is the same anti-pattern as checking your database by logging into phpMyAdmin instead of querying it from your application. The data exists. It's just trapped behind a GUI you have to manually operate.
Two Ways to Work with GSC Data
Manual Dashboard Workflow
MCP-Connected Workflow
The GSC API: Programmatic Access#
Google exposes two API endpoints that matter:
Search Analytics API — Query performance data (clicks, impressions, CTR, position) with filters and grouping. Limited to 25,000 rows per request, 500 requests per day, and data has a 2-3 day lag.
URL Inspection API — Check the indexing status of individual URLs. Limited to 2,000 inspections per day per property.
// Search Analytics query — TypeScript with googleapis
import { google } from 'googleapis';
const searchconsole = google.searchconsole('v1');
const response = await searchconsole.searchanalytics.query({
siteUrl: 'sc-domain:example.com',
requestBody: {
startDate: '2026-03-01',
endDate: '2026-03-31',
dimensions: ['page', 'query'],
rowLimit: 1000,
dimensionFilterGroups: [{
filters: [{
dimension: 'page',
operator: 'contains',
expression: '/blog/'
}]
}]
}
});
// response.data.rows contains SearchAnalyticsRow[]For a deep dive into the API, including OAuth2 setup and practical use cases, see our GSC API guide.
Bringing GSC Data Into Your Editor via MCP#
The Model Context Protocol (MCP) lets AI coding tools call external services. Instead of copy-pasting data from dashboards, your AI assistant queries GSC data directly.
Rampify's MCP server exposes GSC data through tools your AI can call:
# In Claude Code or Cursor
"Which of my pages aren't indexed?"
# AI calls get_gsc_insights() → returns:
# - Pages with indexing issues
# - Performance trends (clicks, impressions, position)
# - Content opportunities based on search data
# - Specific recommendations with file pathsThis eliminates the context-switching problem. The AI has the data. It knows your codebase. It can suggest specific fixes in specific files. No dashboard tab required.
For the complete setup guide, see Connecting GSC to Your AI Coding Tools via MCP.
The real power isn't just reading GSC data — it's turning it into structured feature specs. When the AI identifies an indexing issue, it can create a spec with acceptance criteria, affected files, and implementation tasks. That spec persists across sessions and tracks resolution. See spec-driven development for the full methodology.
Common Indexing Problems (And Where to Start)#
JavaScript Rendering Issues#
Google uses a headless Chromium instance to render JavaScript, but it's not instantaneous. Pages that rely heavily on client-side rendering may be indexed with incomplete content — or not indexed at all.
Check: Use URL Inspection in GSC to compare the "Google's view" rendered HTML against your source. If critical content is missing from the rendered version, you have a JS rendering problem.
Framework-specific guidance: See Google Indexing for Next.js and React.
Crawl Budget Waste#
Every redirect chain, broken link, and duplicate URL eats crawl budget without adding indexed pages. On large sites (10,000+ URLs), this directly reduces how many of your important pages get crawled.
Check: Look at the Coverage report's "Excluded" section. High numbers of "Page with redirect," "Duplicate without user-selected canonical," or "Crawled - currently not indexed" indicate budget waste.
Sitemap Problems#
An outdated or malformed sitemap is like a table of contents that points to the wrong pages. Google trusts your sitemap as a signal of what matters — if it's wrong, crawl priorities are wrong.
Check: Verify your sitemap in GSC under Sitemaps. Look for high "Discovered" counts with low "Indexed" counts — that gap represents pages Google found via your sitemap but chose not to index.
Forcing Indexing#
When you need a page indexed now, you have three options: URL Inspection (manual, one-at-a-time), the Google Indexing API (limited to specific schema types), and IndexNow (not supported by Google, but covers Bing and others).
Details: See Submit URL to Google: What Actually Works.
Monitoring Indexing Health Over Time#
Indexing isn't a one-time fix. It's an ongoing process that requires monitoring, just like uptime or error rates.
Key metrics to track:
Index coverage ratio — What percentage of your submitted URLs are actually indexed? A declining ratio means new pages are being excluded faster than old ones are being added.
Crawl frequency — How often is Googlebot visiting your site? A drop in crawl frequency can precede indexing problems by weeks.
Time to index — How long after publishing does a new page appear in search results? If this increases, something has changed in Google's evaluation of your site quality.
Impression trends — Declining impressions for previously stable pages can indicate de-indexing or ranking drops.
Rampify tracks these metrics automatically and surfaces them through the MCP server, so your AI assistant can flag regressions before they become traffic problems.
Next Steps#
Google indexing is the foundation that determines whether your pages appear in search. Understanding the data model, the failure modes, and the monitoring strategy gives you the same visibility into search infrastructure that you already have into your application infrastructure.
The guides in this series go deeper on each topic:
- Crawled Currently Not Indexed: How to Fix It — The most common indexing problem and 10 specific fixes
- Google Search Console API Guide — Programmatic access with TypeScript examples
- Submit URL to Google — Indexing API, IndexNow, and what actually works
- Google Indexing for Next.js and React — Framework-specific SEO guide
- Connecting GSC to AI Coding Tools via MCP — Bring search data into your editor
Try Spec-Driven Development with Rampify
Scan your site for SEO issues, pull GSC data into your editor, and create structured specs — all from your AI coding tools. No dashboard tab required.
Get Started FreeRelated Reading
Crawled Currently Not Indexed: How to Fix It
Understand why Google crawls but won't index your pages, and apply 10 specific fixes to get them into search results.
Google Search Console API Guide
From OAuth2 setup to practical use cases — build automated SEO intelligence with the GSC API and TypeScript.
Connecting GSC to AI Coding Tools via MCP
Set up the Rampify MCP server and bring Google Search Console data directly into Claude Code, Cursor, and Windsurf.