Glossary
Crawl budget
The crawl budget is the volume of URLs Googlebot is willing to crawl on a site within a given period, determined by two combined factors: crawl capacity (what the server can handle without degradation) and crawl demand (Google's interest in the site's URLs, based on popularity and freshness).
Also known as
- crawl budget
- crawl budget
- crawl budget
Crawl budget becomes critical beyond **~10,000 indexable URLs** on a site. Below that threshold, Googlebot typically crawls everything without constraint. Above it, some pages are visited too infrequently — or not at all — preventing them from being indexed or updated. The most exposed sites: e-commerce with millions of SKUs / facets, pSEO sites with thousands of generated pages, and media sites with deep archives.
2026 optimizations (official Google guidance + OnCrawl / Botify / SISTRIX research): (1) **block in robots.txt** URLs with no SEO value (infinite pagination, sort parameters, calendar pages), (2) **avoid redirect chains** (each 301 hop consumes crawl budget), (3) **return real 404s instead of soft 404s** (Google retries soft 404s indefinitely), (4) **clean sitemap** with accurate lastmod (Google prioritizes URLs with recent lastmod), (5) **server response time < 200ms** (above that, Google reduces the crawl rate), (6) **internal linking depth ≤ 4 clicks** from the homepage (deeper pages receive very little crawl).
The implication for pSEO: publishing 10,000 pages in bulk is a bad idea — Google will sample and index only 20–40% of the volume. Publishing progressively (50–200/week) with segmented sitemaps and precise lastmod dates typically achieves 70–90% indexation in 3–6 months.
In the getchatsocial.com product
getchatsocial.com serves segmented sitemaps by page type (`/sitemap-glossaire.xml`, `/sitemap-comparer.xml`, `/sitemap-cas.xml`) with precise lastmod to optimize crawl prioritization, and publishes new pSEO pages at a progressive cadence rather than in bulk.
FAQ
At what site size does crawl budget become a concern?
Practical threshold: ~10,000 indexable URLs. Below that, Googlebot generally crawls everything without significant constraint. Between 10k and 100k, crawl budget optimization starts to matter. Above 100k (e-commerce, media, mature pSEO), it's a first-class concern that can determine whether you achieve 40% or 90% indexation.
How do you monitor your crawl budget?
Search Console > Crawl Stats shows the number of pages crawled per day and the average Googlebot response time. To go deeper: analyze server logs (OnCrawl, Botify, Screaming Frog Log Analyzer) to see which URLs Googlebot actually visits vs which it ignores. A monthly log audit is sufficient for a medium-sized site.