Broken Links Checker Crawler
Broken links — 404 errors — harm SEO and degrade user experience. Crawler automatically checks all internal and external website links, finds non-working ones, and generates report showing which pages contain them.
What Crawler Checks
- Internal links — pages within the domain
- External links — outgoing links to other sites
- Images — src attributes
- CSS/JS resources — static loading
- Redirects — chains longer than 2 hops
Implementation
Python with asyncio and httpx for concurrent checking, BeautifulSoup for HTML parsing. Checks status codes and identifies problematic links.
Report Format
CSV with columns: broken_url, http_status, found_on_page, link_text. Ready for import into Google Sheets or Notion.
Implementation timeline: 1–2 working days.







