AI-based accessibility system
Web Content Accessibility Guidelines (WCAG 2.1) Level AA is a legal requirement in a growing number of jurisdictions (ADA in the US, EN 301 549 in the EU, and EAA is mandatory from 2025). A manual accessibility audit of a 5,000-page website takes 3-4 weeks and costs $40-80K. An AI system completes the audit in a matter of hours and continues monitoring in CI/CD.
Automated accessibility audit
What the automation finds and what it doesn't
Axe-core, Deque, and IBM Equal Access Checker are standard engines. Automatically detect missing alt text, incorrect heading hierarchy, low contrast ratio, missing ARIA labels, and keyboard traps. Coverage: 30–40% of WCAG criteria as assessed by Deque.
Machine learning covers part of the remaining 60%: - Alt Text Quality: a classifier (fine-tuned BERT/RoBERTa) evaluates whether the alt text is informative or formal ("image.jpg", "photo1"). F1 0.84 on a dataset of 50K annotated images. - Semantic Contrast: not just the contrast ratio (WCAG 1.4.3), but also the readability of the font — a CNN classifier on the element render. - Cognitive Complexity: Flesch-Kincaid + readability ML models for assessing complexity level AAA (WCAG 3.1.5).
Computer Vision for visual audit
Screenshot rendering → CV pipeline: 1. Playwright headless browser renders pages 2. DETR (Detection Transformer) detects UI components 3. Classifier checks touch target size (WCAG 2.5.5: minimum 44×44px) 4. Segmentation model checks spacing between interactive elements
Performance: 2500 pages in 40 minutes on 4×A10G GPU.
AI for users with disabilities
Automatic alt text generation
BLIP-2 or LLaVA (Large Language Vision Model) generates image descriptions. For product images in e-commerce: fine-tuning on domain data with product descriptions. ROUGE-L 0.67 vs. 0.58 for the base BLIP-2 after fine-tuning on 12,000 examples using QLoRA.
CMS integration: webhook when image is loaded → inference endpoint (TorchServe or vLLM) → write alt text to the media library. Latency: 800 ms–1.2 s per image.
Subtitles and transcription
Whisper large-v3 (OpenAI) — state-of-the-art ASR. WER 3.1% on LibriSpeech clean. For the video platform: real-time subtitles via streaming inference, batch transcription for VOD. Speaker diarization: pyannote.audio for speaker separation in webinars.
Adaptation for poor acoustics: domain fine-tuning via Hugging Face transformers + datasets, 8-bit quantization via bitsandbytes for deployment on CPU infrastructure.
Assistive technologies and NLP
Plain Language Simplification: T5 or Mistral-7B fine-tuned for paraphrasing complex legal/medical texts into plain language. SARI score is a simplification quality metric, baseline T5 large: 42.1, after domain fine-tuning: 47.3.
Augmentative and Alternative Communication (AAC) system: a language model predicts the next word/symbol for users with motor impairments. GPT-2 fine-tuned on a corpus of AAC messages reduces keystrokes by 40% compared to a simple n-gram model.
Monitoring in CI/CD
Accessibility regression testing: axe-core + ML extensions are integrated into GitHub Actions. Pull requests are not merged when new WCAG violations are detected. Dashboard: historical violation metrics by component, trend analysis.
Typical mistake: the team adds accessibility checks only to e2e tests that run once per day. By that time, the violation is already in production. Correct: unit-level accessibility checks in the Storybook + Playwright component tests + full-page scan in CI.
Stack
| Компонент | Инструмент |
|---|---|
| Автоматический аудит | axe-core, Playwright |
| Alt-text генерация | BLIP-2, LLaVA, Hugging Face |
| ASR субтитры | Whisper large-v3 |
| Текстовое упрощение | T5, Mistral-7B fine-tuned |
| CV UI анализ | DETR, torchvision |
| CI/CD интеграция | GitHub Actions, GitLab CI |
Development time: 2–5 months for audit tools. AI-powered assistive features (alt-text generator, subtitles, text simplification) — an additional 3–6 months.







