AI Automated End-to-End Test Generation

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
AI Automated End-to-End Test Generation
Medium
~2-3 business days
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1218
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    853
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1047
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    823

AI Auto-Generation of E2E Tests

E2E tests are the most expensive to maintain: they break on any UI change, run slowly, are unstable (flaky). Main cause of instability — hard locators like div.container > ul > li:nth-child(3) > a. An AI generator creates Playwright tests with semantic locators (aria-label, data-testid, role) that resist cosmetic layout changes.

Generate Playwright Tests from Scenario Description

from langchain_openai import ChatOpenAI
from playwright.sync_api import sync_playwright
import json

class E2ETestGenerator:
    PLAYWRIGHT_PROMPT = """Create Playwright E2E test in TypeScript.

Scenario: {scenario}
App URL: {base_url}
Test data: {test_data}

Test requirements:
1. Use semantic locators: getByRole, getByLabel, getByText, getByTestId
2. DON'T use CSS selectors like .class or #id (except data-testid)
3. Add explicit waits: await expect(locator).toBeVisible()
4. For forms: fill via getByLabel(), not via selectors
5. Check after each significant action (not only at the end)
6. Use page.waitForResponse() for ajax operations
7. Structure: test.describe > test.beforeEach > test

Example of good locator:
✅ page.getByRole('button', {{ name: 'Create order' }})
✅ page.getByTestId('checkout-submit-btn')
❌ page.locator('button.btn-primary:nth-child(2)')

Return TypeScript test code."""

    def __init__(self):
        self.llm = ChatOpenAI(model="gpt-4o", temperature=0.1)

    def generate_from_scenario(
        self,
        scenario: str,
        base_url: str,
        test_data: dict
    ) -> str:
        result = self.llm.invoke(
            self.PLAYWRIGHT_PROMPT.format(
                scenario=scenario,
                base_url=base_url,
                test_data=json.dumps(test_data)
            )
        )
        return result.content

    def generate_from_recording(self, playwright_trace: str) -> str:
        """Improves automatically recorded Playwright Codegen test"""
        prompt = f"""Improve automatically recorded Playwright test.

Original test (from Codegen):
```typescript
{playwright_trace}

Fix:

  • Replace CSS selectors with semantic locators
  • Remove unnecessary clicks
  • Add explicit waits
  • Add assertions for data verification

Return improved TypeScript code.""" return self.llm.invoke(prompt).content


### Stability Optimization and Self-Healing

```python
class TestStabilityOptimizer:
    async def analyze_flakiness(self, test_result: dict) -> dict:
        """Analyzes why test failed and suggests fixes"""
        prompt = f"""Analyze flaky test failure.

Test: {test_result['test_name']}
Error: {test_result['error']}
Screenshot: {test_result['screenshot_path']}

Likely causes:
- Race condition: element not yet visible
- Network delay: wait for network response
- Async update: element content changed
- Dynamic ID/class: selector needs updating

Suggest fix (semantic locator approach)."""
        return await self.llm.ainvoke(prompt)

Case study: React SPA, 80 critical user journeys for E2E. Initial problem: 35% of tests flaky, breaking 2–3 times per week on style changes. Generated semantic-locator tests: zero flakiness after stabilization. Maintenance time per layout change: 3 hours (manual fix per test) → 0 (tests adapt).

Timeframe: E2E test generation + stability improvements: 4–6 weeks; full framework with cross-browser testing: 8–10 weeks.