Prompt Versioning System Development

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
Prompt Versioning System Development
Medium
~2-3 business days
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

Development of a prompt versioning system

Prompt versioning is Git for LLM instructions. When a prompt changes, you need to know who changed it, what exactly changed, how it affected quality, and be able to revert to any previous version.

Principles of versioning

Version immutability: the created version never changes. If a prompt needs to be corrected, a new version is created.

Semantic versioning: major.minor.patch:

  • Major: a fundamental change in the instruction or task
  • Minor: improving the wording without changing the problem
  • Patch: typo fixes, minor tweaks

Link with results: Each version is linked to metrics on the evaluation set.

Git-based versioning of prompts

For small teams, storing prompts in Git is often sufficient:

prompts/
├── customer-support/
│   ├── system-prompt.v1.txt
│   ├── system-prompt.v2.txt
│   └── system-prompt.current -> system-prompt.v2.txt
├── summarization/
│   ├── prompt.v1.yaml
│   └── prompt.v2.yaml
└── prompts.json  # Индекс с метаданными
# prompts/summarization/prompt.v2.yaml
version: "2.0.0"
name: "document-summarizer"
created: "2024-11-15"
author: "ml-team"
changelog: "Added length constraint, improved tone instruction"
model:
  provider: "openai"
  name: "gpt-4o"
  temperature: 0.2
  max_tokens: 500
variables:
  - name: document
    required: true
  - name: max_sentences
    required: false
    default: "3"
content: |
  Summarize the following document in exactly {{max_sentences}} sentences.
  Be concise and focus on the main points.
  Do not add information not present in the document.

  Document:
  {{document}}
metrics:
  rouge_l: 0.47
  human_rating: 4.2
  eval_set: "summarization-benchmark-v3"

Automatic diff prompts

import difflib

def diff_prompt_versions(v1_content: str, v2_content: str) -> str:
    """Показать diff между версиями промпта"""
    v1_lines = v1_content.splitlines(keepends=True)
    v2_lines = v2_content.splitlines(keepends=True)

    diff = difflib.unified_diff(
        v1_lines, v2_lines,
        fromfile="version_1",
        tofile="version_2",
        lineterm=""
    )
    return "".join(diff)

def analyze_prompt_change(v1: str, v2: str) -> dict:
    """Анализ характера изменений"""
    v1_words = set(v1.lower().split())
    v2_words = set(v2.lower().split())

    added_words = v2_words - v1_words
    removed_words = v1_words - v2_words

    return {
        "length_change": len(v2) - len(v1),
        "added_words": list(added_words)[:10],
        "removed_words": list(removed_words)[:10],
        "similarity": difflib.SequenceMatcher(None, v1, v2).ratio(),
        "change_type": "major" if difflib.SequenceMatcher(None, v1, v2).ratio() < 0.7 else "minor"
    }

Promotion workflow

[Draft] → [In Review] → [Approved] → [Staging] → [Production]
                ↑                         ↓
           Reviewer                  A/B Test (5%)
                                          ↓
                                    Full Rollout / Rollback

Key rule: no prompts are promoted to production without passing the evaluation set. The automated CI job runs tests every time a prompt changes and blocks promotion if the regression is greater than 3%.