Stable Diffusion Self-Hosted Server Deployment

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
Stable Diffusion Self-Hosted Server Deployment
Medium
from 1 business day to 3 business days
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

Self-hosted deployment of Stable Diffusion

Self-hosted Stable Diffusion gives you complete control over generation: custom models, LoRA, no content policy restrictions on API services, and predictable costs at high volumes. For 5,000+ images per month, self-hosting is cheaper than the API.

Deployment options

Automatic1111 WebUI is the most popular, rich ecosystem of extensions:

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui

# Загружаем модель
wget -O models/Stable-diffusion/sd_xl_base_1.0.safetensors \
  https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

# Запуск с API
./webui.sh --api --listen --port 7860 --xformers

ComfyUI is a more flexible, node-based workflow, better for automation:

git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
pip install -r requirements.txt
python main.py --listen 0.0.0.0 --port 8188

Docker deploy

# docker-compose.yml
version: "3.8"
services:
  stable-diffusion:
    image: universonic/stable-diffusion-webui:latest
    ports:
      - "7860:7860"
    volumes:
      - ./models:/app/stable-diffusion-webui/models
      - ./outputs:/app/stable-diffusion-webui/outputs
    environment:
      - COMMANDLINE_ARGS=--api --xformers --medvram
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    restart: unless-stopped

Automatic1111 API client

import httpx
import base64
import json

class SDWebUIClient:
    def __init__(self, base_url: str = "http://localhost:7860"):
        self.base_url = base_url

    async def txt2img(
        self,
        prompt: str,
        negative_prompt: str = "low quality, blurry",
        width: int = 1024,
        height: int = 1024,
        steps: int = 30,
        cfg_scale: float = 7.0,
        sampler: str = "DPM++ 2M Karras",
        seed: int = -1
    ) -> bytes:
        payload = {
            "prompt": prompt,
            "negative_prompt": negative_prompt,
            "width": width,
            "height": height,
            "steps": steps,
            "cfg_scale": cfg_scale,
            "sampler_name": sampler,
            "seed": seed,
            "batch_size": 1
        }

        async with httpx.AsyncClient(timeout=120) as client:
            response = await client.post(f"{self.base_url}/sdapi/v1/txt2img", json=payload)
            result = response.json()
            return base64.b64decode(result["images"][0])

    async def img2img(self, init_image: bytes, prompt: str, denoising_strength: float = 0.7) -> bytes:
        payload = {
            "init_images": [base64.b64encode(init_image).decode()],
            "prompt": prompt,
            "denoising_strength": denoising_strength,
        }
        async with httpx.AsyncClient(timeout=120) as client:
            response = await client.post(f"{self.base_url}/sdapi/v1/img2img", json=payload)
            return base64.b64decode(response.json()["images"][0])

    async def get_models(self) -> list[str]:
        async with httpx.AsyncClient() as client:
            response = await client.get(f"{self.base_url}/sdapi/v1/sd-models")
            return [m["title"] for m in response.json()]

    async def switch_model(self, model_title: str) -> None:
        async with httpx.AsyncClient(timeout=60) as client:
            await client.post(
                f"{self.base_url}/sdapi/v1/options",
                json={"sd_model_checkpoint": model_title}
            )

Scaling under load

from celery import Celery
import redis

# Несколько GPU-воркеров
app = Celery("sd_tasks", broker="redis://localhost:6379/0")
app.conf.worker_concurrency = 1  # 1 задача на GPU воркер
app.conf.worker_prefetch_multiplier = 1

@app.task(queue="gpu_0")
def generate_on_gpu0(prompt: str, settings: dict) -> str:
    client = SDWebUIClient("http://gpu0-server:7860")
    return asyncio.run(client.txt2img(prompt, **settings))

@app.task(queue="gpu_1")
def generate_on_gpu1(prompt: str, settings: dict) -> str:
    client = SDWebUIClient("http://gpu1-server:7860")
    return asyncio.run(client.txt2img(prompt, **settings))

TCO: self-hosted vs API

Объём DALL-E 3 standard FLUX Dev (Replicate) Self-hosted (RTX 4090)
1,000 images/month $40 $15 $50 (amortization)
10,000 images/month $400 $150 $55
100,000 images/month $4,000 $1,500 $100

Self-hosted (RTX 4090 ~$1800) breakeven point: ~15,000–20,000 images per month. Deployment time: basic single-GPU server – 1–2 days. Multi-GPU with balancing and monitoring – 1 week.