Development of AI Image Generation Systems
AI image generation — creating custom services based on Stable Diffusion, FLUX, DALL-E, or Midjourney: avatars, banners, article illustrations, product visualization, NFTs. Project complexity depends on customization requirements, speed, and inference cost.
Model Selection by Task
| Model | Strengths | Inference Cost | Manageability |
|---|---|---|---|
| DALL-E 3 | Text understanding, instruction following | $0.04–0.08/image | High |
| FLUX.1 Dev | Photorealism, detail | $0.003–0.015 (Replicate) | High |
| SDXL | Flexibility, LoRA/ControlNet | Self-hosted from $0.001 | Maximum |
| Midjourney | Artistic style | $0.01–0.04 | Low (no API) |
| Kandinsky 3 | Russian-language prompts | Self-hosted / $0.005 | Medium |
DALL-E 3 Integration
from openai import AsyncOpenAI
import base64
client = AsyncOpenAI()
async def generate_image_dalle(
prompt: str,
size: str = "1024x1024", # 1024x1024, 1792x1024, 1024x1792
quality: str = "standard", # standard, hd
style: str = "vivid" # vivid, natural
) -> bytes:
response = await client.images.generate(
model="dall-e-3",
prompt=prompt,
size=size,
quality=quality,
style=style,
n=1,
response_format="b64_json"
)
return base64.b64decode(response.data[0].b64_json)
FLUX via Replicate API
import replicate
import httpx
async def generate_image_flux(
prompt: str,
aspect_ratio: str = "1:1",
num_outputs: int = 1
) -> list[bytes]:
output = await replicate.async_run(
"black-forest-labs/flux-dev",
input={
"prompt": prompt,
"aspect_ratio": aspect_ratio,
"num_outputs": num_outputs,
"guidance": 3.5,
"num_inference_steps": 28,
"output_format": "webp",
"output_quality": 90
}
)
images = []
async with httpx.AsyncClient() as http:
for url in output:
resp = await http.get(str(url))
images.append(resp.content)
return images
Self-hosted via ComfyUI
import websocket
import json
import uuid
class ComfyUIClient:
def __init__(self, server_address: str = "127.0.0.1:8188"):
self.server_address = server_address
self.client_id = str(uuid.uuid4())
def queue_prompt(self, workflow: dict) -> str:
import urllib.request
data = json.dumps({"prompt": workflow, "client_id": self.client_id}).encode("utf-8")
req = urllib.request.Request(f"http://{self.server_address}/prompt", data=data)
return json.loads(urllib.request.urlopen(req).read())["prompt_id"]
def get_image(self, filename: str, subfolder: str, folder_type: str) -> bytes:
import urllib.parse
data = urllib.parse.urlencode({"filename": filename, "subfolder": subfolder, "type": folder_type})
url = f"http://{self.server_address}/view?{data}"
return urllib.request.urlopen(url).read()
Queue Processing and Scaling
from celery import Celery
import redis
app = Celery("image_gen", broker="redis://localhost:6379/0")
@app.task(bind=True, max_retries=3)
def generate_image_task(self, job_id: str, prompt: str, settings: dict):
try:
if settings.get("model") == "dalle":
image = asyncio.run(generate_image_dalle(prompt, **settings))
elif settings.get("model") == "flux":
images = asyncio.run(generate_image_flux(prompt, **settings))
image = images[0]
# Save to S3/MinIO
url = upload_to_storage(job_id, image)
# Notify client
redis_client.publish(f"job:{job_id}", json.dumps({"status": "done", "url": url}))
return url
except Exception as exc:
raise self.retry(exc=exc, countdown=30)
Typical Architectural Patterns
Synchronous API (up to 10 RPS): FastAPI → Replicate/OpenAI API → S3. Response time 3–15 sec.
Async with queue (10–100 RPS): FastAPI → Redis → Celery workers → GPU server → S3. Webhook on completion.
Self-hosted GPU cluster (100+ RPS): ComfyUI + Ray Serve + multiple GPU nodes behind load balancer.
Timeline: REST API with generation via DALL-E/FLUX — 1 week. Self-hosted SDXL with queue — 2–3 weeks. Full platform with customization and billing — 2–3 months.







