Developing a multi-tenant AI platform (SaaS) for B2B clients
Multi-tenant AI SaaS is an architectural pattern where a single codebase serves multiple B2B clients (tenants) with data isolation, customization, and independent billing. The right architecture determines the scalability and security of the platform.
Multi-tenancy models
Shared Database, Shared Schema (most cost-effective): All tenants in a single database, differentiated by the tenant_id column. Simplicity of operations, complexity of ensuring isolation.
Shared Database, Separate Schema: Each tenant receives a separate PostgreSQL schema. Better isolation and the ability to create per-tenant indexes.
Separate Database per Tenant: Each tenant is a separate database. Maximum isolation and operational complexity for 1,000+ tenants.
For an AI platform with ML models, the optimal solution is: Shared DB + Separate Schema for transactional data + separate S3 prefixes for ML artifacts of each tenant.
Row-Level Security in PostgreSQL
-- Включение RLS для изоляции данных тенантов
ALTER TABLE predictions ENABLE ROW LEVEL SECURITY;
-- Политика: каждый тенант видит только свои данные
CREATE POLICY tenant_isolation ON predictions
USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
-- Установка tenant context для каждого соединения
-- (выполняется при начале сессии)
SET LOCAL app.current_tenant_id = '550e8400-e29b-41d4-a716-446655440000';
# FastAPI middleware для установки tenant context
@app.middleware("http")
async def tenant_context_middleware(request: Request, call_next):
tenant_id = await resolve_tenant(request)
request.state.tenant_id = tenant_id
async with db.acquire() as conn:
# Установка tenant context для всех запросов в рамках соединения
await conn.execute(
f"SET LOCAL app.current_tenant_id = '{tenant_id}'"
)
request.state.db_conn = conn
response = await call_next(request)
return response
Tenant-specific AI configuration
@dataclass
class TenantAIConfig:
tenant_id: str
# Разрешённые модели
allowed_models: list[str]
# Кастомные системные промпты (branding)
system_prompt_override: str = None
# Лимиты использования
monthly_token_limit: int = 1_000_000
concurrent_request_limit: int = 10
# Fine-tuned модели тенанта
custom_models: list[str] = None
# Data retention
prediction_log_retention_days: int = 90
# Compliance settings
pii_detection_enabled: bool = True
audit_log_enabled: bool = True
class TenantAwareInferenceService:
async def predict(self, tenant_id: str, model_name: str,
inputs: dict) -> dict:
config = await self.get_tenant_config(tenant_id)
# Проверка разрешений
if model_name not in config.allowed_models:
raise PermissionError(f"Model '{model_name}' not allowed for this tenant")
# Проверка rate limits
if not await self.rate_limiter.check(tenant_id, config.concurrent_request_limit):
raise RateLimitError("Concurrent request limit exceeded")
# Применение tenant-specific промпта
if config.system_prompt_override and 'system' in inputs:
inputs['system'] = config.system_prompt_override + "\n\n" + inputs['system']
# PII detection перед отправкой в модель
if config.pii_detection_enabled:
inputs = await self.pii_detector.redact(inputs)
result = await self.inference_engine.run(model_name, inputs)
# Логирование с изоляцией по тенанту
await self.audit_log.record(tenant_id, model_name, inputs, result)
return result
Onboarding of a new tenant
class TenantOnboardingService:
async def provision_tenant(self, signup_data: dict) -> Tenant:
tenant = await self.db.create_tenant(signup_data)
# Создание изолированной схемы данных
await self.db_manager.create_schema(tenant.id)
await self.db_manager.run_migrations(tenant.id)
# Создание S3 prefixes
await self.storage.create_tenant_prefix(tenant.id)
# Дефолтная конфигурация
await self.config_store.create_default_config(tenant.id)
# Выдача API ключей
api_key = await self.auth.create_api_key(tenant.id, scope="all")
# Начальное письмо с инструкциями
await self.email.send_welcome(tenant, api_key)
return tenant, api_key
Isolation of ML models by tenants
Tenants can fine-tune base models using their own data. Each fine-tuned model is isolated and accessible only to the tenant who created it:
# Fine-tuning endpoint
@app.post("/v1/fine-tuning/jobs")
async def create_fine_tuning_job(request: FineTuningRequest,
tenant_id = Depends(get_tenant)):
# Данные тенанта → его fine-tuned модель
job = await fine_tuning_service.create_job(
tenant_id=tenant_id,
base_model=request.model,
training_file=request.training_file, # Файл в tenant-specific S3 prefix
)
return job
Multi-tenant AI SaaS development timeline: 3-5 months depending on the number of tenants, isolation requirements, and the complexity of the AI functionality.







