Setting up JupyterHub for Teamwork with AI/ML
JupyterHub is a multi-user Jupyter server where each team member gets an isolated environment with shared access to data and GPUs. This solves a common problem for ML teams: "Everything works locally, but it doesn't reproduce on the server."
Installation on Kubernetes (Zero to JupyterHub)
# Добавление Helm репозитория
helm repo add jupyterhub https://hub.jupyter.org/helm-chart/
helm repo update
# config.yaml
cat > config.yaml << 'EOF'
hub:
config:
Authenticator:
admin_users:
- admin
GitHubOAuthenticator:
client_id: "your-github-client-id"
client_secret: "your-github-client-secret"
oauth_callback_url: "https://jupyter.company.com/hub/oauth_callback"
allowed_organizations:
- your-github-org
singleuser:
image:
name: jupyter/datascience-notebook
tag: "python-3.11"
profileList:
- display_name: "CPU Standard (4 CPU, 16GB RAM)"
description: "For EDA and light training"
default: true
- display_name: "GPU Instance (1x A100 40GB)"
description: "For model training"
kubespawner_override:
extra_resource_limits:
nvidia.com/gpu: "1"
- display_name: "GPU Large (2x A100 80GB)"
kubespawner_override:
extra_resource_limits:
nvidia.com/gpu: "2"
storage:
capacity: 50Gi
homeMountPath: /home/jovyan
# Общее хранилище для датасетов (read-only для пользователей)
singleuser:
extraVolumes:
- name: shared-datasets
persistentVolumeClaim:
claimName: shared-datasets-pvc
readOnly: true
extraVolumeMounts:
- name: shared-datasets
mountPath: /data/shared
readOnly: true
EOF
helm install jupyterhub jupyterhub/jupyterhub \
--namespace jhub --create-namespace \
--values config.yaml
Custom Docker images for ML
FROM jupyter/datascience-notebook:python-3.11
USER root
RUN apt-get update && apt-get install -y \
libgomp1 \
&& rm -rf /var/lib/apt/lists/*
USER ${NB_UID}
# ML dependencies
RUN pip install --no-cache-dir \
torch==2.2.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 \
transformers==4.38.0 \
datasets \
accelerate \
peft \
mlflow==2.11.0 \
dvc[s3] \
great_expectations \
lightgbm xgboost catboost \
optuna \
shap \
wandb
# MLflow tracking server URL
ENV MLFLOW_TRACKING_URI=http://mlflow.internal:5000
# DVC remote config
COPY dvc_config /home/jovyan/.dvc/config
Resource management
The ResourceQuota for a Kubernetes namespace limits the total consumption:
apiVersion: v1
kind: ResourceQuota
metadata:
name: jhub-quota
spec:
hard:
requests.nvidia.com/gpu: "8" # Максимум 8 GPU одновременно
limits.memory: "512Gi"
requests.cpu: "64"
PriorityClass for GPU: Research tasks have low priority, production inference has high priority.
Integration with ML infrastructure
MLflow is automatically accessible from all notebooks via an environment variable. DVC is configured with corporate remote storage. The shared dataset folder with the latest dataset versions is mounted read-only. Git pre-commit hooks are installed globally to standardize code.
Typical result: an ML team of 10+ people works in a unified environment without "works on my machine" issues, with shared access to GPU resources and centralized experiment tracking.







