JupyterHub Setup for AI ML Team Collaboration

We design and deploy artificial intelligence systems: from prototype to production-ready solutions. Our team combines expertise in machine learning, data engineering and MLOps to make AI work not in the lab, but in real business.
Showing 1 of 1 servicesAll 1566 services
JupyterHub Setup for AI ML Team Collaboration
Medium
~2-3 business days
FAQ
AI Development Areas
AI Solution Development Stages
Latest works
  • image_website-b2b-advance_0.png
    B2B ADVANCE company website development
    1212
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_websites_belfingroup_462_0.webp
    Website development for BELFINGROUP
    852
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_logo-advance_0.png
    B2B Advance company logo design
    561
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822

Setting up JupyterHub for Teamwork with AI/ML

JupyterHub is a multi-user Jupyter server where each team member gets an isolated environment with shared access to data and GPUs. This solves a common problem for ML teams: "Everything works locally, but it doesn't reproduce on the server."

Installation on Kubernetes (Zero to JupyterHub)

# Добавление Helm репозитория
helm repo add jupyterhub https://hub.jupyter.org/helm-chart/
helm repo update

# config.yaml
cat > config.yaml << 'EOF'
hub:
  config:
    Authenticator:
      admin_users:
        - admin
    GitHubOAuthenticator:
      client_id: "your-github-client-id"
      client_secret: "your-github-client-secret"
      oauth_callback_url: "https://jupyter.company.com/hub/oauth_callback"
      allowed_organizations:
        - your-github-org

singleuser:
  image:
    name: jupyter/datascience-notebook
    tag: "python-3.11"
  profileList:
    - display_name: "CPU Standard (4 CPU, 16GB RAM)"
      description: "For EDA and light training"
      default: true
    - display_name: "GPU Instance (1x A100 40GB)"
      description: "For model training"
      kubespawner_override:
        extra_resource_limits:
          nvidia.com/gpu: "1"
    - display_name: "GPU Large (2x A100 80GB)"
      kubespawner_override:
        extra_resource_limits:
          nvidia.com/gpu: "2"
  storage:
    capacity: 50Gi
    homeMountPath: /home/jovyan

# Общее хранилище для датасетов (read-only для пользователей)
singleuser:
  extraVolumes:
    - name: shared-datasets
      persistentVolumeClaim:
        claimName: shared-datasets-pvc
        readOnly: true
  extraVolumeMounts:
    - name: shared-datasets
      mountPath: /data/shared
      readOnly: true
EOF

helm install jupyterhub jupyterhub/jupyterhub \
  --namespace jhub --create-namespace \
  --values config.yaml

Custom Docker images for ML

FROM jupyter/datascience-notebook:python-3.11

USER root
RUN apt-get update && apt-get install -y \
    libgomp1 \
    && rm -rf /var/lib/apt/lists/*

USER ${NB_UID}

# ML dependencies
RUN pip install --no-cache-dir \
    torch==2.2.0 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 \
    transformers==4.38.0 \
    datasets \
    accelerate \
    peft \
    mlflow==2.11.0 \
    dvc[s3] \
    great_expectations \
    lightgbm xgboost catboost \
    optuna \
    shap \
    wandb

# MLflow tracking server URL
ENV MLFLOW_TRACKING_URI=http://mlflow.internal:5000

# DVC remote config
COPY dvc_config /home/jovyan/.dvc/config

Resource management

The ResourceQuota for a Kubernetes namespace limits the total consumption:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: jhub-quota
spec:
  hard:
    requests.nvidia.com/gpu: "8"    # Максимум 8 GPU одновременно
    limits.memory: "512Gi"
    requests.cpu: "64"

PriorityClass for GPU: Research tasks have low priority, production inference has high priority.

Integration with ML infrastructure

MLflow is automatically accessible from all notebooks via an environment variable. DVC is configured with corporate remote storage. The shared dataset folder with the latest dataset versions is mounted read-only. Git pre-commit hooks are installed globally to standardize code.

Typical result: an ML team of 10+ people works in a unified environment without "works on my machine" issues, with shared access to GPU resources and centralized experiment tracking.