Setting up MLflow for experiment tracking
MLflow Tracking is an open standard for logging ML experiments. Parameters, metrics, artifacts, and models are all in one place, with the ability to compare and replay.
Quick start
# Локальный сервер (для разработки)
pip install mlflow
mlflow server --host 0.0.0.0 --port 5000
# Production: с PostgreSQL backend и S3 для артефактов
mlflow server \
--backend-store-uri postgresql://user:pass@localhost/mlflow \
--default-artifact-root s3://my-mlops-bucket/mlflow \
--host 0.0.0.0 --port 5000
Experiment logging
import mlflow
mlflow.set_tracking_uri("http://mlflow-server:5000")
with mlflow.start_run():
mlflow.log_param("learning_rate", 0.01)
mlflow.log_params({"batch_size": 32, "epochs": 10, "optimizer": "adam"})
# Логирование метрик по эпохам
for epoch in range(10):
train_loss = train_one_epoch(model, train_loader)
val_loss, val_acc = evaluate(model, val_loader)
mlflow.log_metrics({"train_loss": train_loss, "val_loss": val_loss, "val_acc": val_acc}, step=epoch)
# Финальные метрики
mlflow.log_metric("test_f1", 0.924)
# Артефакты
mlflow.log_artifact("confusion_matrix.png")
mlflow.log_dict({"feature_importance": feature_imp}, "artifacts/feature_importance.json")
# Модель
mlflow.sklearn.log_model(model, "model", registered_model_name="my-classifier")
Autologging
# Автоматическое логирование без ручного кода
mlflow.autolog() # включает autolog для всех поддерживаемых фреймворков
# Или для конкретного
mlflow.sklearn.autolog(log_models=True, log_input_examples=True)
mlflow.pytorch.autolog(log_every_n_epoch=1)
mlflow.transformers.autolog()
Docker for production MLflow
services:
mlflow:
image: ghcr.io/mlflow/mlflow:v2.14.0
ports: ["5000:5000"]
environment:
- MLFLOW_S3_ENDPOINT_URL=https://storage.yandexcloud.net
- AWS_ACCESS_KEY_ID=${YC_ACCESS_KEY}
- AWS_SECRET_ACCESS_KEY=${YC_SECRET_KEY}
command: >
mlflow server
--backend-store-uri postgresql://mlflow:${DB_PASS}@postgres:5432/mlflow
--default-artifact-root s3://mlops-bucket/mlflow
--host 0.0.0.0







