FLAML (Microsoft) Integration for Fast AutoML
FLAML (Fast and Lightweight AutoML) from Microsoft Research is an AutoML library with a focus on minimal search time while maintaining high quality. It is used in Azure AutoML and Microsoft Fabric.
The key idea of FLAML
Cost-efficient search via cost-frugal Bayesian Optimization:
- Evaluates each configuration on a small fraction of the data (early stopping)
- Doesn't waste time on obviously bad configurations
- Adaptively redistributes the budget between algorithms
Basic examples
Classification:
from flaml import AutoML
automl = AutoML()
automl.fit(
X_train, y_train,
task='classification',
time_budget=120, # секунды
metric='roc_auc',
n_jobs=-1,
eval_method='cv',
n_splits=5,
estimator_list=['lgbm', 'xgboost', 'rf', 'extra_tree']
)
print(f'Best model: {automl.best_estimator}')
print(f'Best AUC: {automl.best_result}')
Time series with FLAML:
automl = AutoML()
automl.fit(
X_train, y_train,
task='ts_forecast',
time_budget=300,
period=7, # горизонт прогноза (7 дней)
eval_method='holdout',
estimator_list=['prophet', 'arima', 'lgbm', 'xgboost']
)
Integration with MLflow
from flaml import AutoML
import mlflow
def flaml_with_mlflow(X_train, y_train, X_test, y_test, run_name: str):
with mlflow.start_run(run_name=run_name):
automl = AutoML()
automl.fit(X_train, y_train, task='classification', time_budget=300, metric='roc_auc')
mlflow.log_param('best_estimator', automl.best_estimator)
mlflow.log_param('best_config', str(automl.best_config))
mlflow.log_metric('val_roc_auc', automl.best_result)
# Тест на holdout
from sklearn.metrics import roc_auc_score
y_proba = automl.predict_proba(X_test)[:, 1]
test_auc = roc_auc_score(y_test, y_proba)
mlflow.log_metric('test_roc_auc', test_auc)
mlflow.sklearn.log_model(automl, 'flaml_model')
return automl
FLAML vs FLAML NLP/Vision: flaml[nlp] adds hyperparameter search for Transformer fine-tuning (HuggingFace). flaml[blendsearch] is an advanced BlendSearch strategy.
Timeframe: Basic FLAML integration – 1 day. MLflow tracking, custom estimators, parallel training, Azure AutoML pipeline – 1-2 weeks.







