AI-based sustainability and ESG management system
ESG reporting is moving from PR documents to audited data: the CSRD (Corporate Sustainability Reporting Directive) requires double materiality assessments starting in 2024, and the SEC Climate Disclosure Rules require verified Scope 1/2/3 data. A company with 200 suppliers and 15 production sites physically cannot collect and consolidate ESG data manually without automation.
Automating ESG data collection
The main problem is that the data is spread across 40 sources: power system SCADA, ERP (SAP, Oracle), supplier portals, payment systems (for calculating travel emissions), and utility bills. None of the sources have a standard format.
ETL pipeline for ESG
Apache Airflow for orchestration. Each source is a separate DAG transformed into a single ESG data schema (GRI-aligned or ESRS-aligned structure). Storage: PostgreSQL or Snowflake with an ESG data model (entity: facility, activity_type, period, value, unit, source, confidence_score).
LLM component: automatic classification of utility bills and invoices by ESG categories (Scope 1/2/3 emissions, water, waste). GPT-4o or Claude 3.5 Sonnet with structured output (JSON schema) — 0.91 precision on a test dataset of 3,000 documents vs. 0.67 for a rule-based classifier.
Emissions Calculation
Scope 1: Direct combustion — activity data × emission factor from the IPCC/DEFRA database. Scope 2: Purchased electricity × location-based or market-based factor (RE100 compliance). Scope 3: 15 categories, of which category 1 (purchased goods) and category 11 (use of sold products) are the most labor-intensive.
ML task for Scope 3 Cat 1: spend-based estimation (supplier costs × emission intensity according to EEIO tables) + physical data where available. The hybrid model reduces the estimation uncertainty from ±40% (pure spend-based) to ±18%.
Monitoring and anomalies
Energy Management System (EnMS): energy consumption time series with a 15-minute resolution. Prophet or N-BEATS for baseline consumption forecasting. A deviation of >2σ from the forecast during working hours is considered an anomaly (e.g., leak, suboptimal equipment operation, open warehouse doors). At a manufacturing facility with 1,200 employees, the system found 14 anomalies over 3 months, saving $180,000/year in energy costs.
Scope 3 Category 4: Upstream transportation
Integration with TMS (Transport Management System): each shipment → distance × load factor × emission factor (transport type, fuel). ML route optimizer with ESG constraint: CO2 budget per shipment as a hard constraint, cost as an objective.
ESG scoring of suppliers
Supply chain sustainability rating: 200+ suppliers, data from CDP questionnaires, Ecovadis, open databases (Refinitiv, MSCI ESG). The XGBoost classifier predicts the likelihood of an ESG incident at a supplier (regulatory fine, scandal, environmental accident) over a 12-month horizon. AUROC 0.78 on holdout.
Features: CDP disclosure score, industry benchmark, GDELT news sentiment (negative mentions), geographical risk index (Climate Risk Index), company size, country.
NLP news monitoring: RSS + NewsAPI → BERT-based sentiment classifier for monitoring ESG risks in the news feed. Named Entity Recognition (NER) links mentions to specific providers from the registry.
Automation of ESG reporting
Report generation
LLM (GPT-4o, Claude) + RAG for internal ESG data: generating narrative sections of GRI/ESRS reports from structured data. Report template + tables → 80% of the text is generated automatically, verified and supplemented by an expert.
An important caveat: LLM must not hallucinate numbers. Architecture: All numerical claims are linked to specific database records via a citation mechanism—if LLM cannot cite a source, it does not include the number in the text.
Double Materiality Assessment (CSRD)
Materiality matrix: 2 axes – financial materiality (the impact of ESG factors on a company's finances) and impact materiality (the company's impact on society/the environment). ML component: clustering and prioritizing ESG topics based on stakeholder survey data and industry benchmarks.
Stack
| Слой | Технологии |
|---|---|
| Оркестрация данных | Apache Airflow, dbt |
| Хранение | Snowflake, PostgreSQL |
| Расчёт выбросов | Python, IPCC/DEFRA факторы, pyCO2SYS |
| ML-модели | XGBoost, PyTorch, Hugging Face |
| LLM для отчётов | GPT-4o, Claude 3.5 (Azure/Anthropic API) |
| Мониторинг | Grafana, Apache Flink |
Development period: 4–10 months depending on the number of data sources and the requirements for the coverage of reporting standards.







