Business Situation
A Europe-based sustainability intelligence platform empowers investors to make informed ESG-aligned decisions by analyzing the sustainability performance of their portfolios. As part of its product evolution, the client aimed to:
- Train a custom Large Language Model (LLM) to enhance its environmental, social, and governance (ESG) recommendation engine
- Standardize data annotation across more than 400 key performance indicators (KPIs)
- Create a robust, scalable standard operating procedure (SOP) and annotation framework for long-term model accuracy and consistent analysis
The goal: Build a next-generation ESG insight engine, faster, smarter, and more precise.
SGA Approach
SG Analytics (SGA) was engaged as a strategic partner to bring end-to-end ownership of the LLM training and data annotation lifecycle from framework design and ESG analyst training to internal QA validation, client-tool integration, and final output testing.
This hybrid engagement combined our proprietary artificial intelligence (AI) tools, ESG domain expertise, and full integration with the Client’s in-house platform to ensure aligned, efficient, and high-quality outcomes.
Our Integrated Solution
1. Strategic Alignment & Discovery
- Conducted in-depth discovery workshops with the client’s product, research, and data science teams
- Mapped platform requirements with ESG frameworks, including Sustainable Finance Disclosure Regulation (SFDR), Global Reporting Initiative (GRI), and UN SDGs across 16 sectors and 2,000 companies
- Identified opportunities to enhance the client’s recommendation engine via structured LLM-ready inputs
2. Framework Design & SOP Development
- Developed custom standard operating procedures (SOPs) and annotation guideline documents covering:
– Over 60 sustainability metrics across Environment, Social, and Governance
– Over 400 ESG KPIs, indicators, and context flags - Defined annotation hierarchies, threshold rules, and text classification standards
Result: A scalable annotation blueprint embedded with ESG taxonomy alignment and model compatibility
3. AI-Integrated Annotation & Validation Workflow
- Deployed SGA’s internal IDEAT-QA engine (AI-enhanced quality assurance module) to:
– Pre-validate LLM-ready text blocks
– Auto-flag inconsistencies, missing attributes, or contextual errors
– Create audit logs for model backtesting and reinforcement learning - Subject matter expertise (SME) review loops ensured contextual understanding, especially for complex or ambiguous ESG metrics
Result: 2-layer validation approach combining machine consistency with human insight
4. Analyst Training & Client Platform Integration
- Trained and deployed a 120-member team of specialized ESG analysts, skilled in:
– Interpreting sustainability disclosures across industries
– Using the client’s annotation tool to tag, flag, and structure data
– Adapting to LLM model behavior and evolution - Conducted knowledge transfer sessions and test-run validations using the client’s live environment
5. Annotation Execution & QA Delivery
- Processed 12,000 sustainability documents (annual reports, ESG disclosures, and frameworks)
- Applied standardized tagging to extract verifiable insights on:
– Emissions, renewable energy use, gender equity, board diversity, etc. - Performed iterative QA reviews with auto-assist tools and expert checkpoints
Key Takeaways
- Standardization was the Catalyst: Developing SOPs and metric-specific guidelines ensured annotation integrity and LLM-readiness
- AI Validation Tools Scaled Confidence: IDEAT-QA improved consistency and reduced annotation rework by over 30%
- Expertise Made the Difference: SME bridged the gap between raw disclosures and nuanced ESG interpretation. Achieved over 98% data accuracy through a combination of multi-tier quality checks, technical data validations, and expert human review
- Collaboration Enabled Innovation: Close integration with the client’s platform ensured a unified delivery pipeline and continuous learning loop