Inside the Foundation Model Transparency Index: How Researchers Score AI Giants on 100 Disclosure Metrics
Hook
While foundation models like GPT-4 impact millions more people each year, their developers are becoming less transparent over time. The Foundation Model Transparency Index documents this troubling trend with hard data.
Context
Foundation models have followed the same opacity trajectory as social media platforms before them. Companies deploy systems affecting millions of users while revealing little about training data sources, labor practices, computational resources, safety mitigations, or downstream impacts. This information asymmetry creates accountability gaps: researchers can’t reproduce results, policymakers can’t craft informed regulations, and users can’t make educated choices about which models to trust.
Stanford’s Center for Research on Foundation Models, in collaboration with Princeton’s Center for IT Policy and MIT Media Lab, created the Foundation Model Transparency Index to systematically measure and track this opacity. Rather than building software tools, they’ve constructed a rigorous assessment framework with 100 specific indicators spanning the entire foundation model lifecycle—from upstream resources like training data and compute, through model characteristics like capabilities and risks, to downstream factors like distribution channels and usage policies. The repository serves as both a transparency benchmark and a public accountability mechanism, documenting which companies disclose what information across multiple assessment cycles since October 2023.
Technical Insight
The FMTI repository structures transparency assessment as a hierarchical data architecture rather than executable code. The framework organizes 100 indicators into domains and subdomains, each grounded in academic literature about AI transparency. For example, the upstream resources domain includes indicators about data provenance, labor practices, and computational costs, while the model characteristics domain covers capability evaluations, known limitations, and safety mitigations.
The scoring methodology implements a validation pipeline with multiple checkpoints. Two researchers independently score each company on all 100 indicators, comparing results and resolving disagreements through structured discussion. Companies then receive their preliminary scores and can contest assessments they believe are inaccurate, triggering a two-month feedback loop with virtual meetings to finalize scores. This dual-validation approach reduces subjective bias while maintaining academic independence.
The repository appears to contain indicator definitions, scoring materials, transparency reports, and visualizations organized by assessment period. The December 2025 update demonstrates the framework’s evolution. Researchers added new indicators reflecting current AI ecosystem developments while raising the bar on existing indicators to ensure disclosures remain substantively useful. This addresses a critical challenge: as companies learn to game transparency metrics, the framework must adapt to maintain meaningful signal.
The repository includes all raw scoring data under CC-BY licensing, enabling reproducible research. Each company’s transparency report consolidates publicly available information about their flagship model, creating a standardized comparison layer across heterogeneous disclosure practices. When OpenAI publishes a system card and Meta releases a model card, the FMTI framework normalizes these different formats against the same 100-indicator rubric.
The 2025 assessment expanded coverage to 13 companies, including first-time evaluations of DeepSeek, Alibaba, xAI, and Midjourney. This required methodological flexibility: 7 companies submitted formal transparency reports in response to researcher outreach, while the FMTI team compiled reports for 6 others based on publicly available information. This hybrid approach balances comprehensive coverage against the reality that many companies decline formal participation.
Citing the work follows standard academic formats:
@article{bommasaniklyman2024fmti,
author = {Bommasani, Rishi and
Klyman, Kevin and
Kapoor, Sayash and
Longpre, Shayne and
Xiong, Betty and
Maslej, Nestor and
Liang, Percy},
title = {The Foundation Model Transparency Index v1.1},
month = may,
year = 2024,
url = {https://arxiv.org/abs/2310.12941}
}
The multi-institutional authorship—spanning Stanford’s Institute for Human-Centered AI, Princeton’s Center for IT Policy, and MIT Media Lab—gives the index academic credibility that purely commercial transparency efforts lack. This institutional backing matters when companies contest scores, as researchers can maintain independence from industry pressure.
Gotcha
The FMTI repository is fundamentally a research dataset, not a software library or automation tool. Don’t expect programmatic query capabilities or automated monitoring systems. Every assessment requires manual researcher interpretation of diverse disclosure formats, making this inherently labor-intensive work that doesn’t scale through code.
The scoring methodology, despite validation processes, ultimately relies on researcher judgment calls. When an indicator asks whether a company discloses “sufficient detail” about training data composition, reasonable people can disagree about what constitutes sufficient. The framework attempts to minimize subjectivity through dual-researcher scoring and company feedback loops, but these mechanisms slow the assessment cycle and limit how frequently the index can update. The indicators themselves also evolve between versions—2025 indicators are more stringent than 2024—which complicates longitudinal comparisons. Did a company’s score drop because they became less transparent, or because researchers raised the bar?
Coverage limitations constrain the repository’s comprehensiveness. The 13-14 companies assessed represent major players, but the foundation model ecosystem includes hundreds of smaller developers whose transparency practices remain unexamined. Companies that decline participation and lack sufficient public information simply don’t get scored, creating selection bias toward larger, more visible organizations. This means the index captures transparency trends among AI giants while potentially missing important developments in the long tail of foundation model development.
Verdict
Use the FMTI repository if you’re conducting AI policy research, writing about foundation model governance, building transparency advocacy campaigns, or analyzing industry disclosure trends over time. The standardized 100-indicator framework and validated scoring data provide unmatched rigor for cross-company comparisons. Journalists covering AI accountability, academic researchers studying algorithmic transparency, and policymakers drafting disclosure requirements will find this an essential reference. The CC-BY licensing enables citation and reuse in reports, papers, and public advocacy. Skip it if you need executable transparency tools, real-time monitoring capabilities, or technical implementations you can integrate into development workflows. This is pure research infrastructure—valuable for understanding and benchmarking transparency, but offering no software components for automation or integration. Also skip it if you’re focused on smaller AI companies outside the 13-company assessment scope, as coverage remains limited to major foundation model developers willing to participate or prominent enough for independent assessment.