This AI Paper from Cohere AI Introduces a Multi-faceted Approach to AI Governance by Rethinking Compute Thresholds


As AI systems become more advanced, ensuring their safe and ethical deployment has become a critical concern for researchers and policymakers. One of the pressing issues in AI governance is the management of risks associated with increasingly powerful AI systems. These risks include potential misuse, ethical concerns, and unintended consequences that could arise from AI’s growing capabilities. Policymakers are exploring various strategies to mitigate these risks, but the challenge lies in accurately predicting and controlling the potential harms AI systems might cause as they scale.

Current governance strategies often rely on defining thresholds for the computational power (measured in FLOP – floating-point operations) used to train AI models. These thresholds are intended to identify and regulate AI systems that exceed certain levels of computational intensity under the assumption that higher compute correlates with greater risk. Frameworks like the White House Executive Orders on AI Safety and the EU AI Act have incorporated these thresholds into their policies.

Cohere for AI researcher has introduced a critical examination of these compute thresholds as a governance tool. They argue that current implementations are shortsighted and fail to effectively mitigate risks. They emphasize that the relationship between compute and risk is highly uncertain and rapidly evolving. Instead of relying solely on compute thresholds, they suggest a more nuanced approach to AI governance that considers multiple factors influencing AI’s risk profile.

The proposed approach advocates for a dynamic and comprehensive evaluation of AI systems rather than fixed compute thresholds. This includes better specifying FLOP as a metric, considering additional dimensions of AI performance and risk, and implementing adaptive thresholds that adjust to the evolving landscape of AI capabilities. The researchers recommend enhancing transparency and standardization in reporting AI risks and aligning governance practices with the actual performance and potential harms of AI systems. This comprehensive method involves examining factors such as the quality of training data, optimization techniques, and the specific applications of AI models to ensure a more accurate assessment of potential risks.

The research highlights that fixed compute thresholds often miss significant risks associated with smaller, highly optimized AI models. Empirical evidence suggests that many current policies need to account for the rapid advancements and optimization techniques that can make smaller models as capable and risky as larger ones. For instance, models with less than 13 billion parameters have been shown to outperform larger models with over 176 billion parameters in certain tasks. This oversight indicates that compute thresholds, as currently applied, are unreliable predictors of AI risks and need substantial revision to be effective.

One noteworthy result from the research is that smaller models, when optimized, can achieve performance levels comparable to much larger models. For example, the study found that smaller models could reach up to 77.15% performance scores on benchmark tests, a significant improvement from the 38.59% average just two years prior. Furthermore, the researchers pointed out that the current thresholds, such as those set by the EU AI Act and the White House Executive Order, do not capture the nuances of model performance and risk, as they primarily focus on the sheer amount of compute without considering the specific capabilities and optimizations of the models.

In conclusion, the research underscores the inadequacy of compute thresholds as a standalone governance tool for AI. The problem lies in the unpredictable relationship between compute and risk, necessitating a more flexible and informed approach to regulation. The proposed solution involves shifting towards dynamic thresholds and multi-faceted risk assessments that can better anticipate and mitigate the risks posed by advanced AI systems. Researchers emphasize the need for policies that evolve with the technology and accurately reflect the complexities of modern AI development.

