Unpredictable AGI may resist full control, making diverse AI safer

Public concern about AI safety has grown significantly in recent years. As AI systems become more powerful, a key question is how we make sure they do what we actually want. Now, researchers suggest that rather than trying to eliminate misalignment between AI and humans, we should embrace and manage it through a diverse ecosystem of AI systems that can balance and correct one another.

"While we have shown that sufficiently strong AI cannot be fully controlled or predicted, we also demonstrate that agents can be influenced by other agents without central control, and that greater diversity and openness influence their behavior. As these systems get more powerful, ensuring they remain beneficial to and aligned with humanity becomes more important," said Dr. Hector Zenil, senior author and Senior Lecturer/Associate Professor at King's Institute for AI and the School of Biomedical Engineering & Imaging Sciences/

Published in PNAS Nexus, the paper uses mathematical principles to demonstrate that an AI system powerful enough to exhibit artificial general intelligence will inevitably explore behaviors we didn't predict or plan for, making perfect guaranteed alignment impossible.

However, rather than trying to create a single, perfectly controlled AI, the researchers propose what they call 'agentic neurodivergence'—a diverse ecosystem of AI systems with different goals, values and approaches. The idea mirrors natural ecosystems, where diversity fosters resilience through adaptability.

To read more, click here.