Samuel G. B. Johnson, Amir-Hossein Karimi, Yoshua Bengio, et al., Imagining and building wise machines: The centrality of AI metacognition, arXiv:2411.02478v1 [cs.AI].
Abstract: Recent advances in artificial intelligence (AI) have produced systems capable of increasingly sophisticated performance on cognitive tasks. However, AI systems still struggle in critical ways: unpredictable and novel environments (robustness), lack of transparency in their reasoning (explainability), challenges in communication and commitment (cooperation), and risks due to potential harmful actions (safety). We argue that these shortcomings stem from one overarching failure: AI systems lack wisdom. Drawing from cognitive and social sciences, we define wisdom as the ability to navigate intractable problems - those that are ambiguous, radically uncertain, novel, chaotic, or computationally explosive - through effective task-level and metacognitive strategies. While AI research has focused on task-level strategies, metacognition - the ability to reflect on and regulate one’s thought processes - is underdeveloped in AI systems. In humans, metacognitive strategies such as recognizing the limits of one’s knowledge, considering diverse perspectives, and adapting to context are essential for wise decision-making. We propose that integrating metacognitive capabilities into AI systems is crucial for enhancing their robustness, explainability, cooperation, and safety. By focusing on developing wise AI, we suggest an alternative to aligning AI with specific human values - a task fraught with conceptual and practical difficulties. Instead, wise AI systems can thoughtfully navigate complex situations, account for diverse human values, and avoid harmful actions. We discuss potential approaches to building wise AI, including benchmarking metacognitive abilities and training AI systems to employ wise reasoning. Prioritizing metacognition in AI research will lead to systems that act not only intelligently but also wisely in complex, real-world situations.
From the article itself, which I have only skimmed:
At first blush, in the cognitive and social sciences, the concept of ‘wisdom’ seems to bring together many superficially unrelated characteristics. Consider the following examples of human wisdom:
- Willa’s children are bitterly arguing about money. Willa draws on her life experience to show them why they should instead compromise in the short term and prioritize their sibling relationship in the long term.
- Daphne is a world-class cardiologist. Nonetheless, she consults with a much more junior colleague when she recognizes that the colleague knows more about a patient’s history than she does.
- Ron is a political consultant who formulates possible scenarios to ensure his candidate will win. To help generate scenarios, he not only imagines best case scenarios, but also imagines that his client has lost the election and considers possible reasons that might have contributed to the loss.
Life experience, intellectual humility, and scenario planning do not seem to share much in common beyond all being positive attributes. But being able to solve tricky integrals, crack clever jokes, and compose beautiful sonnets are also positive attributes—yet these don’t constitute wisdom.
Hmmmm....Really? On the other hand, in discussing the difference between what I had to do in analyzing Spielberg’s Jaws and what ChatGPT had to do when I asked it to do a Girardian interpretation, I concluded:
The deeper point is that there is a world of difference between what ChatGPT was doing when I piloted it into Jaws and Girard and what I eventually did when I watched Jaws and decided to look around to see what I could see. How is it that, in that process, Girard came to me? I wasn’t looking for Girard. I wasn’t looking for anything in particular. How do we teach a computer to look around for nothing in particular and come up with something interesting?
I hesitate to say that my interpretive process involved wisdom – in part because wisdom is a heavily freighted word and I have doubts about myself on that score – but it does have some of the open-ended quality in those three examples – “intractable situations” is the term used in the article. I drew on my years of experience as a critic (first example), I consulted with my friend David who, while he’s certainly not a junior colleague, he knows Girard’s work better than I do (second example), and I explored the movie by imagining how it would have gone if, for example, Quint hadn’t been killed (third example). Those skills strike me as being central to being a literary critic.
It does seem to me that the ordinary business of literary criticism is quite different from the kinds of problems used to assess and benchmark AIs. I’m thinking especially of LLMs, including the recent ones with inference-time-scaling. It’s one thing to scan the web for information on a specific topic and then compile a report. It’s quite something else to come up with a plausible interpretation of a literary text you’ve just read or a movie you’ve just watched. The breathless hype that accompanies AI these days strikes me as being utterly oblivious of the skills required in literary criticism. Are the people who promote the impending arrival of AGI really that poorly educated and intellectually impoverished? Their assessment of human capability is truncated in ways they do not understand and acknowledge. Do they lack wisdom?
The highlighted text "though I have doubts about that myself" -- is that highlighted for emphasis? or to link? there's nowhere that it goes. Just curious.
ReplyDeleteEmphasis, to draw your attention.
Delete