Friday, November 25, 2022

Meta announces CICERO, an AI that plays Diplomacy

Gary Marcus and Ernest Davis have posted an interesting evaluation of Cicero: What does Meta AI’s Diplomacy-winning Cicero Mean for AI? [Hint: It’s not all about scaling]

First:

The first thing to realize is that Cicero is a very complex system. Its high-level structure is considerably more complex than systems like AlphaZero, which mastered Go and chess, or GPT-3 which focuses purely on sequences of words. Some of that complexity is immediately apparent in the flowchart; whereas a lot of recent models are something like data-in, action out, with some kind of unified system (say a Transformer) in between, Cicero is heavily prestructured, in advance of any learning or training, with a carefully-designed bespoke architecture that is divided into multiple modules and streams, each with their own specialization.

A marvel, but...

Cicero is in many ways a marvel; it has achieved by far the deepest and most extensive integration of language and action in a dynamic world of any AI system built to date. It has also succeeded in carrying out complex interactions with humans of a form not previously seen.

But it is also striking in how it does that. Strikingly, and in opposition to much of the Zeitgeist, Cicero relies quite heavily on hand-crafting, both in the data sets, and in the architecture; in this sense it is in many ways more reminiscent of classical “Good Old Fashioned AI” than deep learning systems that tend to be less structured, and less customized to particular problems. There is far more innateness here than we have typically seen in recent AI systems

Also, it is worth noting that some aspects of Cicero use a neurosymbolic approach to AI, such as the association of messages in language with symbolic representation of actions, the built-in (innate) understanding of dialogue structure, the nature of lying as a phenomenon that modifies the significance of utterances, and so forth.

That said, it’s less clear to us how generalizable the particulars of Cicero are.

In sum:

Cicero makes extensive use of machine learning, but is hardly a poster child for simply making ever bigger models (so-called “scaling maximalism”), nor for the currently popular view of “end-to-end” machine learning of in which some single general learning algorithm applies across the board, with little internal structure and zero innate knowledge. At execution time, Cicero consists of a complex array of separate hand-crafted modules with complex interactions. At training time, it draws on a wide range of training materials, some built by experts specifically for Cicero, some synthesized in programs hand-crafted by experts. [...]

Our final takeaway? We have known for some time that machine learning is valuable; but too often nowadays ML is a taken as universal solvent—as if the rest of AI was irrelevant—and left to do everything on its own. Cicero may change that calculus. If Cicero is any guide, machine learning may ultimately prove to be even more valuable if it is embedded in highly structured systems, with a fair amount of innate, sometimes neurosymbolic machinery.

There's much more in their article.

No comments:

Post a Comment