Friday, June 7, 2024

Relationships among words, metalingual definition, and interpretability

First, I talk about now natural language is its own metalanguage and that allows them to define new works in terms of existing ones. Then I discuss the concept of justice in terms of mechanism of metalingual definition proposed by David Hays some years ago. I conclude with some remarks on interpretability in view of Anthropic’s recent research on features.

The metalingual function of language in defining word meaning

In a famous essay published in 1960, “Linguistics and Poetics,” Roman Jakobson listed six functions of language. While the essay focused on the poetic function, as the title indicates, I’m interested in a different function, which he called the metalingual function:

A distinction has been made in modern logic between two levels of language: “object language” speaking of objects and “metalanguage” speaking of language.10 But metalanguage is not only a necessary scientific tool utilized by logicians and linguists; it plays also an important role in our everyday language. Like Molière's Jourdain who used prose without knowing it, we practice metalanguage without realizing the metalingual character of our operations. Whenever the addresser and/or the addressee need to check up whether they use the same code, speech is focused on the code: it performs a METALINGUAL (i.e., glossing) function.

In the process of explicating that function Jakobson pointed out that it can be used to define words, noting that “any process of language learning, in particular child acquisition of the mother tongue, makes wide use of such metalingual operations.”

Not all words get their meaning in that way. Many words have their meanings grounded in sensorimotor experience. We would like to know what percentage of words have their meanings grounded in sensorimotor experience and what percentage have their meanings grounded in other words. In 2016 Steven Harnad and his colleagues published an article investigating this problem, “The Latent Structure of Dictionaries.” They examined the structure of the vocabularies in two dictionaries, one with roughly 47,000 words and the other with roughly 69,000 words. They found that a large majority of the words were defined in terms of a relatively small number of words defined in terms of sensorimotor features (p. 649):

So in our view the mental lexicon is itself hybrid—a dual-code representational system consisting of learned sensorimotor feature (affordance) detectors for the grounding words (and any later hybrid words) plus recombinatory and purely symbolic (i.e., verbal) definitions and descriptions for the referents of the words that are learned through words alone.

More recently Briony Banks, Anna M. Borghi, Raphaël Fargier et. al. reviewed the literature on abstract concepts, “Consensus Paper: Current Perspectives on Abstract Concepts and Future Research Directions.” They noted that “many theories have also argued that our understanding and representation of abstract concepts relies more on language than the sensorimotor dimension, and particularly linguistic distributional relations.”

Given that LLMs have been constructed in an environment consisting entirely of words, the apparent fact that most words are defined in terms of other words seems highly salient.

Metalingual definition and the concept of justice

Back in the 1970s David Hays was interested in the idea that words can be used to define the meaning of other words. He talked specifically of metalingual definition. He used charity as his prototypical example: Charity is when someone does something nice for another without thought of reward. Any story that exhibits that pattern of relationships between its actors and their actors, such a story is about justice. The concept inheres in that pattern of relationships as a whole and not in any of the individual components of the pattern.

Notice that the definition itself contains an abstract concept, reward. Taken as a computational mechanism, which was his point, metalingual definition is thus recursive, allowing definitions to be nested within definitions. One of Hays’s students Brian Phillips, implemented the idea in his doctoral dissertation using tragedy as his example. I recently took the definition that Brian Phillips used and used it to test ChatGPT, which had no trouble applying it to specific examples and determining whether or not they met the conditions set forth in the definition.

With this before us I ask: What is justice? That is to say, what kind of a thing is justice? It’s a virtue, no? Yes, but I’m looking for something even more general, more abstract. It’s a concept, and idea, no? Of course it is. And just what are those things? Philosophers have been pondering to question for years. Cognitive scientists have been asking that question as well. When David Hays proposed that abstract concepts can be defined by stories, he was proposing an answer to that question. Abstract concepts, such as justice, are defined by relationship among words.

* * * * *

I’ve devoted a great deal of attention to ChatGPT’s ability to deal with metalingual definition. Justice is one of the first concepts I investigated, back in December of 2022. I’ve continued to investigate that concept. I have appended my most recent session to the end of this post.

That investigation has three parts. First, I ask it to tell me two stories involving justice and I specify that the word “justice” anywhere in the stories. The point of that restriction is to make it clear that the meaning of the term does not reside in the word itself. The stories exhibit justice, but do not name it. Note that in a second session, which I’ve placed in a second appendix, I give ChatGPT the two stories, one after the other, and ask it what they’re about. It realizes that they are about justice.

After asking ChatGPT to tell me stories involving justice I ask it to define the term. The first definition is fairly long and has five numbered points, each specifying a particular kind of justice. So I ask it for a single paragraph and then a single sentence. It provides both. Note that both the long definition and the single paragraph definition begin with pretty much the same information that the single sentence contains.

Finally, I ask ChatGPT to explain the relationship between the stories and the definition, which it does in a paragraph of 112 words. Here’s the first sentence: “The relationship between the definition of justice and stories about justice lies in the way these narratives illustrate and bring to life the abstract principles of fairness, equity, and moral rightness.”

Interpretability of LLMs

What does this have to do with the interpretability of large language models? To a first approximation, it seems to me that LLMs are about the relationships between words. The transformer is presented with strings of words during training and, in the process of making those predictions, constructs a complex model of how words are related to one another.

Thus we might say that justice is a certain pattern of relationships among words. But what pattern? The pattern that gives us stories, stories which may not even contain the word “justice” or the pattern that gives us definitions and could, I assume, produce essays and even books if necessary? Those are distinctly different patterns of relationships; one might even think about them as being orthogonal, at least informally. One pattern is about justice in the context of story and the other is about justice in the context of define. Finally, what about the pattern that explication the relationship between the stories and the definitions?

In Scaling Monosemanticity, researchers at Anthropic identified features in Claude 3 Sonnet, where features are understood to be “directions in their activation spaces.” In their discussion, they note “that features often respond to both abstract discussion and concrete examples of a concept,” which is certainly something that I’d expect to be the case. One thing that bothers me about the discussion is that there is no sense of the model as capturing relationships between words. Given that these features are very abstract objects it’s not clear to me just what that misgiving means, but I worry that the concept of features invites reification.

A digression into neuroscience: Some years ago I had quite a bit of correspondence with the late Walter Freeman, who did pioneering work in thinking about the brain in terms of chaos theory and complex neurodynamics. He believed that percepts and concepts were located in populations of neurons rather than single neurons. I’m deeply sympathetic to that view, and have been ever since I read Karl Pribram on neural holography. Nonetheless I asked him about visual neurons that had very complex activation properties, such as a monkey’s paw or an image of Bill Clinton. Don’t such examples lend support to the idea of a so-called “grandmother cell.” His reply was no, they don’t. In such a complex system, you’re bound to find individual neurons with all sort of odd response characteristics.

I feel a bit like that with these features. While they don’t seem to be individual neurons, it’s not clear what they are. Robert_AIZI has expressed a similar reservation. Thus he has noted:

I think Anthropic successfully demonstrated (in the paper and with Golden Gate Claude) that this feature, at very high activation levels, corresponds to the Golden Gate Bridge. But on a median instance of text where this feature is active, it is "irrelevant" to the Golden Gate Bridge, according to their own autointerpretability metric! I view this as analogous to naming water "the drowning liquid", or Boeing the "door exploding company". Yes, in extremis, water and Boeing are associated with drowning and door blowouts, but any interpretation that ends there would be limited.

Just what IS this feature?

I’m not surprised that with judicious and determined poking around we can find interpretable “features” in these models. But whether or not we’re carving LLMs at their joints, to invoke Plato’s famous remark, that’s another matter.

It’s early days.

Appendix 1: ChatGPT on justice: stories and definitions

What follows the asterisks is a single session with ChatGPT. As always, my prompts are set in bold-face while ChatGPT’s responses are plain-face, except for those strings that ChatGPT sets in bold-face.

* * * * *

Tell me a story about justice, but don’t use the word justice anywhere in the story.

In a small, bustling town surrounded by rolling hills, there lived a blacksmith named Alden. Alden was known far and wide for his craftsmanship, but even more so for his unwavering sense of fairness. Every day, townsfolk would visit his forge, not just to commission work but also to seek his counsel in disputes.

One day, a wealthy merchant named Garrick approached Alden with a request. Garrick wanted a new set of iron gates for his mansion, promising a hefty sum for the job. However, there was a condition: the gates had to be completed within a fortnight, an almost impossible task given their intricate design. Alden, valuing his word above all, agreed to the challenge.

Meanwhile, a humble farmer named Elara arrived at the forge with a different plea. Her plow had broken, and with harvest season approaching, she desperately needed it repaired. She had little money to offer, only a basket of fresh produce from her farm. Alden, understanding the farmer's plight, promised to repair the plow without delay.

The days flew by, and Alden worked tirelessly. He would spend long hours at the anvil, his hammer ringing out in the stillness of the night. The townsfolk, witnessing his dedication, marveled at his commitment. However, whispers began to spread about the blacksmith’s ability to meet both promises.

On the final day, Garrick arrived at the forge, eager to see his new gates. To his dismay, Alden informed him that the gates were not yet finished. Enraged, the merchant accused Alden of breaking his promise and stormed out, vowing to ruin the blacksmith's reputation. Elara, on the other hand, found her plow perfectly mended, ready for the harvest.

News of the incident spread quickly. Many sided with Garrick, believing Alden had favored the farmer unfairly. Others defended Alden, citing his history of integrity. The town's mayor decided to intervene, calling for a gathering in the town square to settle the matter.

At the meeting, Alden stood before the crowd, his face etched with weariness but his eyes steady. He explained his actions, recounting how Elara's need was immediate and crucial for her livelihood, whereas Garrick's gates, though important, could wait a little longer. He spoke of the value of compassion and the importance of helping those in dire need.

The mayor, after listening to both sides, addressed the crowd. He spoke of the blacksmith’s long-standing reputation for fairness and the importance of considering the greater good in every action. The crowd fell silent, contemplating his words. Slowly, murmurs of agreement began to ripple through the assembly.

Garrick, witnessing the town’s collective judgment, felt a pang of guilt. Realizing his own actions had been driven by pride and impatience, he approached Alden and offered a sincere apology. Alden, ever gracious, accepted it and promised to complete the gates within a few days.

From that day on, Alden's forge became a symbol of integrity and balance. The townsfolk learned to appreciate the blacksmith's wisdom even more, understanding that true fairness often required looking beyond immediate desires to see the broader picture. And so, Alden's legacy endured, a testament to the enduring power of doing what is right.

Give me another story about justice, still without using the word.

In a bustling city by the sea, there was a renowned marketplace where merchants from distant lands came to trade their goods. The heart of this vibrant market was a stall run by an elderly woman named Marisol. Known for her keen eye and sharp wit, Marisol dealt in rare spices and exotic herbs, always ensuring her customers received fair prices and honest measures.

One summer, a wealthy merchant named Rodrigo arrived with a caravan full of silks, spices, and jewels. His reputation preceded him; he was known for his cunning and ruthless business tactics. Rodrigo quickly set up his grand stall, drawing customers with his dazzling wares and extravagant promises.

It wasn’t long before whispers spread through the market that Rodrigo was cheating his customers. He would weigh his spices on doctored scales and sell diluted oils at the price of pure ones. Many fell victim to his deceit, but few dared to challenge him, for his wealth and influence were intimidating.

One day, Marisol noticed a young girl leaving Rodrigo’s stall in tears. Her hands clutched a small pouch of saffron, and Marisol could see the despair in her eyes. Gently, she called the girl over and asked what had happened. Through sobs, the girl explained that she had saved for months to buy the saffron for her sick mother, only to realize it was mixed with cheap, yellow threads.

Determined to set things right, Marisol devised a plan. She gathered the honest merchants of the market and shared her concerns. Together, they approached the city’s magistrate, a fair-minded man who valued the integrity of the marketplace above all.

The next day, under the watchful eyes of the magistrate, Marisol and the merchants confronted Rodrigo. They demanded he reweigh his goods on their scales and prove the purity of his oils. As the crowd gathered, Rodrigo’s deceit was laid bare for all to see. His scales were exposed as fraudulent, and his goods were found wanting.

The magistrate ordered Rodrigo to repay every customer he had cheated and banned him from the marketplace until he could prove his honesty. As the crowd dispersed, Marisol approached the young girl, handing her a fresh pouch of pure saffron with a warm smile.

With Rodrigo gone, the market thrived once more, free from the shadow of dishonesty. Marisol returned to her stall, her heart lightened by the knowledge that she had upheld the market's true spirit. Through her actions, she had ensured that fairness and integrity would always prevail, reminding everyone that even the smallest act of courage can restore balance in the world.

Define “justice.”

"Justice" is a complex and multifaceted concept that generally refers to the principle of moral rightness, fairness, and equity. It involves the fair treatment of individuals and the equitable distribution of benefits and burdens in society. Here are key aspects of justice:

1. Legal Justice: The administration of laws in a manner that ensures fairness, impartiality, and due process. This includes the fair treatment of individuals within the legal system, ensuring that laws are applied consistently and that legal rights are upheld.

2. Social Justice: The pursuit of a society where all individuals have equal opportunities and access to resources and services. It addresses issues of inequality, discrimination, and oppression, seeking to create a fair and inclusive society.

3. Distributive Justice: The equitable allocation of resources and wealth within a society. This involves ensuring that benefits and burdens are distributed in a manner that is considered fair and just, often taking into account individuals' needs and contributions.

4. Retributive Justice: The aspect of justice concerned with the appropriate punishment for wrongdoing. It aims to ensure that those who commit offenses receive fair and proportional consequences, reflecting the severity of their actions.

5. Restorative Justice: A process that emphasizes repairing the harm caused by criminal behavior through reconciliation between offenders and victims. It focuses on healing, accountability, and making amends rather than purely punitive measures. Overall, justice is about ensuring that individuals are treated with fairness, respect, and equality, and that systems and institutions operate in a way that upholds these principles.

Give me a one paragraph definition.

Justice is the principle of fairness and moral rightness in the treatment of individuals and the resolution of disputes, ensuring that people receive what they are due and that wrongs are appropriately addressed. It involves the application of laws, ethical standards, and societal norms to uphold rights, maintain order, and promote equity, striving to balance individual and collective interests in a fair and impartial manner.

Now give me a single sentence definition.

Justice is the principle of fairness and moral rightness in ensuring individuals receive their due rights and responsibilities, and wrongdoings are appropriately addressed.

Explain the relationship between the definition of justice and stories about justice.

The relationship between the definition of justice and stories about justice lies in the way these narratives illustrate and bring to life the abstract principles of fairness, equity, and moral rightness. Stories about justice serve as concrete examples that demonstrate how these principles are applied in various contexts, showcasing the challenges, decisions, and outcomes involved in achieving just resolutions. They help to contextualize and humanize the concept of justice, making it more relatable and understandable by depicting the struggles and triumphs of individuals and communities in their pursuit of what is fair and right. Through these narratives, the complexities and nuances of justice are explored, highlighting its significance and impact on society.

Appendix 2: ChatGPT recognizes justice

I’m going to tell you a story. I want you to tell me what the story is about. Here’s the story:

In a small, bustling town surrounded by rolling hills, there lived a blacksmith named Alden. Alden was known far and wide for his craftsmanship, but even more so for his unwavering sense of fairness. Every day, townsfolk would visit his forge, not just to commission work but also to seek his counsel in disputes.

One day, a wealthy merchant named Garrick approached Alden with a request. Garrick wanted a new set of iron gates for his mansion, promising a hefty sum for the job. However, there was a condition: the gates had to be completed within a fortnight, an almost impossible task given their intricate design. Alden, valuing his word above all, agreed to the challenge.

Meanwhile, a humble farmer named Elara arrived at the forge with a different plea. Her plow had broken, and with harvest season approaching, she desperately needed it repaired. She had little money to offer, only a basket of fresh produce from her farm. Alden, understanding the farmer's plight, promised to repair the plow without delay.

The days flew by, and Alden worked tirelessly. He would spend long hours at the anvil, his hammer ringing out in the stillness of the night. The townsfolk, witnessing his dedication, marveled at his commitment. However, whispers began to spread about the blacksmith’s ability to meet both promises.

On the final day, Garrick arrived at the forge, eager to see his new gates. To his dismay, Alden informed him that the gates were not yet finished. Enraged, the merchant accused Alden of breaking his promise and stormed out, vowing to ruin the blacksmith's reputation. Elara, on the other hand, found her plow perfectly mended, ready for the harvest.

News of the incident spread quickly. Many sided with Garrick, believing Alden had favored the farmer unfairly. Others defended Alden, citing his history of integrity. The town's mayor decided to intervene, calling for a gathering in the town square to settle the matter.

At the meeting, Alden stood before the crowd, his face etched with weariness but his eyes steady. He explained his actions, recounting how Elara's need was immediate and crucial for her livelihood, whereas Garrick's gates, though important, could wait a little longer. He spoke of the value of compassion and the importance of helping those in dire need.

The mayor, after listening to both sides, addressed the crowd. He spoke of the blacksmith’s long-standing reputation for fairness and the importance of considering the greater good in every action. The crowd fell silent, contemplating his words. Slowly, murmurs of agreement began to ripple through the assembly.

Garrick, witnessing the town’s collective judgment, felt a pang of guilt. Realizing his own actions had been driven by pride and impatience, he approached Alden and offered a sincere apology. Alden, ever gracious, accepted it and promised to complete the gates within a few days.

From that day on, Alden's forge became a symbol of integrity and balance. The townsfolk learned to appreciate the blacksmith's wisdom even more, understanding that true fairness often required looking beyond immediate desires to see the broader picture. And so, Alden's legacy endured, a testament to the enduring power of doing what is right.

The story is about integrity, fairness, and the importance of prioritizing the needs of others. It highlights the blacksmith Alden's commitment to his principles, demonstrating how true fairness sometimes involves making difficult decisions that balance immediate needs against future promises. The narrative also explores themes of compassion, community judgment, and reconciliation, ultimately showing that integrity and wisdom are valued and respected by society. The resolution, where the merchant Garrick realizes his mistake and the townsfolk support Alden, reinforces the idea that doing what is right often involves considering the greater good and showing empathy.

Comment: Note that “justice” does not appear in this response. However, I note that my thesaurus lists “fairness” as the first synonym for “justice.”

Let’s do it again. But let’s use a different story. Here it is:

In a bustling city by the sea, there was a renowned marketplace where merchants from distant lands came to trade their goods. The heart of this vibrant market was a stall run by an elderly woman named Marisol. Known for her keen eye and sharp wit, Marisol dealt in rare spices and exotic herbs, always ensuring her customers received fair prices and honest measures.

One summer, a wealthy merchant named Rodrigo arrived with a caravan full of silks, spices, and jewels. His reputation preceded him; he was known for his cunning and ruthless business tactics. Rodrigo quickly set up his grand stall, drawing customers with his dazzling wares and extravagant promises.

It wasn’t long before whispers spread through the market that Rodrigo was cheating his customers. He would weigh his spices on doctored scales and sell diluted oils at the price of pure ones. Many fell victim to his deceit, but few dared to challenge him, for his wealth and influence were intimidating.

One day, Marisol noticed a young girl leaving Rodrigo’s stall in tears. Her hands clutched a small pouch of saffron, and Marisol could see the despair in her eyes. Gently, she called the girl over and asked what had happened. Through sobs, the girl explained that she had saved for months to buy the saffron for her sick mother, only to realize it was mixed with cheap, yellow threads.

Determined to set things right, Marisol devised a plan. She gathered the honest merchants of the market and shared her concerns. Together, they approached the city’s magistrate, a fair-minded man who valued the integrity of the marketplace above all.

The next day, under the watchful eyes of the magistrate, Marisol and the merchants confronted Rodrigo. They demanded he reweigh his goods on their scales and prove the purity of his oils. As the crowd gathered, Rodrigo’s deceit was laid bare for all to see. His scales were exposed as fraudulent, and his goods were found wanting.

The magistrate ordered Rodrigo to repay every customer he had cheated and banned him from the marketplace until he could prove his honesty. As the crowd dispersed, Marisol approached the young girl, handing her a fresh pouch of pure saffron with a warm smile.

With Rodrigo gone, the market thrived once more, free from the shadow of dishonesty. Marisol returned to her stall, her heart lightened by the knowledge that she had upheld the market's true spirit. Through her actions, she had ensured that fairness and integrity would always prevail, reminding everyone that even the smallest act of courage can restore balance in the world.

The story is about justice, integrity, and the power of collective action in confronting dishonesty. It centers on Marisol, an elderly woman known for her fair trade practices, who stands against Rodrigo, a deceitful merchant exploiting customers. When Marisol discovers Rodrigo's dishonesty, she unites the honest merchants and seeks the intervention of the city magistrate. The exposure of Rodrigo's fraud and the subsequent restitution ordered by the magistrate restore the market's integrity. The narrative underscores the importance of courage and solidarity in upholding fairness and protecting the community from exploitation, demonstrating that even a small act of bravery can have a significant impact on ensuring justice and honesty.

Comment: Note that “justice” appears in both the first a last sentences of this response.

Could the first story be about justice as well?

Yes, the first story can also be seen as being about justice. While it primarily focuses on integrity and fairness, the themes of justice are evident in the resolution of the conflict and the actions taken by the characters. Here's how the first story can be interpreted as being about justice:

1. Alden's Actions: Alden, the blacksmith, strives to balance his commitments to both the wealthy merchant Garrick and the humble farmer Elara. His decision to prioritize the immediate needs of Elara, whose livelihood depended on the timely repair of her plow, reflects a sense of justice that goes beyond contractual obligations. Alden's choice to help the more vulnerable party in urgent need demonstrates his understanding of true justice, which considers the broader context and impact of his actions.

2. Community Judgment: When Garrick accuses Alden of breaking his promise, the community becomes involved, reflecting the social aspect of justice. The townsfolk's initial division and subsequent gathering to hear Alden's explanation highlight the role of public opinion and communal decision-making in achieving justice.

3. Mayor's Intervention: The mayor's role in convening the town meeting and listening to both sides before making a judgment is a clear representation of a judicial process. The mayor's emphasis on Alden's history of fairness and the importance of compassion underscores a balanced and just approach to resolving the dispute.

4. Resolution and Reconciliation: The resolution, where Garrick realizes his mistake and apologizes, and Alden graciously accepts the apology, reinforces the restorative aspect of justice. It shows that justice is not only about punishment but also about understanding, correction, and reconciliation.

Overall, the first story highlights how justice involves fairness, empathy, and the willingness to see beyond immediate self-interest to consider the greater good. By acting with integrity and prioritizing urgent needs, Alden embodies the principles of justice, demonstrating that true fairness often requires a nuanced and compassionate approach.

Comment: I would have been happy with just “yes,” or “yes” plus a short comment. By that criterion, everything about this response is gravy. However, the fact that ChatGPT could give a detailed justification for its answer makes is clear that it does know how to apply a definition to a case is a revealing way.

No comments:

Post a Comment