Sunday, February 9, 2025

A clue about the mind: “Is-A” sentences

I'm bumping this post from 2011 to the top of the queue. Why? Because it is about the relationship between word order in sentences and order in the process of parsing sentences. That makes it relevant to my ongoing research into the nature of processes in LLMs.

Somewhere in his Problems in General Linguistics, my copy of which is, alas, in storage, Emile Benveniste has a chapter on sentences hanging on the auxiliary “to be.” As Benveniste was a linguist of the Old School, when being a linguistic meant familiarity with many languages, including—and this is important for this particular topic—classical Greek, it had examples from many languages, making it tough sledding for a monoglot like me.

While the content of this post certainly arises out of my thinking about that chapter, in the absence of actually having the text in front of me, I hesitate to assert a stronger relationship than that. I note only that, for Benveniste, the auxiliary “to be” was fraught with metaphysical significance. For the concept of being derives from “to be.” Where would philosophy be without Being? Thus, when Benveniste pondered such sentences, he wasn’t merely commenting on language. He was doing philosophy, or, if not quite that, camping out on philosophy’s door step.

I’m interested in such sentences because I believe they are a DEEP CLUE about how the mind works. I just don’t know what to make of the clue.

Is-A Sentences

So, I'm interested in word order in assertions such as the following:
(1) Fido is a beagle.
(2) Beagles are dogs.
(3) Dogs are beasts.
They all move from an element in a class (whether an individual, Fido, or another class, beagles) to a class containing it. None of them move in the opposite direction. Consider what happens when you try to go the opposite way. In the following sentence the class is mentioned first, then the subclass:
(4) Beagle is the kind of animal of which Fido is an instance.
In particular, note that (4) has a metalingual character that (1) does not. That is, (4) explicitly asserts that we are dealing with classification. One can do that metalingual job in various ways, but, as far as I can tell, one can't avoid it. That is, one cannot construct a proper English sentence relating a genus and species in which the genus is mentioned first, one can’t do that without ‘looping through’ some kind of metalingual construction on the way from genus to species.

Why?

What does this assymetry tell us about the underlying mechanisms? Why don't have sentences such as:
(5) Beagle za di Fido.
In this case "za di" is the inverse of "is a". English has no such sentences & no such inverse.

So, how widespread is this asymmetry and is there any explanation of this directionality?

But, consider . . .

I sent a query on that matter to a listserve, I forget which one, and got two replies that add some complexity to the matter. Rich Rhodes, Linguistics at UCal Berkeley, tells me that in Ojibwa the word order is reversed, the class comes before the individual, but the asymmetry remains. He then comments, which he qualifies as a quick guess:
My guess is that there is no compelling discourse function (like information flow) which makes it desirable to invert classificational equatives. Hence we only get the "unmarked" order. Subject-predicate in theme-rheme languages (like English) and predicate-subject in rheme-theme languages (like Ojibwe).
So, what's the nature of the mechanism that determines the "unmarked" order? That's what I want to know.

Lee Pearcy, Episcopal Academy in Merion, Pa., offered these examples:
(6) The beagle is Fido.
(7) The dogs are beagles.
(8) The beasts are dogs.
As stand-alone sentences, they seem a bit awkward to me. But they fare better in answers to questions, e.g.:
What’s that dog?
Which dog? The beagle is Fido and the terrier is Max.

What’re those animals?
The dogs are beagles, the cats are Persians.
In those contexts, the matter of class or classification is raised by the question, thus making it present in the discourse and so available as a point of attachment in the answer.

Further clues, anyone?

7 comments:

  1. This is so interesting. This is some of what makes writing poems so challenging. One clue I have is that when writing, the reader in English goes left to right to acknowledge word order that casts the foundation of meaning. And germane to meaning is the matter of directionality in time. The writer orders the words one after the other such that the reader is expected to follow the same order. This establishes a direction in time. How this architecture works in play with the poem itself is something the writer must bear in mind in order to respect (1) the vibrancy of the poem (2) the weight of spaces and line breaks in playing the orders of thought (3) respecting the implied meaning of words or phrases that is indirect but which is as much the subject and energy of the poem as the actual words, indeed, can provide context without being put into words more powerfully than the poem could state if it relied solely upon the architecture of the word order. The writer is always playing with the reader's imagination, assuming the presence of the reader's inner dialogue responding to the words as recorded, although of course can't know exactly. A writer can make assumptions that will probably contribute to whether the reader considers the poem to be congruent with her own aesthetics, or dissonant and unlikable.George Oppen is -- to my mind -- the undisputed master of utilizing "to be" judiciously with a discrimination that respects all prepositions in a way that each preposition becomes a keystone to the juxtaposition of words and space in the poem in a way that the augments rather than argues the direct and indirect "message" of the poem. -- Your sister. (Comment is only a clue, and I could probably nail down more clearly if you are interested.)

    ReplyDelete
    Replies
    1. For what it's worth, Sally, I've been thinking about this particular problem off and on for decades. And it's now become central to the thinking I've been doing about how large language models (like the one in ChatGPT) work. It's an important and subtle issue.

      Delete
  2. I can appreciate thinking about it for decades. What particular tools in the kit? by which you think "I've got it!" and then question again the nature of the moving parts of the problem? It is probably the most important issue in writing a poem that is able to maintain its vibrancy through decades and becoming "timeless".

    ReplyDelete
  3. Hi Bill.

    Re; "So, what's the nature of the mechanism that determines the "unmarked" order? That's what I want to know. "Further clues, anyone?"

    Serendipity! I archived this comment of yours Bill, to send to Tyson Yankaporta! Re his book Sand Talk. I have only had a brief email exchange with Tyson (probably not on his radar). Life intervened. I believe Tyson would approve of your comment.

    Your comment was in Scott Aaronson's blog in response to Matteo Villa, Shtetl Optimized "The Problem of Human Specialness in the Age of AI"...
    "Bill Benzon Says: 
    Comment #37 February 13th, 2024 at 7:28 am
    ...
    [BB:] "I’ve been wondering the same thing. It’s not as though the universe was made for us or is somehow ours to do with as we see fit. It just as.
    "From Benzon and Hays, The Evolution of Cognition, 1990"...
    https://scottaaronson.blog/?p=7784#comment-1967973

    Great comment Bill. If you were to email Tyson, I'll bet Boston to a brick, he will get back to you with answers and or contacts of value re your questions above.

    Dr Tyson Yunkaporta
    FOUNDER / SENIOR RESEARCH FELLOW
    "As the founder of the IKS Lab at Deakin, The Indigenous Knowledge Systems Lab was established in early 2021 by Dr Tyson Yunkaporta, author of ‘Sand Talk: How Indigenous Thinking Can Save the World’. The IKS Lab is an activist, public-facing think-tank, rooted in a strong evidence base of research. It uses Indigenous Knowledges as a prompt and provocateur for seeing, thinking, and doing things differently."
    ...
    https://ikslab.deakin.edu.au/about-us/

    Tyson loves a yarn.

    "Sand Talk: How Indigenous Thinking Can Save the World"...
    "As he says, “In a cross-cultural dialogue, we might see that the problem with this model is that every time you create a new layer of derivatives...you double the size of the system, you do not merely double the risk...you multiply it exponentially”

    "... Aboriginal Australians don’t have a word for safety. Instead, I learned that protocols of protection are more critical than trusting an abstract system to provide safety.
    ...
    "Yarning, in Aboriginal culture, is based on sharing stories and coming to decisions through mutual respect, active listening and humor. There is no talking stick in Australian Aboriginal Yarning (That’s something the Iroquois created), just an organic back-and-forth and the creation of a space without a stage to share experiences, to draw on the ground and sketch ideas out to illustrate a point.
    ...
    theconversationfactory dot com / sand-talk-how-indigenous-thinking-can-save-the-world-tyson-yunkaporta

    "An Aboriginal Language Pedagogy Framework for Western New South Wales " Tyson Yunkaporta
    ab-Original (2019) 3 (1): 130–136

    "Deep Time Diligence"
    An Interview with Tyson Yunkaporta
    February 19, 2024
    Transcript
    https://emergencemagazine.org/conversation/deep-time-diligence/

    Greek and Ojibwa / Indigenous peoples of the Northeastern Woodlands are preceded by the longest continuous living culture, First Nations people of Australia. (how humiliating, having to be defined in the context of "auatralia")...
    "... the Aboriginal people consisted of complex cultural societies with more than 250 languages[6] and varying degrees of technology and settlements. Languages (or dialects) and language-associated groups of people are connected with stretches of territory known as "Country", with which they have a profound spiritual connection".
    Wikipedia

    Every group as you're probably aware has a totemic animal, they do not conceptually seperate from nature. I am sure my description of is wanting.
    So an Aboriginal person will have a nuanced answer to "What’s that dog?'

    And transmission of culture in embedded via 3 generations at a time with language + dance + music + yarn + sand drawings.
    We whiteys are lacking in such a comprehensive culture transmission. Their culture is in their bones / dreaming. I envy them.

    Hope you and Tyson have a great yarn.
    Cheers,
    Dipity.

    ReplyDelete
  4. And more...

    A couple of sentence structure papers and researchers.
    "Papers in Australian Linguistics No. 17 - ANU Open Research
    Institute of Linguistics
    Australian Linguistics No. 17. A-71, iv + 277 pages. Pacific Linguistics, The Australian National University, 1988. DOI:10.15144/PL-A71.cover ©1988
    ...
    THE STRUCTURE AND SYSTEM OF BURARRA SENTENCES
    Kathleen Glasgow (p.205)
    ...Table 1: Overal l view of Burarra sentence structure
    ...STRUCTURE AND SYSTEM OF BURARRA SENTENCES p.249
    Tables 2 and 3 have been condensed to show the larger system of Burarra sentences in Table 4.
    ...
    https://openresearch-repository.anu.edu.au/server/api/core/bitstreams/71213d81-1aae-4a06-815d-0d50d76475c5/content&sa=U&ved=2ahUKEwjbstKg77mLAxXF1jgGHfaAD4IQFnoECAEQAg&usg=AOvVaw2dbpeXWe3eXsVB676X6nc7

    My fave...
    "Karl Friston: The Physics of Sentience"
    Dr. Karl Friston, University College London, ... August 19th, 2023.
    https://m.youtube.com/watch?v=5HDvoJzvulY

    And...
    Associate Professor
    Stephen Wilson
    ASSOCIATE PROFESSOR IN SPEECH PATHOLOGY
    "Coverbs and Complex Predicates in Wagiman
    Stephen Wilson
    "Wagiman is an Australian Aboriginal language spoken in the Top End of the Northern Territory. It possesses an unusual open class of words which the author calls "coverbs".
    ...
    'After discussing a wide range of relevant data from the language, he outlines a formal analysis within Lexical Functional Grammar. He argues that to account for the intricacy of Wagiman complex predicates, it is necessary for the grammar to make explicit reference to representations of lexical meaning, and he proposes that complex predicate formation can be seen as the fusion of these semantic representations."
    https://web.stanford.edu/group/cslipublications/cslipublications/site/1575861720.shtml

    Associate Professor
    Stephen Wilson
    ASSOCIATE PROFESSOR IN SPEECH PATHOLOGY
    School of Health and Rehabilitation SciencesFaculty of Health, Medicine and Behavioural Sciences
    Background
    "I am a cognitive neuroscientist with a research focus on the neural basis of language. My research is focused on three related questions:
    - How is language processed in the brain?
    - How does brain damage affect language processing in individuals 
    Fields of Research
    University of Queensland

    I hope this is of use Bill, and not a distraction. Yet as many ignore Aboriginal culture and language, having something to say about an ontology of inclusion with humans and nature may assist humans and gaia at once. ymmv.
    Regards,
    Seren Dipity.

    ReplyDelete
    Replies
    1. Well, right now I'm mostly interested in word order in English because that's the language that dominates the LLMs. Ultimately, that's a different story. And I'm not going to be the one who works it out. That's going to be done by people who actually know those languages. But thanks for the tips. Who knows?

      Back when I was interested in graffiti I found some interesting stuff about aboriginal sand drawings.

      Delete
  5. Cognition
     2014 
    DOI: 10.1016/j.cognition.2014.03.004

    |View full text|Cite

    "The semantic origins of word order"

    Marieke Schouwstra
    HenriΓ«tte de Swart

    "Order By: Relevance
    [ Refs & cites & links ]
    https://scite.ai/reports/the-semantic-origins-of-word-ak3VLM

    SD.

    ReplyDelete