Tuesday, December 27, 2022

On limits to the ability of LLMs to approximate the mind’s structure

We assume that the mind has some structure. That structure is a function of 1) the structure of the world (which includes other humans) and 2) the brain’s ability to ‘model’ that structure. See my post, World, mind, and learnability: A note of the metaphysical structure of the cosmos.

The structure of the mind is not open to us through direct inspection. We need to approximate it by various indirect methods. Several academic disciplines are devoted to this job. Artificial intelligence approaches it by constructing computer programs that behave ‘intelligently.’ Deep learning is one such approach. Large Language Models are an approach that has recently received a great deal of attention.

Large Language Models

LLMs attempt to approximate the mind’s structure using a a procedure in which they attempt to guess the next word in a string they are currently examining. The model being developed is modified according to whether or not the guess is correct. In practice the ‘string’ in question is a concatenation of many many many strings that have been scraped from WWW.

Are there any theorems on what kinds of structures are discoverable through such a procedure? One current view posits that we can reach AGI – whatever that is – simply by scaling up. What does that view presuppose about the structure of the mind? Is it possible that there exists a mind – perhaps a super-intelligence of some kind – whose structure cannot be approximated by such a procedure?

Current LLMs are known to have difficulty with so-called common-sense reasoning. Some non-trivial component of common-sense reasoning consists of examples that are “close” to the physical world. Can this difficulty be overcome simply by scaling up? Why or why not? Note that if not, that suggests that there IS some kind of mental structure that cannot be discovered by the standard guess-the-next-word procedure. How can we characterize that structure?

I believe I believe that a 1975 paper by Miriam Yevick speaks to this issue, Holographic or fourier logic, and have appended a complete reference along with its abstract and summary. I also have a recent post setting forth her ideas: Miriam Yevick on why both symbols and networks are necessary for artificial minds.

If Yevick is correct, then mental structures and processes of the kind she calls holographic may not be very well approximated through the standard LLM training procedure. Beyond this I note the human minds are open to the external world and always “pursuing” it. I suggest that implies that they must necessarily “run ahead” of any model trained on texts, no matter how large the corpus.

Miriam Yevick 1975

I believe that a 1975 paper by Miriam Yevick speaks to this issue: Holographic or fourier logic, Pattern Recognition, Vol 7, #2, 1975, pp. 197-213. The abstract:

A tentative model of a system whose objects are patterns on transparencies and whose primitive operations are those of holography is presented. A formalism is developed in which a variety of operations is expressed in terms of two primitives: recording the hologram and filtering. Some elements of a holographic algebra of sets are given. Some distinctive concepts of a holographic logic are examined, such as holographic identity, equality, contaminent and “association”. It is argued that a logic in which objects are defined by their “associations” is more akin to visual apprehension than description in terms of sequential strings of symbols.

Concluding summary:

It has recently been conjectured that neural holograms enter as units in the thought process. If holographic processes do occur in the brain and are instrumental in thought, then the logical operations implicit in these processes could be considered as intuitive and enter as units in our mental and mathematical computations.

It has also been said that: "if we want the computer to have eyes, we shall first have to give him instruction in the facts of life".

We maintain in this paper that a language of thought in which holographic operations enter as primitives is essentially different from one in which the same operations are carried out sequentially and hence over a finite time span, as would be the case if they were computed by a neural network. Our assumption is that "holographic thought" utilizes the associative properties of holograms in "one shot". Similarly we maintain that apprehension proceeds from the very beginning via two modes, the aural and the optical; whereas the verbal string is natural to the first, the pattern as such is natural to the second: the essentially instantaneous nature of the optical process captures the apprehension as a global unit whose meaning is expressed in the first place in terms of "associations" with other such units.

We are hence led to search for a language of patterns based on the logic of holographic processes. In the first part of this paper we identify and express a number of derived holographic operations in terms of the two primitive operations of “recording the hologram” and “filtering.” We also derive some elements of a holographic algebra of patterns. In the second part some potentially distinctive aspects of a holographic logic are examined, such as holographic identity (directly related to randomness), equality, containment and "association". The notion of the Gödel Pattern is introduced as a bridge between such an associative, optical language and the usual sequential string of symbols of a formal language considered as "mere scratches on paper".

We speculate on the potential relation of the notion of holographic association to the, as yet unclarified, notion of "connotation" in logic. We also find that some of the concepts developed in this paper graze the boundaries of both the uncertainty principle and undecidability questions in self-referential systems. They may, perhaps, open up further insights into the connection, if any, between these two.

No comments:

Post a Comment