Tuesday, April 26, 2022

Why is simple arithmetic difficult for deep learning systems?

Marcus points this out at two points in the video: c. 18:25 (multiplication of 2-digit numbers), c. 19:49 (3-digit addition). Why is this so difficult for deep learning models to grasp this? This suggests a failure to distinguish between semantic and episodic memory, to use terms from Old School symbolic computation.

The question interests me because arithmetic calculation has well-understood procedures. We know how people do it. And by that I mean that there’s nothing important about the process that’s hidden, unlike our use of ordinary language. The mechanisms of both sentence-level grammar and discourse structure are unconscious.

It's pretty clear to me that arithmetic requires episodic structure, to introduce a term from old symbolic-systems AI and computational linguistics. That’s obvious from the fact that we don’t teach it to children until grammar school, which is roughly when episodic level cognition kicks in (see the paper Hays and I did, Principles and Development of Natural Intelligence).

I note that, while arithmetic is simple, it’s simple only in that there are no subtle conceptual issues involved. But fluency requires years of drill. First the child must learn to count; that gives numbers meaning. Once that is well in hand, children are drilled in arithmetic tables for the elementary operations, addition, subtraction, multiplication, and division. The learning of addition and subtraction tables proceeds along with exercises in counting, adding and subtracting items in collections. Once this is going smoothly one learns the procedures multiple-digit addition and subtraction, multiple-operand addition and then multiplication and division. Multiple digit division is the most difficult because it requires guessing, which is then checked by actual calculation (multiplication followed by subtraction).

Why does such intellectually simple procedures require so much drill? Because each individual step must be correct. One mistake anywhere, and the whole calculation is thrown off. You need to recall atomic facts (from the tables) many times in a given calculation and keep track of intermediate results. The human mind is not well-suited to that. It doesn’t come naturally. Drill is required. That drill is being managed by episodic cognition.

Obviously machine learning cannot pick up that kind of episodic structure. The question is: Can it pick up any kind of episodic structure at all? I don’t know.

When humans produce the kind of coherent prose that these AI devices to, they are using episodic cognition. But that episodic cognition is unconscious. Do machine learning systems pick up episodic cognition of that kind? As I say, I don’t know. But I can imagine that they do not. If not, then what are they doing to produce such convincing simulacra of coherent prose? I am tempted to say they are doing it all with systemic-level cognition, but that may be a mistake as well. They’re doing it with some other mechanism, one that doesn’t differentiate between semantic and episodic level cognition – not to mention gnomonic (see Principles above).

The fact that these systems can only produce relatively short passages of coherent prose suggests a failure at the episodic level. The fact they can produce things nonsense and that are not true suggests a failure at the gnomonic level. 

* * * * *

Check out that tweet thread for more examples. 

Here's a more recent post on this subject.

No comments:

Post a Comment