Bumped this Oct 2017 post to the top of the queue on general principle, and to remind me to think about this issue some more.
* * * * *
If the eye were not sun-like, the sun’s light it would not see.
– Johann Wolfgang von Goethe
Thinking about corpus linguistics, machine learning, neural nets, deep learning and such. One of the thoughts that keeps flitting through my mind goes something like this:
How come, for example, we can create this computer system that crunches through two HUGE parallel piles of texts in two languages and produce a system than can then make passable translations from one of those languages to the other WITHOUT, however, UNDERSTANDING any language whatsoever? Surely the fact that THAT – and similar things – is possible tells us something about something, but what?As far as I know these systems have arisen through trying things out and seeing what works. The theoretical basis for them is thin. Oh, the math may be robust, but that’s not quite the same.
Understanding involves the relationship between text and world. That’s what we were trying to do back in the 1970s and into the 1980s, create systems that understood natural language texts. We created systems that had an internal logic relating concepts to one another so one could make inferences, and even construct new assertions. That effort collapsed, and it collapsed because THE WORLD. Yes, combinatorial explosion. Brittleness, that’s a bit closer. Lack of common sense knowledge, still closer, that’s knowledge of the world, lots of it, much of it trivial, but necessary. But these symbolic systems were also merely symbolic, they weren’t coupled to sensory and motor systems – and through them to the world itself.
And now we have these systems that utterly lack an internal logic relating concepts one to the other, and yet they succeed, after a fashion, where we failed (back in the day). How is it that crunching over HUGE PILES of texts is a workable proxy for understanding the world? THAT’s the question. Surely there’s some kind of theorem here.
The thing about each of the texts in those huge piles is that they were created by a mind engaged with the world. That is, each text reflects the interaction of a mind with the world. What the machine seems to be doing through crunching over all these texts is it recovers a simulacrum of the mind’s contribution to those texts and that’s sufficient to get something useful done. Or, is it a simulacrum of the world’s contribution to those texts? Does it matter? Can we tell?
THAT’s what I’m wondering about.
I think.
Think of the world as Borges’s fabled library of Babel. Most of the texts – and they are just texts, strings of graphic symbols – in that world are gibberish. Imagine, however, that we have combed through this library and have managed to collect a large pile of meaningful texts. Only an infinitesimal set of texts is meaningful, and we’ve managed to find millions of them. So, we crunch through this pile and, voilà! we can now generate more texts, all of which are almost as intelligible and coherent as the originals, the true texts. And yet our machines don’t understand a thing. They just crunch the texts, dumber than those monkeys seeking Shakespeare with their random typing.
THAT, I think, is what’s going on in deep learning and so forth.
If so, doesn’t that tell us something about the world? Something about the world that makes it intelligible? For not all possible worlds are intelligible.
The world Borges imagines in that story, “The Library of Babel”, is not an intelligible world. Why not?
Remember, we’re using this story as a metaphor, in this case, we’re using it to think about corpus linguistics, machine learning, and the rest. In this usage each volume in the library represents an encounter between someone’s mind and the world. Most such encounters are ephemeral and forgotten. Only some of them yield intelligible texts. Those are the one’s that interest us.
The problem with the library as Borges describes it is there’s no way of finding the ‘useful’ or ‘interesting’ books in it. They all look alike. That world is, for all practical purposes, unintelligible. You’ve got to read each one of them all the way through in order determine whether or not it contains anything sensible.
Few of the stacks, of course, would contain such a mark. You’d have to wander far and wide before you find one. But that’s surely better than having to examine each book, page by page, on each shelf in each stack. Now you only have to examine each book in the marked stack. But that’s an improvement, no? NOW the world becomes intelligible. One can live in it.
* * * * *
As for those humanists who worry about some conflict between “close reading” and “distant reading”, get over it. Neither is a kind of reading, as the term is ordinarily understood. Both usages are doing undisclosed mythological/ideological work. Drop the nonsense and try to think about what’s really going on.
It’s hard, I know. But at this point we really have no choice. We’ve extracted all we can from those myths of reading. Now they’re just returning garbage.
Time to come up out of the cave.
you think the ant going on its appointed rounds has "understanding"?
ReplyDeleteyou think the spade going down on its appointed round has "understanding"?
I think you are overly excited about technology
I think you can't read.
DeleteTangential and yet in the same realm of information-creating: John Wheeler points out that attempts have been to weigh heat i.e. the increased agitation in the atom that creates heat. Unsuccessful although it this can be easily calculated. (There is a story of Count Rutherford and a bathtub. . .) I do think that our information capacity is of this emergent property and that at its essential in our biological thermodynamic compilations it is this weight of heat that creates the conditions that are the geometrodynamic "works" informing the lattice of 4-D spacetime and the passage of time. The math that looks promising to me in its lattice configuration was discussed in a 3QD article. https://www.3quarksdaily.com/3quarksdaily/2019/06/making-ice-in-vietnam.html
ReplyDeleteThese lattices would allow for the emergence of both the local and non-local phenomena of the evolution of language and music. (Not a binary coded system, as you know.)
This is my drop in the bucket!