Pages in this blog

Monday, August 11, 2025

ChatGPT’s graphics abilities @3QD

I’ve got a new piece out in 3 Quarks Daily:

ChatGPT Makes Images. It’s FUN! (and illuminating)

I open by talking about how I came to buy a Macintosh in 1984 and how that, in turn, led me to publish an article in Byte Magazine, which was the most technically sophisticated of the magazines that were created to serve the home computer market. The article, “The Visual Mind and the Macintosh,” argued that the Mac’s image-making capacity made it especially well-suited as a tool for thinking.

That, however, is not the theme of my 3QD article, which is mostly a rubric for posting various images that I’d created with ChatGPT. I’d originally intended to come back to that theme at the end of the article, but by the time I got there it seemed rather labored an unnecessary. But I’d like to say a little about that here. Then I’m going to talk about what appears to be the “subtext” of the 3QD piece.

Visual Thinking

I did a lot of drawing as a kid. At one point I was drawing flying saucers. After I’d read Tom Sawyer – or was it after my father read it to me before bedtime? – I drew maps for caves. You’ll remember that at one point Tom Sawyer and Becky Thatcher get lost in a cave and are spooked by Indian Joe.

How do you draw a map for a cave? You just draw a complicated pattern of squiggly lines with only one path from the entrance to some chamber deep in the network. Later on I’d draw hypothetical railroad layouts. That is to say, another network diagram. I also made plans for spacecraft, such as this:

In graduate school I did lots of cognitive network diagrams, like this:

I later went on to write Visualization: The Second Computer Revolution (1989), with Richard Friedhoff. After that I wrote and encyclopedia article about Visual Thinking. There I emphasized the importance of visual thinking in science generally, and computer science in particular.

Joanna Drucker explains the conceptual power of diagrams in Graphesis (2006):

In a landmark 1987 essay, “Why a Diagram Is (Sometimes) Worth Ten Thousand Words,” Herbert Simon and Jill Larkin argue that a diagram is fundamentally computational, and that the graphical distribution of elements in spatial relation to each other supported “perceptual inferences” that could not be properly structured in linear expressions, whether these were linguistic or mathematical. They state at the outset that “a data structure in which information is indexed by two-dimensional location is what we call a diagrammatic representation.” They argue that the spatial features of diagrams are directly related to a concept of location, and that location performs certain functions. Locations exercise constraints and express values through relations, whether a machine or human being is processing the instructions. Larkin and Simon were examining computational load and efficiency, so they looked at data representations from the point of view of a three part process: search, recognition, and inference. Their point was that visual organization plays a major role in diagrammatic structures in ways that are unique and specific to these graphical expressions. [p. 106]

That’s why I like diagrams. The physical act of making such diagrams is relatively simple. But making them intellectually meaningful, that’s something else. Alas, ChatGPT isn’t up to that yet. But there was no point in saying that in my 3QD piece, which was about something else.

Themes in “ChatGPT Makes Images”

Once I’d settled on the idea of doing an article about images made by ChatGPT the choice of images came quickly. There seems to be a logic there. I start with a painting I did as child, a painting of rockets on Mars. I then showed two renderings that ChatGPT did of that painting, one in the style of a Japanese print and the other in the style of a Byzantine mosaic. That is to say, these are exotic and strange images.

Then I started with a graffiti photo and had ChatGPT use that a cue to images on a search for remnants of a lost civilization (Indiana Jones), a search that eventually led to an alien planet. More exoticism. Then came two images of a Hindu superhero, Kama Carnatica. Still more the exotic.

But then we return to childhood, and early childhood at that, to the world of a four-year-old. I showed a four-panel comic where a young boy tames a tsunami by naming it. Think about that. It doesn’t seem particularly exotic, but it is about taming the unknown by naming. Very human. Very Biblical. Very intellectual.

What happens next? I bring back my Hindu superhero and place her in four very well-known paintings, The Birth of Venus, Mona Lisa, Whistler’s Mother, and Portrait of Madam X. What’s going on, domesticating the exotic by assimilating it to Western art, or othering Western art by making it exotic? Does it matter?

I conclude with three mandalas. The first is an iris. A Georgia O-Keefe iris? Exotic. Then a mandala about written symbols. The final mandala is about my life, with links back to Mars and Kama Carnatica.

There’s a logic there, myth logic.

That four-year old girl

Why did I tell that story about Valerie? This is an article about ChatGPT’s ability to generate images. That story is, at best, tangential to the article. It’s there because it’s part of the prompt I used to generate some images I’ve included in the article.

I note that the whole essay is a bit over 4000 words long, plus the images. That story is introduced a bit after the halfway point; it’s in the middle of the essay. So, in the middle of this essay about strange new technology, that some find to be scary, I place a sweet story about eliciting hugs from a four-year-old girl. That’s followed by a short comic about a little boy who overcomes his fear of a tsunami by using language, by naming it.

On the one hand those cartoon images are just more examples of what ChatGPT can do. I could easily have used other examples. But I chose those. And I’m pretty sure I chose them for they story they tell, though I wasn’t quite conscious of that at the time. And those images, in turn, entailed the story about Valerie.

That purely incidental and peripheral story is thus at the heart of this article. What follows that section of the article? A section about the “whitewashing” of history. To be sure, I’m not (entirely) serious about that whitewashing. But I’m certainly aware that that sort of thing is a charged issue and that the images I produced in that section evoke “cultural appropriation.” I thought seriously about that, and decided to go ahead.

As I’ve said before, there’s a logic there, a cultural logic.

3 comments:

  1. I found your blog via your 3quarks article (which I found via scour.ing). In your article, I really liked your writing tone and openness to image creation with ChatGPT. Hence why I went searching and found your blog here. And momma mia. What a treasure trove here. You publish about 1000 posts a year.

    I feel like I need a "new savanna" highlights reel to dig into some of your favorite posts. For now, I'm going to explore your "urban pastoral" tag.

    Also, I'm going to request your book, "Beethoven's anvil" via interlibrary loan.

    I'm glad the internet continues to offer such great discoveries such as your blog.

    ReplyDelete
    Replies
    1. Welcome to the savanna, Matt. Here's a link to a page that serves as a guide to the blog. It's several years out of date so there are no links to newer stuff, but it will give some idea about the variety of material here: https://new-savanna.blogspot.com/p/about-new-savanna.html

      Delete
  2. Just zipped over to your website. Web comics. Here's a post where I have some comments on Scott McCloud's wonderful book, Understanding Comics.

    ReplyDelete