NEW SAVANNA: How is a GPT modified so that it can tell stories?

Sunday, March 26, 2023

How is a GPT modified so that it can tell stories?

GPT’s are constructed so as to continue a given text. I’d asked GPT-4 about that and reported its response in a previous post: So, just why do GPTs have to operate by continuing an existing string? I went on to quiz it about story generation. I asked it about that.

How is a GPT modified so that it can tell stories?

To modify a GPT model so that it can generate stories, there are a few approaches that can be taken. One approach is to fine-tune the model on a dataset of stories, such as books or other narrative texts. This involves modifying the architecture of the GPT model to generate coherent and engaging stories.

Here are some steps that could be taken to fine-tune a GPT model for story generation:

Preprocess the story dataset: The dataset of stories needs to be preprocessed so that it is in a format that can be used to train the GPT model. This involves splitting the stories into paragraphs or sentences and encoding them into a format that can be fed into the model.

Modify the architecture of the model: The GPT model needs to be modified to generate coherent and engaging stories. This can be achieved by adding new layers or modifying existing layers to better capture the structure and narrative elements of a story. For example, the model could be modified to learn to generate character descriptions, plot elements, and dialogue.

Train the model: Once the model architecture has been modified, it can be trained on the preprocessed dataset of stories. During training, the model learns to generate coherent and engaging stories by predicting the next word or sentence based on the context provided.

Generate stories: Once the model has been trained, it can be used to generate new stories. To do this, the model is provided with a starting prompt or sentence, and it generates the next words or sentences based on the context provided. The generated text can be further conditioned by providing feedback to the model or by adjusting various parameters, such as the temperature or the length of the generated text.

In summary, modifying a GPT model to generate stories involves fine-tuning the architecture of the model on a dataset of stories and training it to generate coherent and engaging narrative text.

Hmmmm. Perhaps I should have pressed it on the first one since most stories already exist in paragraphs and sentences. I definitely should have pressed it on 2. 3 I already know. It’s four that I’m curious about. I frequently prompt ChatGPT with “Tell me a story” or “Tell me a story about a hero.” There is no obvious continuation for either of those. A wide range of things could follow either one, with no obvious way of chosing among them.

I tried to get at that in the next question:

I have noticed that if you prompt ChatGPT with “Tell me a story” it tends to tell a fairy-tale, with princesses and dragons and peasants. It even seems to like stories about Jack and a beanstalk. Also, it begins the stories with a stock phrase: “Once upon a time, long ago and far away...” Why is this?

When prompted with the request "Tell me a story," ChatGPT often generates a fairy tale-style story featuring princesses, dragons, and peasants. The stories usually begin with the traditional opening phrase "Once upon a time, long ago and far away..." This is likely due to the type of data that the GPT model was trained on.

The pre-training data used to train the GPT model likely included a large corpus of text that contained a significant number of fairy tales, folk tales, and other forms of traditional storytelling. These stories often follow a predictable narrative structure, with common elements such as magical creatures, heroic protagonists, and enchanted settings.

As a result, when prompted to generate a story, the GPT model is drawing on this pre-existing knowledge and generating stories that follow a similar structure and contain similar elements. The opening phrase "Once upon a time, long ago and far away..." is a common trope in traditional storytelling that signals to the listener that a story is about to begin and sets the stage for the narrative that follows.

It's also possible that the GPT model has learned through its training data that these types of stories tend to be well-received and are more likely to be enjoyed by humans. As a result, the model may be biased towards generating stories that fit this mold.

It's worth noting that GPT models can generate a wide range of story types and genres, depending on the prompt and the training data. If prompted with a different type of story or genre, such as a mystery or a romance, the model may generate a story that is quite different from a traditional fairy tale.

On the whole, not all that helpful. David Chapman had a short tweet stream in which he speculated about the presence of fan fiction on the web. One repository had almost 11 million stories. That might explain a lot. In any event, here’s my next and final query on this topic:

Yes, I have noticed that as well. If you prompt it, “Tell a sad story” it will do so. If you include specific items in the prompt – such as setting the story in an office building in a large city – it will do that. Do you have any observations about this?

Yes, that's correct. GPT models can generate a wide range of story types and genres, depending on the prompt and the training data. If prompted with a specific setting or genre, such as a sad story or a story set in an office building in a large city, the model can generate a story that is tailored to that prompt.

This is because GPT models use a process called "prompt conditioning," where the model generates text based on the input prompt or context provided. The model then generates the next word or sentence based on the probability distribution of the next likely word given the context.

When the prompt includes specific items or settings, the model uses these as context to generate a story that is consistent with the prompt. For example, if the prompt includes the words "sad story," the model may generate a story that includes themes of loss, tragedy, or heartbreak. If the prompt includes the words "office building in a large city," the model may generate a story that includes characters working in an office setting, with elements of urban life and city living.

It's important to note that the quality of the generated text depends on the quality of the prompt and the training data used to train the model. If the prompt is well-formed and includes specific details, the model is more likely to generate coherent and engaging text that is consistent with the prompt. Conversely, if the prompt is vague or ambiguous, the model may generate text that is less coherent or less relevant to the prompt.

I have already noted that things can get very interesting if you work interactively on a story.