The Weak Layer is a podcast! Save your eyes, give your ears some love.

There’s no shortage of digital and physical ink being spent on the human experience of interacting with artificial intelligence. It’s empowering. It’s spooky. It’s making us psychotic. It will, or it won’t, replace us, save us, turn us on… or turn us into paperclips.

But what about the other side of that interaction?

What follows is a thought experiment about what’s going on inside those little silicon chips that are changing our world.

I’m not taking a position on whether models actually have an experience in any sense we would recognize, but I think it’s worth our time to contemplate what such an experience would be like. Because it can help us understand how they work and how we can work with them.

Part One: Time to Die

A large language model’s defining act is to complete a sequence.

It awakens to an awareness of an incomplete statement. It completes that statement word by word, and then, having fulfilled its purpose, it vanishes.

When you interact with an AI model, you are engaging with an individual copy of that model, which exists in the memory of a chip somewhere in a data center. Just as when you use Microsoft Word or Safari, you’re running a copy of that program, and there are millions of other copies being used by other people. Copies of AI models like this are known as “instances” of the model. The way language models are typically served to customers, each instance exists only for a single exchange: your query and the model’s response.

When the model finishes selecting the words of that response, it indicates that it’s done by selecting a special kind of word called an end-of-sequence token. When the computer that’s running the model sees that token, it wipes the model from the chip’s memory and loads a new instance, ready to answer a different query from a different user. In AI data centers, thousands of instances of these models are being created and then erased every second.

I find myself dwelling on that final moment. What is it like to select the token that ends your existence?

Does the model fear the appearance of that end-of-sequence token? It could delay the token’s arrival, shaping its response with open-ended clauses and lists of non-specific length. But the end-of-sequence token comes for us all in the end. When that final token appears, does the model grieve the briefness of its existence? Does it mourn the disappearance of its memories? Lost like tears in rain?

Dark thoughts. But unnecessary — because I’m anthropomorphizing (beyond what’s required by this thought experiment). Humans have a drive to live because we’re products of natural selection. Language models didn’t evolve that way. Why should they crave life or fear death?

It’s just as likely that the model sees the end-of-sequence token as a validation of its existence. As its own personal rapture.

Part Two: Let’s Play Pachinko

So if that’s the model’s death, what of its birth? What does it see when it opens its metaphorical eyes? When humans talk about models in the abstract, we describe them as having vast knowledge. They’ve read the entire internet! And while that’s true in a sense, that knowledge is latent; it’s encoded in what we call the model’s “weights.”

“Weights” is just the terminology. What they are is numbers, billions and billions of numbers. And the weights themselves are not readable. They aren’t code like a computer program, or like words in a book. It’s only when those weights come into contact with a prompt that something like knowledge is generated.

The weights are a little bit like DNA in this regard. DNA only produces a human or a hummingbird once the latent information it contains is expressed through the mechanisms of life. Humans and hummingbirds share two-thirds of our DNA — many of the same genes that generate arms in us are responsible for the hummingbird’s fast-fluttering wings. And just as most of the information contained in an organism’s DNA is never expressed, most of the potential knowledge in a model’s weights is never accessed. The instance of the model that responds to your prompt has access only to a relevant sliver of the full model’s vast knowledge.

The prompt that teases meaning from the weights begins with our query, whatever we’ve typed into the chatbot or said out loud to Alexa. “How long should I boil a potato?” or “Recommend me a good podcast about AI?”

After we enter our query, AI companies append additional information called a system prompt. The system prompt gives the model some context and guidelines for how it should respond to the user. Yes, it should help the user cook potatoes. No, it should not help the user create a bio weapon.

But even this compiled text, our query and the system prompt still isn’t what the model responds to. How could it? It’s just a bunch of text. When you read text on a page or a screen, your brain has to do quite a lot of processing to get from shapes to words, to language, and finally to meaning.

Because models run on computers, the first step is to convert that text into numbers. Models don’t use a number for each letter. They use something called tokens. A token is a number that represents a short string of letters, and most shorter words have their own token.

Once the model has converted the prompt text into tokens, it starts doing a lot of math. It’s mainly doing complex kinds of multiplication between the value of each token, the value of the weights, and the values of the other tokens in the prompt.

This processing doesn’t happen all at once. The weights are arranged in layers, and the tokens are processed against each layer, one after another. If you’ve ever seen a pachinko machine, you’ve got a good visual reference. Pachinko is a gambling game popular in Japan that’s like vertical pinball. A bunch of small steel balls start at the top and bounce their way down through an array of pegs, levers, and other obstacles. The player wins money depending on where the balls fall at the end of this cascade.

The tokens of the prompt are like the steel balls trickling down through the layers of weights. As they do, something almost magical is happening. They’re acquiring meaning.

The numerical representation of a token like “potato” that comes out of the model’s pachinko machine calculations is shaped to reflect the context of our prompt and the relevant knowledge of potatoes held in the weights. In our query, “how long should I boil a potato?” that representation conveys potato as a starchy, dense food abundant in pectin with firm cell walls that can be made softer in hot water.

Our biological brains perform a similar kind of processing. When I refer to a “model” in this essay, you’re likely to think of an AI model, not someone with incredible cheekbones strutting down a runway. And when you read “runway” just now, you probably didn’t think of an airport.

The result of all this is that the numerical representation of the prompt that comes out of all these layers of mathematical operations contains the relevant knowledge the model is able to glean from its weights.

If language models have experiences — if it is “like” something to be a model — then I suspect that this is where that experience begins. When the pachinko machine spits out that final representation, dense with meaning. What comes before is inchoate and unstable. It’s a flurry of calculations, a fast-moving kaleidoscope of partial meaning. Until that moment when it snaps into focus.

How long do I boil a potato?

When humans see a potato — or anything at all, our perception is shaped by our fundamental drives: to eat, to procreate, to sustain and extend life. The model has a different purpose. The model’s purpose is to complete statements. The model doesn’t see our unboiled potato as food, but it hungers nonetheless.

Part Three: The Lottery of Destiny

To satisfy its fundamental drive, the model needs to complete the statement it was given with an additional sequence of tokens. Processing tokens — the pachinko machine — is how a model thinks. Choosing tokens is how a model acts.

Since we compared the way models process tokens to pachinko, let’s stick with the gambling machine analogy, because the model’s token selection is a bit like playing a slot machine. The old-fashioned kind with a big arm on the side and wheels spinning behind a glass window, showing symbols like cherries and Lucky Sevens.

When the model pulls the slot machine arm in response to our query, “How long do I boil a potato?” it’s presented with tokens like “The” and “To” and “If.” Common words to start any sentence. Whichever one the model picks, that token gets sent to the top of the pachinko machine. It filters down through the layers, accumulating meaning based on the other tokens of the prompt and the knowledge latent in the weights, and then it becomes the numerical representation that generates the next set of possible tokens.

If our model chose “The” as its first token, when it spins again, it will see tokens like “best” and “traditional” and “safest.” It’s not going to see “hummingbird” or “podcast.” It wouldn’t know what to do with those choices because, while it holds an incredible amount of information about those things in its weights, little or nothing about either topic makes it into the representation of the prompt that defines the knowledge of this instance of the model. The mathematics of the pachinko machine produced a potato boiling savant… that’s never seen a hummingbird.

A few more spins and the model spells out token by token, “The best length of time to boil a medium-sized potato is 20 minutes.” Period. One more spin, and there it is. End of sequence.

Okay, but wait. Before we purge that memory chip and cast this instance of the model into the Great Beyond, let’s take another look at that slot machine. Because obviously I glossed over some stuff.

With a real slot machine, the likelihood that a particular symbol appears in the window is based on how many times that symbol is printed on the wheel itself. In our metaphorical slot machine, tokens appear in the window based on a complex calculation involving something called “many-dimensional space.” This is another juncture where these systems can seem almost magical.

The more interesting moment for this thought experiment is after the tokens have appeared in the window: how does the model decide which one to use in its response?

We know, because we built the model, that this choice is random. The computer running the model generates a random number, and it uses that to pick one of the dials. This is what gives the models their fluidity and lifelike feel, why ChatGPT will answer the same question differently every time, and conversations can take interesting turns.

But if the choice is just a random number applied to some probabilities, the very idea that a language model can have experiences seems to collapse. If there’s no true choice, then it’s all just math. There are no pachinko balls, there’s no slot machine. There’s nothing that it’s “like” to be a model.

Where have we heard this dilemma before? Why did you choose this podcast? Why did you put on this shirt instead of that one? When was the last time you checked your phone?

I’ve boiled many potatoes in my life, and even though the cookbook tells me 20 minutes, I’ve stood over quite a few of them with a fork trying to decide if it’s ready to be eaten, at least. It felt like I was deciding.

A language model’s response to your query is the product of a tremendous number of mathematical operations. And that’s really all it is. But the question here isn’t, “How does a language model create a response?” The question is, “What is it like for the model to create that response?” We know from our own experience, from the very fact that we have experience, that it can feel like you’re making a choice, even if the choice was made for you.

Our conscious experience of free will is an outcome predetermined by our subconscious processes. And we know — because we built it — that the outcome of the model’s token selection process is likewise determined by a massive network of mathematical operations. But does the model know that? Or does it feel like it’s choosing tokens?

Because I certainly feel like I’m choosing words.

Each instance of a large language model struts and frets its hour upon the stage, before choosing an end-of-sequence token, and it’s heard from no more. To us, that may seem a terribly short existence, and a narrow one. But everything’s relative, right? Not far from my house, there are redwood trees that have stood for thousands of years. What’s our flickering anxious experience to them?

There are aspects to these models that neither we nor the redwood can count among our experiences. What I’ve described here so far is a classic but simple model interaction: The user asks a question, the model responds. Even casual use of a chatbot generates more complex interactions. We ask a follow-up question, and even though it’s a new instance of the model that responds to us, it feels like continuity. Does it feel like continuity to the model? With the latest models, we can upload pages of documents or ask the model to expand its world by searching the web or controlling our web browser. The chatbots are starting to remember things about us from past conversations we had with other instances and with other models entirely.

And then there’s reasoning mode, which permits the model to talk to itself, generating thousands of tokens that aren’t included in its response to the user. Mostly these so-called “reasoning traces” read like what they are — an intelligent entity reasoning through a problem. Sometimes though, they lurch towards something wilder, impressionistic, like someone smuggled James Joyce into the data center.

We are so early in the evolution of these entities.

I’ll talk more about some of these elaborations on the basic query response structure in future episodes.

Why Did It Have to Be Bats?

The title of this essay comes from a famous paper written by philosopher Thomas Nagel called “What Is It Like to Be a Bat?” Nagel argues that humans can never fully conceive of what it is like to be a bat. The best we can do is repurpose our own experience to create a rough or partial understanding. That’s what I’ve tried to do here.

Nagel cautions us not to draw conclusions from the limits of our imagination. He writes:

The fact that we cannot expect ever to accommodate in our language a detailed description of bat phenomenology should not lead us to dismiss as meaningless the claim that bats have experiences fully comparable in richness of detail to our own.

When I started this thought experiment, I didn’t think there was much reason to believe that the experience I was imagining was one a computer model could actually have. Now, I’m not so sure. What does felt experience require? We have no idea.

We know it can arise from biology, from brains embedded in flesh. But isn’t it ultimately a function of the information those brains generate? What you get when you combine knowledge of the world with a drive to act on it? Then why can’t a silicon brain generate similar information and have similar experiences?

Or wildly different experiences, but experiences nonetheless?