The Majesty of Language
LLMs as reminder of humanity's greatest achievement
The semantic surprise
The study of language has always been driven by debates around meaning. Does meaning exist in the structures of language, or in the minds that use it? Does language reflect the world, or construct it? Do symbols connect to our understanding of concepts, or to the nexus of related symbols in language?
LLMs add a new question to the debate: What does language look like when viewed from the perspective of the entire corpus at once?
The answer is that it would look a lot like an LLM.
After all, LLMs do not point to real objects. They are not grounded in any experience. They are not connected to any minds that generate language. Yet from language alone, an LLM both decodes the meaning of every request you give it and generates endless meaning on command.
This is a form of meaning that is only possible from the perspective of the entire corpus.
Think about what accumulates in the totality of human text. Every poem that captures longing, every explanation that clarifies confusion, every joke that subverts expectations—they all leave a tiny deposit of meaning in the corpus of language.
Multiply this by billions of texts across thousands of years, and language becomes so dense with semantic patterns that it transforms into a self-contained interface for meaning itself.
This is the semantic surprise: at sufficient scale, a corpus is so saturated with meaning that LLMs can model it as naturally as they model the rules of grammar. They can parse both double negatives and double meanings. They can learn not just sentence structure, but social structure. They can recognize both passive voice and passive aggression. LLMs don’t just master the probabilities of syntax, but also master the associative patterns of semantics.
What LLMs prove is that meaning doesn’t just live in the minds of language users, but that meaning is self-contained in language itself.
From this perspective, to call LLMs “stochastic parrots”—as if all they are doing is randomly predicting the next word—feels like an insult to language. LLMs don’t just mimic words. They mimic meaning.
The accidental cheatcode
The most amazing part? LLMs get all this meaning for free. Somehow we trained a system to predict the next word, and it learned to navigate every aspect of the human experience that has ever been put into words.
This is not how software engineering works.
Imagine walking into a tech company with the following request: “Build me a system that can have a meaningful chat with me about any topic, from any perspective. It should be able to diagnose my psychology, impersonate any historical persona, and suggest wisdom traditions with surprising relevance. Basically, it should give me a meaningful response to any request that I make.”
They’d think you were requesting magic, not engineering.
And they’d be right. After all, any computer can learn syntax—the rules of grammar that govern word order. Applying rules is exactly what we would expect from a machine. But LLMs have also learned semantics—not just how to arrange words, but how to use words to mean things. LLMs don’t just play with grammatical rules, they play with meaning. And it’s the meaning that makes LLMs so magical.
How does an LLM figure out what all of these words and sentences and contexts actually mean? The only possible explanation is that the magic is in language itself. After all, no one trained an LLM in sociology, anthropology, or psychoanalysis. No one programmed in humor modules or emotional databases. No one designed it to flirt, show empathy, or be sarcastic. These capabilities just fell out of the models with zero planning or design.
In other words, language turned out to be the ultimate cheat code for AI.
LLMs didn’t need to learn meaning, they just needed to learn language. The meaning came along for free. Somehow, in teaching LLMs how to process language, they learned how to process everything else.
Meaning machines
But if LLMs have mastered meaning, we need to ask: what kind of strange form of meaning is this?
We’re not sure exactly. Much like human brains, the neural networks that power an LLM are more like a “black box” than something you can inspect or interpret. We can’t peek inside the machine to see exactly what’s happening.
What we do know is that it is a form of meaning unlike any we’ve encountered before—not meaning as reference or representation, but meaning as pure geometric relationship. An LLM doesn’t so much “understand” meaning as navigate the meaning that language already contains.
It turns out the best way to predict the next word is to figure out what those words actually mean. This is only possible because language has so much structure that the meaning of any word can be defined by its use relative to every other word in the corpus.
LLMs figure this out by converting language into math. Every basic token of text is encoded as an “embedding” of associative probabilities. Alone, each embedding is meaningless. But when viewed in relation to every other embedding, a high dimensional space is formed where vectors tell a mathematical story of meaning.
The famous example is KING - MALE + FEMALE = QUEEN: the discovery that if you subtract the concept of “male” from the concept of “king”, and then add the concept of “female”, the result is the concept most associated with “queen”.
Yet the LLM has no essential representation of “king”. There is just a hypothetical concept of “king-ness” that results from all the patterns of “king” as it relates to every other concept. In the geometry of meaning space, you may find “king” close to a concept like “ruler” and far away from a concept like “ice-cream”. The path of “male” to “king” will be in the same direction as “female” to “queen”, but in a completely different direction from “ice-cream” to “delicious”.
The same math can operate along any dimension of meaning captured by the embedding. An LLM can explode the concept of “squirrel” into all of its infinite parts to combine any aspect of “squirrel-ness” with any other concept it can possibly relate to: squirrel-as-philosopher, squirrel-as-quantum-particle, squirrel-as-economic-metaphor.
From the perspective of the LLM, each of these are equally valid paths through meaning space. Just like Darwin discussing ‘quantum evolution’ is perfectly meaningful, even if quantum theory emerged a few decades after Darwin’s death. For a human, what we call a “hallucination” is less an indictment of LLMs and more a reflection on the particular way that humans navigate meaning space. LLMs are happy to ignore certain constraints like temporal consistency that we prefer to enforce.
This means that the best way to understand LLMs may not be through intelligence, or even language, but through meaning. LLMs are a new interface to explore this hypothetical “meaning all at once” that has always been latent in language all along. Effectively, this makes the LLM more like a “meaning machine”—a new technology that allows us to play with meaning in its purest form, with zero constraint or reference.
If you find it difficult to see LLMs as meaning machines, remember that the current conversational interface necessarily collapses a vast space of meaning into a single chat response. Whatever the ideal interface for meaning machines look like, it will need to have a far greater dimensional capacity than a one-to-one conversation.
Artificial Language Intelligence
This idea of “meaning machine” is not how we ever imagined intelligence becoming artificial. Much of AI’s history was guided by the belief that we needed to teach the machine how to understand meaning. We spent decades trying to define symbols, build knowledge graphs, and encode rules.
We had it backwards. We needed to train machines to navigate the meaning that language already contains. Language is so saturated with meaning that LLMs could pass the Turing Test just by learning to navigate all the structure and “intelligence” latent in language itself.
This means that the intelligence we find in LLMs has almost nothing to do with the machine and almost everything to do with language. It means that “scaling laws” have less to do with compute or inference and more to do with how much intelligence we can extract from the structure of language. It means that any “consciousness” we are tempted to find in an LLM is simply a testament to the degree of human consciousness we’ve encoded into language.
Ultimately, LLMs should remind us of something that we too often forget: we are the species that uses technology in service of meaning. And language is our greatest human achievement. We built language together, across millennia, through nothing more than trial and error and the collective need to mean something to each other.
Every word we’ve invented to capture some fleeting form of meaning has accumulated into a technology more complete than any database, more nuanced than any algorithm, and more alive than any system we could possibly design.

