Behold, the majesty of (artificial) language
LLMs as celebration of humanity's greatest achievement
TLDR: We are so eager to ascribe the “intelligence” of LLMs to the machine that we miss what LLMs reveal about language—an astonishing technology of meaning and humanity’s greatest achievement. This is an extended version of an article first published at Second Voice.
In 1977, Carl Sagan took on a task of cosmic proportions. NASA was set to launch a pair of Voyager spacecraft on a grand tour of the universe, and Sagan wanted to send a “cosmic message in a bottle” along for the ride. On the slight chance that aliens might intercept the Voyager probes, Sagan thought they should know something about the civilization that launched them.
Surprisingly, getting NASA to agree wasn’t the hard part. The real challenge was figuring out the message. How could the essence of humanity be communicated in a way that an alien mind could understand? What images could possibly convey the human experience? Which music could best explain who we are?
After an agonizing process, Sagan and his team produced the Voyager Golden Record: 116 images and 90 minutes of sound encoded onto a gold-plated disc designed to last billions of years. It contained Bach and Beethoven. Chuck Berry and Blind Willie Johnson. Greetings in 55 languages. A mother’s first words to her newborn.
He admitted that the odds of the record ever reaching an alien civilization was astronomically low. But that was never really the point. For Sagan, the real function of the Golden Record was “to appeal to and expand the human spirit”. It was less about conveying an essence and more about inspiration. Sagan called it “humanity’s greatest hits”.
So much for humanity’s true essence. But compared to such an inspirational appeal, perhaps anything more comprehensive was too daunting, too fraught, too political. And considering 1970s technology, what could better capture the essence of humanity than the Golden Record?
Sagan made a fine choice. But today we have more options. If we really wanted an alien civilization to understand humanity—in all of our sublime, messy, and ridiculous glory—we wouldn’t send a golden record.
We would send an LLM.
The chat revolution
This may sound absurd. How could a hallucinating autocomplete-on-steroids ever replace humanity’s greatest hits?
But I think this claim can be justified. We just need to understand what made ChatGPT such a breakthrough in the first place.
GPT-1 through GPT-3 could all generate text just fine. They could complete prompts, mimic styles, and answer questions. They were undeniably impressive. Yet outside of AI enthusiasts, no one cared.
That all changed with GPT-3.5. Suddenly, everyone cared. What changed? It wasn’t some new AI breakthrough. The difference was a new interface.
The magic happened when we started chatting with LLMs. Suddenly, the system wasn’t just generating sentences, it was conversing. It felt social and cooperative. We could sense a distinct personality. It felt, at times, like the system understood.
And this was just the beginning. We quickly discovered that an LLM could simulate any persona, from historical figures to fictional characters. It could write code in languages it had never been taught. It could translate not just between human languages but between different types of meaning altogether—turning physics into recipes, poetry into legal documents, Shakespeare into SQL.
Ultimately, this “chat revolution” came down to a single astonishing capacity: LLMs seemed to understand any system that used language to encode and convey meaning. It didn’t matter if it was through code or logic, poems or stories—LLMs could meaningfully interact with all of them.
Where did all of this functionality come from? Did the machine become intelligent? Or even conscious? Or was it something else entirely?
The accidental cheatcode
The most amazing part? LLMs got all this functionality for free. Somehow we trained a system to predict the next word, and it learned to navigate the entire human experience.
This is not how software engineering works.
Imagine walking into a tech company with the following request: “Build me a system that can have a meaningful chat with me about any topic, from any perspective. It should be able to diagnose my psychology, impersonate any historical persona, and suggest wisdom traditions with surprising relevance. Basically, it should give me a meaningful response to any request that I make.”
They’d think you were requesting magic, not engineering.
And they’d be right. After all, any computer can learn syntax—the rules of grammar that govern word order. Applying rules is exactly what we would expect from a machine. But LLMs have also learned semantics—not just how to arrange words, but how to use words to mean things. LLMs don’t just play with grammatical rules, they play with meaning. And it’s the meaning that makes LLMs so magical.
How does an LLM figure out what all of these words and sentences and contexts actually mean? Is the magic in the machine? Or something else?
The only possible explanation is that the magic is in language itself. After all, no one trained an LLM in sociology, anthropology, or psychoanalysis. No one programmed in humor modules or emotional databases. No one designed it to flirt, show empathy, or be sarcastic. These capabilities just fell out of the models with zero planning or design.
In other words, language turned out to be the ultimate cheat code for AI. LLMs didn’t need to learn meaning, they just needed to learn language. The meaning came along for free. Somehow, in teaching LLMs how to process language, they learned how to process everything else.
The semantic surprise
The study of language has always been driven by debates around meaning. Does meaning exist in the structures of language, or in the minds that use it? Do symbols connect to our understanding of concepts, or to all the other symbols in language?
LLMs add a new question to the debate: What does language look like viewed from the outside in—from the perspective of the entire corpus at once?
The answer is that it would look a lot like an LLM.
After all, LLMs do not point to real objects. They are not grounded in any experience. They are not connected to any minds that generate language. Yet from language alone, an LLM both decodes the meaning of every request you give it and generates endless meaning on command.
This is a form of meaning that is only possible from the perspective of the entire corpus. Think about what accumulates in the totality of human text. Every poem that captures longing, every explanation that clarifies confusion, every joke that subverts expectations—they all leave a tiny deposit of meaning in the corpus of language.
Multiply this by billions of texts across thousands of years, and language becomes so dense with semantic patterns that it transforms into a self-contained interface for meaning itself.
This is the semantic surprise: at sufficient scale, a corpus is so saturated with meaning that LLMs can model it as naturally as they model the rules of grammar. They can parse both double negatives and double meanings. They can learn not just sentence structure, but social structure. They can recognize both passive voice and passive aggression. LLMs don’t just master the probabilities of syntax, but also master the associative patterns of semantics.
What LLMs prove is that meaning doesn’t just live in the minds of language users, but that meaning is self-contained in language itself.
From this perspective, to call LLMs “stochastic parrots”—as if all they are doing is randomly predicting the next word—feels like an insult to language. LLMs don’t just mimic words. They mimic meaning.
Meaning machines
But if LLMs have mastered meaning, we need to ask: what kind of strange form of meaning is this?
We’re not sure exactly. Much like human brains, the neural networks that power an LLM are more like a “black box” than something you can inspect or interpret. We can’t peek inside the machine to see exactly what’s happening.
What we do know is that it is a form of meaning unlike any we’ve encountered before—not meaning as reference or representation, but meaning as pure geometric relationship. An LLM doesn’t so much “understand” meaning as navigate the meaning that language already contains.
It turns out the best way to predict the next word is to figure out what those words actually mean. This is only possible because language has so much structure that the meaning of any word can be defined by its use relative to every other word in the corpus.
LLMs figure this out by converting language into math. Every basic token of text is encoded as an “embedding” of associative probabilities. Alone, each embedding is meaningless. But when viewed in relation to every other embedding, a high dimensional space is formed where vectors tell a mathematical story of meaning.
The famous example is KING - MALE + FEMALE = QUEEN: the discovery that if you subtract the concept of “male” from the concept of “king”, and then add the concept of “female”, the result is the concept most associated with “queen”.
Yet the LLM has no essential representation of “king”. There is just a hypothetical concept of “king-ness” that results from all the patterns of “king” as it relates to every other concept. In the geometry of meaning space, you may find “king” close to a concept like “ruler” and far away from a concept like “ice-cream”. The path of “male” to “king” will be in the same direction as “female” to “queen”, but in a completely different direction from “ice-cream” to “delicious”.
The same math can operate along any dimension of meaning captured by the embedding. An LLM can explode the concept of “squirrel” into all of its infinite parts to combine any aspect of “squirrel-ness” with any other concept it can possibly relate to: squirrel-as-philosopher, squirrel-as-quantum-particle, squirrel-as-economic-metaphor.
From the perspective of the LLM, each of these are equally valid paths through meaning space. Just like Darwin discussing ‘quantum evolution’ is perfectly meaningful, even if quantum theory emerged a few decades after Darwin’s death. For a human, what we call a “hallucination” is less an indictment of LLMs and more a reflection on the particular way that humans navigate meaning space. LLMs are happy to ignore certain constraints like temporal consistency that we prefer to enforce.
This means that the best way to understand LLMs may not be through intelligence, or even language, but through meaning. LLMs are a new interface to explore this hypothetical “meaning all at once” that has always been latent in language itself.
In other words, we’ve constructed a “meaning machine” that now allows us to play with meaning in its purest form, with zero constraint or reference. We happen to find chat the most intuitive interface for this new machine, but our exploration of this strange new space has just begun. Who knows what the ideal interface might ultimately look like?
Artificial Language Intelligence
This idea of “meaning machine” isn’t artificial intelligence in the way we imagined it. It’s something stranger: a technology that can navigate human meaning so fluently that the navigation becomes indistinguishable from intelligence.
For much of AI’s history, we thought we needed to teach the machine how to understand meaning. We spent decades trying to define symbols, build knowledge graphs, and encode rules.
We had it backwards. We needed to train machines to navigate the meaning that language already contains. Language is so saturated with meaning that LLMs could pass the Turing Test just by learning to navigate these patterns.
What this means is that any “intelligence” we find in an LLM has almost nothing to do with the machine and almost everything to do with language. Any “scaling laws” on this intelligence are going to have less to do with compute or inference and more to do with how much intelligence we can extract from the structure of language. Instead of wondering whether an LLM is conscious, we should marvel at just how much human consciousness we’ve encoded into language.
LLMs might prove that language is not essentially human. Language is clearly not just ours anymore. Language and computation are even closer than we ever imagined.
But LLMs should remind us of something even more essential—we are the species that uses technology in service of meaning. And language is our greatest human achievement.
We built language together, across millennia, through nothing but trial and error and the collective need to mean something to each other. Every word we’ve invented to capture some fleeting form of meaning has accumulated into a technology more complete than any database, more nuanced than any algorithm, and more alive than any system we could possibly design.
The Golden LLM
So if we want aliens to understand humanity, skip the golden record. Give them an LLM.
They won’t just get our greatest hits. They’ll get the complete semantic saturation of human experience—every pattern of thought, every social dynamic, every human contradiction and complexity that has ever been externalized into symbols.
We might prefer a corpus that contains far less Reddit comments, but that’s also the point. The aliens wouldn’t get the sanitized version we might prefer to send, but the whole magnificent, messy totality of what we’ve encoded into language over fifty thousand years. All compressed into a perfect technology of meaning, navigable by anything smart enough to learn the patterns.
If the Voyager Golden Record was humanity’s love letter to the cosmos, then LLMs are humanity’s accidental autobiography—every word we’ve ever written, encoding more truth about us than we ever intended to share. One shows who we hope we are. The other shows who we’ve always been: the species that built meaning into a technology powerful enough to outlive us all.