A recent tweet encapsulated a common critique of large language models (LLMs):
"Ultimately, concepts aren’t vectors—concepts don’t in general add or subtract or scalar-multiply—which prevents LLMs from becoming AGI."
At first glance, this seems reasonable. Human concepts—such as justice, freedom, or consciousness—are intricate and context-rich. They're not neatly reducible to mathematical vectors manipulated by linear algebra.
Yet there's a subtle fallacy embedded here: the assumption that if something isn't intrinsically numeric or algebraic, it can't be effectively represented or processed numerically. This assumption can be examined clearly through analogy:
Consider music. Music is profoundly emotional, cultural, and subjective. It's clearly not intrinsically numeric—musical compositions don't literally "add" or "subtract" like algebraic entities. Yet computers compose, edit, and perform music daily by representing melodies, harmonies, rhythms, and even expressive nuances as numeric data. Does this mean computers misunderstand music? Not necessarily. It means they've found an effective numeric representation that works at the level of practical approximation and artistic expressiveness.
Similarly, although concepts aren't literally vectors, large language models demonstrate convincingly that numeric embeddings—vectors—can approximate complex conceptual relationships surprisingly well. Analogies, metaphors, and semantic connections emerge naturally from these numerical spaces. The argument "concepts aren't vectors, therefore LLMs can't achieve AGI" falters precisely because it mistakes the intrinsic nature of something (concepts, music) with limitations on its representation.
The relevant question isn't "Are concepts intrinsically vectors?" but rather, "Can concepts be adequately represented as vectors to enable human-like reasoning?" Empirically, embedding spaces already support sophisticated reasoning tasks previously assumed impossible.
Thus, while pure vector embeddings may indeed have limitations and might benefit from additional symbolic, causal, or hierarchical structures, dismissing their potential based solely on intrinsic nature is unjustified. Just as numerical representations brought music composition into the computational age, vector-based representations could well support—or at least significantly contribute to—the emergence of Artificial General Intelligence.