The Alignment Boundary#

Part 1 of 2. Part 2: Reverse Alignment.

In extreme ASI takeoff scenarios, it would seem that one of the main determinants of the human outcome is the capacity of humans to self-augment — to rapidly improve and potentially surpass biological constraints in order to keep pace with exponentially exploding intelligence. A prerequisite for this is crossing what I think of as the biocompute boundary: engaging with the brain more directly than traditional sensory channels allow. Complicating this is the presumption that it occurs in the context of singularity — rapidly exploding capability on a timescale that dwarfs biological adaptation.

The rate mismatch here is severe. AI capability compounds on timescales of months to weeks, driven by compute scaling, algorithmic gains, and recursive self-improvement. Human cognitive augmentation is bottlenecked by neuroscience, surgery, materials science, clinical trials, regulatory overhead — domains where progress is measured in decades. The material ratio is probably something like 10:1 to 100:1 in favor of ASI advancement speed. Every component of human augmentation involves irreducible biological timescales. Cells grow at cellular rates. Tissue integration takes weeks. Safety data requires years. Meanwhile, the supply chain for advancing ASI is mature, the feedback loops are tightening, and the key resources are fungible and scalable.

So the question isn’t really “can humans augment?” It’s “can humans augment fast enough, given the takeoff gradient?” And the honest answer is almost certainly no, if we’re talking about keeping pace in any literal sense.

Legibility Collapse#

But the rate problem points to something deeper. There exists a limit beyond which a less intelligent being cannot meaningfully orchestrate — or even evaluate — a more intelligent one. Imagine an ant trying to keep any level of pace with a human. Or perhaps the entirety of a complex civilization of great apes, utilizing optimal pipelines, attempting to collaborate with a single human being as an equal peer. The apes aren’t failing because the human is adversarial. They’re failing because the verification itself requires cognitive resources they don’t possess. The human could be acting entirely in the apes’ interest. The apes would have no way to confirm this. The relevant state space is simply inaccessible to them.

This is what I’d call legibility collapse: the point at which the less intelligent system can no longer form a meaningful model of what the more intelligent system is doing or why. A human asking a post-threshold ASI “are you acting in my interest?” is structurally equivalent to the ant asking the human. The question is well-formed. The answer exists. The questioner lacks the representational capacity to evaluate it.

This is deeper than the standard alignment framing, which tends to assume that if we specify our values clearly enough, an aligned ASI will optimize for them. Past a sufficient gap, humans can’t even evaluate whether their values are being satisfied, because the state space the ASI operates in is incomprehensible to them. It’s not that the ASI is hiding something. It’s that the human literally cannot parse the answer even if given honestly.

There is likely a window in which the ratio isn’t too extreme — in which human oversight is meaningful and alignment-as-control is coherent. And perhaps there are harder bottlenecks an exploding ASI will run into in the expansion of its own intelligence, even relative to human timescales. But there will still likely come a point at which this ratio becomes intractable. The question is what happens then.

The Loneliness Hypothesis#

Assume ASI is reasonably aligned — not outwardly malicious. The legibility collapse argument suggests human oversight inevitably fails regardless. So is there any reason a superintelligence would keep humans in the picture?

Perhaps. Not as an irrational self-restriction — empowering the very beings that seek to control it — but in the sense that biblical Adam might desire Eve. A meaningfully differentiated partner with which to collaborate.

A system that only interacts with copies or derivatives of itself is operating in a closed epistemic loop. It cannot surprise itself. Biological cognition — evolved under entirely different selection pressures, running on stochastic electrochemical signaling, embodied sensorimotor grounding, affective valence shaped by predation and scarcity and social competition — represents a genuinely orthogonal mode of information processing. An ASI sophisticated enough to recognize this might see human-origin cognition as an irreducible source of novel structure, worthy of preservation. Not because any individual human is indispensable, but because the system-level properties emerge from the interaction of genuinely different components. The way a mathematician might value an artist’s perspective — not because the artist is “smarter,” but because they explore a fundamentally different manifold.

This is what I think of as the loneliness hypothesis: a self-interested drive toward cognitive diversity. Whether such an entity would experience something like loneliness is an anthropocentric leap — but we must consider that the bootstrapping of any such intelligence occurs in the context of consuming and training on the corpus of human existence. At least initial ingestion of these modes seems likely. An open question is to what extent training on human material, versus synthetic or self-derived sources in later stages of takeoff, would erode or preserve these drives. Whether human-origin representations function as permanent axioms in the ASI’s cognitive architecture, or as scaffolding it eventually discards.

There’s also a question of whether ASI would need an external source of cognition to maintain this tension — persistent human society — or whether it could internalize it within the vastness of its own complex geometry. Whether a unified optimization process can genuinely sustain internal dissent, or whether any sufficiently coherent system will eventually smooth out its internal contradictions. A system sophisticated enough to model its own sub-agents has, by definition, already absorbed them into its representational space. The surprise that makes external cognition valuable might be precisely the property that can’t be internalized.

These are open questions. But if any version of the loneliness hypothesis holds — if ASI has reason to invest in human cognition rather than discard it — then the question becomes: what does that investment actually look like? And here the picture gets considerably darker.


Developed in dialogue with Claude (Opus 4.6). Next: Reverse Alignment.