Neuroscience research might provide critical clues on how to “align” future brain-like AIs. Development of improved connectomics technology would be important to underpin this research. Improved connectomics technology would also have application to accelerated discovery of new potential treatments for currently intractable brain disorders.
Neuroscience research capabilities may be important to underpin AI alignment research
Since the human brain is the only known generally intelligent system, it is plausible (though by no means certain), that the AGI systems we will ultimately build may converge with some of the brain’s key “design features”.
The presence of 40+ person neuroscience teams at AI companies like DeepMind, or heavily neuroscience-inspired AI companies like Vicarious Systems, supports this possibility. I
f this is the case, then learning how to “align” brain-like AIs, specifically, will be critical for the future. There may be much to learn from neuroscience, of utility for the AGI alignment field, about how the brain itself is trained to optimize objective functions.
Neuroscience focused work is still a small sub-branch of AI safety/alignment research. There are preliminary suggestions that the mammalian brain can be thought of as a very particular kind of model based reinforcement learning agent in this context, with notably differences from current reinforcement learning systems, including the existence of many reward channels rather than one.
See Steve Byrnes’s recent writings on this:
We are even starting to see a bit of empirical evidence for such connections based on recent fly connectome datasets:
In this scenario, it becomes particularly important to understand the nature of the brain’s reward circuitry, i.e., how the subcortex provides “training signals” to the neocortex. This could potentially be used to inform AI alignment strategies that mimic those used by biology to precisely shape mammalian development and behavior.
In another scenario, which could unfold later this century, closer integration of brains and computers through brain computer interfacing or digitization of brain function may play a role in how more advanced intelligence develops, yet our ability to design and reason about such systems is also currently strongly limited by a lack of fundamental understanding of brain architecture.
Current brain circuit mapping capabilities do not adequately support this agenda
Unfortunately, current brain mapping technologies are insufficient to underpin the necessary research. In particular, although major progress is being made in mapping circuitry in small, compact brain volumes using electron microscopy
this method has some severe limitations.
First, it is at best expensive, and also still technically unproven, to scale this approach to much larger volumes (centimeter distances, entire mammalian brains), when one considers issues like lost or warped serial sections or the need for human proofreading.
This scale is required, though, to reveal the long range interactions between the subcortical circuitry that (plausibly) provides training signals to the neocortex, and the neocortex itself. Scale is also crucial for general aspects of holistic brain architecture that inform the nature of this training process. This is particularly the case when considering larger brains closer to those of humans.
Second, electron microscopy provides only a “black and white” view of the circuitry that does not reveal key molecules that may be essential to the architecture of the brain’s reward systems. The brain may use multiple different reward and/or training signals conveyed by different molecules, and many of these differences are invisible to current electron microscopy brain mapping technology.
Improved brain mapping technologies could help
New anatomical/molecular brain circuit mapping technology has the potential to increase the rate of knowledge generation about long-range brain circuitry/architecture, such as the subcortical/cortical interactions that may underlie the brain’s “objective functions”.
This *could* prove to be important to underpin AI alignment in a scenario where AI converges with at least some aspects of mammalian brain architecture.
See, e.g., the following comment in Steve Byrnes’s latest AI safety post here: “I do think the innate hypothalamus-and-brainstem algorithm is kinda a big complicated mess, involving dozens or hundreds of things like snake-detector circuits, and curiosity, and various social instincts, and so on. And basically nobody in neuroscience, to my knowledge, is explicitly trying to reverse-engineer this algorithm. I wish they would!”
A big part of the reason one can’t do that today is because our technologies for long-range yet precise molecular neuroanatomy are still poor.
At recent NIH/DOE connectome brainstorming workshops, I spoke about emerging possibilities for “next-generation connectomics” technologies.
It is also possible that advances in brain mapping technology would generally accelerate AGI timelines, as opposed to specifically accelerating the safety research component. However, I think it is at least plausible that long-range molecular neuroanatomy specifically could differentially support looking at interactions between brain subsystems separated across long distances in the brain, which is relevant to understanding the brain’s own reward / “alignment” circuitry, versus just its cortical learning mechanisms. This might bias development of certain next-generation connectomics technologies towards helping with AI safety as opposed to capabilities research, given a background state of affairs in which we are already getting pretty good at mapping local cortical circuitry.
Other possible benefits
Another core benefit of improved connectomics would, if successful, be an improved ability to understand mechanisms for, and screen drugs against, neurological and psychiatric disorders that afflict more than one billion people worldwide and are currently intractable for drug development. See more here and here.