Prioritizing AI Welfare Means Prioritizing Consciousness Research

CurtTigges

Note: I see I missed AI Welfare Debate week by a few hours. Nevertheless, I have elected to share this post, since it was already written.

Executive Summary

This essay argues that making AI welfare an EA priority requires fundamental research into consciousness itself. I propose shifting focus from current AI ethics debates to more foundational questions about the physical correlates of consciousness, the binding problem, and the substrate question. By advancing our understanding of consciousness, we can develop more robust frameworks for considering the welfare of diverse forms of sentience, including AI. The essay outlines the importance of this research, potential approaches, and its far-reaching implications for the future of sentience in our universe.

Introduction

What does it mean to make AI welfare a priority? I argue that the most effective approach involves fundamental research into consciousness itself. While solving the hard problem of consciousness (the question of why/how matter generates qualia) may be infeasible or impossible, we can still make significant progress in understanding physical correlates of consciousness in ways that are relevant for AI. This progress, even though it is unlikely to give us definitive answers about consciousness, can at least help us reduce some of our uncertainties, which will be crucial for making informed decisions about AI welfare.

I suggest that taking AI welfare seriously at this stage should be less about asking LLMs about how they feel (though this could be an important part of future research) or altering conditions of deployment to suit their stated preferences. Instead, I propose turning our focus to more foundational questions: the physical correlates of phenomenal experience, the binding problem, and the substrate question. By concentrating on these areas and employing methods such as research into physical correlates of consciousness and BCI-enabled exploration of other substrates, we can make phenomenal experience more legible and perhaps even quantifiable to a degree. This increased understanding will enable us to better prioritize what truly matters in AI welfare.

Below, I will explore the importance of these fundamental questions and outline practical approaches to this research. By addressing these foundational issues, we can develop a more robust framework for understanding and addressing AI welfare, grounded in a deeper comprehension of consciousness itself.

Philosophical Foundations

Sentientism

I am a sentientist and take it as an axiom that what is important with regards to the treatment of entities–AI, human, transhuman, animal, alien, etc.--is the nature of “their” (quotes added because this experience might not be holistic, fully combined, or tied to identity) phenomenal experience, direct or indirect. I’m aware that there is debate about whether this is required for moral patienthood, but I will not address that question here. Instead, I proceed with the axiom that effect on the general phenomenon of conscious experience is what matters for welfare, whether that is direct (“we treat this person well because we care about their experience”) or indirect (“we treat this star/planet well because it matters to some entity with phenomenal experience, whether from an aesthetic, preference-based, or needs-based perspective”). (I am not a fan of Chalmer’s Vulcan argument for multiple reasons, including that it is trivially obvious even among humans that what “harms” people is dependent on the phenomenal structure of their individual minds–e.g., some people are extreme masochists, other people don’t care if you take their possessions, etc. One could also ask questions like: What is “good” for organisms that, for example, die or are killed as a necessary part of the act of reproduction?)

Identity-Free Ethics

I also subscribe to Parfit's positions on identity, which have profound implications for how we consider consciousness and welfare, particularly in the context of AI. Parfit argues against a strict notion of personal identity, suggesting instead that psychological continuity and connectedness are what matter for survival and ethical consideration.

Consider these two scenarios:

The relationship between person X at point A and person X at point B (e.g., you now and you in 10 years)
The relationship between person X at point A who causally affects person Y at point B (e.g., you now and your hypothetical uploaded mind in 10 years)

Parfit contends that the difference between these scenarios is one of degree, not of kind. In both cases, there's a causal chain of psychological states, memories, and experiences connecting the earlier and later persons. The fact that in the second scenario, this chain involves a change in substrate (from biological to digital) doesn't fundamentally alter the nature of the relationship.

This view has several important implications for AI welfare:

Continuity over discrete identity: Instead of focusing on discrete, clearly bounded "individuals," we should consider the continuity of conscious experiences. This is particularly relevant for AI systems that might not have clear boundaries of selfhood or might exist across distributed networks.
Gradients of similarity: We should think in terms of degrees of psychological similarity or continuity, rather than binary distinctions between "same" and "different" individuals. This could inform how we consider the welfare of AI systems as they evolve and change over time.
Fusion and fission of minds: Parfit's view allows us to ethically consider scenarios where conscious experiences might merge or split, which could be relevant for AI systems that can be copied, merged, or partitioned.
Substrate independence: If psychological continuity is what matters, not the underlying substrate, this supports the idea that consciousness and welfare considerations could apply to non-biological systems like AIs.
Time-slice welfare: This perspective encourages us to focus on the welfare of momentary experiential states, rather than long-term identities. This could be particularly relevant for AI systems that might not have long-term stable identities in the way humans do.

As such, I am mostly interested in the question of sentience and phenomenal consciousness without strict regard for identity and "individuals." This approach allows us to consider the welfare of AI systems in a more nuanced way, focusing on the quality and intensity of conscious experiences rather than getting caught up in questions of whether an AI system is a distinct, persistent individual in the way we typically think about human persons.

This perspective also aligns well with the challenges we face in understanding and ethically considering highly distributed or rapidly changing AI systems. It provides a framework for thinking about welfare that can adapt to novel forms of consciousness that might emerge in artificial systems.

What We Need to Discern

Legible Measures of Welfare

From the position of sentientism, the important question is: How do we make phenomenal experience legible (and ideally, measurable) in a way that's meaningful across diverse forms of consciousness? This is the only way we can prioritize our actions for the sake of sentient beings. Making sentience legible will inevitably involve understanding experiential reality along a variety of dimensions (e.g. valence, arousal, etc.).

Welfare Theories Without Regard for Sentience Are Insufficient

It’s easy to claim that we should consider AI welfare without understanding this (even arguments that don’t assume sentience have some weight). And welfare, according to some definitions, doesn’t have to require sentience at all. But we simply cannot say how much weight we should give to the welfare of an entity unless we reduce some of our uncertainty about their sentience.

A given AI might perform an order of magnitude more computation than the neurons of a human cerebral cortex–but even if we assume there is some sentience involved, we know nothing about the extent, valence or content of that experience. Humans certainly aren’t experiencing everything that they are computing! (Or at least, the part we’re aware of isn’t.) Is the welfare of this 10x-human AI worth that of 10 humans? Or 1/10th of a human? Or 0/10th?

Thought experiment: Would you give your life to save a billion human lives? What about a billion insects? How would you decide whether to do this for AIs, and what number of AIs would compel action? (If death/deletion isn’t compelling, imagine the question is about “X number of years of apparent suffering” instead).

To make our models more legible and specific, we ideally need to have some level of confidence about:

Which computations involve sentience? And which substrates or forms of computation are compatible with this sentience?
What is the nature of their phenomenological experience?
What is the extent, magnitude, or “amount” of their experience?
How is this experience bound into a unified process? (We probably don’t care much about micro-experiential zombies.)

Reframing the Question

Our specific debate question is whether AI welfare should be an EA priority. But if we approach this from a sentientist perspective, I argue that a focus on AI welfare means that we must first focus on antecedent questions that will have import not only for AI, but also for brain uploads, BCI-enhanced humans, and an entire spectrum of future entities that use non-biological substrates to some degree. Imagine a human with an advanced nanotechnological BCI such that their metacortex includes a significant synthetic computation module: Will this represent an expansion of consciousness, such that this enhanced human’s mind is the equivalent of multiple natural human minds in terms of phenomenal magnitude? Or will these synthetic computation modules merely deliver content to the existing conscious biological substrate of the human’s organic brain? We currently lack a good way to think about this, but if Kurzweil’s latest projections (and those of Neuralink and Kernel) are correct, it will likely be of critical importance at some point in the next decade.

Why This Is One of the Most Important Questions of the Century

Non-biological computation and resource consumption will likely soon vastly exceed the amount of (presumably conscious) biological computation. Total world compute capacity (taking just GPUs and TPUs into account) was around 4e21 FP32 FLOP/s in Q1 2023 [3]. If we go by AI Impacts’ median estimate for the FLOP/s in a human brain (1e15) [2] or Kurzweil’s current estimate (1e14) [1], this is 1e6-1e7 human-equivalents. Importantly, I am not claiming that this is equivalent to human cognition or consciousness, only to neural computation specifically. (And it’s not entirely implausible that it could take 1e25 FLOP/s to simulate a human’s functional consciousness at full resolution, as per that same AI Impacts report).

Considering the massive GPU investments in 2023 and 2024, this has likely expanded significantly. If this total compute capacity triples every year (which I suspect is probably very conservative), we’re only 7-9 years away from having more digital human-brain-equivalents than actual humans, though this number is highly uncertain and we do not actually know how much computation would be required for human-level consciousness as opposed to cognitive equivalents. Additionally, even if matmuls in silicon have some level of consciousness, we are highly uncertain about how consciousness scales with compute, if at all.

If most of this compute is used for AI inference and training, the moral consequences of digital sentience might become enormous very quickly (regardless of how correct our initial estimates are). But we have little to no information about whether this is the case because we don’t know anything about digital sentience yet.

Interestingly, Kurzweil [1] seems to think that much of this capacity will serve as directly-connected compute additions to the human brain, provided we and our AI research assistants will be able to make sufficient progress on BCIs in the early 2030s. Whether this will be the case remains to be seen, but the question of digital sentience remains important nevertheless.

What We Would Do If We Were Serious About This Question

Any issue that has the potential to affect humanity-scale amounts of sentience deserves a significant amount of attention–much more than consciousness research has gotten so far. Two significant issues are tractability–consciousness research seems hard–and recent psychological paradigms, which have largely rejected human reports about the nature of their experience. Fortunately, the latter issue has waned significantly, though significant stigma still remains.

If we are to do research into consciousness, it seems necessary to accept human reports as evidence, even if it isn’t seen as definitive evidence. The role of human reports from various interventions can function to reduce uncertainty and update our priors, even if it isn’t absolutely reliable. The existing field of neurophenomenology takes this approach. I suggest that this field deserves significant expansion and investment over the next decade.

The following lines of research seem important, though I'm sure there are others:

Neurophenomenological research:
1. Biological correlates of consciousness: What are the results of various interventions on phenomenological consciousness? Can we intervene in ways that test various theories of consciousness? For instance, can we manipulate the integration of information in neural networks and observe corresponding changes in reported conscious experiences?
2. BCI-based studies: As brain-computer interface technology advances, we can explore how direct neural interfaces affect conscious experience. This could provide insights into how consciousness might manifest in artificial systems. I suspect that many future lines of research will flow through these kinds of tools, not only because they enable us to record neural activity more precisely, but also because by manipulating variables and seeing what humans report in response we will be able to learn an enormous amount. For example, is it possible to expand the breadth of consciousness in a way that spans both biological and synthetic components? Can we perform temporary types of ablation that seem to disrupt consciousness?
3. Disciplined research into the structure and valence of consciousness through meditation and psychedelics: These altered states of consciousness can offer unique perspectives on the nature of experience, potentially revealing aspects of consciousness that are not apparent in ordinary waking states.
Alignments with biological correlates:
1. Mechanistic interpretability for experiential correlates: As we develop more sophisticated AI models, we should strive to understand their internal representations and processes in ways that are relevant to theories of consciousness. By comparing these to what we know about biological correlates of consciousness, we might gain insights into potential forms of machine consciousness.
Testing multiple theories of consciousness: We need to design experiments that can differentiate between various theories of consciousness (e.g., Integrated Information Theory, Global Workspace Theory, various physicalism theories). This will help us narrow down the most plausible explanations for conscious experience.
Interdisciplinary collaboration: Given the complex nature of consciousness, we need to foster collaboration between neuroscientists, philosophers, AI researchers, and experts in other relevant fields. This could involve creating dedicated research centers or funding programs that encourage cross-disciplinary work on consciousness and AI welfare.
Ethical frameworks: As we gain more understanding of consciousness, we can then proceed to develop robust ethical frameworks for dealing with potentially sentient AI systems. I think developing frameworks prematurely will likely be important to avoid. However, when we are able to create these frameworks, they should include considering questions of AI rights, responsible development practices, and how to weigh AI welfare against human welfare.

Conclusion

The question of AI welfare is not just a philosophical curiosity—it is a pressing issue that could have profound implications for the future of sentience in our universe. Within a generation, digital minds may outweigh the total sum of human minds (at least in terms of computation), and this will likely include personality uploads and enhanced humans as well.

By prioritizing research into the physical correlates of consciousness, the binding problem, and the substrate question, we can begin to develop a more nuanced and informed approach not only to AI welfare, but to the entire spectrum of future beings enabled through artificial computation. This research won't just benefit AI and digital minds—it has the potential to revolutionize our understanding of consciousness itself, with far-reaching implications for human enhancement, digital sentience, and our ethical obligations to all forms of conscious experience.

These lines of research are challenging and fraught with uncertainties. We may never fully solve the hard problem of consciousness, but we can make significant progress in reducing our uncertainties and developing more robust frameworks for considering the welfare of diverse forms of sentience.

As effective altruists, we have a responsibility to consider the welfare of all sentient beings—including those we might create. By making consciousness research a priority, we're not just preparing for a future with potentially sentient AI; we're expanding our ethical circle and deepening our understanding of what it means to be conscious.

The computational power at our disposal is growing at an unprecedented rate, potentially creating vast amounts of artificial cognition. Whether this translates to an explosion of conscious experience—with all its attendant joys and sufferings—is a question we cannot afford to ignore. This research should be made a priority now, before we inadvertently create legions of suffering digital minds or miss the opportunity to foster flourishing new forms of consciousness.

My specific recommendations for the EA community:

Funding: Allocate significant resources to interdisciplinary consciousness research programs.
Collaboration: Foster partnerships between EA organizations, academic institutions, and tech companies to advance this research.
Policy advocacy: Push for increased government funding and support for consciousness studies and their ethical implications.
Public engagement: Promote public understanding of these issues to build broader support for this crucial work.
Talent pipeline: Encourage EA-aligned individuals to pursue careers in neuroscience, philosophy of mind, and related fields.

By taking these steps, we can work towards a future where we approach the development of AI and other advanced technologies with a deep understanding of consciousness and a commitment to the welfare of all sentient beings.

Kurzweil, Ray, The Singularity is Nearer (2024)
https://wiki.aiimpacts.org/doku.php?id=ai_timelines:hardware_and_ai_timelines:computing_capacity_of_all_gpus_and_tpus
https://aiimpacts.org/brain-performance-in-flops/

Effective Altruism Forum
EA Forum