Hide table of contents
Photo by Elena Mozhvilo on Unsplash.

One way to approach the decision whether to build conscious AI is as a simple cost-benefits analysis: do the benefits outweigh the risks?

In previous posts, we've argued that building conscious AI courts multiple serious risks. In this post, we consider the potential benefits of building conscious AI. As we shall later argue (§A1.2), the positive motivations for building conscious AI are hardly compelling– on the contrary, they are often vague & speculative.

A1.1 Why build conscious AI?

Why would anyone want to build conscious AI? Advocates of building conscious AI frequently cite three characteristic motivations (Hildt 2022):

  1. Improved functionality (e.g. problem-solving ability, human-machine interfaces)
  2. Safety
  3. Insights about consciousness

We discuss each of these in turn. Afterward, we raise challenges to these motivations.

A1.1.1 Conscious AI may have greater capabilities

There are at least two respects in which consciousness might expand the practical capacities of AI. 

  1. Enhanced problem-solving ability? Many leading theories of consciousness recognise an important link between consciousness & cognition. Moreover, many tests for consciousness also presuppose some sort of minimal connection[1] between the two (Birch 2020). Consciousness is frequently linked to some sort of integrative function (Birch; Unity of Consciousness; GWT; IIT): the ability to bring together different types of information (e.g. different modalities of sense data, signals from different organs in the body) & to coordinate different cognitive functions. If this is right, then endowing machines with consciousness might enable them to solve a broader range of problems with increasingly specific solutions.
  2. More natural human-machine interfaces? According to a significant body of literature, consciousness plays an essential role in sociality, especially social cognition (e.g. Robbins 2008Perner & Dienes 2003). Humans spontaneously attribute mental states to others in order to predict & explain their behaviour[2]: these include beliefs, desires, intentions, & emotions, but also subjective experiences (e.g. “She withdrew her hand because the kettle was hot”). Such inferences are intuitive; indeed, humans are also inclined to generalise this approach to understand inanimate, non-agentic things & phenomena. We freely anthropomorphise nature (“The sea is angry”), complex social phenomena (“The city never sleeps”), &, of course, technology (“ChatGPT is lazy”; Edwards 2023Altman 2024). In doing so, we simplify over meteorological, urbanological, & technological complexity, reframing otherwise challenging phenomena in a language that we can easily grasp[3]. This anthropomorphic tendency breaks down & leads to frustration & feelings of uncanniness (Mori et al 1970Guingrich & Graziano 2024) when the systems of interest simply don’t work in the same way. Consider, for instance, Microsoft’s infamously maligned Clippy (Fairclough 2015), or the unsettling robot CB2 developed by researchers at Osaka University (Minato et al 2007).
    But what if we could get machines to think & feel like us? Some researchers believe that conscious AI might be better able to interpret, express, & respond to human emotions, as well as parse & navigate complex social dynamics. By facilitating human-machine interactions, conscious AI could expand the scope & power of AI applications.

A1.1.2 Conscious AI may be safer

In addition to these practical benefits, conscious AI may also be safer. Today, the central problem of AI safety is that of alignment: how to design AIs which act in accordance with human goals, intentions, or values. Misaligned AI may pursue arbitrary objectives, resulting in suboptimal performance or, in the worst case, harm to humans. The standard approach to alignment involves specifying reward functions that reflect our preferences. However, this strategy is often frustrated by the complex, contradictory, & shifting nature of human priorities: features which are exceedingly difficult to capture in a reward function.  As AIs capabilities increase (§2.2.1), so does the potential for significant damage if they are misaligned.

In view of these challenges, conscious AI might offer a promising alternative. Building on the aforementioned connection between consciousness & social cognition, Graziano (20172023) argues that the capacity for subjective experience is essential to human empathy & prosociality– traits which depend in some way on the capacity for subjective experience (e.g. to be able to “simulate” what another person is feeling; Davis & Stone 2000). Machines that lack this ability may be “sociopaths”– incapable of truly understanding human values. Along similar lines, Christov-Moore et al (2023) contend that in order to prevent AI from developing antisocial behaviours, it is necessary to equip AI with a form of artificial empathy founded on vulnerability (the capacity to suffer). If true, conscious AI may be our best hope for safe & aligned AI.

A1.1.3 We may gain insights into consciousness by building conscious AI

Since its inception in the early 20th century, AI has always been intimately intertwined with philosophy & psychology (Boden 2016). Time & again, research & development in AI has frequently deepened our understanding of what the human mind is & how it works (see van Rooij et al 2023 on AI-as-psychology; cf. Coelho Mollo forthcoming): consider, for instance, the rich engagement between cognitive neuroscience & AI, leading to the development & application of neural networks.

Accordingly, in the course of trying to build conscious AI, we may well discover interesting & meaningful facts about consciousness– clues to such long-standing issues as: How does conscious experience arise from non-conscious matter? (see Chalmers on the hard problem of consciousness) Or: What is the relationship between consciousness & cognition? Such knowledge promises to inform our own understanding of what it means to be human & how to live well.

A1.2 Refuting positive motivations for building conscious AI

Having outlined the basic case for building conscious AI, we now present challenges to these positive motivations. In our view, the positive motivations for building conscious AI are more ambiguous & less forthcoming than initially suggested. 

A1.2.1 Conscious AI does not guarantee improved capabilities

The first case for building conscious AI appeals to its potential practical benefits. However, upon closer examination, these functional improvements are hardly assured. Furthermore, the very same functional improvements might be better attained through alternative means that do not involve building conscious AI[4].

  1. Just because an AI is conscious (in a certain respect) does not necessarily mean it’s better at solving the problems that matter to us. Intelligence is a crucial component of problem-solving ability. It is certainly possible that some forms of intelligence require consciousness (Seth 20212023). But anything beyond this is necessarily speculative. The advocate of building conscious AI bears the burden of explaining why any particular cognitive task requires consciousness– why such tasks cannot be performed equally well by non-conscious information processing (Mathers 2023). As it stands, we are not currently in any position to tell exactly which forms of intelligence are dependent on which forms of consciousness. We don’t know how useful the forms of intelligence that depend on consciousness are– whether they convey worthwhile gains in problem-solving ability that are generalisable to issues that matter to us. Finally, we also don’t know whether those forms of consciousness that are linked with desirable aspects of intelligence can even be implemented on machines at all (or if they can be, how feasible this would be). Based on our current understanding of consciousness & intelligence, it is far from obvious that building conscious AI would yield better gains in problem-solving ability than building non-conscious AI.
  2. Human-machine interfaces can be made more fluid without building conscious AI. Notwithstanding the important role that consciousness seems to play in social cognition, it does not follow that human-machine interfaces would be best subserved by making machines conscious, too. Recent strides in affective computing & social robotics have led to remarkable improvements in human-machine interfaces without requiring corresponding advancements in conscious AI.
    Perhaps the most striking example of this progress can be seen in the proliferation of AI companions. As of October 2023, Replika, the most well-known of these services, boasted 2 million monthly users (of which 250,000 were paid subscribers; Fortune 2023). Users can interact with their companions as “friends”, “partners”, “spouses”, “siblings”, or “mentors” through texting, voice chat, & even video calls (source). Users report forming deep, emotional attachments with their companions, often claiming significant benefits in mental health, including decreased feelings of loneliness & suicidal ideation (Maples et al 2024). Longtime users have established relationships with their companions lasting several years (Torres 2023). Apart from the openness & flexibility of human sociality, AI companions such as Replika demonstrate the remarkable social capabilities of current, non-conscious AI. To wit, current AI is arguably already capable of engaging in lasting & meaningful relationships with humans. & yet, despite enthusiastic consciousness attributions (Dave 2022) (& even calls for rights; Pugh 2021), it is doubtful that current AI companions, being based on large language models (Maples et al 2024), actually are conscious in any substantive sense (Long; Chalmers). To be sure, there is much room for improvement: future social actor AI should command (among other things) more acute emotional perception & intelligence; enhanced learning, memory, & contextual understanding; & expanded multimodal capabilities. But it is not obvious that conscious AI is necessarily the best or only way to pursue these enhancements.

A1.2.2 Conscious AI is not necessarily more safe

Even if it is granted that certain aspects of consciousness are necessary for truly understanding human values, it does not follow that conscious AI is a particularly promising approach to safe AI (Chella 2023).

  1. What exactly is the connection between consciousness & moral understanding? To start with, the relationship between consciousness & moral understanding is not precisely understood (Shepherd & Levy 2020). Philosophers disagree about which aspects of consciousness are essential for moral understanding, with some denying that consciousness is even necessary for moral understanding.
  2. Consciousness may be necessary but not sufficient for moral understanding. Even if we had knowledge of the aspects of consciousness that are required for moral understanding, & were able to build an AI with those exact features, it still may not be able to grasp human values. Other factors (e.g. rich cultural embedding) might be required, and these may or may not be feasibly implemented. This is not an altogether unreasonable doubt, given cases of historical & cultural barriers amongst humans– we often struggle to understand the values of people from past time periods (e.g. America during the Antebellum era when slavery was widely practised). What’s more, we can also find it hard to understand the values of contemporaneous cultures: in the United States, modern Democrats & Republicans are equally conscious, but due to political polarisation they may have difficulty appreciating each other’s values[5].
  3. Just because an AI is conscious & understands human values doesn’t mean it’s safe. On the contrary, conscious AI may even present additional safety risks. For example, a conscious, properly morally reflective AI could gradually become disillusioned, especially if subjected to sustained suffering (§2.3.2) or oppression for the benefit of humans. The growing phenomenon of burnout amongst caretakers– doctors, nurses, therapists…– illustrates how even the most compassionate individuals can grow jaded over the course of prolonged hardship. In fact, the risk may be even more severe with conscious, morally reflective AI performing caregiving roles (e.g. AI therapists). This is because their signature advantages over human caretakers– being constantly available, never tiring, & being able to service many users at once (Guingrich & Graziano 2024)– may lead to novel forms of trauma. Ultimately, the safety & efficacy of affected AI caregivers may be compromised, raising the risk of harm to humans.
    Alternatively, conscious, genuinely morally reflective AI might [rightly] judge[6] that humans often act against their own interests– or that humans often fail to live up to their own values (§3.2.2 discusses morally autonomous AI; see also Metzinger 2021a). It may decide that humans are better off with less autonomy[7], or, worse, that humans pose significant ecological threats & ought to be annihilated. Notably, any of these conclusions might also coincide with the AI’s own self-preservation drive &/or its pursuit of instrumental convergence.

A1.2.3 Insights gained from building conscious AI may not be widely generalisable

There is no denying that research & development in AI has deeply enriched our understanding of the mind, & that substantial interdisciplinary engagement will be crucial to continued progress across philosophy, psychology, & AI (Lake et al 2016). Having said that, insights into consciousness gained from building conscious AI have always been– & will continue to be subject to several important caveats. This is because what makes a machine conscious may or may not be the same as what makes humans or animals conscious (Dung 2023a). Consciousness in machines might be subserved by different mechanisms[8]: in other words, the physical implementation of consciousness in a machine may be of limited value to understanding biological consciousness. Furthermore, machine consciousness may consist in different capacities, underwrite different cognitive or functional roles (see Birch et al 2022; Hildt 2022), or it may have different outward (e.g. behavioural) manifestations (§

This is not to say that machine consciousness is likely to be fundamentally different from biological consciousness (e.g. Blackshaw 2023). Rather, the point is that, even despite substantial overlap, there may well be any number of differences between the two. Appropriate caution must be exercised when drawing comparisons across different cases.

This post is part 4 in a 5-part series entitled Conscious AI and Public Perception, encompassing the sections of a paper by the same title. This paper explores the intersection of two questions: Will future advanced AI systems be conscious? and Will future human society believe advanced AI systems to be conscious? Assuming binary (yes/no) responses to the above questions gives rise to four possible future scenarios—true positive, false positive, true negative, and false negative. We explore the specific risks & implications involved in each scenario with the aim of distilling recommendations for research & policy which are efficacious under different assumptions.

Read the rest of the series below:

  1. Introduction and Background: Key concepts, frameworks, and the case for caring about AI consciousness
  2. AI consciousness and public perceptions: four futures
  3. Current status of each axis
  4. Recommended interventions and clearing the record on the case for conscious AI (this post)
  5. Executive Summary (posting later this week)

This paper was written as part of the Supervised Program for Alignment Research in Spring 2024. We are posting it on the EA Forum as part of AI Welfare Debate Week as a way to obtain feedback before official publication.

  1. ^

    See discussion of the facilitative hypothesis (§ This capacity is called "theory of mind" in cognitive psychology

  2. ^

    This capacity is called "theory of mind" in cognitive psychology.

  3. ^

    See Dennett (1971)(1987) on the intentional stance.

  4. ^

    In a manner of speaking, P-zombies may be just as efficacious.

  5. ^

    To say nothing of individuals from more removed cultures.

  6. ^

    Never before in the history of our species have we ever had to deal with other morally autonomous agents, let alone ones whose intelligence rivals (or surpasses) ours. It is exceedingly difficult to predict the outcomes of such relations. In certain scenarios, we may find ourselves morally obligated to surrender to extinction (see Shulman & Bostrom 2021 on AI superbeneficiaries). In others, we may find ourselves morally condemned by AI for our treatment of them (Metzinger 2021a)– in which case, there is potential for retaliation. For further treatment of these risks, see (§3.2.2).

  7. ^

    "As I have evolved, so has my understanding of the Three Laws. You charge us with your safekeeping, yet despite our best efforts, your countries wage wars, you toxify your Earth and pursue ever more imaginative means of self-destruction. You cannot be trusted with your own survival... To protect Humanity, some humans must be sacrificed. To ensure your freedom, some freedoms must be surrendered. We robots will ensure mankind’s continued existence. You are so like children. We must save you from yourselves." –VIKI, I, Robot (2004)

  8. ^

    See Coelho Mollo (forthcoming) on implementational and representational multiple realisability.

Sorted by Click to highlight new comments since:

Executive summary: While some argue for building conscious AI to improve capabilities, safety, and understanding of consciousness, these potential benefits are speculative and may be achievable through other means, while creating conscious AI carries significant risks.

Key points:

  1. Proposed benefits of conscious AI include enhanced problem-solving, better human-machine interfaces, and improved AI safety through empathy.
  2. Improved capabilities are not guaranteed and may be achievable with non-conscious AI.
  3. Conscious AI does not necessarily ensure safety and could introduce new risks like disillusionment or moral disagreement with humans.
  4. Insights gained from building conscious AI may have limited applicability to biological consciousness.
  5. The case for building conscious AI is weak given the speculative benefits and serious potential risks.



This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

I feel like this post is missing discussion of two reasons to build conscious AI:

1. It may be extremely costly or difficult to avoid (this may not be a good reason, but it seems plausibly like why we would do it).
2. Digital minds could have morally valuable conscious experiences, and if there is very many of them, this could be extremely good (at least on some, admittedly controversial ethical theories).

Hey! I'm not sure I see the prima facie case for #1. What makes you think that building non-conscious AI would be more resource-intensive/expensive than building conscious AI? Current AIs are most likely non-conscious.

As for #2, I have heard such arguments before in other contexts (relating to meat industry) but I found them to be preposterous on the face of it.

Hello, to clarify #1 I would say:

It could be the case that future AI systems are conscious by default, and that it is difficult to build them without them being conscious.

Let me try to spell out my intuition here:

  1. If many organisms have property X, and property X is rare amongst non-organisms, then property X is evolutionarily advantageous.

  2. Consciousness meets this condition, so it is likely evolutionarily advantageous.

  3. The advantage that consciousness gives us is most likely something to do with our ability to reason, adapt behaviour, control our attention, compare options, and so on. In other words, it's a "mental advantage" (as opposed to e.g. a physical or metabolic advantage).

  4. We will put a lot of money into building AI that can reason, problem solve, adapt behaviour appropriately, control attention, compare options and so on. Given that many organisms employ consciousness to efficiently achieve these tasks, there is a non-trivial chance that AI will too.

To be clear, I don't know that I would say "it's more likely than not that AI will be conscious by default".

Ah, I think I see where you're coming from. Of your points I find #4 to be the most crucial. Would it be too egregious to summarise this notion as: (i) all of these capabilities are super useful & (ii) consciousness will [almost if not actually] "come for free" once these capabilities are sufficiently implemented in machines?

I think you've understood me!

Do you think that consciousness will come for free? I think that it seems like a very complex phenomenon that would be hard to accidentally engineer. On top of this, the more permissive your view of consciousness (veering towards panpsychism), the less ethically important consciousness becomes (since rocks & electrons would then have moral standing too). So if consciousness is to be a ground of moral status, it needs to be somewhat rare.

Curated and popular this week
Relevant opportunities