Quick takes

If you've liked my writing in the past, I wanted to share that I've started a Substack: https://peterwildeford.substack.com/

Ever wanted a top forecaster to help you navigate the news? Want to know the latest in AI? I'm doing all that in my Substack -- forecast-driven analysis about AI, national security, innovation, and emerging technology!

Something that I personally would find super valuable is to see you work through a forecasting problem "live" (in text). Take an AI question that you would like to forecast, and then describe how you actually go about making that forecast. The information you seek out, how you analyze it, and especially how you make it quantitative. That would

  1. make the forecast process more transparent for someone who wanted to apply skepticism to your bottom line
  2. help me "compare notes", ie work through the same forecasting question that you pose, come to a conclusion, a
... (read more)

I sometimes say, in a provocative/hyperbolic sense, that the concept of "neglectedness" has been a disaster for EA. I do think the concept is significantly over-used (ironically, it's not neglected!), and people should just look directly at the importance and tractability of a cause at current margins.

Maybe neglectedness useful as a heuristic for scanning thousands of potential cause areas. But ultimately, it's just a heuristic for tractability: how many resources are going towards something is evidence about whether additional resources are likely to be i... (read more)

Showing 3 of 11 replies (Click to show all)

But neglectedness as a heuristic is very good precisely for narrowing down what you think the good opportunity is. Every neglected field is a subset of a non-neglected field. So pointing out that great grants have come in some subset of a non neglected field doesn't tell us anything.

To be specific, it's really important that EA identifies the area within that neglected field where resources aren't flowing, to minimize funging risk. Imagine that AI safety polling had not been neglected and that in fact there were tons of think tanks who planned to do AI saf... (read more)

4
tlevin
I think the opposite might be true: when you apply it to broad areas, you're likely to mistake low neglectedness for a signal of low tractability, and you should just look at "are there good opportunities at current margins." When you start looking at individual solutions, it starts being quite relevant whether they have already been tried. (This point already made here.)
2
tlevin
What is gained by adding the third thing? If the answer to #2 is "yes," then why does it matter if the answer to #3 is "a lot," and likewise in the opposite case, where the answers are "no" and "very few"? Edit: actually yeah the "will someone else" point seems quite relevant.

I wish more work focused on digital minds really focused on answering the following questions, rather than merely investigating how plausible it is that digital minds similar to current day AI's could be sentient:

  1. What does good sets of scenarios for post-AGI governance need to look like to create good/avoid terrible (or whatever normative focus we want) futures, assuming digital minds are the dominant moral patients going into the future 1a) How does this differ dependent on what sorts of things can be digital minds eg whether sentient AIs are likely to

... (read more)
3
Bradford Saad
I'd also like to see more work on digital minds macrostrategy questions such as 1-3. To that end, I'll take this opportunity to mention that the Future Impact Group is accepting applications for projects on digital minds (among other topics) through EoD on March 8 for its part-time fellowship program. I'm set to be a project lead for the upcoming cohort and would welcome applications from people who'd want to work with me on a digital minds macrostrategy project. (I suggest some possible projects here but am open to others.)  I think the other project leads listed for AI sentience are all great and would highly recommend applying to work with any of them on a digital minds project (though I'm unsure if any of them are open to macrostrategy projects).
8
Ryan Greenblatt
I think work of the sort you're discussing isn't typically called digital minds work. I would just describe this as "trying to ensure better futures (from a scope-sensitive longtermist perspective) other than via avoiding AI takeover, human power grabs, or extinction (from some other source)". This just incidentally ends up being about digital entities/beings/value because that's where the vast majority of the value probably lives. ---------------------------------------- The way you phrase (1) seems to imply that you think large fractions of expected moral value (in the long run) will be in the minds of laborers (AIs we created to be useful) rather than things intentionally created to provide value/disvalue. I'm skeptical.

You're sort of right on the first point, and I've definitely counted that work in my views on the area. I generally prefer to refer to it as 'making sure the future goes well for non-humans' - but I've had that misinterpreted as just focused on animals. I

I think for me the fact that the minds will be non-human, and probably digital, matter a lot. Firstly, I think arguments for longtermism probably don't work if the future is mostly just humans. Secondly, the fact that these beings are digital minds, and maybe digital minds very different to us, means a lot... (read more)

Anyone else get a pig butchering scam attempt lately via DM on the forun? 

I just got the following message 

> Happy day to you, I am [X] i saw your profile today and i like it very much,which makes me to write to you to let you know that i am interested in you,therefore i will like you to write me back so that i will tell you further about myself and send you also my picture for you to know me physically. 

[EMAIL]

I reported the user on their profile and opened a support request but just FYI


 

We've got 'em. Apologies to anyone else who got this message. 

4
Toby Tremlett🔹
Thanks for sharing Seth. Would you mind DMing me their name? I'll ban the account, and mods will look into this. 

(x-posted from LW)

Single examples almost never provides overwhelming evidence. They can provide strong evidence, but not overwhelming.

Imagine someone arguing the following:
 

1. You make a superficially compelling argument for invading Iraq

2. A similar argument, if you squint, can be used to support invading Vietnam

3. It was wrong to invade Vietnam

4. Therefore, your argument can be ignored, and it provides ~0 evidence for the invasion of Iraq.

In my opinion, 1-4 is not reasonable. I think it's just not a good line of reasoning. Regardless of whether you'... (read more)

1-4 is only unreasonable because you've written a strawman version of 4. Here is a version that makes total sense:

1. You make a superficially compelling argument for invading Iraq

2. A similar argument, if you squint, can be used to support invading Vietnam

3. This argument for invading vietnam was wrong because it made mistakes X, Y, and Z

4. Your argument for invading Iraq also makes mistakes X, Y and Z

5. Therefore, your argument is also wrong. 

Steps 1-3 are not strictly necessary here, but they add supporting evidence to the claims. 

As far as I c... (read more)

So long and thanks for all the fish. 

I am deactivating my account.[1] My unfortunate best guess is that at this point there is little point and at least a bit of harm caused by me commenting more on the EA Forum. I am sad to leave behind so much that I have helped build and create, and even sadder to see my own actions indirectly contribute to much harm.

I think many people on the forum are great, and at many points in time this forum was one of the best places for thinking and talking and learning about many of the world's most important top... (read more)

Showing 3 of 12 replies (Click to show all)
-34
titotal
19
Ben_West🔸
It feels appropriate that this post has a lot of hearts and simultaneously disagree reacts. We will miss you, even (perhaps especially) those of us who often disagreed with you.  I would love to reflect with you on the other side of the singularity. If we make it through alive, I think there's a decent chance that it will be in part thanks to your work.

After nearly 7 years, I intend to soon step down as Executive Director of CEEALAR, founded by me as the EA Hotel in 2018. I will remain a Trustee, but take more of a back seat role. This is in order to focus more of my efforts on slowing down/pausing/stoping AGI/ASI, which for some time now I've thought of as being the most important, neglected and urgent cause.

We are hiring for my replacement. Please apply if you think you'd be good in the role! Or send on to others you'd like to see in the role. I'm hoping that we find someone who is highly passionate ab... (read more)

I haven't visited CEELAR and I don't know how impactful it has been, but one thing I've always admired about you via your work on this project is your grit and agency. When you thought it was a good idea back in 2018, you went ahead and bought the place. When you needed funding, you asked and wrote a lot about what was needed. You clearly care a lot about this project, and that really shows. I hope your successor will too.

I'm reminded of Lizka's Invisible Impact post. It's easy to spot flaws in projects that actually materialise but hard/impossible to crit... (read more)

Instead of "Goodharting", I like the potential names "Positive Alignment" and "Negative Alignment."

"Positive Alignment" means that the motivated party changes their actions in ways the incentive creator likes. "Negative Alignment" means the opposite.

Whenever there are incentives offered to certain people/agents, there are likely to be cases of both Positive Alignment and Negative Alignment. The net effect will likely be either positive or negative. 

"Goodharting" is fairly vague and typically just refers to just the "Negative Alignment" portion.&n... (read more)

I think the term "goodharting" is great. All you have to do is look up goodharts law to understand what is talked about: the AI is optimising for the metric you evaluated it on, rather than the thing you actually want it to do. 

Your suggestions would rob this term of the specific technical meaning, which makes thing much vaguer and harder to talk about. 

I imagine that scientists will soon have the ability to be unusually transparent and provide incredibly low rates of fraud/bias, using AI. (This assumes strong AI progress in the next 5-20 years)

  • AI auditors could track everything (starting with some key things) done for an experiment, then flag if there was significant evidence of deception / stats gaming / etc. For example, maybe a scientist has an AI screen-recording their screen whenever it's on, but able to preserve necessary privacy and throw out the irrelevant data.
  • AI auditors could review any experi
... (read more)
Showing 3 of 7 replies (Click to show all)
3
Parker_Whitfill
Agreed with this. I'm very optimistic about AI solving a lot of incentive problems in science. I don't know if the end case (full audits) as you mention will happen, but I am very confident we will move in a better direction than where we are now.    I'm working on some software now that will help a bit in this direction! 
4
titotal
I don't mind you using LLMs for elucidating discussion, although I don't think asking it to rate arguments is very valuable.  The additional details of having subfield specific auditors that are opt-in does lessen my objections significantly. Of course, the issue of what counts as a subfield is kinda thorny. It would make most sense for, as claude suggests, journals to have an "auditor verified" badge, but then maybe you're giving too much power over content to the journals, which usually stick to accept/reject decisions (and even that can get quite political).  Coming back to your original statement, ultimately I just don't buy that any of this can lead to "incredibly low rates of fraud/bias". If someone wants to do fraud or bias, they will just game the tools, or submit to journals with weak/nonexistent auditors. Perhaps the black box nature of AI might even make it easier to hide this kind of thing.  Next: there are large areas of science where a tool telling you the best techniques to use will never be particularly useful. On the one hand there is research like mine, where it's so frontier that the "best practices" to put into such an auditor don't exist yet. On the other, you have statistics stuff that is so well known that there already exist software tools that implement the best practices: you just have to load up a well documented R package. What does an AI auditor add to this? If I was tasked with reducing bias and fraud, I would mainly push for data transparency requirements in journal publications, and in beefing up the incentives for careful peer review, which is currently unpaid and unrewarding labour. Perhaps AI tools could be useful in parts of that process, but I don't see it as anywhere near as important than those other two things. 

This context is useful, thanks.

Looking back, I think this part of my first comment was poorly worded:
> I imagine that scientists will soon have the ability to be unusually transparent and provide incredibly low rates of fraud/bias, using AI.

I meant 
> I imagine that scientists will [soon have the ability to] be unusually transparent and provide incredibly low rates of fraud/bias], using AI.

So it's not that this will lead to low rates of fraud/bias, but that AI will help enable that for scientists willing to go along with it - but at the same time... (read more)

If antinatal advocacy was effective, wouldn't it make sense to pursue on animal welfare grounds? Aren't most new humans extremely net negative?

I have a 3YO so hold fire!

  • Most new humans will likely consume hundreds (thousands?) of factory farmed animals over their lifetime, creating a substantial negative impact that might outweigh the positive contributions of that human life
  • Probably of far less consequence, the environmental footprint of each new human also indirectly harms wild animals through habitat destruction, pollution, and climate change (TBH I am being very speculative on this point).
3
David Mathers🔸
Some people are going to say that destroying nature is a positive impact of new humans, because they think wild animals have net negative lives. 

ooft, good point. 

As AI improves, there's a window for people to get involved and make changes regarding AI alignment and policy.

The window arguably starts small, then widens as it becomes clearer what to do.

But at some point it gets too close to TAI, I expect that the window narrows. The key decisions get made by a smaller and smaller group of people, and these people have less ability get help from others, given the quickening pace of things.

For example, at T minus 1 month, there might ultimately be a group of 10 people with key decision-making authority on the most power... (read more)

3
Peter
Hmm maybe it could still be good to try things in case timelines are a bit longer or an unexpected opportunity arises? For example, what if you thought it was 2 years but actually 3-5?

I wasn't trying to make the argument that it would definitely be clear when this window closes. I'm very unsure of this. I also expect that different people have different beliefs, and that it makes sense for them to then take corresponding actions. 

Mini EA Forum Update

We've updated the user menu in the site header! 🎉 I'm really excited, since I think it looks way better and is much easier to use.

We've pulled out all the "New ___" items to a submenu, except for "New question" which you can still do from the "New post" page (it's still a tab there, as is linkpost). And you can see your quick takes via your profile page. See more discussion in the relevant PR.

Let us know what you think! 😊

Bonus: we've also added Bluesky to the list of profile links, feel free to add yours!

I love having the profile button at the top, I currently find not having that a bit disorientating

If we could have LLM agents that could inspect other software applications (including LLM agents) and make strong claims about them, that could open up a bunch of neat possibilities.

  • There could be assurances that apps won't share/store information.
  • There could be assurances that apps won't be controlled by any actor.
  • There could be assurances that apps can't be changed in certain ways (eventually).

I assume that all of this should provide most of the benefits people ascribe to blockchain benefits, but without the costs of being on the blockchain.

Some neat opt... (read more)

Reading the Emergent Misalignment paper and comments on the associated Twitter thread has helped me clarify the distinction[1] between what companies call "aligned" vs "jailbroken" models. 

"Aligned" in the sense that AI companies like DeepMind, Anthropic and OpenAI mean it = aligned to the purposes of the AI company that made the model. Or as Eliezer puts it, "corporate alignment." For example, a user may want the model to help edit racist text or the press release of an asteroid impact startup but this may go against the desired morals and/or co... (read more)

Showing 3 of 5 replies (Click to show all)
2
titotal
Thank you for laying that out, that is elucidatory. And behind all this I guess is the belief that if we don't suceed in "technical alignment", the default is that the AI will be "aligned" to an alien goal, the pursuit of which will involve humanities disempowerment or destruction? If this was the belief, I could see why you would find technical alignment superior.  I, personally, don't buy that this will be the default: I think the default will be some shitty approximation of the goals of the corporation that made it, localised mostly to the scenarios it was trained in. From the point of view of someone like me, technical alignment actually sounds dangerous to pursue: it would allow someone to imbue an AI with world domination plans and potentially actually succeed. 
3
Jonas Hallgren
FWIW, I find that if you analyze places where we've successfully aligned things in the past (social systems or biology etc.) you find that the 1th and 2nd types of alignment really don't break down in that way.  After doing Agent Foundations for a while I'm just really against the alignment frame and I'm personally hoping that more research in direction will happen so that we get more evidence that other types of solutions are needed. (e.g alignment of complex systems such as has happened in biology and social systems in the past)

That sounds like [Cooperative AI](https://www.cooperativeai.com/post/new-report-multi-agent-risks-from-advanced-ai) 

https://www.cooperativeai.com/post/new-report-multi-agent-risks-from-advanced-ai 

In addition to wreaking havoc with USAID, the rule of law, whatever little had been started in Washington about AI safety, etc., the US government has, as you all know, decided to go after trans people. I'm neither trans nor an American, but I think it's really not nice of them to do that, and I'd like to do something about it, if I can.

To some extent, of course, it's the inner deontologist within me speaking here: trans people are relatively few, arguably in less immediate danger than African children dying of AIDS, and the main reason why I feel an urge ... (read more)

Thinking about the idea of an "Evaluation Consent Policy" for charitable projects. 

For example, for a certain charitable project I produce, I'd explicitly consent to allow anyone online, including friends and enemies, to candidly review it to their heart's content. They're free to use methods like LLMs to do this.

Such a policy can give limited consent. For example:

  • You can't break laws when doing this evaluation
  • You can't lie/cheat/steal to get information for this evaluation
  • Consent is only provided for under 3 years
  • Consent is only provided starting in
... (read more)

A quick annoucement that Magnify mentee applications are now open! 

Magnify mentee applications are currently open! 

We would love to hear from you if you are a woman, non-binary person, or trans person of any gender who is enthusiastic about pursuing a high-impact career using evidence-based approaches. Please apply here by the 18th March.

​Past mentees have been most successful when they have a clear sense of what they would like to achieve through the 6-month mentorship program. We look to match pairings based on the needs and availabil... (read more)

Does anyone want to develop a version of Talos/Tarbell/AIM/Horizon but for comms roles? 

Last year, we at EA Netherlands spent considerable time developing such a programme idea in collaboration with someone with significant experience in science and technology communications. Our funding application wasn't successful but I think the idea still has legs, especially given the results of the MCF talent needs survey. If you're interested, let me know and perhaps we can hand over what we developed and you can give it another go. 

4
Patrick Gruban 🔸
The fellowships you mention want to have impact-focused people working in mainly nonimpact-focused organizations or starting new ones. The MCF question was about people working within the movement. Communications is also a broad field, with a strategist having different skill requirements than someone working in PR or also in digital marketing. So, I'm not sure if the comparison works. That said, I could see a version of this where experienced people from different communications backgrounds get introduced to EA ideas to be better prepared for applying to organizations within the field. At Successif, we have some experienced communication people in our advising program whom we help to navigate the AI Risk field and get introduced to ideas and organizations. We're also planning an advocacy fellowship based on our existing media course. Perhaps something like this could also be useful in the EA space. I'm happy to share our experiences if anyone wants to work on this.

That’s a very good point, perhaps the fellowships ran by the school for moral ambition is a better analogy. 

The USDA secretary released a strategy yesterday on lowering egg prices. Explained originally as a WSJ opinion (paywall). Summarized here without the paywall.

Five points to the strategy:

  • $500 million for a biosecurity program to limit transmission of avian flu
  • $400 million to farmers to recover after an outbreak
  • $100 million for vaccines
  • Look to ease regulations, especially overriding California Proposition 12 that banned the sale of eggs from caged hens.
  • Look to allow temporary imports of eggs

Key concerns:

  • Enacting a key-goal of the EATS act to overri
... (read more)

I just created a new Discord server for generated AI safety reports (ie. using Deep Research or other tools). Would be excited to see you join (ps. Open AI now provides uses on the plus plan 10 queries per month using Deep Research).

https://discord.gg/bSR2hRhA

Load more