Bio

“In the day I would be reminded of those men and women,
Brave, setting up signals across vast distances,
Considering a nameless way of living, of almost unimagined values.”

How others can help me

I would greatly appreciate anonymous feedback, or just feedback in general. Doesn't have to be anonymous.

Comments
303

Answer by Emrik3
0
1

Principles are great!  I call them "stone-tips".  My latest one is:

Look out for wumps and woozles!

It's one of my favorite. ^^  It basically very-sorta translates to bikeshedding (idionym: "margin-fuzzing"), procrastination paradox (idionym: "marginal-choice trap" + attention selection history + LDT), and information cascades / short-circuits / double-counting of evidence…  but a lot gets lost in translation.  Especially the cuteness.

The stone-tip closest to my heart, however, is:

I wanna help others, but like a lot for real!

I think EA is basically sorta that… but a lot gets confusing in implementation.

I first learned this lesson in my youth when, after climbing to the top of a leaderboard in a puzzle game I'd invested >2k hours into, I was surpassed so hard by my nemesis that I had to reflect on what I was doing.  Thing is, they didn't just surpass me and everybody else, but instead continued to break their own records several times over.

Slightly embarrassed by having congratulated myself for my merely-best performance, I had to ask "how does one become like that?"

My problem was that I'd always just been trying to get better than the people around me, whereas their target was the inanimate structure of the problem itself.  When I had broken a record, I said "finally!" and considered myself complete.  But when they did the same, they said "cool!", and then kept going.  The only way to defeat them, would be by not trying to defeat them, and instead focus on fighting the perceived limits of the game itself.

To some extent, I am what I am today, because I at one point aspired to be better than Aisi.

If evolutionary biology metaphors for social epistemology is your cup of tea, you may find this discussion I had with ChatGPT interesting. 🍵

(Also, sorry for not optimizing this; but I rarely find time to write anything publishable, so I thought just sharing as-is was better than not sharing at all. I recommend the footnotes btw!)

Glossary/metaphors

Background

Once upon a time, the common ancestor of the palm trees Howea forsteriana and Howea belmoreana on Howe Island would pollinate each other more or less uniformly during each flowering cycle. This was "panmictic" because everybody was equally likely to mix with everybody else.

Then there came a day when the counterfactual descendants had had enough. Due to varying soil profiles on the island, they all had to compromise between fitness for each soil type—or purely specialize in one and accept the loss of all seeds which landed on the wrong soil. "This seems inefficient," one of them observed. A few of them nodded in agreement and conspired to gradually desynchronize their flowering intervals from their conspecifics, so that they would primarily pollinate each other rather than having to uniformly mix with everybody. They had created a cline.

And a cline once established, permits the gene pools of the assortatively-pollinating palms to further specialize toward different mesa-niches within their original meta-niche. Given that a crossbreed between palms adapted for different soil types is going to be less adaptive for either niche,[1] you have a positive feedback cycle where they increasingly desynchronize (to minimize crossbreeding) and increasingly specialize. Solve for the general equilibrium and you get sympatric speciation.[2]

Notice that their freedom to specialize toward their respective mesa-niches is proportional to their reproductive isolation (or inversely proportional to the gene flow between them). The more panmictic they are, the more selection-pressure there is on them to retain 1) genetic performance across the population-weighted distribution of all the mesa-niches in the environment, and 2) cross-compatibility with the entire population (since you can't choose your mates if you're a wind-pollinating palm tree).[3]

From evo bio to socioepistemology

I love this as a metaphor for social epistemology, and the potential detrimental effects of "panmictic communication". Sorta related to the Zollman effect, but more general. If you have an epistemic community that are trying to grow knowledge about a range of different "epistemic niches", then widespread pollination (communication) is obviously good because it protects against e.g. inbreeding depression of local subgroups (e.g. echo chambers, groupthink, etc.), and because researchers can coordinate to avoid redundant work, and because ideas tend to inspire other ideas; but it can also be detrimental because researchers who try to keep up with the ideas and technical jargon being developed across the community (especially related to everything that becomes a "hot topic") will have less time and relative curiosity to specialize in their focus area ("outbreeding depression").

A particularly good example of this is the effective altruism community. Given that they aspire to prioritize between all the world's problems, and due to the very high-dimensional search space generalized altruism implies, and due to how tight-knit the community's discussion fora are (the EA forum, LessWrong, EAGs, etc.), they tend to learn an extremely wide range of topics. I think this is awesome, and usually produces better results than narrow academic fields, but nonetheless there's a tradeoff here.

The rather untargeted gene-flow implied by wind-pollination is a good match to mostly-online meme-flow of the EA community. You might think that EAs will adequately speciate and evolve toward subniches due to the intractability of keeping up with everything, and indeed there are many subcommunities that branch into different focus areas. But if you take cognitive biases into account, and the constant desire people have to be *relevant* to the largest audience they can find (preferential attachment wrt hot topics), plus fear-of-missing-out, and fear of being "caught unaware" of some newly-developed jargon (causing people to spend time learning everything that risks being mentioned in live conversations[4]), it's unlikely that they couldn't benefit from smarter and more fractal ways to specialize their niches. Part of that may involve more "horizontally segmented" communication.

Tagging @Holly_Elmore because evobio metaphors is definitely your cup of tea, and a lot of it is inspired by stuff I first learned from you. Thanks! : )

  1. ^

    Think of it like... if you're programming something based on the assumption that it will run on Linux xor Windows, it's gonna be much easier to reach a given level of quality compared to if you require it to be cross-compatible.

  2. ^

    Sympatric speciation is rare because the pressure to be compatible with your conspecifics is usually quite high (Allee effects network effects). But it is still possible once selection-pressures from "disruptive selection" exceed the "heritage threshold" relative to each mesa-niche.[5]

  3. ^

    This homegenification of evolutionary selection-pressures is akin to markets converging to an equilibrium price. It too depends on panmixia of customers and sellers for a given product. If customers are able to buy from anybody anywhere, differential pricing (i.e. trying to sell your product at above or below equilibrium price for a subgroup of customers) becomes impossible.

  4. ^

    This is also known (by me and at least one other person...) as the "jabber loop":

    This highlight the utter absurdity of being afraid of having our ignorance exposed, and going 'round judging each other for what we don't know. If we all worry overmuch about what we don't know, we'll all get stuck reading and talking about stuff in the Jabber loop. The more of our collective time we give to the Jabber loop, the more unusual it will be to be ignorant of what's in there, which means the social punishments for Jabber-ignorance will get even harsher.

  5. ^

    To take this up a notch: sympatric speciation occurs when a cline in the population extends across a separatrix (red) in the dynamic landscape, and the attractors (blue) on each side overpower the cohering forces from Allee effects (orange). This is the doodle I drew on a post-it note to illustrate that pattern in different context:

    I dub him the mascot of bullshit-math. Isn't he pretty?

(Publishing comment-draft that's been sitting here two years, since I thought it was good (even if super-unfinished…), and I may wish to link to it in future discussions. As always, feel free to not-engage and just be awesome. Also feel free to not be awesome, since awesomeness can only be achieved by choice (thus, awesomeness may be proportional to how free you feel to not be it).)

Yes! This relates to what I call costs of compromise.

Costs of compromise

As you allude to by the exponential decay of the green dots in your last graph, there are exponential costs to compromising what you are optimizing for in order to appeal to a wider variety of interests. On the flip-side, how usefwl to a subgroup you can expect to be is exponentially proportional to how purely you optimize for that particular subset of people (depending on how independent the optimization criteria are). This strategy is also known as "horizontal segmentation".[1]

The benefits of segmentation ought to be compared against what is plausibly an exponential decay in the number of people who fit a marginally smaller subset of optimization criteria. So it's not obvious in general whether you should on the margin try to aim more purely for a subset, or aim for broader appeal.

Specialization vs generalization

This relates to what I think are one of the main mysteries/trade-offs in optimization: specialization vs generalization. It explains why scaling your company can make it more efficient (economies of scale),[2] why the brain is modular,[3] and how Howea palm trees can speciate without the aid of geographic isolation (aka sympatric speciation constrained by genetic swamping) by optimising their gene pools for differentially-acidic patches of soil and evolving separate flowering intervals in order to avoid pollinating each other.[4]

Conjunctive search

When you search for a single thing that fits two or more criteria, that's called "conjunctive search". In the image, try to find an object that's both [colour: green] and [shape: X].

My claim is that this analogizes to how your brain searches for conjunctive ideas: a vast array of preconscious ideas are selected from a distribution of distractors that score high in either one of the criteria.

10d6 vs 1d60

Preamble2: When you throw 10 6-sided dice (written as "10d6"), the probability of getting a max roll is much lower compared to if you were throwing a single 60-sided dice ("1d60"). But if we assume that the 10 6-sided dice are strongly correlated, that has the effect of squishing the normal distribution to look like the uniform distribution, and you're much more likely to roll extreme values.

Moral: Your probability of sampling extreme values from a distribution depends the number of variables that make it up (i.e. how many factors convolved over), and the extent to which they are independent. Thus, costs of compromise are much steeper if you're sampling for outliers (a realm which includes most creative thinking and altruistic projects).

Spaghetti-sauce fallacies 🍝

If you maximally optimize a single spaghetti sauce for profit, there exists a global optimum for some taste, quantity, and price. You might then declare that this is the best you can do, and indeed this is a common fallacy I will promptly give numerous examples of. [TODO…]

But if you instead allow yourself to optimize several different spaghetti sauces, each one tailored to a specific market, you can make much more profit compared to if you have to conjunctively optimize a single thing.

Thus, a spaghetti-sauce fallacy is when somebody asks "how can we optimize thing  more for criteria ?" when they should be asking "how can we chunk/segment  into  cohesive/dimensionally-reduced segments so we can optimize for {, ..., } disjunctively?"


People rarely vote based on usefwlness in the first place

As a sidenote: People don't actually vote (/allocate karma) based on what they find usefwl. That's a rare case. Instead, people overwhelmingly vote based on what they (intuitively) expect others will find usefwl. This rapidly turns into a Keynesian Status Contest with many implications. Information about people's underlying preferences (or what they personally find usefwl) is lost as information cascades are amplified by recursive predictions. This explains approximately everything wrong about the social world.

Already in childhood, we learn to praise (and by extension vote) based on what kinds of praise other people will praise us for. This works so well as a general heuristic that it gets internalized and we stop being able to notice it as an underlying motivation for everything we do.

  1. ^

    See e.g. spaghetti sauce.

  2. ^

    Scale allows subunits (e.g. employees) to specialize at subtasks.

  3. ^

    Every time a subunit of the brain has to pull double-duty with respect to what it adapts to, the optimization criteria compete for its adaptation—this is also known as "pleiotropy" in evobio, and "polytely" in… some ppl called it that and it's a good word.

  4. ^

    This palm-tree example (and others) are partially optimized/goodharted for seeming impressive, but I leave it in because it also happens to be deliciously interesting and possibly entertaining as examples of a costs of compromise. I want to emphasize how ubiquitous this trade-off is.

Oh, this is excellent! I do a version of this, but I haven't paid enough attention to what I do to give it a name. "Blurting" is perfect.

I try to make sure to always notice my immediate reaction to something, so I can more reliably tell what my more sophisticated reasoning modules transforms that reaction into. Almost all the search-process imbalances (eg. filtered recollections, motivated stopping, etc.) come into play during the sophistication, so it's inherently risky. But refusing to reason past the blurt is equally inadvisable.

This is interesting from a predictive-processing perspective.[1] The first thing I do when I hear someone I respect tell me their opinion, is to compare that statement to my prior mental model of the world. That's the fast check. If it conflicts, I aspire to mentally blurt out that reaction to myself.

It takes longer to generate an alternative mental model (ie. sophistication) that is able to predict the world described by the other person's statement, and there's a lot more room for bias to enter via the mental equivalent of multiple comparisons. Thus, if I'm overly prone to conform, that bias will show itself after I've already blurted out "huh!" and made note of my prior. The blurt helps me avoid the failure mode of conforming and feeling like that's what I believed all along.

Blurting is a faster and more usefwl variation on writing down your predictions in advance.

  1. ^

    Speculation. I'm not very familiar with predictive processing, but the claim seems plausible to me on alternative models as well.

I disagree a little bit with the credibility of some of the examples, and want to double-click others. But regardless, I think this is a very productive train of thought and thank you for writing it up. Interesting!

And btw, if you feel like a topic of investigation "might not fit into the EA genre", and yet you feel like it could be important based on first-principles reasoning, my guess is that that's a very important lead to pursue. Reluctance to step outside the genre, and thinking that the goal is to "do EA-like things", is exactly the kind of dynamic that's likely to lead the whole community to overlook something important.

I'm not sure. I used to call it "technical" and "testimonial evidence" before I encountered "gears-level" on LW. While evidence is just evidence and Bayesian updating stays the same, it's usefwl to distinguish between these two categories because if you have a high-trust community that frequently updates on each others' opinions, you risk information cascades and double-counting of evidence.

Information cascades develop consistently in a laboratory situation [for naively rational reasons, in which other incentives to go along with the crowd are minimized]. Some decision sequences result in reverse cascades, where initial misrepresentative signals start a chain of incorrect [but naively rational] decisions that is not broken by more representative signals received later. - (Anderson & Holt, 1998)

Additionally, if your model of a thing has has "gears", then there are multiple things about the physical world that, if you saw them change, it would change your expectations about the thing.

Let's say you're talking to someone you think is smarter than you. You start out with different estimates and different models that produce those estimates. From Ben Pace's a Sketch of Good Communication:

Here you can see both blue and red has gears. And since you think their estimate is likely to be much better than yours, and you want get some of that amazing decision-guiding power, you throw out your model and adopt their estimate (cuz you don't understand or don't have all the parts of their model):

Here, you have "destructively deferred" in order to arrive at your interlocutor's probability estimate. Basically zombified. You no longer have any gears, even if the accuracy of your estimate has potentially increased a little.

An alternative is to try to hold your all-things-considered estimates separate from your independent impressions (that you get from your models). But this is often hard and confusing, and they bleed into each other over time.

"When someone gives you gears-level evidence, and you update on their opinion because of that, that still constitutes deferring."

This was badly written. I just mean that if you update on their opinion as opposed to just taking the patterns & trying to adjust for the fact that you received them through filters, is updating on testimony. I'm saying nothing special here, just that you might be tricking yourself into deferring (instead of impartially evaluating patterns) by letting the gearsy arguments woozle you.

I wrote a bit about how testimonial evidence can be "filtered" in the paradox of expert opinion:

If you want to know whether string theory is true and you're not able to evaluate the technical arguments yourself, who do you go to for advice? Well, seems obvious. Ask the experts. They're likely the most informed on the issue. Unfortunately, they've also been heavily selected for belief in the hypothesis. It's unlikely they'd bother becoming string theorists in the first place unless they believed in it.

If you want to know whether God exists, who do you ask? Philosophers of religion agree: 70% accept or lean towards theism compared to 16% of all PhilPaper Survey respondents.

If you want to know whether to take transformative AI seriously, what now?

Some selected comments or posts I've written

  • Taxonomy of cheats, multiplex case analysis, worst-case alignment
  • "You never make decisions, you only ever decide between strategies"
  • My take on deference
  • Dumb
  • Quick reasons for bubbliness
  • Against blind updates
  • The Expert's Paradox, and the Funder's Paradox
  • Isthmus patterns
  • Jabber loop
  • Paradox of Expert Opinion
  • Rampant obvious errors
  • Arbital - Absorbing barrier
  • "Decoy prestige"
  • "prestige gradient"
  • Braindump and recommendations on coordination and institutional decision-making
  • Social epistemology braindump (I no longer endorse most of this, but it has patterns)

Other posts I like

  • The Goddess of Everything Else - Scott Alexander
    • “The Goddess of Cancer created you; once you were hers, but no longer. Throughout the long years I was picking away at her power. Through long generations of suffering I chiseled and chiseled. Now finally nothing is left of the nature with which she imbued you. She never again will hold sway over you or your loved ones. I am the Goddess of Everything Else and my powers are devious and subtle. I won you by pieces and hence you will all be my children. You are no longer driven to multiply conquer and kill by your nature. Go forth and do everything else, till the end of all ages.”
  • A Forum post can be short - Lizka
    • Succinctly demonstrates how often people goodhart on length or other irrelevant criteria like effort moralisation. A culture for appreciating posts for the practical value they add to you specifically, would incentivise writers to pay more attention to whether they are optimising for expected usefwlness or just signalling.
  • Changing the world through slack & hobbies - Steven Byrnes
    • Unsurprisingly, there's a theme to what kind of posts I like. Posts that are about de-Goodharting ourselves.
  • Hero Licensing - Eliezer Yudkowsky
    • Stop apologising, just do the thing. People might ridicule you for believing in yourself, but just do the thing.
  • A Sketch of Good Communication - Ben Pace
    • Highlights the danger of deferring if you're trying to be an Explorer in an epistemic community.
  • Holding a Program in One's Head - Paul Graham
    • "A good programmer working intensively on his own code can hold it in his mind the way a mathematician holds a problem he's working on. Mathematicians don't answer questions by working them out on paper the way schoolchildren are taught to. They do more in their heads: they try to understand a problem space well enough that they can walk around it the way you can walk around the memory of the house you grew up in. At its best programming is the same. You hold the whole program in your head, and you can manipulate it at will.

      That's particularly valuable at the start of a project, because initially the most important thing is to be able to change what you're doing. Not just to solve the problem in a different way, but to change the problem you're solving."

My latest tragic belief is that in order to improve my ability to think (so as to help others more competently) I ought to gradually isolate myself from all sources of misaligned social motivation.  And that nearly all my social motivation is misaligned relative to the motivations I can (learn to) generate within myself.  So I aim to extinguish all communication before the year ends (with exception for Maria).

I'm posting this comment in order to redirect some of this social motivation into the project of isolation itself.  Well, that, plus I notice that part of my motivation comes from wanting to realify an interesting narrative about myself; and partly in order to publicify an excuse for why I've ceased (and aim to cease more) writing/communicating.

Load more