Hide table of contents

This work has come out of my Undergraduate dissertation. I haven't shared or discussed these results much before putting this up.  Message me if you'd like the code :)

Edit: 16th April. After helpful comments, especially from Geoffrey, I now believe this method only identifies shifts in the happiness scale (not stretches). Have edited to make this clearer.

TLDR

  • Life satisfaction (LS) appears flat over time, despite massive economic growth — the “Easterlin Paradox.”
  • Some argue that happiness is rising, but we’re reporting it more conservatively — a phenomenon called rescaling.
  • I test rescaling using long-run German panel data, looking at whether the association between reported happiness and three “get-me-out-of-here” actions (divorce, job resignation, and hospitalisation) changes over time.
  • If people are getting happier (and rescaling is occuring) the probability of these actions should become less linked to reported LS — but they don’t.
  • I find little evidence of rescaling. We should probably take self-reported happiness scores at face value.

1. Background: The Happiness Paradox

Humans today live longer, richer, and healthier lives in history — yet we seem no seem for it. Self-reported life satisfaction (LS), usually measured on a 0–10 scale, has remained remarkably flatover the last few decades, even in countries like Germany, the UK, China, and India that have experienced huge GDP growth. As Michael Plant has written, the empirical evidence for this is fairly strong.

This is the Easterlin Paradox. It is a paradox, because at a point in time, income is strongly linked to happiness, as I've written on the forum before. This should feel uncomfortable for anyone who believes that economic progress should make lives better — including (me) and others in the EA/Progress Studies worlds.

Assuming agree on the empirical facts (i.e., self-reported happiness isn't increasing), there are a few potential explanations:

  • Hedonic adaptation: as life gets better, our expectations rise just as fast — so we don’t feel happier.
  • Social comparison: we care about relative, not absolute, gains.
  • Rescaling: maybe happiness is increasing, but the 0–10 reporting scale has shifted or stretched.

It’s that third one — rescaling — that I try to empirically test here.

2. What is “Rescaling” ?

The rescaling hypothesis suggests that the reporting function — how we map our true wellbeing onto a 0–10 scale — changes over time. You can think of the reporting function like a ruler, converting underlying happiness to reported Life Satisfaction. Suppose two people, one from 1990 and another from 2020, both report their Life Satisfaction (LS) = 8/10. The second person could be happier, if their reporting function (i.e., their ruler) has: either shifted or stretched upwards. 

I believe the method proposed below only identifies shifting in the happiness ruler: when the happiness of a 'best' and 'worst' possible life, both shift upwards. This process is illustrated below.

From: https://link.springer.com/article/10.1007/s10902-021-00460-8

If rescaling is happening, then all the flat LS lines we see in national statistics might be wrong. It could mean that people are getting happier — we’re just measuring it badly.

There’s very little work testing this directly: probably less than 10 papers, and only 1 relating to the Easterlin Paradox (Prati & Senik, 2025). They use a memory-based approach: ask people how happy they used to be and compare that to what they said at the time. If I said “7/10” in 2010, but remember it as “6/10” in 2015, that suggests my internal scale has changed. Using this method, Prati & Senik estimate that happiness in the U.S. might be underreported by 80–140%. (!)

This work is interesting, but it totally relies on peoples' memories. We might employ motivated reasoning  to justify to ourselves that life is getting better. The authors attempt to correct for this

3. My Approach: Use Actions, Not Memories

If people are getting happier over time — but reporting it on a stricter, shifted scale — then the link between how happy someone says they are, and what they do when they're unhappy, should weaken over time. 

In other words: if life satisfaction is increasing, but the reporting scale is [edit] shifting, then big life decisions — like leaving a job or ending a relationship — should become less predictable from reported happiness.

This assumes that the underlying relationship between underlying happiness and 'exit actions' is: 1) negative and 2) convex (i.e., gets flatter at higher happiness levels). In other words, as peoples happiness with their family,increases to infinity, for example, the probability that they get divorced falls to 0.

Indeed, Kaiser & Oswald (2022) found that reported domain satisfaction was a strong, convex predictor of these GMEOH/exit actions:

Kaiser & Oswald (2022)

If the scale has shifted upwards, we would effectively be sampling datapoints from further right on these graphs – where the relationship is flatter. This would result in a fall in the predictive power of life satisfaction (a flatter/weaker relationship).

I focus on three major exit or "get-me-out-of-here" (GMEOH) decisions:

  • GMEOH_Marriage → Divorce
  • GMEOH_Job → Voluntary resignation, transfer, or self-employment exit
  • GMEOH_Health → Hospitalisation

and two measures of happiness, in the year proceeding this event:

  • Domain satisfaction → Separate questions on satisfaction with family, work, and health (0–10 scale)
  • Overall life satisfaction → “All things considered, how satisfied are you with your life?”

Method & Data

Using the German SOEP panel (1990–2022) — the longest-running panel dataset on happiness — I analyse:

  • ~700,000 person-year observations
  • ~6,500 divorces
  • ~24,000 job exits
  • ~83,000 hospitalisations

I run logistic regressions, predicting each GMEOH event using lagged happiness (life satisfaction in the previous period), and interact this with decade dummies to test whether the LS–action link weakens over time.

If the happiness scale is shifting upwards, then we’d expect the effect of LS on these actions to attenuate across decades.[1].

4. Results

Exits Actions & Happiness with a Life Domain

  • The relationships between life satisfaction in a particular domain – family, health, work – and a 'get-me-out-of-here' action were basically stable over time.
  • No evidence of attenuation/rescaling here.

(I plot the basic models here – I also included individual fixed effects, but this made little difference and is more fiddly to plot.)

Exits Actions & Overall Happiness

I then combined all three exit actions into a single binary variable – which equals 1, if a person took any of the GMEOH actions in a year (became divorced, went to hospital, or quit their job). 

  • I regressed this combined variable on overall life satisfaction (lagged) (including an interaction term for time periods)
  • This increases statistical power, since domain-specific satisfaction isn't always measured.

Surprisingly, the relationship appears to get stronger over time. That’s the opposite of what rescaling predicts!

5. Conclusion

I started this project with a weak prior that maybe the Easterlin Paradox is a result of measurement error: maybe happiness really is rising — we’re just measuring it badly.

But I didn’t find that. Across three exit actions — divorce, job resignation, and hospital visits — the link with self-reported happiness is stable, or getting stronger. That suggests people use the happiness scale consistently over time. In short: rescaling (at least scale shifts) is not driving the Easterlin Paradox — at least not in Germany.

I also think the implications of the Easterlin Paradox are more profound than people realise. Decades of economic progress have made us no (or at least very little) happier! Isn't that insane!

 

More Stuff to do here

There's a bunch of limitations here, and a bunch more work that could be done. 

A few limitations that come to mind: 

  • This identifies shifts, not stretches in the reporting function – I think!
  • Divorce isn't voluntary – it isn't necessarily a 'get-me-out-of-here' action
  • I could have done some non-parametric regressions – i..e., not logit. 

Other work

  • This could be replicated on UK and/or Australian data. The Oswald/Kaiser paper also does this. 
  1. ^

    The results were robust to smaller time dummies too.

63

0
0
1

Reactions

0
0
1

More posts like this

Comments23
Sorted by Click to highlight new comments since:

The phenomenon you describe as "rescaling" is generally known as a (violation of) measurement invariance across in psychometrics. It is typically tested by observing whether the measurement model (i.e., the relationship between the unobservable psychological construct and the measured indicators of that construct) differ across groups (a comprehensive evaluation of different approaches is in Millsap, 2011).

I would interpret the tests of measurement invariance you use.... 

If people are getting happier over time — but reporting it on a stretched or stricter scale — then the link between how happy someone says they are, and what they do when they're unhappy, should weaken over time.

In other words: if life satisfaction is increasing, but the reporting scale is stretching, then big life decisions — like leaving a job or ending a relationship — should become less predictable from reported happiness

 

....to actually be measures of "prediction invariance": which holds when a measure has the same regression coefficient with respect to an external criterion across different groups or time.

But as Borsboom (2006) points out, prediction invariance and measurement invariance might actually be in tension with each other under a wide range of situations. Here's a relevant quotation:

In 1997 Millsap published an important paper in Psychological Methods on the relation between prediction invariance and measurement invariance. The paper showed that, under realistic conditions, prediction invariance does not support measurement invariance. In fact, prediction invariance is generally indicative of violations of measurement invariance: if two groups differ in their latent means, and a test has prediction invariance across the levels of the grouping variable, it must have measurement bias with regard to group membership. Conversely, when a test is measurement invariant, it will generally show differences in predictive regression parameters.

This is stretching my knowledge of the topic beyond its bounds, but this issue seems related to the general inconsistency between measurement invariance and selection invariance, which has been explored independently in psychometrics and machine learning (e.g., the chapters on facial recognition and  recidivism in The Alignment Problem). 

Thanks a lot for this. I hadn't actually come across these terms; that's super useful. I'll have to read  both these articles when I get a chance, will report back.

To synthesize a few of the comments on this post -- This comment sounds like a general instance of the issue that @geoffrey points out in another comment: what @Charlie Harrison is describing as a violation of "prediction invariance" may just be a violation of "measurement invariance"; in particular because happiness (the real thing, not the measure) may have a different relationship with GMEOH events over time.

I basically agree with this critique of the results in the post, but want to add that I nonetheless think this is a very cool piece of research and I am excited to see more exploration along these lines!

One idea that I had -- maybe someone has done something like this? -- is to ask people to watch a film or read a novel and rate the life satisfaction of the characters in the story. For instance, they might be asked to answer a question like "How much does Jane Eyre feel satisfied by her life, on a scale of 1-10?". (Note that we aren't asking how much the respondent empathizes with Jane or would enjoy being her, simply how much satisfaction they believe Jane gets from Jane's life.) This might allow us to get a shared baseline for comparison. If people's assessments of Jane's life go up or down over time, (or differ between people) it seems unlikely that this is a result of a violation of "prediction invariance", since Jane Eyre is an unchanging novel with fixed facts about how Jane feels. Instead, it seems like this would indicate a change in measurement: i.e. how people assign numerical scores to particular welfare states.

haha, yes, people have done this! This is called 'vignette-adjustment'. You basically get people to read short stories and rate how happy they think the character is. There are a few potential issues with this method: (1) they aren't included in long-term panel data; (2) people might interpret the character's latent happiness differently based on their own happiness

Oh, great, thanks so much! I'll check this out.

Anchoring vignettes may also sometimes lack stability within persons. That said, it's par for the course that any one source of evidence for invariance is going to have its strengths and weaknesses. We'll always be looking for convergence across methods rather than a single cure-all. 

Always enjoy your posts, you tend to have fresh takes and clear analyses on topics that feel well-trodden. 

That said, I think I'm mainly confused if the Easterlin paradox is even a thing, and hence whether there's anything to explain. On the one hand there are writeups like Michael Plant's summarising the evidence for it, which you referenced. On the other hand, my introduction to Easterlin's paradox was via Our World in Data's happiness and life satisfaction article, which summarised the evidence for happiness rising over time in most countries here and explain away the Easterlin paradox here as being due to either survey questions changing over time (in Japan's case, chart below) or to inequality making growth not benefit the majority of people (in the US's case).

GDP per capita vs. Life satisfaction across survey questions

Plant's writeup says that 

The reply that Easterlin and O’Connor (2022) make is that Stevenson, Wolfers, and co. are looking over too short a time horizon. They point out that the critique looks at segments of ten years and to really test the paradox requires looking over a longer time period, which is what Easterlin and O'Connor (2022) do themselves. Easterlin and O'Connor (2022) write that they don't really understand why Stevenson and Wolfers are using these short time segments rather than the longer ones.

But the chart for Japan above, which is from Stevenson and Wolfers (2008), spans half a century, not ten years, so Easterlin and O'Connor's objection is irrelevant to the chart. 

This OWID chart (which is also the first one you see on Wikipedia) is the one I always think about when people bring up the Easterlin paradox:

But Plant's writeup references the book Origins of Happiness (Clark et al., 2019) which has this chart instead:

So is the Easterlin paradox really a thing or not? Why do the data seem to contradict each other? Is it all mainly changing survey questions and rising inequality? On priors I highly doubt it's that trivially resolvable, but I don't have a good sense of what's going on.

I wish there was an adversarial collaboration on this between folks who think it's a thing and those who think it isn't.

Hey Mo, thanks so much!

I don't have a particularly strong view on this.

I guess:

First, there are differences in the metrics used – the life satisfaction (0-10) is more granular than the 4 category response questions. 

Additionally, the plot from OWID, a lot of the data seems quite short-term – e.g., 10 years or so. Easterlin always emphasises that the paradox is across the whole economic cycle, but a country might experience continuous growth in the space of a decade.

My overall view – several happiness economists I've spoken to basically think the Easterlin Paradox is correct (at least, to be specific: self-reported national life satisfaction is flat in the long-run), so I defer to them.

It's worth saying that the fact that most arrows go up on the OWiD chart could just point to two independent trends, one of growth rising almost everywhere and another of happiness rising almost everywhere, for two completely independent reasons. Without cases where negative or zero growth persists for a long time, it's hard to rule this out. 

It could in theory, but OWID's summary of the evidence mostly persuades me otherwise. Again I'm mostly thinking about how the Easterlin paradox would explain this OWID chart: 

I'm guessing Easterlin et al would probably counter that OWID didn't look at a long-enough timeframe (a decade is too short), and I can't immediately see what the timeframe is in this chart, so there's that.

This is really neat analysis idea. 

At the same time, my hunch is that all three of these exit actions have gotten easier to do and more common from 1990 to 2022. I believe divorce has gotten less stigmatized, the job market rewards more hopping around, and (I think) hospitalization has been recommended more.

If that "easier-to-do" effect is large enough, then it'd be compatible with a very wide range of happiness trends (rising/falling/stable + rescaling/no rescaling) Wondering if you have any thoughts on that.

Hi Geoffrey,

Thank you!

It's possible that these 3 exit actions have gotten easier to do, over time. Intuitively, though, this would be pushing in the same direction as rescaling: e.g., if getting a divorce is easier, it takes less unhappiness to push me to do it. This would mean the relationship should (also) get flatter. So, still surprising, that the relationship is constant (or even getting stronger). 

Ah I missed the point about the relationship getting flatter before. Thanks for flagging that.

I think I'm more confused about our disagreement now. Let me give you a toy example to show you how I'm thinking about this. So there's three variables here:

  • latent life satisfaction, which ranges from 0 to infinity
  • reported life satisfaction, which ranges from 0 to 10 and increases with latent life satisfaction
  • probability of divorce, which ranges from 0% to 100% and decreases with latent life satisfaction

And we assume for the sake of contradiction that rescaling is true. One example could be:

  • In t=1, latent life satisfaction = 1 * reported life satisfaction
    • Both are bounded from [0,10]
  • In t=2, latent life satisfaction = 2 * reported life satisfication.
    • Reported life satisfaction still ranges from [0,10]
    • But latent life satisfaction now ranges from [0,20]

Let's say that's true. Let's also assume people divorce less as they get happier (and let's ignore my earlier 'divorce gets easier' objection). One example could be:

  • In t=1 and t=2, probability of divorce = 0.40 - latent life satisfaction/100. That implies:
    • In t=1, probability of divorce ranges from [0.40, 0.30]
    • In t=2, probability of divorce ranges from [0.40, 0.20]

And so if I got the logic right, rescaling should accentuate (make steeper) the relationship between probability of divorce and reported life satisfaction. But I think you're claiming rescaling should attenuate (make flatter) the relationship. So it seems like we're differing somewhere. Any idea where?

I think rescaling could make it steeper or flatter, depending on the particular rescaling. Consider that there is nothing that requires the rescaling to be a linear transformation of the original scale (like you've written in your example). A rescaling that compresses the life satisfaction scores that were initially 0-5 into the range 0-3, while leaving the life satisfaction score of 8-10 unaffected will have a different effect on the slope than if we disproportionately compress the top end of life satisfaction scores.

Sorry if I expressed this poorly -- it's quite late :)

Hi Zachary, yeah, see the other comment I just wrote. I think stretching could plausibly magnify or attenuate the relationship, whilst shifting likely wouldn't. 

While I agree in principle, I think the evidence is that the happiness scale doesn't compress at one end. There's a bunch of evidence that people use happiness scales linearly. I refer to Michael Plant's report (pp20-22 ish): https://wellbeing.hmc.ox.ac.uk/wp-content/uploads/2024/02/2401-WP-A-Happy-Probability-DOI.pdf

Thanks for this example, Geoffrey. Hm, that's interesting! This has gotten a bit more complicated than I thought.

It seems: 

  1. Surprisingly, scale stretching could lead to attenuation or magnification depending on the underlying relationship (which is unobserved)

Let h be latent happiness; let LS be reported happiness. 

Your example:

So yes, the gradient gets steeper. 

Consider another function. (This is also decreasing in h)

 

i.e., the gradient gets flatter. 

2. Scale shifting should always lead to attenuation (if the underlying relationship is negative and convex, as stated in the piece)

Your linear probability function doesn't satisfy convexity. But, this seems more realistic, given the plots from Oswald/Kaiser look less than-linear, and probabilities are bounded (whilst happiness is not).

Again consider:

T=1: LS = h => P(h) =1/LS

T=2: LS = h-5 <=> h = LS+5 => P(h) = 1/(LS+5)

Overall, I think the fact that the relationship stays the same is some weak evidence against shifting – not stretching. FWIW, in the quality-of-life literature, shifting occurs but little stretching

Interesting! I think my intuition going into this has always been stretching so that's something I could rethink

Hello Charlie. This looks like an interesting research question. However, I do have a few comments on the interpretation and statistical modelling. Both statistical comments are on subtle issues, which I would not expect an undergraduate to be aware of. Many PhD students won't be aware of them either!

On interpretation with respect to the Easterlin paradox: your working model, as far as I can tell, appears to assume that quit rates decrease in a person's latent happiness, but not in their reported happiness. However, if shifts in reporting are caused by social comparisons (i.e., all the other jobs or relationships you see around you improving) then from that direction rescaling no longer implies a flatter relationship, as the quality of the jobs or relationships available upon quitting have increased. However, your results are indicative that other forms of rescaling are not occurring e.g., changes in culture. I think this distinction is important for interpretation. 

The first of the statistical comments that these changes in probabilities of making a change could also be explained by a general increase in the ease of or tendency to get a hospital appointment/new job.  This stems from the non-linearity of the logistic function. Logistic regression models a latent variable that determines the someone's tendency to quit and converts this into a probability. At low probabilities, an increase in this latent variable has little effect on the probability of separation due to the flatness of the curve mapping the latent variable into probabilities. However, the same increase at values of the variable that give probabilities closer to 0.5 will have a big effect on the probability of separation as the curve is steep in this region. As your research question is conceptual (you're interested in whether life satisfaction scales map to an individual's underlying willingness to quit in the same way over time), rather than predicting the probability of separations, the regression coefficients on time interacted with life satisfaction should be the parameter of interest rather than the probabilities. These effects can often go in different directions. A better explanation of this issue with examples is available here: https://datacolada.org/57 I also don't know whether results will be sensitive to assuming a logit functional form relative to other reasonable distributions, such as a probit.

Another more minor comment, is that you need to be careful when adding individual fixed effects to models like this, which you mentioned you did as a robustness check. In non-linear models, such as a logit, doing this often creates an incidental parameters problem that make your regression inconsistent. In this case, you would also be dealing with the issue that it is impossible to separately identify age, time, and cohort effects. Holding the individual constant, the coefficient of a change in time with life satisfaction would include both the time effect you are interested in and an effect of ageing that you are not.

I'd be happy to discuss any of these issues with you in more detail.

Very nice, thanks!

If people are getting happier (and rescaling is occuring) the probability of these actions should become less linked to reported LS — but they don’t.

When I first read this sentence I thought that your argument makes perfect sense, but then when I read

Surprisingly, the relationship appears to get stronger over time. That’s the opposite of what rescaling predicts!

in the Overall happiness section and my first thought was: "well, I guess people are getting more demanding". And now I am confused. I could imagine thinking about "people don't settle for half-good any more" as a kind of increased happiness (even if calling it "satisfaction" would be strange).

Independent of this, my personal impression about economic wealth ever since childhood has been that my physical needs are essentially saturated and my social environment is massively more important for my subjective well-being. And although the latter is influenced by wealth, it is much more strongly affected by culture. And although I can think of plenty cultural developments that I am thankful for, I can also think of many that don't push me in a healthy direction.

Sorry – this is unclear. 

"If people are getting happier (and rescaling is occuring) the probability of these actions should become less linked to reported LS"

This means, specifically, a flatter gradient (i.e., 'attenuation') – smaller in absolute terms. In reality, I found a slightly increasing (absolute) gradient/steeper. I can change that sentence.

I could imagine thinking about "people don't settle for half-good any more" as a kind of increased happiness

This feels similar to Geoffrey's comment. It could be that it takes less unhappiness for people to take decisive life action now. But, this should mean a flatter gradient (same direction as rescaling)

And yeah, this points towards culture/social comparison/expectations being more important than absolute £.

Thanks for engaging!

This means, specifically, a flatter gradient (i.e., 'attenuation') – smaller in absolute terms. In reality, I found a slightly increasing (absolute) gradient/steeper. I can change that sentence.

I don't think that is necessary - my confusion is more about grasping how the aspects play together :) I'm afraid I will have to make myself a few drawings to get a better grasp.

All good. Easy to tie yourself in knots with this ...

Curated and popular this week
Relevant opportunities