TLDR: Feel free to download or make a copy of this Sheets to calculate the parameters of uniform, normal, loguniform, lognormal, pareto and logistic distributions, including their mean, median and mode, based on the values of 2 quantiles.
Scenario
Given a random variable with cumulative distribution function , and two values and respecting the quantiles and , what are the parameters of the underlying distribution? Answering this question is relevant to determine the expected value (mean) of .
As a concrete example, could represent the cost-effectiveness distribution of an intervention whose 10th and 90th percentiles are 5 and 15. For this case, the inputs would be:
- and .
- and .
Distribution parameters
The relationship between the values and quantiles of is described by:
- For a uniform distribution with minimum and maximum :
- For a normal distribution with mean and standard deviation , denoting as the quantile function of the standard normal distribution[1]:
- For a pareto distribution with minimum and tail index :
- For a logistic distribution with mean and scale :
- .
The parameters which define the distributions could be determined solving a system of 2 equations for each of the above relationships:
- For a uniform distribution:
- For a normal distribution:
- For a loguniform or lognormal distribution:
- follows a uniform or normal distribution.
- This means the parameters referring to the logarithm of could be calculated replacing and by and .
- For a pareto distribution:
- .
- .
- For a logistic distribution:
- .
- .
Feel free to download or make a copy of this Sheets to calculate the parameters of uniform, normal, loguniform, lognormal, pareto and logistic distributions, based on the above formulas. I have also included formulas for the mean, median and mode of all these distributions.
- ^
The standard normal distribution has mean 0, and standard deviation 1. The quantile function is the inverse of the cumulative distribution function. The quantile function of the normal distribution could be calculated via NORMINV in Sheets, and scipy.stats.norm.ppf in Python.
One can very often avoid running Monte Carlo simulations, at least ones with more than 1 distribution, by estimating the means of input distributions using the sheet. For example, to determine the expected value of Z = X*Y if X and Y are independent distributions, E(Z) = E(X)*E(Y), and one can use the sheet to calculate each of these factors (if both X and Y follow a distribution there).
Useful stuff! I was working on something similar months ago and ended up eyeballing things.
I think the links to the sheets are broken though, they just link to this page
Sorry, and thanks! They are working now.
The hyperlink on the word "this" (in both instances) is broken. I don't see how to get to the calculator.
Sorry, and thanks! They are working now.
Thank you for this! I had been trying to solve this exact problem recently, and I wasn't sure if I was doing it right. And this spreadsheet is much more convenient than the way I was doing it.
Great to know that you found it useful!