Math Home

Probability

Given a random variable \(X,\) the \(p\)th percentile of \(X\) is the number \(c_p\) defined as follows:

- Let \([a, b]\) be the interval such that for every \(c \in [a, b],\) \(P(X \leq c) \geq p\) and \(P(X \geq c) \geq 1-p.\)
- Linearly interpolate over the interval \([a, b]:\) \(c_p = (1-p)a + pb.\)

Often times there is exactly one point \(c_p\) such that \(P(X \leq c_p) \geq p\) and \(P(X \geq c_p) \geq 1-p,\) in which case \(c = c_p.\)

The \(50\)th percentile is called the median. The median is a value \(c\) such that half the distribution is above \(c\) and half the distribution is below \(c.\)

The \(25\)th percentile is called the first quartile, the \(50\)th percentile is called the second quartile, and the \(75\)th percentile is called the third quartile.

Let \(X\) be uniform on \(\{3, 5, 8, 9\}.\) What is the median of \(X?\) What is the first quartile?

▼ Solution:

For any value \(c \in [5,8],\) \(P(X \leq c) = \frac{1}{2}\) and \(P(X \geq c) = \frac{1}{2}.\) So, the median of \(X\) is
\[\frac{1}{2} \cdot 5 + \frac{1}{2} \cdot 8 = 6.5\]
For any value \(c \in [3, 5],\) \(P(X \leq c) \geq \frac{1}{4}\) and \(P(X \geq c) \geq \frac{3}{4}.\) So, the first quartile of \(X\) is
\[\frac{3}{4} \cdot 3 + \frac{1}{4} \cdot 5 = 3.5\]

Let \(X\) have the Geometric\((0.2)\) distribution. Find the quartiles of the distribution of \(X.\)

▼ Solution:

The first quartile, \(c_{25},\) can be found by solving the following:
\[\sum_{i=1}^{c_{25}} (0.8)^{i-1} \cdot 0.2 = 0.25\]
First, we rewrite the sum as something easier to solve. factor out the \(0.2\) and \(0.8^{-1}\) from the sum. We get
\[\sum_{i=1}^{c_{25}} (0.8)^{i-1} \cdot 0.2 = 0.25\sum_{i=1}^{c_{25}} (0.8)^i\]
Next, use the identity
\[\sum_{i=1}^n r^i = \frac{r(1-r^n)}{1-r}\]
Plugging in \(c_{25}\) for \(n\) and \(0.8\) for \(r,\) we get
\begin{align}
0.25\sum_{i=1}^{c_{25}} (0.8)^i & = 0.25 \cdot \frac{0.8(1-0.8^{c_{25}})}{1-0.8} \\
& = 1-0.8^{c_{25}}
\end{align}
Now we can solve for \(c_{25}.\)
\begin{align}
& 1-0.8^{c_{25}} = 0.25 \Rightarrow \\
& 0.8^{c_{25}} = 0.75 \Rightarrow \\
& c_{25}ln(0.8) = ln(0.75) \Rightarrow \\
& c_{25} = \frac{ln(0.75)}{ln(0.8)} \approx 1.29
\end{align}
Since \(X\) is discrete and there is no \(c_p\) such that \(P(X = c_p) = p,\) take the ceiling of \(1.29\) to get \(c_{25}.\) So, \(c_{25} = 2\). You can check that \(P(X \leq 2) = 0.2 + 0.2 \cdot 0.8 = 0.36 > 0.25\) and \(P(X \geq 2) = 1 - P(X = 1) = 1-0.2 = 0.8 > 0.75.\)

The second quartile, or median, of \(X\) must satisfy \[1-0.8^{c_{50}} = 0.5\] This can be solved to yeild \[c_{50} = \frac{ln(0.5)}{ln(0.8)} \approx 3.11\] Taking the ceiling, we get \(c_{50} = 4.\) You can check: \begin{align} & P(X \leq 4) = 0.2 + 0.2 \cdot 0.8 + 0.2 \cdot 0.8^2 + 0.2 \cdot 0.8^3 = 0.5904 \\ & P(X \geq 4) = 1 - 0.2 - 0.2 \cdot 0.8 - 0.2 \cdot 0.8^2 = 0.512 \end{align} Lastly, the third quartile can be found by solving \[1-0.8^{c_{75}} = 0.75\] This can be solved to yeild \[c_{75} = \frac{ln(0.25)}{ln(0.8)} \approx 6.21\] Taking the ceiling, we get \(c_{75} = 7.\) You can check: \begin{align} & P(X \leq 7) = 0.2 \sum_{i=1}^7 (0.8)^{i-1} \approx 0.79 > 0.75 \\ & P(X \geq 7) = 1 - 0.2 \sum_{i=1}^6 (0.8)^{i-1} \approx 0.26 > 0.25 \end{align}

The second quartile, or median, of \(X\) must satisfy \[1-0.8^{c_{50}} = 0.5\] This can be solved to yeild \[c_{50} = \frac{ln(0.5)}{ln(0.8)} \approx 3.11\] Taking the ceiling, we get \(c_{50} = 4.\) You can check: \begin{align} & P(X \leq 4) = 0.2 + 0.2 \cdot 0.8 + 0.2 \cdot 0.8^2 + 0.2 \cdot 0.8^3 = 0.5904 \\ & P(X \geq 4) = 1 - 0.2 - 0.2 \cdot 0.8 - 0.2 \cdot 0.8^2 = 0.512 \end{align} Lastly, the third quartile can be found by solving \[1-0.8^{c_{75}} = 0.75\] This can be solved to yeild \[c_{75} = \frac{ln(0.25)}{ln(0.8)} \approx 6.21\] Taking the ceiling, we get \(c_{75} = 7.\) You can check: \begin{align} & P(X \leq 7) = 0.2 \sum_{i=1}^7 (0.8)^{i-1} \approx 0.79 > 0.75 \\ & P(X \geq 7) = 1 - 0.2 \sum_{i=1}^6 (0.8)^{i-1} \approx 0.26 > 0.25 \end{align}

Let \(X\) be a continuous random variable with pdf \(f(x) = 2-2x\) for \(0 < x < 1.\) Find the median of the distribution of \(X\) and the \(10\)th percentile.

▼ Solution:

The median is the \(50\)th percentile. We can find the median, \(c_{50},\) by solving the following:
\[\int_0^{c_{50}} (2 - 2x) dx = 0.5\]
Taking the integral, the equation becomes
\[2c_{50} - c_{50}^2 = 0.5\]
Using the quadratic formula, the only solution between \(0\) and \(1\) is \(c_{50} = 1-\frac{\sqrt{2}}{2} \approx 0.29.\) This means that \(X\) has a \(0.5\) probability of being less than \(0.29\) and a \(0.5\) probability of being greater than \(0.29.\)

To find the \(10\)th percentile, find \(c_{10}\) by solving the following: \[\int_0^{c_{10}} (2-2x)dx = 0.1\] Taking the integral, the equation becomes \[2c_{10} - c_{10}^2 = 0.1\] Using the quadratic formula, the only solution between \(0\) and \(1\) is \(c_{10} = 1-\frac{3}{\sqrt{10}} \approx 0.05.\) This means that \(X\) has a \(0.1\) probability of being less than \(0.05\) and a \(0.9\) probability of being greater than \(0.05.\)

To find the \(10\)th percentile, find \(c_{10}\) by solving the following: \[\int_0^{c_{10}} (2-2x)dx = 0.1\] Taking the integral, the equation becomes \[2c_{10} - c_{10}^2 = 0.1\] Using the quadratic formula, the only solution between \(0\) and \(1\) is \(c_{10} = 1-\frac{3}{\sqrt{10}} \approx 0.05.\) This means that \(X\) has a \(0.1\) probability of being less than \(0.05\) and a \(0.9\) probability of being greater than \(0.05.\)

Check your understanding:

For problems 1 and 2, let \(X\) be a discrete random variable with pmf \[p(0)=0.1, p(2)=0.2, p(3.5)=0.3, p(5)=0.2, p(12)=0.2\]

1. What is the median of \(X?\)

Unanswered

2. What is the \(17\)th percentile of \(X?\)

Unanswered

For problems 3 and 4, let \(X\) be a continuous random variable with pdf \(f(x) = \frac{1}{9}x^2\) for \(0 < x < 3.\)

3. What is the third quartile of \(X?\)

Unanswered

4. What is the \(88\)th percentile of \(X?\)

Unanswered