# Random Variable Generation

Updated 2017-11-05

Given an uniform random variable, how do we generate random numbers with a specific probability distribution?

## Inverse Transform Algorithm

### Discrete Random Variables

To generate discrete random variables with distribution $$\mathbb P(X=x_i)=p_i$$, for $$i=1,\dotsc,n$$, we follow the following algorithm.

1. Generate continuous uniform random variable $$Y\sim U(0,1)$$
2. Set $$X$$ based on where $$Y$$ is on the distribution:
$X=\begin{cases} x_1& Y\in[0,p_1)\\ x_2 & Y\in[p_1, p_1+p_2)\\ x_3 & Y\in[p_1 + p_2, p_1 + p_2+p_3)\\ \vdots & \vdots\\ x_n & Y\in[\sum_{i=1}^{n-1}p_i,1) \end{cases}$

#### Continuous Random Variables

To generate continuous random variables with distribution $$F_X(x)$$, follow these steps:

1. Generate continuous uniform random variable $$Y\sim U(0,1)$$
2. Find the inverse of the continuous distribution $$F_X^{-1}(x)$$
3. Set $$X=F_X^{-1}(Y)$$

“Pseudo” Proof: Suppose we have the uniform random variable $$Y\sim U(0,1)$$ and we use the inverse distribution function $$F_X^{-1}$$ to obtain random $$X$$. Note that the distribution function for $$Y$$ is, by definition, $$P_Y(Y<y)=y$$.

\begin{align} \mathbb P(X\leq x)&=\mathbb P(F_X^{-1}(Y)\leq x)\\ &=\mathbb P(Y\leq F_X(x))\\ &=F_X(x) \end{align}

Which is what we wanted.

Note: This method does not work directly on Gaussian random variables, use Polar Algorithm for Gaussian random variables instead.

## Polar Algorithm

As noted above, we can’t use inverse transformation method for Gaussian random variables. In order to simulate continuous random variable to have the distribution $$F_X(x)=N(0,1)$$, we need to follow these steps:

1. Generate continuous uniform random variables $$Y_1\sim U(0,1)$$ and $$Y_2\sim U(0, 1)$$
2. Set $$D=-2\ln(Y_1)$$
3. Set $$\Theta=2\pi Y_2$$
4. Set $$X=\sqrt{D}\cos(\Theta)$$; $$X$$ is now a random variable with distribution $$N(0,1)$$
5. Set $$Z=\sqrt{D}\sin(\Theta)$$; $$Z$$ is now a random variable with distribution $$N(0,1)$$ that is also independent from $$X$$

To obtain a random variable with non-standard normal distribution $$Z\sim N(\mu, \sigma^2)$$, we use the linear properties of the Gaussian random variables.

1. Generate $$X\sim N(0,1)$$ using steps 1-4 above
2. Set $$Z=\sigma X + \mu$$; $$Z$$ is now a random variable with distribution $$Z\sim N(\mu, \sigma^2)$$

## Composition Method

Use this method if the distribution function $$F_X$$ is composed of a sum of other distribution functions $$F_{X_i}$$:

$F_X(x)=\sum_{i=1}^{n}p_i F_{X_i}(x)\qquad\sum_{i=1}^n p_i=1,\qquad p_i\geq 0$

Where $$F_{X_i}$$ are CDFs.

To get a random variable with distribution $$F_X$$, follow these steps:

1. Generate a random variable $$I$$ with discrete distribution $$\mathbb P(I=i)=p_i$$ for $$i=1,2,\dotsc,n$$ using the Inverse Transform method for discrete random variables
2. Generate a random variable $$Y_I$$ with distribution $$F_{X_I}$$ using the Inverse Transform method for continuous random variables
3. Set $$X$$=$Y_I$$;$$X$ is now a random variable with distribution $$X\sim F_X(x)$$

## Acceptance-Rejection Method

We can use this method if it’s hard to use previous methods to generate a random variable $$X$$ with distribution $$F_X$$.

So we find a proposal distribution $$F_Y$$ where it is easy to sample. Note that we must ensure the range of the density functions, $$f_X$$ and $$f_Y$$ are the same.

Next, we need to know the upper bound $$a$$, where $$a\geq \frac{f_X(x)}{f_Y(x)}$$.

With all that in mind, we may begin the algorithm:

1. Generate a random variable $$Y$$ from the distribution $$F_Y$$ using methods previously described
2. Generate a uniform random variable $$Z\sim U(0,1)$$
3. Check if $$Z \leq \frac{f_X(Y)}{a f_Y(Y)}$$
• If true (accept), then set $$X=Y$$
• Else (reject), go back to step 1 and try again

Note: The probability of $$Y$$ being accepted is $$\frac{1}{a}$$, therefore we should always choose $$a$$ to be as small as possible given $$a\geq 1$$.

Example: half normal distribution The half normal distribution function has the density $$f_X=\frac{2}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}$$. The inverse CDF is very difficult to find, so we use the A-R method.

The range of the density of this distribution spans $$[0,\infty)$$, so it would be appropriate to choose the exponential random variable distribution as our proposal distribution. The density of the proposal distribution is $$f_Y(x)=e^{-x}$$.

Thus

$\frac{f_X(x)}{f_Y(x)}=\frac{2}{\sqrt{2\pi}}e^{-\frac{x^2}{2}+x}$

Taking the derivative and the second derivative, we find the maximum is at $$x=1$$. Thus, our upper bound is

$\frac{f_X(1)}{f_Y(1)}=\frac{2}{\sqrt{2\pi}}e^\frac12\approx1.3=a$

Now we apply the A-R algorithm. First we generate an exponential random variable, $$Y\sim f_Y$$.

Next, we generate $$Z\sim U(0,1)$$.

Lastly, we set $$X=Y$$ if $$Z\leq \frac{\frac{2}{\sqrt{2\pi}}e^{y-\frac{y^2}{2}}}{1.3}$$

## Vector Random Variables Generation

Suppose we have a vector consists of two random variables where $$(x, y)\sim F_{X,Y}(x,y)$$.

If $$X$$ and $$Y$$ are independent. Great! Then we can just generate each of them independently.

If not, then the algorithm is as follows:

1. Express the joint PDF of the two random variables as a conditional PDF

$f_{X,Y}(x,y)=f_{y\vert x}(y\vert x)f_X(x)$
2. For $$i=1,2,\dotsc,n$$ do:

1. Generate $$x_i\sim f_X(x)$$
2. Generate $$y_i\sim f_{Y\vert X}(\left.\cdots\right\vert _{x=x_i})$$

Example:

Suppose $$f_{X,Y}(x,y)=x+y$$ and $$0\leq x,y \leq 1$$, then the marginal PDF for $$X$$ is

$f_X(x)=\int_0^1 f_{X,Y}(x,y)\mathrm dy=x+\frac12$

Next, using the marginal PDF for $$X$$, we can find the conditional PDF:

$f_{Y\vert X}(y\vert x)=\frac{f_{X,Y}(x,y)}{f_X(x)}=\frac{x_i+y}{x_i+\frac{1}{2}}$

Following the rest of the algorithm, we sample $$x_i\sim x+\frac12$$ and sample $$y_i\sim \frac{x_i + y}{x_i + \frac12}$$ for all $$i$$’s.