# Chapter 2 Discrete random variables

The outcome of a random experiment can often be presented in terms of a number. It is at least possible to summarize the relevant part of the outcome in a number.

A random variable is the numerical outcome of a random experiment, it is a number at random.

## 2.1 Definition of random variable

Associated with a random experiment we have a family of events with the $$\sigma$$-algebra structure, while in $${\mathbb R}$$, the borelian sets (intervals, their complementaries, unions and intersections) do also form a $$\sigma$$-algebra.

A random variable is a measurable mapping from the sample space asociated with a random experiment into the set of real numbers, $$X:S\mapsto{\mathbb R}$$.

It associates each outcome of a random experiment with a real number and is measurable because the inverse image of every borelian set does belong to the $$\sigma$$-algebra of events.

Events associated with a random variable

For any borelian $$A\subset{\mathbb R}$$, its inverse image through $$X$$, given by $X^{-1}(A)=\{s\in S:\, X(s)\in A\}$ is an event, and as such, we can compute its probability. \begin{align*} P(X\in A)&=P(X^{-1}(A))\\ &=P(\{s\in S:\,X(s)\in A\})\,. \end{align*}

If $$A$$ is a singleton, then $$P(X=x)=P(X^{-1}(x))=P(\{s\in S:\,X(s)=x\})$$.

We can ignore the original sample space $$S$$ and consider a probability in $${\mathbb R}$$ given by $$P_X(A)=P(X\in A)$$ for any borelian $$A\subset{\mathbb R}$$.

The support of a r.v. (discrete and continuous variables)

The support or range of a random variable $$X(S)$$ is the set of all values that it can assume.

• if $$X(S)$$ is a finite or denumerable set, then $$X$$ is a discrete random variable.
• if $$X(S)$$ contains all the elements in an interval of real numbers, then $$X$$ is a continuous random variable.

## 2.2 Discrete r.v.s, probability mass function, and cumulative distribution function

Probability mass function A probability mass function associates each real number $$x$$ with the probability that the random variable $$X$$ exactly matches it, $p(x)=P(X=x).$

Properties of the probability mass function

• $$0\leq p(x)\leq 1$$ for every $$x\in{\mathbb R}$$
• if $$X(S)=\{x_i\}_{i\in I}$$, then $$\sum_{i\in I} p(x_i)=1$$

The probability that $$X$$ lies in any Borelian $$A\subset{\mathbb R}$$ is $P(X\in A)=\sum_{x_i\in X(S)\cap A}p(x_i).$

Cumulative distribution function, cdf The cumulative distribution function (cdf) of r.v. $$X$$ evaluated at $$x\in{\mathbb R}$$ is the probability that $$X$$ is not greater than $$x$$, $F(x)=P(X\leq x)=\sum_{x_i\leq x}p(x_i),\quad\text{where }x_i\in{\mathbb R}.$

Properties of the cdf of a discrete random variable

• $$\lim\limits_{x\rightarrow-\infty}F(x)=0$$;
• $$\lim\limits_{x\rightarrow+\infty}F(x)=1$$;
• $$F$$ is nondecreasing;
• $$F$$ is right-continuous.

Example (6-face fair die)

$$X\equiv$$’outcome of a roll of a 6-face fair die’ $p(x)=P(X=x)=\left\{\begin{array}{ll}1/6&\textrm{ if }x\in\{1,2,3,4,5,6\}\\ 0&\textrm{ otherwise}\end{array}\right.,$

$F(x)=P(X\leq x)=\left\{\begin{array}{ll}0&\textrm{ if }x<1\\1/6&\textrm{ if }1\leq x<2\\2/6&\textrm{ if }2\leq x<3\\3/6&\textrm{ if }3\leq x<4\\4/6&\textrm{ if }4\leq x<5\\5/6&\textrm{ if }5\leq x<6\\1&\textrm{ if }x\geq 6\end{array}\right..$

library(prob)
table(rolldie(1)$X1) ## ## 1 2 3 4 5 6 ## 1 1 1 1 1 1 barplot(table(rolldie(1)$X1)/6,col="light blue")

plot(ecdf(rolldie(1)$X1)) ecdf(rolldie(1)$X1)(3)
## [1] 0.5

## 2.3 Mean, variance, and quantiles

Mean or expectation The mean or expecation of $$X$$ is its probability-weighted average ${\mathbb E}[X]=\sum_{i\in I} x_ip_X(x_i)\,.$

Transformation of a random variable If $$X$$ discrete r.v. and $$g:{\mathbb R}\mapsto{\mathbb R}$$ function, then $$g(X)$$ is a discrete r.v. with probability mass function $$p_{g(X)}(y)=P(g(X)=y)=\sum_{g(x_i)=y}P(X=x_i)$$.

Properties of the mean For any real numbers $$a,b\in{\mathbb R}$$, any function $$g:{\mathbb R}\mapsto{\mathbb R}$$, and r.v. $$X$$,

• $${\mathbb E}[aX+b]=a{\mathbb E}[X]+b$$;
• $${\mathbb E}[g(X)]=\sum_{i\in I} g(x_i)p_X(x_i)$$;
• $${\mathbb E}[(X-{\mathbb E}[X])^2]=\min_{x\in{\mathbb R}}{\mathbb E}[(X-x)^2]$$.

Variance The variance is a measure of the scatter of the distribution of r.v. $$X$$.

It is the expected squared distance of $$X$$ to its mean,

Properties of the variance

• $${\rm Var}[X]\geq 0$$;
• $${\rm Var}[X]={\mathbb E}[X^2]-{\mathbb E}[X]^2$$;
• $${\rm Var}[aX+b]=a^2{\rm Var}[X]$$, for any $$a,b\in{\mathbb R}$$.

The standard deviation of $$X$$ is the (positive) square root of its variance, $\sigma_X=\sqrt{{\rm Var}[X]}\,.$

Example (6-face fair die)

$X\equiv\text{'outcome of a roll of a 6-face fair die'}$ ${\mathbb E}[X]=1\cdot\frac{1}{6}+2\cdot\frac{1}{6}+3\cdot\frac{1}{6}+4\cdot\frac{1}{6}+5\cdot\frac{1}{6}+6\cdot\frac{1}{6}=3.5\,.$

set.seed(1)
x=sim(probspace(rolldie(1)),ntrials=1000)
mean(x$X1) ## [1] 3.488 sample((1:6),size=1000,replace=T) $X\equiv\text{'outcome of a roll of a 6-face fair die'}$ ${\mathbb E}[X^2]=1^2\cdot\frac{1}{6}+2^2\cdot\frac{1}{6}+3^2\cdot\frac{1}{6}+4^2\cdot\frac{1}{6}+5^2\cdot\frac{1}{6}+6^2\cdot\frac{1}{6}=15.1667\,,$ ${\rm Var}[X]={\mathbb E}[X^2]-{\mathbb E}[X]^2=2.9167\,,$ $\sigma_X=\sqrt{{\rm Var}[X]}=1.7078\,.$ var(x$X1)
## [1] 2.880737
sd(x\$X1)
## [1] 1.697273

Median The median is the most central value with respect to the distribution of a random variable $$X$$ in the sense that $P(X\leq{\rm Me}_X)\geq 1/2\quad\text{and}\quad P(X\geq{\rm Me}_X)\geq 1/2\,.$

Example (6-face fair die) Any value in the interval $$[3,4]$$ is a median of the outcome of the die.

Properties of the median

• $${\rm Me}_{aX+b}=a{\rm Me}_X+b$$, for any $$a,b\in{\mathbb R}$$;
• $${\rm Me}_{g(X)}=g({\rm Me}_X)$$ if $$g$$ is monotone;
• $${\mathbb E}|X-{\rm Me}_X|=\min_{x\in{\mathbb R}}{\mathbb E}|X-x|$$.

Quantiles For $$0<\alpha<1$$ the $$\alpha$$-quantile of random variable $$X$$ a number $$q_\alpha$$ such that $P(X\leq q_\alpha)\geq \alpha\quad\text{and}\quad P(X\geq q_\alpha)\geq 1-\alpha\,.$

The quantile function of random variable $$X$$ is defined as $F^{-1}_X(\alpha)=\inf\{x:\,F(x)\geq\alpha\}.$ A quantile function defined like this is:

• $$\lim\limits_{\alpha\downarrow 0}F^{-1}_X(\alpha)=\inf X(S)$$;
• $$\lim\limits_{\alpha\uparrow 1}F^{-1}_X(\alpha)=\sup X(S)$$;
• nondecreasing;
• left-continuous.

Example (6-face fair die)

$$X\equiv$$’outcome of a roll of a 6-face fair die’

$F^{-1}(x)=\left\{\begin{array}{ll}1&\textrm{ if }\,\,\,\,\,\,0<x\leq 1/6\\2&\textrm{ if }1/6<x\leq 2/6\\3&\textrm{ if }2/6<x\leq 3/6\\4&\textrm{ if }3/6<x\leq 4/6\\5&\textrm{ if }4/6<x\leq 5/6\\6&\textrm{ if }5/6<x\leq 1\end{array}\right..$

## 2.4 The Bernoulli process

A Bernoulli trial is a random experiment that can only result in two possible outcomes. Commonly we refer to these outcomes as success (S) and failure (F). The probability of success is denoted by $$0<p<1$$ and this experiment can be repeated independently as many times as needed.

### 2.4.1 Binomial distribution binom(size,prob)

Consider a Bernoulli trial with probability of success $$p$$ and that is carried out independently $$n$$ times, a Binomial random variable $$X$$ with parameters $$n$$ and $$p$$ represents the number of trials that result in success.

$X\sim{\rm B}(n,p)$ $P(X=r)={n\choose r}p^r(1-p)^{n-r},\quad r\in\{0,1,\ldots,n\}$

dbinom(r,size=n,prob=p)

${\mathbb E}[X]=np\quad;\quad{\rm Var}[X]=np(1-p)$

If $$X\sim{\rm B}(n_1,p)$$ and $$Y\sim{\rm B}(n_2,p)$$ are indep., then $$X+Y\sim{\rm B}(n_1+n_2,p)$$.

Binomial probability mass function dbinom

r=c(0:10)
barplot(dbinom(r,size=10,prob=0.4),
names.arg=r,col="light blue")

Binomial random observations generation rbinom

set.seed(1)
x=rbinom(1000,size=10,prob=0.4)
barplot(table(x)/1000,col="light blue")

Binomial cumulative distribution function pbinom

t=seq(-1,11,by=.01)
plot(t,pbinom(t,size=10,prob=0.4),type="l")

Binomial empirical cumulative distribution function

plot(ecdf(x))

Binomial ecdf and cdf

sum(x<=4)/1000
## [1] 0.632
ecdf(x)(4)
## [1] 0.632
pbinom(4,size=10,prob=0.4)
## [1] 0.6331033

Binomial quantile function

t=seq(0,1,by=.0001)
plot(t,qbinom(t,size=10,prob=0.4),type="l")

### 2.4.2 Geometric (Pascal’s) distribution geom(prob)

Consider a Bernoulli trial with probability of success $$p$$, the number of independent trials that result in failure obtained before the first success follows a Geometric distribution with parameter $$p$$.

$X\sim{\mathcal G}(p)$

$P(X=r)=p(1-p)^r,\quad r\in\{0,1,2,\ldots\}$

${\mathbb E}[X]=\frac{1-p}{p}\quad ;\quad{\rm Var}[X]=\frac{1-p}{p^2}$

Geometric probability mass function dgeom

r=c(0:10)
barplot(dgeom(r,prob=0.4),
names.arg=r,col="light blue")

### 2.4.3 Negative Binomial distribution nbinom(size,prob)

Consider a Bernoulli trial with probability of success $$p$$, the number of failures ($$independent$$ trials that result in failure) before the $$k$$-th success (trials that result in sucess) follows a Negative Binomial distribution with parameters $$k$$ and $$p$$.

$X\sim{\rm NB}(k,p)$ $P(X=r)={r+k-1\choose r}p^k(1-p)^r,\quad r\in\{0,1,2,\ldots\}$ ${\mathbb E}[X]=\frac{k(1-p)}{p}\quad;\quad{\rm Var}[X]=\frac{k(1-p)}{p^2}$

## 2.5 Hypergeometric distribution hyper(m,n,k)

Consider a finite population with $$N_1+N_2$$ objects, such that $$N_1$$ are of type 1 and $$N_2$$ are of type 2. A total number of $$k$$ objects are selected from the population without replacement. The number of objects of type $$N_1$$ in the selection follows a Hypergeometric distribution with parameters $$N_1,N_2$$, and $$k$$.

$X\sim{\rm H}(N_1,N_2,k)$

$P(X=r)=\frac{{N_1\choose r}{N_2\choose k-r}}{{N_1+N_2\choose k}},\quad r\in\{\max\{0,k-N_2\},\ldots,\min\{k,N_1\}\}$ ${\mathbb E}[X]=\frac{kN_1}{N_1+N_2}\quad;\quad{\rm Var}[X]=k\cdot \frac{N_1N_2}{(N_1+N_2)^2}\cdot\frac{N_1+N_2-k}{N_1+N_2-1}$

## 2.6 Poisson distribution pois(lambda)

The number of events that occur in a region of space (or time) independently one from the others and at a constant rate $$\lambda>0$$ follows a Poisson distribution with parameter $$\lambda$$.

$X\sim{\mathcal P}(\lambda)$

$P(X=r)=e^{-\lambda}\frac{\lambda^r}{r!},\quad r\in\{0,1,2,\ldots\}$ ${\mathbb E}[X]=\lambda\quad;\quad{\rm Var}[X]=\lambda$

If $$X\sim{\mathcal P}(\lambda_1)$$ and $$Y\sim{\mathcal P}(\lambda_2)$$ are independent, then $$X+Y\sim{\mathcal P}(\lambda_1+\lambda_2)$$.

Poisson probability mass function $$\lambda=1.5$$

t=c(0:10)
barplot(dpois(t,lambda=1.5),
names.arg=t,col="light blue")

Poisson probability mass function $$\lambda=10$$

t=c(0:30)
barplot(dpois(t,lambda=10),
names.arg=t,col="light blue")

Discrete distributions in R

Distributions R command
Binomial, $${\rm B}(n,p)$$ binom(size,prob)
Geometric, $${\mathcal G}(p)$$ geom(prob)
Negative Binomial, $${\rm NB}(k,p)$$ nbinom(size,prob)
Hypergeometric, $${\rm H}(N_1,N_2,k)$$ hyper(m,n,k)
Poisson, $${\mathcal P}(\lambda)$$ pois(lambda)
Functions R prefix
probability function d
cumulative probability p
quantile function q
random numbers r