This LaTeX document is available as postscript or asAdobe PDF.

Statistics Background
L. R. Schaeffer, March 1999

Random Variables

A random variable is a real-valued function which exists within the domain of a defined sample space. A random variable is designated by a capital letter, say Y, and the value of Y, depending on the outcome of the experiment, is denoted by a small letter, say y. The sample space is the range of values that the value of Y can be assigned.

Random variables can be either discrete or continuous. A discrete random variable can assume only a finite number of distinct values, such as zero or one for example. A continuous random variable can assume any value within the range of the sample space.

Discrete Random Variables

In the discrete case, the probability that Y takes the value y, is defined as the sum of the probabilities of all sample points that are assigned the value y. That is,

P(Y=y) = p(y).

The probability distribution of Y lists the probabilities for each value of y. Suppose Y can take on four values with the following probabilities:

 y p(y) 0 1/8 1 1/4 2 1/4 3 3/8

Any other values of y are assumed to have p(y)=0, and the sum of the probabilities is 1. The expected value of a discrete random variable is defined as

For the example above,

E(Y) = ( 0 (1/8) + 1 (1/4) + 2 (1/4) + 3 (3/8) ) = 1.875.

Similarly, the expected value of a function of Y, say g(Y) is given by

Suppose g(Y) = Y2, then

E(Y2) = ( 0 (1/8) + 1 (1/4) + 4 (1/4) + 9 (3/8) ) = 4.625.

The variance of discrete random variable Y is

Var(Y) = E[(Y - E(Y))2] = E(Y2) - [E(Y)]2.

For the example,

Binomial Distribution

A common discrete distribution is the binomial distribution. A binomial event can take on only two possible outcomes, success or failure, zero or one, heads or tails, diseased or not diseased, and so on. The probability of one outcome is q and the probability of the other outcome is 1-q. Trials, or a succession of binomial events, are assumed to be independent. The random variable Y is the number of successes. The probability distribution is given by

for and . The number of trials is n. The expected value and variance of the binomial distribution are

Poisson Distribution

A Poisson probability distribution provides a good model for the probability distribution of the number Y of rare events that occur in a given space, time, volume, or any other dimension, and is the average value of Y. An example in animal breeding might be the number of quality embryos produced by a cow during superovulation, which can range from 0 to 20 (or more). The Poisson probability distribution is given by

for and Also,

General Results

Expectations

If Y represents a random variable from some defined population, then the expectation of Y is denoted by

where means expected value. Now let and represent vectors of random variables (or random vector variables), let k represent a scalar constant, and let be a matrix of constants, then
1.
2.
3.
4.
5.
and if then

The mean of a population is also known as the first moment of the distribution. The exact form of the distribution for a random variable will determine the form of the estimator of the mean and of other parameters of the distribution.

Variance-Covariance Matrices

The variance of a scalar random variable, Y, is defined as

Var(Y) = E(Y2) - E(Y)E(Y) = E(Y-E(Y))2

and is commonly represented as . With two scalar random variables, say Y1 and Y2, then the covariance between them is defined as

Cov(Y1,Y2) = E(Y1Y2) - E(Y1)E(Y2)

and is represented as . These definitions can be extended to a vector of random variables, , to give a variance-covariance matrix of order equal to the length of .

A variance-covariance (VCV) matrix of a random vector contains variances on the diagonals and covariances on the off-diagonals. A VCV matrix is square, symmetric and should always be positive definite or positive semi-definite. Another commonly used name for VCV matrix is dispersion matrix.

Let be a matrix of constants conformable for multiplication with the vector , then

If we have two sets of functions of , say and , then

Similarly, if we have two functions of different random vectors, say and , and , then

Continuous Distributions

Consider measuring the amount of milk given by a dairy cow at a particular milking. Even if a machine of perfect accuracy was used, the amount of milk would be a unique point on a continuum of possible values, such as 32.35769842.... kg of milk. As such it is mathematically impossible to assign a nonzero probability to all of the infinite possible points in the continuum. Thus, a different method of describing a probability distribution of a continuous random variable must be used. The sum of the probabilities (if they could be assigned) through the continuum is still assumed to sum to 1. The cumulative distribution function of a random variable is

for . As y approaches , then F(y) approaches 0. As y approaches , then F(y)approaches 1. Thus, F(y) is said to be a nondecreasing function of y. If a < b, then F(a) < F(b).

If F(y) is the cumulative distribution function of Y, then the probability density function of Y is given by

wherever the derivative exists. Always for f(y) being a probability density function,

Conversely,

The expected value of a continuous random variable Y is

provided that the integral exists. If g(Y) is a function of Y, then

provided that the integral exists. Finally,

Var(Y) = E(Y2) - [E(Y)]2.

The Uniform Distribution

The basis for the majority of random number generators is a uniform distribution. A random variable Y has a continuous uniform probability distribution on the interval if and only if the density function of Y is

The parameters of the density function are and . Also,

The Normal Distribution

A random variable Y has a normal probability distribution if and only if

for , where is the variance of Y and is the expected value of Y.

For the random vector, , the multivariate normal density function is

denoted as where is the variance-covariance matrix of . Note that the determinant of must be positive, otherwise the density function is undefined.

Chi-Square Distribution.

• If , then , where is a central chi-square distribution with n degrees of freedom and n is the length of the vector . The mean of the central chi-square distribution is n, and the variance is 2n. The probability distribution function is

for y2 > 0.

• If , then where is the noncentrality parameter which is equal to . The mean of a noncentral chi-square distribution is and the variance is .

• If , then has a noncentral chi-square distribution only if is idempotent, i.e. . The noncentrality parameter is and the mean and variance of the distribution are and , respectively.

• If there are two quadratic forms of , say and , and both quadratic forms have chi-square distributions, then the two quadratic forms are independent if . Independence of quadratic forms is necessary for the construction of valid tests of hypotheses.
The chi-square distribution is used in hypothesis testing, for example in contingency tables. It is also used as a prior distribution for variances in Bayesian analyses. Chi-square variables are components of the following two distributions.

The t-distribution.

The t-distribution is based on the ratio of two independent random variables. The first is from a univariate normal distribution, and the second is from a central chi-square distribution. Let and with x and u being independent, then

The mean of a t-distribution is the mean of the x variable, and the variance is n/(n-2), and n is the degrees of freedom of the distribution.

The F-distribution.

The central F-distribution is based on the ratio of two independent central chi-square variables. Let and with u and w being independent, then

The mean of the F-distribution is m/(m-2) and the variance is

Tables of F-values have been constructed for various probability levels as criteria to test if the numerator chi-square variable has a noncentral chi-square distribution. If the calculated F-value is greater than the value in the tables, then u is implied to have a noncentral chi-square distribution, otherwise we assume that u has a central chi-square distribution.

The square of a t-distribution variable gives a variable that has an F-distribution with 1 and n degrees of freedom.

Noncentral F-distributions exist depending on whether the numerator or denominator variables have noncentral chi-square distributions. Tables for noncentral F-distributions generally do not exist because of the difficulty in predicting the noncentrality parameters. However, using random chi-square generators it is possible to numerically calculate an expected noncentral F value for specific situations. When both the numerator and denominator chi-square variables are from noncentral distributions, then their ratio follows a doubly noncentral F-distribution.

A quadratic form is a sum of squares of elements of a vector. The general form is , where is a vector of random variables, and is a regulator matrix. The regulator matrix can take on various forms and values depending on the situation. Usually is a symmetric matrix. Examples of different matrices are as follows:

1.
, then which is a total sum of squares of the elements in .
2.
, then where n is the length of . Note that , so that and is the sum of the elements in .
3.
, then gives the variance of the elements in .

The expected value of a quadratic form is

However,

so that

then

If we let , then

The expectation of a quadratic form does not depend on the distribution of . However, the variance of a quadratic form requires that follows a multivariate normal distribution. Without showing the derivation, the variance of a quadratic form, assuming has a multivariate normal distribution, is

The quadratic form, , has a chi-square distribution if

or the single condition that is idempotent. Then if

the expected value of is and the variance is , which are the usual results for a noncentral chi-square variable.

The covariance between two quadratic forms, say and , is

The covariance is zero if , then the two quadratic forms are said to be independent.

Bilinear Forms

A bilinear form is represented as , where and are two different random vectors, possibly of different lengths, and is the regulator matrix. If and with , then

and if , then

Usually, the lengths of and are equal and is symmetric. If is also positive definite, then is a sum of cross-products. Bilinear forms occur in the estimation of covariances. Note that a bilinear form may be written as a quadratic form as follows: Let

then

and

This LaTeX document is available as postscript or asAdobe PDF.

Larry Schaeffer
1999-02-26