Next: Distributions, Previous: Basic Statistical Functions, Up: Statistics [Contents][Index]
Compute the covariance matrix.
If each row of x and y is an observation, and each column is
a variable, then the (i, j)-th entry of
cov (x, y)
is the covariance between the i-th
variable in x and the j-th variable in y.
cov (x) = 1/(N-1) * SUM_i (x(i) - mean(x)) * (y(i) - mean(y))
where N is the length of the x and y vectors.
If called with one argument, compute cov (x, x)
, the
covariance between the columns of x.
The argument opt determines the type of normalization to use. Valid values are
normalize with N-1, provides the best unbiased estimator of the covariance [default]
normalize with N, this provides the second moment around the mean
Compatibility Note:: Octave always treats rows of x and y
as multivariate random variables.
For two inputs, however, MATLAB treats x and y as two
univariate distributions regardless of their shapes, and will calculate
cov ([x(:), y(:)])
whenever the number of elements in
x and y are equal. This will result in a 2x2 matrix.
Code relying on MATLAB’s definition will need to be changed when
running in Octave.
See also: corr.
Compute matrix of correlation coefficients.
If each row of x and y is an observation and each column is
a variable, then the (i, j)-th entry of
corr (x, y)
is the correlation between the
i-th variable in x and the j-th variable in y.
corr (x,y) = cov (x,y) / (std (x) * std (y))
If called with one argument, compute corr (x, x)
,
the correlation between the columns of x.
See also: cov.
Compute a matrix of correlation coefficients.
x is an array where each column contains a variable and each row is an observation.
If a second input y (of the same size as x) is given then calculate the correlation coefficients between x and y.
param, value are optional pairs of parameters and values which modify the calculation. Valid options are:
"alpha"
Confidence level used for the bounds of the confidence interval, lci and hci. Default is 0.05, i.e., 95% confidence interval.
"rows"
Determine processing of NaN values. Acceptable values are "all"
,
"complete"
, and "pairwise"
. Default is "all"
.
With "complete"
, only the rows without NaN values are considered.
With "pairwise"
, the selection of NaN-free rows is made for each
pair of variables.
Output r is a matrix of Pearson’s product moment correlation coefficients for each pair of variables.
Output p is a matrix of pair-wise p-values testing for the null hypothesis of a correlation coefficient of zero.
Outputs lci and hci are matrices containing, respectively, the lower and higher bounds of the 95% confidence interval of each correlation coefficient.
Compute Spearman’s rank correlation coefficient rho.
For two data vectors x and y, Spearman’s rho is the correlation coefficient of the ranks of x and y.
If x and y are drawn from independent distributions,
rho
has zero mean and variance
1 / (N - 1)
,
where N is the length of the x and y vectors, and is
asymptotically normally distributed.
spearman (x)
is equivalent to
spearman (x, x)
.
Compute Kendall’s tau.
For two data vectors x, y of common length N, Kendall’s tau is the correlation of the signs of all rank differences of x and y; i.e., if both x and y have distinct entries, then
1 tau = ------- SUM sign (q(i) - q(j)) * sign (r(i) - r(j)) N (N-1) i,j
in which the q(i) and r(i) are the ranks of x and y, respectively.
If x and y are drawn from independent distributions,
Kendall’s
tau
is asymptotically normal with mean 0 and variance
(2 * (2N+5)) / (9 * N * (N-1))
.
kendall (x)
is equivalent to kendall (x,
x)
.
Next: Distributions, Previous: Basic Statistical Functions, Up: Statistics [Contents][Index]