Next: , Previous: Correlation, Up: Statistics


21.7 Weighted Samples

The functions described in this section allow the computation of statistics for weighted samples. The functions accept an array of samples, x_i, with associated weights, w_i. Each sample x_i is considered as having been drawn from a Gaussian distribution with variance \sigma_i^2. The sample weight w_i is defined as the reciprocal of this variance, w_i = 1/\sigma_i^2. Setting a weight to zero corresponds to removing a sample from a dataset.

— Function: double gsl_stats_wmean (const double w[], size_t wstride, const double data[], size_t stride, size_t n)

This function returns the weighted mean of the dataset data with stride stride and length n, using the set of weights w with stride wstride and length n. The weighted mean is defined as,

          \Hat\mu = (\sum w_i x_i) / (\sum w_i)
— Function: double gsl_stats_wvariance (const double w[], size_t wstride, const double data[], size_t stride, size_t n)

This function returns the estimated variance of the dataset data with stride stride and length n, using the set of weights w with stride wstride and length n. The estimated variance of a weighted dataset is calculated as,

          \Hat\sigma^2 = ((\sum w_i)/((\sum w_i)^2 - \sum (w_i^2)))
                          \sum w_i (x_i - \Hat\mu)^2

Note that this expression reduces to an unweighted variance with the familiar 1/(N-1) factor when there are N equal non-zero weights.

— Function: double gsl_stats_wvariance_m (const double w[], size_t wstride, const double data[], size_t stride, size_t n, double wmean)

This function returns the estimated variance of the weighted dataset data using the given weighted mean wmean.

— Function: double gsl_stats_wsd (const double w[], size_t wstride, const double data[], size_t stride, size_t n)

The standard deviation is defined as the square root of the variance. This function returns the square root of the corresponding variance function gsl_stats_wvariance above.

— Function: double gsl_stats_wsd_m (const double w[], size_t wstride, const double data[], size_t stride, size_t n, double wmean)

This function returns the square root of the corresponding variance function gsl_stats_wvariance_m above.

— Function: double gsl_stats_wvariance_with_fixed_mean (const double w[], size_t wstride, const double data[], size_t stride, size_t n, const double mean)

This function computes an unbiased estimate of the variance of the weighted dataset data when the population mean mean of the underlying distribution is known a priori. In this case the estimator for the variance replaces the sample mean \Hat\mu by the known population mean \mu,

          \Hat\sigma^2 = (\sum w_i (x_i - \mu)^2) / (\sum w_i)
— Function: double gsl_stats_wsd_with_fixed_mean (const double w[], size_t wstride, const double data[], size_t stride, size_t n, const double mean)

The standard deviation is defined as the square root of the variance. This function returns the square root of the corresponding variance function above.

— Function: double gsl_stats_wtss (const double w[], const size_t wstride, const double data[], size_t stride, size_t n)
— Function: double gsl_stats_wtss_m (const double w[], const size_t wstride, const double data[], size_t stride, size_t n, double wmean)

These functions return the weighted total sum of squares (TSS) of data about the weighted mean. For gsl_stats_wtss_m the user-supplied value of wmean is used, and for gsl_stats_wtss it is computed using gsl_stats_wmean.

          TSS =  \sum w_i (x_i - wmean)^2
— Function: double gsl_stats_wabsdev (const double w[], size_t wstride, const double data[], size_t stride, size_t n)

This function computes the weighted absolute deviation from the weighted mean of data. The absolute deviation from the mean is defined as,

          absdev = (\sum w_i |x_i - \Hat\mu|) / (\sum w_i)
— Function: double gsl_stats_wabsdev_m (const double w[], size_t wstride, const double data[], size_t stride, size_t n, double wmean)

This function computes the absolute deviation of the weighted dataset data about the given weighted mean wmean.

— Function: double gsl_stats_wskew (const double w[], size_t wstride, const double data[], size_t stride, size_t n)

This function computes the weighted skewness of the dataset data.

          skew = (\sum w_i ((x_i - \Hat x)/\Hat \sigma)^3) / (\sum w_i)
— Function: double gsl_stats_wskew_m_sd (const double w[], size_t wstride, const double data[], size_t stride, size_t n, double wmean, double wsd)

This function computes the weighted skewness of the dataset data using the given values of the weighted mean and weighted standard deviation, wmean and wsd.

— Function: double gsl_stats_wkurtosis (const double w[], size_t wstride, const double data[], size_t stride, size_t n)

This function computes the weighted kurtosis of the dataset data.

          kurtosis = ((\sum w_i ((x_i - \Hat x)/\Hat \sigma)^4) / (\sum w_i)) - 3
— Function: double gsl_stats_wkurtosis_m_sd (const double w[], size_t wstride, const double data[], size_t stride, size_t n, double wmean, double wsd)

This function computes the weighted kurtosis of the dataset data using the given values of the weighted mean and weighted standard deviation, wmean and wsd.