ILIAS
Release_4_0_x_branch Revision 61816
|
Public Member Functions | |
Math_Stats ($nullOption=STATS_REJECT_NULL) | |
Constructor for the class. | |
setData ($arr, $opt=STATS_DATA_SIMPLE) | |
Sets and verifies the data, checking for nulls and using the current null handling option. | |
getData ($expanded=false) | |
Returns the data which might have been modified according to the current null handling options. | |
setNullOption ($nullOption) | |
Sets the null handling option. | |
studentize () | |
Transforms the data by substracting each entry from the mean and dividing by its standard deviation. | |
center () | |
Transforms the data by substracting each entry from the mean. | |
calc ($mode, $returnErrorObject=true) | |
Calculates the basic or full statistics for the data set. | |
calcBasic ($returnErrorObject=true) | |
Calculates a basic set of statistics. | |
calcFull ($returnErrorObject=true) | |
Calculates a full set of statistics. | |
min () | |
Calculates the minimum of a data set. | |
max () | |
Calculates the maximum of a data set. | |
sum () | |
Calculates SUM { xi } Handles cummulative data sets correctly. | |
sum2 () | |
Calculates SUM { (xi)^2 } Handles cummulative data sets correctly. | |
sumN ($n) | |
Calculates SUM { (xi)^n } Handles cummulative data sets correctly. | |
product () | |
Calculates PROD { (xi) }, (the product of all observations) Handles cummulative data sets correctly. | |
productN ($n) | |
Calculates PROD { (xi)^n }, which is the product of all observations Handles cummulative data sets correctly. | |
count () | |
Calculates the number of data points in the set Handles cummulative data sets correctly. | |
mean () | |
Calculates the mean (average) of the data points in the set Handles cummulative data sets correctly. | |
range () | |
Calculates the range of the data set = max - min. | |
variance () | |
Calculates the variance (unbiased) of the data points in the set Handles cummulative data sets correctly. | |
stDev () | |
Calculates the standard deviation (unbiased) of the data points in the set Handles cummulative data sets correctly. | |
varianceWithMean ($mean) | |
Calculates the variance (unbiased) of the data points in the set given a fixed mean (average) value. | |
stDevWithMean ($mean) | |
Calculates the standard deviation (unbiased) of the data points in the set given a fixed mean (average) value. | |
absDev () | |
Calculates the absolute deviation of the data points in the set Handles cummulative data sets correctly. | |
absDevWithMean ($mean) | |
Calculates the absolute deviation of the data points in the set given a fixed mean (average) value. | |
skewness () | |
Calculates the skewness of the data distribution in the set The skewness measures the degree of asymmetry of a distribution, and is related to the third central moment of a distribution. | |
kurtosis () | |
Calculates the kurtosis of the data distribution in the set The kurtosis measures the degrees of peakedness of a distribution. | |
median () | |
Calculates the median of a data set. | |
mode () | |
Calculates the mode of a data set. | |
midrange () | |
Calculates the midrange of a data set. | |
geometricMean () | |
Calculates the geometrical mean of the data points in the set Handles cummulative data sets correctly. | |
harmonicMean () | |
Calculates the harmonic mean of the data points in the set Handles cummulative data sets correctly. | |
sampleCentralMoment ($n) | |
Calculates the nth central moment (m{n}) of a data set. | |
sampleRawMoment ($n) | |
Calculates the nth raw moment (m{n}) of a data set. | |
coeffOfVariation () | |
Calculates the coefficient of variation of a data set. | |
stdErrorOfMean () | |
Calculates the standard error of the mean. | |
frequency () | |
Calculates the value frequency table of a data set. | |
quartiles () | |
The quartiles are defined as the values that divide a sorted data set into four equal-sized subsets, and correspond to the 25th, 50th, and 75th percentiles. | |
interquartileMean () | |
The interquartile mean is defined as the mean of the values left after discarding the lower 25% and top 25% ranked values, i.e. | |
interquartileRange () | |
The interquartile range is the distance between the 75th and 25th percentiles. | |
quartileDeviation () | |
The quartile deviation is half of the interquartile range value. | |
quartileVariationCoefficient () | |
The quartile variation coefficient is defines as follows: | |
quartileSkewnessCoefficient () | |
The quartile skewness coefficient (also known as Bowley Skewness), is defined as follows: | |
percentile ($p) | |
The pth percentile is the value such that p% of the a sorted data set is smaller than it, and (100 - p)% of the data is larger. | |
__sumdiff ($power, $mean=null) | |
Utility function to calculate: SUM { (xi - mean)^n }. | |
__calcVariance ($mean=null) | |
Utility function to calculate the variance with or without a fixed mean. | |
__calcAbsoluteDeviation ($mean=null) | |
Utility function to calculate the absolute deviation with or without a fixed mean. | |
__sumabsdev ($mean=null) | |
Utility function to calculate: SUM { | xi - mean | }. | |
__format ($v, $useErrorObject=true) | |
Utility function to format a PEAR_Error to be used by calc(), calcBasic() and calcFull() | |
_validate () | |
Utility function to validate the data and modify it according to the current null handling option. |
Data Fields | |
$_data = null | |
$_dataExpanded = null | |
$_dataOption = null | |
$_nullOption | |
$_calculatedValues = array() |
Base::__calcAbsoluteDeviation | ( | $mean = null | ) |
Utility function to calculate the absolute deviation with or without a fixed mean.
private
$mean | the fixed mean to use, null as default |
Definition at line 1488 of file Stats.php.
References __sumabsdev(), count(), PEAR\isError(), and PEAR\raiseError().
Referenced by absDev(), and absDevWithMean().
Base::__calcVariance | ( | $mean = null | ) |
Utility function to calculate the variance with or without a fixed mean.
private
$mean | the fixed mean to use, null as default |
Definition at line 1460 of file Stats.php.
References __sumdiff(), count(), PEAR\isError(), and PEAR\raiseError().
Referenced by variance(), and varianceWithMean().
Base::__format | ( | $v, | |
$useErrorObject = true |
|||
) |
Utility function to format a PEAR_Error to be used by calc(), calcBasic() and calcFull()
private
mixed | $v | value to be formatted |
boolean | $returnErrorObject | whether the raw PEAR_Error (when true, default), or only the error message will be returned (when false) |
Definition at line 1545 of file Stats.php.
References PEAR\isError().
Referenced by calcBasic(), and calcFull().
Base::__sumabsdev | ( | $mean = null | ) |
Utility function to calculate: SUM { | xi - mean | }.
private
optional | double $mean the mean value for the set or population |
Definition at line 1513 of file Stats.php.
References mean(), PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by __calcAbsoluteDeviation().
Base::__sumdiff | ( | $power, | |
$mean = null |
|||
) |
Utility function to calculate: SUM { (xi - mean)^n }.
private
numeric | $power | the exponent |
optional | double $mean the data set mean value |
Definition at line 1428 of file Stats.php.
References PEAR\isError(), mean(), PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by __calcVariance(), kurtosis(), sampleCentralMoment(), and skewness().
Base::_validate | ( | ) |
Utility function to validate the data and modify it according to the current null handling option.
private
Definition at line 1562 of file Stats.php.
References $d, $key, count(), PEAR\raiseError(), STATS_DATA_CUMMULATIVE, STATS_IGNORE_NULL, STATS_REJECT_NULL, and STATS_USE_NULL_AS_ZERO.
Referenced by setData().
Base::absDev | ( | ) |
Calculates the absolute deviation of the data points in the set Handles cummulative data sets correctly.
public
Definition at line 750 of file Stats.php.
References __calcAbsoluteDeviation(), and PEAR\isError().
Referenced by calcFull().
Base::absDevWithMean | ( | $mean | ) |
Calculates the absolute deviation of the data points in the set given a fixed mean (average) value.
Not used in calcBasic(), calcFull() or calc(). Handles cummulative data sets correctly
public
numeric | $mean | the fixed mean value |
Definition at line 773 of file Stats.php.
References __calcAbsoluteDeviation().
Base::calc | ( | $mode, | |
$returnErrorObject = true |
|||
) |
Calculates the basic or full statistics for the data set.
public
int | $mode | one of STATS_BASIC or STATS_FULL |
boolean | $returnErrorObject | whether the raw PEAR_Error (when true, default), or only the error message will be returned (when false), if an error happens. |
Definition at line 326 of file Stats.php.
References calcBasic(), calcFull(), elseif(), PEAR\raiseError(), STATS_BASIC, and STATS_FULL.
Base::calcBasic | ( | $returnErrorObject = true | ) |
Calculates a basic set of statistics.
public
boolean | $returnErrorObject | whether the raw PEAR_Error (when true, default), or only the error message will be returned (when false), if an error happens. |
Definition at line 349 of file Stats.php.
References __format(), count(), max(), mean(), min(), range(), stDev(), sum(), sum2(), and variance().
Referenced by calc().
Base::calcFull | ( | $returnErrorObject = true | ) |
Calculates a full set of statistics.
public
boolean | $returnErrorObject | whether the raw PEAR_Error (when true, default), or only the error message will be returned (when false), if an error happens. |
Definition at line 373 of file Stats.php.
References __format(), absDev(), coeffOfVariation(), count(), frequency(), geometricMean(), harmonicMean(), interquartileMean(), interquartileRange(), kurtosis(), max(), mean(), median(), midrange(), min(), mode(), quartileDeviation(), quartiles(), quartileSkewnessCoefficient(), quartileVariationCoefficient(), range(), sampleCentralMoment(), sampleRawMoment(), skewness(), stdErrorOfMean(), stDev(), sum(), sum2(), and variance().
Referenced by calc().
Base::center | ( | ) |
Transforms the data by substracting each entry from the mean.
This will reset all pre-calculated values to their original (unset) defaults.
public
Definition at line 295 of file Stats.php.
References PEAR\isError(), mean(), setData(), and STATS_DATA_CUMMULATIVE.
Base::coeffOfVariation | ( | ) |
Calculates the coefficient of variation of a data set.
The coefficient of variation measures the spread of a set of data as a proportion of its mean. It is often expressed as a percentage. Handles cummulative data sets correctly
public
Definition at line 1109 of file Stats.php.
References PEAR\isError(), mean(), PEAR\raiseError(), and stDev().
Referenced by calcFull().
Base::count | ( | ) |
Calculates the number of data points in the set Handles cummulative data sets correctly.
public
Definition at line 599 of file Stats.php.
References PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by __calcAbsoluteDeviation(), __calcVariance(), _validate(), calcBasic(), calcFull(), geometricMean(), harmonicMean(), kurtosis(), mean(), median(), percentile(), sampleCentralMoment(), sampleRawMoment(), skewness(), and stdErrorOfMean().
Base::frequency | ( | ) |
Calculates the value frequency table of a data set.
Handles cummulative data sets correctly
public
Definition at line 1171 of file Stats.php.
References $_data, PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by calcFull(), and mode().
Base::geometricMean | ( | ) |
Calculates the geometrical mean of the data points in the set Handles cummulative data sets correctly.
public
Definition at line 965 of file Stats.php.
References count(), PEAR\isError(), product(), and PEAR\raiseError().
Referenced by calcFull().
Base::getData | ( | $expanded = false | ) |
Returns the data which might have been modified according to the current null handling options.
public
boolean | $expanded | whether to return a expanded list, default is false |
Definition at line 217 of file Stats.php.
References $_data, $_dataExpanded, PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by interquartileMean().
Base::harmonicMean | ( | ) |
Calculates the harmonic mean of the data points in the set Handles cummulative data sets correctly.
public
Definition at line 995 of file Stats.php.
References count(), PEAR\isError(), PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by calcFull().
Base::interquartileMean | ( | ) |
The interquartile mean is defined as the mean of the values left after discarding the lower 25% and top 25% ranked values, i.e.
:
interquart mean = mean(<P(25),P(75)>)
where: P = percentile
Definition at line 1235 of file Stats.php.
References $n, getData(), PEAR\isError(), quartiles(), and PEAR\raiseError().
Referenced by calcFull().
Base::interquartileRange | ( | ) |
The interquartile range is the distance between the 75th and 25th percentiles.
Basically the range of the middle 50% of the data set, and thus is not affected by outliers or extreme values.
interquart range = P(75) - P(25)
where: P = percentile
public
Definition at line 1273 of file Stats.php.
References PEAR\isError(), and quartiles().
Referenced by calcFull(), and quartileDeviation().
Base::kurtosis | ( | ) |
Calculates the kurtosis of the data distribution in the set The kurtosis measures the degrees of peakedness of a distribution.
It is also called the "excess" or "excess coefficient", and is a normalized form of the fourth central moment of a distribution. A normal distributions has kurtosis = 0 A narrow and peaked (leptokurtic) distribution has a kurtosis > 0 A flat and wide (platykurtic) distribution has a kurtosis < 0 Handles cummulative data sets correctly
public
Definition at line 832 of file Stats.php.
References __sumdiff(), count(), PEAR\isError(), and stDev().
Referenced by calcFull().
Base::Math_Stats | ( | $nullOption = STATS_REJECT_NULL | ) |
Constructor for the class.
public
optional | int $nullOption how to handle null values |
Definition at line 176 of file Stats.php.
Base::max | ( | ) |
Calculates the maximum of a data set.
Handles cummulative data sets correctly
public
Definition at line 451 of file Stats.php.
References PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by calcBasic(), calcFull(), midrange(), and range().
Base::mean | ( | ) |
Calculates the mean (average) of the data points in the set Handles cummulative data sets correctly.
public
Definition at line 624 of file Stats.php.
References count(), PEAR\isError(), and sum().
Referenced by __sumabsdev(), __sumdiff(), calcBasic(), calcFull(), center(), coeffOfVariation(), and studentize().
Base::median | ( | ) |
Calculates the median of a data set.
The median is the value such that half of the points are below it in a sorted data set. If the number of values is odd, it is the middle item. If the number of values is even, is the average of the two middle items. Handles cummulative data sets correctly
public
Definition at line 864 of file Stats.php.
References $_data, $_dataExpanded, $n, count(), PEAR\isError(), PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by calcFull().
Base::midrange | ( | ) |
Calculates the midrange of a data set.
The midrange is the average of the minimum and maximum of the data set. Handles cummulative data sets correctly
public
Definition at line 940 of file Stats.php.
References PEAR\isError(), max(), and min().
Referenced by calcFull().
Base::min | ( | ) |
Calculates the minimum of a data set.
Handles cummulative data sets correctly
public
Definition at line 427 of file Stats.php.
References PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by calcBasic(), calcFull(), midrange(), and range().
Base::mode | ( | ) |
Calculates the mode of a data set.
The mode is the value with the highest frequency in the data set. There can be more than one mode. Handles cummulative data sets correctly
public
Definition at line 900 of file Stats.php.
References $_data, frequency(), PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by calcFull().
Base::percentile | ( | $p | ) |
The pth percentile is the value such that p% of the a sorted data set is smaller than it, and (100 - p)% of the data is larger.
A quick algorithm to pick the appropriate value from a sorted data set is as follows:
The median is the 50th percentile value.
public
numeric | $p | the percentile to estimate, e.g. 25 for 25th percentile |
Definition at line 1389 of file Stats.php.
References $_data, $_dataExpanded, $data, count(), elseif(), PEAR\isError(), and STATS_DATA_CUMMULATIVE.
Referenced by quartiles().
Base::product | ( | ) |
Calculates PROD { (xi) }, (the product of all observations) Handles cummulative data sets correctly.
public
Definition at line 546 of file Stats.php.
References PEAR\isError(), and productN().
Referenced by geometricMean().
Base::productN | ( | $n | ) |
Calculates PROD { (xi)^n }, which is the product of all observations Handles cummulative data sets correctly.
public
numeric | $n | the exponent |
Definition at line 567 of file Stats.php.
References $n, PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by product().
Base::quartileDeviation | ( | ) |
The quartile deviation is half of the interquartile range value.
quart dev = (P(75) - P(25)) / 2
where: P = percentile
public
Definition at line 1298 of file Stats.php.
References interquartileRange(), and PEAR\isError().
Referenced by calcFull().
Base::quartiles | ( | ) |
The quartiles are defined as the values that divide a sorted data set into four equal-sized subsets, and correspond to the 25th, 50th, and 75th percentiles.
public
Definition at line 1199 of file Stats.php.
References PEAR\isError(), and percentile().
Referenced by calcFull(), interquartileMean(), interquartileRange(), quartileSkewnessCoefficient(), and quartileVariationCoefficient().
Base::quartileSkewnessCoefficient | ( | ) |
The quartile skewness coefficient (also known as Bowley Skewness), is defined as follows:
quart skewness coeff = (P(25) - 2*P(50) + P(75)) / (P(75) - P(25))
where: P = percentile
Definition at line 1349 of file Stats.php.
References $d, PEAR\isError(), and quartiles().
Referenced by calcFull().
Base::quartileVariationCoefficient | ( | ) |
The quartile variation coefficient is defines as follows:
quart var coeff = 100 * (P(75) - P(25)) / (P(75) + P(25))
where: P = percentile
Definition at line 1321 of file Stats.php.
References $d, PEAR\isError(), and quartiles().
Referenced by calcFull().
Base::range | ( | ) |
Calculates the range of the data set = max - min.
public
Definition at line 645 of file Stats.php.
References PEAR\isError(), max(), and min().
Referenced by calcBasic(), and calcFull().
Base::sampleCentralMoment | ( | $n | ) |
Calculates the nth central moment (m{n}) of a data set.
The definition of a sample central moment is:
m{n} = 1/N * SUM { (xi - avg)^n }
where: N = sample size, avg = sample mean.
public
integer | $n | moment to calculate |
Definition at line 1040 of file Stats.php.
References $n, __sumdiff(), count(), PEAR\isError(), and PEAR\raiseError().
Referenced by calcFull().
Base::sampleRawMoment | ( | $n | ) |
Calculates the nth raw moment (m{n}) of a data set.
The definition of a sample central moment is:
m{n} = 1/N * SUM { xi^n }
where: N = sample size, avg = sample mean.
public
integer | $n | moment to calculate |
Definition at line 1076 of file Stats.php.
References $n, count(), PEAR\isError(), PEAR\raiseError(), and sumN().
Referenced by calcFull().
Base::setData | ( | $arr, | |
$opt = STATS_DATA_SIMPLE |
|||
) |
Sets and verifies the data, checking for nulls and using the current null handling option.
public
array | $arr | the data set |
optional | int $opt data format: STATS_DATA_CUMMULATIVE or STATS_DATA_SIMPLE (default) |
Definition at line 189 of file Stats.php.
References _validate(), PEAR\raiseError(), STATS_DATA_CUMMULATIVE, and STATS_DATA_SIMPLE.
Referenced by center(), and studentize().
Base::setNullOption | ( | $nullOption | ) |
Sets the null handling option.
Must be called before assigning a new data set containing null values
public
Definition at line 236 of file Stats.php.
References PEAR\raiseError(), STATS_IGNORE_NULL, STATS_REJECT_NULL, and STATS_USE_NULL_AS_ZERO.
Base::skewness | ( | ) |
Calculates the skewness of the data distribution in the set The skewness measures the degree of asymmetry of a distribution, and is related to the third central moment of a distribution.
A normal distribution has a skewness = 0 A distribution with a tail off towards the high end of the scale (positive skew) has a skewness > 0 A distribution with a tail off towards the low end of the scale (negative skew) has a skewness < 0 Handles cummulative data sets correctly
public
Definition at line 795 of file Stats.php.
References __sumdiff(), count(), PEAR\isError(), and stDev().
Referenced by calcFull().
Base::stdErrorOfMean | ( | ) |
Calculates the standard error of the mean.
It is the standard deviation of the sampling distribution of the mean. The formula is:
S.E. Mean = SD / (N)^(1/2)
This formula does not assume a normal distribution, and shows that the size of the standard error of the mean is inversely proportional to the square root of the sample size.
public
Definition at line 1146 of file Stats.php.
References count(), PEAR\isError(), and stDev().
Referenced by calcFull().
Base::stDev | ( | ) |
Calculates the standard deviation (unbiased) of the data points in the set Handles cummulative data sets correctly.
public
Definition at line 691 of file Stats.php.
References PEAR\isError(), and variance().
Referenced by calcBasic(), calcFull(), coeffOfVariation(), kurtosis(), skewness(), stdErrorOfMean(), and studentize().
Base::stDevWithMean | ( | $mean | ) |
Calculates the standard deviation (unbiased) of the data points in the set given a fixed mean (average) value.
Not used in calcBasic(), calcFull() or calc(). Handles cummulative data sets correctly
public
numeric | $mean | the fixed mean value |
Definition at line 731 of file Stats.php.
References PEAR\isError(), and varianceWithMean().
Base::studentize | ( | ) |
Transforms the data by substracting each entry from the mean and dividing by its standard deviation.
This will reset all pre-calculated values to their original (unset) defaults.
public
Definition at line 259 of file Stats.php.
References PEAR\isError(), mean(), PEAR\raiseError(), setData(), STATS_DATA_CUMMULATIVE, and stDev().
Base::sum | ( | ) |
Calculates SUM { xi } Handles cummulative data sets correctly.
public
Definition at line 476 of file Stats.php.
References PEAR\isError(), and sumN().
Referenced by calcBasic(), calcFull(), and mean().
Base::sum2 | ( | ) |
Calculates SUM { (xi)^2 } Handles cummulative data sets correctly.
public
Definition at line 498 of file Stats.php.
References PEAR\isError(), and sumN().
Referenced by calcBasic(), and calcFull().
Base::sumN | ( | $n | ) |
Calculates SUM { (xi)^n } Handles cummulative data sets correctly.
public
numeric | $n | the exponent |
Definition at line 521 of file Stats.php.
References $n, PEAR\raiseError(), and STATS_DATA_CUMMULATIVE.
Referenced by sampleRawMoment(), sum(), and sum2().
Base::variance | ( | ) |
Calculates the variance (unbiased) of the data points in the set Handles cummulative data sets correctly.
public
Definition at line 671 of file Stats.php.
References __calcVariance(), and PEAR\isError().
Referenced by calcBasic(), calcFull(), and stDev().
Base::varianceWithMean | ( | $mean | ) |
Calculates the variance (unbiased) of the data points in the set given a fixed mean (average) value.
Not used in calcBasic(), calcFull() or calc(). Handles cummulative data sets correctly
public
numeric | $mean | the fixed mean value |
Definition at line 715 of file Stats.php.
References __calcVariance().
Referenced by stDevWithMean().
Base::$_data = null |
Definition at line 129 of file Stats.php.
Referenced by frequency(), getData(), median(), mode(), and percentile().
Base::$_dataExpanded = null |
Definition at line 138 of file Stats.php.
Referenced by getData(), median(), and percentile().