18 November 2023

Measures of Dispersion

 The understanding gleaned from numerous scenarios shows that while the central figures in various data sets might be identical, the range of values often varies. Simply identifying a central value doesn't provide insights into the distribution or spread of the data. As a result, we've developed specific metrics to numerically describe this data spread, which are termed as measures of dispersion.

Some definitions for measures of dispersion:

    "The degree to which numerical data tend to spread about an average value is called the variation or dispersion of data."  -Spiegel

    "Dispersion is the measure of the variation of the items."  -A.L. Bowley

Properties of Measures of Dispersion:

  • It should be rigidly defined.
  • It should be easy to calculate and easy to understand.
  • It should be based on all the observations.
  • It should be amenable to further mathematical treatment.
  • It should be affected as little as possible by fluctuations of sampling.

Different types of measures of dispersion:

  1. Range
  2. Interquartile range and quartile deviation
  3. Mean deviation
  4. Median absolute deviation
  5. Variance
  6. Standard deviation and
  7. Coefficient of variation

Range: The range of a distribution is calculated by subtracting the smallest observation from the largest one. If in a set of data, 'A' represents the highest value and 'B' is the lowest, then the range can be defined as the difference between the maximum (X_max) and minimum (X_min) values, which is Range=Xmax​−Xmin​=AB.

Quartile Deviation: The Interquartile range, often abbreviated as 'Q', is computed as Q=1/2(Q3−Q1), where Q1​and Q3 denote the first and third quartiles of the distribution, respectively. 

    While the Interquartile range is more informative than the basic range because it incorporates 50% of the data, its reliability is limited since it doesn't account for the remaining half of the data.

Mean Deviation: The Range and Quartile Deviations focus on specific positions in a data set when measuring dispersion. On the other hand, the Mean Deviation considers all data points. It represents the average of the absolute differences between each value and a chosen central measure, typically the mean, median, or mode.

Sheppard's Correction for Moments:

    When dealing with a grouped frequency distribution, we typically take the midpoint of class intervals as the concentration point for frequencies. This approach is fairly accurate if the distribution is either symmetrical or nearly symmetrical and the class intervals are smaller than 1/20th of the total range. However, this isn't always the case, so there can be inaccuracies introduced, termed as the 'grouping error'. W.F. Sheppard demonstrated that if the frequency distribution is continuous and gradually reduces to zero on both ends, we can correct this midpoint assumption error. This adjustment is referred to as Sheppard's corrections.



                Where h is the width of the class interval.

No comments:

Post a Comment

Measures of Skewness and Measures of Kurtosis

  Measures of Skewness     To say, skewness means 'lack of symmetry'. We study skewness to have an idea about the shape of the curve...