18 November 2023

Measures of Skewness and Measures of Kurtosis

 

Measures of Skewness

    To say, skewness means 'lack of symmetry'. We study skewness to have an idea about the shape of the curve which we can draw with the help of the given data. A distribution is said to be skewed if Mean is not equals to Median is not equal to Mode. 
    If the left tail of the frequency curve is more elongated than right tail, it is known as negative skewness otherwise it as positive skewness.
    They are various measures of skewness are:

    Where M is the mean, Md is the Median, M0 is the Mode of the distribution. These are the absolute measures of skewness. As in dispersion, for comparing two series we do not calculate these absolute measures, but we calculate the relative measures called the coefficients of skewness which are pure numbers independent of units of measurements.

Measures of Dispersion

 The understanding gleaned from numerous scenarios shows that while the central figures in various data sets might be identical, the range of values often varies. Simply identifying a central value doesn't provide insights into the distribution or spread of the data. As a result, we've developed specific metrics to numerically describe this data spread, which are termed as measures of dispersion.

Some definitions for measures of dispersion:

    "The degree to which numerical data tend to spread about an average value is called the variation or dispersion of data."  -Spiegel

    "Dispersion is the measure of the variation of the items."  -A.L. Bowley

Properties of Measures of Dispersion:

  • It should be rigidly defined.
  • It should be easy to calculate and easy to understand.
  • It should be based on all the observations.
  • It should be amenable to further mathematical treatment.
  • It should be affected as little as possible by fluctuations of sampling.

Different types of measures of dispersion:

  1. Range
  2. Interquartile range and quartile deviation
  3. Mean deviation
  4. Median absolute deviation
  5. Variance
  6. Standard deviation and
  7. Coefficient of variation

Range: The range of a distribution is calculated by subtracting the smallest observation from the largest one. If in a set of data, 'A' represents the highest value and 'B' is the lowest, then the range can be defined as the difference between the maximum (X_max) and minimum (X_min) values, which is Range=Xmax​−Xmin​=AB.

Quartile Deviation: The Interquartile range, often abbreviated as 'Q', is computed as Q=1/2(Q3−Q1), where Q1​and Q3 denote the first and third quartiles of the distribution, respectively. 

    While the Interquartile range is more informative than the basic range because it incorporates 50% of the data, its reliability is limited since it doesn't account for the remaining half of the data.

Mean Deviation: The Range and Quartile Deviations focus on specific positions in a data set when measuring dispersion. On the other hand, the Mean Deviation considers all data points. It represents the average of the absolute differences between each value and a chosen central measure, typically the mean, median, or mode.

Sheppard's Correction for Moments:

    When dealing with a grouped frequency distribution, we typically take the midpoint of class intervals as the concentration point for frequencies. This approach is fairly accurate if the distribution is either symmetrical or nearly symmetrical and the class intervals are smaller than 1/20th of the total range. However, this isn't always the case, so there can be inaccuracies introduced, termed as the 'grouping error'. W.F. Sheppard demonstrated that if the frequency distribution is continuous and gradually reduces to zero on both ends, we can correct this midpoint assumption error. This adjustment is referred to as Sheppard's corrections.



                Where h is the width of the class interval.

Diagrammatic and Graphical Representation

    Diagrams give a bird's eye view of complex data. It has long lasting impression and easy to understand even by a common man. It saves time and labor and also facilitate comparison. The One-dimensional diagrams are simple bar diagram, multiple bar diagram, sub-divided or component bar diagram, percentage bar diagram. The Two-dimensional diagrams are rectangle, circles and pie diagrams. The Three-dimensional diagrams are cube, cylinder and sphere. The Non-dimensional diagram are pictograms.

  • Bar diagram
    A bar diagram represents the magnitude of a single factor according to time periods, places, items, etc. But when the magnitude of the factor is given with its sub-factors, each bar is further sub-divided into components in proportion to the magnitude of the sub-factors.




  Such a diagram is known as sub-divided bar diagram.



  • Multiple Bar diagram
    The multiple bar diagram as adjoining bars are drawn according to the number of factors and their heights in proportion to the values of the factors in the same order for each period or place. Each bar of a group is shown by different patterns or colors to make them easily distinguishable, and this pattern is retained in all the groups. A constant distance is maintained between groups of bars drawn for periods or places. Such a diagram is known as multiple or compound bar diagram.



  • Deviation bar diagram
    Deviation bar diagram is suitable to show the net deviations during various year or according to different countries or places, etc. In the deviation bar diagram, positive deviations are shown to the right side of the base line and negative deviations are shown to the left side of the same base line. 


  • Duo-directional bar diagram
    It is used to exhibit the two aspects of a single factor at a glance given for different periods or places. In this type of diagram, one part of the bar remains above the base line and the other below the base line. The heights of the bars below and above the line are in proportion to the values of the two aspects separately whereas the bar as a whole represents the factor. 

  • Paired bar diagram

    When two related factors having different units of measurements are to be displayed for comparison in various periods or places, paired bar diagrams are suitable. In this diagram usually, the periods or places are shown in a strip and horizontal bars for each factor are drawn to the right and left of the vertical strip or vertical bars are drawn below and above the horizontal strip.




  • Sliding bar chart
    It is a bilateral chart in which two components of a factor are represented by two parts of the bar. One part is on the left and the other is on the right of the base line. The scale may be the absolute numbers or in percentages. Such a chart is suitable in situations such as numbers arrested in a criminal case or patients operated for different diseases. 

  •  Broken bar diagram
    Often an investigator comes across cases where some figures are very large as compared to others. In this situation, if the scale is chosen for proper portray of small values by bars, the bars for large values will expand to a unpalatable size. Again, if the scale is chosen for proper display of large values by bars, the bars for small value will become non-existent. Hence, to remove this discrepancy, broken bars are constructed. 
    First, a small scale is taken, and bars are erected at all periods or places up to the highest small values and/or a round off value. Then with a gap another base line and a new scale for large values is chosen. Bars are constructed for remaining value on the new base line. Such a bar diagram is known as broken bar diagram. These diagrams be interpreted very carefully to avoid any wrong conclusions.


  • Line diagram 
    A line diagram is a one-dimensional diagram in which the height of the line represents the frequency corresponding to the value of the item or a factor.


  • Pie-chart
    A pie-chart is a circular diagram which is usually used for depicting the components of a single factor. The circle is divided into segments which are in proportion to the size of the components. They are shown by different patterns or colors to make them attractive.


  • Histogram
    A histogram is a bar diagram which is suitable for frequency distribution with continuous classes. The width of all bars is equal to class interval and heights of the bars in proportion to the frequencies of the respective classes. In this diagram bars touch each other but one bar never overlaps the other.


  • Frequency polygon
    When the mid-points of the tops of the adjacent bars of a histogram are joined in order, then the graph of lines so obtained is called a frequency polygon.



  • Frequency curve
    A frequency curve is a graphical representation of frequencies corresponding to their variate values by a smooth curve. A smoothened frequency polygon represents a frequency curve.



  • Graph
    A graph is a display of points and lines. In a graph of paired values (x, y), so called the co-ordinates of a point, are plotted on a graph paper by suitably choosing the scales along X-axis and Y-axis. The plotted points are joined by straight lines in their sequence of occurrence. The figure so obtained is called a graph. The graph depicts the trend, fluctuations, variability, etc., very prominently. Two or more graphs made on the same graph paper having a common scale x-axis and y-axis facilitate the comparison of data tremendously.




  • Ogive curve
    It is a graph plotted for the variate values and their corresponding cumulative frequencies of a frequency distribution. Its shape is just like elongated S. An ogive curve is prepared either for more than type or less than type distribution. It is useful in finding out quartiles, deciles, percentiles.



  • Lorenz curve
    The variate values giving information about the segment of population are ignored. Cumulative totals for the magnitude or frequencies of the two other factors are found out separately. Cumulative totals are expressed as percentage of their respective grand totals. Paired cumulative percentages are plotted on a graph paper choosing same scale along axes from 0 to 100. Plotted points are joined by a smooth free hand curve. This curve always starts from the origin and terminates at the end point (100, 100).

Measures of Skewness and Measures of Kurtosis

  Measures of Skewness     To say, skewness means 'lack of symmetry'. We study skewness to have an idea about the shape of the curve...