Chapter10. STATISTICS
10.1 Introduction: A statistic is an algebraic expression that
combines scores into a single number. Statistics basically serves two
functions: they estimate parameters in population models and they describe the
data. It is collection of methods for planning experiments, obtaining data, and
then organizing, summarizing, presenting, analyzing, interpreting, and drawing
conclusions.
10.2 Measure Of Central Tendency:
Central
tendency is a representative score. The three measures of central tendency that
will be discussed this semester are the mode, median, and mean.
10.2.1 Mean:
The mean or the arithmetic mean
is the most commonly used measure of central tendency. It is the sum of the
numbers divided by the number of numbers. The symbol m is used for the mean of a population. The symbol M is used for the mean of a
sample. The formula for m is ;
m
= ΣX
N
Where, X is sum of all the
numbers in the given set and N is the total number of numbers in the set.
Example: Evaluate
the mean of 1, 2, 5, 6, 4
Solution: N = 5
ΣX
= 1+2+5+6+4 = 18
So, m
= 18/5
= 3.6
10.2.2 Median: The
median is the middle of a distribution. Half the scores are above the median
and half of them are below the median.
When
there is an odd number of numbers in
a given set, the median is simply the middle number.
For example: in a given set of 2, 8, and 9
the median is 8.
When there is an even number of numbers, the median is the mean of the two
middle numbers.
For example:
The median of the numbers 2, 4, 7, 12 is;
(4+7)/2 = 5.5
10.2.3 Mode: The mode in a list of
numbers refers to the list of numbers that occur most frequently. It is
important to note that there can be more than one mode and if no number occurs more
than once in the set, then there is no mode for that set of numbers.
Example: find the mode?
14,52,12,14,15,14,21,14,17,27,14
Solution: in the above given
set of numbers, the most frequently occurring number is 14. 14 has occurred 5
times in the set. Hence,
Mode
= 14
A
distribution may have more than one mode if the two most frequently occurring
scores occur the same number of times. Such distributions are called Bimodal.
Example: 12,11,14,12,54,10,11,21,24,12,11
Solution: 12
and 11 both have occurred 3 times, so,
Mode = 12
and 11
10.3 Measures of Variability: Variability refers to the spread
or dispersion of scores. A distribution of scores is said to be highly variable
if the scores differ widely from one another.
Consider
the following two sets of scores:
Set
A: 4, 5, 6, 7, 8
Set
B: 2, 4, 6, 8, 10
We
can see that the means for these two sets of scores are the same:
For
Set A the mean is (4 + 5 + 6 + 7 + 8)/5 = 6
For
Set B the mean is (2 + 4 + 6 + 8 + 10)/5 = 6
Although
these two sets of scores have the same mean, they differ in how spread out the
scores are. The scores in Set A vary over a smaller set of values (4 through 8)
than does Set B which varies over score values from 2 through 10.
10.3.1 Range: The
difference between the lowest and highest values.
In {4, 6, 9, 3, 7} the lowest value is 3, and the highest is 9, so the range is
9-3 equals 6.
Range can also mean all the
output values of a function.
Range = Highest Score - Lowest Score
+ 1
For Set A the Range = (8 - 4) + 1 =
4 + 1 = 5
For Set B the Range = (10 - 2) + 1 =
8 + 1 = 9
As we expected the set of scores
with the greater spreadoutedness (Set B), has a larger range than the set with
less variability (Set A).
10.3.2 Variance and Standard Deviation: The variance is the measure of variability about
the mean and is symbolized by "σ2
The
square root of variance is called the standard deviation. It is represented by
σ.
10.3.2.1 Methods of finding variance and standard deviation:
Case I : when
each term has frequency 1
![]()
Let x1 x2
xn are the n given
observations and let x be their mean. Then the variance is given by;
![]()
![]()
σ2
= (x1 x )2 + (x2 x) +
.+ (xn
x)
n
= Σ di2
n
where,
di
= (xi x )
![]()
![]()
![]()
![]()
![]()
therefore,
σ
= Σ (xi x)2 =
Σ di2
![]()
![]()
![]()
n n
Example: Find
the variance and standard deviation of the following onservations;
10,11,8,15,16
Solution: Mean
= x =
10+11+8+15+16 = 60/5 = 12
5
|
Variable xi |
Deviation from the mean
|
di2 |
|
10 11 8 15 16 |
10 12 = -2 11 12 = -1 8 12 = -4 15 12 = 3 16 12 = 4 |
4 1 16 9 16 |
|
|
|
Σdi2 = 46 |
Therefore,
Variance = σ2
= Σdi2
= 46/5 = 9.2
n
and
standard deviation = σ = √9.2 = 3.303
Case II: When the frequencies of the variable are
given. The variance is given by
![]()
σ2 = Σ fi di2 &
S.D = σ = Σ fi di2
![]()
![]()
Σfi
Σfi
Example: find the variance and standard
deviation from the given distribution table.
|
Variable
(xi) |
2 |
4 |
6 |
10 |
12 |
|
Frequency
(fi) |
4 |
4 |
5 |
5 |
6 |
Solution:
|
Variable xi |
Frequency fi |
fixi |
|
di2 |
fi di2 |
|
2 4 6 8 10 12 14 16 |
4 4 5 15 8 5 4 5 |
8 16 30 120 80 60 56 80 |
-7 -5 -3 -1 1 3 5 7 |
49 25 9 1 1 9 25 49 |
196 100 45 15 8 45 100 245 |
|
|
Σ
fi
= 50 |
Σfixi = 450 |
|
|
Σfi di2
= 754 |
Therefore,
Variance = σ2 = Σ fi
di2 = 754 / 50 = 15.08
Σfi
Standard
deviation = σ = √15.08 = 3.88
Case III: When
the mean is a decimal fraction.
![]()
In this case we will use the
formula; 2
σ2 = Σ fi di2 -
Σfidi
![]()
Σfi Σfi
Example: The
scores of 10 students in an examination, in which maximum marks were 50 is
given. Find the variance?
|
Variable xi |
Frequency fi |
di = xi A |
di2 |
fidi2 |
fidi |
|
19 22 27 28 34 35 36 41 48 |
1 1 1 2 1 1 1 1 1 |
-15 -12 -7 -6 0 1 2 7 14 |
225 144 49 36 0 1 4 49 196 |
225 144 49 72 0 1 4 49 196 |
-15 -12 -7 -12 0 1 2 7 14 |
|
|
Σfi = 10 |
|
|
Σ fidi2 = 740 |
Σfidi = -22 |
![]()
Using the formula 2
σ2
= Σ fi di2
- Σfidi
![]()
Σfi
Σfi
= (740/10) (-23/10)2
= 74 4.84
= 69.16
10.3.3 Variance of grouped data:
Step Deviation method: When data are grouped into a frequency
distribution having class intervals of equal size h, the formula used is;
![]()
![]()
![]()
2
σ2
= h2 Σ fi ui2 Σfi ui
Σfi Σfi
Where ui = xi
A , A being assumed as the mean.
h
Example: Calculate
the mean and standard deviation for the following distribution:
|
Marks |
Number
of students |
|
20
30 30
40 40
50 50
60 60
70 70
80 80
90 |
3 6 13 15 14 5 4 |
Solution: firstly lets find the mid
values of the class intervals and consider a suitable assumed mean, A
|
Class
interval |
Frequency fi |
Mid-value xi |
ui = xi A
|
fi ui |
fiui2 |
|
20
30 30
40 40
50 50
60 60
70 70
80 80
90 |
3 6 13 15 14 5 4 |
25 35 45 55 65 75 85 |
-3 -2 -1 0 1 2 3 |
-9 -12 -13 0 14 10 12 |
27 24 13 0 14 20 36 |
|
|
Σ
fi
= 60 |
|
|
Σfi ui = 2 |
Σfiui2 = 134 |
![]()
Therefore,
Mean
= x = A + Σfi ui x h
Σ fi
= 55
+ ( 2/60) x10
=
55+0.33
= 55.33
![]()
![]()
![]()
Variance = 2
σ2
= h2 Σ fi ui2 Σfi ui
Σfi Σfi
= 100
[(134/60) (2/60)2]
= 222.9
Hence, standard deviation =
σ = √222.9
= 14.94
10.4
Mean Deviation about the Median: The
mean of absolute deviations of values of various observations from their median
is called the mean deviation about the median.
Thus,
if x1,x2,
.xn
be the n
observations and M be their median then;
n
Mean
deviation = Σ | xi
M |
i =1
n
10.4.1 Methods
of finding Mean deviation about Median:
Case I: For discrete series:
i)
Let
n1 be the number of those xis for which xi
≥ M and
let n2 be the number xis for which xi
< M. then n1
+n2 = n.
ii)
Let
s1 and s2 denote the sum of n1 and n2
oservations respectively. Then,
Mean deviation = (s1 s2) (n1 n2)M
(n1
+ n2)
Case II: For Grouped Data:
i)
Let
n1 be the sum of frequencies fis of those xis
for which xi < M and let s1 = Σ fi xi
for these fis.
ii)
Let
n2 be the sum of frequencies fis of those xis
for which xi < M and let s2 = Σ fi xi
for these fis.
Mean Deviation = (s1
s2) (n1 n2)M
(n1 +
n2)
