9.
THEORY OF
ATTRIBUTES
9.1CLASSES AND CLASS FREQUENCIES:
Different
attributes in themselves are called different classes and the number of
observations assigned to them are called frequencies which are denoted by
bracketing the class symbols. Thus (A) stands for the frequency of A and (AB)
for the number of objects possessing the attribute AB.
9.1.1
Order of classes
and class frequencies:
a class
represented by n attributes is called a class of nth order and the
corresponding g=frequency as the frequency of the nth order. Thus (A) is a
class frequency of order1; (AB), (AC), (βγ) etc., are class
frequencies of second order; (ABC), (A βγ), (αβC) etc. are
frequencies of third order and so on. N, the total number of members of the
population, without any specification of attributes, is reckoned as a frequency
of zero-order.
Thus in a dichotomous classification with respect to
n attributes, the number of class frequencies of order ‘r’ is (nCr)*2r,
since r attributes out of n can be selected in(nCr) ways
and each of the r attributes contributes two symbols, one representing the
positive part (eg., A) and the other the negative part (e.g., α). Thus the
total number of class frequencies of all orders, for n attributes is:
9.1.2
Relation between
class frequencies:
All the class
frequencies of various orders are not independent of each other and any class
frequency can always be expressed in terms of class frequencies of higher
order.
Thus
N = (A) +
(α) = (B) + (β) = (C) + (γ),etc.
Also, since each
of these A’s or α’s can either be
B’s or β’s, we have
(A)
=
(AB) + (Aβ) and (α) = (αB) + (αβ)
Similarly (B) =
(AB) + (αB) and (β) = (Aβ) + (αβ)
(AB) = (ABC) +
(ABγ), (Aβ) =
(AβC) + (Aβγ)
(αB) =
(αBC) + (αBγ),
(αβ) = (αβC) + (αβγ)
And so on. Thus
(A)
=
(AB) + (Aβ) = (ABC) + (ABγ) + (AβC) + (Aβγ)
(β) =
(Aβ) + (αβ) = (AβC) + (Aβγ) + (αβC) +
(αβγ), etc.
The classes of
highest order are called the ultimate classes and their frequencies, the
ultimate class frequencies. Thus in case of n attributes, the ultimate class
frequencies will be the frequencies of nth order. For example, the class
frequencies (ABC), (ABγ), (AβC),
(Aβγ), (αBC), (αβC), (αβγ) are
the ultimate frequencies for three attributes A, B and C.
9.2
CLASS SYMBOLS AS
OPERATORS:
Let us write
symbolically
A.N = (A)
Which means that
the operation of dichotomizing N according to A given the class frequency equal to (A). Similarly,
we write
α.N =
(α)
adding, we get
A.N + α.N =
(A) + (α)
(A + α).N =
(A) + (α)
(A + α).N =
N
A + α = 1
Thus in symbolic
expression we can replace A by (1 - α) and α by (1 – A).
Similarly, B can
be replaced by (1 – β) and β by (1-B) and so on.
Dichotomizing
(B) according to A, let us write
A.(B) = (AB)
Similarly, B.(A)
= (BA)
A.(B) = B.(A) =
(AB) = AB.N
Which accounts
to dichotomizing N according to AB.
9.3
CONSISTENCY OF
DATA:
Any class
frequencies which have been or might have been observed within one and the same
population are said to be consistent if they conform with one another and do
not in any way conflict. For example, the figures (A) = 20, (AB) = 25 are
inconsistent as (AB) cannot be great than (A), if they are observed from the
sample population.
‘Consistency’ of
a set of class frequencies may be defined as the property that none of them is
negative, otherwise, the data for class frequencies are said to be
‘inconsistent’.
Since any class
frequency can be expressed as the sum of some of the ultimate class
frequencies, it is necessarily non-negative if all the ultimate class
frequencies are non- negative. This provides a criterion for testing the
consistency of the data.
9.4
INDEPENDENCE OF
ATTRIBUTES:
Two attributes A
nad B are said to be independent if there exists no relationship of any kind
between them. If A and B are independent, we would expect
(i)
The
same proportion of A’s amongst B’s as amongst β’s,
(ii)
The
proportion of B’s amongst A’s is same as that amongst the α’s.
9.4.1
Criterion of
independence:
If A and B are
independent, then (i) gives
--------------(1)
![]()
![]()
Similarly, (ii)
gives
-----------------(2)
![]()
![]()
In fact (1) à (2) and vice
versa.
For example,(1)
gives
![]()
-------------(3)
Which is (2).
Similarly, starting from (2), we would arrive at (1).
It becomes
easier to grasp the nature of the above relations if the frequencies are
supposed to be grouped into a table with two rows and two columns as follows:
|
Attributes |
A |
α |
Total |
|
B |
(AB) |
(αB) |
(B) |
|
β |
(Aβ) |
(αβ) |
(β) |
|
Total |
(A) |
(α) |
N |
Second criterion
of independence may be obtained in terms of the class frequencies of first
order. (3) gives
![]()
![]()
Which leads to
the following important fundamental rule:
“if the
attributes A and B are independent, the proportion of AB’s in the population is
equal to the product of the proportions of A’s and B’s in the population.”
We may obtain a
third criterion of independence in terms of second order class frequencies, as
follows.
![]()
![]()
--------------(4)
9.4.2
Symbols (AB)0
and δ:
Let us write
![]()
Which is the value of (AB) under the
hypothesis that the attributes A and B are independent.
δ = (AB) – (AB)0
denotes the
excess of (AB) over (AB)0. Then
![]()
![]()
![]()
(4) à δ= 0 if A
and B are independent.
9.5 ASSOCIATION OF ATTRIBUTES:
Two attributes A
and B are said to be associated if they are not independent but are related in
some way or the other. They are said to be
Positively
associated if (AB) > ![]()
Negatively
associated if (AB) < ![]()
In other words,
two attributes A and B are positively associated if δ > 0, negatively
associated if δ < 0 or and are independent if δ =0.
9.5.1 Yule’s coefficient of association:
As a measure of
the intensity of association between two attributes A and B, G. Udny Yule gave
the coefficient of association Q, defined as follows:
![]()
If A and B are
independent, δ= 0 à Q= 0.
If A and B are
completely associated, then
Either (AB) =
(A) à
(Aβ) = 0
Or (AB) =(B) à (αB) = 0
And in each case
Q = +1.
If A and B are
in complete dissociation then either (AB) = 0 or (αβ) = 0 and we get
Q = -1.
Hence, -1
≤ Q ≤ 1
