Theory of attributes

Chapter 3 …

Attributes

Introduction:

Literally an attribute means a quality. Theory of attributes deals with qualitative characteristics which are not measurable. An attribute may be marked by its presence or absence in a member of given population.

There are two types of characteristics.

i) Quantitative (Variable) characteristics and

ii) Qualitative (Attribute) characteristics

Quantitative characteristics

This is a measurable characteristic. For ex: in a class of 100 students the characteristic under study be the height of a student. We can measure the height of each student. Similarly the characteristic such as age, weight and wages are measurable.

Qualitative characteristics

Qualitative characteristics are non-measurable characteristics and called as attributes. An attribute means a quality or characteristic which cannot be measured but which is known to be present or absent in an atom for example: for the same class of 100 students the characteristics under study be the sex of a student. Here no numerical measure can be attached to this we can record whether the given student is a male or female. Thus the whole group can be divided in two non-overlapping groups’ males and females and each student belongs to one of these two groups.

Definition:

A characteristic, which is qualitative in nature and according to which each member of the population belongs to one and only one class of a number of distinct classes into which the entire population can be divided, is called an attribute. Examples of attribute are sex, literacy, mother tongue, religion, nationality, blindness, honesty, color, smoking and non smoking, drinking and non drinking etc.

Dichotomy:

If the universe (population) is divided into two sub-classes or complementary classes and no more, all the units possessing the attribute in one class and the rest, not possessing the attribute in other class, the classification is called ‘dichotomy’. E.g. according to sex of students, a class of students is divided into two classes: one of boys and the other of girls.

Manifold Classification:

A classification in which the whole group is divided into more than two non overlapping classes is called manifold classification. Thus, if we classify the same class of students according to their mother tongue, here we may consider the languages- Marathi, Gujarati, Hindi & all others. In this case the class of 100 students will be divided into four distinct classes.

***

Notations

The capital letters A, B, C… will denote the presence of attribute, while the small letters a, b, c will denote the absence of attribute. The absence of attribute is also denoted by α, β, γ,…etc. Thus, if A represents male, a (α) represents the female. If B denotes the literate, then b (β) denotes the illiterate.

A, B, C…are called positive attributes, while a, b, c,….( α, β, γ,…..) are called negative attributes.

Class

The group of members possessing a particular attribute will be termed as the class.

Class A

The group of members possessing an attribute A will be termed as the class A.

Class Frequency

The number of members belonging to any class is called the class frequency or frequency of that class and is denoted by inclosing the corresponding class symbols in brackets. Thus, if in the class of 100 students, there are 60 boys and 40 girls, then

Frequency of class A = (A) = 60 and frequency of class a or α = (a) = (α) =40.

Combination of attributes

The combinations of attributes are denoted by grouping together the letters concerned e.g. If A denotes male & B denotes the attribute literate, then

(AB) represents or stands for number of literate men.

(Aβ) represents or stands for number of illiterate men.

(αB) represents or stands for number of literate women.

(αβ) represents or stands for number of illiterate women.

Again a third attribute be employed denoted by C, then

(ABC) represents or stands for number of employed literate men.

(AβC) represents or stands for number of employed illiterate men.

(αBC) represents or stands for number of employed literate women.

(αβC) represents or stands for number of employed illiterate women.

(ABγ) represents or stands for number of unemployed literate men.

(Aβγ) represents or stands for number of unemployed illiterate men.

(αBγ) represents or stands for number of unemployed literate women.

(αβγ) represents or stands for number of unemployed illiterate women.

Positive and Negative classes

The attributes denoted by capital letters A, B, C, ….are called as positive attributes and those denoted by Greek letters α, β, γ, … are called as negative attributes. Thus, A, B, C, ….are positive classes and class frequencies of the type (N), (A), (AB), (ABC) etc. are known as positive class frequencies. α, β, γ, … are negative classes and class frequencies of the type (α), (αβ), (αβγ) etc. are known as negative class frequencies.

Order of class and class frequencies

The number of attributes in a class denotes the order of the class. Thus, the classes A, B, C, …., α, β, γ, … are of first order. and the corresponding frequency as the frequency of 1^st order. Thus, A, B, α, β are all the first order classes and (A), (B), (α), (β) are corresponding first order class frequencies. Similarly, AB, AC, αβ etc. are the second order classes and (AB), (AC), (αβ) etc. are corresponding second order class frequencies, ABC, ABγ, αβC etc. are third order classes and (ABC), (ABγ), (αβC) etc. are corresponding third order class frequencies. ***

Relation between class frequencies:

All the class frequencies of various orders are not independent of each other and any class frequencies of lower order classes can be expressed in terms of class frequencies of higher order. For obtaining these relations, let us consider a group of N individuals dichotomized according to three attributes sex, literacy and employment.

Let the classes of males, literates and employed persons be denoted by A, B and C respectively, Then the number of males added to number of females will equal to total N.

i.e. (A) + (α) = N … (1).

Similarly we have, (B) + (β) = N and (C) + (γ) = N.

Now the number of literate males (AB) added to the number of illiterate males (Aβ) will gives the total number of males.

i.e. (AB) + (Aβ) = (A) … (2).

Similarly, (αB) + (αβ) = (α)…(3).

From (1), (2) and (3) we get N = (A) + (α) = (AB) + (Aβ) + (αB) + (αβ) …. (4)

	A	α	Total
B	(AB)	(αB)	(B)
β	(Aβ)	(αβ)	(β)
Total	(A)	(α)	N

Similarly the sum of employed literate males and unemployed literate males will give the total of literate males.

i.e. (ABC) + (ABγ) = (AB) …. (5).

Similarly, (AβC) + (Aβγ) = (Aβ), (αBC) + (αBγ) = (αB) and (αβC) + (αβγ) = (αβ). Hence we have, N = (A) + (α) = (AB) + (Aβ) + (αB) + (αβ)

= (ABC) + (ABγ) + (AβC) + (Aβγ) + (αBC) + (αBγ) + (αβC) + (αβγ)…. (6)

	A			α			Total
	C	γ	Total	C	γ	Total	Total
B	(ABC)	(ABγ)	(AB)	(αBC)	(αBγ)	(αB)	(B)
β	(AβC)	(Aβγ)	(Aβ)	(αβC)	(αβγ)	(αβ)	(β)
Total	(AC)	(Aγ)	(A)	(αC)	(αγ)	(α)	(N)

Total number of classes

For one attribute

We have zero and first order classes,

i. e. class of order zero = N and

class of order 1 = A, α.

So in all, we have 3 classes. Thus for one attribute 3¹ = 3.

In case of 2 attributes,

We have one zero order class, four first order classes and four second order classes namely N, A, α, B, β, AB, Aβ, αB and αβ respectively. So in all, we have 9 classes. Thus for two attributes, 3² = 9.

Similarly in the case of three attributes, we have 0,1,2 and 3 order classes. They can be shown in tabular form as follows:

Order	Classes	Total
0	N	1
1	A, B, C, α, β, γ	6
2	AB, AC, BC, Aβ, αB, αβ	12
	Aγ, βC, Bγ, βγ, αC, αγ
3	ABC, ABγ, AβC, Aβγ,	8
	αBC, αβC, αBγ, αβγ

Total number of classes = 27. Thus for three attribute 3³ = 27. Similarly, we can easily verify that, for 'n' attributes there will be 3ⁿtotal number of classes.

Ultimate class frequencies:

The frequencies of the classes of the highest order are called ultimate class frequencies. If we have 'n' attributes then the frequencies of the n th order classes will be the ultimate class frequencies. If we have 'n' attributes then the number of classes of highest order will be 2ⁿ. Hence in case of two attributes, there will be 2² = 4 highest order classes and hence the ultimate class frequencies are 4, namely (AB), (Aβ), (αB) & (αβ), and in case of three attributes, there will be 2³ = 8 highest order classes and hence the ultimate class frequencies are 8, namely (ABC), (AβC), (αBC), (αβC), (ABγ), (Aβγ), (αBγ) & (αβγ).

Properties of Ultimate class frequencies:

i) In case of 'n' attributes, there are 2ⁿ ultimate class frequencies

ii) The sum of all the ultimate class frequencies is equal to N.

iii) All the remaining class frequencies of different order can be expressed in terms of ultimate class frequencies. Thus, knowing all the ultimate class frequencies

we can find all remaining classes frequencies.

Fundamental Set of Class Frequencies:

The set of class frequencies which are 2ⁿ in number and with the help of it, we can find out the remaining class frequencies is called a fundamental set.

The set of ultimate class frequencies and that of positive class frequencies are the fundamental sets. Any set of 2ⁿ frequencies from which we can obtain all the positive class frequencies will also form a fundamental set. In case of data on two attributes A and B, the set of ultimate class frequencies; {(AB), (Aβ), (αB), (αβ)} and the set of positive class frequencies; {(N), (A), (B), (AB)} are the fundamental sets.

***

Method of Operator N:

Since the set of positive class frequencies can specify all the remaining frequencies, it becomes necessary to find a method by which every other class frequency can be easily expressed in terms of positive class frequencies. Such a method is provided by method of operator N. Here N is used as an operator and when it operates on a class symbol, it gives the frequency of that class.

Thus, A•N = (A), α•N = (α)

A•N + α•N = (A) + (α)

(A+α) • N = N, ( A+ α) = 1 , A= 1- α or α =1-A. Similarly, β = 1 – B and γ = 1 – C.

These relations convert the negative class symbols in to positive class symbols.

For example, (αβ) = αβ • N = (1 - A) (1 - B) • N

= (1 - A – B + AB) • N

= N – (A) – (B) + (AB)

(Aβγ) = Aβγ • N = A (1 - B) (1 - C) • N

= (A – AB – AC - + ABC) • N

= (A) – (AB) – (AC) + (ABC)

(αβγ) = αβγ • N= (1 - A) (1 - B) (1 - C) • N

= (1 – A - B + AB) (1 – C) • N

= (1 – A – B + AB – C + AC + BC – ABC) • N

= N – (A) – (B) – (C) + (AB) + (AC) + (BC) – (ABC)

This way we can express every class frequency in terms of positive class frequencies.

Note: The relationship among various order class frequencies in case of three attributes A, B and C can be expressed in tabular form as follows. With the help of this table we can determine the missing frequencies easily. (Suggested by Mr. K. R. Pawar and Modified by Dr. B. G. Kore)

	A			α			Total
	C	γ	Total	C	γ	Total	C	γ	Total
B	(ABC)			(αBC)			(BC)
		(ABγ)			(αBγ)			(Bγ)
			(AB)			(αB)			(B)
β	(AβC)			(αβC)			(βC)
		(Aβγ)			(αβγ)			(βγ)
			(Aβ)			(αβ)			(β)
Total	(AC)			(αC)			(C)
		(Aγ)			(αγ)			(γ)
			(A)			(α)			N

***

Consistency of Data:

Data are said to be consistent, if the given class frequencies confirm with one another and don’t conflict in any way. For instance, if (A)=40, (AB)=42, (A) and (AB) are inconsistent, since there can’t be more than 40 units which possess the attributes of A as well as B.

For consistence, it is necessary that no class frequency should be negative. Since all frequencies can be expressed in terms of ultimate class frequencies, it is sufficient for consistency that all ultimate class frequency should be non-negative.

Conditions for Consistency of data:

i) For a single attribute A, the conditions for consistency are:

(i) (A) ≥ 0, (ii) (α) ≥ 0 ⇒ (A) ≤ N ∵ N = (A) + (α) ….. (*)

i.e. the frequency of every first order class is less than or equal to N.

ii) For two attributes A and B, the conditions for consistency are:

(i) (AB) ≥ 0,

(ii) (Aβ) ≥ 0 ⇒ (AB) ≤ (A),

(iii) (αB) ≥ 0 ⇒ (AB) ≤ (B)….. (**) from (ii) and (iii) (AB) must be less than or equal to the smaller of (A) and (B).

(iv) (αβ) ≥ 0 ⇒ (AB) ≥ (A) + (B) – N ∵ (αβ) = N -(A)-(B)+(AB)

iii) For three attributes A, B and C,

conditions for consistency are:

(i) (ABC) ≥ 0

(ii) (ABγ) ≥ 0 ⇒ (ABC)≤(AB) ∵ (ABγ)=(AB)-(ABC)

(iii) (AβC) ≥ 0 ⇒ (ABC) ≤ (AC),

(iv) (αBC) ≥ 0 ⇒ (ABC) ≤ (BC) ∴ From (ii), (iii) and (iv) third

order class frequency is smaller than the smallest of the corresponding three second order frequencies.

(v) (Aβγ) ≥ 0 ⇒ (ABC) ≥ (AB) + (AC) – (A)

∵ (Aβγ) = (A) – (AB) – (AC) + (ABC)

(vi) (αBγ) ≥ 0 ⇒ (ABC) ≥ (AB) + (BC) – (B)

(vii) (αβC) ≥ 0 ⇒ (ABC) ≥ (AC) + (BC) – (C)

(viii) (αβγ) ≥ 0 ⇒ (ABC) ≤ (AB) + (BC) + (AC)-(A)-(B)-(C)+N

∵ (αβγ) = N - (A) - (B) - (C) + (AB) + (AC) + (BC) - (ABC)

From (i) and (viii) we have,

(AB) + (BC) + (AC) ≥ (A) + (B) + (C) - N

From (ii) and (vii) (AB) ≥ (AC) + (BC) - (C)

From (iii) and (vi) (AC) ≥ (AB) + (BC) - (B)

From (iv) and (v) (BC) ≥ (AB) + (AC) - (A)

From (i) and (viii) (AB) +(AC)+(BC) ≥ (A) + (B) +(C) - N

***

Independence of Attributes

Two attributes A and B are said to be independent if there does not exist any kind of relationship between them.

In other words, two attributes A and B are said to be independent, (i) If the proportion of A’s in B’s is the same as that in β’s, (ii) If the proportion of B’s in A’s is the same as that in α’s. For example if smoking and sex are independent, the proportion of the smoker males and smoker females must be same.

***

Association of Attributes:

Two attributes A and B are said to be associated if they are not independent but are related in some way or the other. In other words, two attributes are said to be associated if the proportion of one of them in presence of the other is not equal to that in absence of the other attribute, in some cases one of them may be greater than the other.

For example, if we find that proportion of smokers among males is not same that among females, we shall say that the attributes ‘being a mate’ and ‘being a smoker’ are associated. Formally, we can define association of two attributes A and B as fallows: