Maram Vecha Thannio Thu

Rough Set Approximation Spaces
Introduction
Pawlak is credited with creating the “rough set theory” [1], a
mathematical tool for dealing with vagueness or uncertainty.
Since 1982, the theory and applications of rough set have
impressively developed. There aremany applications of rough
set theory especially in data analysis, artificial intelligence,
and cognitive sciences [2–4]. Some basic aspects of the
research of rough sets and several applications have recently
been presented by Pawlak and Skowron [5, 6]. Rough set
theory [5–8] is an extension of set theory in which a
subset of a universe is described by a pair of ordinary
sets called the lower and upper approximation. Yao [9]
classified broadly methods for the development of rough
set theory into two classes, namely, the constructive and
axiomatic (algebraic) approaches. In constructive methods,
lower and upper approximations are constructed from the
primitive notions, such as equivalence relations on a universe
and neighborhood systems. In rough sets, the equivalence
classes are the building blocks for the construction of the
lower and upper approximations. The lower approximation
of a given set is the union of all the equivalence classes
which are subsets of the set, and the upper approximation
is the union of all the equivalence classes which have a
nonempty intersection with the set. It is well known that a
partition induces an equivalence relation on a set and vice
versa. The properties of rough sets can thus be examined
via either partition or equivalence classes. Rough sets are a
suitable mathematical model of vague concepts. The main
idea of rough sets corresponds to the lower and upper
approximations. Pawlak’s definitions for lower and upper
approximations were originally introduced with reference
to an equivalence relation. Many interesting properties of
the lower and upper approximations have been derived by
Pawlak and Skowron [5, 6] based on the equivalence relations.
However, the equivalence relation appears to be a stringent
condition that may limit the applicability of Pawlak’s rough
set model. Many extensions have been made in recent years
by replacing equivalence relation or partition by notions
such as binary relations [10–12], neighborhood systems, and
Boolean algebras [12–16]. Abu-Donia [17] discussed three
types of upper (lower) approximations depending on the
right neighborhood by using general relation, also generalized
this types by using a family of finite binary relations in
two ways. Many proposals have been made for generalizing
and interpreting rough sets [4, 18–25]. In 1983, Abd El-
Monsef et al. [26] introduced the concept of 𝛽-open sets.
In 1986, Maki [27] has introduced the concept of ⋀-sets in
topological spaces as the sets that coincide with their kernel.
Thekernel of a set𝐴is the intersection of all open supersets of
𝐴. In 2004, Noiri andHatir [28] introduced the ⋀sp-sets (or
⋀𝛽-sets) and investigated some of their properties. In 2008,
Abu-Donia and Salama [29] introduced and investigated the
concept of 𝛽-approximation space. The theory of rough sets
can be generalized in several directions. Within the settheoretic
framework, generalizations of the element based
definition can be obtained by using nonequivalence binary
relations [9, 23, 30–32], generalizations of the granule based
2 Abstract and Applied Analysis
definition can be obtained by using coverings [12, 30, 33–
35], and generalizations of subsystem based definition can be
obtained by using other subsystems [36, 37]. In the standard
rough set model, the same subsystem is used to define lower
and upper approximation operators. When generalizing the
subsystem based definition, one may use two subsystems,
one for the lower approximation operator and the other
for the upper approximation operator. Yao [24] defined a
pair of generalized approximation operators by replacing
the equivalence relations with the family of open sets for
lower approximation operator ”interior operators” and the
equivalence relations with the family of closed sets for upper
approximation operator “closure operators”. In this paper we
used a new subsystem called ⋀𝛽-sets to define new types of
lower and upper approximation operators, called ⋀𝛽-lower
approximation and ⋀𝛽-upper approximation. We study ⋀𝛽-
rough sets, the comparison between this concept and rough
sets is studied. Also, we give some counter examples.
2. Basic Concepts
A topological space [10] is a pair (𝑋, 𝜏) consisting of a
set 𝑋 and family 𝜏 of subset of 𝑋 satisfying the following
conditions:
(1) 𝜙, ∈ 𝜏,
(2) 𝜏 is closed under arbitrary union,
(3) 𝜏 is closed under finite intersection.
The pair (𝑋, 𝜏) is called a topological space, the elements
of 𝑋 are called points of the space, the subsets of 𝑋 belonging
to 𝜏 are called open sets in the space, and the complement
of the subsets of 𝑋 belonging to 𝜏 are called closed set. The
family 𝜏 of open subsets of 𝑋 is also called a topology for 𝑋.
𝐴 = ⋂{𝐹 ⊆ 𝑋 : 𝐴 ⊆ 𝐹 and 𝐹 is closed} is called 𝜏-
closure of a subset𝐴 ⊂ 𝑋.
Evidently, 𝐴 is the smallest closed subset of 𝑋 which
contains 𝐴. Note that 𝐴 is closed if and only if𝐴 = 𝐴.
𝐴∘ = ⋃{𝐺 ⊆ 𝑋 : 𝐺 ⊆ 𝐴 and 𝐺 is open} is called the
𝜏-interior of a subset𝐴 ⊆ 𝑋.
Evidently, 𝐴∘ is the union of all open subsets of 𝑋 which
containing in𝐴.Note that𝐴is open if and only if𝐴 = 𝐴 ∘ . And
(𝐴) = 𝐴 − 𝐴∘ is called the 𝜏-boundary of a subset𝐴 ⊆ 𝑋.
Let 𝐴 be a subset of a topological spaces (𝑋, 𝜏). Let 𝐴, 𝐴∘
and 𝑏(𝐴) be closure, interior, and boundary of𝐴, respectively.
𝐴 is exact if (𝐴) = 𝜙, otherwise 𝐴 is rough. It is clear 𝐴 is
exact if and only if𝐴 = 𝐴∘.
Definition 1 (see [26]). A subset 𝐴 of a topological space
(𝑋, 𝜏) is called 𝛽-open if𝐴 ⊆ (𝐴)∘.
The complement of 𝛽-open set is called 𝛽-closed set.
We denote the family of all 𝛽-open (resp., 𝛽-closed) sets by
𝛽(𝑋) (resp., 𝛽𝐶(𝑋)).
Remark 2. For any topological space (𝑋, 𝜏), we have 𝜏 ⊆
𝛽𝑂(𝑋)
Definition 3 (see [28]). Let𝐴be a subset of a topological space
(𝑋, 𝜏). A subset ⋀(𝐴) is defined as follows: ⋀𝛽(𝐴) = ∩{𝐺 :
𝐴 ⊆ 𝐺,𝐺 ∈ 𝛽𝑂(𝑋)}.
The complement of ⋀(𝐴)-set is called ⋁𝛽(𝐴)-set.
Noiri and Hater [28] stated some properties of ⋀(𝐴) in
the following lemma.
Lemma 4. For subsets 𝐴, 𝐵, and 𝐴𝛼 (𝛼 ∈ Δ) of a topological
space (𝑋, 𝜏), the following hold.
(1) 𝐴 ⊆ ⋀(𝐴).
(2) If 𝐴 ⊂ 𝐵, then ⋀(𝐴) ⊆ ⋀𝛽(𝐵).
(3) ⋀(⋀𝛽(𝐴)) = ⋀𝛽(𝐴).
(4) If 𝐴 ∈ 𝛽(𝑋),𝐴 = ⋀𝛽(𝐴).
(5) ⋀(⋃{𝐴𝛼 : 𝛼 ∈ Δ}) = ⋃{⋀𝛽(𝐴𝛼) : 𝛼 ∈ Δ}.
(6) ⋀(⋂{𝐴𝛼 : 𝛼 ∈ Δ}) ⊆ ⋂{⋀𝛽(𝐴𝛼) : 𝛼 ∈ Δ}.
Definition 5. A subset 𝐴 of a topological space (𝑋, 𝜏) is called
⋀𝛽-set if𝐴 = ⋀(𝐴).
Lemma 6. For subsets 𝐴 and 𝐴𝛼, 𝛼 ∈ Δ of a topological space
(𝑋, 𝜏), the following hold.
(1) ⋀(𝐴) is ⋀𝛽-set.
(2) If 𝐴 is 𝛽-open, then 𝐴 is ⋀𝛽-set.
(3) If 𝐴𝑎 is ⋀𝛽-set for each𝛼 ∈ Δ, then ⋃𝛼∈Δ 𝐴𝑎 is ⋀𝛽-set.
(4) If 𝐴𝑎 is ⋀𝛽-set for each𝛼 ∈ Δ, then ⋂𝛼∈Δ 𝐴𝑎 is ⋀𝛽-set.
Definition 7. Let (𝑋, 𝜏) be a topological space the subset 𝐴 ⊆
𝑋 is called:
(1) 𝛼-open [38] if ⊆ 𝐴∘
∘
,
(2) Preopen [39] if ⊆ (𝐴)
∘.
Remark 8. the class of all⋀𝛽-sets is stronger thanopen (resp.,
𝛼-open, preopen, and 𝛽-open) sets as shown in the following
diagram:
Open 󳨀→ 𝛼-open 󳨀→ Preopen 󳨀→ 𝛽-open 󳨀→ ⋀
𝛽-set.
(1)
Motivation for rough set theory has come from the need
to represent subsets of a universe in terms of equivalence
classes of a partition of that universe. The partition characterizes
a topological space, called approximation space 𝐾 =
(𝑋, 𝑅), where 𝑋 is a set called the universe and 𝑅 is an
equivalence relation [2]. The equivalence classes of 𝑅 are also
known as the granules, elementary sets, or blocks; we will use
𝑅𝑥 ⊆ 𝑋to denote the equivalence class containing 𝑥 ∈ 𝑋. In
the approximation space, we consider two operators
𝑅 (𝐴) = {𝑥 ∈𝑋: 𝑅 𝑥
∩ 𝐴≠ 𝜙} ,
𝑅 (𝐴) = {𝑥 ∈𝑋: 𝑅 𝑥
⊆ 𝐴} ,
(2)
Abstract and Applied Analysis 3
called the lower approximation and upper approximation of
𝐴 ⊆ 𝑋, respectively. Also let POS(𝐴) = 𝑅(𝐴) denote the
positive region of 𝐴, NEG𝑅(𝐴) = 𝑋 − 𝑅(𝐴) denote the
negative region of 𝐴 and BN𝑅(𝐴) = 𝑅(𝐴) − 𝑅(𝐴) denote the
borderline region of 𝑋.
Let 𝑋 be a finite nonempty universe, 𝐴 ⊆ 𝑋, the degree
of completeness can also be characterized by the accuracy
measure as follows:
𝛼𝑅 (𝐴) =
󵄨󵄨󵄨󵄨
𝑅 (𝐴)
󵄨󵄨󵄨󵄨
󵄨󵄨󵄨󵄨󵄨
𝑅 (𝐴)
󵄨󵄨󵄨󵄨󵄨
, 𝐴≠ 𝜙, (3)
where | ⋅ | represents the cardinality of set. Accuracy measures
try to express the degree of completeness of knowledge.
𝛼(𝐴) is able to capture how large the boundary region
of the data sets is; however, we cannot easily capture the
structure of the knowledge. A fundamental advantage of
rough set theory is the ability to handle a category that cannot
be sharply defined given a knowledge base. Characteristics
of the potential data sets can be measured through the
rough sets framework. We can measure inexactness and
express topological characterization of imprecision with the
following.
(1) If (𝐴) ≠ 𝜙 and 𝑅(𝐴) ≠𝑋, then 𝐴 is roughly 𝑅-
definable.
(2) If (𝐴) = 𝜙 and 𝑅(𝐴) ≠𝑋, then 𝐴 is internally 𝑅-
undefinable.
(3) If (𝐴) ≠ 𝜙 and 𝑅(𝐴) = 𝑋, then 𝐴 is externally 𝑅-
undefinable.
(4) If (𝐴) = 𝜙 and 𝑅(𝐴) = 𝑋, then 𝐴 is totally 𝑅-
undefinable.
We denote the set of all roughly 𝑅-definable (resp., internally
𝑅-undefinable, externally 𝑅-undefinable, and totally 𝑅-
undefinable) sets by RD(𝑋) (resp., IUD(𝑋), EUD(𝑋), and
TUD(𝑋)).
With 𝛼(𝐴) and classifications above we can characterize
rough sets by the size of the boundary region and structure.
Rough sets are treated as a special case of relative sets and
integrated with the notion of Belnap’s logic [22].
Remark 9. We denote the relationwhich used to get a subbase
for a topology 𝜏 on 𝑋 and a class of 𝛽-open sets (𝛽𝑂(𝑋)) by
𝑅𝛽. Also, we denote 𝛽-approximation space by (𝑋, 𝑅𝛽).
Definition 10. Let (𝑋, 𝑅𝛽) be a 𝛽-approximation space 𝛽-
lower (resp., 𝛽-upper) approximation of any nonempty subset
𝐴 of 𝑋 is defined as:
𝑅𝛽(𝐴) = ⋃{𝐺 ∈ 𝛽𝑂(𝑋) : 𝐺 ⊆ 𝐴},
𝑅𝛽(𝐴) = ⋂{𝐹 ∈ 𝛽𝐶(𝑋) : 𝐹 ⊇ 𝐴}.
We can get the the 𝛽-approximation operator as follows.
(1) Get the right neighborhoods 𝑥𝑅 from the given
relation 𝑅 as 𝑥𝑅 = {𝑦 : 𝑥𝑅𝑦}.
(2) Using right neighborhoods 𝑥𝑅 as a sub-base to get the
topology 𝜏.
(3) Using the open sets in the topology 𝜏 to get the family
of 𝛽-open sets “from Definition 1.”
(4) Using the set of all 𝛽-open sets toget 𝛽-approximation
operators (from Definition 10).
Definition 11. Let (𝑋, 𝑅𝛽) be a 𝛽-approximation space and
𝐴 ⊆ 𝑋. Then there are memberships ∈, ∈, ∈𝛽, and
∈𝛽, say, strong, weak, 𝛽-strong, and 𝛽-weak memberships
respectively which defined by
(1) 𝑥 ∈ 𝐴 iff 𝑥 ∈ (𝐴),
(2) 𝑥 ∈ 𝐴 iff 𝑥 ∈ (𝐴),
(3) 𝑥 ∈𝛽 𝐴 iff 𝑥 ∈ 𝑅(𝐴),
(4) 𝑥 ∈𝛽 𝐴 iff 𝑥 ∈ 𝑅(𝐴).
Remark 12. According to Definition 11, 𝛽-lower and 𝛽-upper
approximations of a set𝐴 ⊆ 𝑋can be written as
𝑅𝛽(𝐴) = {𝑥 ∈ 𝐴 : 𝑥 ∈𝛽 𝐴},
𝑅𝛽(𝐴) = {𝑥 ∈ 𝐴 : 𝑥 ∈𝛽 𝐴}.
Definition 13. Let (𝑋, 𝑅𝛽) be a 𝛽-approximation space and
𝐴 ⊆ 𝑋. the 𝛽-accuracy measure of 𝐴 defined as follows:
𝛼𝑅 𝛽
(𝐴) =
󵄨󵄨󵄨󵄨󵄨
𝑅𝛽 (𝐴)
󵄨󵄨󵄨󵄨󵄨
󵄨 󵄨 󵄨 󵄨 󵄨 𝑅𝛽 (
𝐴)
󵄨󵄨󵄨󵄨󵄨
, 𝐴≠ 𝜙. (4)
Definition 14. Let (𝑋, 𝑅𝛽) be a 𝛽-approximation space, the set
𝐴 ⊆ 𝑋is called
(1) roughly 𝑅𝛽-definable, if 𝑅𝛽(𝐴) ≠ 𝜙 and 𝑅𝛽(𝐴) ≠𝑋,
(2) internally 𝑅𝛽-undefinable, if 𝑅𝛽(𝐴) = 𝜙 and 𝑅𝛽(𝐴) ≠
𝑋,
(3) externally 𝑅𝛽-undefinable, if 𝑅𝛽(𝐴) ≠ 𝜙 and 𝑅𝛽(𝐴) =
𝑋,
(4) totally 𝑅𝛽-undefinable, if 𝑅𝛽(𝐴) = 𝜙 and 𝑅𝛽(𝐴) = 𝑋.
We denote the set of all roughly 𝑅𝛽-definable (resp.,
internally 𝑅𝛽-undefinable, externally 𝑅𝛽-undefinable, and
totally 𝑅𝛽-undefinable) sets by 𝛽RD(𝑋) (resp., 𝛽IUD(𝑋),
𝛽EUD(𝑋) and 𝛽TUD(𝑋)).
Remark 15. For any 𝛽-approximation space (𝑋, 𝑅𝛽) the following
hold:
(1) 𝛽RD(𝑋) ⊇ RD(𝑋),
(2) 𝛽IUD(𝑋) ⊆ IUD(𝑋),
(3) 𝛽EUD(𝑋) ⊆ EUD(𝑋),
(4) 𝛽TUD(𝑋) ⊆ TUD(𝑋).
3. A New Type of Rough Classification
Based on ⋀𝛽-Sets
In this section, we introduced and investigated the concept
of ⋀𝛽-approximation space. Also, we introduce the concepts
of⋀𝛽-lower approximation and⋀𝛽-upper approximation for
any subset and study their properties.
Remark 16. We denote the relation which used to get a
subbase for a topology 𝜏 on 𝑋 and a class of ⋀𝛽-sets by 𝑅∧𝛽 .
Also, we denote ⋀𝛽-approximation space by (𝑋, 𝑅∧𝛽
).
Example 17. Let 𝑋 = {𝑎, 𝑏, 𝑐, 𝑑} be a universe and
a relation 𝑅∧𝛽 defined by 𝑅∧𝛽
= {(𝑎, 𝑎),(𝑎, 𝑐), (𝑎, 𝑑),
(𝑏, 𝑏),(𝑏, 𝑑), (𝑐, 𝑎), (𝑐, 𝑏), (𝑐, 𝑑), (𝑑, 𝑎)}, thus 𝑎𝑅∧𝛽
= {𝑎, 𝑐, 𝑑},
𝑏𝑅∧𝛽
= {𝑏, 𝑑}, 𝑐𝑅∧𝛽
= {𝑎, 𝑏, 𝑑}, and 𝑑𝑅∧𝛽
= {𝑎}.
Then the topology associated with this relation is
𝜏 = {𝑋, 𝜙, {𝑎}, {𝑑}{𝑎, 𝑑},{𝑏, 𝑑},{𝑎, 𝑏, 𝑑},{𝑎, 𝑐, 𝑑}} and ⋀𝛽-sets
= {𝑋, 𝜙, {𝑎}, {𝑐}, {𝑑},{𝑎, 𝑐}{𝑎, 𝑑}, {𝑏, 𝑑}, {𝑐, 𝑑}, {𝑎, 𝑏, 𝑑},{𝑎, 𝑐, 𝑑},
{𝑏, 𝑐, 𝑑}}. So (𝑋, 𝑅∧𝛽
) is a ⋀𝛽-approximation space.
Definition 18. Let (𝑋, 𝑅∧𝛽
) be a⋀𝛽-approximation space.⋀𝛽-
lower approximation and ⋀𝛽-upper approximation of any
nonempty subset 𝐴 of 𝑋 is defined as
⋀
𝛽
(𝐴) = ⋃{𝐺 : 𝐺 is ⋀𝛽-set, 𝐺 ⊆ 𝐴},
⋀(𝐴) = ⋂{𝐹 : 𝐹 is ⋁𝛽-set, 𝐹 ⊇ 𝐴}.
The following proposition shows the properties of ⋀𝛽-
lower approximation and ⋀𝛽-upper approximation of any
nonempty subset.
Proposition 19. Let (𝑋, 𝑅∧𝛽
) be a ⋀𝛽-approximation space
and 𝐴, 𝐵 ⊆ 𝑋. Then:
(1) ⋀
𝛽
(𝐴) ⊆ 𝐴 ⊆ ⋀𝛽(𝐴),
(2) ⋀
𝛽
(𝜙) = ⋀𝛽(𝜙) = 𝜙, ⋀
𝛽
(𝑋) = ⋀𝛽(𝑋) = 𝑋,
(3) If 𝐴 ⊆ 𝐵 then ⋀
𝛽
(𝐴) ⊆ ⋀
𝛽
(𝐵) and ⋀(𝐴) ⊆ ⋀𝛽(𝐵),
(4) ⋀
𝛽
(𝑋 \ 𝐴) = 𝑋 \ ⋀𝛽(𝐴),
(5) ⋀(𝑋 \ 𝐴) = 𝑋 \ ⋀
𝛽
(𝐴),
(6) ⋀
𝛽
(⋀
𝛽
(𝐴)) = ⋀
𝛽
(𝐴),
(7) ⋀(⋀𝛽(𝐴)) = ⋀𝛽(𝐴),
(8) ⋀
𝛽
(⋀
𝛽
(𝐴)) ⊆ ⋀𝛽(𝐴)),
(9) ⋀
𝛽
(⋀𝛽(𝐴)) ⊆ ⋀𝛽(⋀𝛽(𝐴)),
(10) ⋀
𝛽
(𝐴 ∪ 𝐵) ⊇ ⋀
𝛽
(𝐴) ∪ ⋀
𝛽
(𝐵),
(11) ⋀(𝐴 ∪ 𝐵) ⊇ ⋀
𝛽
(𝐴) ∪ ⋀𝛽(𝐵),
(12) ⋀
𝛽
(𝐴 ∩ 𝐵) ⊆ ⋀
𝛽
(𝐴) ∩ ⋀
𝛽
(𝐴),
(13) ⋀(𝐴 ∩ 𝐵) ⊆ ⋀𝛽(𝐴) ∩ ⋀𝛽(𝐵).
) be a⋀𝛽-approximation space.The
Universe𝑋can be divided into 24 regions with respect to any
𝐴 ⊆ 𝑋as follows.
(1) The internal edg of 𝐴, Edg(𝐴) = 𝐴 − 𝑅(𝐴).
(2) The 𝛽-internal edg of 𝐴, 𝛽Edg(𝐴) = 𝐴 − 𝑅𝛽(𝐴).
(3) The ⋀𝛽-internal edg of 𝐴, ⋀𝛽Edg(𝐴) = 𝐴 − ⋀
𝛽
(𝐴).
(4) The external edg of 𝐴, Edg(𝐴) = 𝑅(𝐴) − 𝐴.
(5) The 𝛽-external edg of 𝐴, 𝛽Edg(𝐴) = 𝑅𝛽(𝐴) − 𝐴.
(6) The ⋀𝛽-external edg of 𝐴, ⋀𝛽Edg(𝐴) = ⋀𝛽(𝐴) − 𝐴.
(7) The boundary of 𝐴, (𝐴) = 𝑅(𝐴) − 𝑅(𝐴).
(8) The 𝛽-boundary of 𝐴, 𝛽(𝐴) = 𝑅𝛽(𝐴) − 𝑅𝛽(𝐴).
(9) The ⋀𝛽-boundary of 𝐴, ⋀𝛽(𝐴) = ⋀𝛽(𝐴) − ⋀
𝛽
(𝐴).
(10) The exterior of 𝐴, ext(𝐴) = 𝑋 − 𝑅(𝐴).
(11) The 𝛽-exterior of 𝐴, 𝛽ext(𝐴) = 𝑋 − 𝑅𝛽(𝐴).
(12) The ⋀𝛽-exterior of 𝐴, ⋀𝛽 ext(𝐴) = 𝑋 − ⋀𝛽(𝐴).
(13) (𝐴) − 𝑅𝛽(𝐴).
(14) (𝐴) − ⋀
𝛽
(𝐴).
(15) (𝐴) − ⋀𝛽(𝐴).
(16) 𝑅(𝐴) − 𝑅(𝐴).
(17) 𝑅(𝐴) − ⋀
𝛽
(𝐴)
(18) 𝑅(𝐴) − ⋀𝛽(𝐴).
(19) 𝑅(𝐴) − 𝑅𝛽(𝐴).
(20) ⋀(𝐴) − 𝑅𝛽(𝐴).
(21) ⋀(𝐴) − 𝑅(𝐴).
(22) ⋀
𝛽
(𝐴) − 𝑅𝛽(𝐴).
(23) ⋀
𝛽
(𝐴) − 𝑅(𝐴).
(24) (𝐴) − 𝑅𝛽(𝐴).
Remark 21. As shown in Figure 1, the study of ⋀
𝛽-approximation
spaces is a generalization for study of approximation
spaces. Because of the elements of the regions [𝑅(𝐴)−𝑅(𝐴)],
[⋀
𝛽
(𝐴)−𝑅(𝐴)], and [⋀
𝛽
(𝐴)−(𝐴)]will bedefinedwell in𝐴,
while this points was undefinable in Pawlak’s approximation
spaces. Also, the elements of the region [(𝐴) − ⋀
𝛽
(𝐴)],
[𝑅(𝐴) − ⋀𝛽(𝐴)], and [𝑅(𝐴) − 𝑅𝛽(𝐴)] do not be belong
to 𝐴, while these elements was not well defined in Pawlak’s
approximation spaces.
Figure 1 shows the above 24 regions.
Theorem 22. For any topological space (𝑋, 𝜏) generated by a
binary relation 𝑅 on 𝑋, we have, 𝑅(𝐴) ⊆ 𝑅𝛽(𝐴) ⊆ ⋀
𝛽
(𝐴) ⊆
𝐴 ⊆ ⋀𝛽(𝐴) ⊆ 𝑅𝛽(𝐴) ⊆ 𝑅(𝐴).
Proof. 𝑅(𝐴) = ⋃{𝐺 ∈ 𝜏 : 𝐺 ⊆ 𝐴} ⊆ ⋃{𝐺 ∈ 𝛽𝑂(𝑋) : 𝐺 ⊆
𝐴} = 𝑅(𝐴) ⊆ ⋃{𝐺 ∈ ⋀𝛽-set : 𝐺 ⊆ 𝐴} = ⋀
𝛽
(𝐴) ⊆ 𝐴, that is,
𝑅(𝐴) ⊆ 𝑅𝛽(𝐴) ⊆ ⋀
𝛽
(𝐴) ⊆ 𝐴.
Abstract and Applied Analysis 5
Rβ(A)
R(A)
Rβ(A)
A
R(A)
∧β(A)
∧
β(A)
Figure 1
Also, (𝐴) = ⋂{𝐹 ∈ 𝜏𝑐 : 𝐹 ⊇ 𝐴} ⊇ ⋂{𝐹 ∈ 𝛽𝐶(𝑋) : 𝐹 ⊇
𝐴} = 𝑅(𝐴) ⊇ ⋂{𝐹 ∈ ⋁𝛽-set : 𝐹 ⊇ 𝐴} = ⋀𝛽(𝐴) ⊇ 𝐴, that is,
𝑅(𝐴) ⊇ 𝑅𝛽(𝐴) ⊇ ⋀𝛽(𝐴) ⊇ 𝐴.
Consequently, (𝐴) ⊆ 𝑅𝛽(𝐴) ⊆ ⋀
𝛽
(𝐴) ⊆ 𝐴 ⊆ ⋀𝛽(𝐴) ⊆
𝑅
𝛽
(𝐴) ⊆ 𝑅(𝐴).
Definition 23. Let (𝑋, 𝑅∧𝛽) be a ⋀𝛽-approximation space and
𝐴 ⊆ 𝑋. Then there are memberships ∈∧𝛽 and ∈∧𝛽, say,
⋀𝛽-strong and ⋀𝛽-weak memberships respectively which
defined by
(1) 𝑥 ∈∧𝛽
𝐴 iff 𝑥 ∈ ⋀
𝛽
(𝐴),
(2) 𝑥 ∈∧𝛽
𝐴 iff 𝑥 ∈ ⋀𝛽(𝐴).
Remark 24. According to Definition 28, ⋀𝛽-lower and ⋀𝛽-
upper approximations of a set𝐴 ⊆ 𝑋can be written as
(1) ⋀
𝛽
(𝐴) = {𝑥 ∈ 𝐴 : 𝑥 ∈∧𝛽
𝐴},
(2) ⋀(𝐴) = {𝑥 ∈ 𝐴 : 𝑥 ∈∧𝛽
𝐴}.
Remark 25. Let (𝑋, 𝑅∧𝛽) be a ⋀𝛽-approximation space and
𝐴 ⊆ 𝑋. Then
(1) 𝑥∈𝐴⇒𝑥 ∈𝛽𝐴 ⇒ 𝑥 ∈∧𝛽
𝐴,
(2) 𝑥 ∈∧𝛽
𝐴 ⇒ 𝑥 ∈𝛽øü__x ý_X&𝐴 ⇒ 𝑥 ∈𝐴.
The converse of Remark 25 may not be true in general as
seen in the following example.
Example 26. In Example 17. Let 𝐴 = {𝑏, 𝑐}, we have 𝑐 ∈∧𝛽
𝐴
but 𝑐 ∉
𝛽
𝐴. Let 𝐴 = {𝑐,}, 𝑐 ∈𝛽 𝐴 but 𝑐 ∉𝐴. Let 𝐴 = {𝑑} we
have 𝑐 ∈𝐴 but 𝑐 ∉𝛽 𝐴. Let 𝐴 = {𝑎, 𝑑}, 𝑐 ∈𝛽𝐴 but 𝑐 ∉∧𝛽
𝐴.
Table 1
The set𝐴 ⊆ 𝑋 𝛼(𝐴) 𝛼𝑅𝛽
(𝐴) 𝛼∧𝛽
(𝐴)
{𝑎} 1/2 1 1
{𝑏} 0 0 0
{𝑐} 0 0 1
{𝑑} 1/3 1/2 1/2
{𝑎, 𝑏} 1/3 1/2 1/2
{𝑎, 𝑐} 1/2 1 1
{𝑎, 𝑑} 1/2 1/2 2/3
{𝑏, 𝑐} 0 0 1/2
{𝑏, 𝑑} 2/3 1 1
{𝑐, 𝑑} 1/3 2/3 2/3
{𝑎, 𝑏, 𝑐} 1/3 2/3 2/3
{𝑎, 𝑏, 𝑑} 3/4 3/4 1
{𝑎, 𝑐, 𝑑} 3/4 3/4 3/4
{𝑏, 𝑐, 𝑑} 2/3 1 1
Let 𝑋 be a finite nonempty universe, 𝐴 ⊆ 𝑋, we can
characterize the degree of completeness by a new tool named
⋀𝛽-accuracy measure defined as follows.
𝛼∧ 𝛽
(𝐴) =
󵄨󵄨󵄨󵄨󵄨󵄨󵄨
⋀
𝛽
(𝐴)
󵄨󵄨󵄨󵄨󵄨󵄨󵄨
󵄨󵄨󵄨󵄨󵄨⋀
𝛽(
𝐴)
󵄨󵄨󵄨󵄨󵄨
, 𝐴≠ 𝜙. (5)
Example 27. In Example 17, we candeduce the following table
showing the degree of accuracy measure 𝛼𝑅(𝐴), 𝛽-accuracy
measure 𝛼𝑅𝛽
(𝐴) and ⋀𝛽-accuracy measure 𝛼∧𝛽
(𝐴) for some
sets, see Table 1.
We see that the degree of exactness of the set 𝐴 = {𝑎, 𝑐} by
using accuracy measure equal to 50%, by using ⋀𝛽-accuracy
measure equal to 100%. Also, the set 𝐴 = {𝑎, 𝑏, 𝑑} by using
𝛽-accuracy measure equal to 75% and by using ⋀𝛽-accuracy
measure equal to 100%. Consequently ⋀𝛽-accuracy measure
is better than accuracy and 𝛽-accuracy measures in this case.
We investigate ⋀𝛽-rough equality and ⋀𝛽-rough inclusion
based on rough equality and inclusion which introduced
by Novotn´y and Pawlak in [7, 40].
) be a ⋀𝛽-approximation space,
𝐴, 𝐵 ⊆ 𝑋. Thenwe say that 𝐴 and 𝐵 are
(i) ⋀𝛽-roughly bottom equal (𝐴 ∼∧𝛽
𝐵) if ⋀
𝛽
(𝐴) =
⋀
𝛽
(𝐵),
(ii) ⋀𝛽-roughly top equal (𝐴 ≃∧𝛽
𝐵) if ⋀(𝐴) = ⋀𝛽(𝐵),
(iii) ⋀𝛽-roughly equal (𝐴 ≈∧𝛽
𝐵) if (𝐴 ∼∧𝛽
𝐵) and
(𝐴 ≃∧𝛽
𝐵).
Example 29. In Example 17, we have the sets {𝑎, 𝑐}, {𝑎, 𝑏, 𝑐} are
⋀𝛽-roughly bottom equal and {𝑐, 𝑑}, {𝑏, 𝑐, 𝑑} are ⋀𝛽-roughly
top equal.
) be a ⋀𝛽-approximation space,
𝐴, 𝐵 ⊆ 𝑋. Thenwe say that
(i) 𝐴 is ⋀𝛽-roughly bottom included in (𝐴 ⊂∼
∧𝛽
𝐵) if
⋀
𝛽
(𝐴) ⊆ ⋀
𝛽
(𝐵),
(ii) 𝐴 is ⋀𝛽-roughly top included in (𝐴 ̃⊂∧𝛽
𝐵) if
⋀
𝛽
(𝐴) ⊆ ⋀
𝛽
(𝐵).
(iii) 𝐴 is ⋀𝛽-roughly included in (𝐴 ⊂
̃ ∼
∧𝛽
𝐵) if (𝐴 ⊂∼
∧𝛽
𝐵)
and (𝐴 ̃⊂∧𝛽
𝐵).
Example 31. In Example 17, we have {𝑏, 𝑐} is ⋀𝛽- roughly
bottom included in {𝑎, 𝑏, 𝑐}. Also, {𝑏, 𝑐} is ⋀𝛽- roughly top
included in {𝑎, 𝑏, 𝑐}. Also, {𝑏, 𝑐} is ⋀𝛽-roughly included in
{𝑎, 𝑏, 𝑐}.
4. ⋀𝛽-Rough Sets
We introduced a new concept of ⋀𝛽-rough set.
Definition 32. For any ⋀𝛽-approximation space (𝑋, 𝑅∧𝛽
), a
subset 𝐴 of 𝑋 is called:
(1) ⋀𝛽-definable (⋀𝛽-exact) if ⋀(𝐴) = ⋀
𝛽
(𝐴),
(2) ⋀𝛽-rough if ⋀(𝐴) ≠⋀
𝛽
(𝐴).
Example 33. Let (𝑋, 𝑅∧𝛽
) be a ⋀𝛽-approximation space as in
Example 17.We have the set {𝑐} is⋀𝛽-exact while {𝑐, 𝑑} is⋀𝛽-
rough set.
Proposition 34. Let (𝑋, 𝑅∧𝛽
) be a ⋀𝛽-approximation space.
Then
(1) every exact set in 𝑋 is 𝛽-exact,
(2) every 𝛽-exact set in 𝑋 is ⋀𝛽-exact,
(3) every ⋀𝛽-rough set in 𝑋 is 𝛽-rough,
(4) every 𝛽-rough set in 𝑋 is rough.
Proof. Obvious.
The converse of all parts of Proposition 34may not be true
in general as seen in the following example.
Example 35. Let (𝑋, 𝑅∧𝛽
) be an ⋀𝛽-approximation space as
in Example 17.Then the set {𝑏, 𝑑} is 𝛽-exact but not exact, the
set {𝑐} is ⋀𝛽-exact but not 𝛽-exact, the set {𝑐} is 𝛽-rough but
not ⋀𝛽-rough and the set {𝑎, 𝑐} is rough but not 𝛽-rough.
) be a ⋀𝛽-approximation space, the
set𝐴 ⊆ 𝑋is called:
(1) roughly ⋀𝛽-definable, if ⋀
𝛽
(𝐴) ≠ 𝜙 and ⋀(𝐴) ≠𝑋,
(2) internally ⋀𝛽-undefinable, if ⋀
𝛽
(𝐴) = 𝜙 and
⋀𝛽(𝐴) ≠𝑋,
(3) externally⋀𝛽-undefinable, if⋀
𝛽
(𝐴) ≠ 𝜙 and⋀(𝐴) =
𝑋,
(4) totally⋀𝛽-undefinable, if⋀
𝛽
(𝐴) = 𝜙 and⋀(𝐴) = 𝑋.
We denote the set of all roughly ⋀𝛽-definable (resp.,
internally ⋀𝛽-undefinable, externally ⋀𝛽-undefinable and
totally ⋀𝛽-undefinable) sets by ⋀𝛽RD(𝑋) (resp., ⋀𝛽IUD(𝑋),
⋀𝛽EUD(𝑋) and ⋀𝛽TUD(𝑋)).
Remark 37. For any ⋀𝛽-approximation space (𝑋, 𝑅∧𝛽
). The
following are hold:
(1) ⋀𝛽RD(𝑋) ⊇ 𝛽RD(𝑋) ⊇ RD(𝑋),
(2) ⋀𝛽IUD(𝑋) ⊆ 𝛽IUD(𝑋) ⊆ IUD(𝑋),
(3) ⋀𝛽EUD(𝑋) ⊆ 𝛽EUD(𝑋) ⊆ EUD(𝑋),
(4) ⋀𝛽TUD(𝑋) ⊆ 𝛽TUD(𝑋) ⊆ TUD(𝑋).
Example 38. In Example 17, we have the set {𝑎, 𝑑} ∈
⋀𝛽RD(𝑋) but {𝑎, 𝑑} ∉ 𝛽RD(𝑋). The set {𝑐} ∈ 𝛽IUD(𝑋)
but {𝑐} ∉ ⋀𝛽IUD(𝑋). The set {𝑎, 𝑏, 𝑑} ∈ 𝛽EUD(𝑋) but
{𝑎, 𝑏, 𝑑} ∈ ⋀𝛽EUD(𝑋).__
Rough Set-Based
Feature Selection:
The main aim of feature selection (FS) is to determine
a minimal feature subset from a problem
domain while retaining a suitably high accuracy
in representing the original features. In many real
world problems, FS is a must due to the abundance
of noisy, irrelevant, or misleading features. For
instance, by removing these factors, learning from
data techniques can benefit greatly. A detailed
review of feature selection techniques devised

for classification tasks can be found in Dash and
Liu (1997).
The usefulness of a feature or feature subset is
determined by both its relevancy and redundancy.
A feature is said to be relevant if it is predictive of
the decision feature(s), otherwise it is irrelevant.
A feature is considered to be redundant if it is
highly correlated with other features. Hence, the
search for a good feature subset involves finding
those features that are highly correlated with
the decision feature(s), but are uncorrelated with
each other.
A taxonomy of feature selection approaches
can be seen in Figure 1. Given a feature set size
n, the task of FS can be seen as a search for an
‘optimal’ feature subset through the competing
2n candidate subsets. The definition of what an
optimal subset is may vary depending on the
problem to be solved. Although an exhaustive
method may be used for this purpose in theory,
this is quite impractical for most datasets. Usually
FS algorithms involve heuristic or random search
strategies in an attempt to avoid this prohibitive
complexity. However, the degree of optimality
of the final feature subset is often reduced.
The overall procedure for any feature selection
method is given in Figure 2 (adapted from Dash
& Liu, 1997).
The generation procedure implements a search
method (Langley 1994; Siedlecki & Sklansky,
1988) that generates subsets of features for evaluation.
It may start with no features, all features,
a selected feature set, or some random feature
subset. Those methods that start with an initial
subset usually select these features heuristically
Figure 1. Aspects of feature selection
Figure.2..Feature.selection
36 more pages are available in the full version of this document, which may be
ROUGH SETS BASED ATTRIBUTE REDUCTION
METHODS
Rough Sets Attribute Reduction (RSAR) is a filter-based feature reduction
tool which may be used to extract knowledge from the information system and
which is able to retain the information content while reducing the amount of
knowledge involved. Rough Sets analysis is performed based on only the
information given in the data set and it requires no external parameters to operate. It
works only by making use of the granularity structure of the data. But it still makes
minimum model assumption that with every object in the universe of discourse,
there is some information available which is a true and accurate reflection of the real
world.
The optimal criterion for Rough Sets feature selection is to find shortest or
minimal reducts while obtaining high quality classifiers based on the selected
features. Another criterion can be the number of rules generated by the reducts
obtained. The usefulness of a feature or a feature subset is determined by both its
relevancy and redundancy. A feature is said to be relevant if it is predictive for the
decision feature, otherwise it is irrelevant. A feature is considered to be redundant if
it is highly correlated with other features. Hence the search for a good feature subset
involves finding these features that are highly correlated with the decision features
but not correlated with each other.
The problem of finding a reduct of an information system has been the
subject of much research (Alpigini et al.,2002, Swiniarski and Skowron .,2003). The
most basic solution to finding such a subset is to simply generate all possible subsets
and select those with a maximum Rough Sets dependency degree. But this method is
46
expensive and involves very high time complexity and is suitable only for simple
datasets. To avoid unwanted computational effort, an element of pruning can be
introduced by selecting the reducts of minimal cardinality. But the problem of
finding the shortest reduct is NP hard and algorithms that rely on heuristics to find
short reducts are being investigated.
2.6.1 Quick Reduct
Chouchoulas and Shen (2001) proposed a Quick Reduct Algorithm (QRA)
given in Figure 2.1. The QRA attempts to calculate a reduct without exhaustively
generating all possible subsets. It starts with an empty set and adds in turn, one at a
time, those attributes that result in the greatest increase in the Rough Sets
dependency metric until the maximum possible dependency degree is reached.
2.6.2 Reverse Reduct
This method (Figure 2.2) adopts the backward elimination of attributes as
opposed to the forward selection process in Quick Reduct. Initially, all attributes
appear in the reduct candidate; the ones that are least informative are incrementally
removed until no further attribute can be eliminated without introducing
inconsistencies. But this method is too costly for large datasets since the algorithm
has to begin with all features and evaluate large feature subsets. But the
computational complexity involved in this method is the same as that in Quick
Reduct.
As both Quick Reduct and Reverse Reduct perform well, sometimes a
combination of these two strategies in one algorithm also performs well. For
instance, initially the search continues in a forward direction and then resorts to
backward steps intermittently to remove less important features before proceeding.
47
But this method is also not guaranteed to find a minimal subset. Using the
dependency function to discriminate between features that lead the search down a
non minimal path. It is impossible to predict which combination of attributes will
lead to an optimal reduct based on changes in dependency with the addition or
deletion of single attributes. But it definitely results in a close-to-minimal subset
which will still be useful in greatly reducing dimensionality of the dataset.
Figure 2.1 Quick Reduct Algorithm
Figure 2.2 Reverse Reduct Algorithm
Quick Reduct Algorithm(QRA)
C, the set of all conditional features;
D, the set of decision features.
(1) R←{}
(2) do
(3) T← R
(4) 􀗊 x􀗐(C-R)
(5) if DDR x T U
DcardPOS DcardUR R Where /
(6) T ←R Ux
(7) RT
(8) until RDC D
(9) return R
Reverse Reduct Algorithm(C,D)
C, the conditional attributes;
D, the decision attributes;
(1) RC
(2) aC
(3) if ( RaD1) 
(4) RR a
(5) return R
48
2.6.3 Modified Quick Reduct Algorithm
When the cardinality of attribute becomes one and the dependency value is
greater than or equal to the threshold value t, then the attribute should be selected
while testing the single attribute. This modification is carried out from the existing
QRA. The algorithm is given in Figure 2.3.
Figure 2.3 Modified Quick Reduct Algorithm
2.7 DISCERNIBILITY MATRIX APPROACH
Reducts for Rough Sets may also be found using the discernibility matrices
(Peters and Skowron.,2004).If U,C U Dis a decision table, then a
discernibility matrix is a symmetric U X U matrix with entries defined as
d a C ax ax i j U ij i j / , , 1,2,... (2.13)
Modified Quick Reduct Algorithm (C, D)
(1) R←{}
(2) do
(3) T←R
(4) 􀗊 xC R
(5) if card TT Dt 1& 
(6) R←T
(7) else if R xDT DU 

(8) T R Ux
(9) R←T
(10) until RDC D
(11) return R
49
dij , for i, j 1,2,...U contains those attributes that have different values for objects
i and j. The discernibility matrix considers only those object discernibilities that
occur when the decision attributes corresponding to the object differ. From the
discernibility matrix, the discernibility function can be defined.
A discernibility function fD is a Boolean function of m Boolean variables
**
3
*
2
*
a1 ,a ,a ,...am(corresponding to the attributes a ,a ,a ,...am 1 2 3 defined as below:
 D m ij ij f a ,a ,a ,...a c /1 j i U ,c ***
3
*
2
*
1
(2.14)
where *   */ .
cij a acij
The minimal reducts of a system are the set of all prime implicants of the
discernibility function. Although this method is guaranteed to discover all minimal
subsets, it is an expensive operation making the method impractical for even
medium sized data sets.
2.8 ENTROPY BASED REDUCTION
Another technique for Rough Sets feature selection is Entropy-Based
Reduction (EBR) developed from the work carried out in (Jensen and Shen.,2004a,
2004b). This approach is based on the entropy heuristic employed by machine
learning techniques such as C4.5 (Quinlan 1993). Dash and Liu (1997) adopted a
similar approach using an entropy measure for ranking of features. EBR works by
examining a dataset and determining those attributes that provide the most gain in
information.
50
The entropy of the attribute A which can take values a1,a2,a3,...an with
respect to the conclusion C whose values may possibly be a1,a2,a3,...an is
defined as
i j
n
i
ij
m
j
j HCApapcapca 2
11
log

(2.15)
The EBR Algorithm is given in Figure 2.4.The EBR algorithm requires no
threshold in order to function. The search for the best feature subset is stopped when
the resulting subset entropy is equal to that of the entire feature set. For consistent
data, the final entropy of the subset will be zero. Hence the technique can be used
for finding Rough Sets reducts of the data is consistent.
Figure 2.4 Entropy-Based Reduct Algorithm
2.9 HEURISTICS FUNCTION BASED FEATURE SELECTION
The Rough Sets concepts could be employed to define the heuristics
functions (Hu,1995, Zhong et al.,2001). Some heuristics functions are analysed in
this section.
Entropy-based reduction EBR (C, D)
(1) R←{}
(2) do
(3) T←R
(4) x(C R)
(5) if HRUxHT
(6) T R Ux
(7) R←T
(8) until C
HD
R
HD
(9) return R
51
2.9.1 Significance Oriented Method
In this method, the significance of the features is used as a heuristic. Each
time the most significant feature from the unselected features is added to generate
the next candidate feature subset. The significance of a, denoted by SIG(a) is the
increase in the dependency between condition attributes and the decision attributes
as a result of the addition of a. The heuristic is to select features causing the
dependency to increase faster. The dependency between condition attributes and
decision attributes is defined as
R DPOS DU R , (2.16)
where POSR Dis the cardinality of the positive region and U is the
cardinality of the universe. In this case, the heuristic function is defined as
SIGaRUa,DR,D(2.17)
where R is set of currently selected features and D is the decision attribute.
This method considers only the dependency of the selected features but does
not take into account, the quality of the potential classification rules. The quality of
the rules can be assessed with the help of two parameters: (i) the size of consistent
instances and (ii) support of each rule.
2.9.2 Support Oriented Methods
Zhang et al (2001) proposed a heuristic function that considers not only the
significance of the attribute but also takes into account the support of the most
significant rule. The heuristic selects feature ‘a’ such , by adding ‘a’ to the current
set, the size of consistent instances increase faster and the support of the most
52
significant rule is larger than by adding any other features. This heuristic function is
defined as follows:
FR aCardPOS  DMAXSizePOS INDR aR a R a
U U U , X / (2.18)
The first factor (POSRUaD) indicates the cardinality of the consistent
instances. The second factor denotes the maximal size out of indiscernibility classes
included in the positive region which denotes the support of the most significant
rule. Hence this heuristic is referred to as Maximum Support Heuristic. But the
limitation of the Maximum Support Heuristic is that it selects features causing the
highest support of the most significant rule rather than the highest overall quality of
the potential rules. That is, it only considers a local optimum instead of a global
optimum of the potential rules.
2.9.3 Average Support Heuristic
To overcome the limitations of the Maximum Support Heuristic, Zhang et.al,
proposed a new heuristic function called Average Support Heuristic, which
considers the overall quality Q of the potential set of rules. Q is the average support
of the most significant rules for each of the decision classes.
The overall quality Q (R,a) of the potential set of rules is defined as



n
i
iSRad
n
QRa
1
,,
1
, (2.19)
where
S R a d MAXSize POS D d INDR ai R a i
U U , , / (2.20)
is the support of the most significant rule for decision class { D=d i} where D is the
decision attribute discussing the values { d1,d2,d3,...dn }.
53
The Average Support Heuristic function is defined as
FR aCardPOS DQR aR a , X , U (2.21)
2.9.4 Parameterized Average Support Heuristic
The heuristic functions generally consider only the positive region in the
traditional Rough Sets model. They ignore the information provided by the
boundary region which correspond to inconsistent instances. The concept of lower
approximation has to be broadened to include predictive instances that are excluded
by the traditional lower approximation. Predictive instances, refer to instances that
may produce productive rules, which hold true with high probability but are not
necessarily 100% true. Zhang and Yao.,(2004) proposed a new definition for lower
approximation based on which an improvement of the average support heuristics
called Parameterized Average Support Heuristic (PASH) is done.
PASH function is defined as
F R a CardPOS DQR aR a , X , U (2.22)
 
where Card POSRUaD is the cardinality of the positive region and
Q(R,a) is the overall quality of potential rules based on predictive instances. The
main advantage of PASH is that it considers the overall quality of the potential rules,
thus producing a set of rules with balanced support distribution over all decision
classes. It requires a parameter to be defined by the user that adjusts the level of
approximation.
54
2.9.5 Johnson Reducer
This is a simple greedy heuristic algorithm that is often applied to
discernibility function to find a single reduct (Ohrn, 1999). Reducts formed by this
process have no guarantee of minimality, but are generally close to minimality.
The Johnson Reducer Algorithm is given in Figure 2.5. The algorithm begins
by setting the current reduct candidate, R, to the empty set. Then, each conditional
attribute appearing in the discernibility function is evaluated according to the
heuristic measure. For the standard Johnson algorithm, this is typically a count of
the number of appearances an attribute makes within clauses; attributes which
appear more frequently are considered to be more significant. The attribute with the
highest heuristic value is added to the reduct candidate and all clauses in the
discernibility function containing this attribute are removed. When all clauses are
removed, the algorithm terminates and returns the reduct R.
Figure 2.5 Johnson Reducer Algorithm
Johnson Reducer Algorithm(C,fD )
C, the set of conditional attributes
fD, the discernibility function.
(1) R; bestc=0;
(2) while (fD not empty)
(3) for each aC that appears in fD
(4) c = heuristic(a)
(5) if (c > best c)
(6) best c=c; bestAttr a;
(7) RRUa
(8) fDremoveClauses(fD,a)
(9) return R
55
2.9.6 Compressibility Algorithm
Starzyk et al.,(2000) proposed a new method for the generation of all reducts
in an information system by manipulating the clauses in discernibility functions. The
concept of strong compressibility is introduced and applied in conjunction with an
expansion algorithm. The simplification process applies where clause attributes are
either simultaneously present or absent in all clauses. In such a situation, the
attributes may be replaced by a single representative attribute. The compressibility
algorithm uses concepts such as absorption and expansion laws from Boolean
algebra with strong compressibility for simplifying the discernibility function
2.10 DYNAMIC REDUCTS
Reducts generated from an information system are sensitive to changes in the
system. This can be seen by removing a randomly chosen set of objects from the
original object set. Those reducts frequently occurring in random subtables can be
considered to be stable. These are the reducts encompassed by dynamic reducts
(Bazan et al.,1994).Dynamic Reduct Algorithm is presented in Figure 2.6.
 
For a decision table A U,C U D, any system B  U' ,C U D , where
U' U is called a subtable of A.If F is a family of subtables of A, then

BF
DR A F d A d d B d
, Re , Re ,

I I (2.23)
define the set of F-dynamic reducts of A. From equation (2.23) ,it follows that a
relative reduct of A is dynamic if it is also a reduct of all subtables in F.
56
By introducing a threshold, 0 1, the concept of F,- dynamic reducts
can be defined:
DR A F C R A d S C F , , : (2.24)
where

F
BFCdBd
SCF
: Re ,
(2.25)
is the F-stability coefficient of C.
This definition removes the previous restriction that a dynamic reduct must
appear in every generated subtable. Hence, a reduct is considered to be dynamic if it
appears in a certain proportion of subtables, determined by the value .
A comparison of dynamic and non-dynamic approaches can be found in
(Bazan, 1998), where various methods were tested on extracting laws from decision
tables.
Figure 2.6 Dynamic Reduct Algorithm
DynamicReduct(A, , its)
A, the original decision table;
, the dynamic reduct threshold;
its, the number of iterations.
(1) R{}
(2) TcalculateAllReducts(A)
(3) for j = 1…its
(4) Aj _ deleteRandomRows(A)
(5) RRU calculateAllReducts(Aj)
(6) CT
(7) if sF (C,R) ≥ 
(8) output C
57

Maram Vecha Thannio Thu

Uploaded by

Copyright:

Available Formats

Maram Vecha Thannio Thu

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Maram Vecha Thannio Thu

Uploaded by

Copyright:

Available Formats

Rough Set Approximation Spaces

review of feature selection techniques devised

(7) else if R xDT DU 

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.