Demystifying Ontology
Demystifying Ontology
Emad Khazraee
Drexel University, Philadelphia, USA
Xia Lin
Drexel University, Philadelphia, USA
Abstract: The term “ontology” is used in diferent communities multifariously, in a nearly anarchic
way. Ironically, the major function of ontology itself is to explicate the meaning of terms and
concepts. Therefore, diferent conceptions of this term impede collaboration and exchange of
expertise between diferent domains and communities. Thus, providing a clear image of the
diferent notions of ontology is a precondition of communication. This paper studies diferent
notions of ontology and attempts to compare these diferent conceptions, and to organize them
into a model to facilitate collaboration in this ield. The use of an ontology gamut model is proposed
instead of the one-dimensional ontology spectra used in the past. This model can be used as the
basis for agreement to clarify the term ontology among diferent communities by providing levels
of formality, semantics and complexity. The coordinates of each ontology in this gamut helps with
understanding the speciic conception of that ontology.
1. Introduction
The purpose of this paper is to clarify the use of the term “ontology”. Currently it
is used by diferent communities in a near anarchic fashion. The irony is that one
of the major functions of ontology is to explicate the meanings of terms and
concepts. Ontologies are developed and used in a variety of domains including
librarianship, information and computer science, and artiicial intelligence.
Diferent conceptions of this term are an impediment to collaboration and
exchange of expertise between diferent domains. Therefore, it becomes
a precondition of communication to provide a clear image of the diferent
notions of the term ontology. In other words, the polysemy of the term can
be tolerated if diferent communities are able to clearly describe their speciic
understanding and usage of the term when exchanging ideas with each other.
Former eforts to consolidate the disparity of the use of “ontology” were mostly
presented in linear ontology spectrum models, considering only one dimension
of ontologies, such as semantic power, formality or complexity. These models
41
help to disambiguate the concept of ontology; however, these models cannot
clearly diferentiate the subtle diferences in some cases. This paper attempts
to compare diferent conceptions of ontology and organize them into a model
based on the diferentiated dimensions of an ontology in a novel way. Such
a model can provide a common basis for promoting collaboration among
diferent communities using ontologies.
2. What is ontology?
The term ontology has a long history in philosophy. It is known as the synonym
of metaphysics (meaning what comes after physics), and was developed into
a discipline by Aristotle, who used the term “irst philosophy” instead (Smith,
2003). The word ontology comes from the Greek ontos and logos, standing for
being and word, respectively (Sowa, 2000). Smith (2003) traced the irst use
of the term in the early 17th century by two philosophers, Jacob Lorhard and
Rudolf Göckel, and noted its irst appearance in English, dating back to 1721 in
Bailey’s dictionary. According to Smith (1989), the term “formal ontology” was
irst used by Husserl in logical investigation. The term formal ontology causes
confusion in many cases; thus, we shall diferentiate between “formal ontology”
and “formalized ontology.” The term “ontology” has been used in the computer
and information science community since the 1980’s, and during this period,
diferent notions of this term have proliferated. These difering notions will be
reviewed and diferentiated in this paper.
The initial point for the explication of the usage of the term ontology would be
the clariication of our question itself. The question, “what is ontology?” should
be distinguished from the question, “what is an ontology?” The former question
is concerned with ontology as a discipline in philosophy, while the latter refers to
ontology as an artefact. To disambiguate between these situations, “Ontology”
with an uppercase “O” will serve as the subject of inquiry in the irst question,
as a discipline, then the term “ontology” with a lowercase “o” will be used in the
sense of the second question, as an artefact (Guarino, Oberle & Staab, 2009;
Daconta, Obrst & Smith, 2003).
The main concern in Ontology as a discipline in philosophy is to answer the
question, “what exists?” Therefore, it is a methodological study of the account
of existence. “Philosophical ontology is the science of what is, of the kinds and
structures of objects, properties, events, processes and relations in every area
of reality.” In philosophy, it looks for “deinitive and exhaustive classiications of
entities in all spheres of being” (Smith & Welty, 2001: i). Ontology is the study of
all concrete and abstract entities that make up the world.
42
Two sources for forming Ontology are observation and reasoning. Observation
provides knowledge about the world and reasoning turns it into an abstract
framework (Sowa, 2000). However, we should distinguish between the
epistemological (what we know or what we believe that exists) and ontological
(what exists) issues. Smith (2003) diferentiates between two approaches toward
Ontology as substantialist and luxist, recognizing substances (continuants)
and events/processes (occurrents), respectively, as the main focus of Ontology
as a discipline. The other division is between adequatists and reductionists.
Adequatists pursue Ontology at all levels of aggregation (microphysical
to cosmological), while reductionists seek to reduce reality to its simplest
constituents.
On the other hand, ontology as an artefact is “a technical term ... that is
designed for a purpose, which is to enable the modelling of knowledge
about some domain, real or imagined” (Gruber, 2009). In the second sense, an
ontology is a knowledge engineering artefact, sometimes known as a formal
ontology or applied ontology. Legg (2008) argues that “kinds and structures”
in Smith’s (2003) deinition above are essentially “categories.” Information
science then adopted the term ontology based on this assertion. Smith (2003)
explains that the motivation behind the use of the term ontology in computer
and information science is a “Tower of Babel” problem. Its goal is to resolve
terminological and conceptual conlicts, and it thus seen as a shared taxonomy
of entities. This notion aligns with the early ideas that searched for a shared
taxonomy of concepts as a silver bullet for the Semantic Web. Smith deines
an ontology in information science as “a dictionary of terms formulated in a
canonical syntax and with commonly accepted deinitions designed to yield
a lexical framework for knowledge-representation which can be shared by
diferent communities” (Smith, 2003: 6) He also recognizes a more ambitious
deinition for ontology in this context as a formal theory, which includes both
deinitions and a supporting framework of axioms. For example, it contains
terminology (Tboxes), assertions (Aboxes), and rules of inference in a system
based on Description Logics (Baader et al., 2003).
Zuniga (2001) deines an ontology as a formal language designed with a speciic
functional purpose(s) in mind to represent a particular domain of knowledge.
This is an important diference between Ontology in philosophy which is more
descriptive than the IS ontologies designed for practical applications. One of
the important points Zuniga mentions is the diferent uses of the term “formal”;
in one sense formal is about the nature of investigation which is general and
applicable to all domains of reality. In this sense, the use of the term formal
is dealing with general categories such as thing, process, and matter—thus
43
a formal ontology (in the sense used by Husserl) deploys these categories to
codify what exists (Poli & Obrst, 2010). In the other sense, formal means the
use of symbolism in a deductive system. Formal is used mainly in the second
sense by the information science community (Zuniga, 2001). Poli & Obrst (2010)
emphasize that we should diferentiate between formal Ontology (use of formal
in the irst sense) and a formalized ontology, which implies the use of formal
language to represent an ontology. Thus, they proposed the use of “categorial”
instead of “formal.”
The former distinction between Ontology (as a discipline) and ontology (as a
knowledge artefact) is the most radical distinction among diferent usages of
the term. The other diferences are subtler than the former distinction. To reduce
confusion, Poli & Obrst (2010) suggest a terminology for these diferent usages:
ontology-as-categorial-analysis (ontology_c) and ontology-as-technology
(ontology_t), respectively. This paper focuses mostly on ontology_t according
to this terminology. It should be noted that the relationships between these
two realms of ontology would have beneits for both parties.
The term ontology is used in the computer science (CS), artiicial intelligence
(AI), and information science (IS) communities with a variety of meanings. This
term was irst used by McCarthy in 1980 during his studies from the perspective
of a logicist in AI (Smith & Welty, 2001). The proliferation of concepts was
developed shortly after, and then in the mid 90’s Gruber (1993) provided the
most widely accepted deinition of the term ontology as, “a formal speciication
of a conceptualization”. One of the reasons that this deinition became widely
accepted in the three communities mentioned is its generality. Consequently,
each community can use it for its own purposes. Guarino & Giaretta (1995)
revealed the deiciencies of this deinition. To provide an account of this term
use, a brief review of the literature is presented.
The most general interpretation of Gruber’s deinition claims that ontology is a
list of existing concepts or entities in a domain. This set of terms or vocabulary
can be structured to form a hierarchy or lattice. Soergel (1999) believes that
ontology is a new brand invented to be used in place of classiication. He deines
ontologies as “a shallow classiication of basic categories or a classiication used
in linguistics, data element deinition, or knowledge management” (cf. Soergel,
1997: 1). More sophisticated interpretations do not accept the former deinition
and place emphasis on the use of formalization in an ontology. In this sense, an
ontology is a structure that is expressed in a formal language and can be shared
among diferent agents. This interpretation forms a spectrum of knowledge
artefacts, known as the ontology spectrum (Uschold & Gruninger, 1996). At one
44
end of the ontology spectrum, there are lightweight structures with minimal
semantics and formalization, and at the other end, there are structures with
richer semantics and formalization. This spectrum has been mentioned by
various researchers in the domain (Daconta, Obrst & Smith, 2003; McGuinness,
2003; Smith & Welty, 2001; Guarino, Oberle & Staab, 2009). Moreover, some
authors provide similar organization of a broader group called Knowledge
Organization Systems (KOS) (Zeng, 2008). The following igures illustrate the
four spectra.
Figure 1: Ontology spectrum based on formal semantics adopted (Daconta, Obrst &
Smith, 2003)
45
Figure 3: Ontology spectrum based on formal complexity adopted from
(Smith & Welty, 2001)
Figure 4: Ontology spectrum based on formality adopted from (Guarino, Oberle & Staab,
2009)
46
of concepts (sometimes called classes) in a domain of discourse, properties of
each concept describing various features and attributes of the concepts (also
called slots, roles or properties), and restrictions on slots (also called facets or
role restrictions)” (Noy & McGuinness, 2001: 3).
Uschold & Gruninger (1996: 97), deine ontologies as “the shared understanding
of some domain of interest which may be used as a unifying framework” called
“agreements about a shared conceptualization”. These agreements consist of
three components: conceptual frameworks for knowledge modelling, protocols
for exchanging content among agents, and agreements on representation of
theories. They conclude that in knowledge sharing “ontologies are speciied
in the form of deinitions of representational vocabulary”. The functions of an
ontology in this perspective are categorized as, communication, interoperability,
and system engineering (i.e. speciication, reliability and reusability). Eight
years later they used Gruber’s deinition as the standard deinition of ontology
indicating that the main diference between approaches to ontologies is in
the level of speciication of the meaning of terms used in the ontology. They
demonstrate that there is a continuum that starts with little speciication of
meaning and ends with rigorously formalized theories (Uschold & Gruninger,
2004). They emphasize that the promise of ontologies is “a shared and common
understanding of a domain that can be communicated between people and
application systems” (Uschold & Gruninger, 2004: 61). The main focus of these
ontologies is on shared or common understanding. Then these ontologies can
be seen as collaborative agreements for sharing and exchanging information.
Guarino, Oberle & Staab (2009) provide a more detailed deinition of ontology
by explicating the important aspects of the deinition as “a formal, explicit
speciication of a shared conceptualization”. They indicate that in order to
achieve a reliable deinition, it is necessary to clarify the following concepts,
conceptualization, formal, speciication and shared. They distinguish
between conceptualization as a structure of intentional relationships and
conceptualization as a structure of extensional relationships, highlighting it as
one of the fundamental sources of confusion regarding ontologies. Finally, they
deine an ontology as “a set of axioms, i.e. a logical theory designed to capture
the intended models corresponding to a certain conceptualization and to
exclude unintended ones”1 (Guarino, Oberle & Staab, 2009: 8). They expand the
use of the semiotic triangle, and use the logically precise meaning of “denotes”
1 They provide this precise deinition: “Let C be a conceptualization, and L a logical language with
vocabulary V and ontological commitment K. An ontology OK for C with vocabulary V and ontolo-
gical commitment K is a logical theory consisting of a set of formulas of L, designed so that the set
of its models approximates to the set of intended models of L according to K” (Guarino, Oberle &
Staab, 2009: 11).
47
instead of the weakly deined relationship of “invokes”. Accordingly, they deine
four generic types of ontologies: top-level ontologies, reference ontologies,
core ontologies, and application ontologies (Guarino, Oberle & Staab, 2009: 15-
19).
3. Ontology gamut
In this paper we examine the diferent uses of the term ontology, as a discipline
and as a knowledge engineering artefact (ontology_t). This in turn consists of
a set of artefacts from simple structures of terms (for sharing or to clarify mea-
ning) to logical theories (representing a universe of discourse) as explicated by
Guarino, Oberle & Staab (2009) and recognized as the ambitious deinition of
ontology in information science by Smith (2003). Based on the literature three
main dimensions have been distinguished for ontologies. Diferent authors
provided diferent spectra of ontologies in computer and information science
using one characteristic or diferentia (Daconta, Obrst & Smith, 2003; Guarino,
Oberle & Staab, 2009; McGuinness, 2003; Smith & Welty, 2001). Ontologies
were grouped based on the degree of semantics, expressivity, formality, or
complexity.
It is important to recognize that these dimensions are not necessarily equal or
positively correlated. The level of semantic content is related to the expressive
power of the ontology. However, semantic level is not necessarily correlated to
the degree of formalization of a structure. Even formalized structures may be
used to discard problems raised by semantic content. For example, a database
schema (DB schema) may contain less semantic content than a term glossary
that deines terms in natural language and is shared among users. Thesauri
consist of deinitions in natural language, are slightly structured, and their
preferred terms are selected based on speciic rules. In this case, the semantic
content level is higher than a DB schema, while the structure and formality
could be weaker. Thesauri occupy diferent places in the spectrum; they have
structure but they might not use a formal language for representation.
This paper expands on previous models of the ontology spectrum to create a
new model called an “ontology gamut.” Instead of merely using “formalism” or
“formal semantics” as the diferentia to recognize diferent types of ontologies,
we can use a two dimensional model similar to the RGB gamut. In this model
(Figure 5), the two axes represent the degree of formalization and semantic
content. Three main families of ontologies are diferentiated in this model. On
the left side of the gamut, there are formal structures with less or no semantic
content (e.g. DB schemas). At the bottom, there are semantic structures with no
48
formalizations (e.g. term lists, glossaries and synonym rings). To the upper right
side, both the level of semantic content and completeness of a formal system
are higher than the other two families, and at its extreme corner, the knowledge
artefact is a logical theory containing both deinitions and a supporting
framework of axioms.
Here we should clarify what the level of semantic content means. In this model,
the level of semantic content is related to the extent to which the constructs
in the knowledge artefact convey meaning. In other words, it is the degree of
correspondence between its constructs and the things in the external world
or the universe of discourse. Simply, semantic content here is the content that
clariies the relationships between terms (concepts or symbols in the system)
and what they stand for. A glossary may use natural language with a high level
of expressivity to convey the meanings of terms in their speciic context of use.
Therefore, this glossary can be used as the basis for shared agreement within a
community but it cannot necessarily be used by machines. Since machines do
not understand meaning, formalisms are created in order to use machines for
manipulating data and discarding semantics. There may be machine-readable
49
formal structures that do not necessarily convey any meaning or refer to external
entities. However, we can use formal semantics to restrict interpretations,
reduce vagueness, and use machine power to do formal reasoning.
Accordingly, in the ontology gamut, DB schemas are mostly simple
representations of relationships (not necessarily of ontic entities) for machines.
Formal with regard to this family refers to the fact that expressions must be
machine-readable, and therefore, the target audience is machines. This family is
mostly referenced by the DB community and sometimes the AI community. The
goal of the vocabulary and classiication family is collaborative agreement within
a community (e.g. indexers or users). With humans as their target audience, they
provide rich content to clarify meaning using mostly natural language. This
family is mostly referenced by the Library and Information Science community.
The Logical Theory family contains both deinitions and a supporting framework
of axioms and inference rules. They are designed to enable reasoning using
machine power for a domain of knowledge. Formal in this family refers to
restricting interpretations as well as providing machine-readable models. The
target audience in this case is then both humans and machines. This family is of
most interest to the AI and Knowledge Representation community.
4. Conclusion
This paper proposes the use of an ontology gamut model instead of the
one-dimensional ontology spectra used in the past. This model can be used
as the basis for agreement on clarifying the use of the term ontology among
diferent communities. Positioning each ontology within this gamut can help
in understanding the speciic conceptions of the term “ontology” adopted by
each group. This in turn helps us to better collaborate on exchanging expertise
among communities, because each conception of ontology entails speciic
terminology, methods and tools. The notion of an ontology gamut provides
an overarching realm for what can be described as a “clan” of knowledge
engineering artefacts that may be loosely called ontologies. Within this clan
of ontologies as technology, a family might be of more interest to information
and computer science and the AI community, which covers the territory from
shared vocabularies represented for machines to logical theories for domains
of knowledge.
The neighbouring territory of this clan of ontologies is Ontology as a discipline.
This important discipline cannot be ignored in knowledge engineering prac-
tice. Smith (2003) and Poli & Obrst (2010) indicated that knowledge engineers
and philosophers could learn mutual lessons through collaboration. Employing
50
philosophical precision in the ontology design process can prevent many
errors. Ontology as a discipline may also provide insight into the challenges of
ontology design such as approximation and vagueness. In return, philosophers
can learn from the challenges of knowledge engineering and receive help in
materializing their attempts. Therefore, we should also consider including
Ontology as a discipline in the mapping of the ontology gamut.
Acknowledgment
We express our appreciation to the anonymous reviewers’ comments, which
greatly contributed to enhancing the quality of the article.
References
Baader, F. et al. (eds.) (2003). The description logic handbook: theory, implementation, and
applications. Cambridge, UK: Cambridge University Press.
Daconta, M.; Obrst, L. J.; Smith, K. (2003). The semantic web: a guide to the future of XML,
web services, and knowledge management. Indianapolis: Wiley.
Guarino, N.; Oberle, D.; Staab, S. (2009). What Is an ontology? In: Handbook on ontologies.
2nd ed. Edited by S. Staab, R. Studer. Berlin, etc.: Springer, 2009. (International
handbooks on information systems), pp. 1-17.
Legg, C. (2008). Ontologies on the Semantic Web. Annual Review of Information Science
and Technology, 41 (1), pp. 407-451.
McGuinness, D. L. (2003). Ontologies come of age. In: Spinning the Semantic Web: bringing
the World Wide Web to its full potential. Edited by Dieter Fensel et al. Cambridge, Mass:
The MIT Press, pp. 171–196. Available at: http://www-ksl.stanford.edu/people/dlm/
papers/ontologies-come-of-age-mit-press-%28with-citation%29.htm.
51
Noy, N. F.; McGuinness, D. L. (2001). Ontology development 101: A guide to creating your
irst ontology. Stanford Knowledge Systems Laboratory Technical Report KSL-01-
05 and Stanford Medical Informatics Technical Report SMI-2001-0880. Palo Alto,
CA: University of Stanford. Available at: http://www.ksl.stanford.edu/people/dlm/
papers/ontology-tutorial-noy-mcguinness.pdf
Poli, R.; Obrst, L. (2010). The interplay between ontology as categorial analysis and
ontology as technology. In: Theory and applications of ontology. Vol 2. Computer
applications. Edited by R. Poli, M. Healy, A. Kameas. Berlin, etc.: Springer, pp. 1-26.
Smith, B. (1989). Logic and formal ontology. In: Husserl’s phenomenology: a textbook.
Edited by J. N. Mohanty, W. McKenna. Lanham: University Press of America, pp. 29-
67.
Smith, B. (2003). Ontology. In: Blackwell guide to the philosophy of computing and
information. Edited by L. Floridi. Oxford: Blackwell Publishers, pp. 155-166.
Smith, B.; Welty, C. (2001). Ontology: towards a new synthesis. In: Proceedings of the
2nd International Conference on Formal Ontology in Information Systems (FOIS ’01)
Ogunquit, Maine, USA, October 17-19, 2001. Edited by N. Guarino, B. Smith, C. Welty.
New Your, NY: ACM, pp. 3-9.
Soergel, D. (1999). The rise of ontologies or the reinvention of classiication. Journal of the
American Society for Information Science, 50 (12), pp. 1119-1120.
Uschold, M.; Gruninger, M. (2004). Ontologies and semantics for seamless connectivity.
SIGMOD Rec., 33 (4), pp. 58-64. Available at: http://www.sigmod.org/sigmod/record/
issues/0412/12.uschold-9.pdf.
52
Systems (FOIS ‘01), Ogunquit, Maine, USA, October 17-19, 2001. Edited by N. Guarino, B.
Smith, C. Welty. New York, NY: ACM. pp. 187-197. Available at: http://portal.acm.org/
ft_gateway.cfm?id=505187&type=pdf.
XIA LIN is an associate professor at the iSchool@Drexel, College of Information Science and
Technology, Drexel University, Philadelphia, Pennsylvania, USA. His research areas include
information organization, knowledge mapping, information visualization, digital libraries,
information retrieval, and visual interface design. He has published more than 60 research papers
in these areas and received signiicant research funding for research and doctoral education. His
visualization prototypes have been presented and demonstrated in many national and international
conferences. Currently, he serves on the Editorial Board of the International Journal of Information
Visualization. Dr. Lin has a Ph.D. in Information Science from the University of Maryland at College
Park and a Master of Librarianship from Emory University. Prior to joining Drexel, Dr. Lin was an
assistant professor at the University of Kentucky.
53
54