XMLSchema

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 124

XML Schema

What’s in a Schema?
A Schema is an XML document (a DTD is not)

Because it is an XML document, it must have a root element


 The root element is <schema>

Within the root element, there can be


 Any number and combination of
Inclusions
Imports
Re-definitions
Annotations

 Followed by any number and combinations of


Simple and complex data type definitions
Element and attribute definitions
Model group definitions
Annotations
Referencing XML
Schema in XML
documents
XML Namespaces
Purpose of namespaces
From perspective of XML document writer
 Just like namespaces in programming
languages
 Allows you to reuse certain names in an XML
document by qualifying them with prefixes
 Also, makes it clear which schema individual
elements are associated with
 Examples next page
Sample namespaces
<?xml version=“1.0”?>
<library xmlns=http://masters.cs.uchicago.edu/library>
<book id=“1234”>
<title>
Introduces a namespace at the document root
Intro to XML level. All elements and attributes that are
</title> prefixed with the namespace prefix (blank
<authors> in this case, as we will see on next slide)
<person id=“ARS”> belong to the masters.cs.uchicago.edu/library
<name> namespace (either directly or indirectly).
Andrew Siegel
</name>
</person>
</authors>
</book>
</library>
Sample namespaces
<?xml version=“1.0”?>
<lib:library xmlns:lib=http://masters.cs.uchicago.edu/library>
<lib:book id=“1234”>
<lib:title>
Same as previous slide, but now namespace
Intro to XML is not default (ie blank). All elements must be prefixed
</lib:title> with lib:
<lib:authors>
<lib:person id=“ARS”> Note that choice of ‘lib’ is totally arbitrary
<lib:name>
Andrew Siegel Also note that the namespace http://... is a URI
</lib:name> It does not point to any web resource. The standard
</lib:person> requires that a URI be used for uniqueness.
</lib:authors>
Note also that the attributes do not use the namespace
</lib:book> directly. This is default behavior, as we will see, but
</lib:library> it can be changed
Sample namespaces
<?xml version=“1.0”?>
<l:library xmlns:l=http://masters.cs.uchicago.edu/library>
<l:book id=“1234”>
<l:title>
Same as previous slide, but now we choose
Intro to XML to use “l” as our shortcut rather than “lib”
</l:title>
<l:authors>
<l:person id=“ARS”>
<l:name>
Andrew Siegel
</l:name>
</l:person>
</l:authors>
</l:book>
</l:library>
Pathological case
<?xml version=“1.0”?>
<l:library xmlns:l=“http://masters.cs.uchicago.edu/library”>
<l:book id=“1234” xmlns:lib=“http://masters.cs.uchicago.edu/library”>
<l:title>
Intro to XML
</l:title> Note you can introduce many prefixes for the
<lib:authors> same namespace (not recommended) and mix
<l:person id=“ARS”> them however you choose. However, they only
<name> propagate down from the element where they were
Andrew Siegel introduced.
</name>
</l:person>
</lib:authors>
</l:book>
</l:library>
So What
So far, this doesn’t accomplish much

Allows us just to reuse same names within


a single xml document that we control.

Next, we examine relationship between


schema and corresponding .xml
XML-schema relationship

<library xmlns=http://masters.cs.uchicago.edu/library>
<book id=“1234” available=“yes”>
<isbn>123456789 </isbn>
<title> Intro to XML</title> These must match. You can have
</book> other namespaces as well but must
declare targetNameSpace at least
</library>

<xs:schema targetNamespace=http://masters.cs.uchicago.edu/library
elementFormDefault=“qualified” attributeFormDefault=“unqualified”
xmlns:lib=http://masters.cs.uchicago.edu/library>
xmlns:xs=http://www.w3.org/2001/XMLSchema>
… These just introduce
</xs:schema> namespaces, as we did
before
More on previous example
It is very important to realize that there are two
things going on here:
 Introduction of whatever namespaces we want and
their associated prefixes. This is easy

 In the schema, using one of these namespaces as


our targetNamespace. This means that none of the
elements or attributes in the schema will use it.
Rather, it is used to resolve all references to the
components of schema!
More …
Recall that when we link an xml to a schema, we
must specify the location of the schema with a
xsi:schemaNoNamespaceLocation= ...

For a schema with a targetNamespace, we use


xsi:schemaLocation attribute

This takes a pair of values -- namespace and


schema URI, such a
<library
xsi:schemaLocation="http://masters.cs.uchicago.edu/cspp53025/bc
benchmark.xsd”
>
Reminder: Global Elements
Global definition: “All the components (elements,
attributes, simple and complex types, element
and attribute groups) can be defined directly at
the top level of the schema, directly under the
xs:schema document element. Their definition is
said to be “global”, and they can be referenced
elsewhere in the schema, as well as in any
schema that has imported or included this
schema.”
Qualified vs. Unqualified
The declaration of a targetNamespace gives
choice between defining elements/attributes
that
1. Must explicitly be associated with the
targetNamespace
“qualified”
2. May not be explicitly associated with
targetNamespace
“unqualifed”

Note: top-level element must always be qualified!


Examples
<xs:element name=“book” form=“qualified”/>
<xs:element name=“isbn” form=“qualified”/>
<xs:attribute name=“lang” form=“unqualified”/>
<xs:attribute name=“character” form=“unqualified”/>

<xs:schema ../> also can specify behavior for entire document using

elementFormDefault=“qualified | unqualified”
attributeFormDefault=“qualified | unqualified”

These also themselves have defaults (both unqualified)


Combining schema
Main purpose of using namespaces is
when combining schema from different
locations.

Since each schema only defines one


targetNamespace, this can only be done
by merging different schema to form a
larger schema
Include statement
The xs:include feature for schema allows direct
textual inclusion of top-level elements from
another schema:
 <xs:include shcemaLocation=“student.xsd”>
Much like include in programming languages
Included elements are then top-level elements of
new schema
No precedence rules for name conflicts!
Good for including types defined in other
schema
More on Include
Include has a few limitations:
 Cannot handle namespaces (see import next)
Name conflicts very likely to occur
 Cannot include parts of another schema. It is all or
nothing
This is mitigated by xs:redefine, which allows the inclusion to
change parts of the included file
 By extension of the included element
 By restriction of the included element

Similar to extension/restriction of complex types


Import
xs:import is like include, but it handles
namespaces
 <xs:import namespace=“http://masters.cs.uchicago.edu/library”
schemaLocation=“libTypes.xsd”/>

Note that this namespace mapping must be defined in


the importing schema

Guarantees that building-block pieces of schema will not


have name conflicts!
SchemaLocation
One important additional note is that
SchemaLocation, whether used in import or at
root level of xml document, is only a suggestion
to xml processor!

If omitted, “the schema author is leaving the


identification of the schema to the instance,
application, or user” via other mechanisms.
Fast Reactor Schema example
How to express units?
<xs:element name="conductivity">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:decimal">
<xs:attribute name="units" use="required" type=”unitsType"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>

<xs:simpleType name="unitsType">
<xs:restriction base="xs:string">
<xs:enumeration value="K"/>
<xs:enumeration value="g_per_cm3"/>
<xs:enumeration value="W_per_cm-K"/>
<xs:enumeration value="kW_per_m"/>
</xs:restriction>
</xs:simpleType>
Exercise 1
Suggest a good solution for describing the units
attribute (or element if you prefer), keeping in
mind the following:
 Each physical property has different combinations of
units in general
 There are typical ways of expressing the values (e.g.
grams vs Mgrams), but we might not want to prohibit
unusual choices
 We would like robust validation
 We might want to compute new properties from
combinations of existing properties
 We want something not too difficult to parse
A Solution
My initial solution is particularly bad for pretty
obvious reasons

One idea: use a vocabulary that understands the


concept of a fraction, multiplication, etc. so that
the structure of the units is easy to ascertain

This is time-consuming to construct -- build off of


existing vocabulary
Math Markup Language
MathML is an XML notation for
describing mathematical notation
and capturing both its structure
and content. The goal of MathXML
is to allow mathematics to be
served, received, and processed
on the World Wide Web, just as
html has enabled this
functionality for text.
Presentation vs. Content
Mathematical expressions are built up as syntax trees
- of layout schemata in Presentation-MathML
- of logical subexpressions in Content-MathML

Example:(a+b)2
<msup> <apply>
<mfenced> <power/>
<mi>a</mi> <apply>
<mo>+</mo> <plus/>
<ci>a</ci>
<mi>b</mi>
<ci>b</ci>
</mfenced>
</apply>
<mn>2</mn> <cn>2</cn>
</msup> </apply>

Note that root element of mathML schema is <math>


Using MathML
Assuming we can apply mathML fairly
easily by itself, how do we incorporate it
into our fast_reactor schema?

Note that an xml instance document can


only directly point to one schema …
xs:import
We can combine schema by designing a
base schema that uses <xs:import> to
combine with other schema

Syntax is, e.g.:


URI of mathML schema
<xs:import
schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"
namespace="http://www.w3.org/1998/Math/MathML">
</xs:import>

Must refer to MathML targetNamespace


Simpler example
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:form="http://masters.cs.uchicago.edu/ace104"
xmlns:m="http://www.w3.org/1998/Math/MathML"
targetNamespace="http://masters.cs.uchicago.edu/ace104"
elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:import schemaLocation="http://www.w3.org/Math/XMLSchema/mathml2/mathml2.xsd"
namespace="http://www.w3.org/1998/Math/MathML"></xs:import>
<xs:element name="formula">
<xs:complexType>
<xs:sequence>
<xs:element ref="m:math"/>
<xs:element name="description" type="xs:anySimpleType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Corresponding XML
<formula
xmlns="http://masters.cs.uchicago.edu/ace104"
xmlns:mml="http://www.w3.org/1998/Math/MathML"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=“http://masters.cs.uchicago.edu/ace104 file:formula.xsd”

<mml:mrow>
<mml:msup> <mml:mi>x</mml:mi><mml:mn>2</mml:mn></mml:msup>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mn>4</mml:mn><mml:mo>+</mml:mo><mml:mi>x</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mn>4</mml:mn>
</mml:mrow>

<description>x squared plus four times x plus four</description>


</formula>
Qualified vs. Unqualified
How would previous slide look if we
specified
 elementFormDefault=“qualified”
 attributeFormDefault=“qualified”

What can we infer about mathml2.xsd


regarding qualification?
Qualified vs. Unqualified
Note that the it is the declaration in
mathml2.xsd determines namespace
qualification for the elements that are
define in that schema.
 This is defined as
elementFormDefault=“qualified”
attributeFormDefault=“unqualified”
Use of namespaces within
Schema
Note that targetNameSpace prefix is often
used within the schema itself. Example
<xs:schema targetNamespace=http://dyomedea.com/ns/library
elementFormDefault=“qualified” attributeFormDefaul=“unqualified”
xmlns:lib=“http://dyomedea.com/ns/library”
xmlns:xs=“http://www.w3.org/2001/XMLSchema”>
<xs:element name=“library”>
<xs:complexType>
<xs:sequence>
<xs:element name=“book” type=“lib:bookType”/>

<xs:complexType name=“bookType”>
<xs:sequence>
..
Name Conflicts
XML Schemas allow name reuse (DTDs don’t).

Where can the same name be used, and where will there be a
name conflict?
 Type definitions are placed in one symbol space.
 Element declarations are placed in a second symbol space
 Attribute declarations are placed in a third symbol space.

 Hence, you can have a type, element, and an


attribute all with the same name!

 Also, each type definition creates a new symbol space


What's Legal?
Legal
 Element, attribute, type (complex or simple) with the same
name
 Same name in different Symbol Spaces
 Same name in different namespaces This relationship is
confusing. We will
explore shortly

Illegal
 Same name and same Symbol Space but different type

Legal
 Same name and same Symbol Space and same type
<xsd:element name="foo">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="bar" type="xsd:string"/>
...
<xsd:element name="bar" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType
</xsd:element>
Is this legal or illegal?

<xsd:element name="foo">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="bar" type="xsd:string"/>
...
<xsd:element name="bar" type="xsd:integer"/>
</xsd:sequence>
</xsd:complexType
</xsd:element>

Is this legal or illegal?


<xsd:element name="BookOnCars"> Global element
<xsd:complexType>
symbol space
<xsd:sequence>
<xsd:element name="Chapter">
<xsd:complexType>
<xsd:sequence> BookOnCars
<xsd:element name="Title" type="xsd:string"/>
<xsd:element name="Section" >
<xsd:complexType> Title
<xsd:sequence>
<xsd:element name="Title" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element> Global attribute
</xsd:sequence> symbol space
</xsd:complexType>
</xsd:element>
<xsd:element name="Title" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
Title
</xsd:element>
<xsd:complexType name="Title">
<xsd:sequence>
<xsd:element name="CarManufacturer" type="xsd:string"/>
<xsd:element name="Year" type="year"/>
</xsd:sequence> Global type
</xsd:complexType> symbol space
<xsd:element name="Title" type="xsd:string"/>
<xsd:attribute name="Title" type="xsd:string"/>

Title
<xsd:element name="BookOnCars"> anonymous
<xsd:complexType> symbol space
<xsd:sequence>
<xsd:element name="Chapter">
Chapter
<xsd:complexType>
<xsd:sequence> Title
<xsd:element name="Title" type="xsd:string"/>
<xsd:element name="Section">
<xsd:complexType> anonymous
<xsd:sequence> symbol space
<xsd:element name="Title" type="xsd:string"/>
</xsd:sequence>
Title
</xsd:complexType>
</xsd:element> Section
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="Title" type="xsd:string"/> anonymous
</xsd:sequence> symbol space
</xsd:complexType>
</xsd:element>
Title
<xsd:complexType name="Title">
<xsd:sequence>
<xsd:element name="CarManufacturer" type="xsd:string"/>
<xsd:element name="Year" type="year"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element name="Title" type="xsd:string"/> Title
<xsd:attribute name="Title" type="xsd:string"/> symbol space

CarManufacturer
Year
Global/Local Elements and
Namespaces

Only global elements are in the namespace!

Local elements are associated with the global elements.

In our example, the only elements in the namespace are BookOnCars


and Title (the globally-declared Title element)

BookOnCars has two local elements associated with it - Chapter and


Title.
 Chapter has two elements associated with it - Title and Section

Section has one element associated with it - Title


Same Situation with Attributes
Attributes in an XML document are in the same
situation as the schema local elements - there
can be many attributes with the same name in
an XML document; thus, there would be many
name collisions if attributes were made part of a
default namespace. Instead, we say that
attributes are not in a default namespace.
Rather, they are associated with elements which
are in the namespace. (See next slide for
example)
elementFormDefault
In all of our examples thus far we have set the value of
this schema attribute to "qualified". The "qualified" value
means that in an instance document all elements must
be qualified.

Alternatively, you can assign elementFormDefault the


value "unqualified". The "unqualified" value means
that in an instance document only the global
elements can be qualified and the local elements
must not be qualified.
References, Explicit, and
Anonymous Types
Some design choices to consider
Using References
Recall that you don't have to have the content of an element defined in the
nested fashion as just shown
<xs:element name="rooms">
<xs:complexType>
<xs:sequence>
<xs:element name="room">
<xs:complexType>
<xs:sequence>
<xs:element name="capacity" type="xs:decimal"/>

You can define the element globally and use a reference to it instead.
<xs:element name="rooms">
<xs:complexType>
<xs:sequence>
<xs:element ref="room"/>
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="room">

</xs:element>
Rooms Schema using References

<?xml version="1.0" encoding="UTF-8"?>


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="rooms">
<xs:complexType>
<xs:sequence>
<xs:element ref="room" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="room">
<xs:complexType>
<xs:sequence>
<xs:element name="capacity" type="xs:decimal"/>
<xs:element name="equiptmentList"/>
<xs:element ref="features" minOccurs="0" maxOccurs="1"/>
</xs:sequence>
<xs:attribute name="name" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>

<xs:element name="features">
<xs:complexType>
<xs:sequence>
<xs:element name="feature" type="xs:string” maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Types

Also, recall that named types themselves can be defined independently of any
element and used to create elements.

<xsd:element name="Robot">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="Sensor_List" minOccurs="0"/>
<xsd:element ref="Specification_List" minOccurs="0"/>
<xsd:element ref="Note" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
OR
<xsd:element name="Robot” type=“RoboType”>
<xsd:complexType name="RoboType” >
<xsd:sequence>
<xsd:element ref="Sensor_List" minOccurs="0"/>
<xsd:element ref="Specification_List" minOccurs="0"/>
<xsd:element ref="Note" minOccurs="0"/>
</xsd:sequence>
</xsd:complexType>
Types vs. Refs vs. Anonymous
Questions to the class

 What are the relative advantages of using refs


vs. anonymous types?

 What is the difference between a ref and a


type definition?
XML Simple Types
CSPP51038 shortcourse
Simple Types
Recall that simple types are composed of text-only
values.

All attributes are of simple type

Elements with text and no attributes are of simple type

Careful: Elements with text and attributes are said to


have simple content
Types
44 built-in simple types that are defined in the
XML schema namespace

For your homework, you must use reasonably


restrictive types in your schema

See Costello slides for details


 http://www.xfront.com/#tutorials
 Copyright © [2002]. Roger L. Costello. All Rights Reserved.
Defining New Types
Can define new simple types built off of
existing types.

Types are derived from other, more


general types.

Type hierarchy:
http://www.w3.org/TR/xmlschema-2/#built-in-datatypes
Simple Type
<xs:simpleType>

Defines new simple types.


 Character data
 No children (i.e., no subelements)
 No attributes.

Can create new simple types by:


 Restriction
 List
 Union
Derivation by Restriction
12 facets that describe possible restrictions on a
built-in data type
 length, minLength, maxLength
 pattern
Uses regular expression language.
 enumeration
 whiteSpace
 minInclusive, maxInclusive
 minExclusive, maxExclusive
 totalDigits fractionDigits
Not all facets apply to all built-in data types
Restriction syntax

<xsd:simpleType name=“typeName”>
<xsd:restriction base=“builtIn_or_userDef”>
<facet1 value=“Value”/>
<facet1 value=“Value”/>
<facet1 value=“Value”/>
<!-- etc. -->
</xsd:restriction>
</xsd:simpleType>
List and Union syntax

<xsd:simpleType name=“typeName”>
<xsd:list itemType=“simpleType”>
</xsd:simpleType>

<xsd:simpleType name=“typeName”>
<xsd:union memberTypes=“simpleType1 simpleType2 ...”>
</xsd:simpleType>
Restriction examples
<!– restricting upper and lower value of int -->
<xs:simpleType name=“myInteger”>
<xs:restriction base=“xs:integer”>
<xs:minInclusive value=“-2”/>
<xs:maxInclusive value=“5/>
</xs:restriction>
</xs:simpleType>

<!– enumerations of car names-->


<xs:simpleType name = “cars”>
<xs:restriction base=“xs:string”>
<xs:enumeration value=“pinto”/>
<xs:enumeration value=“gremlin”/>
<xs:enumeration value=“pacer”/>
</xs:restriction>
</xs:simpleType>
More examples
<xs:simpleType name=“CapitalizedNameWS”>
<xs:restriction base=“xs:token”>
<xs:pattern value=([A-Z]([a-z]*) ?)+”/>
</xs:restriction>
</xs:simpleType> Token is a string
with no whitespace

<!– disable scientific notation-->


<xs:simpleType name=“nonScientific”/>
<xs:restriction base=“xs:float”>
<xs:pattern value=“[^eE]*”/>
</xs:restriction>
</xs:simpleType>
Examples
<!– a list of exactly three strings-->

<xs:simpleType name="testList">
<xs:list itemType="xs:int"/>
</xs:simpleType>

<xs:simpleType name="constrList" final="list">


<xs:restriction base="testList">
<xs:maxLength value="3"/>
<xs:minLength value="3"/>
</xs:restriction>
</xs:simpleType>
Union
<!– a list of variable elements of fixed length-->
<xs:simpleType name="testUnion“>
<xs:union memberTypes="xs:string xs:int xs:boolean"/>
</xs:simpleType>

<xs:simpleType name="unionList">
<xs:list itemType="testUnion"/>
</xs:simpleType>

<xs:simpleType name=“fixedUnionList”>
<xs:restriction base=“unionList”>
<xs:maxLength=3/>
</xs:restriction>
</xs:simpleType>
Examples
<!– example two facets together-->
<xs:simpleType name="myString">
<xs:restriction base="xs:string">
<xs:length value="3"/>
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>

Weird facet: whitespaces removed


before validation
“Fixed” attribute

<!– disallows modification of minInclusive facet during derivation/>


<xs:simpleType name=”minInclusive">
<xs:restriction base="xs:float">
<xs:minInclusive value=“10” fixed=“true”/>
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>
Exercise
1. Create a new simple type that is a list of exactly three xs:token values each of which is between 9 and
10 characters long.

2. Create a new simple type that is a list of 10-20 elements of each member of which can be either an
xs:int greater than zero or an xs:date

Syntax reminder for restriction, list, and union:


<xsd:simpleType name=“typeName”>
<xsd:restriction base=“builtIn_or_userDef”>
<facet1 value=“Value”/>
<facet1 value=“Value”/>
<facet1 value=“Value”/>
<!-- etc. -->
</xsd:restriction>
</xsd:simpleType>

<xsd:simpleType name=“typeName”>
<xsd:list itemType=“simpleType”>
</xsd:simpleType>

<xsd:simpleType name=“typeName”>
<xsd:union memberTypes=“simpleType1 simpleType2 ...”>
</xsd:simpleType>
Answer 1
<xs:simpleType name="foo1">
<xs:restriction base="xs:token">
<xs:maxLength value="10"> </xs:maxLength>
<xs:minLength value="9"> </xs:minLength>
</xs:restriction>
</xs:simpleType>

<xs:simpleType name="foo2">
<xs:list itemType="foo1"> </xs:list>
</xs:simpleType>

<xs:simpleType name="foo3">
<xs:restriction base="foo2">
<xs:length value="3"> </xs:length>
</xs:restriction>
</xs:simpleType>
Answer 2
<xs:simpleType name="posInt">
<xs:restriction base="xs:integer">
<xs:minInclusive value="1"/>
</xs:restriction>
</xs:simpleType>

<xs:simpleType name="listofUnions">
<xs:list>
<xs:simpleType>
<xs:union memberTypes="posInt xs:date"/>
</xs:simpleType>
</xs:list>
</xs:simpleType>

<xs:simpleType name="restrictedListofUnions">
<xs:restriction base="listofUnions">
<xs:minLength value="10"/>
<xs:maxLength value="20"/>
</xs:restriction>
</xs:simpleType>
Complex Types
Complex vs. Simple types
Recall that these are fundamentally different

Simple types exist independently of the markup and


could be used to describe data in entirely different
scenarios (RDBMS, CSV files, etc.)

Complex types are a description of the markup structure.

XML names (e.g. xs:restriction) reused for both types but


meanings can be very different.
Content model review
• Recall the definition of the various content models

Content model Mixed Complex Simple Empty


Child elements Yes Yes No No

Child text Yes No Yes No

• Recall that complex types are defined differently in Schema


depending on whether they have simple or complex content
models
Named vs. Anonymous Types
Complex types can be either named (global) or
anonymous (local).

Global definitions must have names and be defined one


level below the <schema> element.

Global definitions can then be used in any element using


the type attribute.

New complex types can be derived from global


definitions.
Creation vs. Derivation
For simple types only derivation is
possible (cannot create new simple types)

For complex types the opposite is true


 No native complex types
 Can define new content models from scratch
or extend or restrict existing ones
Complex types with simple
content models
This is the case closest to simple types, so
we begin here

<xs:complexType name=“tokenWithLang”>
<xs:simpleContent>
<xs:extension base=“xs:token”>
<xs:attribute ref=“lang”/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>

<xs:element name=“title” type=“tokenWithLang”/>


Derivation by extension

Can extend this complex type to add attributes as follows:

<xs:element name=“title”>
<xs:complexType>
<xs:simpleContent>
<xs:extension base=“tokenWithLang”>
<xs:attribute name=“note” type=“xs:token”/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Derivation by restriction
Can restrict text node in regular way
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="tokenWIthLangAndNote">
<xs:simpleContent>
<xs:extension base="xs:token">
<xs:attribute name="lang" type="xs:language"/>
<xs:attribute name="note" type="xs:token"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>

<xs:element name="title">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="tokenWIthLangAndNote">
<xs:maxLength value="8"/>
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:schema>
Derivation by restriction
Can remove an attribute using prohibited
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="tokenWIthLangAndNote">
<xs:simpleContent>
<xs:extension base="xs:token">
<xs:attribute name="lang" type="xs:language"/>
<xs:attribute name="note" type="xs:token"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>

<xs:element name="title">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="tokenWIthLangAndNote">
<xs:attribute name=“note” use=“prohibited”>
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:schema>
Derivation by restriction
Can restrict the value of attributes
<xs:element name="title">
<xs:complexType>
<xs:simpleContent>
<xs:restriction base="tokenWIthLangAndNote">
<xs:attribute name=“lang”>
<xs:simpleType>
<xs:restriction base=“xs:language”>
<xs:enumeration value=“en”/>
<xs:enumeration value=“es”/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:restriction>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Derivation of Complex Content by extension

<?xml version="1.0" encoding="UTF-8"?>


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:complexType name="basePerson">
<xs:sequence>
<xs:element ref="name"/>
• similar to adding a second
<xs:element ref="born"/> sequence after the existing
</xs:sequence>
<xs:attribute ref="id"/> one
</xs:complexType>

<xs:element name="author">
<xs:complexType>
<xs:complexContent>
<xs:extension base="basePerson">
<xs:sequence>
<xs:element ref="dead" minOccurs="0"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:element>
</xs:schema>
Restricting complex content models
<xs:complexType name="person">
<xs:sequence>
<xs:element ref="name"/>
<xs:element ref="born"/>
<xs:element ref="dead" minOccurs="0"/>
<xs:element ref=“qualification” minOccurs=“0”/>
</xs:sequence>
<xs:attribute ref="id"/>
</xs:complexType>

<xs:element name="author"> <xs:element name=“character">


<xs:complexType> <xs:complexType>
<xs:complexContent> <xs:complexContent>
<xs:restriction base="person"> <xs:restriction base="person">
<xs:sequence> <xs:sequence>
<xs:element ref="name"/> <xs:element ref="name"/>
<xs:element ref="born"/> <xs:element ref="born"/>
<xs:element ref="dead" minOccurs="0"/> <xs:element ref=“qualification"/>
</xs:sequence> </xs:sequence>
</xs:restriction> </xs:restriction>
</xs:complexContent> </xs:complexContent>
</xs:complexType> </xs:complexType>
</xs:element> </xs:element>
Key, keyref, ID, IDRef
Big Picture
How can we write a schema that require things
like the following:
 A certain element be unique based on some criterion
E.g. each book have a different isbn
 A certain element only refer to some element already
defined in the given xml document
Every aisle can only contain a product from the productList
These and similar things can be accomplished
using either keys or IDs
ID/IDRef
Allow you to associate a unique identifier with a
given class of elements

ID and IDRef are a carryover from DTD’s

They solve same problem as key/keyref but


have more limitations

Still, some people find them simpler to use


?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="library">
<xs:complexType>
<xs:sequence>
<xs:element name="authorList" type="authorListType"/>
<xs:element name="bookList" type="bookListType"/>
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:complexType name="bookListType">
<xs:sequence>
<xs:element name="book" type="bookType"/>
</xs:sequence>
</xs:complexType>

<xs:complexType name="authorListType">
<xs:sequence>
<xs:element name="author" type="authorType" minOccurs="1" maxOccurs="unbounded"/
</xs:sequence>
</xs:complexType>
<xs:complexType name="authorType">
<xs:sequence>
<xs:element name="last" type="xs:string"/>
<xs:element name="first" type="xs:string"/>
</xs:sequence>
<xs:attribute name="identifier" type="xs:ID" use="required"/>
</xs:complexType>

<xs:complexType name="bookType">
<xs:sequence>
<xs:element name="isbn" type="xs:token"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="author-ref">
<xs:complexType>
<xs:attribute name="ref" type="xs:IDREF" use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>

</xs:schema>
<?xml version="1.0" encoding="UTF-8"?>
<library xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="file:/Users/siegela/Desktop/cspp51038/examples/IDExample
<authorList>
<author identifier="ID000">
<last>last0</last>
<first>first0</first>
</author>
<author identifier="ID001">
<last>last0</last>
<first>first0</first>
</author>
</authorList>
<bookList>
<book>
<isbn>isbn0</isbn>
<title>title0</title>
<author-ref ref="ID000"/>
</book>
</bookList>
</library>
Key/keyref
Much more flexible and arguably sensible
than ID’s

Different philosophy – keys are not types


but rather elements

Xpath used to select which field(s) are


used as keys for a given element
Unique
The Unique construct is very similar to Key. It applies
to an element and defines a simple check for
uniqueness.

<xs:element name=“book” maxOccurs=“unbounded>


<xs:complexType>

</xs:complexType>
<xs:unique name=“book”>
<xs:selector xpath=“book”/>

</xs:unique>
</xs:element>
Unique
The Unique construct is very similar to Key. It applies to an
element and defines a simple check for uniqueness.
<xs:element name=“book” maxOccurs=“unbounded>
<xs:complexType>

</xs:complexType>
<xs:unique name=“book”>
<xs:selector xpath=“book”/>
<xs:field xpath=“isbn”/>
</xs:unique>
</xs:element>

“For each library, each book identified by its isbn must be


Unique”
Key
Key is identical to unique with the
additional restriction that all nodes
corresponding to all the fields are required.

Syntax is the same.


Key references

Can be used to define a reference to


key or unique.

Note that refer attribute is limited to key


or unique elements under the same
element or one of their ancestors!
Uniqueness & Keys
DTDs provide the ID attribute datatype for uniqueness (i.e., an ID
value must be unique throughout the entire document, and the XML
parser enforces this).

XML Schema has much enhanced uniqueness capabilities:


 enables you to define element content to be unique.
 enables you to define non-ID attributes to be unique.
 enables you to define a combination of element content and
attributes to be unique.
 enables you to distinguish between unique versus key.
 enables you to declare the range of the document over which
something is unique
unique vs key
Key: an element or attribute (or combination
thereof) which is defined to be a key must:
 always be present (minOccurs must be greater than
zero)
 be non-nillable (i.e., nillable="false")
 be unique

Key implies unique, but unique does not imply


key
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="workPlan">
<xs:complexType>
<xs:sequence>
<xs:element name="assignment" minOccurs="1" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="assignee" type="xs:string"> </xs:element>
<xs:element name="taskName" type="xs:string"> </xs:element>
<xs:element name="length" type="xs:int"> </xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="employee" minOccurs="1" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="social" type="xs:string"/>
<xs:element name="last" type="xs:string"/>
<xs:element name="first" type="xs:string"/>
</xs:sequence>
</xs:complexType>

</xs:element>
</xs:sequence> “each employee element of workPlan
</xs:complexType> is unique based on its social field
<xs:unique name="social_key">
<xs:selector xpath="employee"/>
<xs:field xpath="social"/>
</xs:unique>

</xs:element>
</xs:schema>
<?xml version="1.0" encoding="UTF-8"?>
<workPlan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="file:workPlan.xsd">
<assignment>
<assignee>assignee0</assignee>
<taskName>taskName0</taskName>
<length>2147483647</length>
</assignment>
<employee>
<social>social0</social>
<last>last0</last>
<first>first0</first>
</employee>
<employee> This won’t validate!
<social>social0</social>
<last>last0</last>
<first>first0</first>
</employee>
</workPlan>
Adding keyref
<xs:unique name="social_key">
<xs:selector xpath="employee"/>
<xs:field xpath="social"/>
</xs:unique>

<xs:keyref name="social_keyref" refer="social_key">


<xs:selector xpath="assignment"></xs:selector>
<xs:field xpath="assignee"></xs:field>
</xs:keyref>
This says “the assignee field of the assignee child node of the current node must
Contain a social_key value”
<?xml version="1.0" encoding="UTF-8"?>
<workPlan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="file:workPlan.xsd">
<assignment>
<assignee>social3</assignee>
<taskName>taskName0</taskName>
<length>2147483647</length> This won’t validate!
</assignment>
<employee>
<social>social0</social>
<last>last0</last>
<first>first0</first>
</employee>
<employee>
<social>social1</social>
<last>last0</last>
<first>first0</first>
</employee>
</workPlan>
<xs:key name="social_key">
<xs:selector xpath="employee"/>
<xs:field xpath="social"/>
</xs:unique>

<xs:keyref name="social_keyref" refer="social_key">


<xs:selector xpath="assignment"></xs:selector>
<xs:field xpath="assignee"></xs:field>
</xs:keyref>

Note: Can use key almost identically to unique


Only difference is that field _must_ exist and be non-nillable
Using ISBN as a Key
When a book is published it has an ISBN,
which is guaranteed to be unique.

In the BookStore we should be able to


express that each Book's ISBN element is
unique. Further, let's make the ISBN
elements keys (i.e., both unique and
required to exist).
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.books.org"
xmlns="http://www.books.org"
xmlns:bk="http://www.books.org"
elementFormDefault="qualified">
<xsd:element name="BookStore">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Book" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Title" type="xsd:string"/>
<xsd:element name="Author" type="xsd:string"/>
<xsd:element name="Date" type="xsd:string"/>
<xsd:element name="ISBN" type="xsd:string"/>
<xsd:element name="Publisher" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
<xsd:key name="PK">
<xsd:selector xpath="bk:Book"/>
<xsd:field xpath="bk:ISBN"/>
</xsd:key>
</xsd:element>
</xsd:schema>
<xsd:element name="BookStore">
...
<xsd:key name="PK">
<xsd:selector xpath="bk:Book"/>
<xsd:field xpath="bk:ISBN"/>
</xsd:key>
</xsd:element>

"Within <BookStore> we define a


key, called PK. Select each <Book>, and
within each <Book> the ISBN element is
a key."

In other words, within <BookStore>


each <Book> must have an <ISBN> and
it must be unique.

This is nice! We are using the content of a


field as a key! (No longer limited to ID attributes
for defining uniqueness.)
Must namespace-qualify xPath
expressions
(when elementFormDefault="qualified")
Note that we namespace-qualified the xPath references:
<xsd:key name="PK">
<xsd:selector xpath="bk:Book"/>
<xsd:field xpath="bk:ISBN"/>
</xsd:key>

This is required, even though the targetNamespace is the


default namespace. This is an xPath requirement.

Note that if the schema had instead set


elementFormDefault="unqualified" then the xPath
expressions would not be namespace-qualified.
<?xml version="1.0"?>
<BookStore xmlns="http://www.books.org"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=
"http://www.books.org
BookStore.xsd">
<Book>
<Title>My Life and Times</Title>
<Author>Paul McCartney</Author>
<Date>1998</Date>
<ISBN>1-56592-235-2</ISBN>
<Publisher>McMillin Publishing</Publisher>
A schema-validator
</Book>
<Book>
will verify that each
<Title>Illusions The Adventures of a Reluctant Messiah</Title>
<Author>Richard Bach</Author>
Book has an ISBN
<Date>1977</Date>
<ISBN>0-440-34319-4</ISBN>
element and that the
<Publisher>Dell Publishing Co.</Publisher>
</Book>
values are all unique.
<Book>
<Title>The First and Last Freedom</Title>
<Author>J. Krishnamurti</Author>
<Date>1954</Date>
<ISBN>0-06-064831-7</ISBN>
<Publisher>Harper &amp; Row</Publisher>
</Book>
</BookStore>
Notes about <key>
It must be nested within an <element>
It must come at the end of <element> (after the content
model, and attribute declarations)
Use the <selector> element as a child of <key> to select
a set of elements for which the key applies.
Use the <field> element as a child of <key> to identify
the element or attribute that is to be the key
 There can be multiple <field> elements. See next example.
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.CostelloReunion.org"
xmlns="http://www.CostelloReunion.org"
xmlns:reunion="http://www.CostelloReunion.org"
elementFormDefault="qualified">
<xsd:element name="Y2KFamilyReunion">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Participants" >
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Name" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="First" type="xsd:string"/>
<xsd:element name="Last" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
<xsd:key name="PK">
<xsd:selector xpath="reunion:Participants/reunion:Name"/>
<xsd:field xpath="reunion:First"/>
<xsd:field xpath="reunion:Last"/>
</xsd:key>
</xsd:element>
</xsd:schema>

The key is the combination of the First and Last name


<?xml version="1.0"?>
<Y2KFamilyReunion xmlns="http://www.CostelloReunion.org"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=
"http://www.CostelloReunion.org
Y2KFamilyReunion.xsd">
<Participants>
<Name><First>Peter</First><Last>Brown</Last></Name>
<Name><First>Peter</First><Last>Costello</Last></Name>
</Participants>
</Y2KFamilyReunion>

A schema-validator will verify that each


First name/Last name combination is unique.
unique
The <unique> element is used exactly like
the <key> element is used. It has a
<selector> and one or more <field>
elements, just like <key> has.

The only difference is that the schema


validator will simply validate that,
whenever present, the values are unique.
<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.books.org"
xmlns="http://www.books.org"
xmlns:bk="http://www.books.org"
elementFormDefault="qualified">
<xsd:element name="BookStore">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Book" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Title" type="xsd:string"/>
<xsd:element name="Author" type="xsd:string"/>
Note: ISBN
<xsd:element name="Date" type="xsd:string"/>
<xsd:element name="ISBN" type="xsd:string" minOccurs="0"/>
is optional
<xsd:element name="Publisher" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:sequence> Require
</xsd:complexType>
<xsd:unique name="UNIQ"> every ISBN
<xsd:selector xpath="bk:Book"/>
<xsd:field xpath="bk:ISBN"/> be unique.
</xsd:unique>
</xsd:element>
</xsd:schema>
<?xml version="1.0"?>
<BookStore xmlns="http://www.books.org/namespaces/BookStore"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=
"http://www.books.org/namespaces/BookStore
BookStore24.xsd">
<Book>
<Title>My Life and Times</Title>
<Author>Paul McCartney</Author>
<Date>1998</Date>
<Publisher>McMillin Publishing</Publisher>
</Book>
A schema-validator
<Book>
<Title>Illusions The Adventures of a Reluctant Messiah</Title>
will verify that each
<Author>Richard Bach</Author>
<Date>1977</Date>
Book which has an
<ISBN>0-440-34319-4</ISBN>
<Publisher>Dell Publishing Co.</Publisher>
ISBN element, has
</Book>
<Book>
a unique value (note
<Title>The First and Last Freedom</Title>
<Author>J. Krishnamurti</Author>
that the first Book
<Date>1954</Date>
<ISBN>0-06-064831-7</ISBN>
does not have an
<Publisher>Harper &amp; Row</Publisher>
</Book>
ISBN. That's perfectly
</BookStore> valid!)
Referencing a key
Recall that by declaring an element of type
IDREF then that element must reference an ID
attribute, and an XML Parser will verify that the
IDREF value corresponds to a legitimate ID
value.

Similarly, you can define a keyref which asserts,


"the value of this element must match the value
of an element referred to by this".
<?xml version="1.0"?>
<Library xmlns="http://www.library.org"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=
"http://www.library.org
A key element
AuthorSigningAtLibrary.xsd">
<Books>
<Book>
<Title>Illusions The Adventures of a Reluctant Messiah</Title>
<Author>Richard Bach</Author>
<Date>1977</Date> Suppose that we define a
<ISBN>0-440-34319-4</ISBN>
<Publisher>Dell Publishing Co.</Publisher> key for ISBN (i.e., each
</Book>
... Book must have an ISBN
</Books>
<GuestAuthors>
and it must be unique)
<Author>
<Name>Richard Bach</Name>
<BookForSigning>
<Title>Illusions The Adventures of a Reluctant Messiah</Title>
<ISBN>0-440-34319-4</ISBN>
</BookForSigning>
We would like to ensure
</Author> that the ISBN for the
</GuestAuthors>
</Library> GuestAuthor matches
one of the ISBNs in the
A keyref element BookStore.
<xsd:element name="Library">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="Books">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="Book" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element ref="GuestAuthors"/>
</xsd:sequence>
</xsd:complexType>
<xsd:key name="PK">
<xsd:selector xpath="bk:Books/bk:Book"/>
<xsd:field xpath="bk:ISBN"/>
</xsd:key>
<xsd:keyref name="isbnRef" refer="PK">
<xsd:selector xpath="bk:GuestAuthors/bk:Author/bk:BookForSigning"/>
<xsd:field xpath="bk:ISBN"/>
</xsd:keyref>
</xsd:element>
<xsd:key name="PK">
<xsd:selector xpath="bk:BookStore/bk:Book"/>
<xsd:field xpath="bk:ISBN"/>
</xsd:key>

This tells the schema-validator to validate that


every Book (in BookStore) has an ISBN, and
that ISBN must be unique.

<xsd:keyref name="isbnRef" refer="PK">


<xsd:selector xpath="bk:GuestAuthors/bk:Author/bk:BookForSigning"/>
<xsd:field xpath="bk:ISBN"/>
</xsd:keyref>

This tells the schema-validator that the ISBN of the Book


that the Author is signing must refer to one of the ISBN
elements in the collection defined by the PK key.
Note about key and keyref
If there are 2 fields in the key, then there must be 2 fields
in the keyref, if there are 3 fields in the key, then there
must be 3 fields in the keyref, etc.

Further, the fields in the keyref must match in type and


position to the key.
Extra Slides
Whitespace facet
The "whitespace" facet controls how white space in the element will be
processed

There are three possible values to the whitespace facet


 "preserve" causes the processor to keep all whitespace as-is
 "replace" causes the processor to replace all whitespace characters (tabs,
carriage returns, line feeds, spaces) with space characters
 "collapse" causes the processor to replace all strings of whitespace characters
(tabs, carriage returns, line feeds, spaces) with a single space character

<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whitespace value="replace"/>
</xs:restriction>
</xs:simpleType>
Type Extensions
A third way of creating a complex type is to extend another
complex type (like OO inheritance)

<xs:element name="Employee" type="PersonInfoType"/>


<xs:complexType name="PersonNameType">
<xs:sequence>
<xs:element name="FirstName" type="xs:string"/>
<xs:element name="LastName" type="xs:string"/>
</xs:sequence>
</xs:complexType>

<xs:complexType name="PersonInfoType">
<xs:complexContent>
<xs:extension base="PersonNameType">
<xs:sequence>
<xs:element name="Address" type="xs:string"/>
<xs:element name="City" type="xs:string"/>
<xs:element name="Country" type="xs:string"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
Type Extensions (use)

To use a type that is an extension of another, it is


as though it were all defined in a single type
<Employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="TypeExtension.xsd">
<FirstName>King</FirstName>
<LastName>Arthur</LastName>
<Address>Round Table</Address>
<City>Camelot</City>
<Country>England</Country>
</Employee>
Simple Content in Complex Type
If a type contains only simple content (text and attributes), a
<simpleContent> element can be put inside the <complexType>

<simpleContent> must have either a <extension> or a


<restriction>

This example is from the (Bridge of Death) Episode Dialog:


<xs:element name="dialog">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="speaker” type="xs:string" use="required"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Model Groups
Model Groups are used to define an element that has either

 mixed content (elements and text mixed)
 element content

Model Groups can be


 all
the elements specified must all be there, but in any order
 choice
any of the elements specified may or may not be there
 sequence
all of the elements specified must appear in the specified order
"All" Model Group
The following schema specifies 3 elements and mixed content
<xs:element name="BookCover">
<xs:complexType mixed="true">
<xs:all minOccurs="0" maxOccurs="1">
<xs:element name="BookTitle" type="xs:string"/>
<xs:element name="Author" type="xs:string"/>
<xs:element name="Publisher" type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>

The following XML file is valid in the above schema


<BookCover xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="AllModelGroup.xsd">
Title: <BookTitle>The Holy Grail</BookTitle>
Published: <Publisher>Moose</Publisher>
Author: <Author>Monty Python</Author>
</BookCover>
Some guidelines for
Schema design
Designing a Schema
Analogous to database schema design --- look for intuitive
names

Can start with an E-R diagram, and then convert


 Attributes to Attributes
 Subobjects to Subelements
 Relationships to IDREFS

Normalization? Still makes sense to avoid repetition whenever


possible–
 If you have an Enrolment document, only list Ids of students, not their
names.
 Store names in a separate document
 Leave it to tools to connect them
Designing a Schema (cont.)
Difficulties:
 Many more degrees of freedom than with database schemas:
 e.g. one can associate information with something by including it as an
attribute or a subelement.

<ADDRESS NAME=“Martin Sheen”, Street=“1222 Alameda


Drive” ,City=“Carmel”, State=“CA”, ZIP=“40145”>

<ADDRESS>
<NAME> Martin Sheen </NAME>

<ZIP> 4145 </ZIP>
</ADDRESS>

 ELEMENTS are more extensible – use when there is a possibility that


more substructure will be added.
 ATTRIBUTES are easier to search on.
“Rules” for Designing a Schema
Never leave structure out. The following is definitely a bad
idea:
 <ADDRESS> Martin Sheen 1222 Alameda Drive,

Carmel, CA 40145 </ADDRESS>


Better would be:
 <ADDRESS firstName=“Martin” lastname=“Sheen”

streenNum=“1222” streenName=“Alameda Drive”


city=“Carmel” state=“CA” zip=“40145” />
Or:
<ADDRESS>
<name>
<first>Martin</first><last>Sheen</last>
</name>
<street>
<num>1222</num><name>Alameda Drive</name>
</street>
<city>Carmel</city>
<state>CA</state><zip>40145</zip>
</ADDRESS>
More“Rules” for Designing a Schema
When to use Elements (instead of attributes)
 Do not put large text blocks inside an attribute
(Bad Idea) <book type=“memoir” content=“Bravely bold Sir
Robin rode forth from Camelot.
He was not afraid to die, O brave Sir Robin.
He was not at all afraid to be killed in nasty ways,
Brave, brave, brave, brave Sir Robin!

He was not in the least bit scared to be mashed into a


pulp,
Or to have his eyes gouged out and his elbows broken,
To have his kneecaps split and his body burned away
And his limbs all hacked and mangled, brave Sir Robin!

His head smashed in and his heart cut out


And his liver removed and his bowels unplugged…”>
 Elements are more flexible, so use an Element if you think
you might have to add more substructure later on.
More “Rules” for Designing Schemas

More on when to use Elements (instead of


Attributes)
 Use an embedded element when the information you are
recording is a constituent part of the parent element
one's head and one's height are both inherent to a human being,
you can't be a conventionally structured human being without
having a head and having a height
One's head is a constituent part and one's height isn't -- you can
cut off my head, but not my height
 use embedded elements for complex structure validation
(obvious)
 use embedded elements when you need to show order
(attributes are not ordered)
More “Rules” for Designing Schemas

When to use Attributes instead of Elements


 use an attribute when the information is inherent to the
parent but not a constituent part (height instead of head)
 use attributes to stress the one-to-one relationship
among pieces of information
to stress that the element represents a tuple of information
dangerous rule, though
 Leads to the extreme formulation that a <chapter> element can
have a TITLE= attribute
 And then to the conclusion that it really ought to have a CONTENT=
attribute too
 Then you find yourself writing the entire document as an empty
element with an attribute value as long as the Quest for the Holy
Grail
 use attributes for simple datatype validation (obviously)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy