XML Guide: Informatica Powercenter (Version 9.1.0)
XML Guide: Informatica Powercenter (Version 9.1.0)
XML Guide: Informatica Powercenter (Version 9.1.0)
0)
XML Guide
Informatica PowerCenter XML Guide
Version 9.1.0
March 2010
This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and
disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form,
by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S. and/or international
Patents and other Patents Pending.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in
DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.
The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us in
writing.
Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange,
PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange, Informatica On
Demand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging and Informatica
Master Data Management are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company
and product names may be trade names or trademarks of their respective owners.
Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights
reserved. Copyright Sun Microsystems. All rights reserved. Copyright RSA Security Inc. All Rights Reserved. Copyright Ordinal Technology Corp. All rights
reserved.Copyright Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright 2007 Isomorphic Software. All rights reserved. Copyright Meta
Integration Technology, Inc. All rights reserved. Copyright Oracle. All rights reserved. Copyright Adobe Systems Incorporated. All rights reserved. Copyright DataArt,
Inc. All rights reserved. Copyright ComponentSource. All rights reserved. Copyright Microsoft Corporation. All rights reserved. Copyright Rogue Wave Software, Inc. All
rights reserved. Copyright Teradata Corporation. All rights reserved. Copyright Yahoo! Inc. All rights reserved. Copyright Glyph & Cog, LLC. All rights reserved.
Copyright Thinkmap, Inc. All rights reserved. Copyright Clearpace Software Limited. All rights reserved. Copyright Information Builders, Inc. All rights reserved.
Copyright OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and other software which is licensed under the Apache License,
Version 2.0 (the "License"). You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under the License.
This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software copyright
1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under the GNU Lesser General Public License Agreement, which may be found at http://
www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any kind, either express or implied, including but not
limited to the implied warranties of merchantability and fitness for a particular purpose.
The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California, Irvine,
and Vanderbilt University, Copyright () 1993-2006, all rights reserved.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and redistribution of
this software is subject to terms available at http://www.openssl.org.
This product includes Curl software which is Copyright 1996-2007, Daniel Stenberg, <daniel@haxx.se>. All Rights Reserved. Permissions and limitations regarding this
software are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or without
fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
The product includes software copyright 2001-2005 () MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available
at http://www.dom4j.org/ license.html.
The product includes software copyright 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http:// svn.dojotoolkit.org/dojo/trunk/LICENSE.
This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations regarding this
software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.
This product includes software copyright 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at http://
www.gnu.org/software/ kawa/Software-License.html.
This product includes OSSP UUID software which is Copyright 2002 Ralf S. Engelschall, Copyright 2002 The OSSP Project Copyright 2002 Cable & Wireless
Deutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.
This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software are subject
to terms available at http:/ /www.boost.org/LICENSE_1_0.txt.
This product includes software copyright 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at http://
www.pcre.org/license.txt.
This product includes software copyright 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http:// www.eclipse.org/org/documents/epl-v10.php.
This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://www.stlport.org/doc/
license.html, http://www.asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://httpunit.sourceforge.net/doc/
license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/license.html, http://www.libssh2.org,
http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/license-agreements/fuse-message-broker-v-5-3-license-
agreement, http://antlr.org/license.html, http://aopalliance.sourceforge.net/, http://www.bouncycastle.org/licence.html, http://www.jgraph.com/jgraphdownload.html, http://
www.jgraph.com/jgraphdownload.html, http://www.jcraft.com/jsch/LICENSE.txt and http://jotm.objectweb.org/bsd_license.html.
This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and Distribution
License (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php) and the BSD License (http://
www.opensource.org/licenses/bsd-license.php).
This product includes software copyright 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this software
are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab. For further
information please visit http://www.extreme.indiana.edu/.
This Software is protected by U.S. Patent Numbers 5,794,246; 6,014,670; 6,016,501; 6,029,178; 6,032,158; 6,035,307; 6,044,374; 6,092,086; 6,208,990; 6,339,775;
6,640,226; 6,789,096; 6,820,077; 6,823,373; 6,850,947; 6,895,471; 7,117,215; 7,162,643; 7,254,590; 7,281,001; 7,421,458; 7,496,588; 7,523,121; 7,584,422; 7,720,842;
7,721,270; and 7,774,791, international Patents and other Patents Pending.
DISCLAIMER: Informatica Corporation provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied
warranties of non-infringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is error free. The
information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is
subject to change at any time without notice.
NOTICES
This Informatica product (the Software) includes certain drivers (the DataDirect Drivers) from DataDirect Technologies, an operating company of Progress Software
Corporation (DataDirect) which are subject to the following terms and conditions:
1. THE DATADIRECT DRIVERS ARE PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT INFORMED OF
THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT LIMITATION, BREACH
OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Informatica Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Informatica Customer Portal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Informatica Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Informatica Web Site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Informatica How-To Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Informatica Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Informatica Multimedia Knowledge Base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Informatica Global Customer Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Table of Contents i
Chapter 2: Using XML with PowerCenter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Using XML with PowerCenter Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Importing XML Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Importing Metadata from an XML File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Importing Metadata from a DTD File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Importing Metadata from an XML Schema. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Creating Metadata from Relational Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Creating Metadata from Flat Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Understanding XML Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Creating Custom XML Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Rules and Guidelines for XML Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Understanding Hierarchical Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Normalized Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Denormalized Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Understanding Entity Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Rules and Guidelines for Entity Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Type 1 Entity Relationship Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Type II Entity Relationship Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Using Substitution Groups in an XML Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Working with Circular References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Understanding View Rows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Using XPath Query Predicates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Rules and Guidelines for Using View Rows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Pivoting Columns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Using Multiple-Level Pivots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Limitations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
ii Table of Contents
Chapter 4: Using the XML Editor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Using the XML Editor Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
XML Navigator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
XML Workspace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Columns Window. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Creating and Editing Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Creating an XML View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Adding Columns to a View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Deleting Columns from a View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Expanding a Complex Type. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Importing anyType Elements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Applying Content to anyAttribute or ANY Elements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Using anySimpleType in the XML Editor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Adding a Pass-Through Port. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Adding a FileName Column. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Creating an XPath Query Predicate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Querying the Value of an Element of Attribute. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Testing for Elements or Attributes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
XPath Query Predicate Rules and Guidelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Steps for Creating an XPath Query Predicate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Maintaining View Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Creating a Relationship Between Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Creating a Type Relationship. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Re-Creating Entity Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Viewing Schema Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Updating a Namespace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Navigating to Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Searching for Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Viewing a Simple or Complex Type Hierarchy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Viewing XML Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Validating XML Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Setting XML View Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Generating All Hierarchy Foreign Keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Generating Rows in Circular Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Generating Hierarchy Relationship Rows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Setting the Force Row Option. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Generating Rows for Views in Type Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Troubleshooting Working with the XML Editor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
iv Table of Contents
Appendix A: XML Datatype Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
XML and Transformation Datatypes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Unsupported Datatypes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
XML Date Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Table of Contents v
Preface
The XML Guide is written for developers and software engineers responsible for working with XML in a data
warehouse environment. Before you use the XML Guide, ensure that you have a solid understanding of XML
concepts, your operating systems, flat files, or mainframe system in your environment. Also, ensure that you are
familiar with the interface requirements for your supporting applications.
Informatica Resources
Informatica Documentation
The Informatica Documentation team takes every effort to create accurate, usable documentation. If you have
questions, comments, or ideas about this documentation, contact the Informatica Documentation team through
email at infa_documentation@informatica.com. We will use your feedback to improve our documentation. Let us
know if we can contact you regarding your comments.
The Documentation team updates documentation as needed. To get the latest documentation for your product,
navigate to Product Documentation from http://mysupport.informatica.com.
vi
Informatica Knowledge Base
As an Informatica customer, you can access the Informatica Knowledge Base at http://mysupport.informatica.com.
Use the Knowledge Base to search for documented solutions to known technical issues about Informatica
products. You can also find answers to frequently asked questions, technical white papers, and technical tips. If
you have questions, comments, or ideas about the Knowledge Base, contact the Informatica Knowledge Base
team through email at KB_Feedback@informatica.com.
Use the following telephone numbers to contact Informatica Global Customer Support:
North America / South America Europe / Middle East / Africa Asia / Australia
Standard Rate
France: 0805 804632
Germany: 01805 702702
Netherlands: 030 6022 797
Preface vii
viii
CHAPTER 1
XML Concepts
This chapter includes the following topics:
XML Files, 2
DTD Files, 5
Cardinality, 9
Component Groups, 17
XML Path, 19
Code Pages, 19
You can import XML definitions into PowerCenter from the following file types:
XML file. An XML file contains data and metadata. An XML file can reference a Document Type Definition file
(DTD) or an XML schema definition (XSD) for validation.
DTD file. A DTD file defines the element types, attributes, and entities in an XML file. A DTD file provides some
constraints on the XML file structure but a DTD file does not contain any data.
XML schema. An XML schema defines elements, attributes, and type definitions. Schemas contain simple and
complex types. A simple type is an XML element or attribute that contains text. A complex type is an XML
element that contains other elements and attributes.
Schemas support element, attribute, and substitution groups that you can reference throughout a schema. Use
substitution groups to substitute one element with another in an XML instance document. Schemas also
support inheritance for elements, complex types, and element and attribute groups.
1
XML Files
XML files contain tags that identify data in the XML file, but not the format of the data. The basic component of an
XML file is an element. An XML element includes an element start tag, element content, and element end tag. All
XML files must have a root element defined by a single tag at the top and bottom of the file. The root element
encloses all the other elements in the file.
An XML file models a hierarchical database. The position of an element in an XML hierarchy represents its
relationships to other elements. An element can contain child elements, and elements can inherit characteristics
from other elements.
Book is the root element and it contains the title and chapter elements. Book is the parent element of title and
chapter, and chapter is the parent of heading. Title and chapter are sibling elements because they have the same
parent.
An element can have attributes that provide additional information about the element. In the following example, the
attribute graphic_type describes the content of picture:
<picture graphic_type="gif">computer.gif</picture>
1. Root element.
2. Element data.
3. Enclosure element.
4. Element tags.
5. Element data.
6. Attribute value.
7. Attribute tag.
An XML file has a hierarchical structure. An XML hierarchy includes the following elements:
Enclosure element. An element that contains other elements but does not contain data. An enclosure element
can include other enclosure elements.
Global element. An element that is a direct child of the root element. You can reference global elements
throughout an XML schema.
Leaf element. An element that does not contain other elements. A leaf element is the lowest level element in
the XML hierarchy.
Local element. An element that is nested in another element. You can reference local elements only within the
context of the parent element.
Multiple-occurring element. An element that occurs more than once within its parent element. Enclosure
elements can be multiple-occurring elements.
Parent chain. The succession of child-parent elements that traces the path from an element to the root.
XML Files 3
The following figure shows some elements in an XML hierarchy:
To reference the location and name of a DTD file, use the DOCTYPE declaration in an XML file. The DOCTYPE
declaration also names the root element for the XML file.
For example, the following XML file references the location of the note.dtd file:
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM
"http://www.w3schools.com/dtd/note.dtd">
<note>
<body>XML Data</body>
</note>
To reference a schema, use the schemaLocation declaration. The schemaLocation contains the location and
name of a schema.
Unicode Encoding
An XML file contains an encoding attribute that indicates the code page in the file. The most common encodings
are UTF-8 and UTF-16. UTF-8 represents a character with one to four bytes, depending on the Unicode symbol.
UTF-16 represents a character as a 16-bit word.
DTD Files
A Document Type Definition (DTD) file defines the element types and attributes in an XML file. A DTD file also
provides some constraints on the XML file structure. A DTD file does not contain any data or element datatypes.
1. Element
2. Attribute
3. Element list
4. Element occurrence
5. Attribute value option
6. Attribute name
DTD Files 5
DTD Elements
In the DTD file, an element declaration defines an XML element. An element declaration has the following syntax:
<!ELEMENT product (#PCDATA)>
The DTD description defines the XML tag <product>. The description (#PCDATA) specifies parsed character data.
Parsed data is the text between the start tag and the end tag of an XML element. Parsed character data is text
without child elements.
The following example shows a DTD description of an element with two child elements:
<!ELEMENT boat (brand, type) >
<!ELEMENT brand (#PCDATA) >
<!ELEMENT type (#PCDATA) >
Brand and type are child elements of boat. Each child element can contain characters. In this example, brand and
type can occur once inside the element boat. The following DTD description specifies that brand must occur one or
more times for a boat:
<!ELEMENT boat (brand+) >
DTD Attributes
Attributes provide additional information about elements. In a DTD file, an attribute occurs inside the starting tag of
an element.
Attribute_type. The kind of attribute. The most common attribute type is CDATA. A CDATA attribute is
character data.
Default_value. The value of the attribute if no attribute value occurs in the XML file.
- #FIXED. The XML file must contain the default value from the DTD file. A valid XML file can contain the same
attribute value as the DTD, or the XML file can have no attribute value. You must specify a default value with
this option.
The following example shows an attribute with a fixed value:
<!ATTLIST product product_name CDATA #FIXED vacuum>
The element name is product. The attribute is product_name. The attribute has a default value, vacuum.
1. Element name.
2. Attribute
3. Attribute type and null construction
4. Element datatype
5. Element data
6. Element list and occurrence
7. Element list and datatype
RELATED TOPICS:
Simple and Complex XML Types on page 11
Namespace. A collection of elements and attribute names identified by a Uniform Resource Identifier (URI)
reference in an XML file. Namespace differentiates between elements that come from different sources.
Name. A tag that contains the name of an element or attribute.
Datatype. A classification of a data element, such as numeric, string, Boolean, or time. XML supports custom
datatypes and inheritance.
Namespace
A namespace contains a URI to identify schema location. A URI is a string of characters that identifies an internet
resource. A URI is an abstraction of a URL. A URL locates a resource, but a URI identifies a resource. A DTD or
schema file does not have to exist at the URI location.
An XML namespace identifies groups of elements. A namespace can identify elements and attributes from
different XML files or distinguish meanings between elements. For example, you can distinguish meanings for the
element table by declaring different namespaces, such as math:table and furniture:table. XML is case sensitive.
The namespace Math:table is different from the namespace math:table.
You can declare a namespace at the root level of an XML file, or you can declare a namespace inside any element
in an XML structure. When you declare multiple namespaces in the same XML file, you use a namespace prefix to
associate an element with a namespace. A namespace declaration appears in the XML file as an attribute that
starts with xmlns. Declare the namespace prefix with the xmlns attribute. You can create a prefix name of any
length.
One namespace has math elements, and the other namespace has furniture elements. Each namespace has an
element called table, but the elements contain different types of data. The namespace prefix distinguishes
between the math table and the furniture table.
xmlns:xs="http://www.w3.org/2001/XMLSchema" Namespace that contains the native XML schema and datatypes.
In this example, each schema component has the prefix of xs.
elementFormDefault="qualified" Specifies that any element in the schema must have a namespace
in the XML file.
Name
In an XML file, each tag is the name of an element or attribute. In a DTD file, the tag <!ELEMENT> specifies the
name of an element, and the tag <!ATTLIST> indicates the set of attributes for an element. In a schema file,
<element name> specifies the name of an element and <attribute name> specifies the name of an attribute.
When you import an XML definition, the element tags become column names in the PowerCenter definition, by
default.
Hierarchy
An XML file models a hierarchical database. The position of an element in an XML hierarchy represents its
relationship to other elements. For example, an element can contain child elements, and elements can inherit
characteristics from other elements.
Cardinality
Element cardinality in a DTD or schema file is the number of times an element occurs in an XML file. Element
cardinality affects how you structure groups in an XML definition. Absolute cardinality and relative cardinality of
elements affect the structure of an XML definition.
Absolute Cardinality
The absolute cardinality of an element is the number of times an element occurs within its parent element in an
XML hierarchy. DTD and XML schema files describe the absolute cardinality of elements within the hierarchy. A
DTD file uses symbols, and an XML schema file uses the <minOccurs> and <maxOccurs> attributes to describe
the absolute cardinality of an element.
For example, an element has an absolute cardinality of once (1) if the element occurs once within its parent
element. However, the element might occur many times within an XML hierarchy if the parent element has a
cardinality of one or more (+).
The absolute cardinality of an element determines its null constraint. An element that has an absolute cardinality
of one or more (+) cannot have null values, but an element with a cardinality of zero or more (*) can have null
values. An attribute marked as fixed or required in an XML schema or DTD file cannot have null values, but an
implied attribute can have null values.
Cardinality 9
The following table describes how DTD and XML schema files represent cardinality:
Note: You can declare a maximum number of occurrences or an unlimited occurrences in a schema.
The following figure shows the absolute cardinality of elements in a sample XML file:
1. Element Address occurs more than once within Store. Its absolute cardinality is one or more(+).
2. Element City occurs once within its parent element Address. Its absolute cardinality is once(1).
3. Element Sales occurs zero or more times within its parent element Product. Its absolute cardinality is zero or more(*).
Relative Cardinality
Relative cardinality is the relationship of an element to another element in the XML hierarchy. An element can
have a one-to-one, one-to-many, or many-to-many relationship to another element in the hierarchy.
An element has a one-to-many relationship with another element if every occurrence of one element can have
multiple occurrences of another element. For example, an employee element can have multiple email addresses.
Employee and email address have a one-to-many relationship.
An element has a many-to-many relationship with another element if an XML file can have multiple occurrences of
both elements. For example, an employee might have multiple email addresses and multiple street addresses.
Email address and street address have a many-to-many relationship.
The following figure shows the relative cardinality between elements in a sample XML file:
1. One-to-many relationship. For every occurrence of SNAME, there can be many occurrences of ADDRESS and, therefore, many
occurrences of CITY.
2. Many-to-many relationship. For every occurrence of STATE, there can be multiple occurrences of YTDSALES. For every occurrence of
YTDSALES, there can be many occurrences of STATE.
3. One-to-one relationship. For every occurrence of PNAME, there is one occurrence of PPRICE.
For more information about XML datatypes, see the W3C specifications for XML datatypes at http://www.w3.org/
TR/xmlschema-2.
Simple Types
A simple datatype is an XML element or attribute that contains text. A simple type is indivisible. Simple types
cannot have attributes, but attributes are simple types.
Atomic Types
An atomic datatype is a basic datatype such as a Boolean, string, integer, decimal, or date. To define custom
atomic datatypes, add restrictions to an atomic datatype to limit the content. Use a facet to define which values to
restrict or allow.
A facet is an expression that defines minimum or maximum values, specific values, or a data pattern of valid
values. For example, a pattern facet restricts an element to an expression of data values. An enumeration facet
lists the legal values for an element.
The following example contains a pattern facet that restricts an element to a lowercase letter between a and z:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/>
</xs:restriction>
</xs:simpleType></xs:element>
Lists
A list is an array collection of atomic types, such as a list of strings that represent names. The list itemType
defines the datatype of the list components.
An XML file might contain the following data in the names list:
<names>Joe Bob Harry Atlee Will</names>
The following figure shows a schema file containing a shoesize union that contains sizenames and sizenums lists:
The union defines sizenames and sizenums as union member types. Sizenames defines a list of string values.
Sizenums defines a list of decimal values.
Complex Types
A complex type aggregates a collection of simple types into a logical unit. For example, a customer type might
include the customer number, name, street address, town, city, and zip code. A complex type can also reference
other complex types or element and attribute groups.
XML supports complex type inheritance. When you define a complex type, you can create other complex types
that inherit the components of the base type. In a type relationship, the base type is the complex type from which
you derive another type. A derived complex type inherits elements from the base type.
An extended complex type is a derived type that inherits elements from a base type and includes additional
elements. For example, a customer_purchases type might inherit its definition from the customer complex type,
but the customer_purchases type adds item, cost, and date_sold elements.
The following figure shows derived complex types that restrict and extend the base complex type:
In the above figure, the base type is PublicationType. BookType extends PublicationType and includes the ISBN
and Publisher elements.
Abstract Elements
Sometimes a schema contains a base type that defines the basic structure of a complex element but does not
contain all the components. Derived complex types extend the base type with more components. Since the base
type is not a complete definition, you might not want to use the base type in an XML file. You can declare the base
type element to be abstract. An abstract element is not valid in an XML file. Only the derived elements are valid.
To define an abstract element, add an abstract attribute with the value true. The default is false.
For example, PublicationType is an abstract element. BookType inherits the elements in PublicationType, but also
includes ISBN and Publisher elements. Since PublicationType is abstract, a PublicationType element is not valid
in the XML file. An XML file can contain the derived type, BookType.
Use the following element and attributes that allow any type of data:
anyType element. Allows an element to be any datatype in the associated XML file.
anySimpleType element. Allows an element to be any simpleType in the associated XML file.
ANY content element. Allows an element to be any element already defined in the schema.
anyAttribute attribute. Allows an element to be any attribute already defined in the schema.
anyType Elements
An anyType element can be any datatype in an XML instance document. Declare an element to be anyType when
the element contains different types of data.
The following schema describes a person with a first name, last name, and an age element that is anyType:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<<xs:element name="age" type="xs:anyType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The following XML instance document includes a date type and a number in the age element:
<person>
<firstname>Danny</firstname>
<lastname>Russell</lastname>
<age>1959-03-03</age>
</person>
<person>
<firstname>Carla</firstname>
<lastname>Havers</lastname>
<age>46</age>
</person>
Both types are valid for the schema. If you do not declare a datatype for an element in a schema, the element
defaults to anyType when you import the schema in the Designer.
The following schema describes a person with a first name, last name, and other element that is anySimpleType:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="other" type="xs:anySimpleType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
The following XML instance document substitutes the anySimpleType element with a string datatype:
<person>
<firstname>Kathy</firstname>
<lastname>Russell</lastname>
<other>Cissy</other>
</person>
The following XML instance document substitutes the anySimpleType element with a numeric datatype:
<person>
<firstname>Kathy</firstname>
<lastname>Russell</lastname>
<other>34</other>
</person>
When you specify ANY content, you use the keyword ANY instead of an element name and element type.
The following schema describes a person with a first name, last name, and an element that is ANY content:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:any minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="son" type="xs:string"/>
<xs:element name=daughter type="xs:string"/>
The schema includes a son element and a daughter element. You can substitute the ANY element for the son or
daughter element in the XML instance document:
<person>
<firstname>Danny</firstname>
<lastname>Russell</lastname>
<son>Atlee</son>
</person>
<person>
<firstname>Christine</firstname>
<lastname>Slade</lastname>
<daughter>Susie</daughter>
</person>
The following schema describes a person with a first name, last name, and an attribute that is anyAttribute:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
<xs:anyAttribute/>
</xs:complexType>
</xs:element>
The following XML instance document substitutes anyAttribute with the gender attribute:
<person gender="female">
<firstname>Anita</firstname>
<lastname>Ficks</lastname>
</person>
<person gender="male">
<firstname>Jim</firstname>
<lastname>Geimer</lastname>
</person>
Component Groups
You can create the following groups of components in an XML schema:
Element and attribute group. Group of elements or attributes that you can reference throughout a schema.
Substitution group. Group of elements that you can substitute with other elements from the same group.
The following example shows the schema syntax for an element group:
<xs:group name="Songs">
<xs:element name="songTitle" type="xs:string" />
<xs:element name="artist" type="xs:string" />
<xs:element name="publisher" type="xs:string" />
</xs:group>
The following example shows the schema syntax for an attribute group:
<xs:attributeGroup name="Songs">
<xs:attribute name="songTitle" type="xs:string" />
<xs:attribute name="artist" type="xs:string" />
<xs:attribute name="publisher" type="xs:string" />
</xs:attributeGroup>
Component Groups 17
The following element groups provide constraints on XML data:
Sequence group. All elements in an XML file must occur in the order that the schema lists them. For example,
OrderHeader requires the customerName first, then orderNumber, and then orderDate:
<xs:group name="OrderHeader">
<xs:sequence>
<xs:element name="customerName" type="xs:string" />
<xs:element name="orderNumber" type="xs:number" />
<xs:element name="orderDate" type="xs:date" />
</xs:sequence>
</xs:group>
Choice group. One element in the group can occur in an XML file. For example, the CustomerInfo group lists a
choice of elements for the XML file:
<xs:group name="CustomerInfo">
<xs:choice>
<xs:element name="customerName" type="xs:string" />
<xs:element name="customerID" type="xs:number" />
<xs:element name="customerNumber" type="xs:integer" />
</xs:choice>
</xs:group>
All group. All elements must occur in the XML file or none at all. The elements can occur in any order. For
example, CustomerInfo requires all or none of the three elements:
<xs:group name="CustomerInfo">
<xs:all>
<xs:element name="customerName" type="xs:string" />
<xs:element name="customerAddress" type="xs:string" />
<xs:element name="customerPhone" type="xs:string" />
</xs:all>
</xs:group>
Substitution Groups
Use substitution groups to replace one element with another in an XML file. For example, if you have addresses
from Canada and the United States, you can create an address type for Canada and another type for the United
States. You can create a substitution group that accepts either type of address.
The following schema fragment shows an Address base type and the derived types CAN_Address and
USA_Address:
<xs:complexType name="Address">
<xs:sequence>
<xs:element name="Name" type="xs:string" />
<xs:element name="Street" type="xs:string"
minOccurs="1" maxOccurs="3" />
<xs:element name="City" type="xs:string" />
</xs:sequence>
</xs:complexType>
<xs:element name="MailAddress" type="Address" />
<xs:complexType name="CAN_Address">
<xs:complexContent>
<xs:extension base="Address">
<xs:sequence>
<xs:element name="Province" type="xs:string" />
<xs:element name="PostalCode" type="CAN_PostalCode"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="USA_Address">
<xs:complexContent>
<xs:extension base="Address">
<xs:sequence>
<xs:element name="State" type="USPS_StateCode" />
<xs:element name="ZIP" type="USPS_ZIP"/>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
CAN_Address includes Province and PostalCode, and USA_Address includes State and Zip. The MailAddress
substitution group includes both address types.
RELATED TOPICS:
Using Substitution Groups in an XML Definition on page 33
XML Path
XMLPath (XPath) is a language that describes a way to locate items in an XML file. XPath uses an addressing
syntax based on the route through the hierarchy from the root to an element or attribute. An XML path can contain
long schema component names.
XPath uses a slash (/) to distinguish between elements in the hierarchy. XML attributes are preceded by @ in the
XPath.
You can create a query on an element or attribute XPath to filter XML data.
RELATED TOPICS:
Using XPath Query Predicates on page 36
Code Pages
XML files contain an encoding declaration that indicates the code page used in the file. The most common code
pages in XML are UTF-8 and UTF-16. All XML parsers support these code pages. For information on the XML
character encoding specification, see the W3C website at http://www.w3c.org.
PowerCenter supports the same set of code pages for XML files that it supports for relational databases and other
flat files. PowerCenter does not support a user-defined code page.
When you create an XML source or target definition, the Designer assigns the PowerCenter Client code page to
the definition. If you import an XML schema that contains a code page assignment, the XML Wizard displays the
code page from the schema. However, the XML Wizard does not apply that code page to the XML definition you
create in the repository.
You can not configure the code page for an XML source definition. The Integration Service converts XML source
files to Unicode when it parses them.
You can configure the code page for a target XML definition in the Designer. You can also change the code page
for an XML target instance in session properties.
XML Path 19
CHAPTER 2
Pivoting Columns, 37
Limitations, 39
An XML definition can contain multiple groups. In an XML definition, groups are called views. The relationship
between elements in the XML hierarchy defines the relationship between the views. When you create an XML
definition, the Designer creates views for multiple-occurring elements and complex types in a schema by default.
The relative cardinality of elements in an XML hierarchy affects how PowerCenter creates views in an XML
definition. Relative cardinality determines if elements can be part of the same view.
The Designer defines relationships between the views in an XML definition by keys. Source definitions do not
require keys, but target views must have them. Each view has a primary key that is an XML element or a
generated key.
When you create an XML definition, you can create a hierarchical model or an entity relationship model of the XML
data. When you create a hierarchical model, you create a normalized or denormalized hierarchy. A normalized
hierarchy contains separate views for multiple-occurring elements. A denormalized hierarchy has one view with
duplicate data for multiple-occurring elements.
If you create an entity model, the Designer creates views for complex types and multiple-occurring elements. The
Designer creates an XML definition that models the inheritance and circular relationships the schema provides.
20
Importing XML Metadata
When you import an XML definition, the Designer creates a schema in the repository for the definition. The
repository schema provides the structure from which you edit and validate the XML definition.
XML files
DTD files
Flat files
The following figure shows a sample XML file. The root element is Employees. Employee is a multiple occurring
element. The Employee element contains the LastName, FirstName, and Address. The Employee element also
contains the multiple-occurring elements: Phone and Email.
When you import an XML file, you do not need all of the XML data to create an XML definition. You need enough
data to accurately show the hierarchy of the XML file.
The Designer can create an XML definition from an XML file that references a DTD file or XML schema. If an XML
file has a reference to a DTD or an XML schema on another node, the node that hosts the PowerCenter Client
must have access to the node where the schema resides so the Designer can read the schema. The XML file
contains a universal resource identifier (URI) which is the address of the DTD or an XML schema.
When you import a DTD file, you can change the datatypes for the elements in the XML definition. You can
change the null constraint, but you cannot change element cardinality.
If you import an XML file with an associated DTD, the Designer creates a definition based on the DTD structure.
The associated DTD, ProductInfo.xml, uses the Product element from StoreInfo.dtd. Product includes the multiple-
occurring Sales element:
The Designer creates the following source definition. The ProductInfo definition contains the Product and Sales
groups. The XML file determines what elements to include in the definition. The DTD file determines the structure
of the XML definition:
Each simple type definition in an XML schema is a restriction of another simple type definition in the schema
Atomic datatypes, such as Boolean, string, or integer, restrict the anySimpleType datatype. When you define a
simple datatype in an XML schema, you derive a new datatype from an existing datatype. For example, you can
derive a restricted integer type that holds only numbers from 1 to 20. The base type is integer.
When you derive a complex datatype from another datatype, you create a new datatype that contains the elements
of the base type. You can add new elements to the derived type or create restrictions on the inherited elements.
The Designer creates views for derived types without duplicating the columns that represent inherited
components. This reduces metadata and decreases the size of the XML definition in the repository.
The following figure shows a schema with simple and complex derived types:
The MailAddress element is an Address type. A derived type, CAN_Address, inherits the Name, City, and Street
from the Address type, and extends Address by adding a Province and PostalCode. PostalCode is a simple type
called CAN_PostalCode.
When you import an XML schema, every simple type or attribute in a complex type can become a column in an
XML definition. Complex types become views.
The following figure shows an XML definition from the schema if you import the schema with the default options:
The CAN_Address view contains the elements that are unique for its type. The root element is MailAddress. The
Address type contains Name, Street, and City. The CAN_Address has a foreign key to Address. CAN_Address
includes Province and PostalCode.
The view does not contain the Name, Street, and City that it inherits from MailAddress.
The following figure shows a sample XML target definition from the relational definitions, Orders and Order_Items.
The root is XRoot. XRoot encloses Orders and Order Items. Order_Items has a foreign key that points to Orders.
The following figure shows a sample XML source definition from flat files orders and products. Products and
Orders have a foreign key to the root view:
Each view in a target definition must be related to at least one other group. Therefore, each view needs at least
one key to establish its relationship with another view. If you do not designate the keys, the Designer generates
primary and foreign keys in the target views. You can define primary and foreign keys for views if you create the
views and relationships in the XML Editor instead of allowing the Designer to create them for you.
When the Designer creates a primary or foreign key column, it assigns a column name with a prefix. In an XML
definition, the prefixes are XPK_ for a generated primary key column and XFK_ for a generated foreign key
column. The Designer uses the prefix FK_ for a foreign key that points to a primary key.
For example, when the Designer creates a primary key column for the Sales group, the Designer names the
column XPK_Sales. When the Designer creates a foreign key column connecting a sales group to another group,
it names the column XFK_Sales. You can rename any column name that the Designer creates.
If a mapping contains an XML source, the Integration Service creates the values for the generated primary key
columns in the source definition when you run the session. You can configure start values for the generated keys.
The elements in the views and the relationship between views are dependent on the schema the Designer creates
in the repository when you import the definition. The XML Editor validates XML definitions using the rules for valid
views.
A view can be related to several other views, and a view can have multiple foreign keys.
- The target root view requires a primary key, but the target root does not require a foreign key.
- A target leaf view requires a foreign key, but the target leaf view does not require a primary key.
An enclosure element cannot be a key.
A foreign key always refers to a primary key in another group. You cannot use self-referencing keys.
A generated foreign key column always refers to a generated primary key column.
- Elements that have a one-to-many relationship can be part of the same normalized or denormalized view.
- Elements that have a many-to-many relationship cannot be part of the same view.
Normalized views. An XML definition with normalized views reduces redundancy by separating multiple-
occurring data into separate views. The views are related by primary and foreign keys.
Denormalized views. An XML definition with a denormalized view has all the elements of the hierarchy that
are not unique to derived complex types in the view. A source or target definition can contain one denormalized
view.
Normalized Views
When the Designer generates a normalized view, it establishes the root element and the multiple-occurring
elements that become views in an XML definition.
The following figure shows a DTD file and the elements that become views in a normalized XML definition:
Store is the root element. Address, product, employee, and sales are multiple-occurring elements.
The definition has normalized views. The root view is store. The Address, Product, and Sales views have foreign
keys to Store. The Sales view has a foreign key to the Product view.
The following figure shows a data preview for each view in the source definition:
Denormalized Views
When the Designer generates a denormalized view, it creates one view and puts all elements of the hierarchy into
the view. All the elements in a denormalized view belong to the same parent chain. Denormalized views, like
denormalized tables, generate duplicate data.
The Designer can generate denormalized views for XML definitions that contain more than one multiple-occurring
element if the multiple-occurring elements have a one-to-many relationship and are all part of the same parent
chain.
Product and Sales are multiple-occurring elements. Because the multiple-occurring elements have a one-to-many
relationship, the Designer can create a single denormalized view that includes all elements.
The following figure shows the denormalized view for ProdAndSales.dtd in a source definition:
The Designer creates a single view for all the elements in the ProdAndSales hierarchy. Because a DTD file does
not define datatypes, the Designer assigns a datatype of string to all columns. The denormalized view does not
need a primary or foreign key.
The following figure shows a data preview for the denormalized view:
When you work with XML schemas, you can reference parts of the schema rather than repeat the same
information in schema components. A component can inherit the elements and attributes of another component
and restrict or extend the elements from the component. For example, you might use a complex type as a base for
If you create views manually or re-create entity relationships in the XML Editor, you choose how you want to
structure the metadata. When you create an XML definition based on an XML schema that uses inheritance, you
can generate separate views for the base type and derived type. You might create inheritance relationships if you
plan to map the XML data to normalized relational tables.
An XML Type I inheritance relationship is a relationship between two views. Each view root is a global complex
type. One view is derived from the other.
You can create an inheritance relationship between a column and a view. This is an XML Type II inheritance
relationship.
An entity represents a portion of an XML, DTD, or XML schema hierarchy. This hierarchy does not need to start
at the root of the XML file.
The Designer uses entities defined in a DTD file to create entity relationships.
The Designer uses type structures defined in an XML schema to generate entity relationships.
The Designer creates a new entity when it encounters a multiple-occurring element under a parent element.
The Designer generates a separate view for each member of a substitution group.
The Designer generates primary keys and foreign keys to relate separate entities.
The following schema contains a PublicationType, BookType, and MagazineType. PublicationType is the base
type. A publication includes Title, Author, and Date. BookType and MagazineType are derived types that extend
the PublicationType. Book has ISBN and Publisher, and Magazine has Volume and Edition.
<xsd:complexType name="PublicationType">
<xsd:sequence>
<xsd:element name="Title" type="xsd:string"/>
<xsd:element name="Author" type="xsd:string" maxOccurs="unbounded"/>
<xsd:element name="Date" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
<xsd:element name="Publication" type="PublicationType"/>
<xsd:complexType name="BookType">
<xsd:complexContent>
<xsd:extension base="PublicationType">
<xsd:sequence>
<xsd:element name="ISBN" type="xsd:string"/>
<xsd:element name="Publisher" type="xsd:string
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
<xsd:complexType name="MagazineType">
<xsd:complexContent>
<xsd:extension base="PublicationType">
<xsd:sequence>
<xsd:element name="Volume" type="xsd:string"/>
<xsd:element name="Edition" type="xsd:string"/>
</xsd:sequence>
</xsd:extension>
When you create XML views as entities in an XML definition, the Title and Date metadata from PublicationType do
not repeat in BookType or MagazineType. Instead, these views contain the metadata that distinguishes them from
the PublicationType. They have foreign keys that link them to PublicationType.
This example uses reduced metadata explosion because none of the elements in the base type repeat in the
derived types.
The following figure shows the default views the Designer generates from the schema:
The following figure shows an XML file that has a publication, a magazine, and books:
PublicationType view. Contains the title and date for each publication.
BookType view. Contains the ISBN and publisher. BookType contains a foreign key to PublicationType.
MagazineType view. Contains volume and edition. MagazineType also contains a foreign key to the
PublicationType.
Author view. Contains authors for all the publications. The Designer generates a separate view for Author
because Author is a multiple-occurring element. Each publication can contain multiple authors.
For example, the following schema defines a complex type called EmployeeType. EmployeeType contains
EmployeeNumber and EmployeeName elements.
EmployeeStatusType includes an element called Employee that extends EmployeeType. Employee includes an
EmployeeStatus element.
<xs:element name="Employee_Payroll">
<xs:complexType>
<xs:sequence>
<xs:element name="EmployeeStatus" type="EmpStatusType" maxOccurs="unbounded"></
xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="EmpStatusType">
<xs:sequence>
<xs:element name="Employee" minOccurs="0" maxOccurs="1">
<xs:complexType>
<xs:complexContent>
<xs:extension base="EmployeeType">
xs:sequence>
<xs:element name="EmployeeStatus" type="xs:string">
</xs:element>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:complexType name="EmployeeType">
<xs:sequence>
When you import the schema, the Designer creates a view for Employee_Payroll, EmployeeType, and
EmployeeStatus. The EmployeeStatus view contains the column called Employee. Employee derived from
EmployeeType.
The following figure shows the XML views the Designer creates from the schema:
The following figure shows a sample portion of an XML schema containing substitution groups:
The following figure shows a circular reference in the XML Editor workspace:
You might use the Part XML definition to read the following XML file in a session:
<Part>
<ID>1</ID>
<Name>Big Part</Name>
<Type>L</Type>
<Part>
<ID>1.A</ID>
In the XML file, Part 1 contains Part 1.A, and Part 1.A contains Part 1.A.B.
The following figure shows the data and the keys that a session might generate from the XML source:
Note: You cannot run a session that contains a circular XML reference if the session is enabled for constraint-
based loading. The session rejects all rows.
The Integration Service uses a view row to determine when to read and write data for an XML view. You can set a
view row at any single or multiple-occurring element. Once you set the view row, every element you add to the
view has a one-to-one correspondence with the view row.
For example, the Employees view contains elements Employee, Name, Firstname, and Lastname. When you set
the view row to Employee, the Integration Service extracts data using the following algorithm:
For every (Employees/Employee)
extract ./Name/Firstname/Lastname
Employee, Address, and Email are multiple-occurring elements. You can create a view that contains the following
elements:
EMPLOYEE
ADDRESS
NAME
If you set the view row as Address, the Integration Service extracts a Name for every Employee/Address in the
XML data. You cannot add Email to this view because you would create a many-to-many relationship between
Address and Email.
For example, you can add one instance of Email as a pivoted column to the Employee view. The view would
contain the following elements:
EMPLOYEE
ADDRESS
NAME
EMAIL[1]
The view might have the view row, EMPLOYEE/ADDRESS/EMAIL[1]. The Integration Service extracts data for the first
instance of Employee/Address/Email.
To create a query in an XML view, you create an XPath query predicate in the XML Editor. XPath is a language
that describes a way to locate items in an XML document. XPath uses an addressing syntax based on the path
through the XML hierarchy from a root component. You can create an XPath query predicate for elements in the
view row or elements and attributes that have an XPath that includes the view row.
An XPath query predicate includes an element or attribute to extract, and the query predicate that determines the
criteria. You can verify the value of an element or attribute, or you can verify that an element or attribute exists in
the source XML data.
Every view must have a view row, which must be an element or complex type.
The view root is the top-level element in a view. The view root is the parent to all the other elements in the view.
The view row can be the same as the view root unless the view is denormalized.
Two views can have the same view row in an XML source or XML Parser transformation.
The view row element must be the lowest multiple-occurring element in the view. A view cannot contain many-
to-many relationships.
If you add a multiple-occurring element to a view with no other multiple-occurring element, you change the view
row to the new element by default. If the view already has a multiple-occurring element, you cannot add
another multiple-occurring element.
You do not need to specify a view row when you create an empty view. However, as soon as you add a column
to the view, the Designer creates the view row. This is true even if you add just the primary key.
You can change a view row at a later time, but you cannot change a view root unless there are no schema
components in the view.
You can specify a view row that consists of a pivoted element, such as:
Product/Order[2]/Customer
An effective view row for a view is the path of view rows from the top of a hierarchy relationship down to the
view row in the view. A view can have multiple effective view rows because the view can have multiple
hierarchy relationships in the XML definition.
You can specify options in the XML Editor that affect how view rows and effective view rows affect data output.
If you have this type of element in an XML source, use pivoting to treat occurrences of elements as separate
columns in a group. To pivot occurrences of an element in an XML view, create a column for each occurrence you
want to represent in the definition. In the monthly sales example, if you want to represent all 12 occurrences as
columns, create 12 sales columns in the view. If you want to represent the sales of one quarter, create three
columns. When you run a session, the Integration Service ignores any XML data for the occurrences that you do
not include in the definition.
You can pivot columns when you add or edit a view in the XML source definition.
You can pivot simple types and complex types. You cannot pivot a primary key column. When you pivot columns
in a view, the resulting group structure must follow the rules for a valid normalized or denormalized view. The
Designer displays warnings and errors if the pivoted column invalidates a view.
Pivoting affects an element in the view where you pivot the element. When you pivot an element in a view, you do
not change same element in another view.
The following example shows the ADDRESS element of the StoreInfo XML file pivoted into two sets of address
columns:
Pivoting Columns 37
First occurrence of Address pivots to home address columns with prefix HOM_. The second occurrence of
Address pivots to office address columns with prefix OFC_. XPath shows the two sets of columns that come from
the same elements.
The first and second occurrences of Address appear as columns in the group:
The XPath STORE/PRODUCT[2]/ORDER[1]/ORDERNAME refers to the ordername for the first order for the second product
in the store. The XPath STORE/PRODUCT[2]/ORDER/CUSTOMER[1]refers to the first customer for all orders of the second
product.
If you pivot a view row, any column in the XML view that occurs below the view row must have an XPath that
matches XPath of the view row.
The following columns have the same occurrence of Trade in the XPath:
Transaction/Trade[1]/Date
Transaction/Trade[1]/Price
Transaction/Trade[1]/Person[1]/Firstname
You cannot create a column with the following XPath in the view:
Transaction/Trade[2]/Date
Concatenated columns. A column cannot be a concatenation of two elements. For example, you cannot
create a column FULLNAME that refers to a concatenation of two elements FIRSTNAME and LASTNAME.
Composite keys. A key cannot be a concatenation of two elements. For example, you cannot create a key
CUSTOMERID that refers to a concatenation of two elements LASTNAME and PHONENUMBER.
Parsed lists. PowerCenter stores a list type as one string that contains all array elements. PowerCenter does
not parse the respective simple types from the string.
Limitations 39
CHAPTER 3
XML files
DTD files
Relational definitions
When you create XML definitions, you import files with the XML Wizard and organize metadata into XML views.
XML views are groups of columns containing the elements and attributes in the XML file. The wizard can generate
views for you, or you can create custom views.
You can create relationships between views in the XML Wizard. You can create hierarchy relationships or entity
relationships.
You can synchronize an XML definition against an XML schema file if the structure of the schema changes.
40
Importing an XML Source Definition
When you import a source definition from an XML schema or DTD file, the Designer can provide an accurate
definition of the data based on the description provided in the DTD or XML schema file. When you import a source
definition based on an XML file without an associated DTD or XML schema, the XML Wizard determines the types
and occurrences of the data based on data represented in the XML file. When you create the XML definition, you
can get unexpected results. For example, the Designer might define an inaccurate scale attribute for string
columns. If you export the XML source definition and import the definition with the inaccurate scale attributes,
errors occur.
After you create an XML source definition, you cannot change the source definition to any other source type.
Conversely, you cannot change other types of source definition to XML definitions.
The XML Wizard uses keys to relate the XML views and reconstruct the XML hierarchy. You can choose to
generate views and primary keys, or you can create views and specify keys. When you create custom views, you
can select roots and choose how to handle metadata expansion.
The XML Wizard saves the XML hierarchy and the view information as an XML schema in the repository. When
you import an XML definition, the ability to change the cardinality and datatype of the elements in the hierarchy
depends on the type of file you are importing. For example, DTD and XML files do not store datatype information.
When you import these files to create an XML definition, you can configure datatype, precision, and scale in the
Designer. If you import an XML schema, you can change the precision and scale.
You cannot create an XML source definition from an XML file of exported repository objects.When you import a
source definition, the Designer applies a default code page to the XML definition in the repository. The code page
is based on the PowerCenter Client code page. You cannot change the code page for an XML source definition,
but you can change the code page for an XML target definition after you create it.
Option Description
Override all infinite lengths You can specify a default length for components with undefined lengths, such as
strings. If you do not set a default length, the precision for these components sets
to infinite. Infinite precision can cause DTM buffer size errors when you run a
session with large files.
Analyze elements/attributes in Choose this option to create global declarations of standalone XML elements or
standalone XML as global attributes. You can reuse global elements by referencing them in other parts of the
declarations schema. When you clear this option, the standalone XML is a local declaration.
Create an XML view for an You can create a separate view for an enclosure element if the element can occur
enclosure element more than once and the child elements can occur more than once. An enclosure
element is an element that has no text content or attributes but has child elements.
Pivot elements into columns You can pivot leaf elements if they have an occurrence limit. You can pivot
elements in source definitions only.
Ignore fixed element and attribute You can ignore fixed values in a schema and allow other element values in the
values data.
Ignore prohibited attributes You can declare an attribute as prohibited in an XML schema. Prohibited attributes
restrict complex types. When you import the schema or file, you can choose to
ignore the prohibited attributes.
Generate names for XML columns You can choose to name XML columns with a sequence of numbers or with the
element or attribute name from the schema. If you use names, choose from the
following options:
- When the XMLColumn refers to an attribute, prefix it with the element name.
PowerCenter uses the following format for the name of the XML column:
NameOfElement_NameOfAttribute
- Prefix the XML view name for every XML column. PowerCenter uses the
following format for the name of the XML column: NameOfView_NameOfElement
- Prefix the XML view name for every foreign-key column. PowerCenter uses
the following format for the name of a generated foreign key column:
FK_NameOfView_NameOfParentView_NameOfPKColumn
Maximum length for a column name is 80 characters. PowerCenter truncates
column names longer than 80 characters. If a column name is not unique,
PowerCenter adds a numeric suffix to keep the name unique.
The XML Wizard provides options for creating views in the definition or you can create the views manually in the
XML Editor.
Generate entity relationships. If you create entity relationships, the XML Wizard generates views for multiple-
occurring or referenced elements and complex types.
Generate hierarchy relationships. When you create hierarchical relationships, each reference to a
component expands under its parent element. You can generate normalized or denormalized XML views in a
hierarchy relationship.
- Normalized XML views. When you generate a normalized XML view, elements and attributes appear once.
Multiple-occurring elements, or elements in one-to-many relationships appear in different views related by
keys.
- Denormalized XML views. When you generate a denormalized XML view, all elements and attributes appear
in one view. The Designer does not model many-to-many relationships between elements and attributes in an
XML definition..
Create a custom XML views. You can specify any global element as a root when creating a custom XML view.
You can choose to reduce metadata explosion for elements, complex types, and inherited complex types.
Synchronize XML definitions. You can update one or more XML definitions when their underlying schemas
change.
Skip Create XML views. When you choose to skip creating the XML views, you can define them later in the
XML Editor. When you define views in the XML Editor you can define the views to match targets and simplify
the mapping.
Create XML views for the elements and attributes in the XML file. When you import an XML file with an
associated schema, you can create XML views for just the elements and attributes in the XML file, instead of all
the components in the schema.
If choose to generate entity or hierarchy relationships, the Designer chooses a default root and creates the the
XML views. If the XML definition requires more than 400 views, a message appears that the definition is too large.
You can manually create views in the XML Editor. Import the XML definition and choose to create custom views or
skip generating XML views.
When you import a definition from a XML schema that has no global elements, the Designer cannot create a root
view in the XML definition. The Designer displays a message that there is no global element.
After you create an XML view, you cannot change the configuration options you set for the view. For example, if
you create a normalized XML view, you cannot change the view to denormalized. You must import a new XML
source definition and select the denormalized option.
To import part of a schema, import an XML file that references the schema. Select the option to create XML views
only for the elements and attributes in the XML file.
The Designer limits metadata to what is in the XML file. If you change the XML file and import the XML schema
again, the XML definition changes.
Generates views for multiple-occurring and referenced elements and complex types.
When the Designer generates entity relationships, the Designer generates different entities for complex types,
global elements, and multiple-occurring elements based on the relationships in the schema.
If you want to create groups other than the default groups, or if you want to combine elements from different
complex types, you can create custom XML views.
When you view an XML source definition in the XML Editor, you can see the relationships between each element
in the XML hierarchy. For each relationship between views, the XML Editor generates links based on the type of
relationship between the views.
You can specify how you want to generate metadata associated with the view. You can reduce metadata
explosion for elements, complex types, and inherited complex types by generating entity relationships. If you do
not reduce the metadata references, the Designer generates hierarchy relationships and expands all child
elements under their parent elements.
In this example, Bookstore element selected as root and the Book123 element is cleared as root element.
If you use references within an XML schema, you might want to reduce the number of times the Designer includes
the metadata associated with a reference. The XML Wizard provides the following options to reduce metadata
references:
Reduce element explosion. The Designer creates a view for any multiple-occurring element or any element
that is referenced by more than one other element. Each view can have multiple hierarchical relationships with
other views in the definition.
Reduce complex type explosion. The Designer creates an XML view for each referenced complex type or
multiple-occurring element. The XML view can have multiple type relationships with other views. If the schema
uses inherited complex types, you can also reduce explosion of inherited complex types.
Reduce complex type inheritance explosion. For any inherited type, the XML Wizard creates a type
relationship.
When you reduce metadata explosion, the Designer creates entity relationships between the XML views it
generates.
Flat files
URLs
XML files
DTDs
Schema files
When you synchronize an XML definition, the Designer updates the XML schema in the Schema Navigator. but it
does not update the views in the XML definition. You can manually update the views columns in the XML Editor
after you synchronize the XML definition.
Note: Verify that you synchronize the XML definition with the source that you used to create the definition. If you
synchronize an XML definition with a source that you did not use to create the definition, the Designer cannot
synchronize the definitions and loses metadata. Click Edit > Revert to Saved to restore the XML definition.
1. Right-click the top of the definition in the Source Analyzer workspace. Select Edit.
2. On the Table tab, edit the following settings:
Business Name Descriptive name for the source definition. You can edit Business Name by clicking the Rename
button.
Description Description of the source. Character limit is 2,000 bytes/K, where K is the maximum number of
bytes for each character in the repository code page. Enter links to business documentation.
Code Page Read Only. Not applicable for XML source files. The Integration Service converts all XML source
files to Unicode.
XPath Path of the element referenced by the current column in the XML hierarchy. XPath does not
display for generated primary or foreign keys.
Business Name User-defined descriptive name for the column. If Business Name is not visible in the window,
scroll to the right to view or modify the column.
4. If you configure the session to read a file list, and you want to write the source file name to each target row,
click the Properties tab and select Add Currently Processed Flat File Name Port.
The Designer adds the CurrentlyProcessedFileName port to the Columns tab. It is the last column in the first
group. The Integration Service uses this port to return the source file name. The CurrentlyProcessedFileName
port is a string port with default precision of 256 characters.
To remove the CurrentlyProcessedFileName port, click the Properties tab and clear Add Currently Processed
Flat File Name Port.
5. Click the Metadata Extensions tab to create, edit, and delete user-defined metadata extensions.
6. Click OK.
7. Click Repository > Save to save changes to the repository.
When you create an XML target definition, the XML Wizard generates keys to relate each group to the root.
EMAIL and PHONE belong to the same parent element, but they do not belong to the same parent chain. You
cannot put them in the same denormalized view. To put all the elements of employee in one view, you can pivot
one of the multiple occurring elements.
Follow these steps to add two multiple-occurring elements to the same view:
How can I match the EMPNO and SALARY in the same view?
The DTD example is ambiguous. The definition is equivalent to the following:
<!ELEMENT EMPLOYEE (EMPNO+, SALARY+)>
In the DTD example, EMPLOYEE has multiple-occurring elements EMPNO and SALARY. You cannot have two
multiple-occurring elements in the same view.
When EMPNO and SALARY are in different views, you can still combine the data in a mapping. Use two
instances of the same source definition and use a Joiner transformation.
When I import this XML file, the Designer drops the ISBN element. Why does this happen? How can I get
the Designer to include the ISBN element?
Use the schema to import the XML definition. When you use an XML file to import an XML definition, the
Designer reads the first element as simple content because the element has no child elements. The Designer
discards the ISBN child element from the second Book instance. If you use a schema to import the definition,
the Designer uses the the schema definition to determine how to read XML data.
Verify the XML file accurately represents the associated schema. If you use an XML file to import a source
definition, verify the XML file is an accurate representation of the structure in the corresponding XML schema.
Use the XML Editor to create views, modify components, add columns, and maintain view relationships in the
workspace. When you update an XML definition, the Designer propagates the changes to any mapping that
includes the source. Some changes to XML definitions can invalidate mappings.
Note: If you make significant changes to the source you used to create an XML definition, you can synchronize
the definition to the new source rather than editing the definition manually.
Navigator
XML Workspace
Columns Window
51
The following figure shows the XML Editor:
1. Navigator
2. Columns Window
3. XML Workspace
The XML Editor uses icons to represent XML component types. To view a legend that describes the icons in the
XML Editor, click View > Legend.
XML Navigator
The Navigator displays the schema in a hierarchical form and provides information about selected components.
You can sort components in the Navigator by type, hierarchy, or namespace. You can also expand a component to
see components below the component in the hierarchy.
The Navigator toolbar provides shortcuts to most of the Navigator functions. The toolbar also provides navigation
arrows you can click to find previously displayed components in the hierarchy quickly.
Properties tab. Displays information about a selected component such as the component type, length, and
occurrence.
Actions tab. Provides a list of options to view more information about the selected component.
Properties Tab
The Properties tab displays information about a component you select in the Navigator. If the component is a
complex element, you can view element properties in the schema, such as namespace, type, and content. When
you view a simple element or attribute, the Properties tab shows the type and length of the element. The
Properties tab also displays annotations.
If you import the definition from an XML file, you can edit the datatype and cardinality from the Properties tab. If
you create the definition from a DTD file, you can edit the component type.
Actions Tab
The Actions tab lists options you use to see more information about a selected component. You can also reverse
changes you make to components from the Actions tab.
The following options appear on the Actions tab, depending on the properties of the component you select:
ComplexType references. Displays the schema components that are of this type.
ComplexType hierarchy. Displays the complex types derived from the selected component.
SimpleType reference. Displays all the components that are this type.
Propagate SimpleType values. Propagates the length and scale values to all the components of this
SimpleType.
Element references. Displays the components that reference the selected element.
Child components. Displays the global schema components that the selected component uses.
Revert simpleType. Changes the type, length, and precision values back to the original value if you have
changed them.
XML view references. Displays all the XML views and columns that reference the selected component.
XML Workspace
The XML workspace displays the XML views and the relationships between the views. You can create XML views
in the workspace and define relationships between views.
The XML workspace toolbar provides shortcuts to most of the functions that you can do in the workspace.
You can modify the size of the XML workspace in the following ways:
Reduce the workspace. Click the Zoom button on the Workspace toolbar.
Columns Window
The Columns window displays the columns for a view in the workspace. Use the Columns window to name
columns that you add. If you use pivoted columns, you use the Columns window to select and rename
occurrences of multiple-occurring elements. You can also specify options, such as Not Null, Force Row, Hierarchy
or Type Relationship Row, and Non-Recursive Row. These options affect how the Integration Service writes data
to XML targets.
Add a FileName column. Add a column to generate a new file name for each XML target file.
You can add a column to an XML view when the following conditions are true:
The component path starts from the element in the schema that is the view root for that view.
The component does not violate normalization or pivoting rules. For example, you cannot add more than one
multiple-occurring element to a view.
You can add mixed content elements as either simple or complex types.
If you add a pivoted column to a view, a default occurrence number appears in the Columns window. This number
indicates which occurrence of the element to use in the column. You can change the occurrence number or add
more occurrences of the element as new columns. If you do not rename the columns, the XML Editor adds a
sequence number to each pivoted column name.
Note: You cannot change a pivot value if the pivot value is part of a view row.
When you view a complex type in the XPath Navigator, you can view the derived types.
When you import a schema with an anyType element, the anyType element appears in the Schema Navigator as a
string datatype. The XML Wizard does not add the anyType element to a view.
You can drag the anyType element to a view. If you drag anyType to a view, the XML Wizard creates a string port.
You can change the anyType element to another global complex type in the schema.
When you define an element as anySimpleType, the Designer creates an anySimpleType column for the element
when you import the schema. When you use the column in a mapping, the XML Source Qualifier maps this type to
a string.
Once you generate the pass-through port, you add another port to pass the data through the transformation. This
port is the reference port. In an XML Parser transformation, the pass-through port passes the data into the
transformation and the reference port passes the data out of the transformation. In an XML Generator, the pass-
through port passes the data out of the transformation and the reference port passes data into the transformation.
If you have pass-through ports in an XML definition, you can determine the corresponding reference ports.
When you use the FileName column, you set up an Expression transformation or other transformation in the
mapping to generate the unique file names to pass to FileName column.
When you create an XPath query predicate in the XML Editor, the XML Editor provides elements, attributes,
operators, and functions to build the query. You can select the components, enter components, or copy
components into a query. The XML Editor validates each query that you create.
You can query the value of an element or attribute, or you can verify that an element or attribute exists.
The query expression is in brackets. The XPath of Dept is abbreviated by ./ to indicate that the path is continuing
from Employee.
The following XPath query predicate extracts employees if the last name is Smith:
EMPLOYEE[./NAME/LASTNAME='SMITH']
Use Boolean or numeric operators in an XPath query predicate. You can also use string, numeric, and boolean
functions in a query.
For example, an XML file might contain a NAME element with mixed content:
<NAME>
Kathy
<MIDDLE> Mary </MIDDLE>
Russell
</NAME>
Element NAME has the value Kathy, a child element MIDDLE, and a second value Russell. The NAME
column value is KathyRussell. However, the Integration Service evaluates the NAME Kathy.
Boolean Operators
Use the following Boolean operators in an XPath query predicate:
and or < <= > >= = !=
Use the following XPath query predicate to extract employees in department 100 with the last name Jones:
EMPLOYEE [./DEPT = '100' and ./ENAME/LASTNAME = 'JONES']
Numeric Operators
Use the following numeric operators in an XPath query predicate:
+ - * div mod
Use the following XPath query predicate to extract products when the price is greater than cost plus tax:
PRODUCT[./PRICE > ./COST + /.TAX]]
String. Use string functions to test substring values, concatenate strings, or translate strings into other strings.
The following XPath query predicate determines if an employees full name is equal to the concatenation of last
name and first name:
EMPLOYEE[./FULLNAME=concat(./ENAME/LASTNAME,./ENAME/FIRSTNAME)]
Numeric. Use numeric functions with element and attribute values. Numeric functions operate on numbers and
return integers. For example, the following XPath query predicate rounds discount and tests if the result is
greater than 15:
ORDER_ITEMS[round(./DISCOUNT > 15]
Boolean. Boolean functions return either true or false. Use them to test elements, check the language
attribute, or force a true or false result. For example, a string is true if the string value is greater than zero:
boolean(string)
When you run a session, the Integration Service extracts employee data from the XML source if the employees
department has a department name. Otherwise, the Integration Service does not extract the employee data.
You can configure an XPath query predicate for any element in a view row.
For example, if a view row is Company/Dept, you can create the following XPath query predicate:
COMPANY[./DEPT=100]
You can match content.
You can add an XPath query predicate to a column if the column occurs below the view row in the view XML
hierarchy and the column XPath includes the view row.
For example, if the view row is Product/Toys[1], you can create the following XPath query predicate:
Product/Toys[1][./Sales > 100]
The following example shows an invalid XPath query predicate for the Product/Toys[1] view row:
Product/Toys[2][./Sales > 100]
Product/Toys[1] is the view row. You cannot use Product/Toys[2].
Use a single-occurring element or attribute. You cannot create an XPath query predicate on a multiple-
occurring element.
You cannot create an XPath predicate query on an enclosure element because an enclosure element contains
no values.
Operator Description
+ Add
- Subtract
* Multiply
div Divide
mod Modulus
or Boolean or
= Equal
!= Not equal
Create a relationship between views. Define relationships between views in the workspace.
Create a type relationship. Define a type relationship between a column in a view and a type view in the
workspace.
Re-create entity relationships. Generate views and relationships using the same options as in the XML
Wizard.
Refresh shared XML views. Save existing views but update them.
5. Click Next.
The Recreate Entity Relationships dialog box appears.
6. To display a child component, select a shared element or complex type and click the name.
7. To exclude a child component, clear the element in the Exclude Child Components pane.
To generate a new view, select the element or complex type. When you create the new entity relationships,
you generate a view with that element as a view root.
Update the namespace. Change the location of a schema or the default namespace in an XML target.
Navigate to components. Find components by navigating from a component to another component or area of
the XML Editor window.
Arrange views in the workspace. Arrange the views in the workspace hierarchically. You can organize the
views into a hierarchical arrangement in the workspace. To arrange views in the workspace, click Layout >
Arrange, or right-click the workspace and select Arrange.
Search for components. Find components in the Navigator or in the workspace.
Display the hierarchy of simple or complex types. View a hierarchy of the simple or complex types in the
XML schema.
View XML metadata. View an XML file, schema, or DTD that the XML Editor creates from the XML definition.
Preview XML data. Display an XML view using sample data from an external XML file.
Validate the XML definition. Validate the XML definition and view errors.
If you create a target XML definition that has one or more namespaces, you can choose a default namespace.
When you run a session, the Integration Service writes the elements and attributes from the default namespace
without a namespace prefix.
Do not use xml or xmlns as a namespace prefix. An xml prefix points to the http://www.w3.org/XML
namespace by default. Xmlns links elements to namespaces in an XML schema.
To update a namespace:
Navigating to Components
To quickly find components, in large XML definitions select a workspace component to navigate from and select a
navigation option. For example, if you click a foreign key in a view, you can navigate to the associated primary key
or to the column in the Columns window. You can navigate between components in the workspace, the Columns
window, and the Navigator.
To navigate to components:
Primary key. Highlights the primary key associated with a selected foreign key.
Referenced column. Highlights the referenced column associated with a pass-through port in an XML
Parser or Generator transformation.
XPath Navigator. Displays the path to the selected the component.
XML view. Highlights a view in the workspace that contains the selected column from the Columns window.
To search using a partial key, enter the first few characters of the column or component name.
You can display a hierarchy of the complex types in the schema definition. To view a hierarchy of complex types,
click View > ComplexType Hierarchy. A window displays a hierarchy of the complex types in the schema. Select a
component from the ComplexType Hierarchy window to navigate to the component in the schema.
1. To view the metadata as a sample XML document, choose a global component in the Navigator.
2. Click View > XML Metadata.
The View XML Metadata dialog box appears.
3. Choose to display the XML definition as an XML file, a DTD file, or an XML schema.
If you use multiple namespaces, choose the namespace to use.
A default application or text editor displays the metadata.
4. To save a copy of the XML, DTD, or XML schema file, click Save As.
5. Enter a new file name.
If the default display application is a text editor, you need to include the appropriate file suffix with the file
name. The suffix is .xml, .dtd, or .xsd, depending on what type of file you are working with.
Use the following methods to determine when the Integration Service generates rows from an XML source:
Generate all the foreign keys in a view. By default, the Integration Service generates values for one foreign
key in a view. If a view has more than one foreign key, the other foreign keys have null values. You can
generate values for all the foreign keys.
Stop recursive reads in a circular relationship. By default, the Integration Service generates rows for all
occurrences of data in a circular relationship. You can generate a row for just the first occurrence of recursive
data.
Generate a row for a child view if a parent exists. By default, the Integration Service creates rows for all
views with data in the view row. You can generate a row for a child view only when a parent view has data.
Generate a row for a view without view row data. By default, the Integration Service generates data for a
view when the view row has data. You can generate a row for a view that has no data in the view row.
The following figure shows the Pedigree_View with two foreign keys:
If you select the All Hierarchy Foreign Keys option for the Pedigree_View, the Integration Service generates key
values for FK_Species and FK_Animal.
The following figure shows sample data for Pedigree_View with the All Hierarchy Foreign Keys option:
If you clear the All Hierarchy Foreign Keys option, the Integration Service generates key values for one foreign key
column. In this example, the Integration Service generates values for FK_Species because Species_View is the
closest parent of Pedigree_View in the XML hierarchy. The FK_Animal foreign key has null values.
The following figure shows sample data for Pedigree_View if you clear the All Hierarchy Foreign Key option:
The following XML file contains a Part element with a circular reference:
<?xml version="1.0" encoding="utf-8"?>
<Vehicle xmlns="http://www.PartInvoice.org"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.PartInvoice.org part.xsd">
<VehInfo>
<make>Honda</make>
<model>Civic</model>
<type>Compact</type>
</VehInfo>
<Part>
<ID>123</ID>
<Name>Body</Name>
<Type>Exterior</Type>
<Part>
<ID>111</ID>
<Name>DoorFL</Name>
<Type>Exterior</Type>
<Part>
<ID>1112</ID>
<Name>KnobL</Name>
<Type>Exterior</Type>
</Part>
<Part>
<ID>1113</ID>
<Name>Window</Name>
<Type>Exterior</Type>
</Part>
</Part>
</Part>
</Vehicle>
The Part element is the view row for the X_par_Part view in the XML definition:
When you run a session, the Integration Service generates rows for Part 123 and all of the component Parts:
When you select NonRecursive Row, the Integration Service reads the first occurrence of the Part element and
generates one row of data for Part 123.
For example, an XML definition might have a hierarchy consisting of an Employee view and an Address view.
Employee is the parent view. The address data can include Employee\Addresses or Store\Addresses. You can
choose to output Employee\Address.
The following XML file has an Address within the Store element and an Address within the Employee element:
<?xml version=1.0 encoding=UTF-8?>
<!DOCTYPE STORE >
<STORE SID=BE1752>
<SNAME>Mud and Sawdust Furniture Store</SNAME>
<ADDRESS>
<STREETADDRESS>335 Westshore Road</STREETADDRESS>
<CITY>Fausta City</CITY>
<STATE>CA</STATE>
<ZIP>97584</ZIP>
</ADDRESS>
<EMPLOYEE DEPID=34>
<ENAME>
<LASTNAME>Bacon</LASTNAME>
<FIRSTNAME>Allyn</FIRSTNAME>
</ENAME>
<ADDRESS>
<STREETADDRESS>1000 Seaport Blvd</STREETADDRESS>
<CITY>Redwood City</CITY>
<STATE>CA</STATE>
<ZIP>94063</ZIP>
</ADDRESS>
<EPHONE>(408)226-7415</EPHONE>
<EPHONE>(650)687-6831</EPHONE>
</EMPLOYEE>
</STORE>
The following figure shows a hierarchical relationship between the Employee and Address views in the XML
definition:
By default, the Integration Service generates a row for each occurrence of the Address element. The Integration
Service generates one row for the Store\Address and another for Employee\Address.
When you select the Hierarchy Relationship Row option, the Integration Service generates rows in a session as
follows:
The Integration Service generates a row for the Address view when the Employee view has corresponding data
in a session.
The Integration Service generates a row representing the Employee\Address hierarchy relationship.
The following shows the Address data if you select the Hierarchy Relationship Row option:
The following figure shows the zip element as the view row:
By default, the Integration Service generates a row for every occurrence of the zip element within the address
element.
For example, you might process the following XML file in a session:
<?xml version="1.0" ?>
<company xmlns="http://www.example.org"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.example.org forcerow.xsd" >
<name>company1</name>
<address>
<street>stree1</street>
<city>city1</city>
<zip>1001</zip>
</address>
<employee>
<name>emp1</name>
<address>
<street>emp1_street</street>
<city>empl_city</city>
</address>
</employee>
<employee>
<name>emp2</name>
<address>
<street>emp2_street</street>
By default, the Integration Service generates the following rows for the Address view:
The Integration Service does not generate a row for emp1, because the view row is zip, and emp1 has no data for
the zip element.
If you enable Force Row, you can output the street and city elements with or without the zip. The session
generates a row for emp1, even though emp1 does not have data for the zip element.
For example, a definition might have a hierarchy that includes BillToAddress and ShipToAddress. If you want to
generate rows for the BillToAddress, use the Type Relationship Row option.
The PartInvoice view contains invoice data. The view includes a BillToAddress. The type relationship in the XML
definition is between BillToAddress and AddressType.
To limit the AddressType data to BillToAddress, select the X_par_PartInvoice view in the XML Editor workspace.
Choose the Type Relationship Row option. When you run a session, the Integration Service generates Address
rows for BillToAddress but not ShipToAddress. ShipToAddress is not in the type relationship.
Example
The following example shows how to limit data to specific types in a type relationship. The example uses the
PartInvoice view and the AddressType view.
The following XML file contains invoice data that includes the BillToAddress and ShipToAddress:
<xsd:complexType name="AddressType">
<?xml version="1.0" encoding="utf-8"?>
<Invoices xmlns="http://www.PartInvoice.org"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.PartInvoice.org Part.xsd">
<PartInvoice InvoiceNum="185" DateShipped="2005-01-01">
<PartOrder>
<PartID>HLG100</PartID>
<PartName>Halogen Bulb</PartName>
<Quantity>2</Quantity>
<UnitPrice>35</UnitPrice>
</PartOrder>
<BillToAddress>
<Street>2100 Seaport Blvd</Street>
<City>Redwood City</City>
<State>CA</State>
<Zip>94063</Zip>
<Country>USA</Country>
</BillToAddress>
<ShipToAddress xsi:type="USAddressType">
<Street>3350 W Bayshore Rd</Street>
<City>Palo Alto</City>
<State>CA</State>
<Zip>97890</Zip>
<Country>USA</Country>
<PostalCode>97890</PostalCode>
</ShipToAddress>
</PartInvoice>
</Invoices>
To generate AddressType rows related to the PartInvoice view, set the Type Relationship Row Option for the
PartInvoice view. The Type Relationship Row Option generates the following row for the BillToAddress:
It does not generate a row for ShipToAddress because ShipToAddress is not in the type relationship.
I cannot find the DTD or XML schema file that I created when I viewed XML metadata.
The DTD or XML schema file that you can view is a temporary file that the Designer creates for viewing. If you
want to use the file for other purposes, save the file with another name and directory when you view it.
When I add columns to XML source views, the hierarchy in the source XML file remains the same.
When you add columns to XML source views, you do not add elements to the underlying hierarchy. The XML
hierarchy that you import remains the same no matter how you create the views or how you map the columns in a
view to the elements in the hierarchy. You can modify the datatypes and the cardinality of the elements, but you
cannot modify the structure of the hierarchy.
Import the definition from an XML schema or file. You can create a target definition from an XML, DTD, or
XML schema file. You can import XML file definitions from a URL or a local node. If you import an XML file with
an associated DTD, the XML Wizard uses the DTD to generate the XML file.
Create an XML target definition based on an XML source definition. You can drag an existing XML source
definition into the Target Designer. If you create an XML target definition, the Designer creates a target
definition based on the hierarchy of the XML definition.
Create an XML target based on a relational file definition. You can import an XML target definition from a
relational or flat file repository definition.
In addition to creating XML target definitions, you can complete the following tasks with XML targets in the Target
Designer:
Edit target properties. Edit an XML target definition to add comments documenting changes to target XML,
DTD, or XML schema files.
Synchronize target definitions. You can synchronize the target XML definition to a schema if you need to
make changes. When you synchronize the definition, you update the XML definition instead of importing the
schema if the schema changes.
74
Importing an XML Target Definition from an XML File
You can import XML definitions from an XML schema, DTD file, or XML file. You can import local files or files that
you reference with a URL. To ensure that the Designer can provide an accurate definition of the data, import a
target definition from an XML schema.
You can choose from the following options to create XML views:
Create entity relationships. Use this option to create views for multiple-occurring or referenced elements and
complex types. You create relationships between views instead of creating one large hierarchy.
Create hierarchical relationships. Use this option to create a root and expand the XML components under
the root. You can choose to create normalized or denormalized views. If you choose normalized, every element
or attribute appears once. One-to-many relationships become separate XML views with keys to relate the
views. If you create denormalized XML views, all elements and attributes display in one hierarchical group.
Create custom XML views. Use this option to select multiple global elements as roots for XML views and
select options for reducing metadata explosion.
Skip creating XML views. Use this option to import the definition without creating any views. If you choose
this option, use the XML Editor to create XML views in the workspace at a later time.
Synchronize XML definitions. Use this option to update one or more XML definitions when their underlying
schemas change.
Tip: Import DTD or XML schema files instead of XML files. If you import an XML file with an associated DTD,
the XML Wizard uses the DTD.
To import XML target definitions:
When you create an XML target definition from an XML source definition, you create a duplicate of the XML
source definition.
A valid XML source definition does not necessarily create a valid XML target definition. To ensure that you
create a valid target definition, validate the target.
If you create a relational target definition, the Designer creates the relational target definitions based on the
groups in the XML source definition. Each group in the XML source definition becomes a target definition.
The Designer creates the same relationship between the target definitions as between the the groups in the
source definition.
1. Drag an XML source definition from the Navigator into the Target Designer workspace.
The XML Export dialog box appears.
2. Select to create a relational or XML target. Click OK.
The target definition appears in the Target Designer workspace. If you select relational targets, more than one
target definition might appear in the workspace, depending on the source.
Select Table Name of the target definition. To change the name, click the Rename button.
Business Name Descriptive name for the target table. Edit the Business Name using the Rename button.
Description Description of target table. Character limit is 2,000 bytes/K, where K is the maximum number
of bytes for each character in the repository code page. Enter links to business documentation.
Code Page Select the code page to use in the target definition.
Keywords Use keywords to organize and identify targets. Keywords might include developer names,
mappings, or XML schema names.
Use keywords to perform searches in the Repository Manager.
Select Table Displays the target definition you are editing. To choose a different definition to edit, select
one from the list of definitions you have available in the workspace.
Precision Size of column. You can change precision for some datatypes, such as string.
Scale Maximum number of digits after the decimal point for numeric values.
Key Type Type of key the XML Wizard generates to link the views.
XPath Path through the XML file hierarchy that locates an item.
5. On the Properties tab, you can modify the transformation attributes of the target definition.
If you use a source-based commit session or Transaction Control transformation with the XML target, you can
define how you want to flush data to the target.
Edit the following attributes:
Select Table Displays the source definition you are editing. To choose a different source definition to edit,
select the source definition from the list.
Duplicate Group Row Choose one of these options to handle processing duplicate rows in the target:
Handling - First Row. The Integration Service passes the first duplicate row to the target. Rows
following with the same primary key are rejected.
- Last Row. The Integration Service passes the last duplicate row to the target.
- Error. The Integration Service passes the first row to the target. Rows with duplicate
primary keys increment the error count. The session fails when the error count reaches
the error threshold.
DTD Reference DTD or XML schema file name for the target XML file. The Integration Service adds the
document type declaration to the XML file when you create the XML file.
On Commit The Integration Service can generate multiple XML files or append to one XML file after a
commit. Use one of the following options:
- Ignore Commit. The Integration Service creates an XML file and writes to the XML file at
end of file.
- Create New Document. Creates a new XML file at each commit.
- Append to Document. Writes to the same XML file after each commit.
Cache Directory Directory for the XML target cache files. Default is the $PMCacheDir service process variable.
Cache Size Total size in bytes for the XML target cache. Default is auto.
6. On the Metadata Extensions tab, you can create, modify, delete, and promote non-reusable metadata
extensions, and update their values. You can also update the values of reusable metadata extensions.
7. Click OK.
8. Click Repository > Save.
PowerCenter validates target XML views when you perform the following tasks:
The Designer does limited validation when you save or fetch an XML target from the repository.
The XML Editor validates each step when you edit XML in the XML workspace.
You can choose to validate a target definition that you are editing in the XML Editor.
The Designer validates XML target connections when the Designer validates mappings.
The Designer uses rules to validate hierarchy relationships, type relationships, and inheritance relationships.
A view that has a root at a type cannot be a standalone view. The view must be a child in an inheritance
relationship or the view must have a type relationship with another view. An XML target is invalid if the XML
target has no views that are rooted at an element.
You must connect a view with a multiple-occurring view row to another view.
An XML target is invalid if the XML target has no view root at an element.
You can separate parent and child views by other elements, but if you have a choice of two parents for a view,
you must use the closest one. Determine the closest parent by the path of the effective view row. One parent
comes before the other in the path. Choose the view that comes second in the path.
You must connect all views with the same view root in the same hierarchy. The definition cannot contain
multiple trees for the same view root.
An XML view can have a hierarchical relationship to itself if the view row and the view root are identical for the
view.
A column in a view, V1, can have a type relationship to a view, V2, if the view roots are the same type, or the
V2 view root type is derived from the V1 view root. Both view roots must be global complex types.
If a column in a view has a type relationship to another view, you cannot expand the column.
View-to-view inheritance. A view is a derived type of another view. Both views must have global complex view
roots.
A view can have an inheritance relationship to another view if its view root is a complex type derived from the
view root type of the other view.
A view can be a parent in multiple inheritance relationships, but a view can be a child in just one inheritance
relationship.
Column-to-view inheritance. The column is an element of a local complex type, Type1, and the view is rooted
at a global complex type, Type2. Type1 is derived from Type2.
A column in a view can have an inheritance relationship to another view if the column is a local complex type
and the type is derived from the view root type of the other view.
If a column in a view, V1, has an inheritance relationship to a view, V2, you cannot put the content of V2 into
view V1.
The following components affect how you map an XML target in a mapping:
Active sources
Root elements
Abstract elements
FileName columns
Active Sources
An active source is a transformation that can return a different number of rows for each input row. The Integration
Service can load data from different active sources to an XML target. However, all ports within a single group of
the XML target must receive data from the same active source.
Aggregator
Joiner
MQ Source Qualifier
Rank
Source Qualifier
SQL
1. Right-click the target definition in the Mapping Designer and select Edit.
2. Click the Properties tab.
3. Click the arrow in the Root Element value column.
The Select Root dialog box appears.
4. Select an element from the list.
If you connect one port in a group, you must connect both the foreign key and primary key ports for the group.
If you connect a foreign key port in a group, you must connect the associated primary key port in the other
group. If you do not connect the primary key port of the root group, you do not need to connect the associated
foreign key ports in the other groups.
If you use an XML schema with a default attribute value, you must connect the attribute port to create the
default attribute in the target. If you pass a null value through the connected port, the Integration Service writes
the default value to the target.
During a session, if the Integration Service loads data to an abstract type, then the Integration Service should also
have data for a non-abstract derived type associated with the abstract type. If the derived type has no data, then
the Integration Service does not write the abstract element in the target XML file.
Note: If you are creating a new XML file on each commit, you need to dynamically name each XML file you
create. If you do not dynamically name each XML file, the Integration Service overwrites the XML file from the
previous commit.
The Integration Service generates a new XML file for each distinct primary key value in the root group of the
target. You add a FileName column to set different names for each file. Each name overrides the output file name
in the session properties.
Example
The following example shows a mapping containing an XML target with a FileName column:
The Expression transformation generates a file name from the Country XML element and passes the value to the
FileName column. The mapping passes a country to the target root, which is called Client. Whenever the Client
value changes, the Integration Service creates a new XML file. The Integration Service creates a list file that
contains each XML target file name. The Integration Service lists the absolute path to each file in the list.
The list file name is the output file name from the session properties
revenue_file.xml.lst
For example, the ContactInfo element in the following DTD is an enclosure element. The enclosure element has
no text content, but has maxOccurs > 1. The child elements also have maxOccurs > 1.
<!ELEMENT HR (EMPLOYEE+)>
<!ELEMENT EMPLOYEE (LASTNAME,FIRSTNAME,ADDRESS+,CONTACTINFO+)>
<!ATTLIST EMPLOYEE EMPID CDATA #REQUIRED>
<!ELEMENT LASTNAME (#PCDATA)>
<!ELEMENT FIRSTNAME (#PCDATA)>
<!ELEMENT ADDRESS (STREETADDRESS,CITY,STATE,ZIP)>
<!ELEMENT STREETADDRESS (#PCDATA)>
<!ELEMENT CITY (#PCDATA)>
<!ELEMENT STATE (#PCDATA)>
<!ELEMENT ZIP (#PCDATA)>
<!ELEMENT CONTACTINFO (PHONE+,EMERGCONTACT+)>
<!ELEMENT PHONE (#PCDATA)>
<!ELEMENT EMERGCONTACT (#PCDATA)>
If you do not create an XML views for enclosure elements in the source definition, you do not create the
Contactinfo element in the source.
The XML Wizard creates the following source and target definitions:
The wizard does not include the ContactInfo element in the source definition because you chose not to create
views for enclosure elements when you created the source. However, the wizard includes the ContactInfo element
in the target definition.
The XML target definition I created from my relational sources contains all elements, but no attributes. How
can I modify the target hierarchy so that I can mark certain data as attributes?
You cannot modify the component types that the wizard creates from relational tables. However, you can view a
DTD or an XML schema file of the target XML hierarchy. Save the DTD or XML schema file with a new file name.
When you add an XML source definition to a mapping, you need to connect the source definition to an XML
Source Qualifier transformation. The XML Source Qualifier transformation defines the data elements that the
Integration Service reads during a session. The source qualifier determines how the PowerCenter reads the
source data.
You can manually add a source qualifier transformation, or you can create a source qualifier transformation by
default when you add a source definition to a mapping.
You can edit some of the properties and add metadata extensions to an XML Source Qualifier transformation.
When you connect an XML Source Qualifier transformation in a mapping, you must follow rules to create a valid
mapping.
84
Source Qualifier transformation. You can link one XML source definition to one XML Source Qualifier
transformation.
You can link ports of one XML Source Qualifier group to ports of different transformations to form separate data
flows. However, you cannot link ports from more than one group in an XML Source Qualifier transformation to
ports in the same target transformation.
If you drag columns from more than one group to a transformation, the Designer copies the columns of all the
groups to the transformation. However, the Designer links only the ports of the first group to the corresponding
ports of the new columns in the transformation.
You can add an XML Source Qualifier transformation to a mapping by dragging an XML source definition into the
Mapping Designer workspace or by manually creating the source qualifier.
Select Transformation Displays the transformation you are editing. To choose a different transformation to edit,
select the transformation from the list.
3. Click the Ports tab to view the XML Source Qualifier transformation ports.
Use the Sequence column to set start values for generated keys in XML groups. You can enter a different
value for each generated key. Sequence keys are of bigint datatype. Whenever you change these values, the
sequence numbers restart the next time you run a session.
4. Click the Properties tab to configure properties that affect how the Integration Service runs the mapping
during a session.
The following table describes the XML Source Qualifier properties:
Select Transformation Displays the transformation you are editing. To choose a different transformation to edit,
select the transformation from the list.
Tracing Level Determines the amount of information about this transformation that the Integration Service
writes to the session log when it runs the workflow. You can override this tracing level when
you configure a session.
Reset At the end of a session, the Integration Service resets the start values to the start values for
the current session.
Restart At the beginning of a session, the Integration Service starts the generated key sequence for
all groups at one.
5. Click the Metadata Extensions tab to create, edit, and delete user-defined metadata extensions.
You can create, modify, delete, and promote non-reusable metadata extensions, and update their values. You
can also update the values of reusable metadata extensions.
6. Click OK.
Default value. The sequence value for a key that appears in the XML Source Qualifier when you first create
the source qualifier. The default is 1 for each key.
Start value. A sequence number value for a key at the start of a session. You can view the start values in the
XML Source Qualifier transformation before you run a workflow.
Current value. A sequence value for a key during a session.
The start values for the generated keys display in the Sequence column in the XML Source Qualifier.
Note: If you edit the sequence start values on the Ports tab, you must save the changes and exit the Designer
before you run a workflow.
Reset. At the end of a session, the Integration Service resets the start values back to the start values for the
current session. For example, at the beginning of a session, the start value of a key is 2000. At the end of a
session, the current value is 2500. When the session completes, the start value in the repository remains at
2000. You might use this option when you are testing and you want to generate the same key numbers the next
time you run a session.
Restart. At the beginning of a session, the Integration Service restarts the start values using the default value.
For example, if the start value for a key is 1005, and you select Restart, the Integration Service changes the
start value to 1. You might use this option if the keys are getting large and you will have no duplicate key
conflicts if you restart numbering.
The Designer enforces concatenation rules when you connect objects in a mapping. Therefore, you need to
organize the groups in the XML source definition so that each group contains all the information you require in one
pipeline branch.
Consider the following rules when you connect an XML Source Qualifier transformation in a mapping:
You can link ports from one group in an XML Source Qualifier transformation to ports in one input
group of another transformation. You can copy the columns of several groups to one transformation, but you
can link the ports of only one group to the corresponding ports in the transformation.
You can link ports from one group in an XML Source Qualifier transformation to ports in more than one
transformation. Each group in an XML Source Qualifier transformation can be a source of data for more than
one pipeline branch. Data can pass from one group to several different transformations.
You might want to calculate the total YTD sales for each product in the XML file regardless of region. Besides
sales, you also want the names and prices of each product.
To do this, you need both product and sales information in the same transformation. However, when you import
the StoreInfo.xml file, the Designer creates separate groups for products and sales by default.
Since you cannot link both the Product and the Sales groups to the same single input group transformation, you
can create the mapping in one of the following ways:
Join the data from the two groups using a Joiner transformation.
The following figure shows a denormalized group Product_Sales containing a combination of columns from both
the Product and Sales groups:
To create the denormalized group, edit the source definition in the Source Analyzer. You can either create a new
group or modify an existing group. Add product and sales columns to the group in order to do the sales calculation
in the Aggregator transformation. Use the XML Editor to create the group and validate the group.
You can then send the data from the Joiner transformation to an Aggregator transformation to calculate the
YTDSales for each product.
The following figure shows how you can create two instances of the same XML source and join data from two XML
Source Qualifier transformations:
I cannot break the link between the XML source definition and its source qualifier.
The XML Source Qualifier transformation columns match the corresponding XML source definition columns. You
cannot remove or modify the links between an XML source definition and its XML Source Qualifier transformation.
When you remove an XML source definition, the Designer removes its XML Source Qualifier transformation.
XML Parser transformation. The XML Parser transformation reads XML from one input port and outputs data
to one or more groups.
XML Generator transformation. The XML Generator transformation reads data from one or more sources and
generates XML. The XML Generator transformation has a single output port.
Use a midstream XML transformation to extract XML data from messaging systems, such as TIBCO, WebSphere
MQ, or from other sources, such as files or databases. The XML transformation functionality is similar to the XML
source and target functionality, except the midstream XML transformation parses the XML or generates the
document in the pipeline.
Midstream XML transformations support the same XML schema components that the XML Wizard and XML Editor
support. In addition, XML transformations support the following functionality:
Pass-through ports. Use pass-through ports to pass non-XML data through the midstream transformation.
These fields are not part of the XML schema definition, but you use them to generate denormalized XML
groups. You use these fields in the same manner as top-level XML elements. You can also use a pass-through
field as a primary key for the top-level group in the XML definition.
91
Real-time processing. Use a midstream XML transformation to process data as BLOBs from messaging
systems.
Support for multiple partitions. You can generate different XML documents for each partition.
When the Integration Service processes an XML Parser transformation, it reads a row of XML data, parses the
XML, and returns data through output groups. The XML Parser transformation returns non-XML data in pass-
through ports. You can parse XML messages from sources such as JMS or IBM WebSphere MQ .
The XML Parser transformation has one input group and one or more output groups. The input group has one
input port, DataInput, which accepts an XML document in a string.
When you create an XML Parser transformation, use the XML Wizard to import an XML, DTD, or XML schema file.
For example, you can import the following Employee DTD file:
<!ELEMENT EMPLOYEES (EMPLOYEE+)>
<!ELEMENT EMPLOYEE (LASTNAME, FIRSTNAME, ADDRESS, PHONE+, EMAIL*, EMPLOYMENT)>
<!ATTLIST EMPLOYEE EMPID CDATA #REQUIRED
DEPTID CDATA #REQUIRED>
<!ELEMENT LASTNAME (#PCDATA)>
<!ELEMENT FIRSTNAME (#PCDATA)>
<!ELEMENT ADDRESS (STREETADDRESS, CITY, STATE, ZIP)>
<!ELEMENT STREETADDRESS (#PCDATA)>
<!ELEMENT CITY (#PCDATA)>
<!ELEMENT STATE (#PCDATA)>
<!ELEMENT ZIP (#PCDATA)>
<!ELEMENT PHONE (#PCDATA)>
<!ELEMENT EMAIL (#PCDATA)>
<!ELEMENT EMPLOYMENT (DATEOFHIRE, SALARY+)>
<!ATTLIST EMPLOYMENT EMPLSTAT (PF|PP|TF|TP|O) "PF">
<!ELEMENT DATEOFHIRE (#PCDATA)>
<!ELEMENT SALARY (#PCDATA)>
The Designer creates a root view, X_Employees. X_Employees is the parent of X_Employee. X_Employee is a
parent of X_Salary, X_Phone, and X_Email.
Each view in the XML Parser transformation has at least one key to establish its relationship with another view. If
you do not designate the keys in the XML Editor, the Designer creates the primary and foreign keys for each view.
The keys are of datatype bigint. The keys are called generated keys because the Integration Service creates the
key values each time it returns a row from the XML Parser transformation.
When the Designer creates a primary or foreign key column, it assigns a column name with a prefix. In an XML
definition, the prefix is XPK_ for a generated primary key column and XFK_ for a generated foreign key column. A
foreign key always refers to a primary key in another group. A generated foreign key column always refers to a
generated primary key column.
For example, the group X_Employee has the XPK_Employee primary key. The Designer creates foreign key
columns that connect the X_Phone, X_Email, and X_Salary to the X_Employee group. Each group has the foreign
key column XFK_Employee.
The repository stores the key values. You cannot change the values in the repository, but you can choose to reset
or restart the sequence numbers after a session.
For example, a real-time PowerCenter session reads XML messages from a WebSphere MQSeries source. The
session runs with a source-based commit. A message in the commit transaction has an invalid XML payload. To
prevent the commit from failing, you can configure the XML Parser transformation to return the invalid XML to a
separate output group from the valid data. The XML Parser transformation processes the valid XML messages and
completes the transaction.
To configure the XML Parser transformation to validate the XML, enable the Route Invalid Payload Through Data
Flow option on the Midstream XML Parser tab. The Designer adds the following ports to the XML Parser
transformation:
Invalid_Payload. Returns invalid XML messages to the pipeline. If the XML payload is valid, the
Invalid_Payload port contains a null value. This port has the same precision as the DataInput port.
Error_Status. Contains the error string or status returned from the XML validation. If the XML is valid for the
current row, Error_Status contains a null value. This port has the same precision as the DataInput port.
The following mapping shows an XML Parser transformation that routes invalid XML messages to an Errors target
table:
MQSeries source definition. Contains employee XML data in the message data field.
Source Qualifier transformation. Reads data from the WebSphere MQSeries. Contains a set of ports that
represent the message header fields and the message data field.
XML Parser transformation. Receives the XML message data in the DataInput port. When the XML is valid,
the XML Parser transformation returns the employee data and passes it to a target. When the XML is not valid,
the XML Parser transformation returns the XML in the Invalid_Payload port. It returns an error message in the
Error_Status port.
Employees target definition. Receives rows of valid employee data.
Configure the XML Schema Location attribute in the session properties for the transformation. Enter the name and
location of the schema to validate the XML against. You can configure workflow, session, or mapping variables
and parameters for the XML schema definition. You can configure multiple schemas for validation if you separate
them with semi-colons.
You can use a DTD for validation if you include it in the input XML payload. You cannot configure a DTD in the
XML Schema Location attribute or use it to route invalid XML data to the Invalid Payload port.
If you enable XML streaming, verify that the precision for the Invalid_Payload port matches the maximum message
size. If the port precision is less then the message size, the XML Parser transformation returns truncated XML in
the Invalid_Payload port, and writes an error in the session log.
When you enable XML streaming, the XML Parser transformation receives data in segments that are less than or
equal to the port size. When the XML file is larger than the port size, the PowerCenter Integration Service passes
more than one row to the XML Parser transformation. Each XML row has a row type of streaming. The last row
has a row type of insert.
The input port precision must be equal to or greater than the output port precision of the object that passes the
XML to the XML Parser transformation. When most of the XML documents are small, but some messages are
large, set the XML Parser transformation port size to the size of the smaller messages for best performance.
If you enable XML streaming, you must also enable XML streaming for the source or transformation that is passing
the XML data to the XML Parser transformation. If you do not enable streaming, the XML Parser receives the XML
in one row, which might slow performance.
To enable XML streaming in the XML Parser transformation, select Enable XML Input Streaming in the XML
Parser transformation session properties. If you enable XML streaming in the source or transformation, but you do
not enable it for the XML Parser transformation, the XML Parser transformation cannot process the XML file.
When you enable XML streaming and an error occurs in the XML document, the PowerCenter Integration Service
writes the XML document to the session log by default. You can configure the session to write the XML document
to the error log file when an error occurs. Enable Log Source Row Data in the session properties. When you
enable logging, and an error occurs in the XML document, the PowerCenter Integration Service generates a row
error. The PowerCenter Integration Service writes the XML document to the error log file and it increments the
error count.
For example, on Windows 32-bit, the Integration Service rounds the number after 17 digits.
1234.567890123456789 is converted to 1234.567890123460800.
On HP-UX 32-bit, the Integration Service rounds the number after 34 digits.
Use an XML Generator transformation to combine input that comes from several sources to create an XML
document. For example, use the transformation to combine the XML data from two TIBCO sources into one TIBCO
The XML Generator transformation is similar to an XML target definition. When the Integration Service processes
an XML Generator transformation, it writes rows of XML data. The Integration Service can also process pass-
through fields containing non-XML data in the transformation.
The XML Generator transformation has one or more input groups and one output group. The output group has one
port, DataOutput, which generates a string data BLOB XML document. The output group contains the pass-
through port when you create pass-through fields.
To synchronize a midstream XML transformation, use the Transformation Developer or Mapping Designer.
You use the Source Analyzer and Target Designer to synchronize source and target XML definitions.
When you create a midstream XML transformation in the Mapping Designer, the following rules apply:
If you make the transformation reusable, you can change some of the transformation properties from the
Mapping Designer. You cannot add pass-through ports or metadata extensions.
If you create a non-reusable transformation, you can edit the transformation from the Mapping Designer.
When you configure a midstream XML transformation, you can configure components on the following tabs:
Transformation tab. Rename the transformation and add a description on the Transformation tab.
Ports tab. Display the transformation ports and attributes that you create on the XML Parser or XML Generator
tab.
Properties tab. Update the tracing level.
Initialization Properties tab. Create run-time properties that an external procedure uses during initialization.
Metadata Extensions tab. Extend the metadata stored in the repository by associating information with
repository objects, such as an XML transformation.
Port Attribute Definitions tab. Define port attributes that apply to all ports in the transformation.
Midstream XML Parser or XML Generator tab. Create pass-through ports using this tab. Pass-through ports
enable you to pass non-XML data through the transformation. For the XML Parser transformation, you can
choose to reset sequence numbers if you use sequence numbering to generate XML column names. For the
XML Generator transformation, you can choose to create a new XML document on commits.
Properties Tab
Configure the midstream XML transformation properties on the Properties tab.
The following table describes the options you can change on the Properties tab:
Transformation Description
Runtime Location Location that contains the DLL or shared library. Default is $PMExtProcDir. Enter a path relative to the
Integration Service node that runs the XML session.
If this property is blank, the Integration Service uses the environment variable defined on the
Integration Service node to locate the DLL or shared library.
You must copy all DLLs or shared libraries to the runtime location or to the environment variable
defined on the Integration Service node. The Integration Service fails to load the procedure when it
cannot locate the DLL, shared library, or a referenced file.
Tracing Level Amount of detail displayed in the session log for this transformation. Default is Normal.
Transformation Scope Indicates how the Integration Service applies the transformation logic to incoming data. You can
choose one of the following transformation scope values for the XML Parser transformation:
- Row. Applies the transformation logic to one row of data at a time. Flushes the rows generated
for all the output groups before processing the next row.
- Transaction. Applies the transformation logic to all rows in a transaction. Flushes generated rows
at transaction boundaries, when output blocks fill up, and at end of file.
- All Input. Applies the transformation logic to all incoming data. Flush generated rows only when
the output blocks fill up and at end of file.
For the XML Generator transformation, the Designer sets the transformation scope to all input when
you set the On Commit setting to Ignore Commit. The Designer sets the transformation scope to the
transaction level if you set On Commit to Create New Doc.
Output is Repeatable Indicates if the order of the output data is consistent between session runs.
- Never. The order of the output data is inconsistent between session runs.
- Based On Input Order. The output order is consistent between session runs when the input data
order is consistent between session runs.
- Always. The order of the output data is consistent between session runs even if the order of the
input data is inconsistent between session runs.
Default is Based on Input Order for the XML Parser transformation. Default is Always for the XML
Generator transformation.
Requires Single Indicates if the Integration Service must process each partition with one thread.
Thread per Partition
Output is Indicates whether the transformation generates the same output data between session runs. You must
Deterministic enable this property to perform recovery on sessions that use this transformation. Default is enabled.
Warning: If you configure a transformation as repeatable and deterministic, it is your responsibility to ensure that
the data is repeatable and deterministic. If you try to recover a session with transformations that do not produce
the same data between the session and the recovery, the recovery process can result in corrupted data.
You can access the XML Editor from the Midstream XML Parser Tab. Click the XML Editor button.
Note: When you access the XML Editor, you cannot update Edit Transformations until you exit the XML Editor.
The following table describes the options you can change on the Midstream XML Parser tab:
Transformation Description
Precision Length of the column. Default DataInput port precision is 64K. Default precision for a pass-
though port is 20. You can increase the precision.
Restart Always start the generated key sequence at 1. Each time you run a session, the key sequence
values in all groups of the XML definition start over at 1.
Reset At the end of a session, reset the value sequence for all generated keys in all groups. Reset
the sequence numbers back to where they were before the session.
Route Invalid Payload Validate the XML against a schema. If the XML is not valid for the schema, a row error occurs.
Through Data Flow The XML Parser transformation returns the XML and associated error messages to a separate
output group.
Note: If you do not select Reset or Restart, the sequence numbers in the generated keys increase from session to
session. If you select the Restart or Reset option, you update the Restart or Reset property that appears on the
Initialization Properties tab. You cannot change these options from the Initialization Properties tab, however.
You can access the XML Editor from the Midstream XML Generator Tab. Click the XML Editor button. When you
access the XML Editor, you cannot edit transformation properties until you exit the XML Editor.
The following table describes the options you can change on the XML Generator transformation tab:
Transformation Description
Setting
Precision Length of the column. Default DataOutput port precision is 64K. Default precision for a pass-though
port is 20. You can increase the precision.
On Commit The Integration Service can generate multiple XML documents after a commit. Use one of the
following options:
- Ignore Commit. The Integration Service creates the XML document and writes data to the
document at end of file. Use this option if two different sources are connected to the XML
Generator transformation.
- Create New Document. Creates a new XML document at each commit. Use this option if you are
running a real-time session.
When a session uses multiple partitions, the Integration Service generates a separate XML
document for each partition, regardless of On Commit settings. If you select Create New Document,
the Integration Service creates new documents for each partition.
Note: The Designer sets the transformation scope to all input when you set the On Commit setting to Ignore
Commit. The Designer sets the transformation scope to the transaction level if you set On Commit to Create New
Doc.
When you define a pass-through port in the midstream transformation, you add the pass-through port to either the
DataInput group in the XML Parser transformation or the DataOutput group in the XML Generator transformation.
Once you generate the port, you use the XML Editor to add a corresponding reference port to another view in the
XML definition. In the XML Parser transformation, the pass-through port is an input port, and the corresponding
reference port is an output port. In the XML Generator transformation, the pass-through port is an output port and
the associated reference port is an input port.
You can change the datatypes in XML definitions and in midstream XML transformations if you import an XML file
to create the definition. You cannot change XML datatypes when you import them from an XML schema. You
cannot change the transformation datatypes for XML sources within a mapping.
The following table describes the XML and corresponding transformation datatypes:
date Date/Time Jan 1, 0001 A.D. to Dec 31, 9999 A.D. (precision to the nanosecond)
dateTime Date/Time Jan 1, 0001 A.D. to Dec 31, 9999 A.D. (precision to the nanosecond)
101
Datatype Transformation Range
gMonthDay Date/Time Jan 1, 0001 A.D. to Dec 31, 9999 A.D. (precision to the nanosecond)
gYearMonth Date/Time Jan 1, 0001 A.D. to Dec 31, 9999 A.D. (precision to the nanosecond)
time Date/Time Jan 1, 0001 A.D. to Dec 31, 9999 A.D. (precision to the nanosecond)
Unsupported Datatypes
PowerCenter does not support the following XML datatypes:
binary
century
month
nstring
number
recurringDate
recurringDay
recurringDuration
timeDuration
timeInstant
timePeriod
uriReference
year
Use a date, time, or datetime element in either of the following formats within a session:
CCYY-MM
- or -
CCYY-MM-DD/Thh
The format of the first datetime element in an XML file determines the format of all subsequent values of the
element. If the Integration Service reads a value for the same date, time, or datetime element that has a different
format, the Integration Service rejects the row.
For example, if the first value of a datetime element is in the following format:
CCYY-MM-DDThh:mm:ss
the Integration Service rejects a row that contains the element in the following format:
CCYY-MM-DD
boolean, 107
ceiling, 108
concat, 108
contains, 109
false, 110
floor, 110
lang, 111
normalize-space, 111
not, 112
number, 112
round, 113
starts-with, 113
string, 114
string-length, 115
substring, 115
substring-after, 117
substring-before, 117
translate, 118
true, 118
Use an XPath query predicate in an XML view to filter XML source data. In a session, the Integration Service
extracts data from a source XML file based on the query. If all queries return TRUE, the Integration Service
extracts data for the view.
105
An XPath query predicate includes an element or attribute to extract, and the query that determines the criteria.
You can verify the value of an element or attribute, or you can verify that an element or attribute exists in the
source XML data.
This appendix describes each function used in an XPath query predicate. Functions accept arguments and return
values. When you create a function, you can include components from the elements and attributes in the XML
view, and you can add literal values. When you specify a literal, you must enclose the literal in single or double
quotes.
String. Use string functions to test substring values, concatenate strings, or translate strings into other strings.
For example, the following XPath query predicate determines if an employees full name is equal to the
concatenation of last name and first name:
EMPLOYEE[./FULLNAME=concat(./ENAME/LASTNAME,./ENAME/FIRSTNAME)]
Numeric. Use numeric functions with element and attribute values. Numeric functions operate on numbers and
return integers. For example, the following XPath query predicate rounds discount and tests if the result is
greater than 15:
ORDER_ITEMS[round(./DISCOUNT) > 15]
Boolean. Use Boolean functions to test elements, check the language attribute, or force a true or false result.
For example, the following XPath query predicate returns true if the value is greater than zero:
boolean(string)
The following table describes XPath query predicate string functions:
normalize-space normalize-space ( string ) Strips leading and trailing white space from a string.
substring substring ( string, start [ ,length ] ) Returns a portion of a string starting at a specified position.
substring-after substring-after ( string, substring ) Returns a portion of a string starting at a specified position.
substring-before substring-before ( string, substring ) Returns the characters in a string that occur before a
substring.
translate translate ( string1, string2, string3 ) Converts the characters in a string to other characters.
ceiling ceiling ( number ) Rounds a number to the smallest integer that is greater
than or equal to the passed number.
floor floor ( number ) Rounds a number to the largest integer that is less than or
equal to the passed number.
not not ( condition ) Returns TRUE if a Boolean condition is FALSE and FALSE
if the Boolean condition is TRUE.
boolean
Converts a value to Boolean.
Syntax
boolean ( object )
Argument Description
Return Value
Boolean.
A string returns TRUE if its length is not zero, otherwise it returns FALSE.
A number returns FALSE if it is zero or not a number (NaN), otherwise it returns TRUE.
boolean 107
Example
The following example verifies that a name has characters:
boolean ( NAME )
NAME RETURN VALUE
Lilah TRUE
FALSE
ceiling
Rounds a number to the smallest integer that is greater than or equal to the passed number.
Syntax
ceiling ( number )
Argument Description
Return Value
Integer.
Example
The following expression returns the price rounded to the smallest integer:
ceiling ( PRICE )
PRICE RETURN VALUE
39.79 40
125.12 126
74.24 75
NULL NULL
-100.99 -100
100 100
concat
Concatenates two strings.
Argument Description
Return Value
String.
If one of the strings is NULL, concat ignores it and returns the other string.
Example
The following expression concatenates FIRSTNAME and LASTNAME:
concat( FIRSTNAME, LASTNAME )
FIRSTNAME LASTNAME RETURN VALUE
John Baer JohnBaer
NULL Campbell Campbell
Greg NULL Greg
NULL NULL NULL
Tip
The concat function does not add spaces to strings. To add a space between two strings, you can write an
expression that contains nested concat functions. For example, the following expression adds a space to the end
of the first name and concatenates first name to the last name:
concat ( concat ( FIRST_NAME, " " ), LAST_NAME )
FIRST_NAME LAST_NAME RETURN VALUE
John Baer John Baer
NULL Campbell Campbell (includes leading space)
Greg NULL Greg
contains
Determines if a string contains another string.
Syntax
contains( string, substring )
Argument Description
string String datatype. Passes the string to examine. The argument is case sensitive.
substring String datatype. Passes the string to search for in the string. The argument is case sensitive.
Return Value
Boolean.
contains 109
Example
The following expressions returns TRUE if the NAME contains SHORTNAME:
contains( NAME, SHORTNAME )
NAME SHORTNAME RETURN VALUE
John Baer FALSE
SuzyQ Suzy TRUE
WorldPeace World TRUE
CASE_SENSITIVE case FALSE
false
Always returns FALSE. Use this function to set a Boolean to FALSE.
Syntax
false ()
Return Value
FALSE.
Example
Combine the false function with other functions to force a FALSE result. The following expressions return FALSE:
EXPRESSION RETURN VALUE
( salary ) = false() FALSE
A/B = false () FALSE
starts-with ( name, 'T' ) = false() FALSE
floor
Rounds a number to the largest integer that is less than or equal to the passed number.
Syntax
floor( number )
Argument Description
Return Value
Integer.
Example
The following expression returns the largest integer less than or equal to the value in BANK_BALANCE:
floor( BANK_BALANCE )
BANK_BALANCE RETURN VALUE
39.79 39
lang
Returns TRUE if the element has an xml:lang attribute that is the same language as the code argument. Use the
lang function to select XML by language. The xml:lang attribute is a code that identifies the language of the
element content. An element might include text in several languages.
Syntax
lang ( code )
Argument Description
Return Value
Boolean.
Example
The following expression examines the element content language code:
lang ( en )
XML RETURN VALUE
<Phrase xml:lang=es> FALSE
El perro esta en la casa.
</Phrase>
<Phrase xml:lang=en> TRUE
The dog is in the house.
</Phrase>
normalize-space
Removes leading and trailing white space from a string. White space contains characters that do not display, such
as the space character and the tab character. This function replaces sequences of white space with a single space.
Syntax
normalize-space ( string )
Argument Description
Return Value
String.
lang 111
Example
The following expression removes excess white space from a name:
normalize-space ( NAME )
NAME RETURN VALUE
Jack Dog Jack Dog
Harry Cat Harry Cat
not
Returns the inverse of a Boolean condition. The function returns TRUE if a condition is false, and returns FALSE if
a condition is true.
Syntax
not ( condition )
Argument Description
Return Value
Boolean.
Example
The following expression returns the inverse of a Boolean condition:
not ( EMPLOYEE = concat ( FIRSTNAME, LASTNAME ))
EMPLOYEE FIRSTNAME LASTNAME RETURN
Fullname Full Name FALSE
Lastname Lastname First TRUE
number
Converts a string or Boolean value to a number.
Syntax
number ( value )
Argument Description
A string converts to a number if the string contains a numeric character. The string can contain white space
and include a minus sign followed by a number. White space can follow the number in the string. Any other
string is Not a Number (NaN).
A Boolean TRUE converts to 1. A Boolean FALSE converts to 0.
If a value passed as an argument to the function is not a number, the function returns Not A Number (NaN).
Example
The following expression converts payment to a number:
number ( PAYMENT )
PAYMENT RETURN VALUE
850.00 850.00
TRUE 1
FALSE 0
AB NaN
round
Rounds a number to the nearest integer. If the number is between two integers, round returns the higher number.
Syntax
round ( number )
Argument Description
number Numeric value. Passes a numeric datatype or an expression that results in a number.
Return Value
Integer.
Example
The following expression rounds BANK_BALANCE:
round( BANK_BALANCE )
BANK_BALANCE RETURN VALUE
12.34 12
12.50 13
-18.99 -19
NULL NULL
starts-with
Returns TRUE if the first string starts with the second string. Otherwise, returns FALSE.
round 113
Syntax
starts-with ( string, substring )
Argument Description
string String datatype. Passes the string to search. The string is case sensitive.
substring String datatype. Passes the substring to search for in the string. The substring is case sensitive.
Return Value
Boolean.
Example
The following expression determines if NAME starts with FIRSTNAME:
starts-with ( NAME, FIRSTNAME )
NAME FIRSTNAME RETURN VALUE
Kathy Russell Kathy TRUE
Joe Abril Mark FALSE
string
Converts a number or Boolean to a string.
Syntax
string ( value )
Argument Description
Return Value
String.
Returns an empty string if no value is passed. Returns NULL if a null value is passed.
If the number is an integer, the function returns a string in decimal form with no decimal point and no leading
zeros.
If the number is not an integer, the function returns a string including a decimal point with at least one digit
before the decimal point, and at least one digit after the decimal point.
If the number is negative, the function returns a string that contains a minus sign (-).
The following expression returns a string from the Boolean argument STATUS:
string( STATUS )
STATUS RETURN VALUE
TRUE true
FALSE false
NULL NULL
string-length
Returns the number of characters in a string, including trailing blanks.
Syntax
string-length ( string )
Argument Description
Return Value
Integer.
Example
The following expression returns the length of the customer name:
string-length ( CUSTOMER_NAME )
CUSTOMER_NAME RETURN VALUE
Bernice Davis 13
NULL NULL
John Baer 9
substring
Returns a portion of a string starting at a specified position. Substring includes blanks as characters in the string.
string-length 115
Syntax
substring ( string, start [ ,length ] )
Argument Description
start Integer datatype. Passes the position in the string to start counting. If the start position is a positive
number, substring locates the start position by counting from the beginning of the string. The first
character is one. If the start position is a negative number, substring locates the start position by
counting from the end of the string.
length Integer datatype. Must be greater than zero. Passes the number of characters to return in a string. If you
omit the length argument, substring returns all of the characters from the start position to the end of the
string.
Return Value
String.
When the string contains a numeric value, the function returns a character string.
If you pass a negative integer or zero, the function returns an empty string.
Examples
The following expression returns the area code in PHONE:
substring( PHONE, 1, 3 )
PHONE RETURN VALUE
809-555-3915 809
NULL NULL
The following expression returns the PHONE without the area code:
substring ( phone, 5, 8 )
PHONE RETURN VALUE
808-555-0269 555-0269
NULL NULL
You can pass a negative start value to start from the right side of the string. The expression reads the source
string from left to right for the length argument:
substring ( PHONE, -8, 3 )
PHONE RETURN VALUE
808-555-0269 555
809-555-3915 555
357-687-6708 687
NULL NULL
When the length argument is longer than the string, substring returns all the characters from the start position to
the end of the string. For example:
substring ( 'abcd', 2, 8 )
returns bcd.
substring ( 'abcd', -2, 8 )
returns cd.
Syntax
substring-after ( string, substring )
Argument Description
Return Value
String.
Example
The following expression returns the string of characters in PHONE that occurs after the area code (415):
substring-after ( PHONE, (415) )
PHONE RETURN VALUE
(415)555-1212 555-1212
(408)368-4017
NULL NULL
(415)366-7621 366-7621
substring-before
Returns the part of a string that occurs before a substring.
Syntax
substring-before ( string, substring )
Argument Description
substring String datatype. Passes the substring to search for in the string.
Return Value
String.
substring-after 117
Example
The following expression returns the number that occurs in a Third Street address:
substring-before ( ADDRESS, Third Street )
ADDRESS RETURN VALUE
100 Third Street 100
250 Third Street 250
600 Third Street 600
NULL NULL
translate
Converts the characters in a string to other characters. The function uses two other strings as translation pairs.
Syntax
translate ( string1, string2, string3 )
Argument Description
string2 String datatype. Passes the string that defines which characters to translate. Translate replaces each
character in string1 with a number indicating its position in string2.
string3 String datatype. Passes the string that defines what the characters from encrypted string1 should translate
to. Translate replaces each character in encyrpted string1 with a character in string3 at the position
number from string2.
Return Value
String.
Example
The following expression translates a string using two other strings:
translate ( EXPRESSION, STRING2, STRING3 )
EXPRESSION STRING2 STRING3 RETURN VALUE
A Space Odissei i y A Space Odyssey
rats tras TCAS CATS
bar abc ABC BAr
Translate does not change a character in EXPRESSION if the character does not occur in string2. If a character
occurs in EXPRESSION and string2, but does not occur in string3, the character does not occur in the returned
string.
true
Always returns TRUE. Use this function to set a Boolean to TRUE.
Syntax
true ()
Return Value
Boolean TRUE.
Example
The following example returns TRUE for each expression:
EXPRESSION RETURN VALUE
( decision ) = true () TRUE
A/B = true () TRUE
( starts-with ( name, 'T' ))= true TRUE
true 119
INDEX
A
using with schema subsets 43
CLOB
absolute cardinality extracting large XML files 100
description 9 code pages
abstract elements importing XML sources 41
description 14 XML file 5, 19
using in a mapping 80 columns
advanced mode adding to XML views 54
setting the XPath Navigator 54 deleting from an XML view 56
all group generating names 41
description 17 pivoting 37
ANY content elements Columns window
description 16 description 54
anyAttribute complex types
description 17 creating type relationships 62
anySimpleType description 13
description 16 expanding 56
using in the XML Editor 57 extended 13
anyType element type in XML schemas 13
description 15 restricted 13
using in the XML Editor 57 viewing the hierarchy 65
atomic types composite keys
description 12 description 39
attribute query concat (XPath)
using XPath query predicates 58 description 106
attributes syntax 108
DTD syntax 6 concatenated columns
XML 42 description 39
constraint-based loading
with XML circular references 34
B
contains function
description 106
boolean function syntax 109
syntax 107 creating
Boolean operators new XML views in workspace 54
description 59 relationships between XML views 62
XPath query predicates 58
custom XML groups
C
description 26
skip create view 45
cardinality
absolute 9
relative 10
types 9
D
ceiling function DataInput port
description 106 description 92
syntax 108 DataOutput port
characters description 95
counting in a string 115 datatypes
child element rounding XML doubles 95
overview 2 unsupported XML 103
choice group default values
description 17 DTD attributes 6
circular references XML attributes 80
description 34 deleting
using constraint-based loading 34 columns from XML view 56
120
H
denormalized views
generating 28
denormalized XML groups hierarchical views
description 28 types 27
deriving datatypes hierarchy
description 23 description 9
DTD file hierarchy relationships
description 5 element relationships 9
DTM buffer size errors generating 27
fixing 41 model description 20
using circular references 34
E
element query I
using XPath query predicates 58 #IMPLIED option
elements description 6
description 2 ignore fixed element
DTD syntax 5 setting option 41
Enable Input Streaming ignore prohibited attributes
XML Parser transformation property 95 setting options 41
enclosure element infinite precision
creating views for 41 overriding 41
XML hierarchy 2 Invalid_Payload port
encoding XML Parser transformation 93
declaration in XML 5
entity relationships
J
generating 29
generating XML views 44
in XML definitions 30 Joiner transformation
modeling 20 combining XML groups 88
rules and guidelines 30
enumeration
K
description 12
searching for values 65
Error_Status port keys
XML Parser transformation 93 generated key sequence numbers 86
using in XML views 26
XML Parser transformation 92
F
L
facets
description 12
false function lang function
syntax 110 description 106
FileName column syntax 111
adding to an XML view 58 leaf element
passing to XML target 81 overview 2
floor function legend
description 106 understanding XML Editor icons 51
syntax 110 limitations
flushing data using XML sources and targets 39
XML targets 80 lists
functions description 12
using in XPath queries 60 local element
overview 2
G
generate names for XML columns M
description 41 mappings
generated keys connecting abstract elements 80
description 26 using XML targets 79
sequence numbering 86 XML Source Qualifier transformation 87
global declarations XML target ports 80
option to create 41 message IDs
global element XML Generator transformations 100
overview 2
Index 121
metadata explosion pass-through ports
description 30 adding to XML views 98, 99
reducing 46 description 58
metadata extensions generating 100
in XML source qualifiers 86 passive transformations
in XML sources 47 XML Source Qualifier 84
in XML targets 76 pattern facet
midstream XML transformation description 12
creating 96 pivoting
general properties 97 adding pivoted columns 54
Generator properties 99 deleting pivoted columns 56
overview 91 in Advanced Options 41
Parser properties 98 setting multiple levels 38
reset generated key sequence 98 XML columns 37
mode button ports
using the XPath Navigator 54 XML Source Qualifier transformation 87
multiple-level pivots XML targets 80
description 38 precision
multiple-occurring element overriding infinite length 41
overview 2 prefix
updating namespace 64
properties
122 Index
U
sequence group
description 17
simple types unions
description 12 description 13
viewing in a hierarchy 65
single-occurring element
V
overview 2
Skip Create XML Views
setting custom views 45 validating
start value target rules 78
generated keys 86 XML definitions 66
starts-with function XPath queries 60
description 106 view row
syntax 113 description 35
streaming XML guidelines for using 36
logging errors 95 views
XML Parser transformation 95 creating relationships 62
string function description 20
description 106 generating entity relationships 29
syntax 114 generating hierarchical relationships 27
string-length function setting options 41
description 106
syntax 115
X
strings
counting characters 115
returning part of 115 XML
substitution groups attributes 42
description 33 character encoding 19
in XML definitions 33 code pages 19, 41
in XML schema files 18 comparing datatypes to transformation 101
substring function datatypes 101
description 106 description 2
syntax 115 extracting large XML files from a CLOB 100
substring-after function path 19
description 106 synchronizing definitions with schemas 46
syntax 117 XML datatypes
substring-before function rounding doubles 95
description 106 XML definitions
syntax 117 creating from flat files 49
synchronizing creating from relational files 49
midstream XML transformations 96 creating from repository definitions 49
XML definitions 46 synchronizing with sources 46
XML Editor
adding a pass-through port 58
T adding columns to views 54
creating new views 54
targets creating type relationships 62
specifying a root element 80 creating view relationships 62
transaction control point creating XPath query predicates 58
XML targets 80 deleting columns 56
transformation datatypes expanding complex types 56
comparing to XML 101 pass-through fields 98, 99
transformations understanding the icons legend 51
XML Source Qualifier 84 updating namespace 64
translate function using ANY content 57
description 106 using the Columns window 54
syntax 118 validating definitions 66
troubleshooting XML file
XML Source Qualifier transformation 90 importing an XML target definition from 75
XML sources 49 naming 58
XML targets 82 XML Generator transformation
true function DataOutput port 95
syntax 118 example 95
type relationships overview 91
creating in the workspace 62 pass-through ports 100
XML groups
all group 17
choice group 17
Index 123
creating custom 26 XML sources
creating groups from relational tables 25 creating a target from 75
element and attribute groups 17 limitations 39
generating denormalized groups 28 overview 40, 51
generating normalized groups 27 troubleshooting 49
modifying source groups 41 XML targets
substitution groups 18 active sources 79
using substitution groups 33 creating groups from relational tables 25
XML hierarchy editing target properties 76
child element 2 flushing data 80
creating hierarchy relationships 44 limitations 39
enclosure element 2 multi-line attributes 42
global element 2 On Commit session property 80
leaf element 2 port connections 80
local element 2 setting default attributes 80
multiple-occurring element 2 troubleshooting 82
parent chain 2 using in mapping 79
parent element 2 XML views
single-occurring element 2 adding columns 54
xml lang attribute adding pass-through fields 98, 99
description 106 combining data 88
XML metadata creating 42
cardinality 9 creating hierarchy relationships 44
description of types 7 creating new views 54
extracting from substitution groups 33 creating relationships between 62
extracting from XML schemas 23 creating with XML Wizard 42
from substitution groups 18 filtering data 58
hierarchy 9 generating custom views 45
name 9 generating entity relationships 44
null constraint 9 pivoting columns 37
viewing 66 Skip Create XML View option 45
XML Parser transformation XML Wizard
Datainput port 92 generating custom XML views 45
Enable Log Source Row Data 95 generating entity relationships 44
Error_Status port 93 generating hierarchy relationships 44
example 92 selecting root elements 45
generated keys 92 synchronizing XML definitions 46
input validation 93 XPath
Invalid_Payload port 93 adding pivoted columns 54
overview 91 adding query operators 60
rounding Double datatypes 95 creating a query predicate 58
Route Invalid Payload Through Data Flow option 98 description 19
streaming XML files 95 expanding complex types 56
XML Path using query predicates 36
description 19 validating queries 60
XML rules XPath query predicate
pivoting groups 37 functions 105
XML groups from relational tables 25
XML target port connections 80
XML schema
complex types 13
importing metadata from 23
setting default attributes 80
XML schema definition (XSD)
described 6
XML Source Qualifier transformation
adding to mapping 84
creating by default 85
example 88
manually creating 85
overview 84
port connections 87
troubleshooting 90
using in a mapping 87
124 Index