SAP Admin
SAP Admin
SAP Admin
Administrator Guide
Content
2 Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1 Architecture overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Servers and services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Information Steward architecture overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Metadata integrators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Distributed architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Component communication channels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Port assignments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
DSN-less and TNS-less connections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
2.3 SAP integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Information workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Adding a table to a Data Insight project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25
Profiling data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Scheduling and running a Metadata Management integrator source. . . . . . . . . . . . . . . . . . . . . . . . 26
Creating a custom cleansing package with Cleansing Package Builder. . . . . . . . . . . . . . . . . . . . . . .26
Creating a data cleansing solution in Data Cleansing Advisor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Administrator Guide
2 PUBLIC Content
Denying access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Setting user email address for notifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47
5.4 User rights in Data Insight. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Data Insight predefined user groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49
Type-specific rights for Data Insight objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Type-specific rights for source or failed data connections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Customizing rights on Data Insight objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.5 User rights in Metadata Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Metadata Management pre-defined user groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Type-specific rights for Metadata Management objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Assigning users to specific Metadata Management objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.6 User rights in Metapedia. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Metapedia pre-defined user groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Type-specific rights for Metapedia objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Assigning users to specific Metapedia objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.7 User rights in Cleansing Package Builder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Group rights for cleansing packages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.8 User rights for Information Steward administrative tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Viewing and editing repository information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
User rights in Match Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75
6 Repository Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.1 Repository management overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2 Viewing and editing repository information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Viewing and editing Data Cleansing Advisor repository information. . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3 Enabling Windows authentication for Microsoft SQL Server repositories. . . . . . . . . . . . . . . . . . . . . . . .79
6.4 Backing up your repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.5 Repository utility overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Creating a new repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Recovering the repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Upgrading the repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Completing the repository recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Repository utility parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Administrator Guide
Content PUBLIC 3
Displaying and editing Data Insight connection parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Deleting a Data Insight connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Setting connection type and permissions to access failed data. . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.4 Data Insight projects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Creating a project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Editing a project description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Enterprise project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Deleting a project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.5 Data Insight tasks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .122
Scheduling a task in the CMC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Export tasks for external scheduler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Recurrence options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Configuring for task completion notification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Rule threshold notification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Monitoring a task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Pausing and resuming a schedule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Deleting failed data for a failed rule task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Common runtime parameters for Information Steward. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Administrator Guide
4 PUBLIC Content
Running SAP BusinessObjects Enterprise Metadata Integrator with Windows AD authentication
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
8.5 Runtime parameters for integrator sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Common runtime parameters for metadata integrators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Runtime parameters for specific metadata integrators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Changing runtime parameters for integrator sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
8.6 Viewing integrator run progress, history, and log files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
8.7 Other options in the Integrator History page. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
8.8 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Crystal Report message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Desktop Intelligence document error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Out of memory error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Parsing failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Parsing failure for derived table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Unable to parse SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Unable to retrieve SQL to parse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Connection with Data Federator Designer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .192
8.9 Grouping Metadata Integrator sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Creating source groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .192
Modifying source groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .192
Deleting source groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
8.10 Configuring BI Launch Pad for Information Steward. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
8.11 Displaying user-defined reports in Information Steward. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .194
Administrator Guide
Content PUBLIC 5
10.2 Match Review connections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Defining a Match Review connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .207
IBM DB2 connection parameters for Match Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Microsoft SQL Server connection parameters for Match Review. . . . . . . . . . . . . . . . . . . . . . . . . . 209
Oracle connection parameters for Match Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .209
SAP HANA connection parameters for Match Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
SAP ASE connection parameters for Match Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
Administrator Guide
6 PUBLIC Content
Distributed processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Scheduling tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
Queuing tasks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Degree of parallelism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Grid computing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Using SAP applications as a source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Multi-threaded file read. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
Data Insight result set optimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Performance settings for input data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Settings to control repository size. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Settings for Cleansing Package Builder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .256
Settings for Metadata Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Settings for Metapedia. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
13.5 Best practices for performance and scalability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
General best practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Data Insight best practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Metadata Management best practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Cleansing Package Builder best practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Match Review best practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
13.6 SAP tables and ABAP-supported functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Administrator Guide
Content PUBLIC 7
Exporting and importing Match Review configurations (task) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
15 Supportability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
15.1 Information Steward logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Metadata Management logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .297
Data Insight logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .299
Metadata Browsing Service and View Data Service logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Viewing Information Steward logs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .301
15.2 Monitoring with CA Wily Introscope. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .302
CA Wily Introscope Prerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
Workflows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
18 Glossary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
18.1 Glossary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
Administrator Guide
8 PUBLIC Content
1 Accessing Information Steward for
administrative tasks
Perform administrative tasks for SAP Information Steward on the SAP BusinessObjects BI platform Central
Management Console (CMC).
You must have administrator permission (for Data Insight, Metadata Management, Data Review) to be able to
perform tasks in the CMC.
You access the CMC based on your system setup. If the CMC is installed on your computer, use the Start menu
(Windows). Or, if the CMC is installed on a different computer, enter the URL for the CMC into a browser window.
The URL should contain the name of the computer and port number. For example:
http://<webserver>:8080/BOE/CMC
Replace <webserver> with the name of the web server machine. If you changed this default virtual directory on
the web server, you need to type your URL accordingly. If necessary, change the default port number to the
number you provided when you installed the SAP BusinessObjects BI platform.
○ To open the Information Steward tab, click the CMC Home drop-down arrow and select Information
Steward. The Information Steward page opens.
○ To perform tasks such as configure Information Steward settings or manage utilities, stay on the CMC
Home page and click Applications under the Manage list. Then right-click Information Steward Application
and select a task from the context menu.
2. To close the CMC, click Log Off in the upper right corner of the Central Management Console window.
Related Information
Administrator Guide
Accessing Information Steward for administrative tasks PUBLIC 9
2 Architecture
This section outlines the overall platform architecture, system, and service components that make up the SAP
Information Steward platform. The information helps administrators understand the system essentials and helps
to form a plan for system deployment, management, and maintenance.
Information Steward uses SAP BusinessObjects Business Intelligence (BI) platform for managing user security,
scheduling integrator sources as tasks and utilities, and managing sources and on demand services.
Information Steward uses SAP Data Services for profiling, rule tasks, browsing metadata, and viewing data.
The following diagram shows the servers and services for SAP BusinessObjects BI platform, SAP Data Services,
and SAP Information Steward. The services marked with a yellow star are used by the Data Insight module of
Information Steward.
Administrator Guide
10 PUBLIC Architecture
Related Information
Administrator Guide
Architecture PUBLIC 11
2.1.1 Servers and services
This guide uses the terms server and service with the following meanings:
● Server is used to describe an operating system level process (also known as a daemon) hosting one or more
services. For example, the SAP Enterprise Information Management (EIM) Adaptive Processing server and
SAP Information Steward Job Server host multiple services. A server runs under a specific operating system
account and has its own process ID (PID).
● A service is a server subsystem that performs a specific function. The service runs within the memory space
of its server under the PID of the parent container (server). For example, the SAP Data Services Metadata
Browsing service is a subsystem that runs within the SAP EIM Adaptive Processing Server.
Related Information
SAP Information Steward uses SAP BusinessObjects Business Intelligence (BI) platform and SAP Data Services
and inherits the scalable architecture that these two platforms provide. This architecture allows deployment of
server components from the following applications:
The server components are located on different machines for flexibility, reliability, scalability, and better
performance.
SAP Information Steward requires SAP BusinessObjects BI Platform for the following functionality:
Administrator Guide
12 PUBLIC Architecture
2.1.2.2 SAP BusinessObjects Business Intelligence (BI)
platform components usage
The following table describes how SAP Information Steward uses each pertinent SAP BusinessObjects BI
platform component, or SAP BusinessObjects BI platform Information platform services (IPS) if you are not using
SAP BusinessObjects BI clients such as Web Intelligence documents or Crystal Reports.
Table 1:
BI platform or IPS Usage for Information Steward
component
Web Tier Deploys Information Steward on the BI platform Central Management Console (CMC) through which
administrative tasks for Information Steward are performed.
Central Management Maintains a database of information about your BI platform system including the following:
Server (CMS)
● Information about users and groups
● Security levels
● Schedule information
● BI platform content and servers
Stores the following objects in the Metadata Management module of Information Steward:
Note
Because integrator source configurations and source group definitions are stored in the CMS, you
can use the upgrade management tool to move them from one version of the CMS to another.
The schedules and rights information are considered dependencies of these configurations. For
details, see the SAP Information Steward Upgrade Guide.
For more information about the CMS, see the Business Intelligence Platform Administrator Guide. If
you installed Information Platform Service (IPS), see the Information platform services Administrator
Guide.
Administrator Guide
Architecture PUBLIC 13
BI platform or IPS Usage for Information Steward
component
Information Platform Executes profiling tasks and integrator tasks for the Information Steward Job Server, which may
Scheduling Services
host the following services for Information Steward:
Information Platform Creates the SAP Enterprise Information Management (EIM) Adaptive Processing Server, which is re
Processing Services
quired during Information Steward installation.
The SAP EIM Adaptive Processing Server uses the platform processing services to host the following
services:
Information Steward requires the following Data Services servers and service:
The Data Services Job Server provides the system management tools listed in the table below. These
management tools are required during the first installation of Information Steward.
Administrator Guide
14 PUBLIC Architecture
Table 2:
System management tool Description
Repository Manager The Repository Manager creates the required Data Insight objects in the Information
Steward repository.
The Information Steward installer invokes the Repository Manager automatically when cre
ating the repository the first time the installer is run.
Server Manager The Server Manager creates the Information Steward Job Server group and job servers and
associates them to the Information Steward Repository.
To add job servers to the Information Steward Job Server group, you must manually invoke
the server manager.
The Data Services Job Server provides the engine processes that perform the Data Insight data profiling and
validation rule tasks. The engine processes use parallel execution and in-memory processing to deliver high data
throughput and scalability.
The Data Services Metadata Browsing Service provides the capabilities to browse and import the metadata from
Data Insight connections.
The Data Services View Data Service provides the capabilities to view the source data from Data Insight
connections.
Note
When you install Data Services, ensure that you select the following components:
SAP Information Steward is deployed on a Web server using the Central Management Console (CMC) and a Web
application server.
Users perform administrative tasks on the CMC to set up the various Information Steward modules.
Administrator Guide
Architecture PUBLIC 15
Table 3:
Administrative task Description
Manage user security Set up and maintain user security permissions for the following Information Steward modules:
● Data Insight
● Metadata Management
● Cleansing Package Builder
● Data Review
Configure and run metadata ● Create and maintain metadata integrators and manage run schedules.
integrators ● Schedule Data Insight profile and rule tasks.
● Run or schedule Information Steward utilities.
Define source groups Define source groups to subset the metadata when viewing relationships such as same as, im
pact, and lineage.
Control behavior and per Configure application settings that affect the behavior and performance of Data Insight profile
formance and rule tasks.
With a Web application server, data stewards analyze data and associated metadata in Information Steward
modules as shown in the table below.
Table 4:
Module Tasks to analyze data and associated metadata
Data Insight ● Profile data in tables or files and analyze the resulting profile attributes.
● Define rules and set up data quality scorecards.
● View data quality score trends and view sample data that failed each rule.
● View data quality impact on dependent data sources.
● Create data cleansing solutions based on your data's content-type identification results
and SAP best practices for your specific data.
Metadata Management ● View metadata from different sources and search for objects without the need to know
the source or application in which it exists.
● Add annotations to an object and define custom attributes and values for an object.
● Run pre-defined reports that answer typical business questions such as which universe
objects are excluded from reports, or which reports use a particular table or source.
● View impact and lineage of objects within the same source or within different sources.
Metapedia Define Metapedia terms related to business data and organize terms into categories. You can
also define and implement policy sets and statements that are derived from business policies.
Cleansing Package Builder ● Define cleansing packages to parse and standardize data.
● Publish cleansing packages.
● Export cleansing packages so they can be imported to SAP Data Services to generate
base Data Cleanse transforms for jobs that cleanse the data.
Administrator Guide
16 PUBLIC Architecture
Module Tasks to analyze data and associated metadata
Worklist for Data Reviewer ● Review results of automated matching on a regular basis and make necessary correc
tions.
● Reassign the master record in a match group.
● Review Metapedia terms and policy sets and statements to approve or reject them.
● Review Data Insight rules for accuracy and approve or reject them.
Note
The Information Steward web application must be installed on the same web application server as that of the
SAP BusinessObjects BI platform.
For specific version compatibility, refer to the product availability matrix on the SAP customer portal at http://
service.sap.com/PAM
For more information about the CMS, see the SAP BusinessObjects Enterprise Administrator Guide, Central
Management Console
2.1.3 Services
The following table describes each of the services that are pertinent to SAP Information Steward.
Cleansing Package Builder Enterprise Information Performs data analysis when Must already have installed on
Auto-analysis Service Management (EIM) Adaptive you create custom cleansing this computer:
Processing Server packages in Cleansing
● SAP Business Intelligence
Package Builder.
(BI) platform Platform
● Data analysis uses infor Processing Services
mation provided in the
custom cleansing pack
age wizard.
● Creates abstract version
of records using data
analysis and statistical
analysis information.
● Groups abstracted re
cords and processes with
data inference algorithms.
● Creates suggestions
shown in Design mode of
Cleansing Package
Builder.
Administrator Guide
Architecture PUBLIC 17
Service Server on which service Service description Deployment comments
runs
Cleansing Package Builder EIM Adaptive Processing Performs main functionality of Must already have installed on
Core Service Server Cleansing Package Builder, this computer:
such as:
● SAP BI platform Platform
● Create and open cleans Processing Services
ing packages
● Design mode
● Advanced mode
Cleansing Package Builder EIM Adaptive Processing Converts published cleansing Must already have installed on
Publishing Service Server packages to reference data this computer:
format used by Data Services.
● SAP BI platform Platform
Processing Services
Note
Can take a significant pe
riod of time for large
cleansing packages.
Information Steward EIM Adaptive Processing Performs tasks on Data Must already have installed on
Administrator Task Service Server Services such as: this computer:
Data Cleansing Advisor EIM Adaptive Processing Performs functions for setting Must already have installed on
Service Server up data cleansing solutions in this computer:
Data Cleansing Advisor (in
● SAP BI platform Platform
Data Insight).
Processing Services
Data Service Metadata EIM Adaptive Processing Provides Data Insight users Must already be installed on
Browsing Service Server the capability to browse and this computer:
import metadata from differ
● SAP Data Services
ent data sources:
● SAP BI platform Platform
● Relational database sys Processing Services
tems such as
○ Oracle
○ Microsoft SQL Server
Data Services View Data EIM Adaptive Processing Provides the capability to view Must already be installed on
Service Server the external data in Data this computer:
Insight connections in
● SAP Data Services
Information Steward.
● SAP BI platform Platform
Processing Services
Administrator Guide
18 PUBLIC Architecture
Service Server on which service Service description Deployment comments
runs
Information Steward Data EIM Adaptive Processing ● Checks the input table for Must already be installed on
Review Service Server new match groups that this computer:
are ready for review. ● SAP BI platform Platform
● If match results are ready Processing Services
for review, creates a new
match review task.
Information Steward Data Information Steward Job Processes scheduled Data Must already be installed on
Review Task Scheduling Server Review tasks in the Central this computer:
Service Management Console (CMC).
● SAP BI platform Platform
Scheduling Services
Information Steward EIM Adaptive Processing ● Enables Information Must already be installed on
Application Service Server Steward web application this computer:
access (read and write) to ● SAP BI platform Platform
the Information Steward Processing Services
repository
● Provides metadata object Recommended:
relationships (such as ● Install Application Service
data lineage and change
on different computer
impact analysis) for
than web application
Metadata Management.
server
Note
You can deploy multiple
Application Services for
load balancing and availa
bility.
Administrator Guide
Architecture PUBLIC 19
Service Server on which service Service description Deployment comments
runs
Information Steward EIM Adaptive Processing Finds an object that exists in If Metadata Search Service is
Metadata Search Service Server any integrator source while not available during construc
viewing metadata on tion and update processes,
Information Steward Metadata search might return incorrect
Management using the Lucene results.
Search Engine.
For these situations,
● Constructs the search in Information Steward provides
dex during the execution
a utility to reconstruct the
of the Metadata
search index.
Integrators
● File repository server
stores compressed Note
search index
You can deploy multiple
● Updates search index
search services for load
with changes to Metape
balancing and availability.
dia:
○ terms
○ categories
○ policy sets
○ policy statements
○ custom attributes
○ annotations
Information Steward Task Information Steward Job Processes scheduled Data Must have installed on this
Scheduling Service Server Insight profile and rule tasks in computer:
the Central Management
● SAP BI platform Platform
Console (CMC).
Scheduling Services
Information Steward Information Steward Job Processes scheduled Must already be installed on
Integrator Scheduling Server Metadata Management inte this computer:
Service grator sources in the CMC.
● SAP BI platform Platform
Scheduling Services
Information Steward EIM Adaptive Processing Helper service used for testing Installed with the integrators.
Integrator Service Server connections to integrator
sources.
● Collect metadata from source systems and store the collected metadata into the SAP Information Steward
repository.
● Run at regular intervals based on a schedule.
● Update the existing metadata.
● Run on one job server or multiple job servers for load balancing and high availability.
● Run as a separate process.
Administrator Guide
20 PUBLIC Architecture
Table 6:
Metadata integrator name Metadata collected
SAP BusinessObjects Enterprise Collects metadata about objects such as universes, Crystal Reports, Web Intelligence
Metadata Integrator documents, and Desktop Intelligence documents.
SAP NetWeaver Business Warehouse Collects metadata about objects such as Queries, InfoProviders, InfoObjects, Trans
Metadata Integrator formations, and DataSources from an SAP NetWeaver Business Warehouse system.
Common Warehouse Model (CWM) Collects metadata about objects such as catalogs, schemas, and tables from the rela
Metadata Integrator tional package of CWM.
SAP Data Federator Metadata Collects metadata about objects such as projects, catalogs, data sources, and map
Integrator ping rules from a Data Federator repository.
SAP Data Services Metadata Collects metadata about objects such as source tables, target tables, and column
Integrator mappings from a Data Services repository.
Meta Integration Model Bridge (MIMB) Collects metadata from other third-party sources such as the following:
Metadata Integrator (also known as
● Data Modeling metadata such as SAP Sybase PowerDesigner (Version 16.0 and
MITI Integrator)
older), Embarcadero ER/Studio, and Oracle Designer
● Extract, Transform, and Load (ETL) metadata such as Oracle Warehouse Builder
and Microsoft SQL Server Integration Services (SSIS)
● OLAP and BI metadata such as IBM DB2 Cube Views, Oracle OLAP, and Cognos
BI Reporting
Relational databases Metadata Collects metadata from relational database management systems (RDBMS) which
Integrator can be DB2, MySQL, Oracle, SQL Server, Java Database Connectivity (JDBC), or an
SAP Universe connection. Collected metadata includes the definition of objects such
as tables, view, synonyms, and aliases.
SAP HANA Metadata Integrator Collects metadata about objects such as tables, views, and procedures from an SAP
HANA database.
SAP Sybase PowerDesigner Metadata Collects physical data model metadata for SAP Metadata Management, and collects
Integrator glossary information for SAP Metapedia from the SAP Sybase PowerDesigner reposi
tory using PowerDesigner Version 16.1 and newer.
Note
The SAP Sybase PowerDesigner application must be installed on the same ma
chine as SAP Information Steward Metadata Management.
Excel Metadata Integrator Collects metadata from database models that aren't supported by other metadata in
tegrators. For example, collect metadata from dBase, Microsoft Access, Microsoft
FoxPro, Hadoop, SQLite, and so on. In addition, use the Excel Metadata Integator to
create your own metadata without using a physical database source. For example,
use Unified Modeling Language to create metadata for a relational database design.
Related Information
Administrator Guide
Architecture PUBLIC 21
2.2 Distributed architecture
With SAP Information Steward, you can distribute software components across multiple computers to best
support the traffic and connectivity requirements of your network. You can create a minimally distributed system
designed for developing and testing or a highly distributed system that can scale with the demands of a
production environment.
Table 7:
Communication channel Components
CORBA All SAP BusinessObjects Business Intelligence (BI) solutions communicate with the Central
Management Server (CMS) and each other on CORBA.
HTTP/HTTPS Central Management Console (CMC), SAP Information Steward, and SAP Data Services
Management Console communicate with the web application server or HTTPS.
SAP Data Federator metadata integrator uses HTTP/HTTPS to communicate with the Data
Federator server.
TCP/IP Communication between Data Services and Information Steward sub-components deployed on
SAP BusinessObjects BI platform and Data Services job server or Data Services engine uses
TCP/IP with or without SSL.
On each host system, verify that all ports to be used by SAP Information Steward components are available and
not in use by other programs.
For a development system, you can install many components on the same host. Installing on a single host
simplifies many connections between components (the host name is always the same), but you must still define
connections based on the TCP/IP protocol.
This table details the default ports used by Information Steward components:
Administrator Guide
22 PUBLIC Architecture
Table 8:
Default Port Description
5005 Remote Job Server port that receives commands to run an integrator on a previous version.
8080 Default web application server port that listens to HTTP requests from web browsers.
For information about editing this port number, see “Viewing and editing Data Cleansing Advisor re
pository information” in the Information Steward Administrator's Guide.
Connections to databases also require access to a port that is defined when the database is set up. For details,
see the SAP BusinessObjects Business Intelligence platform Administrator Guide.
Related Information
Administrator Guide: Viewing and editing Data Cleanse Advisor repository information [page 78]
Information Steward provides server name connections (also known as DSN-less and TNS-less connections) to
databases that you use as an Information Steward repository, profiling source, or storage for data that failed
validation rules. Server name connections eliminate the need to configure the same DSN or TNS entries on every
machine in a distributed environment.
For the Information Steward repository, the following database types are supported:
Table 9:
Database Required information
For Data Insight profiling sources and failed data storage, the following database types are supported for DSN-
less and TNS-less connections:
Administrator Guide
Architecture PUBLIC 23
● DB2 UDB
● Informix
● MySQL
● Netezza
● Oracle
● SAP HANA
● SQL Anywhere
● Sybase IQ
● Teradata
Note
For the most current list of supported databases for server name connections, see the Release Notes.
SAP Information Steward integrates with your existing SAP infrastructure with the following SAP tools:
Related Information
Administrator Guide
24 PUBLIC Architecture
information platform services, SAP Data Services, and SAP Information Steward. The servers and services within
each of these software products communicate with each other to accomplish a task.
The following section describes some of the process flows as they would happen in SAP information platform
services, SAP Data Services, and SAP Information Steward.
Related Information
This workflow describes the process of adding a table to a Data Insight project.
1. The user selects Add Tables in the Workspace Home window in the Data Insight tab to access the
Browse Metadata window.
2. The web application server passes the request to the Central Management Server (CMS) and returns a list of
connections that the user can view, assuming the user has appropriate permissions to view the connections.
3. If the user has the appropriate rights to view the selected connection, the CMS sends the request to the Data
Services Metadata Browsing Service.
4. The Data Services Metadata Browsing Service obtains the metadata from the connection and sends the
metadata to the web application server.
5. The web application server displays the metadata in the Data Insight Browse Metadata window.
6. When the user selects a table and clicks Add to Project, the web application asks the Application Service to
store the metadata in the Information Steward repository.
This workflow describes the process of running a profile task in Data Insight. Running validation rules are similar
to the steps here.
1. The user selects the name of a table or file in the Workspace Home window in the Data Insight tab and clicks
Profile.
2. The user chooses the tables in the Workspace home window of the Data Insight tab.
3. The user saves the task and schedules to run it.
4. The web application server passes the request to the Central Management Server (CMS).
5. The Information Steward web application determines from the CMS system if the user has the right to run
profile tasks on the connection that contains the table or file.
6. The administrator determines if the user has the right to create a profile task for the connection, and has the
right to schedule the task. If so, the task is scheduled in the CMS system.
Administrator Guide
Architecture PUBLIC 25
7. When the scheduled time arrives, the CMS sends the task information to the Information Steward Task
Scheduling Service.
8. The Information Steward Task Scheduling Service sends the profile task to the Data Services Job Server.
9. The Data Services Job Server partitions the profile task based on the performance application settings.
10. The Data Services Job Server executes the profile task and stores the results in the Information Steward
repository.
11. The web application server displays the profile results in the Data Insight Workspace Home window.
This workflow describes the process of scheduling and running a Metadata Management integrator source to
collect metadata.
1. The user schedules an integrator source in the Central Management Console (CMC) and the request is sent to
the CMS system.
2. The CMS system determines if the user has the appropriate rights to schedule the integrator source.
3. If the user has the appropriate rights to schedule the object, the CMS commits the scheduled integrator
request to the CMS system.
4. When the scheduled time arrives, the CMS finds a suitable Information Steward Job Server based on the Job
Server group associated with the integrator and passes the job.
If the process has an SAP BusinessObjects Enterprise 3.x source system, the process contacts the registered
remote job server and passes along the integrator process information.
5. The integrator process collects metadata and stores the metadata in the Information Steward repository.
6. The integrator process generates the Metadata Management search index files and loads them to the Input
File Repository Server.
7. After uploading the search index files, the integrator source notifies the Metadata Management search
service.
8. The Metadata Management search service downloads the generated index files and consolidates them into a
master index file.
9. The Information Steward Integrator Scheduling Service updates the CMS with the job status.
This workflow describes the process of creating and publishing a custom cleansing package in Cleansing Package
Builder.
1. In SAP Information Steward, the user clicks the Cleansing Package Builder tab.
2. The Cleansing Package Builder (CPB) application sends the user's login information to the CPB Web Service.
3. The CPB Web Service sends the information to the SAP solutions for enterprise information management
(EIM) Adaptive Processing server.
Administrator Guide
26 PUBLIC Architecture
The SAP solutions for EIM Adaptive Processing Server run on the SAP BusinessObjects Business Intelligence
platform.
4. The SAP solutions for EIM Adaptive Processing Server determine which rights the user has in CPB.
5. The information is sent back through the CPB Web Service to the CPB application.
The user sees the cleansing packages they have the rights to view.
6. In the Cleansing Packages Tasks screen, the user selects New Cleansing Package Custom Cleansing
Package to start creating a cleansing package.
The user provides the necessary information and sample data to create the cleansing package.
7. The CPB application sends the information through the CPB Web Service to the CPB Core Service, using the
SAP Enterprise SDK mechanism.
The CPB Core Service handles the main functions of CPB. The CPB Core Service runs on the SAP solutions
for EIM Adaptive Processing Server.
8. The CPB Core Service sends the response back through the CPB Web Service to the CPB application.
The new cleansing package is created in CPB.
9. The application communicates with the CPB Auto-Analysis Service through the CPB Web Service.
The CPB Auto-Analysis Service analyzes the data to create suggestions of standard forms and variations. The
CPB Auto-Analysis Service runs on the SAP solutions for EIM Adaptive Processing Server.
10. When the user has finished refining the cleansing package, the user clicks Publish on the Cleansing Packages
Tasks screen.
11. The CPB application communicates with the CPB Publishing Service through the CPB Web Service.
The CPB Publishing Service assists in the cleansing package's conversion to the reference data format used
by SAP Data Services. The CPB Publishing Service runs on the SAP solutions for EIM Adaptive Processing
Server.
12. The published cleansing package information is sent to the Input File Repository, where it is stored and can be
accessed by Data Services.
The Input File Repository runs on the SAP BusinessObjects BI platform.
13. Data Services communicates directly with the SAP BusinessObjects BI platform to sync with the published
cleansing packages.
This workflow describes the process of creating and publishing a data cleansing solution with Data Cleansing
Advisor.
1. The user selects an input source in the Workspace Home window in Data Insight, and chooses Profile
Content Type .
2. When content type profiling is complete, the Data Cleansing Advisor icon appears next to the name of the
input source.
3. The user clicks the Data Cleansing Advisor icon to start the Data Cleansing Advisor wizard.
4. The Data Cleansing Advisor wizard guides the user to provide necessary information for creating a data
cleansing solution.
Administrator Guide
Architecture PUBLIC 27
5. For each wizard step, the web application server communicates with the Data Cleansing Advisor service to
retrieve optimal results based on the user's current selection.
6. When the user clicks Finish in the Data Cleansing Advisor wizard, the web application server passes the
request to the Data Cleansing Advisor service.
7. The creation process for this data cleansing solution is started on the EIM Adaptive Processing Server, and
the user is redirected to Data Cleansing Solutions list in Data Cleansing Advisor. The data cleansing solution
that the user just created is listed with a status of Creating.
8. Data Cleansing Advisor service stores all user settings in the CMS so that they can be retrieved for processing
in later.
9. Data Cleansing Advisor service first copies data from input source to the Data Cleansing Advisor local
repository. A new Data Services job will be started on Data Services Job Server for this step.
10. Data Cleansing Advisor cleanses the data using content-type profiling results and options selected in the Data
Cleansing Advisor wizard. Two Data Services jobs are started on Data Services Job Server for this step.
11. Data Cleansing Advisor performs matching. A new Data Services job is started on Data Services Job Server
for this step.
12. The data cleansing solution is listed with a Ready status. The user can view the results and edit cleanse and
match settings in Data Cleansing Advisor.
13. When the user is satisfied with the data cleansing solution, he/she clicks Publish Solution. Publish Solution
applies only to the data cleansing solution that is currently open.
14. Data Cleansing Advisor makes the data cleansing solution available to the Data Services Workbench user.
Administrator Guide
28 PUBLIC Architecture
3 Securing SAP BusinessObjects
Information Steward
SAP Information Steward uses the security framework that SAP BusinessObjects Business Intelligence platform
(BI platform) provides.
The SAP BusinessObjects BI platform architecture addresses the many security concerns that affect today's
businesses and organizations. It supports features such as distributed security, single sign-on, resource access
security, granular object rights, and third-party authentication in order to protect against unauthorized access.
For details about how BI platform addresses enterprise security concerns, see the SAP Business Intelligence
Platform Administrator's Guide. For more information about securing the platform, see the topic “Securing the BI
platform” in the same guide (SAP Business Intelligence Platform Administrator's Guide).
As a security requirement, the latest Adobe Flash player must be installed in the web browser used for
Information Steward. JavaScript is also used, but you do not need to install it.
Related Information
SAP Information Steward is a web-based application that uses enterprise security provided by SAP Business
Intelligence platform. Information Steward takes advantage of the following SAP Enterprise security features:
Administrator Guide
Securing SAP BusinessObjects Information Steward PUBLIC 29
● Secure connections
● Reverse proxy servers
Note
You should have a virus scanner running on the Business Intelligence platform server to help avoid any
downtime on server systems due to corrupt files. The virus scanner can quarantine any malicious files
uploaded to FRS, for example, Cleansing Package Builder sample data or Metadata Management Integrator
configuration files.
Related Information
SAP Information Steward has access to the following data which might contain sensitive information:
● Source data in Data Insight connections on which users run profile and rule tasks
● Sample data from profiling results that Data Insight stores in the Information Steward repository
● Sample data that failed validation rules that Data Insight stores in the Information Steward repository
● All data that failed validation rules that a user chooses to store in a database accessed through a Data Insight
connection.
● Results data in Match Review is saved in the staging repository. The user must secure the staging repository.
The Database Administrator (DBA) secures the data in these databases by managing user permissions on them:
The following information is stored in the Information Steward repository during these processes:
● Data Insight information and profile sample data is stored when profiling
● Data Insight sample failed data is stored when rule tasks are calculated
● Match Review source identification is stored when a review task is running
In addition, the Data Insight Administrator or Administrator control access to the data by using the Central
Management System (CMS) to manage the following rights on the Data Insight connections:
● View Data
● Profile/Rule permission
Administrator Guide
30 PUBLIC Securing SAP BusinessObjects Information Steward
● View Sample Data
● Export Data
Related Information
Information Steward uses the SAP Enterprise cryptography which is designed to protect sensitive data stored in
the CMS repository. Sensitive data includes user credentials, data source connectivity data, and any other info
objects that store passwords. This data is encrypted to ensure privacy, keep it free from corruption, and maintain
access control. Sample data stored in the Information Steward repository is protected by the user access
permissions. For more information, see the "Overview of SAP BusinessObjects Enterprise data security" section
of the SAP BusinessObjects Enterprise Administrator's Guide.
Encryption of sensitive information, such as passwords, is done in the following Information Steward areas:
Storing cookies
A cookie is a small text file that stores session state on the client side: the user's web browser caches the cookie
for later use. The following cookies are used by Information Steward: MM_Logon, user, cms, authentication, sap
id, sap client number, token, vintela sso, metadata login.
You can configure the Secure Sockets Layer (SSL) protocol for SAP Information Steward by:
● configuring the web application (see the Business Intelligence platform Web Application Deployment Guide)
Note
When configuring the web application, be certain that the HTTP access to the Business Intelligence
platform web application is maintained after enabling SSL. Otherwise, the main Information Steward web
application is not able to initialize.
● configuring the Business Intelligence platform (see the Business Intelligence platform Administrator Guide)
● configuring the Remote Job Server (below)
Administrator Guide
Securing SAP BusinessObjects Information Steward PUBLIC 31
Configure the Remote Job server
The SAP Enterprise Metadata Integrator in Information Steward can collect metadata from an SAP
BusinessObjects Enterprise XI 3.x system by using the Remote Job Server. When you install Information Steward,
you install the Remote Job Server component on the computer where the Enterprise XI 3.x system resides. Then
you use the Information Steward Service Configuration to configure the Remote Job Server. For information
about installing the Remote Job Server on the SAP BusinessObjects Enterprise XI 3.x system, see “Remote Job
Server Installation” in the Installation Guide.
If you are using the Secure Sockets Layer (SSL) protocol for all network communication between clients and
servers in your SAP BusinessObjects Enterprise XI 3.x and SAP Business intelligence platform 4.0 deployments,
you can use SSL for the network communication between the Remote Job Server and the Metadata Integrator on
Information Steward. In this environment:
● The server is the Remote Job Server on the SAP BusinessObjects Enterprise XI 3.x. To enable SSL, the server
must have both keystore and truststore files defined.
● The client is the Metadata Integrator on Information Steward. To enable SSL, the client must use the same
truststore and password as the server.
To set up SSL between the Remote Job Server and the Metadata Integrator, you need to perform the following
tasks:
● Create keystore and truststore files for the Remote Job Server and copy the truststore file to the Business
intelligence platform 4.0 system.
● Configure the location of SAP BusinessObjects Enterprise XI 3.x SSL certificates and key file names (from
Server Intelligence Agent (SIA)).
Related Information
Creating the keystore and truststore files for the Remote Job Server [page 32]
Configuring the SSL protocol for the Remote Job Server [page 33]
SSL and HTTPS setup for Information Steward [page 35]
To set up SSL protocol for communication to the Remote Job Server, use the keytool command to:
● Create a certificate and store it in a keystore file on the computer where SAP BusinessObjects Enterprise XI
3.x resides
● Create a trust certificate and store it in a truststore file on the computer where SAP BusinessObjects
Enterprise XI 3.x resides
● Copy the trust certificate into a truststore file on the computer where SAP BusinessObjects Business
Intelligence Platform 4.0. resides
1. On the computer where you installed SAP BusinessObjects Enterprise XI 3.x, generate a keystore file and
export it:
Administrator Guide
32 PUBLIC Securing SAP BusinessObjects Information Steward
a. Open a cmd window and go to the directory where Metadata Management configuration files are stored
for Information Steward.
For example, type the following command:
d. Import the certificate from the export file to create the truststore file.
For example, type the following command to import the certificate into a truststore file named
is.truststore.keystore:
This command stores the certificate in the truststore file in the InformationSteward\MM\Config
directory.
2. Copy the truststore file to the computer where SAP Business Intelligence Platform 4.0. resides.
a. Ensure that the is.truststore.keystore file is in a directory that is accessible to both the computer
where SAP BusinessObjects Enterprise XI 3.x is installed and the computer where SAP Business
Intelligence Platform 4.0. is installed.
b. Copy the is.truststore.keystore file to the directory where Metadata Management configuration
files are stored for Information Steward.
For example:
Administrator Guide
Securing SAP BusinessObjects Information Steward PUBLIC 33
1. Obtain the SSL certificate and key file names from the SIA Properties file.
a. Click Start All Programs SAP BI platform 4 SAP BI platform Central Configuration Manager to
open the CCM.
b. Right-click Server Intelligence Agent and choose Properties.
c. Obtain the values for the following options so that you can enter them in a later step:
○ SSL Certificates Folder
○ Server SSL Certificate File
○ SSL Trusted Certificate File
○ SSL Private Key File
○ SSL Private Key File Passphrase File
For more information about these options, see the SAP Business Intelligence Platform Administrator's
Guide.
d. Click Cancel to close the Server Intelligence Agent Properties window.
For more information about SSL servers, see SAP Business Intelligence Platform Administrator's Guide:
Configuring servers for SSL.
Note
To run an integrator source with SSL enabled on the Remote Job Server, set the following runtime JVM
parameter.
-Dbusinessobjects.migration=on
Related Information
Creating the keystore and truststore files for the Remote Job Server [page 32]
Setting the runtime parameter for SAP Enterprise 3.x integrator with SSL [page 180]
Administrator Guide
34 PUBLIC Securing SAP BusinessObjects Information Steward
3.2.3.3 SSL and HTTPS setup for Information Steward
These steps will help you avoid potential problems running Cleansing Package Builder if you have configured the
Secure Socket Layer (SSL) protocol to work with an HTTPS setup to access a web server for SAP Information
Steward.
These steps vary based on the web server you use. Information Steward’s default web server is Apache Tomcat.
Before you start these steps you should read about how to deploy and undeploy web services using the WDeploy
tool in the Web Application Deployment Guide located on the customer portal at . For instructions specific to
Information Steward, read about how to use WDeploy in the Installation Guide.
<transportReceiver name="https"
class="org.apache.axis2.transport.http.SimpleHTTPServer">
<parameter name="port">443</parameter>
<parameter name="hostname">https://[host]</parameter>
</transportReceiver>
○ The port number is the SSL port number that you use with your BusinessObjects Enterprise (BOE) and
Information Steward web service (we show “443” as an example).
○ Host is the name of the server that hosts the Information Steward application. For example, if you use a
web application named “myserver” to access Information Steward, set [host] to myserver (omitting the
brackets).
4. Undeploy the CPBWebservice using WDeploy. (See the topic “To undeploy one web component” in the
Installation Guide).
5. Perform the predeployment steps explicitly for CPBWebservice. For more information see the topic “WDeploy
prerequisites” in the Installation Guide.
6. Redeploy the CPBWebservice. For details see the topic “Deploying web applications with WDeploy” in the
Web Application Deployment Guide.
SAP Information Steward can be deployed in an environment with one or more reverse proxy servers. A reverse
proxy server is typically deployed in front of the web application servers in order to hide them behind a single IP
address. This configuration routes all Internet traffic that is addressed to private web application servers through
the reverse proxy server, hiding private IP addresses.
Because the reverse proxy server translates the public URLs to internal URLs, it must be configured with the URLs
of the Information Steward web applications that are deployed on the internal network.
For information about supported reverse proxy servers and how to configure them, see “Information platform
services and reverse proxy servers” and “Configuring reverse proxy servers for Information platform” in the SAP
information platform services Administrator's Guide.
Administrator Guide
Securing SAP BusinessObjects Information Steward PUBLIC 35
4 Information Steward configuration
settings
Use configuration settings to establish directory locations, to control software behavior when performing tasks,
and to control performance and scalability.
The Information Steward Settings window in the Central Management Console (CMC) contains many settings that
control how the software behaves when performing tasks in Data Insight, Metadata Management, Metapedia,
Cleansing Package Builder, and Match Review.
Administrators or users with the required permissions can set or change the Information Steward configuration
settings.
Administrator Guide
36 PUBLIC Information Steward configuration settings
4.2 Information Steward settings option descriptions
The Information Steward Settings window in the Applications area of the Central Management Console (CMC)
contains options that control software behavior and job performance in Information Steward modes.
Table 10:
Administrator Guide
Information Steward configuration settings PUBLIC 37
Group Parameters Default value Description
Administrator Guide
38 PUBLIC Information Steward configuration settings
Group Parameters Default value Description
Administrator Guide
Information Steward configuration settings PUBLIC 39
Group Parameters Default value Description
Debug Options Data Services engine options N/A Values passed as command
Administrator Guide
40 PUBLIC Information Steward configuration settings
Group Parameters Default value Description
Permissions (Data Insight) Connection on which to base Source Connection Connection type to base view
failed data and export permission for
failed data.
Required permissions:
● View data
● View sample data
● Export data
Data Review and Worklist Data Review tasks retention -1 Number of days before the
settings period Match Review tasks are de
leted. The longer you keep the
data, the larger the repository
size.
Administrator Guide
Information Steward configuration settings PUBLIC 41
Group Parameters Default value Description
Data Cleansing Advisor Data Cleansing Advisor 120 Number of days before the
Settings retention period Data Cleansing Advisor
staged data is deleted. The
longer you keep the data, the
larger the repository size.
Metadata Management Custom reports directory N/A Full path to the directory for
Options custom reports, such as Crys
tal reports, and associated
properties file. Applicable for
Crystal Reports version 12 or
later.
Related Information
Administrator Guide
42 PUBLIC Information Steward configuration settings
5 Users and Groups Management
Use SAP BusinessObjects Business Intelligence platform security to create users and authorize user access to the
objects and actions within the Information Steward modules:
● Data Insight
● Metadata Management
● Metapedia (under Metadata Management in the Central Management Console)
● Cleansing Package Builder
● Match Review
Information Steward provides pre-defined user groups that have specific rights on objects unique to each module.
These user groups enable you to grant rights to multiple users by adding the users to a group instead of modifying
the rights for each user account individually. You also have the ability to create your own user groups.
Related Information
Each pre-defined user group has specific rights to objects in Information Steward modules.
The following diagram provides an overview of the pre-defined Information Steward user groups and their relation
to the Administrator group in SAP BusinessObjects Business Intelligence platform.
Administrator Guide
Users and Groups Management PUBLIC 43
● The Administrator group for the Business Intelligence platform:
○ Creates users and custom groups for all Information Steward modules
○ Performs all tasks within all Information Steward modules
○ Grants users access to cleansing packages
The table below describes the role for administrators for each Information Steward module.
Data Insight ● Grants users and groups access to connections and projects. (By default, all Data Insight pre-de
fined groups are granted access to all connections and projects.)
● Accesses all Data Insight actions.
Data Review ● Creates connections for match review and schedules for match review configurations.
● Accesses all Match Review actions.
Administrator Guide
44 PUBLIC Users and Groups Management
Within each Information Steward module, additional user groups have specific rights for the objects within that
module. For example, the Data Insight Analyst group can create profile tasks and rules, but only the Data Insight
Rule Approver can approve rules.
Related Information
This section contains the steps to create users and add them to Information Steward groups.
Create user accounts and assign them to groups or assign rights to control their access to objects in SAP
Information Steward.
To create users:
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Administrator group.
2. At the CMC home page, click Users and Groups.
For more information, see the “Single Sign-On Setup” topic in the SAP BusinessObjects Business
Intelligence platform Administrator Guide.
b. Type the account name, full name, email, and description information.
Administrator Guide
Users and Groups Management PUBLIC 45
Tip
Use the description area to include extra information about the user or account.
If your license agreement is not based on user roles, specify a connection type for the user account.
○ Choose Concurrent User if this user belongs to a license agreement that states the number of users
allowed to be connected at one time.
○ Choose Named User if this user belongs to a license agreement that associates a specific user with a
license. Named user licenses are useful for people who require access to SAP BusinessObjects Enterprise
regardless of the number of other people who are currently connected.
7. Click Create & Close.
The user is added to the system and is automatically added to the Everyone group. An inbox is automatically
created for the user, as is an Enterprise alias. You can now add the user to a group or specify rights for the user.
For more information, see the “Managing users and groups” topic in the SAP BusinessObjects Business
Intelligence platform Administrator Guide.
Groups are collections of users who share the same rights to different objects. SAP Information Steward provides
groups for Data Insight and Metadata Management, such as Data Insight Analyst group and Metadata
Management User group.
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Administrator group.
2. At the CMC home page, click Users and Groups.
3. Select the User List or Group List node in the navigation tree.
4. Select the name of the user or user group in the right panel.
Administrator Guide
46 PUBLIC Users and Groups Management
6. On the Join Group dialog box, select the Group List node in the navigation tree.
7. Select one or more names from the Available Groups list, and click > to place them in the Destination Group(s)
list.
8. Click OK.
Related Information
1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Data Insight Administrator group.
2. At the CMC home page, click Information Steward.
Administrators must enter a user's email information in the Central Management Console (CMC) so the user can
receive email notifications at certain stages of an object's creation, approval, and management.
For example, a Metapdia term requires an author to create the term, and an approver to approve the term. Each
user chosen for one of these roles receives email notifications when the term is submitted for approval and is
approved.
To set up a user for email notifications, log on to the Central Management Console (CMC) and follow these steps:
Administrator Guide
Users and Groups Management PUBLIC 47
1. Select Users and Groups from the CMC Home drop down menu.
2. Click User List from the navigation tree at left.
A list of all users appears at right.
Related Information
The Data Insight module of SAP Information Steward contains the following objects that have specific rights that
allow various actions on them.
● Connections through which users view data sources and import tables and files to profile the data. In addition
to rights to the connection, a user must also be granted permission on the source data:
For database connections, the Database Administrator must grant privileges on the tables to the user.
For file connections, the users that run the following services must have permissions on the directory where
the file resides:
○ Information Steward Web Application Server (for example, Tomcat)
○ Data Services service
○ Server Intelligence Agent that runs EIMAdaptiveProcessingServer and ISJobServer
● Views that can join tables and files from multiple connections.
● Projects that contain profile tasks, rule tasks, and scorecards in specific business areas, such as HR or Sales.
● Profile tasks that collect profile attributes to help you determine the quality and structure of the data.
● Rule tasks that validate the data according to your business and quality rules.
The following diagram shows users and groups who are granted rights to access Connections, Projects, and Tasks
for Data Insight.
Administrator Guide
48 PUBLIC Users and Groups Management
Related Information
Tip
Users working on the same project should have access to the same set of connections so that they can
collaborate on the same data set within a project.
Administrator Guide
Users and Groups Management PUBLIC 49
The following table describes the predefined user groups in ascending order of rights.
Table 12:
User group Description
Data Insight User Users can view the connections, projects, source data, profile results, sample profile data,
rules, sample data that failed rules, and scorecard results.
Data Insight Analyst Users have all of the rights of a Data Insight User, plus the following rights:
Data Insight Rule Approver Users have all of the rights of a Data Insight Analyst, plus the right to approve and reject rules.
Data Insight Scorecard Man Users have all of the rights of a Data Insight Analyst, plus the right to create and edit score
ager cards that consist of rules for specific Key Data Domains.
Data Insight Administrator Users have all of the user group rights listed above plus the following rights:
● Configure, edit, delete, run, schedule, view history of Information Steward utilities
● Create, edit, and delete Data Insight connections and projects
● Configure Information Steward application settings
● Change Information Steward repository user and password
Related Information
Group rights for Data Insight folders and objects in the CMC [page 51]
Group rights for connections [page 53]
User rights for views [page 55]
Group rights for projects [page 54]
Group rights for tasks [page 56]
Rights are the base units for controlling user access to the objects, users, applications, servers, and other features
in SAP BusinessObjects Enterprise.
Type-specific rights are rights that affect specific object types only, such as Data Insight connections, Data Insight
projects, profile tasks, or rule tasks.
Rights are set on objects, such as a Data Insight connection or project, rather than on the "principals" (the users
and groups) who access them. By default, the pre-defined Data Insight user groups are granted access to newly
created connections and projects. If you want some users to access only certain connections and projects, then
do not add them to a pre-defined group, but add their user names to the list of principals for each individual
connection and project and assign the appropriate type -specific rights. For example, to give a user access to a
particular connection, you add the user to the list of principals who have access to the connection.
Administrator Guide
50 PUBLIC Users and Groups Management
Type-specific rights consist of the following:
For more information about user rights, see "How rights work in SAP Enterprise" in the Enterprise Administrator's
Guide.
Related Information
Each pre-defined Data Insight user group provides specific rights on the folders and objects in the CMC, as the
following tables show.
Data Insight View ob View Data Insight Yes Yes Yes Yes Yes
folder jects folder on the CMC
Connections View ob View connections in Yes Yes Yes Yes Yes
folder jects Connections folder
Data Insight Con View ob View connection prop Yes Yes Yes Yes No
nection jects erties
Administrator Guide
Users and Groups Management PUBLIC 51
CMC Folder or Right Description Pre-defined User Groups
Object Name
Data Insight Data Insight Data In Data In Data In
Administrator Scorecard sight Rule sight An sight
Manager Approver alyst User
Projects folder View ob View Projects folder Yes Yes Yes Yes Yes
jects
Projects View ob View project properties Yes Yes Yes Yes No
jects
Profile task or View ob View task properties Yes Yes Yes Yes Yes
Rule task jects
View in View task history and Yes Yes Yes Yes No
stance logs
Related Information
Administrator Guide
52 PUBLIC Users and Groups Management
5.4.2.2 Group rights for connections
SAP Information Steward provides pre-defined Data Insight user groups that have specific rights to connections.
You can add users to these groups to control their rights to connections.
For example, in Data Insight, if a user has the right to view a connection, the connection is visible in the Browse
Metadata window. In addition, the tables and views in that connection are visible.
Data Insight Ad Data Insight Data In Data In Data In
ministrator Scorecard sight Rule sight Ana sight User
Manager Approver lyst
View ob View connections and tables, Yes Yes Yes Yes Yes
jects browse metadata in the connec
tion
View Data View external data in the con Yes Yes Yes Yes Yes
nection
View Sam View profile sample data and Yes Yes Yes Yes Yes
ple Data sample data that failed rules
Export Export viewed data, profile sam Yes Yes Yes Yes Yes
Data ple data, and sample data that
failed rules
Profile/ Create profile tasks and rule Yes Yes Yes Yes No
Rule per tasks
mission
Note
● Rights to a Data Insight connection are granted to users or groups when the Data Insight Administrator
adds them to the Principals list for the connection.
● The Administrator and the Data Insight Administrator can create, edit, and delete connections in the CMC.
● For a database connection, the Database Administrator must grant the Data Insight user access to the
tables.
● For a file connection, the users that run the following services must have permissions on the directory
where the file resides:
○ Information Steward Web Application Server (for example, Tomcat)
○ Data Services service
○ Server Intelligence Agent that runs EIM Adaptive Processing Server and IS Job Server
Related Information
Group rights for Data Insight folders and objects in the CMC [page 51]
Group rights for Data Insight folders and objects in the CMC [page 51]
User rights in Data Insight [page 48]
Administrator Guide
Users and Groups Management PUBLIC 53
Data Insight predefined user groups [page 49]
Assigning users to specific Data Insight objects [page 59]
Each pre-defined Data Insight user group provides specific rights on projects in Information Steward, as the
following table shows.
Data Insight Ad Data Insight Data Insight Data In Data Insight
ministrator Scorecard Man Rule Ap sight Ana User
ager prover lyst
Edit ob ● Add and remove tables Yes Yes Yes Yes No
jects and files in project
● Add, edit, and remove
views
● Update Preferences
page
Administrator Guide
54 PUBLIC Users and Groups Management
● The Add Tables and Remove buttons are enabled in the Workspace if a user has the right to edit a project.
● The Bind and Delete buttons are enabled on the Rule tab if a user has the right to manage rules.
● The Profile and Calculate Score buttons are enabled in the Workspace if a user has the right to add objects to
the project.
Note
Only the Administrator and Data Insight Administrator can create, edit, and delete projects in the CMC. For
more information, see Group rights for Data Insight folders and objects in the CMC [page 51].
The rights each user has on a view is inherited from the rights the user has on the connections that comprise the
view.
For example, suppose View1 is comprised of the following connections and tables:
● ConnectionA, Table1
● ConnectionB, Table2
Suppose User1 has the Edit right on ConnectionA but not on ConnectionB. Therefore, User1 cannot edit View1
because the denied Edit right is inherited from ConnectionB.
Similarly, if User2 has the Edit right on ConnectionA and ConnectionB, then User2 can edit View1.
This inheritance applies to all of the rights on views, as the following table shows.
View objects The view name and columns are visible in Workspace View objects right on all source connections
Home window.
View Data Look at the source data in each table or file that com View Data right on all source connections
prise a view
View Sample View profile sample data and sample data that failed View Sample Data right on all source connections
Data rules
Profile/Rule Create profile tasks and rule tasks Profile/Rule right on all source connections
permission
Export Data Export viewed data, profile sample data, and sample Export Data right on all source connections
data that failed rules
Note
The following actions require rights on the project:
● To add, edit, or remove views, a user must have the Edit objects right on the project.
● To copy a view, a user must have the Add objects right on the project.
Administrator Guide
Users and Groups Management PUBLIC 55
5.4.2.5 Group rights for tasks
Each pre-defined Data Insight user group provides specific rights on profile tasks and rule tasks in Information
Steward, as the following table shows.
Data Insight Ad Data Insight Data Insight Data In Data Insight
ministrator Scorecard Man Rule Ap sight Ana User
ager prover lyst
View ob View the task in the Tasks Yes Yes Yes Yes Yes
jects tab of the Workspace
Delete ob Delete profile or rule task Yes Yes Yes Yes No
jects
Related Information
Group rights for Data Insight folders and objects in the CMC [page 51]
Administrators control user rights and the access connection type for viewing and exporting failed data.
Users can view failed data, view more failed data, and export failed data when they have permission to access the
connection and the following specific permissions in the connection:
● View data
● View sample data
● Export data
Administrators control the access connection on which the users view or export failed data. The option is
Connection on which to base failed data and it is in the Central Management Console in Data Insight Settings.
Administrator Guide
56 PUBLIC Users and Groups Management
Table 18:
Source connection Users need permission to the source connection to view and
export failed data.
Failed data connection Users need permission to the failed data connection to view
and export failed data.
Related Information
To facilitate user management, assign users to a pre-defined Information Steward user group. By default,
Information Steward assigns all pre-defined user groups to all Data Insight connections and projects. However,
you might want to limit a user's access in the following ways:
● Add the user or group to only a subset of connections. In addition, create projects for specific business areas,
such as HR or Sales, and assign only certain users to access these projects to create profile tasks and rule
tasks.
● Deny a right from an existing user group for a specific connection and specific project.
Related Information
Administrator Guide
Users and Groups Management PUBLIC 57
5.4.4.1 Denying user rights to specific Data Insight objects
By default, the pre-defined Data Insight user groups are added to the access list of connections and projects when
you create them. However, you might want to deny one or a small subset of users access to a specific Data Insight
connection and project.
To allow a user or group to access all but one specific Data Insight connection and project:
1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Data Insight Administrator group.
2. Add the user name to a pre-defined Data Insight group because the user would still have access to most
connections and projects. For details, see Adding users and user groups to Information Steward groups [page
46].
3. At the CMC home page, click Information Steward.
4. Select the object type.
Note
To select multiple names, hold down the Ctrl key when you click each name.
Administrator Guide
58 PUBLIC Users and Groups Management
a. Expand the Application node and select Data Insight Project.
b. Click the Denied column for each right that you want to deny this user or group.
For example, to deny the right to create scorecard and approve rules in this project, click the Denied column
for the following Specific Rights for Data Insight Connection:
○ Approve Rule
○ Manage Rule
13. To deny profile or rule task rights:
a. Expand the Application node and select Information Steward Profiler Task.
b. Click the Override General Global column and the Denied column for each right that you want to deny this
user or group.
For example, to deny the right to schedule a profile task or rule task, click the Override General Global column
and the Denied column for the following General Rights for Data Insight Profiler Task:
○ Reschedule instances that the user owns
○ Schedule document to run
○ View document instances
14. Click OK and verify that the list under Right Name does not display the rights you denied.
15. Click name of the principal you just added, click View Security and verify that the list of rights that you denied
has the red icon in the Status column.
16. Click OK and close the User Security window.
Related Information
Administrators can limit user or group access to only specific Data Insight connections, projects, and tasks.
Add the user or group to the list of principals for each specific Data Insight object.
Note
You must assign the user to both the connection and the project to enable them to add tables or files, create
profile tasks, and create rules.
1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Data Insight Administrator group.
2. At the CMC home page, click Information Steward.
3. Select the object type.
○ For a connection:
1. Select the Connections node in the tree panel.
Administrator Guide
Users and Groups Management PUBLIC 59
2. Select the connection name in the right panel.
○ For a project:
1. Expand the Data Insight node, and expand the Projects node in the tree panel.
2. Select the project name in the tree panel.
○ For a profile task or rule task:
1. Expand the Data Insight node, and expand Projects node in the tree panel.
2. Select the task name in the right panel.
Administrator Guide
60 PUBLIC Users and Groups Management
Related Information
Whenever a project is created, all pre-defined Data Insight user groups are automatically added to its principal list.
This feature facilitates user rights management when you want the same user or same group of users to have
rights on all projects.
You might want to limit rights of a subset of users to only one project. For example, you might want to limit the
Manage Scorecards right on the Human Resources project to only User A, and you want only User B to have the
Manage Scorecards right on the Finance project.
To restrict the Manage Scorecards right to a specific user for each project:
1. Create User A and User B. For details, see Creating users for Information Steward [page 45].
2. Add User A and User B to the Data Insight Analyst user group. For more information, see Adding users and
user groups to Information Steward groups [page 46].
This Data Insight Analyst user group has all of the rights of the Data Insight Scorecard Manager user group
except the Manage Scorecard right, which you will grant to specific users within a project in subsequent
steps.
3. Create the Human Resources project and the Finance project. For details, see Creating a project [page 120].
4. To grant User A the Manage Scorecard right on the Human Resources project:
a. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Data Insight Administrator group.
b. At the CMC home page, click Information Steward, expand the Data Insight node, and expand the Projects
node in the Tree panel.
c. Select the Human Resources project and click Manage Security User Security .
d. On the User Security page, click Add Principals.
e. In the list of Available users/groups, select User A and click the > button to move the names to the
Selected users/groups list.
f. Click Add and Assign Security.
g. Click the Advanced tab.
h. Click the Add/Remove Rights link.
i. Expand the Application node and click Information Steward Project.
j. Click the Granted column for the Manage Scorecards right under Specific Rights for Data Insight
Connection.
k. Click OK and verify that the list under Right Name displays the Manage Scorecards right.
l. Click OK and close the User Security window.
5. Repeat steps 3a through 3j in the Finance project for User B.
Administrator Guide
Users and Groups Management PUBLIC 61
Related Information
The Metadata Management module of Information Steward contains the following objects that have object-
specific rights that allow various actions on them.
● Metadata Management application through which users can view relationships (such as Same As, Impact,
and Lineage) between integrator sources.
● Integrator Sources through which users collect metadata.
● Integrator Source Groups to subset the metadata when viewing relationships.
The CMS manages security information, such as user accounts, group memberships, and object rights that define
user and group privileges. When a user attempts an action on a Metadata Management object, the CMS
authorizes the action only after it verifies that the user's account or group membership has sufficient privileges.
Related Information
Information Steward provides the following Metadata Management user groups to enable you to change the
rights for multiple users in one place (a group) instead of modifying the rights for each user account individually.
User Users that can view metadata only in the Metadata Management tab of Information Steward.
Data steward Users that have all the rights of a Metadata Management User, plus the following rights:
Administrator Guide
62 PUBLIC Users and Groups Management
Pre-defined User Group Description
Administrator Users that have all the rights of a Metadata Management Data Steward, plus the following
rights:
● Create, edit, and delete Metadata Management integrator sources and source groups
● Run and schedule Metadata Integrators
● Create custom attributes and edit values of custom attributes
● Configure, edit, delete, schedule, view history of Information Steward utilities
● Manage Metapedia users and rights and re-assign terms to other groups
Related Information
Type-specific rights affect only specific object types, such as integrator sources. The following topics describe
type-specific rights for each Information Steward object in the CMC and Information Steward.
Related Information
Group rights for Metadata Management folders and objects [page 63]
Group rights for Metadata Management objects in Information Steward [page 65]
Each pre-defined Metadata Management user group provides specific rights to the folders and objects in the
Central Management Console (CMC).
Administrator Guide
Users and Groups Management PUBLIC 63
CMC Folder or Right Name Description Pre-defined User Groups
Object
Metadata Manage Metadata Man Metadata Man
ment Administrator agement Data agement User
Steward
Integrator Source View objects View integrator sour Yes Yes Yes
folder ces, their run history,
their logs, and so on.
Administrator Guide
64 PUBLIC Users and Groups Management
CMC Folder or Right Name Description Pre-defined User Groups
Object
Metadata Manage Metadata Man Metadata Man
ment Administrator agement Data agement User
Steward
Metapedia Folder View objects View terms, catego Yes Yes Yes
ries, policy sets, and
policy statements in
Information Steward
Each pre-defined Metadata Management user group provides specific rights on objects in Information Steward, as
the following table shows.
Administrator Guide
Users and Groups Management PUBLIC 65
Table 21: Rights for Metadata Management objects in Information Steward
Right Description Pre-defined User Groups
Name
Metadata Manage Metadata Manage Metadata Manage
ment Administrator ment Data Steward ment User
View ob ● View all of the metadata objects and Yes Yes Yes
jects their relationships
● View the custom attributes and values
● View the Preferences page
● Search all metadata sources
● View Metadata Management lineage
from View Lineage option on the
Documents tab of BI Launch Pad.
● View Preferences page
By default, the pre-defined Metadata Management user groups are added to the access list of integrator sources
and source groups when you create them.
You might want to allow only certain users to configure integrator sources, define source groups, or define
Metapedia categories and terms or policy sets and statements. In these cases, you would add the user or group to
the specific Metadata Management object's access list (instead of adding to a pre-defined Metadata Management
user group).
Administrator Guide
66 PUBLIC Users and Groups Management
1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Metadata Management Administrator group.
2. At the CMC home page, click Information Steward.
3. Expand the Metadata Management node in the Tree panel.
4. Select the object type.
Note
To select multiple names, hold down the Ctrl key when you click each name.
b. Click the > button to move the names to the Selected users/groups list.
8. Click Add and Assign Security.
9. Click the Advanced tab.
10. Click the Add/Remove Rights link.
11. To assign integrator source rights:
a. Expand the Application node and select Metadata Management Integrator configuration.
b. Click the Override General Global column and the Granted (green check mark icon) column for each right
that you want this user or group to have.
For example, to grant the right to schedule and view integrator instances, click the Override General Global
column and the Granted column for the following General Global Rights General items:
○ Pause and resume document instances
○ Schedule document to run
○ View document instances
12. To assign Metapedia rights:
a. Expand the Content node and select Folder.
b. Click the Granted column for each right that you want this user or group to have.
For example, to grant the right to create, edit, and delete Metapedia categories and terms, click the Granted
column for the following General Rights for Folder:
○ Add objects to folder
○ Delete objects
○ Edit objects
Administrator Guide
Users and Groups Management PUBLIC 67
○ View objects
13. To assign source group rights:
a. Expand the Application node and select Metadata Management Source Group.
b. Click the Override General Global column and the Granted (green check mark icon) column for each right
that you want this user or group to have.
For example, to grant the right to schedule and view integrator instances, click the Override General Global
column and the Granted column for the following General Rights for Metadata Management Integrator
configuration:
○ Pause and resume document instances
○ Schedule document to run
○ View document instances
14. Click OK and verify that the list under Right Name displays the rights you just added.
15. Click OK and verify that the list of principals includes the name or names you just added.
16. Close the User Security window.
In Metapedia, users create objects such as business terms and policy statements, that adhere to their business
requirements and practices.
Users create categories for terms, and policy statements for policy sets. After authors have created a term or
policy it is sent through an approval process.
The CMS manages security information, such as user accounts, group memberships, and object rights that define
user and group privileges. When a user attempts an action on a Metapedia object, the CMS authorizes the action
only after it verifies that the user's account or group membership has sufficient privileges.
Related Information
SAP Information Steward provides the following Metapedia user groups to enable you to change the rights for
multiple users in one place (a group) instead of modifying the rights for each user account individually.
Administrator Guide
68 PUBLIC Users and Groups Management
Table 22: Metapedia pre-defined user groups
Pre-defined user Group Description
Metapedia User Users that can view terms, categories, policy sets, and policy statements in the Metapedia tab
of Information Steward.
Note
A Metapedia author can optionally assign a Metapedia user as an observer when creating a
term, policy set, or policy statement. An observer may receives email notifications when a
term, policy set, or policy statement is approved or deleted.
Metapedia Author Users that have all the rights of a Metapedia User, plus the following rights:
Metapedia Approver Users that have all the rights of a Metapedia Author, plus the rights to approve or reject terms,
policy sets, and policy statements.
Approvers receive all email notifications that a Metapedia Author receives, plus an email notifi
cation when a term, policy set, or policy statement is submitted for approval, is near or past a
task due date, or when a policy set or statement has expired.
Metapedia Administrator Users that have all the rights of a Metapedia Approver, plus the rights to manage terms, policy
sets, and policy statements. For example, to re-assign terms, policy sets, and policy state
ments to other users or groups and to edit terms, policy sets, and policy statements that are
not assigned to them as authors.
Type-specific rights affect only certain object types, such as integrator sources or Metapedia objects. The
following topics describe type-specific rights for each Information Steward object in the CMC and Information
Steward.
Related Information
Each pre-defined Metapedia user group provides specific rights on the folders and objects in the Central
Management Console (CMC).
Administrator Guide
Users and Groups Management PUBLIC 69
Table 23: Rights for Metapedia folders in the CMC
CMC Folder or Right Name Description Metapedia pre-defined User Groups
Object
Administrator Approver Author User
Metapedia View objects View terms and Yes Yes Yes Yes
Folder categories in
Information
Steward
Each pre-defined Metapedia user group provides specific rights on objects in Information Steward.
Administrator Guide
70 PUBLIC Users and Groups Management
Right Name Description Pre-defined User Groups
● Edit categories
Delete ob
● Edit terms
jects
● Add terms to categories
● Relate terms
● Associate objects to a term
● Update Preferences page
● Delete related terms
● Delete associated objects
● Delete associated terms
● Delete term task
● Delete categories
● Delete terms
You might want to allow only certain users to define Metapedia categories and terms. In these cases, you would
add the user or group to the specific Metapedia object's access list (instead of adding to a pre-defined Metapedia
user group).
1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Metapedia Administrator group.
2. At the CMC home page, click Information Steward.
3. Expand the Metadata Management node in the Tree panel.
4. Select Metapedia.
Note
To select multiple names, hold down the Ctrl key when you click each name.
b. Click the > button to move the names to the Selected users/groups list.
8. Click Add and Assign Security.
9. Click the Advanced tab.
Administrator Guide
Users and Groups Management PUBLIC 71
10. Click the Add/Remove Rights link.
11. To assign Metapedia rights:
a. Expand the Content node and select Folder.
b. Click the Granted column for each right that you want this user or group to have.
For example, to grant the right to create, edit, and delete Metapedia categories and terms, click the Granted
column for the following General Rights for Folder:
○ Add objects to folder
○ Delete objects
○ Edit objects
○ View objects
12. To assign application rights:
a. Expand the Application node and select Information Steward Metapedia.
b. Click the Granted column for each right that you want this user or group to have.
For example, to grant the right to create, edit, and delete Metapedia categories and terms, click the Granted
column for the following Specific Rights for Information Steward Metapedia:
○ Approve Term
○ Export Data
○ Import data
○ Manage Term
13. Click OK and verify that the list under Right Name displays the rights you just added.
14. Click OK and verify that the list of principals includes the name or names you just added.
15. Close the User Security window.
The Cleansing Package Builder module of SAP Information Steward contains the following objects to which you
control access.
● Private cleansing packages: Private cleansing packages are viewed or edited by the user who owns them and
are listed under My Cleansing Packages. Private cleansing packages include those created by using the New
Cleansing Package Wizard or by importing a published cleansing package.
● Published cleansing packages: Published cleansing packages are cleansing packages included with SAP
Information Steward or cleansing packages which a data steward created and then published. Published
cleansing packages are available to all users and can be used in an SAP Data Services Data Cleanse transform
or imported and used as the basis for a new cleansing package.
Related Information
Administrator Guide
72 PUBLIC Users and Groups Management
5.7.1 Group rights for cleansing packages
The Administrator and the pre-defined Cleansing Package Builder User groups can perform specific actions on
cleansing packages, as the following table shows.
Edit Change, delete, and rename your Cleansing Package Builder Yes Yes
own private cleansing packages.
View Browse your own private cleansing Cleansing Package Builder Yes Yes
packages and all published cleans
ing packages.
Control Cleansing Start and stop the Cleansing Central Management Yes No
Package Builder Package Builder Service. Console
server
Set up users and Create users and add users to Central Management Yes No
groups groups. Console
View all See all private and published Central Management Yes No
cleansing packages. Console
The following table describes the Information Steward actions in the Applications area of the CMC. To perform
any of these Information Steward actions, a user must belong to one of the following groups:
● Administrator
● Data Insight Administrator
● Metadata Management Administrator
Administrator Guide
Users and Groups Management PUBLIC 73
Table 26: Actions for Information Steward in the CMC Applications area
Item Action Description
View Data Services Job Server View View list of job servers in the Information Steward job
server group
Related Information
You might want to view or edit the SAP Information Steward repository connection information for situations such
as the following:
1. Log on to the Central Management Console (CMC) with a user name that is a member of the Administrator
group.
2. Select Applications from the navigation list at the top of the CMC Home page.
3. In the Application Name list, select Information Steward Application.
Administrator Guide
74 PUBLIC Users and Groups Management
4. Click Action Configure Repository .
The connection information for the Information Steward repository was defined at installation time.
○ For most of the database types, you can only view the connection information here.
○ If the database type is Oracle RAC, you can modify the connection string here if you want to add another
server or tune parameters for failover.
Note
You can change the user name and password for the Information Steward repository, but you must have
the appropriate credentials to access the Information Steward database. If you change the user name and
password, you must restart the Web Application Server and Server Intelligence Agent (SIA).
The Match Review module of SAP Information Steward contains the following objects that have specific rights
that allow various actions on them.
● Connections through which users view match results tables in a staging database to profile the data.
● Match review configurations
● Match review tasks through which users decide whether a record belongs in a match group or whether the
master record should be reassigned.
Note
In addition to rights to the connection, the Database Administrator must grant privileges on the tables to the
user in the staging database.
SAP Information Steward provides pre-defined Match Review user groups to facilitate the assignment of rights on
connections, configurations, and tasks. These groups enable you to change the rights for multiple users in one
place (a group) instead of modifying the rights for each user account individually.
The following table describes the pre-defined user groups in ascending order of rights.
Data Review User Users who are participants in match review as a reviewer or approver and have the following
rights:
Administrator Guide
Users and Groups Management PUBLIC 75
User group Description
Data Review Configuration Users that have all the rights of a Data Review User, plus the following rights:
Manager
● Creates and manages match review configurations.
● Runs match review configurations immediately.
● Manages match review tasks as follows:
○ Edits task settings.
○ Cancels the task if the task is obsolete.
○ Force completes the task if the remaining match review process can be bypassed.
Data Review Administrator Users that have all the above rights on Match Review objects, plus the following rights:
Type-specific rights affect only specific object types, such as Match Review configurations or tasks. The following
topic describes type-specific rights for each Information Steward object in the CMC and SAP Information
Steward.
Related Information
Group rights for Data Review folder and Match Review objects in the CMC [page 76]
Each pre-defined Data Review user group provides specific rights on the folders and objects in the Central
Management Console (CMC), as the following table shows.
Connection folder View ob View connections in the Con Yes Yes Yes
jects nections folder
Administrator Guide
76 PUBLIC Users and Groups Management
CMC Folder or Ob Right Name Description Pre-defined User Groups
ject
Data Review Ad Data Review Data Data Re
ministrator Steward view User
Data Review folder View ob View objects in the Data Yes Yes Yes
jects Review folder
Match Review Config View ob View objects in the Match Yes Yes Yes
uration Folder jects Review Configuration folder
Administrator Guide
Users and Groups Management PUBLIC 77
6 Repository Management
This section describes tasks and tools to manage the Information Steward repository.
You might want to view or edit the SAP Information Steward repository connection information for situations such
as the following:
1. Log on to the Central Management Console (CMC) with a user name that is a member of the Administrator
group.
2. Select Applications from the navigation list at the top of the CMC Home page.
3. In the Application Name list, select Information Steward Application.
○ For most of the database types, you can only view the connection information here.
○ If the database type is Oracle RAC, you can modify the connection string here if you want to add another
server or tune parameters for failover.
Note
You can change the user name and password for the Information Steward repository, but you must have
the appropriate credentials to access the Information Steward database. If you change the user name and
password, you must restart the Web Application Server and Server Intelligence Agent (SIA).
Data Cleansing Advisor stores staged data in a local repository. The staged data allows users to preview cleansing
and matching changes to the data without changing their actual data.
Administrator Guide
78 PUBLIC Repository Management
You can edit the server port number for the Data Cleansing Advisor local repository. You can also view the server
name, which is the machine name where Information Steward was installed.
1. Log on to the Central Management Console (CMC) with a user name that is a member of the Administrator
group.
2. Select Applications from the navigation list at the top of the CMC Home page.
3. In the Application Name list, select Information Steward Application.
4. Click Configure Data Cleansing Advisor.
The connection information for the Data Cleansing Advisor local repository was defined when Data Cleansing
Advisor was first accessed. You can view the server name here, but you cannot change it.
5. If you need to change the server port number, enter the new value in the Server Port field.
If you change the port number, you must restart the Enterprise Information Management Adaptive Processing
Server (EIM APS). Because stopping and restarting the EIM APS will stop all services running under EIM APS, it is
important that you notify other users using these services and restart EIM APS when it is not in use.
1. Navigate to Servers.
2. Under Service Categories, choose Enterprise Information Management Services.
3. Select the EIM Adaptive Processing Server.
4. Use the toolbar buttons to stop and then restart it.
However, after installation you can enable Windows authentication in the Central Management Console (CMC).
Additionally, by default the Web application server (such as Tomcat) and the Server Intelligence Agent (SIA) use
the local System Account. Therefore, you also need to change this setting and enter the Windows domain user
name and password, then restart the servers.
To create a new repository with Windows authentication enabled, set the ISRepositoryUtility parameter
repoMSSQLWinAuth to true.
1. Log on to the Central Management Console (CMC) with a user name that is a member of the Administrator
group.
2. From the drop-down menu, select Applications.
3. Double-click Information Steward Application to open its settings interface.
4. Select Configure Repository.
5. Select the Windows Authentication check box and click Save.
6. Open the BI platform Central Configuration Manager (CCM) and stop the Web application server and the SIA.
7. For the Web application server, disable the System Account and enter the Windows domain user name and
password.
8. Ensure the associated Data Services Windows service is also configured to use the same domain account
configured for the Web application server. The account and password setting is in the Windows Services
Administrator Guide
Repository Management PUBLIC 79
control panel. To access the account and password setting, right-click the SAP Data Services service and
select Properties. Select the Log On tab and confirm the account and password.
9. Restart the Data Services Windows service, Web application server, and SIA.
Related Information
We recommend that you create a backup policy for your SAP Information Steward repository based on the
frequency of the following tasks:
● Add or change Data Insight connections, projects, views, file formats, rules, rule bindings, or scorecards
● Add or change Metadata Management integrator sources or custom attributes
● Define or change Metapedia terms and categories
● Change the include source with failed data setting in an existing rule task
Use the backup utility of your relational database management system that you used to create your Information
Steward repository.
Normally, you create the Information Steward repository during installation. However, you can use the IS
Repository utility (ISRepositoryUtility) in the following situations:
Caution
If you reset your repository by using the create mode of the Repository utility, you will lose all your
existing contents.
● You want to manually upgrade your repository from a prior version of Information Steward to the current
version.
On Windows platforms:
Administrator Guide
80 PUBLIC Repository Management
● The Repository utility uses the parameter values in the InstallUtility.properties file in the
<SAP_BusinessObjects>\InformationSteward\MM\config directory by default.
On UNIX platforms:
You can override the default parameter values in InstallUtility.properties by either of the following
actions:
● As a best practice, make a backup of the InstallUtility.properties file and then change the parameter
values in InstallUtility.properties.
● Enter the parameters with their new values in the command line when you run the Information Steward
Repository utility.
Related Information
To create a new SAP Information Steward repository and associate it with the current Central Management
Console (CMC) system database:
Note
To use an Oracle database for the Information Steward repository, ensure that the JDBC driver exists in the
following directory:
○ On Windows platforms, <SAP_BusinessObjects>\InformationSteward\MM\lib\ext
○ On UNIX platforms, <SAP_BusinessObjects>/InformationSteward/MM/lib/ext
If the JDBC driver is not in this directory, you must download and install it. For more information, see the
Installation Guide.
2. Stop your web application to prevent users from accessing the Information Steward repository during
recovery.
3. Make a backup of the InstallUtility.properties file, which is the default configuration file for the
ISRepositoryUtility command.
Administrator Guide
Repository Management PUBLIC 81
○ On windows platforms, back up the properties file located in <SAP_BusinessObjects>
\InformationSteward\MM\config\InstallUtility.properties.
○ On UNIX platforms, back up the properties file located in <SAP_BusinessObjects>/
InformationSteward/MM/config/InstallUtility.properties.
4. Open InstallUtility.properties and change the values of the connection parameters to the values of
the database you created in step 1.
5. If you want to overwrite an existing repository, change the value of the -o parameter in the
InstallUtility.properties file to true.
Caution
If you specify true for the -o parameter, the utility deletes all objects in the repository.
6. On the machine where you performed the Primary installation type for Information Steward, run
ISRepositoryUtility from the command line.
a. Open a command window and change to the bin directory where Information Steward is installed.
Note
The password for both the CMS database and the Information Steward repository must be specified in
the command line. Do not store passwords in the configuration or properties files.
7. After you create the repository, restart the Web Application Server and CMS.
Note
If you run Data Insight profiling and rule tasks and you created a new repository with different connection
information than in Step 1, you must access the SAP Data Services Server Manager to associate the
Information Steward job server with the new repository.
Related Information
Administrator Guide
82 PUBLIC Repository Management
6.5.2 Recovering the repository
To recover an Information Steward repository and synchronize it with the current CMS system database:
1. Stop your web application to prevent users from accessing the Information Steward repository during
recovery.
2. Restore your Information Steward repository to a backup that you created with your database backup utility.
3. Make a backup of the InstallUtility.properties file, which is the default configuration file for the
ISRepositoryUtility command.
ISRepositoryUtility recover
boePassword "myPass1" password "myPass2"
reportFilePath "C:<SAP_BusinessObjects>\InformationSteward\MM\log
\recovery_report.log"
Administrator Guide
Repository Management PUBLIC 83
Note
The recovery report file will contain information about the Information Steward repository objects that
were updated or created and those objects that require further action.
6. Look for the completion message in the command line or in the log <SAP_BusinessObjects>
\InformationSteward\MM\log\ISRepositoryUtility_<timestamp>.log.
When the utility completes successfully, the following message appears:
Note
When the recovery has finished, the Update Search Index utility is automatically scheduled to run five
minutes later.
After the ISRepositoryUtility utility completes, the following objects will be recovered to their status when
the database backup was taken:
Note
Schedules and Data Insight profile and rule tasks that have been deleted after the database backup was
created will not be recovered. Therefore, they will be deleted from the recovered repository.
Remember
You must complete the repository recovery by taking the actions indicated by the messages in the recovery
report file.
Related Information
Administrator Guide
84 PUBLIC Repository Management
1. Stop your web application to prevent users from accessing the Information Steward repository during
upgrade.
2. Make a backup of the InstallUtility.properties file, which is the default configuration file for the
ISRepositoryUtility command.
Note
The password for both the CMS database and the Information Steward repository must be specified in
the command line. Do not store passwords in the configuration or properties files.
5. After you upgrade the repository, restart the Web Application Server and CMS.
6. If you are migrating to new hardware and upgrading Information Steward, use the appropriate tool for the
following upgrade paths:
○ If you are upgrading from SAP BusinessObjects Enterprise XI 3.1 and Metadata Management 3.1 to BI
Platform 4.0.x and Information Steward 4.1, use the Upgrade Manager to move Information Steward
objects (such as Metadata Management Integrator Sources).
○ If you are upgrading from BI Platform 4.0 and Information Steward 4.0 to BI Platform 4.0 and Information
Steward 4.1, use the SAP Promotion Management tool to move Information Steward objects (such as
cleansing packages).
Related Information
Administrator Guide
Repository Management PUBLIC 85
6.5.4 Completing the repository recovery
After you run the Repository utility in recover mode, you might need to take additional actions to synchronize
the CMS system database.
1. Open the recovery report file that the Repository utility created in the file path that you specified in the
reportFilePath parameter.
2. If you see the following message in the recovery report file, you can skip the rest of this procedure.
3. If you see the following message in the recovery report file, review the subsequent messages to obtain the
names of the objects that require additional actions.
By default, the Repository utility uses the parameter values in the InstallUtility.properties file. You can
override the default parameter values by either of the following actions:
● As a best practice, make a backup of the InstallUtility.properties file and then change the parameter
values in InstallUtility.properties.
● Enter the parameters with their new values in the command line when you run this utility.
Administrator Guide
86 PUBLIC Repository Management
Note
You must always specify the passwords for both the CMS database and the Information Steward repository in
the command line when you run the Repository utility.
Table 29:
Parameter Accepted Values Required Description
-c Absolute path of No Location and name of the configuration file whose parameter
the configuration values this utility will use
file
Default:
<SAP_Business
Objects>/MM/
config/
InstallUtilit
y.properties
Note
All parameters and their descriptions will also display If you
specify no parameters when you run
ISRepositoryUtility.
Default: create
reportFilePat Absolute path of Yes if mode is re Location of the recovery report file that this utility will gener
h the recovery report cover ate when running in the recover mode
file
The path must be enclosed with double quotation marks. For
example:
boeUser Any value accepted Yes Administrator user account to connect to the CMS
by the CMS
boePassword Any value accepted Yes Administrator user account password to connect to the CMS
by the CMS
Administrator Guide
Repository Management PUBLIC 87
Parameter Accepted Values Required Description
boeAuthentica Any value accepted Yes Authentication method to connect the user to the CMS
tionr by the CMS
Default:
secEnterprise
user Any value accepted Yes Information Steward repository user name
by the database
type
repoServer Any value accepted Yes Information Steward repository server name
by the database
type
repoServerPor Integer value Yes Port to communicate with the Information Steward repository
t server
repoDatabaseE See the list in the Yes Information Steward repository database type:
ngine Description col
● DB2: DB2 v9
umn.
● Microsoft SQL Server: MS SQL Server 2005, MS SQL
Server 2008
● SQL Anywhere: Sybase SQL Anywhere 12
● Oracle: Oracle 10, Oracle 11
● SAP HANA: NewDB
● Sybase ASE: Sybase ASE 15, Sybase ASE 16
repoDatabase Any value accepted Yes Information Steward repository database name
by the database
server
repoConnectio Any value accepted Yes Information Steward repository connection information.
nString by the database
For Microsoft SQL Server, if the SQL Server is an instance
server
name, use this parameter to pass it. For example:
repoConnectionString SQLServerExpress
repoMSSQLWinA True, false No For Microsoft SQL Server repositories, specifies whether to
uth use Windows authentication instead of Microsoft SQL Server
Default: False
authentication. For example:
ISRepositoryUtility create
repoMSSQLWinAuth true boePassword
test1234 -o true
Administrator Guide
88 PUBLIC Repository Management
7 Data Insight Administration
Each deployment of SAP Information Steward supports multiple users in one or more Data Insight projects to
assess and monitor the quality of data from various sources. In Data Insight, users work with data within projects
and connections:
Table 30:
Data Insight term Description
Project Collaborative workspace for data stewards and data analysts to assess and monitor
the data quality of a specific domain and for a specific purpose (such as customer
quality assessment, sales system migration, and master data quality monitoring).
Connection Defines the parameters for Information Steward to access a data source. A data
source can be a relational database, application, or file.
For more details about what users can do in Data Insight, see the Data Insight section in the User Guide.
Before Data Insight users can perform tasks, administrators must set up each user in the Central Management
Console (CMC) and perform other setup tasks:
After a Data Insight user creates tasks (profile or rule), the administrator can perform the following tasks:
● Create schedules to run the profile task and rule task at regular intervals.
● Modify the default schedule to run utilities if the frequency of the profile and rule tasks warrant it.
● Modify application-level settings to change the default configuration for Data Insight.
Administrator Guide
Data Insight Administration PUBLIC 89
Related Information
Some of the Data Insight settings in the Information Steward Settings window control task performance by
increasing or decreasing the number of needed resources, and controling the repository size.
The following parameters affect the amount of data that Data Insight needs to process for profiles and rules. You
reduce the amount of data to process to conserve resources and to promote efficient processing.
The following parameters affect the size of the Information Steward Data Insight subcategory repository. If you
increase these values, the repository size also increases and you might need to free space more often.
Administrator Guide
90 PUBLIC Data Insight Administration
7.3 Data Insight Connections
Create connections to databases, applications, and files for storing information from Data Insight processes.
You can create connections for SAP Data Insight for the following reasons:
Table 31:
Reason for connecting Details
Profiling data Configure connections to the following types of data sources to collect profile attributes that
can help you determine the data quality and structure:
● Databases (such as Microsoft SQL Server, IBM DB2, SAP HANA, Oracle, MySQL, Informix
IDS, SAP ASE, and ODBC)
● Applications (such as SAP Business Suite and SAP NetWeaver Business Warehouse)
● Text files
Failed data Configure connections to supported databases to store failed data records and related infor
mation about data that failed your quality validation rules. You can create multiple failed data
connections. For example, create a separate failed data connection for each rule task.
Ensure that you have the proper permissions to access the database connection:
● If you profile and run validation rules on the data, you must have permissions to read the metadata and data
from the source tables.
● If you store failed data, you must have permissions to create, modify, and delete tables, and to create stored
procedures.
Note
For a complete list of supported databases, applications, and their versions, see the Platform Availability Matrix
on the SAP Support Portal at https://service.sap.com/PAM .
Related Information
Before following the steps below, log on to the Central Management Console (CMC) with a user name that
belongs to the Data Insight administrator group or that has the create rights for Connections in Information
Steward.
Administrator Guide
Data Insight Administration PUBLIC 91
1. At the CMC home page, click Information Steward.
2. Select the Connections node in the tree panel at left.
Note
The parameters vary based on the database type you choose. For database-specific parameters and
information, see the applicable connection parameter topic listed for the applicable database type.
Table 32:
Parameter Description
Connection Name (Required) Enter a name for the connection. Name requirements:
○ Maximum length is 64 characters
○ Can be multi-byte
○ Case insensitive
○ Can include underscores and spaces
○ Cannot include other special characters: ?!@#$%^&*()-+={}[]:";'/\|.,`~
Bulk Loading (Required) Select Yes to enable bulk loading for failed data. Bulk loading could
improve the processing speed when the software searches for failed data in
large database files.
Bulk Loading is available when the Purpose is for data that failed rules. Applica
ble for the following database types:
○ DB2
○ Oracle
○ SAP HANA
Rows per commit The number of rows processed before a commit. Default is 1000. Setting the
rows per commit to a larger number of rows may increase performance.
Database Type (Required) Select the database type that contains the data that you want to use
for profiling or failed data.
Database Version (Required) Enter current version number for the database that you chose in
Database Type.
Server Name (Required) Enter the server name that the connection will use.
Database Name (Required) Enter the name of the database that you will use for this connection.
Administrator Guide
92 PUBLIC Data Insight Administration
Parameter Description
Windows Authentication (Required for Microsoft SQL Server) Select Yes to enable Windows authentica
tion. Select No to disable Windows authentication.
User Name, Password (Required) Enter the user name and password that will be used to access this
connection.
Unsupported Data Types (Required) Select the action that the system should take if the database con
tains unsupported data types.
Note
This parameter is only available when the purpose of the connection is for
data profiling.
Language, Client Code Page, Server (Required) Select a valid option from the drop lists or select Default.
Code Page
Note
This parameter is only available when the purpose of the connection is for
data that failed rules.
Append failed data (Required) Select Yes to append failed data to the existing table. Select No to
save failed data to a new table.
Note
This parameter is only available when the purpose of the connection is for
data that failed rules.
If you select to include the source data with failed data, and you have a primary
key defined in your source data, selecting Yes for this option also appends
changed information to the source data as well as failed data. When you select
No, the software overwrites the failed data with the data from the last run.
Note
This parameter is only available when the purpose of the connection is for
data profiling.
5. (Optional) Click Test Connection to verify that Information Steward can connect to the newly defined
connection.
If the connection fails, correct applicable connection information and click Test Connection again.
6. When you are finished with the Create Connection dialog box, click Save.
The newly configured connection appears in the list of connections in Information Steward
Connections .
After you save the connection, you cannot change the Connection Name, Connection Type, Purpose and
connection parameters that uniquely identify a database.
Administrator Guide
Data Insight Administration PUBLIC 93
You must also authorize users so that they can perform tasks such as view the data, run profile tasks and run
validation rules on the data.
Note
If you configure the system environment variables for the database (for example, ORACLE_HOME,
LD_LIBRARY_PATH, SHLIB_PATH, locale settings, ulimit), you must restart the Server Intelligence Agent (SIA).
Users cannot access the Central Management Server (CMS) while the SIA is stopped. Therefore, you should
consider performing this configuration during scheduled down time to limit the effect on your users.
Related Information
Table 33:
DB2 option Possible values Description
Database version DB2 UDB <version number> Select the version of your DB2 client. This is the version
of DB2 that this Data Insight connection accesses.
Use Data Source Name (DSN) Yes Select whether or not to use DSN to connect to the data
base.
No
Default is No, and a server name (also known as DSN-
less) connection will be used. For a DSN-less connection,
you must fill in Server name, Database name, and Port
Number.
Server name Refer to the requirements of your Type the DB2 database server name.
database
This option is required if Use Data Source Name (DSN) is
set to No.
Database name Refer to the requirements of your Type the name of the database defined in DB2.
database
This option is required if Use Data Source Name (DSN) is
set to No.
Port Number Five digit integer Type the port number to connect to this database.
Default: 50000 This option is required if Use Data Source Name (DSN) is
set to No.
Administrator Guide
94 PUBLIC Data Insight Administration
DB2 option Possible values Description
Data Source Name Refer to the requirements of your Type the data source name defined in DB2 for connect
database ing to your database.
User name The value is specific to the data Enter the user name of the account through which SAP
base server and language. Information Steward accesses the database.
Password The value is specific to the data Enter the user's password.
base server and language.
Bulk Loading Yes Select Yes to enable bulk loading for failed data. Bulk
loading could improve the processing speed when the
No
software searches for failed data in large database files.
Note
This parameter is only available when the purpose of
the connection is for data that failed rules.
Rows per commit Default is 1000 For bulk loading. The number of rows processed before a
commit. Setting the rows per commit to a larger number
of rows may increase performance.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data
types.
Do not import
Default is Import as VARCHAR.
VARCHAR Size Any positive integer from 1 to Maximum size of the imported VARCHAR data.
4000
Default is 255.
Client Code page See "Supported locales and encod Code page of the database client.
ings" in the SAP Data Services Ref
erence Guide.
Server code page See "Supported locales and encod Code page of the database server.
ings" in the SAP Data Services Ref
erence Guide.
Administrator Guide
Data Insight Administration PUBLIC 95
DB2 option Possible values Description
Default is No.
Note
This parameter is only available when the purpose of
the connection is for data that failed rules.
Table 34:
Informix option Possible values Description
Database version Informix IDS <version number> Select the version of your Informix client. This is the
version of Informix that this profile connection ac
cesses.
Use Data Source Name (DSN) Yes or No Select whether or not to use DSN to connect to the da
tabase.
Server name Refer to the requirements of your Type the Teradata database server name.
database.
This option is required if Use Data Source Name (DSN)
is set to No.
Database name Refer to the requirements of your Type the name of the database defined in Informix.
database.
This option is required if Use Data Source Name (DSN)
is set to No.
Port Number Four digit integer Type the port number to connect to this database.
Default: 1526 This option is required if Use Data Source Name (DSN)
is set to No.
Data Source Name Refer to the requirements of your Type the Data Source Name defined in the ODBC.
database.
This option is required if Use Data Source Name (DSN)
is set to Yes.
User name The value is specific to the data Enter the user name of the account through which SAP
base server and language. Information Steward accesses the database.
Administrator Guide
96 PUBLIC Data Insight Administration
Informix option Possible values Description
Password The value is specific to the data Enter the user's password.
base server and language.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data
types.
Do not import
Default is Import as VARCHAR.
VARCHAR Size Any positive integer from 1 to 4000 Maximum size of the imported VARCHAR data.
Default is 255.
Client Code page See "Supported locales and encod Code page of the database client.
ings" in the SAP Data Services Ref
erence Guide
Server code page See "Supported locales and encod Code page of the database server.
ings" in the SAP Data Services Ref
erence Guide
Default is No.
Note
This parameter is only available when the purpose
of the connection is for data that failed rules.
To use Microsoft SQL Server as a profile source when SAP Information Steward is running on a UNIX platform,
you must use an ODBC driver, such as the DataDirect ODBC driver.
For more information about how to obtain the driver, see the Product Availability Matrix at http://
service.sap.com/PAM .
Table 35:
Microsoft SQL Server option Possible values Description
Database version Microsoft SQL Server Select the version of your SQL Server client. This is the ver
<version number> sion of SQL Server that this profile source accesses.
Server Name Computer name, fully quali Enter the name of machine where the SQL Server instance is
fied domain name, or IP ad located.
dress
Administrator Guide
Data Insight Administration PUBLIC 97
Microsoft SQL Server option Possible values Description
Database Name Refer to the requirements of Enter the name of the database to which the profiler connects.
your database.
Windows Authentication Yes or No Indicate whether or not Windows Authentication is used. The
default is No, which means Microsoft SQL Server Authentica
Default: No
tion is used.
User Name The value is specific to the da Enter the user name of the account through which
tabase server and language. Information Steward accesses the database.
Password The value is specific to the da Enter the user's password.
tabase server and language.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data types.
VARCHAR Size Any positive integer from 1 to Maximum size of the imported VARCHAR data.
4000
Default is 255.
Language Select the correct language Language abbreviation specified in the ISO 639-2/T standard.
for your database server.
Client Code page See "Supported locales and Code page of the database client.
encodings" in the SAP Data
Services Reference Guide.
Server code page See "Supported locales and Code page of the database server.
encodings" in the SAP Data
Services Reference Guide.
Default is No.
Note
This parameter is only available when the purpose of the
connection is for data that failed rules.
Database version MySQL <version number> Select the version of your MySQL client. This is the ver
sion of MySQL that this profile connection accesses.
Administrator Guide
98 PUBLIC Data Insight Administration
MySQL option Possible values Description
Use Data Source Name (DSN) Yes or No Select whether or not to use DSN to connect to the data
base.
Server name Refer to the requirements of your Type the MySQL database server name.
database.
This option is required if Use Data Source Name (DSN) is
set to No.
Database name Refer to the requirements of your Type the name of the database defined in MySQL.
database.
This option is required if Use Data Source Name (DSN) is
set to No.
Port Number Refer to the requirements of your Type the port number to connect to this database.
database.
This option is required if Use Data Source Name (DSN) is
set to No.
Data Source Name Refer to the requirements of your Select or type the Data Source Name defined in the
database. ODBC Administrator for connecting to the database you
want to profile.
User name The value is specific to the data Enter the user name of the account through which the
base server and language. software accesses the database.
Password The value is specific to the data Enter the user's password.
base server and language.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data
types.
Do not import
Default is Import as VARCHAR.
VARCHAR Size Any positive integer from 1 to 4000 Maximum size of the imported VARCHAR data.
Default is 255.
Client Code page See "Supported locales and encod Code page of the database client.
ings" in the SAP Data Services Ref
erence Guide
Server code page See "Supported locales and encod Code page of the database server.
ings" in the SAP Data Services Ref
erence Guide
Administrator Guide
Data Insight Administration PUBLIC 99
MySQL option Possible values Description
Default is No.
Note
This parameter is only available when the purpose of
the connection is for data that failed rules.
Table 37:
Netezza option Possible values Description
Database version Netezza NPS <version Select the version of your Netezza client. This is the version of
number> Netezza that this profile connection accesses.
Use Data Source Name (DSN) Yes or No Select whether or not to use DSN to connect to the database.
Server Name Refer to the requirements of Type the Netezza database server name.
your database.
This option is required if Use Data Source Name (DSN) is set
to No.
Database Name Refer to the requirements of Type the name of the database defined in Netezza.
your database.
This option is required if Use Data Source Name (DSN) is set
to No.
Port Number Refer to the requirements of Enter the number of the database port.
your database. This option is required if Use Data Source Name (DSN) is set
to No.
Data Source Name Refer to the requirements of Select or type the Data Source Name defined in the ODBC Ad
your database. ministrator for connecting to the database you want to profile.
User name The value is specific to the da Enter the user name of the account through which the soft
tabase server and language. ware accesses the database.
Password The value is specific to the da Enter the user's password.
tabase server and language.
Administrator Guide
100 PUBLIC Data Insight Administration
Netezza option Possible values Description
Unsupported Data Types Import as VARCHAR. Select the action to take to handle unsupported data types.
VARCHAR Size Any positive integer from 1 to Maximum size of the imported VARCHAR data.
4000.
Default is 255.
Language Select the correct language Language abbreviation specified in the ISO 639-2/T standard.
for your database server.
Client Code page See "Supported locales and Code page of the database client.
encodings" in the SAP Data
Services Reference Guide
Server code page See "Supported locales and Code page of the database server.
encodings" in the SAP Data
Services Reference Guide
Default is No.
Note
This parameter is only available when the purpose of the
connection is for data that failed rules.
Table 38:
ODBC option Possible values Description
Data Source Name Refer to the requirements of your Select or type the Data Source Name defined in the
database. ODBC Administrator for connecting to the database you
want to profile.
Additional information Alphanumeric characters and un Enter information for any additional parameters that the
derscores, or blank data source supports (parameters that the data source's
ODBC driver and database support). Use the format:
<parameter1=value1; parameter2=value2>
User name The value is specific to the data Enter the user name of the account through which the
base server and language. software accesses the database.
Password The value is specific to the data Enter the user's password.
base server and language.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data
types.
Do not import
Default is Import as VARCHAR.
Administrator Guide
Data Insight Administration PUBLIC 101
ODBC option Possible values Description
VARCHAR Size Any positive integer from 1 to 4000 Maximum size of the imported VARCHAR data.
Default is 255.
Client Code page See "Supported locales and encod Code page of the database client.
ings" in the SAP Data Services Ref
erence Guide.
Server code page See "Supported locales and encod Code page of the database server.
ings" in the SAP Data Services Ref
erence Guide.
Default is No.
Note
This parameter is only available when the purpose of
the connection is for data that failed rules.
Table 39:
Oracle option Possible values Description
Database version Oracle <version number> Select the version of your Oracle client. This is the version of
Oracle that this profile connection accesses.
Use TNS Name Yes or No Select whether or not to use TNS to connect to the database.
Server Name Computer name, fully quali Enter the name of machine where the Oracle Server instance
fied domain name, or IP ad is located.
dress
This option is required if Use TNS Name is set to No.
Instance Name Refer to the requirements of Enter the System ID for the Oracle database.
your database.
This option is required if Use TNS Name is set to No.
Administrator Guide
102 PUBLIC Data Insight Administration
Oracle option Possible values Description
Port Number Four digit integer Enter the port number to connect to this Oracle Server.
Default: 1521 This option is required if Use TNS Name is set to No.
Database Connection Name Refer to the requirements of Enter an existing Oracle connection through which the soft
your database. ware accesses sources defined in this profile connection.
User Name The value is specific to the da Enter the user name of the account through which the soft
tabase server and language. ware accesses the database.
Password The value is specific to the da Enter the user's password.
tabase server and language.
Bulk Loading Yes or No Select Yes to enable bulk loading for failed data. Bulk loading
could improve the processing speed when the software
searches for failed data in large database files.
Note
This parameter is only available when the purpose of the
connection is for data that failed rules.
Rows per commit Default is 1000 For bulk loading. The number of rows processed before a
commit. Setting the rows per commit to a larger number of
rows may increase performance.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data types.
VARCHAR Size Any positive integer from 1 to Maximum size of the imported VARCHAR data.
4000
Default is 255.
Language Select the correct language Language abbreviation specified in the ISO 639-2/T standard.
for your database server.
Client Code page See "Supported locales and Code page of the database client.
encodings" in the SAP Data
Services Reference Guide.
Server code page See "Supported locales and Code page of the database client.
encodings" in the SAP Data
Services Reference Guide.
Default is No.
Note
This parameter is only available when the purpose of the
connection is for data that failed rules.
Administrator Guide
Data Insight Administration PUBLIC 103
7.3.1.8 SAP HANA connection parameters
Table 40:
HANA option Possible values Description
Database version HANA <version number> Select the version of your SAP HANA client. This is the
version of SAP HANA that this Data Insight connection
accesses.
Use Data Source Name (DSN) Yes or No Select whether or not to use DSN to connect to the data
base.
Server name Refer to the requirements of your Type the name of the database defined in SAP HANA.
database.
This option is required if Use Data Source Name (DSN) is
set to No.
Port Number Five digit integer Enter the port number to connect to this SAP HANA
Server.
Default: 30015
This option is required if Use Data Source Name (DSN) is
set to No.
Data Source Name Refer to the requirements of your Type the data source name defined in SAP HANA for
database. connecting to your database.
Bulk Loading Yes or No Yes: Use bulk loading. Ensure that the user account has
SAVEPOINT ADMIN privileges.
Default is No.
Note
This parameter is only available when the purpose of
the connection is for data that failed rules.
Rows per commit Default is 1000 For bulk loading. The number of rows processed before a
commit. Setting the rows per commit to a larger number
of rows may increase performance.
User name The value is specific to the data Enter the user name of the account through which SAP
base server and language. Information Steward accesses the database.
Password The value is specific to the data Enter the user's password.
base server and language.
Administrator Guide
104 PUBLIC Data Insight Administration
HANA option Possible values Description
Bulk Loading Yes or No (Required) Select Yes to enable bulk loading for failed
data. Bulk loading could improve the processing speed
when the software searches for failed data in large data
base files.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data
types.
Do not import
Default is Import as VARCHAR.
VARCHAR Size Any positive integer from 1 to Maximum size of the imported VARCHAR data.
4000.
Default is 255.
Client Code page See "Supported locales and encod Code page of the database client.
ings" in the SAP Data Services Ref
erence Guide.
Server Code page See "Supported locales and encod Code page of the database server.
ings" in the SAP Data Services Ref
erence Guide.
Default is No.
Note
This parameter is only available when the purpose of
the connection is for data that failed rules.
Database version SQL Anywhere <version Select the version of your SQL Anywhere client. This is
number> the version of SQL Anywhere that this profile connection
accesses.
Administrator Guide
Data Insight Administration PUBLIC 105
SQL Anywhere option Possible values Description
Use Data Source Name (DSN) Yes or No Select whether or not to use DSN to connect to the data
base.
Server name Refer to the requirements of your Type the SQL Anywhere database server name.
database.
This option is required if Use Data Source Name (DSN) is
set to No.
Database name Refer to the requirements of your Type the name of the database defined in SQL Any
database. where.
Port Number Refer to the requirements of your Type the port number to connect to this database.
database.
This option is required if Use Data Source Name (DSN) is
set to No.
Data Source Name Refer to the requirements of your Select or type the Data Source Name defined in the
database. ODBC Administrator for connecting to the database you
want to profile.
User name The value is specific to the data Enter the user name of the account through which the
base server and language. software accesses the database.
Password The value is specific to the data Enter the user's password.
base server and language.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data
types.
Do not import
Default is Import as VARCHAR.
VARCHAR Size Any positive integer from 1 to 4000 Maximum size of the imported VARCHAR data.
Default is 255.
Client Code page See "Supported locales and encod Code page of the database client.
ings" in the SAP Data Services Ref
erence Guide
Server code page See "Supported locales and encod Code page of the database server.
ings" in the SAP Data Services Ref
erence Guide
Administrator Guide
106 PUBLIC Data Insight Administration
SQL Anywhere option Possible values Description
Default is No.
Note
This parameter is only available when the purpose of
the connection is for data that failed rules.
Database version SAP ASE <version Select the version of your SAP ASE client. This is the version
number> of Sybase that this profile connection accesses.
Server Name Computer name Enter the name of the computer where the SAP ASE instance
is located.
Database Name Refer to the requirements of Enter the name of the database to which the profiler connects.
your database
User Name The value is specific to the da Enter the user name of the account through which the soft
tabase server and language. ware accesses the database.
Password The value is specific to the da Enter the user's password.
tabase server and language.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data types.
VARCHAR Size Any positive integer from 1 to Maximum size of the imported VARCHAR data.
4000
Default is 255.
Language Select the correct language Language abbreviation specified in the ISO 639-2/T standard.
for your database server.
Client Code page See "Supported locales and Code page of the database client.
encodings" in the SAP Data
Services Reference Guide
Server code page See "Supported locales and Code page of the database server.
encodings" in the SAP Data
Services Reference Guide
Administrator Guide
Data Insight Administration PUBLIC 107
SAP ASE option Possible values Description
Default is No.
Note
This parameter is only available when the purpose of the
connection is for data that failed rules.
Table 43:
Sybase IQ option Possible values Description
Database version Currently supported versions Select the version of your Sybase IQ client. This is the version
of Sybase IQ that this datastore accesses.
Use Data Source Name (DSN) Yes or No Select whether or not to use DSN to connect to the database.
Host Name Computer name or IP address Type the computer name or IP address.
Database Name Refer to the requirements of Type the name of the database defined in Sybase IQ.
your database.
This option is required if Use Data Source Name (DSN) is set
to No.
Port Number Four digit integer Type the number of the database port.
Default: 2638 This option is required if Use Data Source Name (DSN) is set
to No.
Server name Refer to the requirements of Type the Sybase IQ database server name.
your database.
This option is required if Use Data Source Name (DSN) is set
to No.
Data Source Name Refer to the requirements of Select or type the Data Source Name defined in the ODBC Ad
your database. ministrator for connecting to your database.
User Name The value is specific to the da Enter the user name of the account through which the soft
tabase server and language. ware accesses the database.
Administrator Guide
108 PUBLIC Data Insight Administration
Sybase IQ option Possible values Description
Password The value is specific to the da Enter the user's password.
tabase server and language.
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data types.
VARCHAR Size Any positive integer from 1 to Maximum size of the imported VARCHAR data.
4000
Default is 255.
Language Select the correct language Language abbreviation specified in the ISO 639-2/T standard.
for your database server.
Client Code page See "Supported locales and Code page of the database client.
encodings" in the SAP Data
Services Reference Guide
Server code page See "Supported locales and Code page of the database server.
encodings" in the SAP Data
Services Reference Guide
Default is No.
Note
This parameter is only available when the purpose of the
connection is for data that failed rules.
Table 44:
Teradata option Possible values Description
Database version Teradata <version Select the version of your Teradata client. This is the version
number> of Teradata that this datastore accesses.
Use Data Source Name (DSN) Yes or No Select whether or not to use DSN to connect to the database.
Server name Refer to the requirements of Type the Teradata database server name.
your database
This option is required if Use Data Source Name (DSN) is set
to No.
Administrator Guide
Data Insight Administration PUBLIC 109
Teradata option Possible values Description
Database name Refer to the requirements of Type the name of the database defined in Teradata.
your database
This option is required if Use Data Source Name (DSN) is set
to No.
Port Number Four digit integer Type the port number to connect to this database.
Default: 8888 This option is required if Use Data Source Name (DSN) is set
to No.
Data Source Name Refer to the requirements of Type the Data Source Name defined in the ODBC Administra
your database tor for connecting to your database.
Method to Read Data ODBC or Parallel Transporter Select the method to use to read data.
Export Operator
Note
This option is only available for data profiling and cannot be
changed after the connection is created.
Log Directory Directory path The directory in which to write log files.
Teradata Director Program ID Alphanumeric characters, un Identifies the name of the Teradata database to load.
derscores, and punctuation
Required only for Parallel Transporter Export Operator.
User Name The value is specific to the da Enter the user name of the account through which the soft
tabase server and language ware accesses the database.
Password The value is specific to the da Enter the user's password.
tabase server and language
Unsupported Data Types Import as VARCHAR Select the action to take to handle unsupported data types.
VARCHAR Size Any positive integer from 1 to Maximum size of the imported VARCHAR data.
4000
Default is 255.
Language Select the correct language Language abbreviation specified in the ISO 639-2/T standard.
for your database server
Client Code page See "Supported locales and Code page of the database client.
encodings" in the SAP Data
Services Reference Guide
Server code page See "Supported locales and Code page of the database server.
encodings" in the SAP Data
Services Reference Guide
Administrator Guide
110 PUBLIC Data Insight Administration
Teradata option Possible values Description
Default is No.
Note
This parameter is only available when the purpose of the
connection is for data that failed rules.
You must define a connection to any application that contains data that you want to profile to determine the
quality and structure of the data.
1. Log on to the Central Management Console (CMC) with a user name that has the following authorizations:
○ Authorization to read data on the source application system. For authorizations to SAP Applications, see
“SAP user authorizations” in the SAP Data Services Supplement for SAP.
○ Either belongs to the Data Insight Administrator group or has the Create right on the Connections node.
2. At the CMC home page, click Information Steward.
3. Select the Connections node in the navigation tree on the left.
Option Description
Connection Name Name that you want to use for this Data Insight source.
○ Maximum length is 64 characters
○ Can be multi-byte
○ Case insensitive
○ Can include underscores and spaces
○ Cannot include other special characters: ?!@#$%^&*()-+={}[]:";'/\|.,`~
You cannot change the name after you save the connection.
6. In the Connection Type drop-down list, select the Application connection value.
7. In the Application Type drop-down list, select one of the following applications that contains the data you want
to profile:
For information about the specific data components that Information Steward can profile, see the Product
Availability Matrix at http://service.sap.com/PAM .
Administrator Guide
Data Insight Administration PUBLIC 111
8. Enter the relevant connection information for the application type.
9. Complete the parameters based on the descriptions found in SAP Applications connection parameters [page
112]. If you want to verify that Information Steward can connect successfully before you save this profile
connection, click Test connection.
10. Click Save.
The newly configured connection appears in the list of connections on the right of the Information Steward
page.
After you create a connection, you must authorize users to perform tasks such as view the data, run profile tasks
and run validation rules on the data.
Related Information
The SAP NetWeaver Business Warehouse connection has the same options as the SAP Applications connection
type.
Related Information
When you set up a connection to an SAP application, you must complete the related parameters in the Create
Connection dialog box in the CMC. To start setting up the connection, follow the steps in Defining a Data Insight
connection to an application [page 111].
The table below contains descriptions of the parameters that apply to this type of connection.
Note
The Data Insight module of Information Steward uses the same connections as Data Services to connect to
SAP BW or SAP ECC.
Administrator Guide
112 PUBLIC Data Insight Administration
Table 45:
SAP Applications option Possible values Description
Server Name Computer name, fully qualified do Name of the remote SAP application computer (host) to
main name, or IP address which the software connects.
Client Number 000-999 The three-digit SAP client number. Default is 800.
System Number 00-99 The two-digit SAP system number. Default is 00.
User Name Alphanumeric characters and un Enter the name of the account through which the soft
derscores ware accesses the SAP application server.
Application Language E - English Select the login language from the drop-down list. You
can enter a customized SAP language in this option. For
G - German example, you can type S for Spanish or I for Italian.
F - French
J - Japanese
Code Page See the section "Supported locales and encodings" in the
SAP Data Services Reference Guide.
ABAP execution option Execute preloaded Determines the source for the ABAP program genera
tion.
Generate and execute
Execute preloaded: Runs a previously generated and up
loaded ABAP program. Typically, this is used during the
production stage of a project.
Note
When you choose Execute Preloaded, you must ex
port ABAP programs used by Information Steward to
the SAP system in which the Data Insight view resides
so that they can be loaded into the SAP application.
Administrator Guide
Data Insight Administration PUBLIC 113
SAP Applications option Possible values Description
Job class priority High, Medium, Low Sets the order in which jobs from this connection are
processed in the SAP system. The default setting is Low.
Consider your system and the importance of jobs from
this connection compared to the jobs from other defined
connections.
Working directory on SAP Directory path A directory on the SAP application server where the soft
server
ware can write intermediate files. This directory also
stores the transport file used by the FTP and shared-di
rectory data transfer methods.
Generated ABAP directory Directory path You must specify this directory if any of the following op
tions are selected:
Security Profile Refer to the requirements of the Specify the security profile for Information Steward to
application
read data from the SAP Application.
Number of connection retries Positive integer The number of times the software tries to establish a
connection with the SAP application server.
Defaults to 3.
Interval between retries (sec) Positive integer The time delay in seconds between connection retries.
Defaults to 10.
Administrator Guide
114 PUBLIC Data Insight Administration
SAP Applications option Possible values Description
Data transfer method RFC Define how to retrieve data from the SAP application
server to the SAP Data Services server:
Shared directory
RFC: Use to stream data from the source SAP system di
FTP rectly to the Data Services data flow process using RFC.
For secure data transfer, configure SNC authentication
with the required SNC quality of protection (in the con
nection Authentication options). If you select this data
transfer method, RFC destination appears.
Note
If your Data Insight view includes two or more large
SAP tables, you can specify the ABAP data transfer
method in the View Editor of the Information Steward.
web application. For reading large data sets, the
ABAP method offers better performance.
RFC destination SAPDS or <Destination name> For the RFC data transfer method, enter an RFC destina
tion. You can keep the default name of SAPDS and cre
ate a destination of the same name in the source SAP
system, or you can enter a destination name for an exist
ing destination.
Application Shared Directory Directory path If you selected the Shared directory data transfer
method, specify the SAP application server directory
that stores data extracted by SAP Data Services, and to
which both Data Services and SAP application server
have direct access. After the extraction is completed,
Information Steward picks up the file for further process
ing.
FTP These options are visible if you selected the FTP data transfer method.
Local directory Directory path If you selected the FTP data transfer method, select a cli
ent-side directory to which data from the SAP applica
tion server downloads.
FTP relative path Directory path Indicate the path from the FTP root directory to the SAP
server's working directory. When you select FTP, this di
rectory is required.
Administrator Guide
Data Insight Administration PUBLIC 115
SAP Applications option Possible values Description
FTP host name Computer (host) name, fully quali Must be defined to use FTP.
FTP login user name Alphanumeric characters and un Must be defined to use FTP.
derscores
FTP login password Alphanumeric characters and un Enter the FTP password.
derscores, or blank
Related Information
This topic contains descriptions for the two execute options for generating an ABAP program for Data Insight
views with SAP tables.
Administrators choose the source of the ABAP program files while setting up the connection to the SAP table
source in the Central Management Console (CMC). There are two options for the Execute option parameter in the
connection setup dialog box that determine the source for the ABAP program.
Execute preloaded SAP Business Suite or SAP NetWeaver Business Warehouse application.
The administrator's selection for the Execute option parameter may depend on factors such as the environment
(development, testing, production) and the organization's security policies. For more information about ABAP
program files, see the SAP Data Services Supplement for SAP.
Related Information
Administrator Guide
116 PUBLIC Data Insight Administration
7.3.4 Defining a Data Insight connection to a file
Before you can profile a flat file or Excel spreadsheet file, you must define a connection to the path of the file that
will contain the data that you want to examine to determine the quality and structure of the data.
Ensure that the flat file or Excel spreadsheet file meets the following requirements:
● The users running the following services must have permissions on the shared directory where the file
resides:
○ Web Application Server (for example, Tomcat)
○ Data Services
○ Service Intelligence Agent
● The file must be located on a shared location that is accessible to the Data Services Job Server and the View
Data Server which are components of SAP Data Services.
1. Open the Information Steward area of the Central Management Console (CMC).
Table 47:
Parameter Description
Connection Name Name that you want to use for this data source. Name requirements include:
○ Maximum length is 64 characters
○ Can be multi-byte
○ Case insensitive
○ Can include underscores and spaces
○ Cannot include other special characters such as ?!@#$%^&*( )-+={ }[]:";'?
| |.,`~
You cannot change the name after you save the connection.
4. In the Connection Type drop list, select File connection. A file connection type includes both flat files and Excel
spreadsheet files.
5. In the Directory path box, enter the path for the file.
6. Click Save.
After you create a connection, you must authorize users to perform tasks such as view the data, run profile tasks
and run validation rules on the data. For more information see the topic named User rights in Data Insight.
Related Information
Administrator Guide
Data Insight Administration PUBLIC 117
7.3.5 Displaying and editing Data Insight connection
parameters
Situations when you might want to view or change Data Insight connection parameters include:
● You need to view the connection parameters on a development or test system so that you can recreate the
Data Insight connection when you move to a production system.
● You have several source systems and you want to ensure that you are connecting to the appropriate source.
1. Ensure that you have Edit rights on the connection or you are a member of the Data Insight Administrator
group.
2. Go to the Information Steward area of the CMC.
3. Expand the Data Insight node in the Tree panel, and select Connections.
4. In the list of connections on the right, select the name of your connection, click Action in the top menu tool
bar, and select Properties.
5. Type your changes in the following fields:
○ Description
○ Database version
○ User Name
○ Password
○ Language
○ Client Code Page
○ Server Code Page
6. If you want to verify that Information Steward can connect successfully before you save this Data Insight
connection, click Test connection.
7. Click Save.
The edited description appears in the list of projects on the right pane of the Information Steward page.
The following table shows the Data Insight objects that you can delete from the Central Management Console
(CMC).
For information about dependencies when deleting connections and projects on the Central Management Console
(CMC), see “Deleting a connection” and “Deleting a project” in the Administrator Guide.
Note
You cannot delete a connection if a table or file is being used in a project. You must remove the table or file
from all projects on Information Steward before you can delete the connection or project in the CMC.
Administrator Guide
118 PUBLIC Data Insight Administration
Table 48: Delete object dependents and dependencies
Data Insight object to Object dependencies that prevent deletion be Dependent objects that will also be deleted
delete from CMC cause they are used in another project
Project Warning message appears, but you can click OK ● Referenced project
to delete all dependent objects. ● Rule
● View
● Data cleansing solutions (Published data
cleansing solutions are still available in Data
Services Workbench.)
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or that has the Create right on Connections in Information Steward.
2. At the CMC home page, click Information Steward.
3. Click the Connections node in the Tree panel.
4. From the list in the right pane, select the name of the connection and click Manage Delete .
5. To confirm that you want to delete this connection, click OK in the warning pop-up window.
6. If the following message appears, you must delete each table from the Workspace of each Data Insight project
listed in the message.
Administrators determine the connection type on which their users view and export failed data, and they set the
specific user permissions for each user in that connection.
Choose to base the failed data on the source or the failed data connection. The default setting is the source data
connection.
1. In the Central Management Console (CMC) open Applications and select Information Steward Application.
2. Select Configure Application in the navigation pane at left.
3. Scroll to Permissions in the Information Steward Settings page at right.
4. Set the connection option in Connection on which to base failed data: Source Connection or Failed Data
Connection.
5. Click Save.
6. Open the Information Steward page and select Connections under the Information Steward node at left.
Administrator Guide
Data Insight Administration PUBLIC 119
7. Select the applicable connection. For example, if you chose the failed data connection for the Access failed
data based on option, select the applicable connection that has the purpose For data that failed rules.
○ Export Data
○ View Data
○ View Sample Data
14. Click OK.
15. Verify your choices in the Advanced tab and click OK.
Related Information
A project is a collaborative workspace for data stewards and data analysts to assess and monitor the data quality
of a specific domain and for a specific purpose (such as customer quality assessment, sales system migration,
and master data quality monitoring).
Create a Data Insight project in the Central Management Console (CMC) to allow your users to define the project's
tasks to profile and validate data in SAP Information Steward.
1. Log on to the CMC with a user name that belongs to the Data Insight Administrator group or that has the
Create right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
The Information Steward page opens with the Information Steward node selected in the Tree panel.
3. Expand the Data Insight node in the Tree panel, and select Projects.
Administrator Guide
120 PUBLIC Data Insight Administration
Option Description
Name Name that you want to use for this profile project.
○ Maximum length is 64 characters
○ Can be multi-byte
○ Case insensitive
○ Can include underscores and spaces
○ Cannot include other special characters: ?!@#$
%^&*()-+={}[]:";'/\|.,`~
You cannot change the name after you save the project.
6. Click Save.
Note
After you save the project, you cannot change its name.
The new project appears in the list of projects on the right pane of the Information Steward page.
After you create a project, you must grant users rights to perform actions such as create profile and rule tasks,
run these tasks, or create score cards.
Related Information
To edit a Data Insight project, you must have Edit rights on the project or be a member of the Data Insight
Administrator group.
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or that has the Edit right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
3. Expand the Data Insight node in the Tree panel, and select Projects.
4. In the list of projects on the right, select the name of your project and click Action Properties .
5. Type the changes you want in the Description field.
6. Click Save to see the edited description in the list of projects on the right pane of the Information Steward
page.
Administrator Guide
Data Insight Administration PUBLIC 121
7.4.3 Enterprise project
SAP Information Steward provides a special project named Enterprise, which enables the display of Data Insight
profiling results and scores in the Metadata Management module in impact and lineage diagrams and in the
details for tables and columns.
You view profile results and data quality scores at the table and column level for tables under the Data Insight
Connection category in the Metadata Management tab. To enable the display of profiling and rule results on the
impact and lineage diagrams, you must run the profiling and rule tasks in the Enterprise Project in Data Insight.
For details, see topics “Running a profile task in the Enterprise Project” and “Running a rule task in the Enterprise
Project” in the User Guide.
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or that has the Create right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
3. Expand the Data Insight node in the Tree panel, and select Projects.
4. From the list in the right pane, select the name of the project and click Manage Delete .
5. To confirm that you want to delete this project, click OK in the warning pop-up window.
Caution
When you delete a project, you delete all the contents of the project which include unapproved rules, tasks,
scorecards, profile results, sample data, views, and so forth.
Create Data Insight tasks to run immediately in SAP Information Steward or create a schedule for a recurring task
in the Central Management Console (CMC).
Create and run tasks for the following Data Insight task types:
● Profiles
● Rules
● Calculate Scorecard utility
Additionally, export task execution commands to run Data Insight tasks using an external scheduler.
Administrator Guide
122 PUBLIC Data Insight Administration
Related Information
Schedule tasks to occur once, on a recurring schedule, or based on an event in the Central Management Console
(CMC).
To perform these steps, you must be an administrator, or a user with the applicable permissions.
2. Expand Data Insight Projects in the left pane, and select the name of your project.
Node Description
Parameters Set both database and file log levels. Enter JVM arguments
and any additional runtime arguments.
Instance Title Keep the original task name, or enter a different name to
specify this particular schedule.
Recurrence Set the interval of time in which to run the task. For exam
ple, select to run the task each hour, once a week, and so
on.
5. Click Schedule.
Administrator Guide
Data Insight Administration PUBLIC 123
Related Information
To use an external scheduler to run tasks, export generated commands from the Central Management Console
(CMC) or execute a command utility.
You must be an administrator or a user with scheduling privileges with the right to track activities to export tasks
from the CMC.
Some users prefer to use an external scheduler to run all tasks, including their Information Steward tasks. To use
an external scheduler, users export task commands and copy them to a batch file that runs their external
scheduler.
There are two ways to export tasks from the CMC for an external scheduler:
You can export task execution commands for the following objects:
● Profile tasks
● Rule tasks
● All profile and rule tasks in a project
● Scorecard calculation
Additionally, you can view task history and download the log file from either Information Steward or the CMC.
Related Information
Exporting execution commands for profile and rule tasks [page 125]
Exporting execution commands for calculate scorecard utility [page 126]
Exporting execution commands for tasks in a project [page 127]
ISTaskExecuteCommand utility for Windows and Linux [page 128]
Include scores in execution commands [page 132]
Administrator Guide
124 PUBLIC Data Insight Administration
7.5.2.1 Exporting execution commands for profile and rule
tasks
Use the Central Management Console (CMC) to generate execution commands for profile and rule tasks. Then
copy the commands to a batch file to run tasks with an external scheduler.
To perform these steps, you must be a user with administrator rights, or a user with scheduling privileges and
rights to track activities.
The Information Steward page opens with the Information Steward node selected in the pane at left.
2. Expand Data Insight Projects in the tree pane and then select the project name.
The applicable command line parameters appear in the Command Line Parameters box. For example, here is
a command without JVM arguments or additional arguments:
Sample Code
Sample Code
7. Change the following variables from the command with your information:
○ yourusername
○ yourpassword
○ yourserver
8. Copy and paste the completed command into the applicable batch file.
Related Information
Administrator Guide
Data Insight Administration PUBLIC 125
7.5.2.2 Exporting execution commands for calculate
scorecard utility
Use the Central Management Console (CMC) to generate execution commands for the Calculate Scorecard
utility. Then copy the commands to a batch file to run tasks in an external scheduler.
To perform these steps, you must be a user with administrator rights, or a user with scheduling privileges and
rights to track activities.
The applicable command line parameters appear in the Command Line Parameters box. For example, here is
a command without JVM arguments or additional arguments:
Sample Code
9. Change the following variables from the command with your information:
○ yourusername
○ yourpassword
○ yourserver
10. Copy and paste the generated command into your batch file.
Related Information
Administrator Guide
126 PUBLIC Data Insight Administration
7.5.2.3 Exporting execution commands for tasks in a project
Use the Central Management Console (CMC) to generate execution commands for all tasks in a project. Then
copy the commands to a batch file to run tasks in an external scheduler.
To perform these steps, you must be a user with administrator rights, or a user with scheduling privileges and
rights to track activities.
The Information Steward page opens with the Information Steward node selected in the pane at left.
The software assigns a -configurationTaskObject identification number to each task in the project, and
lists them in the Command Line Parameters text box. For example, here is a list of commands for each task in
a project without JVM arguments or additional arguments:
Sample Code
8. Change the following variables for each command with your information:
○ yourusername
○ yourpassword
○ yourserver
9. Copy and paste the generated commands into your batch file.
Related Information
Administrator Guide
Data Insight Administration PUBLIC 127
7.5.2.4 ISTaskExecuteCommand utility for Windows and
Linux
Run the task execute command utility for Windows or Linux to schedule tasks on an external scheduler.
When you use the ISTaskExecuteCommand, you do not need to generate a command in the Central Management
Console (CMC) to obtain an internal task identification number. ISTaskExecuteCommand only requires object
and task names instead of generated task identification numbers. For example:
Sample Code
You can reuse the same ISTaskExecuteCommand with the task name and project name for each run of that task.
You can even reuse the command with that task name and project name when you move from one environment to
another because the project name and task name do not change from one environment to the next.
There is one exception. When you export all tasks in a project, you use the -configurationTaskObject
identification number for each task in the ISTaskExecuteCommand utility. Therefore, if your task object
identification numbers have changed, which could happen when you move to a new environment, regenerate the
command in the CMC. You can either use the newly regenerated command with the new identification numbers,
or update your existing ISTaskExecuteCommand utility for the project with the new identification numbers. The
advantages and disadvantages for using the generated command or the utility are as follows:
● If you use the newly generated command from the CMC, you re-enter all of your information, such as user
name and password, for each task. You could copy and paste this information to make this process less
tedious.
● If you use your existing ISTaskExecuteCommand, you replace all task identification numbers with the newly
generated identification numbers from the CMC. If none of your information has changed, such as user name
and password, you can reuse that information.
The following examples use the Windows ISTaskExecuteCommand.bat utility for Windows. Substitute the .sh
version of the command if you are working in a Linux environment.
Example
Export parameters to run a given named task
Sample Code
Administrator Guide
128 PUBLIC Data Insight Administration
Example
Export parameters for calculate scorecard utility
Sample Code
Example
Export parameters for a project
Sample Code
Example
Export parameters with JVM parameters and additional arguments
Sample Code
Related Information
The generated export execution command (generated command) from the Central Management Console (CMC)
contains common parameters that you can also use when you create the ISTaskExecuteCommand.bat or .sh
utility (command utility).
The following table contains information about the common parameters contained in either the generated
command or the utility.
Administrator Guide
Data Insight Administration PUBLIC 129
Table 50:
Caution
The task fails if the
configurationTa
skObject identifica
tion number has
changed since the last
time you used the
IStaskExecuteCo
mmand, even when
Administrator Guide
130 PUBLIC Data Insight Administration
Parameter Description Use
-configurationTaskObjectName Name of the task being ex Required if you do not include
ported the -
configurationTaskObj
ect identification number pa
rameter
-configurationProjectObjectName Name of the project being ex Required if you do not include
ported the -
configurationTaskObj
ect identification number pa
rameter
Related Information
Administrator Guide
Data Insight Administration PUBLIC 131
Include scores in execution commands [page 132]
When you export task commands to run rule tasks or calculate scorecards with an external scheduler, you can set
up your commands to include scores.
The software returns a score when the task completes successfully and the task involves one table or view. The
score is in the form of an exit code.
The software translates return scores to a range of large negative values to ensure that there is no overlap
between existing error codes and return scores.
Note
Error codes take precedence over return codes. The software considers the highest error code to be the most
severe.
The software returns a score if you set the -returnScore parameter to true in -programArgs. The following
example shows a generated command with the return score parameter set to true:
Sample Code
The software does not return a score under the following circumstances:
Include a score parameter when you generate code to export a task to an external scheduler.
To perform these steps, you must be a user with administrator rights, or a user with scheduling privileges and
rights to track activities.
The Information Steward page opens with the Information Steward node selected in the pane at left.
Administrator Guide
132 PUBLIC Data Insight Administration
5. Optional. Enter applicable JVM In the JVM Arguments field.
6. Enter the following parameter and value in the Additional Arguments text entry field: -returnScore true.
7. Click Generate.
Sample Code
Related Information
Include a score parameter when you generate code to export a calculate scorecard task to an external scheduler.
To perform these steps, you must be a user with administrator rights, or a user with scheduling privileges and
rights to track activities.
Sample Code
Administrator Guide
Data Insight Administration PUBLIC 133
Related Information
When you schedule an integrator source or an SAP Information Steward utility, you can choose the frequency to
run it in the Recurrence option. The following table describes each recurrence option and shows the additional
relevant values that you must select for each recurrence option.
Once The utility will run once only. Select the values for
● Start Date/Time
● End Date/Time
Hourly The utility will run every N hours and X minutes. Select the values for:
● Hour(N)
● Minute(X)
● Start Date/Time
● End Date/Time
Daily The utility will run once every N days. Select the value for Days(N)
Weekly The utility will run once every week on the selected days. Select the following values:
Monthly The utility will run once every N months. Select the following values:
● Month(N)
● Start Date/Time
● End Date/Time
Nth Day of Month The utility will run on the Nth day of each month. Select the following values:
● Day(N)
● Start Date/Time
● End Date/Time
1st Monday of Month The utility will run on the first Monday of each month. Select the following values:
● Start Date/Time
● End Date/Time
Administrator Guide
134 PUBLIC Data Insight Administration
Recurrence Option Description
Last Day of Month The utility will run on the last day of each month. Select the following values:
● Start Date/Time
● End Date/Time
X Day of Nth Week of the The utility will run on the X day of the Nth week of each month. Select the following values:
Month
● Week(N)
● Day(X)
● Start Date/Time
● End Date/Time
Calendar The utility will run on the days you specified as "run" days on a calendar you have created in the
Calendars management area of the CMC.
The Data Insight Administrator or someone in the Administrator group can configure the server to provide email
notifications in the Central Management Console (CMC). A Data Insight user must have already created the
profile task or rule task in Information Steward (for more information, see the User Guide).
When a task is scheduled to run via the Central Management Console (CMC), you can be notified whether the task
completed successfully or with errors. A profiling task is considered to be in error when profiling any of the tables
fails, either because of an infrastructure error or invalid source information (such as an invalid connection, table,
column and so on).
The calculate score task may fail only when a table in the task was unable to generate its score, either because of
an infrastructure error or invalid source information (such as an invalid connection, table, column and so on).
2. Choose Data Insight Projects <your project name> and then highlight the profiling or calculate
score task that you want to schedule and be notified of the completion.
Table 52:
Option Description
Use Job Server defaults Select to use the settings already defined in the Job Server.
Set values to be used here The following options override those defined in the Job Server.
Administrator Guide
Data Insight Administration PUBLIC 135
Option Description
Subject Specify the default subject heading used in emails containing system alerts.
Message Specify the default message to include in emails containing system alerts.
5. In the CMC Home window, click Servers. Select the ISJobServer. If more than one is available, configure each
one.
6. Choose Manage Properties , and then click Destination in the navigation pane.
7. Select Email from the Destination drop-down list and then click Add.
8. To set up a notification server for completed processing, you must enter information into the Domain, Host,
and Port fields. All other fields are optional. The following table describes the fields on the Destination page:
Table 53:
Option Description
Domain (required) Enter the fully qualified domain of the SMTP server.
Port (required) Enter the port that the SMTP server is listening on. (This standard SMTP port is 25.)
Authentication Select Plain or Login if the job server must be authenticated using one of these methods
in order to send email.
User Name Provide the Job Server with a user name that has permission to send email and attach
ments through the SMTP server.
Password Provide the Job Server with the password for the SMTP server.
From Provide the return email address. Users can override this default when they schedule an
object.
Note
It is recommended that you keep the %SI_EMAIL_ADDRESS%If you specify a spe
cific email address or recipient all system alerts are sent to that address by default.
CC Specify which recipient(s) should receive carbon copies of alerts sent through email.
Subject Specify the default subject heading used in emails containing system alerts.
Message Specify the default message to include in emails containing system alerts.
Add placeholder You can add placeholder variables to the message body using the Add placeholder list.
For example, you can add the report title, author, or the URL for the viewer in which you
want the email recipient to view the report.
9. Click Save.
Related Information
Administrator Guide
136 PUBLIC Data Insight Administration
7.5.5 Rule threshold notification
The Data Insight Administrator or someone in the Administrator group can configure the server to provide email
notifications in the Central Management Console (CMC). A Data Insight user must have already created the
profile task or rule task on Information Steward (for more information, see the User Guide).
When you create a rule task, you can also create email notifications that will alert business users when a rule score
meets a specific sensitivity setting. You can send the notifications to an email address, a CMC user, a CMC group,
or any combination of these. You can set up multiple email notifications for a single rule task.
You must configure the notification server before processing the task so the server has the correct information to
send in the notification. Email notifications will be sent when the task is complete
3. Choose Manage Properties and then select Destination in the navigation pane.
4. Select Email from the Destination drop-down list and then click Add.
5. To set up a notification server for rules, you must enter information into the Domain, Host, Port and From
fields. All other fields are optional. The following table describes the fields on the Destination page:
Table 54:
Option Description
Domain (required) Enter the fully qualified domain of the SMTP server.
Port (required) Enter the port that the SMTP server is listening on. (This standard SMTP port is 25.)
Authentication Select Plain or Login if the job server must be authenticated using one of these methods
in order to send email.
User Name Provide the Job Server with a user name that has permission to send email and attach
ments through the SMTP server.
Password Provide the Job Server with the password for the SMTP server.
From (required) Provide the return email address. Users can override this default when they schedule an
object.
To Not used in this scenario. The email address specified when creating the task is used.
Add placeholder You can add placeholder variables to the message body using the Add placeholder list.
For example, you can add the report title, author, or the URL for the viewer in which you
want the email recipient to view the report.
6. Click Save.
Administrator Guide
Data Insight Administration PUBLIC 137
Related Information
1. Login to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or Administrator group.
2. At the CMC home page, click Information Steward.
The Information Steward page opens with the Information Steward node selected in the Tree panel.
3. In the Tree panel, expand the Data Insight node expand the Projects node.
4. Select the name of your project in the Tree panel.
A list of tasks appears in the right panel with the date and time each was last run.
5. To update the Last Run Status and Last Run columns, click the Refresh icon.
6. To view the history of a task, select its name and click Action History in the top menu tool bar.
The Data Insight Task history pane displays each instance the task was executed with the status, start time
end time, and duration. The Schedule Status column can contain the following values:
Table 55:
Schedule Status Description
Pending Task is scheduled to run one time. When it actually runs, the status changes to “Running”.
Recurring Task is scheduled to recur. When it actually runs, there will be another instance with status
“Running”.
Note
The Database Log shows a subset of the messages in the log file.
b. To find specific messages in the Database Log window, enter a string in the text box and click Filter.
For example, you might enter error to see if there are any errors.
c. To close the Database Log window, click the X in the upper right corner.
8. To save a copy of a task log:
a. Scroll to the right of the Data Insight Task History page, and click the Download link in the Log File column
in the row of the utility instance you want.
Administrator Guide
138 PUBLIC Data Insight Administration
b. Click Save.
c. On the Save As window, browse to the directory where you want to save the log and change the default
file name if you want.
Note
This downloaded log file contains more messages than the Database Log because its default logging level is
set lower.
9. To close the Data Insight Task History page, click the X in the upper right corner.
Related Information
You can pause a recurring schedule for a task when you do not want to run it at its regularly scheduled time until
you resume the schedule.
To pause a schedule:
1. Log on to the CMC with a user name that belongs to the Data Insight Administrator group or that has the
Create right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
The Information Steward page opens with the Information Steward node selected in the Tree panel.
3. In the Tree panel, expand the Data Insight node expand the Projects node.
4. Select the name of your project in the Tree panel.
5. Select the name of the task from the list on the right panel.
7. Select the task instance that has the value "Recurring" in the Schedule Status column and click Action
Pause in the top menu tool bar.
8. When you are ready to resume the recurring schedule, select he task instance that has the value "Paused" in
the Schedule Status column and click Action Resume in the top menu tool bar.
Administrator Guide
Data Insight Administration PUBLIC 139
7.5.8 Deleting failed data for a failed rule task
You can choose to have the software delete conflicting data from the failed data repository under certain
circumstances.
Follow these steps to delete failed data from the Central Management Console (CMC) when the rule task fails
because of structural changes to the source data or because there were changes made to the rule task set up
between runs.
If the option is not active in the drop-down menu, the rule task failed for reasons other than from structural
changes to the source data or changes to the rule task settings. Look at the log file to find out why the task
failed.
An Error Reason message appears describing why the rule task failed, and it includes the failed data error
code.
5. Click Delete Failed Data and Run to continue with the deletion.
The application deletes the conflicting data from the failed data repository. See the User Guide for a list of failed
data error codes and descriptions.
Note
For convenience, set the Clean Up Failed Data utility to clean all of your failed data repositories on a set
schedule. The software default setting is 0 days so that you do not lose any history when you upgrade an
installation.
Related Information
Change the default values of the runtime parameters when you schedule a task.
When you schedule a profile task, rule task, or an integrator source, you can change the default values of the
runtime parameters in the Parameters page when you schedule the instance.
Administrator Guide
140 PUBLIC Data Insight Administration
The following table describes the runtime parameters that are applicable to all metadata integrators, profile tasks,
and rule tasks.
Database Log Level This log is located in the Information Steward repository. You can view this log while a
Data Insight task or metadata integrator is running.
The default logging level is Information. Usually you can keep the default logging level.
However, if you need to provide more detailed information about your integrator run,
you can change the level to Integrator trace so that the log includes tracing informa
tion. For a description of log levels, see Log levels [page 297].
File Log Level A Data Insight task or a metadata integrator creates this log in the SAP installation di
rectory and copies it to the file repository server. You can download this log file after
the Data Insight task or metadata integrator run completes.
The default logging level for this log is Configuration. Usually you can keep the default
logging level. However, if you need to debug your task or integrator run, you can
change the level to Integrator Trace so that the log includes tracing information. For in
structions to change log levels, see Metadata Browsing Service and View Data Service
logs [page 300]
JVM Arguments A Data Insight task or a metadata integrator creates this log in the SAP installation di
rectory and copies it to the file repository server. You can download this log file after
the Data Insight task or metadata integrator run completes.
The default logging level for this log is Configuration. Usually you can keep the default
logging level. However, if you need to debug your task or integrator run, you can
change the level to Integrator Trace so that the log includes tracing information. For in
structions to change log levels, see Changing Metadata Management log levels [page
298]
Additional Arguments Optional runtime parameters for the metadata integrator source or Data Insight task.
For more information, see Runtime parameters for specific metadata integrators [page
173]
Administrator Guide
Data Insight Administration PUBLIC 141
8 Metadata Management Administration
The Metadata Management module of SAP Information Steward collects metadata about objects from different
source systems and stores the metadata in a repository. Source systems include Business Intelligence (SAP
BusinessObjects Enterprise and SAP NetWeaver Business Warehouse), Data Modeling, Data Integration (SAP
Data Services and SAP Data Federation), and Relational Database systems.
When you access Information Steward as a user that belongs to the Metadata Management Administrator group,
you can perform the following tasks in the Central Management Console (CMC):
Related Information
SAP metadata integrators collect metadata from repository sources that you configure, and they populate the
SAP Information Steward repository with the collected metadata.
When you install Information Steward, you can select the following metadata integrators:
Administrator Guide
142 PUBLIC Metadata Management Administration
● SAP BusinessObjects Enterprise Metadata Integrator
● SAP NetWeaver Business Warehouse Metadata Integrator
● SAP HANA Metadata Integrator
● SAP NetWeaver Business Warehouse
● SAP ERP Central Component Integrator
● SAP Hadoop Metadata Integrator
● Common Warehouse Model (CWM) Metadata Integrator
● SAP BusinessObjects Data Federator Metadata Integrator
● SAP BusinessObjects Data Services Metadata Integrator
● Relational Database (RDBMS) Metadata Integrator
● MITI — Meta Integration Model Bridge (MIMB) Metadata Integrator
● SAP Sybase PowerDesigner Metadata Integrator
● Excel Metadata Integrator
This integrator collects metadata for Universes, Crystal Reports, Web Intelligence documents, Dashboard
Objects, and Desktop Intelligence documents.
Ensure that you selected the SAP BusinessObjects Enterprise Metadata Integrator when you installed Information
Steward.
To configure the Enterprise Integrator, you must belong to the Metadata Management Administrator group or add
Objects rights on the Integrator Sources folder.
Table 57:
Option Description
Name Name that you want to use for this metadata integrator source. The maximum length is 128 char
acters.
CMS Server Name Host name of the CMS. This value is required.
Note
The version of SAP BusinessObjects Enterprise installed on the Metadata Management host
must match the version of SAP Enterprise that this CMS manages.
Administrator Guide
Metadata Management Administration PUBLIC 143
Option Description
The default value is Administrator. If you want a user other than Administrator to run
the Metadata Integrator, change the value to the appropriate name.
Password Password to connect to the CMS server to register and run the Metadata Integrator.
Authentication Method Process that CMS uses to verify the identity of a user who attempts to access the system. See
the Business Intelligence Platform Administrator's Guide for available processes.
If you specify Windows AD, you must set JVM parameters in the Enterprise Information
Management Services server and in the runtime parameters for the Metadata Integrator. For
more information, see the topic Configuring sources with Windows AD authentication and Run
ning a Metadata Integrator with Windows AD authentication (see links in Related Topics below).
InfoView Integration Name of the BI Launchpad (formerly known as InfoView) user to invoke the Information Steward
User lineage diagrams when View Lineage is selected for each document in the Documents List of BI
launch pad formerly InfoView).
Password InfoView user password used to connect to Information Steward to display the lineage diagram
for a document in InfoView.
5. Click Test connection to verify that Information Steward can connect successfully before you save this
source.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the Information Steward
page.
Related Information
SAP BusinessObjects Business Intelligence (BI) platform components usage [page 13]
Running a Metadata Integrator immediately [page 169]
Defining a schedule to run a Metadata Integrator [page 170]
Viewing integrator run progress, history, and log files [page 187]
Configuring sources with Windows AD authentication [page 145]
Running SAP BusinessObjects Enterprise Metadata Integrator with Windows AD authentication [page 171]
Administrator Guide
144 PUBLIC Metadata Management Administration
8.2.1.1 Checkpointing
SAP Information Steward can run the SAP BusinessObjects Enterprise Metadata Integrator for extended periods
of time to collect large quantities of objects. If unexpected problems occur during object collection, Information
Steward automatically records warning, error, and failure incidents in your log file for you to analyze later.
As additional failure management, Information Steward uses an automatic checkpointing mechanism with preset
"safe start" points to ensure that processing restarts from the nearest "safe start" point (instead of from the
beginning of the job). Regardless of the reason for the failure (power outage, accidental shutdown, or some other
incident), the next time you run the SAP BusinessObjects Enterprise Metadata Integrator, it restarts from the safe
start point to finish object collection in the least amount of time.
If you specified Windows AD when you configured your integrator source for SAP BusinessObjects Enterprise
Metadata Integrator, you must set JVM parameters in the EIMAdaptiveProcessingServer server.
1. Log on to the Central Management Console (CMC) as a user that belongs to the Administrator group, and
then go to the Servers management area.
2. Expand the Service Categories node and select Enterprise Information Management Services.
3. In the right pane, right-click EIMAdaptiveProcessingServer and click Stop Server.
4. Double-click EIMAdaptiveProcessingServer to open the server properties page.
5. At the beginning of the text box for Command Line Parameters, enter the JVM parameters
java.security.auth.login.config and java.security.krb5.conf and set them to your file
locations.
For example, enter the following text:
-Djava.security.auth.login.config=c:/winnt/bsclogin.conf
-Djava.security.krb5.conf=c:/winnt/krb5.ini
To run the integrator source using Windows AD authentication, you must specify the required runtime
parameters.
Related Information
Running SAP BusinessObjects Enterprise Metadata Integrator with Windows AD authentication [page 171]
Administrator Guide
Metadata Management Administration PUBLIC 145
8.2.2 Configuring sources for SAP NetWeaver Business
Warehouse Metadata Integrator
This section describes how to configure the Metadata Integrator for SAP NetWeaver Business Warehouse.
Note
Ensure that you selected the SAP NetWeaver Business Warehouse Metadata Integrator when you installed SAP
Information Steward.
To configure an SAP NetWeaver Business Warehouse integrator source, you must have the Create or Add
permission on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Click the down arrow next to Manage in the top menu tool bar and select New Integrator Source .
3. In the Integrator Type drop-down list, select SAP NetWeaver Business Warehouse.
4. On the New Integrator Source page, enter the following information.
Option Description
Name Name that you want to use for this integrator source. The maximum length of an integrator source
name is 128 characters.
Connection Type One of the following connection types for this source:
○ Custom Application Server
○ Group/Server Selection
Application Server SAP Application Server host name when Connection Type is Custom Application Server.
Message Server SAP NetWeaver BW Message Server host name when Connection Type is Group/Server Selection.
SAProuter String (Optional) String that contains the information required by SAProuter to set up a connection between
the Metadata Integrator and the SAP NetWeaver BW system. The string contains the host name, the
service port, and the password, if one was given.
SAP User Name of the user that will connect to the SAP NetWeaver BW system.
SAP Password Password for the user that will connect to the SAP NetWeaver BW system.
Language Language to use for the descriptions of SAP NetWeaver BW objects. Specify the 2-character ISO code
for the language (for example, en for English).
5. To verify that Information Steward can connect successfully before you save this source, click Test
connection.
6. Click Save.
The newly configured source appears in the list of integrator Sources on the right of the page.
Administrator Guide
146 PUBLIC Metadata Management Administration
Related Information
This section describes how to configure the Metadata Integrator for SAP Data Services.
Note
Ensure that you selected the SAP Data Services Metadata Integrator when you installed SAP Information
Steward.
To configure the Data Services Integrator, you must have the Create or Add right on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For details, see
Accessing Information Steward for administrative tasks [page 9].
2. Click the down arrow next to Manage in the top menu tool bar and select New Integrator Source .
3. In the Integrator Type drop-down list, select Data Services.
4. On the New Integrator Source page, in the Integrator Type drop-down list, select SAP Data Services
information.
5. Enter the following Data Services information.
Table 58:
Option Description
Name Name that you want to use for this source. The maximum length of an integrator source
name is 128 characters.
Database Type The database type of the Data Services repository. The available database types are:
○ DB2
○ Microsoft SQL Server
○ MySQL
○ Oracle
○ Sybase
○ HANA
Computer Name Name of the computer where the Data Services repository resides.
Administrator Guide
Metadata Management Administration PUBLIC 147
Option Description
Datasource, Database Name, The name of the database, data source, or service name. Specify the following name for
or Service Name the database type of the Data Services repository:
○ DB2 - Data source name
○ Microsoft_SQL_Server - Database name
○ MySQL - Database name
○ Oracle - SID/Service name
○ Sybase - Database name
○ HANA - Database name
Windows Authentication (Only applies to a Microsoft SQL Server connection type) Indicate whether or not Windows
Authentication is used to login to Microsoft SQL Server. The default is No which means Mi
crosoft SQL Server Authentication is used.
Database User Name of the user that will connect to the Data Services repository.
Database User is not displayed for a Microsoft SQL Server connection type when Windows
authentication has been selected.
Database Password The password for the user that will connect to the Data Services repository.
Database Password is not displayed for a Microsoft SQL Server connection type when Win
dows authentication has been selected.
HTTP Scheme Protocol to use when connecting to the Data Services Management Console web applica
tion to access Auto Documentation reports.
Default is HTTP.
6. If you want to verify that Metadata Management can connect successfully before you save this source click
Test Connection.
7. Click Save.
The newly configured source appears in the list of integrator Sources on the right of the window.
Related Information
Administrator Guide
148 PUBLIC Metadata Management Administration
8.2.4 Configuring sources for SAP ERP Central Component
integrator
The SAP ERP Central Component integrator allows you to collect basic Data Model information from the SAP
System, including tables, views, structures, table fields, data elements and data domains, and so on.
Note
Ensure that you select the SAP ERP Central Component option when you install SAP Information Steward.
To configure the Metadata Integrator for SAP ERP Central Component (also known as SAP ECC) you must have
either Create or Add permissions on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Click the down arrow next to Manage in the top menu tool bar and select New Integrator Source .
3. In the Integrator Type drop-down list, select SAP ERP Central Component.
4. On the New Integrator Source page, enter the following information.
Option Description
Name Name that you want to use for this integrator source. The maximum length of an integrator source
name is 128 characters.
Connection Type One of the following connection types for this source:
○ Custom Application Server
○ Group/Server Selection
Application Server SAP Application Server host name when Connection Type is Custom Application Server.
Message Server SAP Message Server host name when Connection Type is Group/Server Selection.
SAProuter String (Optional) String that contains the information required by SAProuter to set up a connection between
the Metadata Integrator and the SAP system. The string contains the host name, the service port, and
the password, if one was given.
SAP User Name of the user that will connect to the SAP system.
SAP Password Password for the user that will connect to the SAP system.
Language Language to use for the descriptions of SAP system objects. Specify the 2-character ISO code for the
language (for example, en for English).
5. To verify that Information Steward can connect successfully before you save this source, click Test
connection.
6. Click Save.
Administrator Guide
Metadata Management Administration PUBLIC 149
Related Information
The Hadoop integrator supports the collection of metadata for standard relational objects.
Note
Ensure that you select the Apache Hadoop Metadata Integrator option when you install SAP Information
Steward. If you are upgrading and you have other integrators installed, the Apache Hadoop Metadata
Integrator will be automatically installed.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Click the down arrow next to Manage in the top menu tool bar and select New Integrator Source .
3. On the New Integrator Source page, select SAP Hadoop Metadata Integrator from the Integrator Type drop-
down list.
4. Enter the following information:
Option Description
Name Name that you want to use for this integrator source. The maximum length of an integrator
source name is 128 characters.
Metastore Node Host The fully qualified domain name (FQDN) for the Hadoop DataNode that is running the Hive Met
Name astore service.
Namenode Host Name The fully qualified domain name for the Hadoop NameNode.
HDFS Port The port number for the Hadoop HDFS NameNode.
Client Directory The path to the directory located on the Information Steward that contains all the required Ha
Libraries doop libraries.
Hadoop Client Home The parent location for the service configuration files hive-site.xml and core-
site.xml.
Administrator Guide
150 PUBLIC Metadata Management Administration
Option Description
Hive Metastore Port The port number for the machine that is running the Metastore service.
Truststore location and The full path to the TrustStore. This option is required only when the Hive server is using SSL to
name connect to the client.
5. To verify that Information Steward can connect successfully before you save this source, click Test
connection.
6. Click Save.
Before you can configure a database source, you must be granted permission to create configuration sources
from an administrator.
To run the HANA Metadata Integrator, the following permissions must be granted:
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Click the Manage drop list and select New Integrator Source .
The New Integrator Source dialog opens.
3. Select SAP HANA from the Integrator Type drop list.
4. Complete the remaining options based on the option descriptions in the table:
Table 59:
Option Description
Name Name of the HANA integrator source (maximum length is 128 characters)
Computer Name Host name on which the HANA database server is running
Administrator Guide
Metadata Management Administration PUBLIC 151
Option Description
Table Schema (Optional) Name of the table schema. Limits metadata collection to the named schema.
Note
The software collects all tables and views that belong to the specified schema. If a view
in the specified schema is a Relational view or Information Model view that references a
different table schema, the metadata from the referenced table will also be collected.
Packages (Optional) Name of the package or packages. Limits the collection of HANA Integrator
source configuration to the named package or packages and any child packages.
5. Click Test connection to verify that Information Steward can successfully connect to the source before you
save the settings.
6. Click Save.
The newly configured source appears in the Integrator Sources list.
Related Information
This topic describes the prerequisites to configuring sources for the PowerDesigner metadata integrator.
Perform the following prerequisites before following the steps to configure sources for the PowerDesigner
metadata integrator:
● Select to install the SAP Sybase PowerDesigner metadata integrator when you install Information Steward. If
you did not do this, see the Information Steward Install Guide (for Windows or Unix) for instructions about
adding features after installation.
● Install the SAP Sybase PowerDesigner application on the machine where the PowerDesigner metadata
integrator will be running.
● Define a PowerDesigner repository connection in the PowerDesigner application. (Find instructions in the SAP
Sybase PowerDesigner documentation.)
Administrator Guide
152 PUBLIC Metadata Management Administration
Note
Make note of the PowerDesigner application user name and password, and the name of the PowerDesigner
repository. You will need this information when you configure a source for the PowerDesigner metadata
integrator.
● Ensure that you have user permission to view Metapedia users and approvers. You must have these
permissions before you can configure sources for the PowerDesigner metadata integrator.
● Ensure that you have the proper permissions to configure the PowerDesigner metadata integrator in the
Central Management Console (CMC).
● Change the identity information for the PowerDesigner DCOM (Microsoft Distributed Component Object
Model) on the machine that contains both the PowerDesigner and Information Steward application
installations (see the topic "Changing DCOM user information for PowerDesigner").
Related Information
Before you can run the SAP Sybase PowerDesigner metadata integrator in SAP Information Steward, you must
change the user for the machine's PowerDesigner DCOM (Microsoft Distributed Component Object Model).
Note
The PowerDesigner application is not accessible when you make this change.
Follow these steps to change the user information for the PowerDesigner DCOM on the machine that contains
both the Information Steward and PowerDesigner application installations:
2. Expand the Component Services node and select Computers My Computer DCOM Config .
3. Right-click PowerDesigner in the DCOM Config list and select Properties.
The PowerDesigner Properties window opens.
4. Open the Identity tab.
5. Select This user and enter the applicable User, Password, and Confirm password information.
6. Click OK.
Remember
You must reset the user setting in the Identity tab to the previous setting before you can use the
PowerDesigner application that is installed on the same machine as Information Steward.
Administrator Guide
Metadata Management Administration PUBLIC 153
8.2.7.2 Configuring PowerDesigner metadata integrator
This topic provides steps for you to configure sources for the SAP Sybase PowerDesigner metadata integrator in
the Central Management Console (CMC).
Remember
Make sure that you have completed all of the prerequisites listed in the topic "Configuring sources for SAP
Sybase PowerDesigner metadata integrator" before following these steps.
Table 60:
Option Description
Integrator Type Select SAP Sybase PowerDesigner from the drop-down list.
Name Enter a name for this metadata integrator source. The maximum length is 128 characters.
Repository Connection Enter the name of the PowerDesigner repository connection definition.
Collect Glossary Select True to obtain glossary entries from the PowerDesigner repository through the
PowerDesigner application (version 16.1 and newer). Any term collected from
PowerDesigner is automatically set to the approved state in Metapedia.
Select False to disable glossary collection from the PowerDesigner repository. When you
select False, all other glossary collection settings are removed from the setup, and you are
finished creating the metadata integrator.
Metapedia Author Select the applicable Metapedia author from the drop-down list. This author will be as
signed to any imported glossary terms collected by PowerDesigner metadata integrator.
Metapedia Approver Select the applicable Metapedia approver from the drop-down list. This approver will be as
signed to any imported glossary terms collected by PowerDesigner metadata integrator.
Metapedia Observer Select the applicable Metapedia observer from the drop-down list. The observer receives
an email notification when a term is approved or deleted. The observer must have
Metapedia view rights or at least be a member of the Metapedia users group.
Administrator Guide
154 PUBLIC Metadata Management Administration
Option Description
Force update from PowerDe Select False to collect terms that are unique to PowerDesigner. Existing Metapedia terms
signer are not affected.
Select True to collect all terms from the PowerDesigner repository, even when they exist in
Metapedia.
○ If Metapedia already has the term, all applicable components in Metapedia are over
written by the term components collected from PowerDesigner.
○ If the term component exists in Metapedia but it does not exist in PowerDesigner, the
component remains as is in Metapedia.
○ If the Metapedia term is in a different category than the PowerDesigner term, the term
remains in the Metapedia category.
Note
You can see if a term in Metapedia came from the PowerDesigner integrator by looking
at the comment in the upper right corner of the Metapedia Term Properties window.
Note
Once a Metapedia term has been modified by PowerDesigner, it becomes a Metapedia
term and can only be updated from PowerDesigner again if this option is set to True,
and you run the metadata integrator again.
4. (Optional.) Before you save the metadata integrator, you can verify that Information Steward can connect to
it by clicking Test Connection.
Remember
If you have not followed the steps to change your DCOM user information, and completed other important
prerequisites, your connection will be unsuccessful and you will receive an error.
5. Click Save.
Related Information
Configuring sources for SAP Sybase PowerDesigner metadata integrator [page 152]
Changing DCOM user information for PowerDesigner [page 153]
This section describes how to configure the Metadata Integrator for Common Warehouse Model (CWM).
Administrator Guide
Metadata Management Administration PUBLIC 155
Note
Ensure that you selected the CWM Metadata Integrator when you installed SAP Information Steward.
To configure the CWM Integrator, you must have the right to add objects in the Integrator Sources folder.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For details, see
Accessing Information Steward for administrative tasks [page 9].
2. Click the down arrow next to Manage in the top menu tool bar and select New Integrator Source .
3. In the Integrator Type drop-down list, select Common Warehouse Modeling.
4. On the CWM Integrator Configuration page, enter the following information.
Table 61:
Option Description
Source Name Name that you want to use for this source. The maximum length of an integrator source name is
128 characters.
File Name Name of the file with the CWM content. For example: C:\data\cwm_export.xml
This value is required. The file should be accessible from the computer where the Metadata
Management web browser is running.
Note
Metadata Management copies this file to the Input File Repository Server on SAP Business
Intelligence Platform. Therefore, if the original file is subsequently updated, you must take the
following steps to obtain the updates before you run the Integrator again:
○ Update the configuration to recopy the CWM file.
1. From the Integrator Sources list, select the CWM integrator source name and click
Action Properties .
The file name displays in the comments under the File Name text box, and the file
name has "frs:" prefacing it.
2. Browse to the original file again.
3. Click Save.
○ Create a new schedule for the CWM integrator because the old schedule has a copy of
the previous file.
1. With the CWM integrator source name still selected in the Integrator list, click
Action Schedules .
2. Select the Recurrence and Parameter options that you want.
3. Click Schedule.
5. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.
Administrator Guide
156 PUBLIC Metadata Management Administration
Related Information
This section describes how to configure the Metadata Integrator for SAP Data Federator.
Note
Ensure that you selected the SAP Data Federator Metadata Integrator when you installed SAP Information
Steward.
To configure an SAP Data Federator integrator source, you must have the Create or Add right on the integrator
source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For details, see
Accessing Information Steward for administrative tasks [page 9].
2. Click the down arrow next to Manage in the top menu tool bar and select New Integrator Source .
3. In the Integrator Type drop-down list, select Data Federator.
4. On the New Integrator Source page, enter the following information.
Option Description
Name Name that you want to use for this source. The maximum length of an integrator source name
is 128 characters.
DF Designer Server Ad Name or IP address of the computer where the Data Federator Designer resides. For example,
dress if you installed the Data Federator Designer on the same computer as the Data Federator
Integrator, type localhost.
DF Designer Server Port number for the Data Federator Designer. The default value is 3081.
Port
User name Name of the user that will connect to the Data Federator Designer.
Password Password for the user that will connect to the Data Federator Designer.
5. If you want to verify that Metadata Management can connect successfully before you save this source, click
Test connection.
6. Click Save.
The newly configured source appears in the list of integrator Sources on the right of the page.
Administrator Guide
Metadata Management Administration PUBLIC 157
Related Information
This section describes how to configure the Metadata Integrator for Meta Integration® Metadata Bridge (MIMB).
For a description of the objects collected by the MIMB Integrator, see the MIMB documentation at http://
www.metaintegration.net/Products/MIMB/Documentation/ .
Note
Ensure that you selected the Meta Integration Metadata Bridge (MIMB) Metadata Integrator when you installed
SAP Information Steward.
To configure the MIMB Integrator, you must have the Create or Add right on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For details, see
Accessing Information Steward for administrative tasks [page 9].
2. Click the down arrow next to Manage in the top menu tool bar and select New Integrator Source .
3. In the Integrator Type drop-down list, select Meta Integration Metadata Bridge.
4. On the New Integrator Source page, enter values for Name and Description. The maximum length of an
integrator source name is 128 characters.
5. In the Bridge drop-down list, select the type of integrator source from which you want to collect metadata and
follow the instructions on the user interface to configure the connection information.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.
Related Information
Administrator Guide
158 PUBLIC Metadata Management Administration
8.2.11 Configuring sources for Relational Database System
Metadata Integrator
Configure and run the Metadata Integrator for the following types of databases by adding objects to the Integrator
Sources folder:
● Universe Connection
● DB2
● Microsoft SQL Server
● MySQL
● Oracle
● JDBC
● Sybase ASE
● Sybase IQ
Note
Ensure that you selected the Relational Database System Metadata Integrator option when you installed SAP
Information Steward.
Follow these steps to configure the sources for a Relational Database System Metadata Integrator:
Note
When you select the Connection Type, the remaining options change based on the connection type you
choose.
The following table explains all options regardless of the connection type you choose.
Table 62:
Option Description
Integrator Type Select Relational Database from the Integrator Type drop-
down list.
Administrator Guide
Metadata Management Administration PUBLIC 159
Option Description
Connection Type Select the type of database for which you want to collect
metadata from the drop-down list. For the Relational Data
base System Metadata Integrator, the applicable database
types are:
○ Universe Connection
○ DB2
○ Microsoft SQL Server
○ MySQL
○ Oracle
○ JDBC (for databases such as Teradata)
○ Sybase ASE
○ Sybase IQ
Note
For Oracle and MySQL you must install the applica
ble JDBC driver. For information see Configuring
sources for JDBC connections [page 161].
Note
For more information about the Universe Connec
tion configuration, see Configuring sources for uni
verse connections [page 163].
Note
The remaining options that appear are based on the
connection type you choose.
Note
Select a value for Connections when Connection Type is
set to Universe Connection (the drop-down list displays
the secure connections defined in the CMS).
Computer name Enter the name of the host on which the database server is
running.
Database Name, or Service name Enter the name of the DB2 database, Microsoft SQL Server
database, MySQL database, or Oracle database service
(SID).
Administrator Guide
160 PUBLIC Metadata Management Administration
Option Description
Database User Enter the name of the user or owner of the database or data
source.
Database Password Enter the password of the user for the database or data
source.
Table Schema (Optional) Enter the name of the schema that you want to
import from this database source. To specify multiple sche
mas, separate the name of each schema with a comma (,).
Note
When the schema name includes a comma, use double
quotation marks around the schema name.
If you do not specify a schema name, and you set the con
nection type to DB2, Microsoft SQL Server, or Oracle,
Metadata Management imports all available schemas for
SQL Server or DB2, or uses the user name to import the
schema for Oracle.
5. Click Save.
The newly configured source appears in Metadata Management Integrator Sources in the left pane.
Related Information
Follow the steps below if you plan to use a JDBC source (such as Teradata) for the Relational Database Metadata
Integrator:
1. Obtain the JDBC driver from your database server web site or utilities CD.
2. Unzip the JDBC driver into a folder such as:
c:\temp\teradata
3. Log on to the Central Management Console (CMC) and access the Information Steward area. For details, see
Accessing Information Steward for administrative tasks [page 9].
Administrator Guide
Metadata Management Administration PUBLIC 161
5. Select Relational Database from the Integrator Type drop-down list.
6. Specify the following JDBC connection parameters:
Table 63:
JDBC parameter Description
Driver Name of the JDBC driver class that you obtained in step 1
above.
URL URL address that specifies the JDBC connection to the da
tabase.
Table Schema (Optional) Specify the name of the schema that you want to
import from this source database. To specify multiple sche
mas, separate the name of each schema with a comma (,).
Note
When the schema name includes a comma, use double
quotation marks around the schema name.
Note
In a distributed deployment, you must set Library Files to
the classpath on the computer where the integrator
runs.
7. Click Test connection to verify that Metadata Management can connect successfully.
8. If the connection works (based on the results of the Test Connection), click Save.
The newly configured source appears in the list of Integrator Sources in the left pane.
Related Information
Configuring sources for Relational Database System Metadata Integrator [page 159]
Running a Metadata Integrator immediately [page 169]
Defining a schedule to run a Metadata Integrator [page 170]
Viewing integrator run progress, history, and log files [page 187]
Administrator Guide
162 PUBLIC Metadata Management Administration
8.2.11.2 Configuring sources for universe connections
The Relational Database Integrator can collect metadata from secured universe connections that use JDBC and
ODBC. For the most current list of supported universe connection types, refer to the Release Notes.
1. If you want to configure a universe connection source that uses a JDBC connection, perform the following
steps:
a. Obtain the JDBC driver from your database server web site or utilities CD.
b. Unzip the JDBC driver into a folder such as the following:
c:\temp\teradata
2. If you want to configure a universe connection source that uses an ODBC connection, ensure that the ODBC
Datasource exists in the computer where the integrator will run.
3. Log on to the Central Management Console (CMC) and access the Information Steward area. For details, see
Accessing Information Steward for administrative tasks [page 9].
The Information Steward page opens with the Integrator Sources node selected in the tree on the left.
4. Take one of the following actions to access the New Integrator Source page.
○ Click the left-most icon, Create an Integrator source, in top menu bar.
○ On the Manage menu, point to New and click Integrator Source.
Option Description
Connections The name of the Central Management System (CMS) connection. The drop-down list displays the se
cure connections defined in the CMS.
Table Schema (Optional) Specify the name of the schema that you want to import from this source database. To
specify multiple schemas, separate the name of each schema with a comma (,).
Note
When the schema name includes a comma, use double quotation marks around the schema name.
If you do not specify a schema name, Metadata Management either imports all available schemas for
SQL Server or DB2, or uses the user name to import the schema for Oracle.
Library Files The full paths to the Java library files (separated by semicolons) required by the Universe Connection.
For example:
Note
In a distributed deployment, you must set Library Files to the classpath on the computer where the
integrator runs.
Administrator Guide
Metadata Management Administration PUBLIC 163
7. Click Test connection to verify that Metadata Management can connect successfully before you save this
source.
8. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.
Related Information
Configuring sources for Relational Database System Metadata Integrator [page 159]
Running a Metadata Integrator immediately [page 169]
Defining a schedule to run a Metadata Integrator [page 170]
Viewing integrator run progress, history, and log files [page 187]
Use provided templates to configure either a relational database source or a Business Intelligence source for the
Excel content integrator.
Ensure that your administrator selected the Excel metadata content integrator during installation, so that the
software installs the template files. You can also obtain the template files when you perform the following steps.
You must belong to the Metadata Management administrator group to be able to configure the Excel content
integrator.
To select an existing Excel workbook that you created previously based on an original template file:
a. Click Browse located next to File Name and browse to and select the Excel file that you created previously.
b. Select the applicable template model type from the Model Name dropdown list. Options are Relational
Database or Business Intelligence.
c. Click Save.
Administrator Guide
164 PUBLIC Metadata Management Administration
d. In the Save As window, change the template name to a new name, and select a location. Click Save.
e. In the New Integrator Source dialog, click Browse and select the Excel file that you just created.
f. Click Save.
The Excel content integrator source appears in the list of Metadata Management Integrator Sources folder in the
CMC. Configure the Excel workbook before you run the Excel metadata integrator source.
Related Information
You manage integrator sources and instances in the SAP Information Steward area of the CMC.
From the list of configured integrator sources, you can select an integrator source and perform a task from
Manage or Actions in the top menu tool bar.
You can perform the following tasks from the Manage menu.
Table 64:
Manage task Description
User Security Manage user security for Integrator Sources, Source Groups, or Metapedia objects.
Delete Delete this source configuration and its associated schedules, source runs, and
logs.
Note
This option is only available when you have selected an item in the main page.
Purge Remove all integrator source runs. This option keeps the source configuration, file
logs, and schedules.
Note
This option is only available when you have selected an item in the main page.
You can perform the following tasks from the Actions menu.
Table 65:
Action task Description
History View the current and previous executions of this Metadata Integrator source.
Administrator Guide
Metadata Management Administration PUBLIC 165
Action task Description
Properties View and edit the configuration information for this Metadata Integrator source.
Related Information
You can view and modify the definition of an integrator source in its Properties dialog box to change its
description, connection information, and other pertinent information for the integrator source.
● To view the definition, you must have the right to View the integrator source.
● To modify the definition, you must have the right to Edit the integrator source.
Related Information
Administrator Guide
166 PUBLIC Metadata Management Administration
8.3.2 Deleting an integrator source
You might want to delete an integrator source in situations such as the following:
Note
If you rename your integrator source, you lose all the previously collected metadata.
To delete an integrator source, you must belong to the Metadata Management Administrator user group or have
the right to delete the integrator source.
4. Select the integrator source and click Manage Delete in the top menu tool bar.
Note
If you delete an integrator source, you also delete the metadata from that source that was stored in the
Metadata Management repository.
Each time you run a metadata integrator or SAP Information Steward utility, SAP Information Steward creates a
new instance and log files for it. By default, the maximum number of instances to keep is 100. When this
maximum number is exceeded, SAP BusinessObjects Enterprise deletes the oldest instance and its associated log
file.
Note
The Purge utility deletes the database log in the Metadata Management repository for each instance that was
deleted.
To change the limits to delete integrator source instances, you must have Full Control access level on the
Metadata Management folder.
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator or Administrator user group.
2. From the CMC click Information Steward.
The Information Steward page opens with the Information Steward node selected in the tree panel at left.
3. Expand the Metadata Management node.
Administrator Guide
Metadata Management Administration PUBLIC 167
4. Click Actions Limits in the top menu bar.
The Limits window for Metadata Management appears.
5. If you want to change the default value of 100 maximum number of instances to keep:
a. Select the check box for the option Delete excess instances when there are more than N instances.
b. Enter a new number in the box under this option.
c. Click Update to save your changes.
6. If you want to specify a maximum number of instances to keep for a specific user or group:
a. Click the Add button next to Delete excess instances for the following users/groups.
b. Select the user or group name from the Available users/groups pane and click >.
c. Click OK.
d. If you want to change the default value of 100 maximum number of instances to keep, type a new number
under Maximum instance count per object per user.
e. Click Update to save your changes.
7. If you want to specify a maximum number of days to keep instances for a specific user or group:
a. Click the Add button next to Delete instances after N days for the following users or groups.
b. Select the user or group name from the Available users/groups pane and click >.
c. Click OK.
d. If you want to change the maximum number of instances to keep (default value 100), type a new number
under Maximum instance count per object per user.
e. Click Update to save your changes.
8. To close the Limits window for Metadata Management, click the X in the upper right corner.
When updates to SAP Information Steward result in a change in the metadata integrator, the metadata objects
collected by the integrator are not displayed until the integrator source is run again. This scenario is shown in the
following ways:
● In Information Steward, on the Metadata Management directory page, the objects for the metadata integrator
do not display. Instead, there is a message that states: "Run required: Ask your Information Steward
administrator to run the integrator source in the CMC".
● In the Central Management Console (CMC), in the Integrator Sources node, the Last run column shows Run
required for upgrade.
To collect the metadata and view the metadata objects in Information Steward, run the Metadata Integrator
immediately or wait for the next scheduled run.
Run the Metadata Integrator to collect the metadata for each source that you configured. When you select
Integrator Sources, all configured integrator sources appear.
Administrator Guide
168 PUBLIC Metadata Management Administration
When you select an integrator source from the list in the right pane, you can run it immediately or define a
schedule to run it.
1. Open the Information Steward page in the CMC and select the Metadata Management node from the left
pane.
2. Expand the Integrator Sources node. All configured integrator sources appear in the center pane.
Note
For the SAP NetWeaver BW integrator to have authorization rights to collect metadata objects from SAP
NetWeaver BI, you must install the correct support package based on the version of SAP NetWeaver BI you are
running. For a list of the SAP NetWeaver BI versions and the applicable support package related to each, see
the topic “SAP NetWeaver BI version and applicable support package for BW integrator” in the Information
Steward Install Guide.
Related Information
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator or Administrator user group.
2. Click Information Steward.
The Information Steward page opens with the Information Steward node selected in the tree panel at left.
3. Expand the Metadata Management node and click Integrator Sources.
4. From the list of configured sources that appears in the center pane, select the integrator source that you want
by clicking anywhere on the row except its type.
Note
If you click the integrator Source Type, you display the version and customer support information for the
integrator source. If you double-click the row, you open the Properties dialog box for the integrator source.
Tip
You can also click the icon Run selected object(s) now in the icon bar under Manage and Actions.
Administrator Guide
Metadata Management Administration PUBLIC 169
6. To view the progress of the integrator run, select the integrator source, and click Action History .
Tip
If you select Now in the Run object option under Action Schedule Recurrence and click Schedule,
the Integrator History page automatically displays.
Related Information
Viewing integrator run progress, history, and log files [page 187]
Computing and storing lineage information for reporting [page 214]
To define a schedule for an integrator source, you must have the proper user or group rights.
Note
If you click the Source Type, you display the version and customer support information for the metadata
integrator.
Administrator Guide
170 PUBLIC Metadata Management Administration
10. (Optional) To change the default values for runtime parameters for the integrator, select the Parameters node
on the left. (For runtime parameter information, see the topic Common runtime parameters for metadata
integrators.)
11. Click Schedule.
12. If you use impact and lineage reports on the Reports option in the View menu list of Information Steward, you
must recompute the contents of the lineage staging table to incorporate changes from the integrator runs.
Similar to setting up a regular schedule to run an integrator, you can set up a schedule to compute the lineage
staging table at regular intervals.
Related Information
Follow these steps to run the SAP BusinessObjects Enterprise Metadata Integrator with Windows AD
authentication:
1. From the Central Management Console (CMC) click Information Steward in the Organize list and expand the
Metadata Management node in the tree pane at left.
2. Click Integrator Sources and select the applicable integrator source name that appears in the main pane at
right.
-Djava.security.auth.login.config=c:/winnt/bsclogin.conf
-Djava.security.krb5.conf=c:/winnt/krb5.ini
6. (Optional) To change the default instance title, select Instance Title from the tree pane and change it to a
unique name that describes this schedule.
7. Select Recurrence from the tree pane and choose the frequency in the Run object drop-down list.
8. Choose additional relevant values for the selected recurrence option as applicable.
9. (Optional) To receive a notification email when the integrator task has completed, select Notification from the
tree pane. For more information about email notifications, see the Business Intelligence Platform
Administrator Guide.
Administrator Guide
Metadata Management Administration PUBLIC 171
10. (Optional) To trigger the execution of a metadata integrator when an event occurs, select Events from the tree
pane. For more information about scheduling, see the Business Intelligence Platform Administrator Guide.
11. Click Schedule.
Related Information
Users with administrator rights can set runtime parameters for different integrator sources. Runtime parameters
change the normal or default actions when you run an integrator source.
The following table describes the runtime parameters that are applicable to all metadata integrators.
Table 66:
Runtime parameter Description
Database Log Level This log is in the SAP Metadata Management Repository. You can view this log while
the Metadata Integrator is running.
The default logging level is Information. Usually you can keep the default logging
level. However, if you need to provide more detailed information about your integra
tor run, choose Integrator Trace for the Database Log Level.
File Log Level The Metadata Integrator creates this log in the SAP installation directory and copies
it to the File Repository Server. You can download this log file after the Metadata
Integrator run completes.
The default logging level for this log is Configuration. Usually you can keep the de
fault logging level. However, if you need to debug your integrator run, you can
change the level to Integrator Trace.
JVM Arguments The Metadata Management job server creates a Java process to perform the meta
data collection. Use the JVM Arguments parameter to configure runtime parame
ters for the Java process.
For example, if the metadata source is very large, you might want to increase the
default memory size.
Administrator Guide
172 PUBLIC Metadata Management Administration
Runtime parameter Description
Additional Arguments Optional runtime parameters for the metadata integrator source.
Related Information
There are runtime parameters that apply to specific metadata integrators. The table below lists the applicable
integrators and the applicable runtime parameters.
Table 67:
Metadata Integrator Type Parameters
SAP BusinessObjects Enterprise Parameters to collect a subset of Central Management Console (CMC) objects:
Metadata Integrator
Universe Folders:
For more information, see the topic Selective CMS object collection [page 176].
Update Options:
For more information, see the topics Runtime parameters for SAP BusinessObjects
Enterprise Metadata Integrator [page 174] and Runtime parameters for SAP Sybase
PowerDesigner Metadata Integrator [page 184].
Administrator Guide
Metadata Management Administration PUBLIC 173
Metadata Integrator Type Parameters
JVM Arguments
For more information, see the topics Runtime parameters for SAP BusinessObjects
Enterprise Metadata Integrator [page 174], SAP BusinessObjects Enterprise 3.x
metadata collection with SSL [page 180] and Configuring sources with Windows AD
authentication [page 145].
Additional Arguments
For additional information, see the topics User collection parameters [page 175] and
Separate processes parameters [page 179].
SAP NetWeaver Business Warehouse ● Parameters that adjust the number of threads used for the metadata collection.
Metadata Integrator ● Parameters to filter the queries or workbooks to collect.
SAP HANA Metadata Integrator Parameters that specify the SAP HANA Studio version to use for the configured
HANA database.
Note
Applicable for users who have multiple versions of SAP HANA Studio installed, and
who want to change the version that was set by the CMC during installation.
SAP Sybase PowerDesigner Metadata Update Option parameters determine which physical data models to collect upon
Integrator
connecting with the repository.
For more information, see Runtime parameters for SAP Sybase PowerDesigner Meta
data Integrator [page 184].
There are several types of runtime parameters for the SAP BusinessObjects Enterprise Metadata Integrator.
Administrator Guide
174 PUBLIC Metadata Management Administration
● SAP Enterprise XI 3.x system
● Metadata collection control for reports
Related Information
The SAP BusinessObjects Enterprise Metadata Integrator provides the following runtime parameter to adjust
memory usage when collecting user permissions.
Table 68:
Runtime parameter Description Default value
Note
If you specify true, increase the memory with the -Xmx parameter in JVM
Arguments. If the amount of memory is available, you can set this value as
high as -Xmx1500m, where "m" indicates megabytes.
Related Information
Administrator Guide
Metadata Management Administration PUBLIC 175
8.5.2.1.2 Selective CMS object collection
Selective collection of SAP BusinessObjects Enterprise Metadata Integrator objects in the SAP Central
Management Server (CMS) can reduce processing time and provide a specific view of your Enterprise system
objects.
To set options for selective collections, schedule the applicable integrator source in the Central Management
Console (CMC) and complete the collection options in the Parameters page.
Use a selective CMS object collection strategy to run the integrator multiple times, and each time collect a
different component. Use the specific update options to control the content of your collections and to control how
to update existing collection objects. You could also incrementally collect CMS metadata for a large SAP
BusinessObjects Enterprise deployment by using selective collection.
Related Information
When you schedule a metadata integrator source in the Central Management Console (CMC), choose options in
the Parameters page to selectively collect objects.
Option Description
Update existing objects and add newly selected objects Adds objects to a previous object collection for the integrator
source, and updates existing objects that are new or have
changed since the last run.
Delete Existing objects before starting object collection Deletes the objects collected from previous runs of the inte
grator source and adds only the objects that you select for
this run.
Example
Update existing objects and add newly selected objects
For the first run of your integrator source, you select to collect only Web Intelligence document objects. On the
next run, you select to collect Crystal Reports and associated universes in addition to the Web Intelligence
document objects. With the Update existing objects and add newly selected objects option selected, the
software does the following:
Administrator Guide
176 PUBLIC Metadata Management Administration
● adds the Crystal Reports objects to the existing Web Intelligence document objects in the Metadata
Management tab
● collects only the Web Intelligence document objects that are new or have changed since the last run and
adds them to the Metadata Management tab
Example
Delete existing objects before starting object collection
You run the Enterprise metadata integrator on a specific source multiple times, and the software has collected
numerous objects in the Metadata Management tab in Information Steward. To view only the Crystal Reports
and associated universe objects in the Metadata Management tab, select Delete Existing objects before starting
object collection when you schedule the next run in CMC. Then select the Collect Crystal Reports and
associated Universes option, and deselect all other collection options. When the job runs, the software does the
following:
● deletes the current objects in the Metadata Management tab for the specific source
● collects the Crystal Reports and associated universe objects to display in the Metadata Management tab of
Information Steward for the specific source
Related Information
Collect objects from universe, public, or personal folders when you schedule to run Enterprise integrator in the
Central Management Console (CMC).
Anyone using the Metadata Management tab in Information Steward can see the contents of the universe, public,
or personal folders. However, the software requires the proper permissions before you can run collections on the
objects in these folders. For example, only you or an administrator can run a collection on objects in your personal
folder.
Note
Metadata Integrator does not collect metadata from inboxes or categories.
Related Information
Administrator Guide
Metadata Management Administration PUBLIC 177
Parameters page option descriptions [page 178]
Collect objects from Universe, Public, or Personal folders by setting corresponding collection options under each
group in the Parameters page in the Central Management Console (CMC).
The following table describes the options in the Parameters page of the CMC where you can select collection
options, including the options described for selective collections.
Table 70:
Option Description
Folder Name Expression Specifies the names of the folders that you want in the collection using a Java Regu
lar Expression. There is a Folder Name Expression for Universe, Public, and Per
sonal folders.
Collect Universes (Universe Folders) Collects the SAP Enterprise universe metadata from the folders specified in the
Folder Name Expression option.
Note
● If you uncheck this option, and choose to collect any report that uses a uni
verse, the integrator collects the universe and Lumira stories metadata as
well.
● If you select only this option and uncheck all of the report options, the inte
grator also collects Lumira datasets.
● Only datasets built in Lumira 1.1.8 contain the universe references collected
by the integrator.
Administrator Guide
178 PUBLIC Metadata Management Administration
Option Description
Collect Web Intelligence Documents and Collects Web Intelligence documents and source universes from Public or Personal
source Universes folders.
Collect Desktop Intelligence Documents Collects Desktop Intelligence documents and source universes from Public or Per
and source Universes sonal folders.
Collect Crystal Reports and associated Collects Crystal Reports and associated universes from Public or Personal folders.
Universes
Collect Dashboards Objects and Collects Dashboards Objects metadata from public or personal folders with UNX
associated Universes universe and BEX query as data sources.
Collect Analysis OLAP Workspace Collects metadata from SAP BusinessObjects Analysis OLAP workspaces from
public folders.
Collect Design Studio Application Collects metadata from SAP BusinessObjects Design Studio Applications from
public folders.
Database Log Level Sets the level of detail in the database log located in the Metadata Management
Repository.
File Log Level Sets the level of detail in the file log located in the SAP installation directory.
Update existing objects and add newly Collects a subset of physical data models based on the changes made since the last
selected objects run. This option is selected by default. Using this option may reduce processing
time.
Delete Existing objects before starting Collects all physical data models, overwriting models in the repository.
object collection
JVM Arguments (Optional) Configures runtime parameters for the Java process.
Additional Arguments (Optional) Configures runtime parameters for the metadata integrator source.
Related Information
The SAP BusinessObjects Enterprise Metadata Integrator uses a separate process for each of the following
collections:
Administrator Guide
Metadata Management Administration PUBLIC 179
● Metadata collection for users, groups, and folders
● Metadata collection for Reports
● Metadata collection for Universes
Each separate process uses separate resources (memory and computer) to improve performance and
throughput during the Metadata Integrator run.
The Metadata Integrator provides the following runtime parameters for running separate processes. Set these
parameters in Additional Arguments on the Parameters page.
Table 71:
Runtime parameter Description Default value
univProcessWorkLimit Specifies the number of universes that are collected by each universe process. 50
reportProcessWorkLimit Specifies the number of reports that are collected by each report process. 300
Related Information
Note
You cannot run remote integrators if Federal Information Processing Standards (FIPS) mode is enabled on
BusinessObjects Enterprise XI 3.x.
Administrator Guide
180 PUBLIC Metadata Management Administration
Follow these steps to set the runtime parameter for the SAP BusinessObjects Enterprise XI 3.x integrator source
when SSL is enabled:
1. From the Central Management Console (CMC) click Information Steward in the Organize list and expand the
Metadata Management node in the tree pane at left.
2. Expand Metadata Management and click Integrator Sources.
3. From the list of configured sources that appears on the right, select the source by clicking anywhere on the
row except its type.
Note
If you click the source type, you display the version and customer support information for the Metadata
Integrator.
Related Information
The SAP BusinessObjects Enterprise Metadata Integrator provides the following run-time parameters to control
metadata collection for reports.
Set these parameters in Additional Arguments in the Parameters page of the Central Management Console
(CMC).
Table 72:
Run-time parameter Description Default value
collectRptFldVar You can disable collection of field levels for Crystal Reports and true
Web Intelligence documents. Field levels include:
● Report fields
● Formula fields (Crystal Reports only)
● SQL expression fields
● Running total fields
● Variables (Web Intelligence only)
Administrator Guide
Metadata Management Administration PUBLIC 181
Run-time parameter Description Default value
collectReportsByUniverse Restricts the collection of reports. When set to true, the integrator true
collects only reports referencing collected universes.
The SAP NetWeaver Business Warehouse Metadata Integrator provides runtime parameters to adjust the number
of threads to use when collecting metadata from the SAP system and to filter the queries or workbooks to collect.
Table 73:
Runtime parameter Description
Number of Threads Specifies the number of threads to use when collecting metadata from the SAP system. You
might want to increase the number of threads if your SAP NetWeaver BW system has a large
number of objects and it has available work processes.
The default value is 5. In a multiprocessor environment, you can increase this value. The num
ber of threads is limited by the number of processors configured on the SAP NetWeaver BW
server. If the number of threads is greater than the available processors, the thread is put on a
queue and will be processed when a processor becomes available.
Query Name Expression Specifies the names of the queries that you want in the collection using a Java Regular Expres
sion. For example:
Workbook Name Expression Specifies the names of the workbooks that you want the integrator source to collect using a
Java Regular Expression.
Note
For the SAP NetWeaver BW integrator to have authorization rights to collect metadata objects from SAP
NetWeaver BI, you must install the correct support package based on the version of SAP NetWeaver BI you are
running. For a list of the SAP NetWeaver BI versions and the applicable support package related to each, see
the topic “SAP NetWeaver BI version and applicable support package for BW integrator” in the Information
Steward Install Guide.
Administrator Guide
182 PUBLIC Metadata Management Administration
8.5.2.3 Runtime parameter for SAP HANA integrator source
Users who have multiple versions of SAP HANA Studio installed on the same system as SAP Information Steward
may want to change the version based on specific SAP HANA integrator sources.
Use the runtime parameter in the table below when you have multiple versions of HANA Studio installed, and you
want to associate a different version in Information Steward than what was automatically selected by the HANA
integrator.
Table 74:
Run-time parameter Description
hdbStudioVersion <version.number> Specifies the HANA Studio version number for the configured HANA database.
Example:
hdbStudioVersion 1.0.32
Note
Do not include the build number with the version number in the parameter. For
example, if you want to use version 1.0.32 build 54621 (shown as
1.0.32.54621 in the About dialog of HANA Studio), enter 1.0.32 in the
<version.number> portion of the parameter.
Determine the SAP HANA Studio version that you want to specify by opening the applicable HANA Studio
application and choosing Help About . Use the version number without the build number. For example, if
you see Version 1.0.32 Build id: 321111040920 (54621) in the About dialog, use 1.0.32 for the HANA
Studio version and omit the build number ID, 321111040920 (54621).
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Expand the Metadata Management node and then expand Integrator Sources.
The integrator that you just scheduled will process using the HANA Studio version that you specified in the CMC.
Administrator Guide
Metadata Management Administration PUBLIC 183
Related Information
The PowerDesigner Metadata Integrator has runtime parameters for selecting which physical data models are
collected when the Central Management Server (CMS) connects to the PowerDesigner repository and for
specifying the Power Designer installation location.
Table 75:
Parameters Description
Update Option These parameters are located in the Parameters page of the CMS and apply to only phys
ical data models.
● Delete existing objects before starting object collection : Collects all physical data
models, overwriting models in the repository.
● Update existing objects and add newly selected objects: Collects a subset of physical
data models based on the changes made since the last run. This is selected by de
fault.
Using this option may reduce processing time.
pdHomeDir Enter this parameter in the Additional Arguments field on the Parameters page to specify
the Power Designer installation location.
You would use this parameter if the integrator is unable to automatically detect the
Power Designer installation or if the installation is using the wrong location. For example,
to override the installation location you would enter pdHomeDir "C:/Sybase/
Power Designer SP03/"
The runtime parameters for the Relational Database Metadata Integrator include parameters in which you can
enter simple expressions that include or exclude specified table names or view names at runtime.
Administrator Guide
184 PUBLIC Metadata Management Administration
Table 76:
Parameter Description
Table Name Expression Enter a regular expression to include or exclude named ta
bles.
^(?!.*(PRO)).*$
Note
Even if the regular expression excludes the tables, all de
pendent objects appear in the Metadata Management tab,
and all dependencies and data objects appear in the im
pact and lineage diagrams.
View Name Expression Enter a regular expression to include or exclude named views.
For example, the following regular expression includes views
with the names specified:
(.*(sales|cost|dates)).*$
Note
Even if the regular expression excludes the views, all de
pendent objects appear in the Metadata Management tab,
and all dependencies and data objects appear in the im
pact and lineage diagrams.
Include public synonyms (For Oracle only) For Oracle databases only, specifies whether all public syno
nyms are to be collected. It is selected by default. When se
lected, all public synonyms are collected. This option does
not appear for any other database types.
Aside from the common runtime parameters, you can use the following:
Administrator Guide
Metadata Management Administration PUBLIC 185
Table 77:
Parameter Description
atbl true Allows you to collect all the tables in the ECC system.
Note
Only the tables under the predefined root Data Model are collected.
If you do not add this parameter, the information you get is limited to the fixed Data Model that
is predefined by SAP.
ProgramArgument[] define Allows you to configure source information. The following options are available for this parame
ProgramArguments() ter:
Table 78:
Argument name Type Description
● AppServer
● GroupServer
SAPUserId String The name of the user connecting to the SAP system.
SAPPassword String The password for the user connecting to the SAP system.
Related Information
When you schedule an integrator source, you can change the default values of the runtime parameters in the
Parameters page in the Central Management Console (CMC).
Before you follow the steps below, make sure you understand all of the settings in the Parameters page.
Administrator Guide
186 PUBLIC Metadata Management Administration
To change the runtime parameters for an integrator source:
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group and click Information Steward on the CMC home page.
2. Expand the Metadata Management node.
3. Select the Integrator Sources node, and select the applicable integrator source.
Related Information
You can view date information, history, and database logs for an integrator source. Open the Central Management
Console (CMC) and navigate to the Information Steward area and choose Metadata Management Integrator
Sources .
A list of all existing integrator sources appears in the main pane. Each integrator source contains additional
information as applicable in the following columns:
● Name
● Type
● Category
● Last Run
● Description
● Created
1. Select the name of the applicable integrator source from the list of all configured integrator sources and click
Actions History .
Administrator Guide
Metadata Management Administration PUBLIC 187
The Integrator History dialog opens with the following schedule information shown in columns:
Table 79:
Column name Description
Schedule Name Name provided in the Instance Title box when the instance was scheduled.
Log File Link to the log file that contains progress messages for the run.
2. View the database log file while the integrator source run is in progress. Either select Actions Database
Log or click the View the database log icon ( ) in the menu bar at the top of the Integrator History
dialog. Use the Refresh option in the database log to reveal additional information as the run progresses.
3. After the integrator source has finished or has failed, use the methods from step 2 to view information, or
click the Download link in the Log File column.
4. Click the “X” icon in the upper right corner of the database log page to close the log file.
By default, Metadata Management writes high-level messages (number of universes processed, number of
reports processed, and so on) to the log. You can change the message level on the configuration page for the
integrator source.
Related Information
There are other options in the Integrator History page that enables you to obtain history information from the
integrator run.
At the top of the Integrator History page in the main pane, there is a menu bar that includes addition actions:
Administrator Guide
188 PUBLIC Metadata Management Administration
● Various icons for specific actions
Table 80:
Action drop list menu Description
Pause Pauses the instance that is scheduled to run or scheduled as a recurring in
stance. The selected instance is not canceled.
Delete Delete the selected integrator instance from the history list.
The icons that are shown in the top menu bar are slightly different than the options in the Actions menu.
Table 81:
Icon Action Description
Pause selected instance Pause the instance that is scheduled to run or scheduled as a
recurring instance. The selected instance is not canceled.
Resume paused instance Restart the paused instance. Only available when the instance
has been paused.
View the database log View messages that indicate which metadata objects have
been collected from the integrator source.
8.8 Troubleshooting
You may encounter some error and warning messages as you view database logs and file logs for each Metadata
Integrator run. The error and warning messages are related to the following items:
● Crystal Reports
● Desktop Intelligence document
● Out of memory error
● Parsing failure errors
● SQL parse errors
● Data Federator Designer error
Administrator Guide
Metadata Management Administration PUBLIC 189
8.8.1 Crystal Report message
Crystal Report [reportname]. Unable to find class for universe object [objectname]. Universe class cannot be
uniquely identified for object [objectname]. Data association cannot be established directly through the object.
Reference will be established directly to the column.
Cause: The SAP BusinessObjects Enterprise Metadata Integrator cannot uniquely identify a universe object that is
used to create a Crystal Report when the object has the same name as another object in a different universe class.
Therefore, the integrator cannot establish the correct relationship between the universe object and the report.
However, the integrator can establish a relationship between the source table or column and the report because
the SQL parser can find the source column used by the universe object.
Cause: Data providers in the Desktop Intelligence document refer to an invalid or non-existent universe.
Action: Open the Desktop Intelligence document and edit the data provider to specify a valid universe.
Error occurred during initialization of VM Could not reserve enough space for object heap
Cause: The SAP BusinessObjects Enterprise Metadata Integrator does not have enough memory to run.
Action: Decrease the value of the MaxPermSize run-time parameter in the JVM Arguments on the Parameters
page when you schedule the integrator source. For example, enter -XX:MaxPermSize=256m and rerun the
Metadata Integrator.
Cause: The SAP BusinessObjects Enterprise Metadata Integrator cannot collect the metadata for a universe
derived table if the SQL used in the derived table is of the form SELECT * FROM TABLE.
Action: Always use the fully-qualified column names in the projection list of the SELECT clause.
Parsing failure for derived Table <table_name>. Unable to find table associated with column <column_name>
Administrator Guide
190 PUBLIC Metadata Management Administration
Cause: The SQL parser in the SAP BusinessObjects Enterprise Metadata Integrator requires column names in a
derived table to be qualified by the table name: <table_name.column_name>. If you do not qualify the column
name, the Metadata Integrator cannot associate the column to the correct table.
Action: Fully qualify the column reference or add the tables used by the derived tables to the universe. The
Metadata Integrator treats the universe tables as a system catalog to find the table and column references.
Cause: The SQL parser in Metadata Management has limited parsing capabilities to extract column names and
table names to build the relationships. For example, if a Metadata Integrator fails to parse the SQL for a view, it
cannot build the source-target relationship between the view and the table or tables upon which the view is based.
However, the Metadata Integrators collect the SQL statement and the Metadata Management Explorer displays it.
Action: Analyze the SQL statement in the Metadata Management Explorer and establish a user-defined
relationship for these tables and columns.
Cause: You do not have sufficient privilege to extract metadata about Web Intelligence documents.
● Have your administrator change your security profile to give you permission to refresh Web Intelligence
documents.
● Run the Metadata Integrator with a different user id that has permission to refresh Web Intelligence
documents.
Cause: If a database connection is not configured for Trusted Authentication in SAP BusinessObjects Business
Intelligence (BI) Platform, you must supply the user id and password at runtime. If you try to collect metadata for
a report that uses a non-Trusted connection to the database, the report collection fails.
Action: Configure both your SAP BusinessObjects BI Platform server and client to enable Trusted Authentication.
For details, see the SAP BusinessObjects Business Intelligence Platform Administrator Guide.
Cause: The extract for Web Intelligence documents fails if you create your Web Intelligence documents with the
Refresh on Open option and the computer on which you run the SAP BusinessObjects Enterprise Metadata
Integrator does not have connection to the source database on which the reports are defined.
● Run SAP Information Steward on the computer where SAP BusinessObjects Enterprise is installed
● Define the database connection on the computer where you run the SAP Metadata Management.
Administrator Guide
Metadata Management Administration PUBLIC 191
8.8.8 Connection with Data Federator Designer
If the Data Federator Integrator connects successfully to the Data Federator Designer, but Data Federator returns
an error:
SAP Information Steward provides the capability to group metadata sources into groups such as Development
System, Test System, and Production System. After the groups are defined, you can view impact and lineage
diagrams for a specific Source Group.
Related Information
You must have been assigned the right to add objects before you can follow these steps.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Select Metadata Management Integrator Sources from the tree in the left pane.
A list of existing integrator sources appears in the main pane.
You can see the new source group listed with the other source groups by choosing Metadata Management
Source Groups .
You must have been assigned the right to edit objects before you can follow these steps.
Administrator Guide
192 PUBLIC Metadata Management Administration
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
You must have been assigned the right to delete objects before you can follow these steps.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
Configuring BI Launch Pad enables users on SAP BusinessObjects Business Intelligence (BI) platform to access
Information Steward on a different computer with SAP BusinessObjects Information platform services (IPS).
If your BI Launch Pad is on BI platform 4.0 SPx, you must install the ISAddOn utility on your BI platform 4.0 SPx
system. For details see “Installing the IS AddOn utility” in the Installation Guide.
You must configure BI Launch Pad in Information Steward if Information Steward is installed on IPS and BI Launch
Pad users will view lineage of reports (such as Crystal Reports or Web Intelligence documents) on a different
computer, or view terms and categories associated with the lineage of reports on a different computer.
For details about lineage diagrams, see the “Relationships for universe objects and reports” section in the User
Guide. For details about Metapedia terms and categories, see the “Understanding terms and categories” topic in
the User Guide.
Note
This BI Launch Pad integration configuration is only applicable to BI platform 4.1 SP1 and later versions of BI
platform.
Administrator Guide
Metadata Management Administration PUBLIC 193
1. On the computer where IPS is installed, log in to the Central Management Console (CMC) with a user name
that belongs to the Metadata Management Administrator group or the Administrator group.
Note
When you open the CMC web interface, make sure you use the webserver name instead of “Localhost”. For
example, instead of using http://localhost:<port_number>/BOE/CMC, use http://
<webserver>:<port_number>/BOE/CMC. Using the localhost URL is not supported in the software. See
“Accessing Information Steward for administrative tasks” in the Administrator Guide for details.
2. At the top of the CMC Home page, select Applications from the navigation drop-down list located at the top
left of the screen.
3. Right-click on Information Steward Application in the Application Name list and select BI Launch Pad
Integration from the drop-down menu.
4. Enter the connection information for the BI platform Central Management Server (CMS) for your BI Launch
Pad application, and click Save.
5. On the computer where BI platform is installed, assign the view for the Information Steward Application object
to the user who will access Information Steward from BI Launch Pad. For more details, see the topic “To
assign principals to an access control list for an object” in the SAP BusinessObjects Business Intelligence
platform 4.1 Administrator Guide.
You can display Crystal reports in tables in the Information Steward repository.
Use a properties file to define the business question that report results will answer (for example, "which tables
have similar names in different integrator sources").
1. Create a custom reports directory on a single web application server or on multiple servers.
Create the directory on a single web application server when the directory path should be available on or
accessible from the web application server machine.
Create the directory on multiple web application servers when the same directory path and content is
maintained on each server machine or when an NFS-shared directory path that is accessible from all
application servers is used. (In the latter case, the user account under which the application server is running
must have read permissions on the directory contents.)
2. For each Crystal report, create a properties file and name it using one of the following conventions:
○ When the Crystal report is in the default language of the operating system, use
reportfilename.rpt.properties.
○ When the Crystal report uses a different preferred viewing locale than the operating system, use
reportfilename.rpt_<language>.properties.
Administrator Guide
194 PUBLIC Metadata Management Administration
3. In the properties file, add a name entry with the following attributes:
Table 82:
Attribute Description
name Name of the Crystal report to display in the Information Steward web application.
description Description of the Crystal report to display in the Information Steward user interface.
categories Category(ies) that this report belongs to. Use a semicolon (;) to separate multiple catego
ries. Categories can be predefined, or you can create new categories.
use_is_repo_connect Source system for the tables on which a Crystal report will run:
ion ○ Set to True to connect to the Information Steward repository.
○ Set to False to use the repository connection defined in the report.
The default value is True.
The following example shows a name entry in a properties file for a new custom report. The new report will list
table names repeated in multiple integrator sources and put the table names in two categories—"Table name
reports" for new categories and "Usage reports" for existing categories.
4. Add each Crystal report and associated properties file to the custom reports directory.
5. Specify the custom reports directory in the Central Management Console (CMC):
a. Log on to the CMC with a user name that belongs to the Metadata Management Administrator group or to
the Administrator group.
b. In the navigation list at the top of the CMC home page, select Applications.
c. In the Applications Name list, select Information Steward.
d. Select Configure Application in the navigation pane at left.
e. Find Metadata Management Options group.
f. Enter a full path for the directory location in Custom Reports Directory.
g. Click Save.
Administrator Guide
Metadata Management Administration PUBLIC 195
9 Cleansing Package Builder Administration
Ownership can be reassigned only by an Information Steward administrator. You can change ownership only for
private cleansing packages. Published cleansing packages are either not owned (unlinked) or linked to private
cleansing packages and therefore when you change the ownership of a private cleansing package, the new owner
automatically can republish to the linked cleansing package.
○ Click the link in the Name or Kind column for the desired cleansing package.
5. In the Properties window, choose a different owner in the Owner drop-down list, and click Save.
Administrator Guide
196 PUBLIC Cleansing Package Builder Administration
9.3 Changing the description of a cleansing package
You can edit the description of published or private cleansing packages.
○ Click the link in the Name or Kind column for the desired cleansing package.
5. In the Properties window, edit the description, and click Save.
The status of a cleansing package is displayed in the popup dialog that appears when you hover your mouse
pointer over the name of the cleansing package in the Cleansing Package Tasks screen and is also indicated by the
cleansing package icon.
It may take some time for a cleansing package with a Busy status to complete the operation and change to a
Ready state. The state of a cleansing package with a Busy status cannot be changed by an Information Steward
administrator. You can either wait for the operation to complete or delete the cleansing package.
When a cleansing package is opened for editing, it enters a locked state so that no other user may edit it. To close
a cleansing package, you must either return to the Cleansing Package Tasks screen, switch to another cleansing
package, or log off from Cleansing Package Builder. If the browser window is closed or the computer is shut down
without logging off, the cleansing package may become locked.
Note
When making significant changes to a cleansing package, you may want to save a copy by clicking Save As in
the Cleansing Package Tasks window. If your cleansing package becomes locked, you will have a copy to work
on.
A cleansing package may become locked when it is in any of the following states:
Administrator Guide
Cleansing Package Builder Administration PUBLIC 197
1. (Data steward) When you encounter a locked cleansing package ( , , or ) in the Cleansing Package
Tasks screen, ask your Information Steward administrator to unlock it.
2. (Information Steward administrator) Change the cleansing package state from Locked to Error.
a. Log in to the Central Management Console (CMC).
b. Select Information Steward.
c. Expand the Cleansing Package node and select Private or Published.
d. Right-click the desired cleansing package, and choose Properties.
e. Change the state to Error and click Save.
f. Notify the data steward that the cleansing package state is updated to Error.
The data steward must verify the condition of the cleansing package prior to further use
3. (Data steward) When notified that the cleansing package is unlocked and moved to the Error state ( , , or
), do the following:
a. From the Cleansing Package Tasks screen, open the cleansing package.
b. Verify the condition of the cleansing package and that it displays information as expected.
c. Close the cleansing package.
If the cleansing package is returned to a Ready state and its condition was as you expected, you may use
it.
If the cleansing package returns to the Error state or the condition was not as expected, the cleansing
package is corrupt and should be deleted.
Related Information
Run a content upgrade to apply release changes to your applicable private SAP-supplied person and firm (global)
cleansing packages.
The content upgrade process can upgrade the following objects, when applicable:
● Standard forms
● Variations
● Reference data
● Rules
If you have created copies of the SAP-supplied person and firm cleansing package and modified them, you can
perform a content upgrade on your private copies without losing any modifications.
The Content Upgrade dialog box lists only the eligible private person and firm cleansing packages. A cleansing
package is ineligible because of the following situations:
Administrator Guide
198 PUBLIC Cleansing Package Builder Administration
● Created using the Create Person and Firm Cleansing Package dialog box
● Created as a single domain person and firm cleansing package
● Already upgraded to the latest version
You can choose to automatically publish a private cleansing package through the content upgrade process. If a
published package already exists for the private cleansing package, the software lists the name of the existing
cleansing package in the Content Upgrade dialog box, and republishes the cleansing package when the upgrade
process completes.
Related Information
We notify you when a content upgrade is available in the Upgrade Guide, which accompanies each Information
Steward release package. For information about the most recent upgrade requirements see the topic “Changes in
version <current version>” in the latest Upgrade Guide .
The software writes all errors and warnings for a content upgrade to a log file.
Administrators can find the log file in the platform's logging folder.
The log file name includes the name of the service where Cleansing Package Builder's Publishing Service is
installed. By default, the location is in the EIM Adaptive Processing Server. Use the date and time listed in the file
explorer to help determine the correct log file.
Example
In Windows, find the log file in <%BOE_Install%>\SAP BusinessObjects Enterprise XI
<current_version>\logging
Example
Log file name: aps_<system_name>.EIMAdaptiveProcessingServer_trace.<number>.glf
Administrator Guide
Cleansing Package Builder Administration PUBLIC 199
9.5.3 Content upgrade states
During content upgrade, the cleansing package icon indicates that the package is locked or busy.
When content upgrade completes, there may be warnings even when the icon indicates the cleansing package is
“ready”. An error icon indicates that there were errors during the content upgrade.
Note
When errors occur during content upgrade, the software does not make any content changes to that cleansing
package. You must fix the errors and run a content upgrade again on that cleansing package.
To find out if there are warnings for a cleansing package in the ready state, move your mouse pointer over the
cleansing package name for a pop-up message. If there is no message, the content upgrade was successful.
Otherwise the pop-up message indicates that you should contact your administrator to view the log file for a list of
warning messages.
If an error icon appears with the cleansing package, you can access the pop-up message in the same manner as
for warnings. The pop-up message instructs you to contact your administrator.
Table 83:
State Description
Error Content upgrade failed and the software did not upgrade the
cleansing package
Related Information
You can view cleansing package properties, including states and statuses, in the following locations
Administrator Guide
200 PUBLIC Cleansing Package Builder Administration
Table 84:
Location Description
Cleansing Package Tasks screen. Hover your mouse pointer over the desired cleansing package
to display the properties sheet.
Information Steward area of the Central Management Console You must have Information Steward administrator privileges
to perform these steps:
Related Information
You can see a cleansing package state in the properties sheet. Open the properties sheet by hovering your mouse
pointer over the desired cleansing package in the Cleansing Package Tasks screen. The following table describes
the possible states for a cleansing package.
Table 85:
State Description
Ready Cleansing package is in good condition and available for editing or viewing.
Content upgrade is complete, however there may be errors. Hover your mouse
pointer over the package to see if there are any content upgrade errors.
Publishing Cleansing package is in the process of being published. Wait for publishing to
complete.
Locked Cleansing package is locked, either because a user has opened it for editing, it is
undergoing a content upgrade, it is being published, or there is an error in the
cleansing package. If there is an error, unlock the cleansing package in the CMC.
Administrator Guide
Cleansing Package Builder Administration PUBLIC 201
State Description
Error Cleansing package has errors. The Error state can occur in the following situa
tions:
If the error is a result of a failed content upgrade, the software did not upgrade the
cleansing package.
Canceling analysis Cleansing package auto analysis process is canceled (by the data steward). It may
take some time for the cleansing package to move to the Ready state.
Canceling publish Cleansing package publishing process is canceled (by the data steward). It may
take some time for the cleansing package to move to the Ready state.
You may also notice an icon appearing next to the cleansing package name. The icon indicates the cleansing
package status.
Related Information
A cleansing package may have one of the following statuses: Ready, Busy, Locked, Error.
The status of a cleansing package appears in the properties sheet that opens when you hover your mouse pointer
over the cleansing package name in the Cleansing Package Tasks screen. The status is also indicated by the
cleansing package icon. The following table shows the status, associated icon and possible user action:
Table 86:
Status Icon Possible user action
Busy Wait for the process (auto-analysis or publishing) to complete or cancel the process.
, , or
Administrator Guide
202 PUBLIC Cleansing Package Builder Administration
Status Icon Possible user action
Locked When a cleansing package is opened for editing, its status changes to Locked so that no other
, , or
user may edit it.
Caution
Ensure that you do not have the cleansing package open in a different browser window, or
if the browser closed before the cleansing package was closed, wait at least 20 minutes for
Cleansing Package Builder to automatically close the cleansing package and restore it to a
Ready state.
Contact your Information Steward administrator to unlock the cleansing package from the
Central Management Console (CMC). Unlocking a cleansing package changes the cleansing
package to an Error state and status.
Error When an error icon appears it is either because the file is corrupt, or that the administrator
, , or
has unlocked it. See the Locked status for details.
The state of a cleansing package with a Busy status cannot be changed by an Information Steward administrator.
You can either wait for the operation to complete or delete the cleansing package.
A cleansing package with a Locked status can be unlocked by an Information Steward administrator. Unlocking a
cleansing package changes its status to Error. Before further use, the condition of the cleansing package must be
verified by a data steward.
Related Information
Administrator Guide
Cleansing Package Builder Administration PUBLIC 203
10 Match Review Administration
In an enterprise landscape, the same record may exist in different source systems, but with some variations.
Duplicate records may also exist in the same system. The process of identifying and consolidating duplicate data
is important because it delivers direct cost savings by eliminating redundant data, and improves business process
efficiency. Typically, automated data quality processes are deployed to cleanse data, standardize it, and
deduplicate matching records. However, manual intervention is unavoidable when these automated processes
cannot determine the duplicate records with enough confidence. The Match Review module in SAP Information
Steward enables business users or data stewards to review the results of automated matching and to make any
necessary corrections.
The Match Review workflow involves processes in SAP Data Services and SAP Information Steward, in addition to
tasks performed by administrators, data stewards, reviewers, and approvers.
The key tasks to incorporate Match Review in the overall data quality process are as follows:
1. A Data Services job loads the results of an automated matching process to a match results table in a staging
database. (This job typically involves address cleansing, data cleansing, standardization, and matching.)
2. The Data Services job groups similar records that are possible duplicates into match groups.
3. A configuration manager configures the Match Review process, and an administrator schedules it.
4. Information Steward monitors the staging database at scheduled intervals and triggers the match review
process if new records are available for review.
5. Information Steward groups match groups in a match review task. A match review task contains all new
match groups with the same job instance ID that need to be reviewed and (if required) approved.
6. Information Steward adds the task to the worklist of the assigned reviewers and approvers.
7. Reviewers review the match groups and unmatch any that are not duplicates. If Match Review is configured to
allow reviewers to reassign the master record in match groups, reviewers can reassign the master record if
necessary.
8. If Match Review is configured to require reviewers' actions to be approved, the review results are submitted to
approvers who approve or reject the actions.
9. A data steward monitors the overall progress of the match review and ensures that sufficient reviewers and
approvers are assigned to finish the task on time.
10. Information Steward posts the match review results by updating the match results table. For auditing
purposes, Information Steward stores a complete history of all changes to records and match groups.
11. The data quality process picks up the match review results from the staging database and integrates the data
into the target application or database systems.
More information:
Administrator Guide
204 PUBLIC Match Review Administration
● User Guide, “Creating or editing a match review configuration”
Related Information
An extract, transform, and load (ETL) developer creates and configures an SAP Data Services automated
matching job. The match groups, which are the output of this job, must be stored in a match result table in a
staging database. A member of the Data Review Administrator group defines a connection to the staging
database so that Information Steward can access it.
Table 87:
Type Column Description
Required by match Source System Unique identifier of the system from which the source records originate.
review
Source Record ID Primary key of the record in the source system. The combination of
SOURCE_SYSTEM and SOURCE_RECORD_ID uniquely identifies the source re
cord.
Job Run ID Unique job run ID of each SAP Data Services job that adds records to the match
result table. This field is required to group together match groups into one
match review task that appears in the reviewer's worklist.
Each delta load from the same job run ID is grouped together as one task.
Output from Data Match Group Number Alphanumeric string that uniquely identifies a match group.
Services job
Match Group Rank Indicates whether the record is a master record (M) or subordinate record (S).
Match Score Number between 0 and 100 indicating the similarity between the subordinate
record and master record for the MTC_GROUP_NUMBER. A high number indi
cates a high similarity.
Source data fields Source Column 1 Fields from the various source systems. The reviewer uses these fields for com
Source Column N paring records in match review.
Administrator Guide
Match Review Administration PUBLIC 205
Note
Your administrator is responsible for purging the match result table.
Related Information
A Data Review administrator defines the match review connection to the staging database and defines the users
or user groups and their permissions for this connection. (The administrator must be assigned to the Data Review
Administrator group in the Central Management Console (CMC) under Users and Groups.)
The administrator also defines the connection to the staging database in the CMC under Information Steward
Connections . By default, any user from the Data Review Administrator group has full permission for the
Connections folder.
The administrator also defines a Match Review job status table in the staging area. The table is updated at the end
of the SAP Data Services job to indicate that the job is complete and that the results are ready for review.
The Match Review job status table must have specific columns as described in the table below.
Table 88:
Column Description
Job Run ID Each Data Services job that adds records to the match result table has a unique job run ID. This field is
required to group together match groups into one match review task that appears in the reviewer's work
list.
Each delta load from the same job run ID is grouped together as one task.
Job Status Status of the job represented by the job run ID. When the Data Services job is complete and the match
results are loaded to the staging database, Information Steward expects the job status to be updated to
the value specified in the match review configuration. The default value is Pending Match Review. This en
sures that Information Steward does not create the match review task while the Data Services job is still
in progress and while the match results for the same job run are being added to the staging table.
Job Status Value Shows the unique column representing the Data Services job status.
Indicates Status Select to monitor the progress of the Match Review. For example, when the job status value changes from
Change
“In Progress (I)” to “Completed (C)”, you could start the next process in your workflow.
Administrator Guide
206 PUBLIC Match Review Administration
Related Information
For Match Review, the duplicate records are the output of an SAP Data Services match transform job, and they
reside in a table in a staging database. Information Steward provides the "For data review" connection purpose for
the staging database. This section describes the connection parameters to the staging database, which can be
different database types.
Note
For a complete list of supported databases and their versions, see the Platform Availability Matrix available at
http://service.sap.com/PAM .
Related Information
1. Ensure that you have the proper privileges on the staging database. You must have privileges to read the
metadata and data from the match result tables.
2. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Review
Administrator group or that has the Create right on Connections in Information Steward.
3. At the CMC home page, click Information Steward.
4. Select the Connections node in the Tree panel.
Administrator Guide
Match Review Administration PUBLIC 207
Table 89:
Option Description
Connection Name Name that you want to use for this Match Review connection.
○ Maximum length is 64 characters
○ Can be multi-byte
○ Case insensitive
○ Can include underscores and spaces
○ Cannot include other special characters: ?!@#$%^&*()-+={}[]:";'/\|.,`~
You cannot change the name after you save the connection.
7. In the Connection Type drop-down list, select the Database connection value.
8. In the Purpose of connection drop-down list, select For data review.
9. In the Database Type drop-down list, select the database that contains the match review results.
10. Enter the relevant connection information for the database type.
11. If you want to verify that Information Steward can connect successfully before you save this profile
connection, click Test connection.
12. Click Save.
Note
After you save the connection, you cannot change its name, connection type, purpose and connection
parameters which uniquely identify a database.
The newly configured connection appears in the list on the right of the Information Steward page.
After you create a connection, you must authorize users to it so that they can perform tasks such as create or edit
match review configurations, run a match review configuration and force completing a match review task.
Related Information
Table 90:
DB2 option Possible values Description
Server Name Refer to the requirements of your Type the name of your DB2 server.
database
Port Number Five-digit port number. Port number to connect to this DB2 server.
Default: 50000
Administrator Guide
208 PUBLIC Match Review Administration
DB2 option Possible values Description
Database Name Refer to the requirements of your Type the name of the staging database to which Match
database Review connects.
User Name The value is specific to the data Enter the user name of the account through which SAP
base server and language. Information Steward accesses the database.
Password The value is specific to the data Enter the user's password.
base server and language.
Table 91:
Microsoft SQL Server op Possible values Description
tion
Server Name Computer name, fully qualified Enter the name of machine where the SQL Server instance is
domain name, or IP address located.
Port Number Four-digit port number. Port number to connect to this Microsoft SQL Server.
Default: 1433
Database Name Refer to the requirements of Enter the name of the staging database to which Match
your database. Review connects.
User Name The value is specific to the da Enter the user name of the account through which
tabase server and language. Information Steward accesses the database.
Password The value is specific to the da Enter the user's password.
tabase server and language.
Table 92:
Oracle option Possible values Description
RAC Connection Yes or No Indicate whether or not to use Oracle RAC to connect to the
Match Review staging database.
Default: No
Administrator Guide
Match Review Administration PUBLIC 209
Oracle option Possible values Description
Server Name Computer name, fully qualified Enter the name of machine where the Oracle Server in
domain name, or IP address stance is located.
Port Number Four-digit port number. Port number to connect to this Oracle Server.
Default: 1521
Instance Name The value is specific to the data Enter the Oracle instance name or SID
base server and language.
User Name The value is specific to the data Enter the user name of the account through which the soft
base server and language. ware accesses the database.
Password The value is specific to the data Enter the user's password.
base server and language.
Table 93:
SAP In-Memory Database Possible values Description
option
Server Name Computer name Enter the name of the computer where the SAP HANA
server is located.
Port Number Five-digit port number. Port number to connect to this SAP HANA Server.
Default: 30015
Database Name Refer to the requirements of your (Optional) Enter the name of the staging database to
database. which Match Review connects.
User name The value is specific to the data Enter the user name of the account through which
base server and language. Information Steward accesses the database.
Password The value is specific to the data Enter the user's password.
base server and language.
Schema Name Refer to the requirements of your Name of the database schema.
database.
Table 94:
SAP ASE option Possible values Description
Server Name Computer name Enter the name of the computer where the SAP ASE server
is located.
Administrator Guide
210 PUBLIC Match Review Administration
SAP ASE option Possible values Description
Port Number Five-digit port number. Port number to connect to this SAP ASE server.
Default: 50000
Database Name Refer to the requirements of Enter the name of the staging database to which Match
your database Review connects.
User name The value is specific to the data Enter the user name of the account through which the soft
base server and language. ware accesses the database.
Password The value is specific to the data Enter the user's password.
base server and language.
Administrator Guide
Match Review Administration PUBLIC 211
11 Information Steward Utilities
SAP Information Steward provides utilities that help you analyze your data using scorecards and lineage reports,
increase disk space by purging files and updating search indexes, and sending email notifications for specific
tasks. You manage utilities in the CMC.
Table 95:
Utility Description Configurable Default Schedule
Properties
Calculate Calculates scores of key data domains regularly for data quality score None Daily
Scorecard cards.
(You can change
Note this default
schedule.)
You can also recalculate scorecards immediately from the Data
Insight tab in Information Steward when you select Now for the
Show score as of option.
Clean up Purges old and obsolete failed data from all of your failed data reposi None 0 (zero) days
Failed Data tory connections on a periodic basis.
Note
The default schedule is 0 (zero) days so that your existing failed
data history is not deleted when you upgrade the software. A setting
of a negative number (-1) causes the utility to never clean up history.
A positive number causes the utility to clean up history every set
number of days (for example, set to 5 to run the utility every 5 days).
Compute Computes and stores end-to-end impact and lineage information Mode None
Lineage Re across all integrator sources for the Reports option in the Open drop-
(You either create
port down list in the Metadata Management module.
a schedule for a
Information Steward provides a configured Compute Lineage Report configured utility
utility that computes impact and lineage for only the integrator sources or run it immedi
that contain changes since the last time the computation was run. You ately.)
can also create another configuration to compute impact and lineage
across all integrator sources.
Administrator Guide
212 PUBLIC Information Steward Utilities
Utility Description Configurable Default Schedule
Properties
Purge Increases disk space in the Information Steward repository in the fol None Daily
lowing ways:
(You can change
● Deletes database logs after integrator sources have been deleted.
this default
● Deletes profile results, scores, and sample data that have ex
schedule.)
ceeded the configured retention period.
● Deletes match review tasks that have exceeded the configured re
tention period.
● Deletes Data Cleansing Advisor's staged data tables that have ex
ceeded the configured retention period. (Users can restore data
cleansing solutions, if necessary, even if staged data has been
purged.)
● Purges logically deleted information that includes sample data for
profiling and rule tasks.
Note
If you want to delete profile results, scores, and sample data for an
individual table or file before the retention period has been reached,
see the User Guide.
Email Notifi Sends a reminder email notification to applicable users when a specific None Daily
cation
task is near or passed the due date. This applies to match review, rules,
and term approval tasks. Note
The system
checks daily,
but one email
reminder is
sent per due
date type
(near or
passed) per
task.
Related Information
Administrator Guide
Information Steward Utilities PUBLIC 213
Monitoring utility executions [page 218]
Modifying utility configurations [page 219]
Creating a utility configuration [page 220]
Deleting failed data for a failed rule task [page 140]
The SAP Information Steward Repository provides a lineage staging table, MMT_Alternate_Relationship, that
consolidates end-to-end impact and lineage information across all integrator sources. The Metadata Management
module provides pre-defined Crystal Reports from this table. You can also create your own reports from this
table. To view the reports on the Reports option in the Open drop-down list in the Metadata Management tab, they
must be Crystal Reports (see "Defining custom reports" in the User Guide).
Before generating reports that rely on this lineage staging table, you should update the lineage information in the
lineage staging table. You can either schedule or run the Compute Lineage Report utility on demand to ensure
those reports contain the latest lineage information.
The following activities can change the lineage information, and it is recommended that you run the lineage
computation after any of these activities occur:
● Run an Integrator to collect metadata from a source system (see "Running a Metadata Integrator" in the
Metadata Management Administration section).
● Change preferences for relationships between objects (see "Changing preferences for relationships" in the
User Guide). The data in the lineage staging table uses the values in Impact and Lineage Preferences and
Object Equivalency Rules to determine impact and lineage relationships across different integrator sources.
● Establish or modify a user-defined relationship of type Impact or Same As (see "Establishing user-defined
relationships between objects" in the User Guide).
Related Information
The search feature of the Metadata Management module of SAP Information Steward allows you to search for an
object that might exist in any metadata integrator source. When you run a metadata integrator source, Metadata
Management updates the search index with any changed metadata.
You might need to recreate the search indexes in situations such as the following:
Administrator Guide
214 PUBLIC Information Steward Utilities
● The Search Server was disabled and could not create the index while running a metadata integrator source.
● The search index is corrupted.
Related Information
● To change the default frequency that the Calculate Scorecard utility is run to generate rule results for the data
quality trend graphs in Data Insight.
● To change the default frequency that the Purge utility is run to increase space in the Information Steward
repository.
● To schedule the Compute Lineage Report utility for Reports on Metadata Management.
● To schedule the Update Search Index utility in Metadata Management.
1. Log in to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the Applications Name list.
Administrator Guide
Information Steward Utilities PUBLIC 215
11. If you want to trigger the execution of this utility when an event occurs, expand Events, and fill in the
appropriate information. For more information about Events, see the SAP Business Intelligence Platform
Administrator Guide.
12. Click Schedule.
13. If you want this newly created schedule to override the default recurring schedule for the Purge or Calculate
Scorecard utility, delete the old recurring instance.
a. From the list of Utility Configurations, select the name of the utility whose schedule you want to delete.
b. Click Actions History .
c. Select the recurring schedule that you want to delete and click the delete icon in the menu bar.
Related Information
● Change the default frequency that the Calculate Scorecard utility is run to generate rule results for the data
quality trend graphs in Data Insight.
● Change the default frequency that the Purge utility is run to Increase space in the Information Steward
repository.
● If you setup a recurring schedule for the Compute Lineage Report utility and you want to change the schedule
to compute the lineage information for Reports on Metadata Management.
● If you setup a recurring schedule for the Update Search Index utility and you want to change the schedule to
rebuild the search indexes in Metadata Management.
To reschedule a utility:
1. Login to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the Applications Name list.
Administrator Guide
216 PUBLIC Information Steward Utilities
6. In the top menu tool bar, click Actions History .
7. On the Utility History screen, select the schedule name with that has a schedule status of Recurring and click
Reschedule in the top menu bar.
a. Click Recurrence in the navigation tree in the left pane of the Reschedule window.
b. Select the frequency in the Run object drop-down list.
c. Select the additional relevant values for the recurrence option.
d. If you want to provide a different name for this schedule, click Instance Title in the navigation tree and
enter the name.
8. Click Schedule.
The newly created Recurring schedule appears on the Utility History screen.
9. Delete the original Recurring schedule:
a. From the list on the Utility History window, select the original Recurring schedule.
b. Click the Delete icon.
Related Information
1. Login to the Central Management Console (CMC) with a user name that belongs to one or more of the
following administration groups:
○ Data Insight Administrator
○ Metadata Management Administrator
○ Administrator
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the Applications Name list.
Caution
If an instance of the lineage report utility is still running, do not start another lineage report utility run.
Starting another utility run might cause deadlocks or delays. The same behavior might occur if you stop the
utility and start another instance right away because the process might still be running in the repository.
7. On the Utility Configurations screen, click the Refresh icon to update the Last Run column for the utility
configuration.
Administrator Guide
Information Steward Utilities PUBLIC 217
Related Information
1. Login to the Central Management Console (CMC) with a user name that belongs to one or more of the
following administration groups:
○ Data Insight Administrator
○ Metadata Management Administrator
○ Administrator
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the Applications Name list.
6. To view the status of the utility run, select the utility configuration name and click Action History .
The Schedule Status column can contain the following values:
Table 96:
Schedule Status Description
Pending The utility is scheduled to run one time. When it actually runs, there will be another instance
with status “Running."
Recurring The utility is scheduled to recur. When it actually runs, there will be another instance with
status “Running."
Administrator Guide
218 PUBLIC Information Steward Utilities
c. To close the Database Log window, click the X in the upper right corner.
8. To save a copy of a utility log:
a. Scroll to the right of the Utility History screen, and click the Download link in the Log File column in the row
of the utility instance you want.
b. Click Save.
c. On the Save As window, browse to the directory where you want to save the log and optionally change the
default file name.
9. To close the Utility History screen, click the X in the upper right corner.
Related Information
Information Steward provides a default configuration for each of the utilities. You can modify the configuration
settings for the following utilities:
Note
The Calculate Scorecard, Purge, and Clean up Failed Data utilities do not have configuration parameters.
1. Login to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. Select Applications from the navigation list at the top of CMC Home.
Administrator Guide
Information Steward Utilities PUBLIC 219
3. Double-click Information Steward Applications in the Application Name list.
4. Select Manage Utilities from the navigation list at left.
5. Select the applicable utility.
○ Description
○ Mode
Mode can be set to one of the following values:
○ Full mode recalculates all impact and lineage information and repopulates the entire lineage staging
table.
Note
If you select Full mode, the computation can take a long time to run because it recalculates impact
and lineage information across all integrator sources.
○ Optimized mode (the default) recalculates impact and lineage information for only the integrator
sources that contain changes since the last time the computation was run. For example, if only one
integrator was run, the computation only recalculates impact and lineage information corresponding
to that integrator source and updates the lineage staging table.
8. For an Update Search Index utility, you can change the following parameters:
○ Description
○ Integrator Source
○ All Sources recreates the search index for all integrator sources that you have configured.
○ The specific name of an integrator source that appears in the list.
9. Click Save.
Related Information
SAP Information Steward provides a default configuration for each of the utilities. You can define another
configuration with different settings and still keep the default configuration for the following utilities:
Administrator Guide
220 PUBLIC Information Steward Utilities
● Compute Lineage Report utility
The default configuration for the Compute Lineage Report utility has Mode set to Optimized which
recalculates lineage information in the lineage staging table for only integrator sources that have changed
since the utility was last run. You might want to configure another Compute Lineage Report utility with Mode
set to Full to recalculate lineage information across all integrator sources.
● Update Search Index utility
The default configuration for the Update Search Index utility has Integrator Source set to All Sources which
rebuilds the search indexes in Metadata Management for all integrator sources. You might want to configure
another Update Search Index utility with Integrator Source set to only one of the integrator sources to rebuild
indexes for the metadata collected for only that integrator source.
Note
You cannot define another configuration for the Calculate Scorecard, Purge, Clean up Failed Data, and Email
Notification utilities.
1. Login to the Central Management Console (CMC) with a user name that belongs to any of the following
groups:
○ Metadata Management Administrator
○ Administrator
2. Select Applications from the navigation list at the top of CMC Home.
3. Double-click Information Steward Application to open the Information Steward Settings window.
4. Select Manage Utilities from the navigation list at left.
5. On the Utilities Configurations window, click Manage New Utility Configuration in the top menu tool bar.
6. In the Utility Type drop-down list, select the utility you want to create a new configuration for.
7. Type a Name and Description for the utility.
8. If you choose to create a new Report Lineage Utility:
a. To recalculate the entire impact and lineage information across all integrator sources, change the Mode
default value from Optimized to Full.
9. If you choose to create a new Search Index Utility:
a. To rebuild indexes for the metadata collected for only one integrator source, select its name from the
Integrator Source drop-down list.
10. Click Save.
The new name appears on the new Utility Configurations window.
Related Information
Administrator Guide
Information Steward Utilities PUBLIC 221
Modifying utility configurations [page 219]
Administrator Guide
222 PUBLIC Information Steward Utilities
12 Server Management
● SAP Business Intelligence platform servers that are collections of services running under a Server Intelligence
Agent (SIA) on a host. Information Steward uses the following servers and services:
Enterprise Information Management Adaptive Processing Server which has the following services:
○ Cleansing Package Builder Auto-analysis Service
○ Cleansing Package Builder Core Service
○ Cleansing Package Builder Publishing Service
○ Information Steward Administration Task Service
○ Information Steward Data Review Service
○ Data Cleansing Advisor Service
○ Application Service
○ Metadata Search Service
○ Data Services Metadata Browsing Service
○ Data Services View Data Service
Information Steward Job Server which has the following services:
○ Information Steward Task Scheduling Service
○ Information Steward Integrator Scheduling Service
○ Information Steward Data Review Task Scheduling Service
● SAP Data Services Job Server which executes the Data Insight profiling tasks. You can create multiple Job
Servers, each on a different computer, to use parallel execution for the profiling tasks.
For a description of these servers and services, see Services [page 17].
This section describes how to manage the above servers for Information Steward.
To verify that the Information Steward servers are running and enabled:
1. From the CMC Home page, select Servers from the CMC Home drop-down list.
2. Expand the Service Categories node and select Enterprise Information Management Services.
The list of servers in the right pane includes a State column that provides the status for each server in the list.
3. Verify that the following servers are “Running” and “Enabled”:
○ AdaptiveProcessingServer
○ JobServer
4. If a server is not running or enabled, do the following:
Administrator Guide
Server Management PUBLIC 223
a. Select the server name from the list.
b. Click Actions from the toolbar and select Start Server or Enable Server.
For more information, see “Services” in the Information Steward Administrator Guide.
Follow these steps to verify that the Information Steward features that were chosen during Information Steward
installation have the corresponding services in the Central Management Console (CMC):
1. Open the CMC and select Servers from the CMC Home drop-down list.
2. Expand Service Categories in the left pane and select Enterprise Information Management Services.
The right pane lists information about the following kinds of servers:
○ AdaptiveProcessingServer (EIM)
○ JobServer (IS)
3. Right-click the applicable server row and select Stop Server from the drop-down list.
The State column changes to Stopped.
4. Right-click the applicable server row again and select Select Services from the drop-down list.
The Select Services: <computer_name>.<server_type> window opens.
5. Verify that the list of services in the Services of <computer_name>.<server_type> column in the right pane is
correct for each feature that was installed.
The table below lists each feature and the applicable service for each type of server.
Table 97:
Services
Feature Installed EIM Adaptive Processing Server Information Steward Job Server (IS
JobServer)
Information Steward Task Server Information Steward Administrator ○ Information Steward Task Sched
Task Service uling Service
○ Information Steward Integrator
Scheduling Service
Information Steward Data Review N/A Information Steward Data Review Task
Server Scheduling Service
Administrator Guide
224 PUBLIC Server Management
Services
Feature Installed EIM Adaptive Processing Server Information Steward Job Server (IS
JobServer)
6. To add any missing service to the Services of <computer_name>.<server_type> list, select the service
from the Available services list in the left pane and use the arrow button > to move it to the left-pane list.
Note
If you want to be able to use Information Steward's new Data Cleansing Advisor feature, you must move
the Data Cleansing Advisor service from the list of available services to the list of services for
EIMAdaptiveProcessingServer.
The installation process of Data Services configures the following services (under the server
EIMAdaptiveProcessingServer) with default settings.
These services are used by Information Steward to connect and view data in profiling sources. You might want to
change the configuration settings to more effectively integrate Information Steward with your hardware, software,
and network configurations.
Note
Not all changes occur immediately. If a setting cannot change immediately, the Properties window displays
both the current setting (in red text) and the updated setting. When you return to the Servers management
Administrator Guide
Server Management PUBLIC 225
area, the server will be marked as Stale. When you restart the server, it will use the updated settings from
the Properties dialog box and the Stale flag is removed from the server.
Related Information
You can change the following properties of the Metadata Browsing Service.
Table 98:
Server Configuration Param Description Possible Values
eter
Service Name Name of the service configuration. Alphanumeric string with a maxi
mum length of 64. The Service
Name cannot contain any spaces.
Maximum Data Source Con Maximum number of data source connections that can integer.
nections be opened at any time under a service instance.
Default value: 200
Retry attempts to launch Maximum number of attempts to launch a new service Default value: 1
Service Provider provider when there is contention to access a shared
service provider.
Stateful Connection Timeout Maximum duration which a stateful connection is open. Default value: 1200
(seconds) Stateful connections include SAP Applications and SAP
BW Source.
Stateless Connection Timeout Maximum duration which a stateless connection is open. Default value: 1200
(seconds) Stateless connections include all relational database
sources.
Recycle Threshold Maximum number of requests that will be processed by Default value: 50000
a service before the Data Services backend engine is re
cycled to free memory that was allocated for metadata
browsing.
Administrator Guide
226 PUBLIC Server Management
Server Configuration Param Description Possible Values
eter
Log Level Level of logging of trace messages to the log file. Information Steward logs:
Collect Connection Statistics Enable or disable the collection of statistic information Default is enabled.
for each open connection.
Listener Port Port number used to communicate with the Data Serv Four-digit port number that is not
ices backend engine. currently in use.
If you change the port number, you must restart the EI Default value: 4010
MAdaptiveProcessingServer for the change to take ef
fect.
JMX Connector Port Port number used for the JMX Connector. Four-digit port number that is not
currently in use.
If you change the port number, you must restart the EI
MAdaptiveProcessingServer for the change to take ef Default value: 4011
fect.
You can change the following properties of the View Data Service.
Table 99:
Server Configuration Param Description Possible Values
eter
Service Name Name of the service configuration. Alphanumeric string with a maxi
mum length of 64. The Service
Name cannot contain any spaces.
Administrator Guide
Server Management PUBLIC 227
Server Configuration Param Description Possible Values
eter
Listener Port Port number used to communicate with the Data Four-digit integer.
Services backend engine.
Default value: 4012
If you change the port number, you must restart the
EIMAdaptiveProcessingServer for the change to take ef
fect.
JMX Connector Port Port number used for the JMX Connector. Four-digit integer.
If you change the port number, you must restart the EI Default value: 4013
MAdaptiveProcessingServer for the change to take ef
fect.
Batch Size (kilobytes) Size of the data to be stored in a view data response. Minimum value: 1000
Minimum Shared Service Pro Minimum number of shared Data Services backend en Default value: 1
viders gines that need to be launched at the startup time of the
service.
Maximum Shared Service Maximum number of shared Data Services backend en Default value: 5
Providers gines that can be launched during the time to service the
view data requests.
Maximum Dedicated Service Maximum number of dedicated Data Services backend Default value: 10
Providers engines that can be launched at any instant of time.
Recycle Threshold Maximum number of requests that will be processed by Any integer.
a service before the Data Services backend engine is re
cycled to free memory that was allocated for viewing Default value: 200
data.
Number of attempts to launch Number of attempts to be made to try launching the Default value: 1
service provider Data Services backend engine instance.
Maximum idle time for shared Maximum number of minutes that a Data Services back Default value: 120
service provider (minutes) end engine can remain without processing any requests.
After this time is exceeded, the Data Services backend
engine is shut down.
Administrator Guide
228 PUBLIC Server Management
Server Configuration Param Description Possible Values
eter
Log Level Level of logging of trace messages to the log file. Information Steward logs:
Information Steward sends email notifications to a specified user for rules, terms, and match review tasks.
Administrators must make settings in the Destination page of the adaptive job server (the Information Steward
job server) for email notifications to function properly.
Note
You must complete the Domain Name, Host Name, and From (sender email address). The Port entry must
be numeric. If these elements are not present, or the Port number is not numeric, a warning message
appears the next time you log in to Information Steward.
7. Complete the rest of the parameters as applicable and click Save & Close.
8. Go to Applications Information Steward Application Configure Application . In the Data Review and
Worklist Settings section, enter a worklist URL for email notifications. The default value is set to localhost.
Administrator Guide
Server Management PUBLIC 229
Related Information
The SAP Data Services Job Server performs the profile and rule tasks on data in Data Insight connections. The
Information Steward Task Server sends Data Insight profile tasks to the Data Services Job Server which partitions
the data and uses parallel processing to deliver high data throughput and scalability.
This section contains the tasks to manage Data Services Job Servers for Information Steward.
Related Information
Configuring a Data Services Job Server for Data Insight [page 230]
Adding Data Services Job Servers for Data Insight [page 231]
Displaying job servers for Information Steward [page 231]
Removing a job server [page 232]
If you will run Data Insight profile and rule tasks or Data Cleansing Advisor, you must access the Data Services
Server Manager to create a job server and associate it with the Information Steward repository. This association
adds the job server to a pre-defined Information Steward job server group that Data Insight will use to run tasks or
create data cleansing solutions. For details about how job server groups improve performance, see the
“Performance and Sizing Considerations” section in the Administrator Guide.
To configure a job server and associate it with the Information Steward repository:
1. Access the Data Services Server Manager from the Windows Start menu:
Start All Programs SAP Data Services <version number> Data Services Server Manager where
<version number> is your current version of Data Services.
2. In the Job Server tab, click Configuration Editor.
3. In the Job Server Configuration Editor window, click Add.
4. In the Job Server Properties window, enter a name for Job Server name.
Note
The Job Server name you enter for Information Steward must be different from the Job Server used by
Data Services.
Administrator Guide
230 PUBLIC Server Management
5. In the Associated Repositories section, click Add and fill in the Repository Information of the Information
Steward repository that you want to associate with this Job Server.
○ Database type
○ Database Server name
○ Database name
○ Username
○ Password
6. Click Apply and OK.
7. When the Job Server Configuration Editor window displays the job server you just added, click OK.
8. Click Close and Restart to restart the job server with the updated configurations.
9. Click OK when you receive a message saying that Data Services Service will restart.
For more information about using the Data Services Server Manager, see “Server management” in the SAP Data
Services Administrator Guide.
If you installed additional Data Services Job Servers on multiple computers, you can use them to run Data Insight
profile and rule tasks even though Information Steward is not installed on those computers.
On each computer where a Data Services Job Servers is installed, you must add a job server to the pre-defined
Information Steward job server group. For details, see Configuring a Data Services Job Server for Data Insight
[page 230].
For more information about job server groups, see “Performance and Sizing Considerations” in the Administrator
Guide.
Related Information
To display the job servers associated with your Information Steward repository:
1. Log in to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the Applications Name list.
4. Click Action View Data Services Job Server in the top menu tool bar.
The List of Data Service Job Servers for Information Steward screen displays:
Administrator Guide
Server Management PUBLIC 231
● The name of each Data Services Job Server associated with the Information Steward repository.
● The computer name and port number for each Data Services Job Server.
Related Information
Configuring a Data Services Job Server for Data Insight [page 230]
To remove a job server from the Information Steward job server group on Data Services:
1. On each computer where you installed additional Data Service Job Servers, access the Data Services Server
Manager from the Windows Start menu:
Start Programs SAP Data Services 4.1 Data Services Server Manager
2. On the Job Server tab, click Configuration Editor.
3. On the Job Server Configuration Editor window, select the name of the job server you want to delete and click
Delete.
4. Click Yes on the prompt that asks if you want to remove persistent cache tables.
5. In the Job Server Properties window, in the Associated Repositories section, ensure the name of your
Information Steward repository is selected and click Delete.
6. In the Repository Information section, enter the password of your Information Steward repository and click
Apply.
7. Click OK to return to the Job Server Configuration Editor window.
8. Click OK to return to the Job Server tab of the Data Services Server Manager window.
9. Click Close and Restart and then click OK to restart the Data Services job server with the updated
configuration.
Related Information
Configuring a Data Services Job Server for Data Insight [page 230]
Adding Data Services Job Servers for Data Insight [page 231]
Displaying job servers for Information Steward [page 231]
Administrator Guide
232 PUBLIC Server Management
13 Performance and Scalability
Considerations
The main Information Steward functions that influence performance and sizing are as follows.
Data profiling
Information Steward can perform basic and advanced data profiling operations to collect information about data
attributes like minimum and maximum values, pattern distribution, data dependency, uniqueness, address
profiling, and so on. These operations require intense, complex computations and are affected by the amount of
data that is processed.
Information Steward validates source data against business rules to monitor data quality and generate scores.
Validation is a computation-heavy process that affects performance based on the number of records and the
number of rules.
The complexity of rules also affects performance. For example, if you have lookup functions in rule processing, the
processing takes more time and disk space. Typically, lookup tables are small, but if they are big, the tables can
adversely affect performance.
After a custom cleansing package is initially created, the Auto-analysis service for Cleansing Package Builder is
processed. Auto-analysis service analyzes the sample data that was provided during the steps in the wizard to
create a custom cleansing package. Some of the analyzing that is performed is grouping similar records, which
generates parsing rules based on analysis of combinations of parsed values found in the data. With a large
number of records and parsed values, this becomes very computation and memory intensive.
Administrator Guide
Performance and Scalability Considerations PUBLIC 233
Metadata integrators
Metadata integrators collect metadata about different objects in the source systems and store it in the
Information Steward repository. If a large amount of metadata is being collected, it can affect performance and
the size of the repository.
You can browse metadata for tables and columns and also view source data. If the table has a large number of
columns, the View Data operation is affected because a lot of data must be accessed, transferred, and displayed
in the user interface.
After the metadata is collected, Information Steward can do end-to-end lineage and impact calculations.
Typically, these are periodically run as utilities to capture the latest information. If the amount of metadata is
large, this calculation can affect performance.
Processing tables in join conditions can slow processing speed based on the table size. Increase processing speed
by ranking (ordering) the tables in a join and setting caching options.
Related Information
13.2 Architecture
The following diagram shows the architectural components for SAP Business Intelligence platform, SAP Data
Services, and SAP Information Steward. The resource-intensive servers and services are indicated with a red
asterisk:
● Data Services Job Server processes data profiling and rule validation tasks in Data Insight.
Administrator Guide
234 PUBLIC Performance and Scalability Considerations
● Cleansing Package Builder Auto-analysis Service analyzes sample data and generates parsing rules for
Cleansing Package Builder.
● Information Steward Integrator Scheduling Service processes Metadata Integrators.
● Application Service performs lineage and impact analysis for Metadata Management and provides the
Information Steward web application access to the Information Steward repository.
● Data Services Metadata Browsing Service obtains metadata from Data Insight connections.
● Data Services View Data Service obtains the source data from Data Insight connections.
● Web Application Server handles requests from users on the web applications for Information Steward.
For more details about these servers and services, see Services [page 17].
Administrator Guide
Performance and Scalability Considerations PUBLIC 235
13.3 Factors that influence performance and sizing
Use Information Steward to work on large amounts of data and metadata. The following factors affect the
performance and sizing. These factors affect the required processing power (CPUs), RAM, and hard disk space
(for temporary files during processing) and the size of the Information Steward repository.
Related Information
The amount of the data is calculated using the number of records and the number of columns. In general, the
more data that is processed, the more time and resources it requires. This is true for profiling and rules validation
operations. The data may come from one or more sources, multiple tables within a source, views, or files.
This factor affects the required CPU, RAM, and hard disk space and the Information Steward repository size. The
larger the record size, the more resources are required for efficient processing.
If the data being processed has many columns, it requires more time and resources. If you are doing column
profiling on many columns or if the columns being processed have long textual data, it affects performance. In
short, if the record length is more, more resources are required.
Related Information
Administrator Guide
236 PUBLIC Performance and Scalability Considerations
13.3.1.2 Data characteristics and profiling type
For distribution profiling such as Value, Pattern, or Word distribution, if the data has many distinct values,
patterns, or words, it requires more resources and time to process.
This factor affects the required CPUs, RAM, and Information Steward repository size. The more distinct the data,
the more resources required.
Related Information
Information Steward is a multi-user web-based application. As the number of concurrent users increases, nearly
all aspects of the application are affected. More users may mean more profiling tasks, more scorecard views and
rule execution, and so on. The key word is concurrent.
If all users run tasks concurrently, it affects the required CPUs, RAM, and hard disk, and the Information Steward
repository size. If most of the users are just viewing the scorecard, then the performance depends more on the
web application server where the Information Steward repository is created.
Related Information
Improve performance when processing views that contain SAP tables by following these best practices.
Performance is sometimes affected when you process views that contain SAP tables. SAP tables are usually large
and joining two large SAP tables in a view can further affect performance. Follow these best practices for better
performance when processing views that contain SAP tables:
● When possible, only join SAP tables from the same source.
● Use only SAP ABAP-supported functions in mapping, filters, or join condition expressions.
● Reduce the amount of data the software has to analyze by utilizing all of the tools designed to create views
that are specific to your needs.
● If your view includes one or more large SAP tables, specify the ABAP data transfer method in the view
definition and ensure that the SAP application connection parameter ABAP execution option is set to
Generate and Execute.
Administrator Guide
Performance and Scalability Considerations PUBLIC 237
● Use the Set optimal order option in the Join Conditions tab, and set the larger source tables to the first
positions.
Related Information
For better performance, use ABAP-supported functions in your views that contain SAP tables.
There are functions that are supported for use with SAP tables and ABAP programs. You can use any available
functions when forming your expressions, but non-supported ABAP functions are processed differently than the
ABAP-supported functions, so they will likely take longer to process.
ABAP-supported functions:
● Absolute (abs)
● Decode (decode)
● Left trim (ltrim)
● Left trim ext (ltrim_blanks_ext)
● Replace null (nvl)
● Right trim (rtrim)
● Right trim ext (rtrim_blanks_ext)
● String length (length)
● Substring (substr)
● System date (sysdate)
● System time (systime)
● To lower case (lower)
● To upper case (upper)
See also:
Learn about sizing the number of metadata sources and concurrent users
Administrator Guide
238 PUBLIC Performance and Scalability Considerations
13.3.2.1 Number of metadata sources
Metadata integrators collect metadata from various sources. As the number of metadata sources increases, it
takes more resources and time. This factor affects the required CPUs, RAM, and Information Steward repository
size.
Related Information
If multiple users view impact and lineage information concurrently, the response time is affected.
Related Information
Learn about sizing sample data, number of concurrent users, and types of cleansing packages.
For Cleansing Package Builder, the amount of sample data used to create the cleansing package affects the
performance of the Auto-analysis service. This in turn can affect response time for the user interface.
Related Information
Administrator Guide
Performance and Scalability Considerations PUBLIC 239
13.3.3.2 Data characteristics
When you create custom cleansing packages, if there are many parsed values per row, it requires more resources
and time to analyze and create parsing rules. For Cleansing Package Builder, the larger the data set, the more
CPU processing power is required. The more parsed values per row, the more RAM that is required.
Related Information
If there many users create custom cleansing packages concurrently, the more CPUs and RAM required.
Related Information
Information Steward uses SAP Business Intelligence platform and Data Services platform for most of the heavy
computational work. It inherits the service-oriented architecture provided by these platforms to support a reliable,
flexible, highly available, and high performance environment.
Here are some features and recommendations for using the platforms for performance. These are not mutually
exclusive, nor are they sufficient by themselves in all cases. You should use a combination of the following ways to
improve throughput, reliability, and availability of the deployment.
Related Information
Administrator Guide
240 PUBLIC Performance and Scalability Considerations
Degree of parallelism [page 245]
Grid computing [page 248]
Multi-threaded file read [page 251]
Data Insight result set optimization [page 252]
Performance settings for input data [page 252]
Settings to control repository size [page 253]
Settings for Metadata Management [page 257]
Settings for Cleansing Package Builder [page 256]
1. Web tier level: Deploy multiple instances of Information Steward web application for high availability and large
numbers of concurrent users.
2. Business Intelligence platform and Information Steward services level: Deploy multiple Business Intelligence
platform services and/or Information Steward services in a distributed environment for load balancing and
scalability. For example, you can deploy multiple metadata integrators on different servers for load balancing.
You can also have multiple Information Steward task servers on different servers for high availability.
3. Data Services Job Server level: A Data Services Job Server group used for a given Information Steward
deployment can have one or more Data Services Job Servers added to it. As the need for Data Services
processing increases (for Data Insight operations), you can scale up by adding more Data Services Job
Servers to the Job Servers group.
Related Information
Business Intelligence platform provides a distributed scalable architecture. This means that services that are
needed for a specific functionality can be distributed across machines in the given landscape. As long as the
services are in the same Business Intelligence platform environment, it doesn't matter which machine they are on;
they just need to be in the same CMS cluster. The Information Steward web application and the Information
Steward repositories can be on different machines.
Information Steward uses some Business Intelligence platform services, and also has its own services that can be
distributed across machines for better throughput. The general principle is that if one of the services needs many
resources, then it should be on a separate machine. Similarly, if you add capacity to existing hardware, it can be
used for more than one service.
Administrator Guide
Performance and Scalability Considerations PUBLIC 241
The Data Insight module of Information Steward uses the Data Services Job Server, which supports distributed
processing.
This section offers some recommendations on different combinations of Information Steward services that can
be combined or decoupled.
Related Information
The most important part of the processing for Data Insight is done by the Data Services Job Server. You can
install Data Services Job Server on multiple machines and make them part of the single job server group that is
used for Information Steward. The profiling and rules tasks are distributed by the Information Steward Job Server
to the Data Services Job Server group. The actual tasks are executed by a specific Data Services Job Server
based on the resource availability on that server. So, if one server is busy, the task can be processed by another
server. In this way multiple profiling and rule tasks can be executed simultaneously.
Related Information
● Application Service
● Metadata integrators
Generally, these two processes should be run on separate servers. If they are on the same server, they should run
at different times. If there are many metadata integrators and they collect a lot of metadata from different
sources, each one of them could be deployed on a separate server.
Administrator Guide
242 PUBLIC Performance and Scalability Considerations
Tip
Schedule intensive processes, such as metadata integrators, during non-commercial hours.
For information about how to deploy integrators on separate servers, see “Scenario 3: Scaling for Metadata
Management” in the Master Guide.
Related Information
The Auto-analysis service is a resource intensive service in Cleansing Package Builder. The CPU and RAM
requirements depend on the number of parsed values found in the data. Also, if multiple concurrent users create
large cleansing packages, it is recommended that you dedicate one server to the Auto-analysis service and
allocate enough memory to the Java process.
Related Information
The Information Steward repository stores all of the metadata collected, profiling and rule results, and sample
data. The repository should be on a separate database server. To avoid resource contention, the database server
should not be the same database server that contains the source data. The database server that does not contain
the source data may or may not be on the same database server that hosts the Business Intelligence platform
repository.
The Information Steward repository should be on the same subnetwork as the Data Services Job Server that
processes large amounts of data and the metadata integrator that processes the largest amount of metadata.
Administrator Guide
Performance and Scalability Considerations PUBLIC 243
Related Information
Typically all web applications are installed on a separate server with other web applications. No other services or
repositories are typically installed with the web application, so that response time for Information Steward user
interface users is not affected by services that process data.
Related Information
Information Steward runs many tasks that are resource intensive. Business Intelligence platform provides the
ability to schedule them. This ability can be used to distribute the tasks so that the same resources can be utilized
for multiple purposes. The following can be scheduled:
● Profiling tasks
● Rule tasks
● Match review tasks
● Metadata integrators
● Calculate Scorecard utility
● Compute Report Lineage utility
● Purge utility
● Clean up Failed Data utility
● Update search index utility
If you schedule these tasks so that they run at different times and when few users access the system, you can
achieve good performance with limited resources. This time slicing is highly recommended for profiling, rules
tasks, and metadata integrators. For example, if you have users that process profiling tasks on demand during
business hours, then the metadata integrators and rules task should be scheduled during non-business hours.
If there are large profiling jobs, they should be scheduled during non-business hours and ideally on a dedicated
powerful server.
Administrator Guide
244 PUBLIC Performance and Scalability Considerations
Related Information
Information Steward can queue when there are many Data Insight tasks that are requested to run at the same
time. This depends on the user configuration for the Average Concurrent Tasks option. Based on this setting and
the number of Data Services Job Servers in the group, Information Steward calculates the total number of tasks
allowed to run simultaneously in a given landscape. Only that many tasks are sent to the Data Services Job Server
group for processing. The remaining tasks are queued. As soon as one of the running tasks finishes, the next task
in the queue is processed.
Using this setting, you can control how many Data Insight -related processes are running so that the resources
can be utilized and scheduled for other processes running on the system.
Related Information
For Data Insight functionality, Information Steward uses the Data Services engine. The Data Services engine
supports parallel processing in multiple ways, one of which is Degree of Parallelism (DOP). The basic idea is to
split a single Data Services job into multiple processing units and utilize available processing power (CPUs) on a
server to work on those processing units in parallel. The distribution of work is different for profiling versus rule
processing.
Note
DOP is only used for Data Insight functionality for column profiling and rule processing. Metadata Management
and Cleansing Package Builder do not use DOP.
Note
In general, do not set the DOP more than the number of available CPUs. To fine tune the performance, set the
DOP value based on the number of concurrent tasks and available hard disk and RAM resources. Gradually
increase the value of DOP to reach an optimal setting. For more information, see the SAP Data Services
Performance Optimization Guide.
Administrator Guide
Performance and Scalability Considerations PUBLIC 245
Related Information
For column profiling, the task is distributed proportionately for the different number of columns. The formula is:
Number of execution units = Number of Columns / DOP.
For example, if there is a column profiling task for 100 million rows with 20 columns, with DOP = 4, the task will be
broken down in execution units that process 5 columns each for all 100 million rows. The data is "partitioned" for 5
columns each for each execution unit to process.
Related Information
Advanced profiling tasks (dependency, redundancy, and uniqueness) require complex sorting operations, so it is
important to optimize the degree of parallelism settings. The degree of parallelism setting is used to execute
sorting operations in parallel sorting operations to increase throughput.
Related Information
Administrator Guide
246 PUBLIC Performance and Scalability Considerations
13.4.5.3 Rule processing
For rule processing, the number of execution units is proportional to the number of rules (as opposed to the
number of columns for profiling). The formula is: Number of execution units = Number of rules / DOP.
For example, if there is a rule execution task for 100 million rows with 20 rules, with DOP = 4, the task will be
broken down in execution units that process 5 rules each for all 100 million rows.
Related Information
Adding CPUs and increasing DOP does not guarantee improved throughput. When many processes run in parallel,
they also share other hardware resources such RAM and disk space.
When the Data Services engine processes data, it creates temporary work files in the Pageable Cache Directory.
Naturally, if many processes are running simultaneously, all of them create temporary files in the same location.
Because this directory is accessed by all of the processes simultaneously, there is a potential for disk contention.
In most environments, depending on the disk capacity and speed, you will reach a point where increasing DOP will
not improve performance proportionately.
Therefore, it is important to enhance all aspects of the hardware at the same time: the number of CPUs, hard disk
capacity, speed, and RAM. You should have a very efficient disk access to go along with the increased number of
DOP. Make sure the pageable cache directory is set accordingly.
Related Information
● When you have only a few very powerful machines with a lot of processing power, RAM, and fast disk access.
● When you have a large amount of data for profiling or rule tasks.
Administrator Guide
Performance and Scalability Considerations PUBLIC 247
Note
If you run many profile tasks simultaneously with DOP > 1, each of them could be split in multiple execution
units. For example, 4 tasks with a DOP of 4 could result in 16 execution units. Now there are 16 processes
competing for resources (CPU, RAM, and hard disk) on the same machine. So it is always a good idea to
schedule jobs efficiently or to use multiple Data Services Job servers.
Note
DOP is a global setting that affects the entire landscape.
Related Information
You can perform grid computing using the Data Services Job Server group. This group is a logical group of
multiple Data Services Job Server. When you install Information Steward, you can assign a single Data Services
Job Server group for that Information Steward instance. There are two ways you can utilize the Job Server group
with Distribution level setting: distribution level table and distribution level sub-table.
A single profiling or rule task can work on one or more “tables”. The term “table” is used in a general sense of the
number of records with rows and columns. In reality, this can come from an RDBMS table, a flat file, an SAP
application, and so on.
Note
DOP and distribution level are global settings and affect the entire landscape.
Related Information
Administrator Guide
248 PUBLIC Performance and Scalability Considerations
13.4.6.1 Distribution level table
When you set the distribution level to Distribution level table, each table of the task is executed on separate Data
Services Job Server in the group. Setting DOP is effective on the independent machines, and one task on that
particular server could be further distributed.
For example, if you have 8 tables in a task and it is submitted to a Data Services Job Server group with 8 Data
Services Job Servers , then each Data Services Job Server processes one table. If the DOP is set to 4, then each
Data Services Job Server tries to parallelize the task into 4 execution units, one for each particular table. There is
no interdependency between the different job servers; they share no resources.
When a Data Services Job Server group receives a task that involves multiple tables and the distribution level is
set to Table, it uses an intelligent algorithm that chooses the Data Services servers based on the available
resources. If a particular server is busy, then the task is submitted to a relatively less busy Data Services server.
These calculations are based on the number of CPUs, RAM, and so on. If you have two Data Services servers, one
with many resources and another with low resources, it is quite possible that the bigger server gets a
proportionally higher number of tasks to execute.
You can also choose servers based purely in a "round robin" fashion, in which case the task is submitted to the
next available Data Services server. For more information, see the SAP Data Services Administrator Guide.
Related Information
Note
This setting is only effective for column profiling. Use this setting with caution, as it may have a negative impact
on the performance of other types of profiling, such as advanced profiling.
When you set the Task distribution level option to Sub table, Data Services distributes a single table task across
multiple machines in the network based on the DOP setting. The basic idea is to split a single task into multiple
independent execution units and send them to different Data Services Job Servers for execution. You can think of
this as DOP, but, instead of multiple CPUs of a single machine, this is across multiple machines.
For example, you have a column profiling task with 100 million rows and 40 columns, the distribution level is set to
Sub-Table, the DOP is 8, and there are 8 Data Services Job Servers in the group. The task is split into 8 execution
units for 5 columns each and sent to the 5 Data Services Job Servers, which then can execute it in parallel. There
is no sharing of CPU or RAM. But all of the Data Services Job Servers share the same pageable cache directory.
You must ensure this directory location is shared and accessible to all Data Services Job Servers. This location
should have a very efficient disk and the network on which this setup is done should be very fast so that it does not
become a Job Servers bottleneck; otherwise, the gains of parallel processing will be negated.
Administrator Guide
Performance and Scalability Considerations PUBLIC 249
Related Information
Configuring grid computing is similar to degree of parallelism, but with the additional aspects of distribution level.
● Distribution level table: Use when you have many concurrent profiling and rule tasks that work on large
amount of data. Set the distribution level to Table, so that individual tasks are sent to different servers.
● Distribution level sub-table: Use when you have a very few column profiling tasks on very large amount of
data. In this case, use distribution level "Sub-table". It is important that these machines share an efficient hard
disk and that they are connected by a fast network.
Related Information
Information Steward allows direct integration with the SAP Business Warehouse and the SAP ERP Central
Component (ECC) system. One of the main benefits is the ability to connect directly to the production SAP
systems to perform data profiling and data quality analysis based on the actual, most timely data, instead of
connecting to a data warehouse, which is loaded infrequently. To fully utilize this advantage without risking the
performance and user experience on the production ECC system, consider these requirements.
To use utilize the back-end resources in the most efficient and sustainable way, the connection user defined for
the interaction between Information Steward and the SAP back-end system should be linked to background
processing. This is the recommended setup for the connection user.
The option to use a dialog user for the connection between Information Steward and the SAP back-end system
should be considered carefully and should only be considered for smaller datasets not exceeding 50,000 records
from a medium width table. In this case, the dialog mode processing will block a dialog process for the processing
time and its resources. Using this approach on larger data sets would require changing the heap size for the
maximum private memory for a dialog process and therefore has impact on the overall memory required by the
ECC system. In addition, the extraction of larger sets of data will most likely exceed the recommended maximum
work process runtime for dialog process on the ABAP Web Application server significantly. Depending on the
amount of records and number of columns of the table the data is retrieved from, the extraction process runtime
will vary. Due to the significant resource required and the extended runtime and allocation of a dialog process the
usage of dialog user for the communication is not recommended.
Administrator Guide
250 PUBLIC Performance and Scalability Considerations
You can connect to the SAP ECC system and retrieve data for profiling and data quality analysis using one of the
following methods.
The data transfer via option two and three have slightly longer turnaround times, but consumes fewer resources
on the back-end system.
It is recommended that you perform optimized requests for data extraction from the SAP system when you define
Information Steward profiling tasks. This optimization should be considered in all scenarios. Therefore, specify
only a subset of the columns in your profile task and use filters to focus on a specific data set. The more optimized
the request, the faster the extraction and overall process execution. In this case, Information Steward only
requests the data for extraction as defined by the filter. The filter criteria is passed to the SAP system.
Related Information
When reading flat files, the Data Services engine can route the tasks in multiple threads to achieve higher
throughput. With multi-threaded file processing, the Data Services engine reads large chunks of data and
processes them simultaneously. In this way, the CPUs are not waiting for the file input/output operations to finish.
You can set the number of file processing threads to the number of CPUs available. Use this setting when you
have a large file to run profile or rule tasks on.
Administrator Guide
Performance and Scalability Considerations PUBLIC 251
Related Information
This optimization is applicable to all of the scheduled profile and rule tasks. Information Steward provides the
ability to optimize redundant tasks for Data Insight. If the profile or rule results set for a particular table is already
available, then it is not processed. This is controlled by the Optimization Period setting. Data in certain tables does
not change very often. The results set is refreshed at a specified frequency. It is not required to execute that again
within that time period. If the data is not going to change in a profiling task, there is no need to process the task
again. Similarly, if the scorecards are calculated only on a nightly basis, there is no need to recalculate the score.
Suppose the Optimization Period is set to 24 hours. A rule task was executed for Table1 and the results set is
already stored. If that same task is tried again within 24 hours, it is not processed again. Imagine another case
where a single rule task involves Table1 and Table2. In this case, rules are not executed on Table1, but they are
processed for Table2 because this table does not have a results set available.
If you want to always obtain the latest results due to the changing nature of the data, set the Optimization Period
to the expected period of change in data.
Note
Any profiling or rule tasks that are run on demand do not use this optimization and all of the data is processed.
Related Information
For Data Insight functionality, Information Steward provides the following settings to improve performance.
These settings are available for both profile and rule tasks when you create a task. Defaults are set in Configure
Applications.
When you create a rule or profiling task, specify the maximum rows and the rate at which you want to process
them. Because processing time and resource requirements are proportional to the number of records being
processed, these settings are very important. Set these numbers only to what is required for the task. For
Administrator Guide
252 PUBLIC Performance and Scalability Considerations
example, if you know that there are 100 million rows in a table and you want to obtain a sense of your data profile
quickly, set the Max Input Size to 1 million rows and the Sampling Rate to 100. Every 100th record will be
processed up to a maximum of 1 million records.
Filter condition
You can also control exactly which data is processed using the filter condition. Because the number of records
affects performance, set the filter condition and process only the amount of data required.
Suppose you have 10 million records for all countries and there are 1 million records for the U.S. If you are
interested in profiling data for the U.S. only, you should set the filter for country = US. In this way, only 1 million
records are processed.
To further improve performance (if applicable), you can combine the filter condition with Max Input Size and
Sampling Rate.
Related Information
For Data Insight functionality, Information Steward provides the following settings to control the size of the
Information Steward repository. This depends on the number of records as well, because the more data, the
bigger the potential results set. However, the repository size can be controlled for even very large amounts of
data.
Related Information
Administrator Guide
Performance and Scalability Considerations PUBLIC 253
13.4.11.1 Profiling
Several settings affect the size of the Information Steward repository. Choose these numbers carefully based on
what your data domain experts require to understand data. The lower the number, the smaller the repository.
You can set a high number, but the repository size increases. Also, the response time for viewing sample data is
affected because more rows need to be read from the database, transported over the network, and rendered in
the browser.
Table 100:
Option Description
Max sample data size The sample size for each profile attribute.
Number of distinct values The number of distinct values to store for the value distribu
tion result.
Number of patterns The number of patterns to store for the pattern distribution
result.
Number of words The number of words to store for the word distribution result.
Results retention period The number of days before the profiling results are deleted.
The longer you keep the results, the bigger the repository. For
more information, see the Purge utility.
Related Information
The size of the Information Steward repository is controlled by the size of the rule data and the length of time the
data is saved to produce trend data.
Table 101:
Option Description
Max sample data size The number of failed records to save for each rule. The higher
the number, the more records that are available to view as
sample data. This results in a larger repository size, and the
response time for viewing sample failed data is affected.
Score retention period The number of days before the scores are deleted. The longer
you keep the score data, the larger the repository size.
Administrator Guide
254 PUBLIC Performance and Scalability Considerations
Related Information
The size of the Information Steward repository is also controlled by the amount of metadata that is collected and
retained. Optimize the amount by selectively choosing the components that you are interested in for different
metadata integrators.
Related Information
The size of the Information Steward repository is controlled by the size or the match review data and the length of
time the data is saved to produce trend data.
Table 102:
Option Description
Data Review tasks retention period The number of days before the Match Review tasks are de
leted. The longer you keep the data, the larger the repository
size. For more information, see the Purge utility in Utilities
overview [page 212].
The size of the Data Cleansing Advisor local repository is controlled by the size of the Data Cleansing Advisor
staged data and the length of time the data is saved.
Administrator Guide
Performance and Scalability Considerations PUBLIC 255
Table 103:
Option Description
Data Cleansing Advisor retention period The number of days before the Data Cleansing Advisor staged
data is deleted. The longer you keep the data, the larger the
repository size.
The default storage period is 120 days. If you set the value to
-1, the staged data is never purged.
To make Cleansing Package Builder run better, set the options for runtime parameters and scheduling when to
publish cleansing packages.
If there are many parsed values (more than 20) per row in the data being used for Cleansing Package Builder,
then the Auto-Analysis service requires adjustment to the JVM runtime parameter that controls memory
allocation for EIMAPS. If the memory cap for Java is left at 1GB, even though the system has 16GB, the service will
run out of memory. The best practice is to allocate 2 to 3GB of memory to the Java services via the -Xmx setting.
Related Information
Schedule cleansing package processing during non-business hours. Depending on the size of the cleansing
package, publishing can be a time consuming task. SAP-supplied cleansing packages for Name-Title-Firm are
typically very large. When resources are available, have a dedicated server for publishing, if you publish
frequently.
Related Information
Administrator Guide
256 PUBLIC Performance and Scalability Considerations
13.4.13 Settings for Metadata Management
To make Metadata Management run better, set the options for runtime parameters and memory allowance for
utilities and impact and linage diagrams.
● JVM arguments for metadata integrators that collect a large amount of metadata should be adjusted for
higher memory allocation. It is recommended that you update the JVM parameters on the integrator
parameters page to -Xms1024m -Xmx4096m.
● Run-time parameters for the maximum number of concurrent processes to collect metadata should be set to
the number of CPUs that can be dedicated for metadata collection. Typically metadata integrators are
installed on independent servers, so you can set it to the number of CPUs on the server. These parameters for
parallel processing may be different for different metadata integrators.
● Each metadata integrator provides some method of performance improvement specifically for the type of
metadata that it collects. For example, with SAP Enterprise Metadata Integrator, you can reduce processing
time by selectively choosing different components.
● For the first time the integrator is run, the run-time parameter Update Option for metadata integrators should
be set to Delete existing objects before starting object collection. For subsequent runs, change it to Update
existing objects and add newly selected objects. For example, for SAP BusinessObjects Enterprise Metadata
Integrator, you can first collect metadata only for Web Intelligence documents (by selecting only that
component as specified in the previous paragraph). For the first run, set the option to Delete existing. In the
next run, you may collect all of the Crystal Reports metadata. For subsequent runs, the parameter should be
set to Update existing.
Related Information
If multiple users simultaneously view large impact and lineage diagrams on Metadata Management, you should
increase the memory that your web application server will use.
To increase the JVM heap size and perm size of the Tomcat web application server:
1. Open a command-line window and access the Apache Tomcat Properties window.
Administrator Guide
Performance and Scalability Considerations PUBLIC 257
2. Select the Java tab.
3. In Maximum memory pool, replace the default value (1024 MB) with a larger value. For example, type the
value 4096.
4. Click OK.
5. Restart the Tomcat web application server.
The Metadata Management utilities Compute Lineage Report and Update Search index have some configuration
parameters. Run Compute Report Lineage in Optimized mode so that it is updated incrementally.
Related Information
Metapedia is based on the concept of how terms are associated with metadata objects and Data Insight objects,
and organized into categories.
A Metapedia term is a word or phrase that defines a business concept in your organization.
When creating a term, we recommend that you give it a unique name. From an information governance
perspective, it’s important that each term have only one definition that applies to the whole organization.
There may be situations in which multiple terms must share the same name. To allow terms to share the same
name, set the Allow Duplicated Term Names option to Yes. This option is set to Noby default.
Note
When creating a term in Metapedia, the information you enter into the Technical name field must be unique. If a
technical name already exists in the Information Steward repository, an error displays and you won't be able to
save the term.
This option can be found under Applications Information Steward Applications Configure Application
Metapedia Options .
Administrator Guide
258 PUBLIC Performance and Scalability Considerations
13.5 Best practices for performance and scalability
Learn about the best practices for each module in Information Steward.
● When using a distributed environment, enable and run only the servers that are necessary. For more
information, see the topic “Enabling and disabling servers” in the Business Intelligence Platform Administrator
Guide.
● Use dedicated servers for resource intensive servers like the Data Services Job Server, Metadata Integrators,
and the Cleansing Package Builder Auto-Analysis service.
● Consider these best practices guidelines when deploying the Application Service:
○ The Application Service can be combined on the same computer with the Search Service, Data Review
Service, and Administrative Task Service.
○ The Application Service should be on a separate computer than the web application server to obtain
higher throughput.
○ The Application Service can be on its own computer, or it can be combined with any Metadata Integrator.
The rationale for this combination is that Metadata Integrators usually run at night or other non-business
hours, and the Application Service runs during normal business hours when users are performing tasks
on the Information Steward web application (such as viewing impact and lineage relationships, adding
tables to profile, or creating Metapedia terms).
● Install the Information Steward Web Application on a separate server. The Business Intelligence platform Web
Tier must be installed on the same computer as the Information Steward Web Application. If you do not have
Tomcat or Bobcat, you need to manually deploy the Information Steward Web Application.
● If you have many concurrent users, you can use multiple Information Steward web applications with Load
Balancer.
For more information, see the topic “Fail-over and load balancing” in the SAP BusinessObjects Business
Intelligence platform Web Application Deployment Guide.
● To obtain a higher throughput, the Information Steward repository should be on a separate computer but in
the same sub-network as the Information Steward Web applications, Enterprise Information Management
Adaptive Processing Server, Information Steward Job Server, and Data Services Job Server.
● Make sure that the database server for the Information Steward repository is tuned and has enough
resources.
● Allocate enough memory and hard disk space to individual servers as needed.
● Follow good scheduling practices to make sure that resource intensive tasks do not overlap each other.
Schedule them to run during non-business hours so that on-demand request performance is not affected.
Related Information
Administrator Guide
Performance and Scalability Considerations PUBLIC 259
Metadata Management best practices [page 262]
Cleansing Package Builder best practices [page 263]
Profiling and rule tasks can consume a large amount of processing resources. If you expect that your SAP
Information Steward Data Insight profiling and rule tasks will consume a large amount of processing resources,
there are some best practices to implement as applicable that may help reduce the amount of processing
resources.
To improve the execution of Data Insight profiling tasks, take advantage of the SAP Data Services Job Server
groups and parallel execution. Put the Data Services Job Server on multiple computers that are separate from the
web application server.
To perform the following tasks, you must access the Data Services Server Manager on each computer.
● Add a Data Services job server and associate it with the Information Steward repository. For more
information, see Adding Data Services Job Servers for Data Insight [page 231].
● Specify the path of the pageable cache that will be shared by all job servers in the Pageable cache directory
option.
There are several more best practices to help reduce processing resources.
● For a predictable distribution of tasks when using multiple Data Services Job Servers, ensure that the
hardware and software configurations are homogeneous. This means that they should all have similar CPU
and RAM capacity.
● Irrespective of using DOP and/or multiple Data Services Job Servers, set the pageable cache directory on
high speed and high capacity disk.
● If you are processing flat files, store them on a high speed disk so that read performance is good.
● Process only data that must be processed. Use settings such as Max input size, Input sampling rate, and Filter
conditions appropriately.
● When working with Data Insight views, use the correct join and filter conditions so that you are pulling in only
required rows. Also use the join ranking and caching options for faster processing.
● Choose only columns that you want to profile or the rules that you want to calculate for the score. Selecting all
columns may lead to redundant processing.
● Word distribution profiling is done only on a few columns. Do not choose this for all columns. Otherwise,
performance and the size of the Information Steward repository is affected.
● If you have lookup functions in rule processing, it takes more time and disk space. Typically, lookup tables are
small, but if they are big, lookup tables can adversely effect performance. Ensure that the tables on which
lookup is performed are small. As an alternative to lookup, you can use the SQL function.
● Choose the DOP settings and the distribution level carefully. Remember that these are global settings and
affect the entire landscape.
Administrator Guide
260 PUBLIC Performance and Scalability Considerations
● When doing column profiling on large amounts of data with many small capacity Data Services Job Servers,
use the distribution level Sub table only. This setting can have an adverse effect on other types of tasks. You
may want to change it back to Table level after that column profiling task is done.
● Store the reference data required for address profiling on a high speed and high capacity disk.
● Make sure that the database server that contains source data is tuned and has enough resources.
● Schedule the Purge utility to run during non-business hours. If column profiling and rules are executed many
times a day, try to schedule the Purge utility to run more than once, so that it can increase free disk space in
the repository.
Related Information
1. Choose the data retrieval method (synchronous or asynchronous) based on the data set size and
performance requirements of your SAP system.
2. For smaller data sets, use the synchronous method.
3. For larger data sets, use the asynchronous method, where the data from SAP systems is written into a file
that Information Steward uses.
It is recommended to use background processing on the SAP back-end, which can be controlled by the user type
of the connection the user defined.
Dialog processing should be considered carefully. In this case, adjust the run-time parameter such as heap size
for the maximum private memory and maximum work process runtime.
Related Information
How scorecards and projects are organized depends on how business users want to view the scorecards. But this
organization can affect the response time for users. If a project contains many scorecards, details of all the
Administrator Guide
Performance and Scalability Considerations PUBLIC 261
scorecards must be retrieved for viewing. So it is good idea to create multiple projects according to the area of
interest and have a limited number of scorecards in those projects.
For example, you could create projects for different geographical locations. Within each project, you could have
different scorecards, such as Customer, Vendor, and so on. Or you could create projects based on Customer,
Vendor, and so on, and then have scorecards based on geography.
The organization also helps you decide what data sources you want to use and the filter conditions involved for
profiling and rule execution.
Another benefit of proper organization is that you can control the user security per project and restrict access to
only specific users.
To avoid future problems, different Data Insight user groups should work together at the beginning of the project
to decide these aspects.
Related Information
To help reduce processing resources when Data Insight views consist of joins of tables and files from multiple
sources, follow these best practices.
● Use appropriate filter conditions so that you are processing only the required rows.
● Use the join ranking and caching options for faster processing.
● Generally, put SAP tables first.
● Move join filter conditions from the ON clause to the WHERE clause.
● If a filter applies to all tables, specify the filter instead of relying on the join to filter the data.
● Edit the view and change the join order on the Join Conditions tab, Set Optimal Order feature, by setting the
biggest table as the first item in the list, and the smallest table as the last item in the list.
● Be careful of filtering by functions. Some functions are not pushed down (the database might not support
them), and they behave unexpectedly. If possible, rely on simple constant filters.
● If using a Left Outer join, and it is possible to filter most of the right table, use a view on top of views. In other
words, create a view of the left table, create a view of the right table with the filter specified, and then join the
two views. This way, the filter is applied before the outer join, and it will reduce the amount of data for the join
operation. It is important to reduce the data set to thousands of rows instead of millions.
Administrator Guide
262 PUBLIC Performance and Scalability Considerations
● Metadata integrators for SAP Enterprise, Data Services, and SAP Business Warehouse should be installed on
their own dedicated servers if they require large processing time or they run in overlapping time periods with
other metadata integrators or Data Insight tasks.
● Any Metadata Integrator can be combined with the Application Service. The rationale for this combination is
that Metadata Integrators usually run at night or other non-business hours, and the Application Service runs
during normal business hours when users are performing tasks on the Information Steward web application
(such as viewing impact and lineage relationships, adding tables to profile, or creating Metapedia terms).
● Another guideline to consider for Metadata Search Service:
○ Can be on its own computer, or it can be combined with any Metadata Integrator. The rationale for this
combination is that Metadata Integrators usually run at night or other non-business hours, and the
Search Server runs during normal business hours when users are searching on the Metadata
Management tab of Information Steward.
● The File Repository Servers should be installed on a server with a high speed and high capacity disk.
● Adjust runtime parameters correctly.
Related Information
● Obtain higher throughput by putting Cleansing Package Builder Auto-Analysis service should be on a
dedicated server.
● Use sample data that represents various patterns in your whole data set. Large amounts of sample data with
too many repeating patterns leads to redundant processing overhead.
● Runtime parameters should be set correctly for Auto-Analysis service. Specifically, memory requirements are
very important. If enough memory is not made available to the process, it will run out of memory. If the
memory cap for Java is left at 1GB, even though the system has 16GB, the service will run out of memory. The
best practice is to allocate 2 to 3GB of memory to the Java services via the -Xmx setting.
Related Information
Follow these best practices guidelines when defining a match review configuration.
Administrator Guide
Performance and Scalability Considerations PUBLIC 263
● Select only the columns that you want to review.
● Create filters to focus on specific data.
● Ensure that the database management system has an index created on the filter column.
Also see: User Guide: Match Review, Creating or editing a match review configuration
Related Information
For better performance, use ABAP-supported functions in your views that contain SAP tables.
There are functions that are supported for use with SAP tables and ABAP programs. You can use any available
functions when forming your expressions, but non-supported ABAP functions are processed differently than the
ABAP-supported functions, so they will likely take longer to process.
ABAP-supported functions:
● Absolute (abs)
● Decode (decode)
● Left trim (ltrim)
● Left trim ext (ltrim_blanks_ext)
● Replace null (nvl)
● Right trim (rtrim)
● Right trim ext (rtrim_blanks_ext)
● String length (length)
● Substring (substr)
● System date (sysdate)
● System time (systime)
● To lower case (lower)
● To upper case (upper)
See also:
Administrator Guide
264 PUBLIC Performance and Scalability Considerations
14 Migration between landscapes
Migration as it relates to SAP Information Steward is the process of moving application configurations between
landscapes.
The application life cycle often involves multiple landscapes, with each landscape used for a different phase or set
of activities. Although the exact practices vary by company, typically the life cycle moves between non-production
(development and test) and production phases.
You can use SAP Information Steward in both phases. Each phase could involve different computers and require
different security settings. For example, the initial test may require only limited sample data and low security,
while final testing may require a full emulation of the production environment including strict security. The
software provides mechanisms for moving objects between landscapes.
Related Information
In develop and test environments, you define and test the following objects for each module of SAP Information
Steward:
● Data Insight
Define profile tasks, rules, and scorecards that instruct Information Steward in your data quality
requirements. The software stores the rule definitions so that you can reuse them or modify them as your
system evolves.
● Metadata Management
Define integrator sources, integrator source groups, and integrator source instances that collect metadata to
determine the relationships of data in one source to data in another source.
● Metapedia
Implement a business glossary of terms related to your business data and organize the terms hierarchically
into categories. Metadata objects can be associated with terms and Data Insight objects such as rules and
scorecards.
● Cleansing Package Builder
Administrator Guide
Migration between landscapes PUBLIC 265
Create and refine person and firm or custom cleansing packages. Cleansing packages contain the parsing
rules and other information that defines how the Data Cleanse transform in SAP Data Services should parse
and standardize name, firm, operational, or product data.
● Match Review
Define and run match review configurations to create match review tasks. The tasks display match groups
created by the Match transform in Data Services. Reviewers review the match groups and make any
necessary corrections.
After you define the objects, use SAP Information Steward to test the execution of your application. At this point,
you can test for errors and trace the flow of execution without exposing production data to any risk. If you
discover errors during this phase, you can correct them and retest the application.
The software provides feedback through trace, error, and monitor logs during this phase.
Related Information
After migrating and configuring your Information Steward environment to a production environment, consider
taking the following actions:
During production
After you move the software into production, monitor it in the CMC for performance and results in the following
ways:
● Monitor your Data Insight tasks and Metadata Integrator runs and the time it takes for them to complete.
The trace and monitoring logs provide information about each task and run.
You can customize the log details. However, the more information you request in the logs, the longer the task
or integrator runs. Balance run time against the information necessary to analyze performance.
● Check the accuracy of your data.
Administrator Guide
266 PUBLIC Migration between landscapes
Evaluate results from production runs and when necessary, return to the development or test landscapes to
optimize performance and refine your target requirements.
Also see:
● User Guide: Cleansing Package Builder, Publishing a cleansing package, Exporting an ATL file for Data
Services
Related Information
● Create in Cleansing Package Builder in a non-production environment and test using a Data Cleanse job.
● Promote to a production environment and use in Data Cleanse jobs.
● Maintain and revise when new data is added or threshold is no longer met.
Administrator Guide
Migration between landscapes PUBLIC 267
As a cleansing package progresses between phases you can move it between landscapes using the Promotion
Management tool in the Central Management Console (CMC).
Typically, Information Steward and the Data Services test environment both access the same Central
Management System (CMS). In this case a published cleansing package and any changes to it are available in
both Information Steward and Data Services.
Cleansing packages are created and modified in Cleansing Package Builder. During the initial creation phase a
private custom cleansing package is created based on a sample of your data, or a custom or person firm cleansing
package that can be based on another cleansing package. After the cleansing package meets the initial “done”
criteria, it is published. The published cleansing package is available for use in a Data Cleanse job.
The Data Cleanse job is run in a Data Services test environment using a larger data sample than was used in the
initial creation of the cleansing package. Based on the results of the Data Cleanse job, in Cleansing Package
Builder the private cleansing package may be refined or modified and then republished using the same name. The
Data Cleanse job would then be run again to check the new results.
Production phase
Once the cleansing package has been refined and the results of the Data Cleanse job produce the required
cleansed data, the published cleansing package is promoted to a production environment.
To promote a cleansing package, use the Promotion Management tool in the CMC. Set up a job, and then add
cleansing packages to it. In the Add Objects window, the cleansing packages available for promotion are found in
Data Quality > Cleansing Packages > Published.
Note
Cleansing packages are referred to as “Objects” or “InfoObjects” in the CMC.
Note
You must use the same mechanism to remove a cleansing package from a system that you used to add it. If
you used Promotion Management to promote a cleansing package, you must use the Promotion Management
rollback option to remove it (this rolls back to the previous version). If you used the Data Services installation
wizard to install the cleansing package, you must use the Data Services installation wizard to uninstall it.
For more information about Promotion Management, see the Business Intelligence platform Administrator Guide.
Administrator Guide
268 PUBLIC Migration between landscapes
Maintenance phase
In the production environment, the cleansed data produced by Data Cleanse jobs is monitored. If new data is
added or the threshold is no longer met, then the cleansing package is modified and tested in the test
environment until the desired results are obtained. After that, the cleansing package is once again promoted to
the production environment.
You can migrate your Information Steward content between system landscapes, such as between development,
test, and production environments.
Prerequisites:
When migrating SAP Information Steward between landscapes, ensure that same version of each product is
installed and configured in all landscapes:
Note
It is recommended that you install IPS to provide flexibility to upgrade Data Services and Information
Steward independently from the BI platform.
Although each landscape must contain the prerequisite software, the landscapes themselves do not need to be
exactly identical. For example, in your source landscape you might have an Oracle database for your Information
Steward repository and a Websphere web application server. Your target landscape might have a SQL Server
database for your Information Steward repository and a Tomcat web application server.
You can migrate many objects at the same time using the Promotion Management tool in the Central
Management Console (CMC). Otherwise, you can migrate smaller objects through the Information Steward
import and export feature.
1. Ensure that the prerequisite software has been installed and configured.
2. Create an inventory of objects in your source Information Steward environment (the environment from which
you want to migrate).
Take note of objects you want to migrate or recreate in your target environment, such as:
○ Projects or objects within a project
You can export all project content together or choose individual objects within a project, such as specific
rules, views, tasks, and so on.
○ Connections
○ Schedules
Administrator Guide
Migration between landscapes PUBLIC 269
○ Cleansing packages
○ Match Review configurations
○ Custom attributes
○ Metapedia terms and categories
○ Users and their group membership information
The objects in your inventory are unique to your environment and needs.
3. In your source Information Steward environment, export the following items:
○ custom attribute definitions
○ projects (an exported project includes rules, rule bindings, views, file formats, key data domains, tasks,
and tables)
○ Metapedia terms and categories
○ Match Review configurations
4. In your target landscape (the landscape to which you want to migrate your Information Steward
environment), create projects which correspond to those in your source landscape.
The contents of a project can be migrated, but the project definition cannot.
5. In your target Information Steward environment, import the files you exported in step 3.
Note
You must import custom attribute definitions before you import project content or Metapedia content.
6. Promote your published cleansing packages from the CMS in your source landscape to the CMS in your
target landscape using the promotion management tool in the Central Management Console (CMC).
The cleansing packages available for promotion are found in Data Quality > Cleansing Packages >
Published.
For more information about the promotion management tool, see the SAP Business Intelligence platform
Administrator Guide.
7. As needed, recreate the objects in your inventory that have not been migrated. Define Data Insight
connections, configure Metadata Integrators, and so on.
8. If your target is IPS, but your source has BI launch pad users that want to access Information Steward
features such as viewing the lineage of a document (Crystal Report or Web Intelligence document) or viewing
the Metapedia definition, configure the BI launch pad from the Information Steward application in the CMC.
For more information see Configuring BI Launch Pad for Information Steward [page 193].
Related Information
Administrator Guide
270 PUBLIC Migration between landscapes
14.2.1 Objects supported for migration
Use the Promotion Management tool in the Central Management Console (CMC) to move Information Steward
objects from an existing Information Steward environment to the new Information Steward environment. For
instructions see Migrating CMS content using Promotion Management [page 272].
All the information associated with each object, including user security, is retained. Access more information
about the Promotion Management tool in the CMC Help file ( Help Help ).
Note
Users who are upgrading from SAP BusinessObjects Enterprise XI 3.1 and SAP Metadata Management 3.1 to BI
Platform 4.0.x and Information Steward 4.1 should use the Upgrade Manager utility.
1. From the CMC Home page, select Promotion Management from the Manage list. The Promotion Jobs tab
opens.
2. Select New Job and complete the options as applicable.
3. Click Create. The new job is created, and the Add Objects from the system dialog box opens.
You can move the following objects from the Add Objects from the system dialog box. For each object, select the
objects and click Add, or Add & Close.
Table 104:
Object Path in Promotion Management
Profiling connections
● Select Information Steward Connections from the
file tree at left.
Administrator Guide
Migration between landscapes PUBLIC 271
Object Path in Promotion Management
Information Steward security (users and user groups) 1. Select the applicable user type (such as User or User
Groups) from the file tree at left.
2. Choose the applicable user groups.
Related Information
Use the Promotion Management tool in the Central Management Console (CMC) to move objects from an existing
Data Services or Information Steward 4.x environment to the latest version's environment.
When you use Promotion Management for migration, all the information associated with the object, including user
security, is retained. For complete information about Promotion Management, see the SAP BusinessObjects
Business Intelligence platform Administrator Guide or Information platform services Administrator Guide.
1. Login to the CMC on the target server (the machine that has IPS 4.0 SP5, or higher).
2. Under the CMC Home drop-down list, select Promotion Management.
3. Connect to the source system.
Administrator Guide
272 PUBLIC Migration between landscapes
6. From the Add Objects window, choose one or more objects listed in the table, and then click Add.
The table includes both Data Services and Information Steward objects. Choose the objects that you want to
promote.
Table 105:
Product Object Path in Promotion Management
Data Services Published cleansing packages Data Quality Cleansing Packages Published
Data Services Data Services security (users and Users or User Groups. For more information see the
user groups) SAP BusinessObjects Business Intelligence platform Ad
ministrator Guide
Information Steward Profiling tasks and rule tasks 1. Select a project object from Information
Steward Enterprise Profiling Projects
2. Click Manage Dependencies.
3. Select the profiling tasks and rule tasks to upgrade.
4. Click Apply & Close.
Information Steward User-created utilities and scheduling Information Steward Metadata Management
information for Information Steward Utilities
utilities
Information Steward Metapedia user security settings Metadata Management Metapedia folder
Information Steward Published cleansing packages Data Quality Cleansing Packages Published
Note
Skip cleansing packages if they
were migrated as a part of the
Data Services migration.
Administrator Guide
Migration between landscapes PUBLIC 273
Product Object Path in Promotion Management
Information Steward Information Steward security (users Users or User Groups. For more information see the
and user groups) SAP BusinessObjects Business Intelligence platform Ad
ministrator Guide
Note
When selecting the repositories to migrate, review the list of existing repositories and carry forward only
the valid ones. If there are any obsolete or invalid repositories, deselect those from the promotion job.
Note
Promotion jobs can take a long time to complete when there is a large cleansing package. For this reason,
you might want to create a separate migration job for cleansing packages. If you have multiple cleansing
packages, create several separate migration jobs.
Note
Objects can be added later with the Add Objects icon.
8. (Optional) Choose Manage Dependencies and select any dependent objects that you would like to migrate.
9. Click Promote.
10. (Optional) Promote security settings by following the sub-steps:
a. Choose Security Settings.
b. Select Promote Security, Promote Object Security, and Promote User Security.
c. Click Save.
d. Choose Test Promote.
e. Click the Test Promote button. Verify that the test promotion is successful.
Note
If there are many objects to be promoted, this may take some time to complete.
11. If you created several migration jobs, select them and then click Promote to complete the migration of the
CMS content.
Note
Depending on the size of the projects and the contents, it might take several minutes to complete the
promotion job. If the content is too big, consider breaking the contents into multiple migration jobs.
12. Click History to verify the job status. Click the Refresh icon as necessary until the job is completed.
13. Login to the CMC on the destination system to verify that all objects (repositories, users and user groups, if
any) have been migrated.
Administrator Guide
274 PUBLIC Migration between landscapes
Note
With regard to EIM APS Services configuration, the new Information platform services/Data Services
landscape Services configuration parameters are set to default values. If you previously modified any of the
Service configuration parameters, log on to the CMC to change the parameters to your custom settings.
Note
With regard to Data Services Application settings, the new Information platform services/Data Services
landscape application settings are set to default values. If you previously modified any of these settings, log on
to the CMC and change the options to your custom settings ( CMC Home Applications Data Services
Application ).
See also:
● Data Services and Information Steward Master Guide: Separate Information platform services and BI
platform
Extra migration steps are required when the ABAP program needs to be preloaded by the SAP application in a
production environment.
Administrators set up connections to SAP table sources in the Central Management Console (CMC). A parameter
in the connection setup specifies the source for the ABAP program. The ABAP execution option parameter has
two values: Execute preloaded and Generate and execute. The Execute preloaded option requires that the ABAP
program be preloaded by the SAP application.
When you move your Information Steward software to a new landscape you migrate Data Insight views as usual.
When you migrate views that contain SAP tables, ABAP programs may also need to be moved. There are two
scenarios when you migrate views containing SAP tables to a new environment based on the ABAP execution
option parameter that your administrator sets in the connection parameters.
The table below contains the processes involved when you migrate from a test environment to a production
environment based on the ABAP execution option setting in the connection setup.
Administrator Guide
Migration between landscapes PUBLIC 275
Table 106: ABAP program generation based on execution option in a production environment
Execute preloaded Generate and execute
With this setting, Information Steward does not generate the With this setting, Information Steward is enabled to generate
ABAP programs. Therefore, users must migrate from test to the ABAP program in either the test environment or the pro
production in the following way: duction environment.
● Users export the applicable ABAP programs from the test ● Users migrate projects and other objects from test to
environment using the Export ABAP programs export op production as usual.
tion in Information Steward
● Users import the ABAP programs from the test environ Note
ment to the production environment using their specific
Because of the ABAP execution option setting, the
SAP application import process.
production environment allows Information Steward
● Users perform the basic Information Steward migration
to preload the ABAP programs in the production envi
steps to migrate projects and objects to the production
ronment, so there is no need to export ABAP pro
environment.
grams from Information Steward in the test environ
● The Information Steward application in the production ment.If you specify the ABAP data transfer method in
environment automatically uses the ABAP program that your Data Insight view, the view generates the ABAP
was imported from the test environment (in the second scripts to read data from SAP tables. For reading large
bullet point above). data sets, the ABAP transfer method provides better
performance. However, when you migrate from one
system to another, you must export the ABAP script
from one SAP system and import it into the target
SAP system. If the Data Insight view uses the default
data transfer method defined in the CMC, the produc
tion environment allows Information Steward to pre
load the ABAP programs in the production environ
ment, so there is no need to export ABAP programs
from Information Steward in the test environment.
Even though it seems less complicated to choose the Generate and execute setting, there are reasons to choose
the Execute preloaded option.
Example
Let's say that you have been testing Information Steward in a test environment, and you are ready to migrate to
a production environment. In your test environment, the execute setting is Generate and Execute, so
Information Steward has been generating the ABAP programs. Your production environment security policy
requires that applications like Information Steward are not allowed to generate any ABAP programs. Therefore,
you need to follow the process listed in the Execute preloaded column in the table above. Also see "Work with
SAP tables in views" in the User Guide.
Related Information
Administrator Guide
276 PUBLIC Migration between landscapes
14.3.1 ABAP program naming conventions
When you export an ABAP program from your test environment, Information Steward provides the option to
rename it. Renaming an ABAP program file prevents possible conflicts with any existing programs in the
environment. If you decide to rename the ABAP program, use the naming conventions listed below:
Note
The slash character is only allowed in the start position and cannot be used anywhere else in the name.
You can move certain objects from the following list of Information Steward products:
● Data Insight
● Metadata Management
● Metapedia
● Cleansing Package Builder
● Match review
● General objects
For objects migrated by exporting from the source, then importing to the target, only the current version of each
object is migrated. History is not preserved. Additionally, user security is not migrated with this mechanism.
See also:
Related Information
Administrator Guide
Migration between landscapes PUBLIC 277
Match Review object portability [page 280]
Information Steward general object portability [page 280]
Table 107:
Object Mechanism to move
ABAP Programs Export from source, then import to target SAP application.
Tasks (rule and profiling) Export from source, then import to target.
Custom attribute definitions and values Export from source, then import to target.
Related Information
You can move Metadata Management objects between landscapes using various mechanisms.
Administrator Guide
278 PUBLIC Migration between landscapes
Table 108:
Object Mechanism to move
User-defined relationships between objects Export from source, then import to target.
Related Information
Cleansing Package Builder objects that you can move between landscapes:
Table 109:
Object Mechanism to move Comments
Published cleansing packages Promotion Management in the CMC. The cleansing packages available for
Data Services ATL files Recreate in new landscape. To create an ATL file in the new land
scape, import a cleansing package as a
private cleansing package, and then re
publish it.
See also:
Administrator Guide
Migration between landscapes PUBLIC 279
14.4.4 Match Review object portability
You can move Match Review objects between landscapes using various mechanisms.
Table 110:
Object Mechanism to move Comments
Configurations Export from source, then import to tar For details, see the Related Information.
get.
See also:
Related Information
Information Steward general objects that you can move between landscapes:
Table 111:
Object Mechanism to move Comments
Data Services Job Servers Recreate in new landscape. Used to run Data Insight and Match
Review tasks.
Administrator Guide
280 PUBLIC Migration between landscapes
Object Mechanism to move Comments
User security Either promote using Promotion Promotion Management will move your
Management or create in new land current users, user groups, and permis
scape. sions. If your new environment has dif
ferent users, it may be more efficient to
create new user security then to pro
mote existing information.
Related Information
To share objects and the contents of projects from one system to another, you can export them from one system
and then import them into another system.
For example, if you perform all of your setup and testing on a development system, you can export projects and
objects from that system and import them into the production system to make the projects and objects available
in the production system.
You can also import and export some of these objects individually.
Administrator Guide
Migration between landscapes PUBLIC 281
If you want to keep custom attributes associated with rules that you export and import, you must import the
custom attribute definitions before you import the rules.
Restriction
Rules and views that contain the following functions cannot be exported directly from SAP Information Steward
to SAP Data Services:
● Lookup
● Data dependency
● Is unique
Related Information
You must obtain the appropriate permissions from your administrator before you can perform these steps.
Before you export a project to a .zip file, make sure that the project is valid. You cannot export an invalid project.
A project is invalid if it contains invalid objects such as rules with syntax errors, invalid or incomplete key data
domains, or invalid view definitions.
1. In Data Insight select the project that you want to export and open Workspace Home.
2. Click the Export icon ( ) located in the upper right corner of Workspace Home and select Project from the
drop-down menu.
The Export summary for the selected project dialog box opens listing everything that will be exported.
3. Click OK to proceed.
4. When the message stating that the export to .zip was successful appears, click Save.
5. In the save dialog box, enter a name and choose a location for the .zip file, or accept the default name
(Project_yyyymmdd_hhmmss.zip) and location. Click Save.
The .zip file containing the project is exported to the location that you specified.
Administrator Guide
282 PUBLIC Migration between landscapes
14.5.1.2 Exporting an object
You export objects so that you can share them between projects.
You must obtain the appropriate permissions from your administrator before you can perform these steps.
You cannot export invalid objects. An object is invalid if its definition contains errors. For example, an invalid rule
has syntax errors; an invalid key data domain contains invalid rules.
● Rules
● Rule bindings
● Views
● File formats
● Key data domains
● Tasks
● Tables
● ABAP programs
Tip
Most of these objects are contained in a project and are exported with the project (see Export and import
projects and objects [page 281]) To save time, export the project and then export the individual objects that are
not exported in the project.
To export an object:
1. In Data Insight open the project that contains the object(s) that you want to export.
2. In Workspace Home, select the tab for the object type that you want to export and then select one or more
objects that you want to export.
For example, if you want to export a rule, go to the Rules tab, and select the rule that you want to export.
Note
If you are exporting ABAP programs, select the export option from Workspace Home.
3. Click the Export icon ( ) and select the object from the drop-down menu. The options available under the
Export icon vary based on the object you are exporting:
If the object is valid, the file is exported but not saved to your local disk yet. (Errors are displayed in an Error
window if applicable.)
Table 112:
Object Tab Export menu option Description
Tables Workspace Home Exports the selected tables or views to a file. An exported
Export
Profile Results table includes the table index. When you export a view, as
Tables
sociated source tables and connections are also included.
Administrator Guide
Migration between landscapes PUBLIC 283
Object Tab Export menu option Description
Profile results Workspace Home Export Profile Exports basic profile results of a table, view, and/or file to a
Profile Results Results Microsoft Excel spreadsheet.
Tables with Workspace Home Exports rules bound to the columns; tables, views, and/or
Export Tables
rule bindings
Rule Results flat file sources and file formats that are bound to the rules;
with Rule Bindings
connections; and rule-binding definitions.
Note
You can export tables, views, or flat file sources and file
formats that contain rule bindings.
Rules to file Rules (on the left Exports rules to a file that you can then import into another
Export Rules to
side) Information Steward project.
file
When exporting rules with dependencies (for example, the
rule may have a dependency on a connection due to a SQL
or Lookup function in the expression), the dependencies
are also included in the exported file. Exporting rules also
includes associated test data and the results of the test
data.
Rules to Data Rules (on the left Exports rules to Data Services using one of the following op
Export Rules to
Services side)
tions:
Data Services
○ Export to Data Services ATL file
○ Export to Data Services repository
Note
If the rule contains an expression that includes lookup
value, exists, is data dependent, or is unique functions,
you must choose the option to export rules to file
Key data do Scorecard Setup Exports key data domains to a file.
Export Key
mains (on the left side)
Data Domains
File formats File Formats (on the Exports file formats to a file.
Export File
left side)
Formats
Administrator Guide
284 PUBLIC Migration between landscapes
Object Tab Export menu option Description
ABAP pro Workspace Home Export ABAP Exports ABAP programs when you have SAP tables in your
gram files views.
programs
Note
There is no ABAP program tab. Therefore you access
the Export button from the Workspace Home window.
4. In the message window, click Save to download the file to your local hard disk.
5. In the Select location for download window, either accept the default name
(Object_yyyymmdd_hhmmss.zip) or enter a different name and choose a location for the .zip file, which
contains the exported object. Click Save.
You might want to reuse content types in another Information Steward instance. You can export from one system
and import into another. When you export the content type, you are also exporting any training data associated
with the content type and the status (Disabled, Enabled, OK, and so on).
When exporting content types, another user might be using or changing the state of the content type. For
example, a user might be disabling a custom content type. The table shows how these types of situations are
handled.
Table 113:
State Description
Disabling The content types are exported, and the state on the target system is set to Disabled.
Enabling The content types are exported, and the state on the target system is set to Disabled.
Training The content types are exported. Any "learned" training data is carried over to the target system.
Error The content types are exported, and the state on the target system is set to <blank>.
Related Information
Administrator Guide
Migration between landscapes PUBLIC 285
14.5.1.4 Exporting rule bindings
When you export rule bindings, you can completely migrate to another instance of Information Steward or to
another repository.
For example, you can move the entire object from a development environment to the test environment, and then
to the production environment. When exporting rule bindings, you also export the following:
Note
You can only export tables, views, or flat file sources and file formats that contain rule bindings.
1. From the Workspace Home Rule Results tab, select the tables that have rule bindings you want to export, and
The rules are exported, compressed and saved in a ZIP file in the location that you specified. You can import the
ZIP file in another instance of Information Steward.
Related Information
The rule name can contain only alphanumeric characters, spaces, underscores, and East Asian scripts to comply
with Data Services file naming requirements.
Administrator Guide
286 PUBLIC Migration between landscapes
You must obtain the appropriate permissions from your administrator before you can perform these steps.
Restriction
Rules and views that contain the following functions cannot be exported directly from Information Steward to
Data Services:
● Lookup
● Data dependent
● Is unique
2. Click the Export icon ( ) and select Rules to Data Services from the drop-down menu.
The Export Rules to Data Services dialog box opens that contains various export options and a list of
repositories to which the object may be exported. The repositories must already be set up in the Central
Management Console, and you must have permissions to access the repository.
3. In the Export Rules to Data Services dialog box, select the following options as needed.
Table 114:
Option Description
Export dependent objects Includes objects that rely on the selected rules. Examples include tables and
connection information.
Export to Data Services ATL file Saves the rules to an ATL file so that you can manually import them when you
are using Data Services. You do not choose a repository when you choose this
option. The Export to Data Services repository option is not available when you
choose this option.
Export to Data Services repository Places the rules directly into the repository that you choose from the list of repo
sitories that appears in the lower portion of the dialog box. The Export to Data
Services ATL file option is not available when you choose this option.
4. Click Export.
After the export is complete, the Export Rules to Data Services dialog box contains export statistics and a list
of the rules that were exported, including the status of each.
5. Click the buttons located at the bottom of the Export Rules to Data Services dialog box as applicable.
Download Summary Saves high-level information such as the names of the rules exported and whether
they exported successfully or failed to export.
Download ATL Saves the rule in an ATL file. This button only appears when you have chosen the
Export to Data Services ATL file option.
6. After you click any button described above, a Select location for download dialog box appears. Enter a new
name for the file or accept the default file name and the location, and then click Save. The file name must
comply with Data Services naming requirements.
7. Click Close to return to the Rules tab.
Administrator Guide
Migration between landscapes PUBLIC 287
Related Information
You can import a project that was exported from another instance of SAP Information Steward. When you import
a project, you import all of its associated objects:
1. Log on to Information Steward with a user name that belongs to one of the following groups:
○ Data Insight Analyst
○ Data Insight Scorecard Administrator
○ Data Insight Rule Approver
○ Data Insight Administrator
2. Open the Data Insight tab and select the project into which you want to import.
3. If the project you will import contains rules that have custom attributes associated with them, you must
import the custom attribute definitions before you import the project. To import custom attributes, see
Importing and exporting custom attribute definitions [page 295].
If you select this option, you can enter a comment regarding rule approval in the box below the option. The
comment applies to all rules in the ZIP file.
If you do not select this option, the rules will have an “Editing” status.
9. Click Next.
The Import window now shows that you are on step 2 of the import process: Map connection and schema.
10. Map the source connection in the imported file to the target connection and schema.
a. For any file connection in the From column, select the target file connection in the drop-down list in the To
column for each file connection.
Administrator Guide
288 PUBLIC Migration between landscapes
Note
If the rule is bound to a column in any flat files, ensure that the target file connection contains the same
flat files as the imported connection.
b. For each database connection in the From column, if the target connection differs from the source, select
the target connection in the drop-down list in the To column.
Note
The connection must already be defined on this Information Steward system.
c. For each schema in the connection in the From column, if the target schema name differs from the
source, select the target schema name in the drop-down list in the To column.
11. Click Import to proceed.
The Import window now shows a summary of everything that was imported.
12. Click Download summary to save high-level details such as the names of the imported objects and whether
they were imported successfully to a .csv file. Click Download log to save in-depth details of the import
process to a text file. If you choose to download the summary or log, the Select location for download window
opens, where you name the file (or accept the default), choose its location, and click Save.
13. Click Finish to close the Import window.
The project and its associated objects are imported into your project.
Before you can use an imported data cleansing solution, you must restore its staged data, which is now archived.
Then open the data cleansing solution to see its results.
You can import the following objects that were exported from another instance of SAP Information Steward.
● Rules
● Rule bindings
● Views
● File formats
● Key data domains
● Tasks
● Tables
Tip
If you need to import all objects from a project, just import the whole project instead.
Ensure that the target Data Insight connections (for any tables or files on which the object was created or is bound
to) are defined in the Central Management Console (CMC) before you follow these steps.
1. Log on to Information Steward with a user name that belongs to one of the following groups:
○ Data Insight Analyst
○ Data Insight Scorecard Administrator
Administrator Guide
Migration between landscapes PUBLIC 289
○ Data Insight Rule Approver
○ Data Insight Administrator
2. Open the Data Insight tab and select the project into which you want to import an object.
3. If you will import rules that have custom attributes associated with them, you must import the custom
attribute definitions before you import the rules. To import custom attributes, see Importing and exporting
custom attribute definitions [page 295].
○ If the imported objects should replace existing objects when their names are identical, select Overwrite
existing objects. Overwriting objects also overwrites any dependent objects.
○ If you are importing a rule or rule-related object, select Automatically approve imported rules to give the
imported rules an “Approved” status.
If you select this option, you can enter a comment regarding rule approval in the box below the option.
The comment applies to all rules in the ZIP file.
If you do not select this option, the rules will have an “Editing” status.
8. If you are importing rule bindings, views, key data domains, tasks, or tables, click Next to map the source
connection in the imported file to the target connection and schema.
a. For any file connection in the From column, select the target file connection in the drop-down list in the To
column.
Note
If the rule is bound to a column in any flat files, ensure that the target file connection contains the same
flat files as the imported connection.
b. For each database connection in the From column, if the target connection differs from the source, select
the target connection in the drop-down list in the To column.
Note
The connection must already be defined on this Information Steward system.
c. For each schema in the connection in the From column, if the target schema name differs from the
source, select the target schema name in the drop-down list in the To column.
9. To import the object, click Import.
The Import window now shows a summary of everything that was imported.
10. Click Download summary to save high-level details such as the names of the imported objects and whether
they were imported successfully to a .csv file. Click Download log to save in-depth details of the import
process to a text file. If you choose to download the summary or log, the Select location for download window
opens, where you name the file (or accept the default), choose its location, and click Save.
11. To close the Import window, click Finish.
The project and its associated objects are imported into your project.
Administrator Guide
290 PUBLIC Migration between landscapes
14.5.1.8 Importing content types
After exporting content types, you will have an XML file within a compressed ZIP file. Use the ZIP file to import
content types.
The content types are imported. You can choose Manage Content Types , and then click Edit to view the
available training data.
When importing content types, another user might be using or changing the state of the content type.
Related Information
You can import rules which were exported from other instances of Information Steward. When importing, you
must have the target connection defined, and you can choose whether you want to overwrite tables.
When importing new rules, the user importing the rule is assigned as the author of the rule. When importing
existing rules, the currently listed author remains assigned as the author.
To import rules:
1. Ensure that the target connection (for the table that your rule is bound to) is defined on the Central
Management Console (CMC).
2. In the upper-right corner, click Import project and associated objects.
3. In the Import dialog box, select a rule to import.
a. Click Browse and navigate to the location of the rules Zip file, and then click Next.
b. Select any of the following options that you want to implement during importing.
Administrator Guide
Migration between landscapes PUBLIC 291
Table 116: Import options
Option Description
Automatically approve imported When selected, all of the imported rules have an "Approved" status. Select an ap
rules prover from the list. Enter approval comments. The comment applies to all rules
in the Zip file. When this option is not selected, the rules have an "Editing" status.
Note
When rules are imported, they are private.
Overwrite existing objects When selected, overwrites any dependencies between the table and rule, for ex
ample, when the rule has a lookup function within it.
Note
Only rules without an associated worklist task are imported.
Note
Rules are only overwritten when one the following conditions are true:
○ The person importing has administrator rule rights.
○ The person importing is assigned as the author of an existing rule.
○ The person importing belongs to the user group assigned as the author.
Note
When a public rule is overwritten during import, it will have an “Editing” status
in the project to which it is imported.
4. Click Import.
5. On the Rules Import window, you can select to download several files.
○ Select Download summary to view high-level details such as the names of the views imported and
whether they imported successfully or failed to import.
○ Select Download Log to view in-depth details such as detailed error messages.
6. Click Finish to return to the Rules tab.
Related Information
Before you can import rule bindings, you must export them and define (if not already defined) any connection that
contains a table to which the rule is bound.
Administrator Guide
292 PUBLIC Migration between landscapes
You might want to import rule bindings into another instance of Information Steward, or to move the object to
another environment such as development, test, or production. When you import rule bindings, you also import
the following:
1. Ensure that the target connection (for the table that your rule is bound to) is defined on the Central
Management Console (CMC).
2. Open the Rule Results tab from the Workspace Home page and click the import rule bindings icon ( ).
3. Click Browse to find the location of the rule binding ZIP file, and then click Open.
4. Select the options (if any) that you want to implement during importing.
Automatically approve imported When this option is selected, all of the rules that are imported will have an "Approved"
rules status. Select an approver from the list. Enter an approval comment in the text box.
The comment applies to all rules in the Zip file. When this option is not selected, the
rules will have an "Editing" status.
Overwrite existing tables When selected, overwrites any information for the existing table.
Overwrite existing file formats When selected, overwrites any information for the existing file format and flat file sour
and flat file sources ces.
5. Click Next.
The Mapping from imported connection and schema to target pane appears.
6. Map the source connection in the imported file to the target connection and schema.
a. For any file connection in the From column, select the target file connection in the drop-down list in the To
column.
Note
If the rule is bound to a column in any flat files, ensure that the target file connection contains the same
flat files as the imported connection.
b. For each database connection in the From column, if the target connection differs from the source, select
the target connection in the drop-down list in the To column.
Note
The connection must already be defined on this information Steward system.
c. For each schema in the connection in the From column, if the target schema name differs from the
source, select the target schema name in the drop-down list in the To column.
7. Click Import.
8. On the Rule Bindings Import window, you can choose to download some files.
○ Select Download summary to view high-level details such as the names of the rules, connections, rule
bindings, tables, file formats, and views and whether they imported successfully or failed to import.
○ Select Download log to view in-depth details such as detailed error messages, the names of the
connections, rule bindings, rules, and so on.
Administrator Guide
Migration between landscapes PUBLIC 293
9. Click Finish to return to the Rules tab.
Related Information
You can export basic profile results of a table, view, and/or file to a Microsoft Excel spreadsheet so that you can
share the information with others who do not have access to the profile results in Information Steward.
1. Log on to Information Steward with a user name that belongs to one of the following groups:
○ Data Insight Analyst
○ Data Insight Administrator
2. Open the Data Insight tab and select the project that contains the profile results you want to export.
3. In the Workspace Home, select one or more tables, views, or files. If you select a table, view, or file that has no
profile results, then the profile results cannot be exported.
For each connection, an Excel file is created. The file name is connection_name.xls. All of the Excel files of all
the selected connections are compressed in a .zip file. The default name of the .zip file is
IS_ProfileResult_date_time.zip, but you can choose a different name. The Excel file includes the following
information:
● The Version worksheet contains release version number of the Information Steward product.
● The Parameters worksheet contains project name, data source name, and the names of the tables, views,
and/or files that were exported.
● For each selected table, view, or file, an Excel worksheet is created by table, view, or file name.
● The worksheet header includes a Profile Run Summary.
● The column summary, properties, and basic profile results are exported in a table format.
● Each table, view, or file column's Value, Pattern, and Word distribution count has a hyperlink to a separate
table that contains its details.
Administrator Guide
294 PUBLIC Migration between landscapes
14.5.3 Export tasks for external scheduler
Instead of scheduling tasks in the Central Management Console (CMC), you can export tasks to run them with an
external scheduler.
Running Information Steward tasks with an external scheduler is convenient if you run other tasks using the same
external scheduler. To be able to export task commands, you must be an administrator or a user with scheduling
privileges and the right to track activities. You can export the following task types:
Export these tasks by generating commands in the CMC, and copy the commands to a batch command. View log
files and history information in either Information Steward or the CMC.
To learn more about exporting commands for running tasks with an external scheduler, see the Administrator
Guide.
You can export custom attribute definitions and then import them into another instance of Information Steward to
share them between systems.
Note
If you want to retain your custom attributes associated with rules, you must export them and then import them
before you import the rules.
Exported custom attributes are stored in an .xml file (CustomAttributeDefinitions.xml), within a .zip file
(named CustomAttribute_date_time.zip by default).
Administrator Guide
Migration between landscapes PUBLIC 295
Importing custom attributes
You can export Match Review configurations and then import them into another instance of Information Steward
to share them between systems.
Exported Match Review configurations are stored in an .xml file (CustomAttributeDefinitions.xml), within
a .zip file (named MRConfiguration_date_time.zip by default).
1. Log on to Information Steward with a user name that belongs to one of the following groups:
○ Data Review Configuration Manager
○ Data Review Administrator
2. In the Manage list, choose Match Review Configurations.
3. Click the Import icon.
4. In the Import Match Review Configurations(s) window, click Browse to open a window where you will choose
the file
5. In the Select file to upload window, select the .zip file that you want to import, and click Open.
6. Back on the Import Match Review Configurations(s) window, if you want a different name for the configuration
on the target system, change the name in Configuration Name.
7. If you want a different name for the target connection, select one in the drop-down list.
8. Click Import.
9. Select the name of the configuration you want to export and click the Export icon
Administrator Guide
296 PUBLIC Migration between landscapes
15 Supportability
For each profiling and rule task and Metadata Integrator run, SAP Information Steward writes information in the
following logs:
● Database Log - Use the database log as an audit trail. This log is in the Information Steward Repository. You
can view this log while the Metadata Integrator or Data Insight profile or rule task is running.
The default logging level for the database log is Information which writes informational messages, such as
number of reports processed, as well as any warning and error messages. It is recommended that you keep
the logging level for the database log at a high level so that it does not occupy a large amount of disk space.
● File Log - Use the file log to provide more information about a Metadata Integrator or Data Insight profile or
rule task run. The Metadata Integrator creates this log in the Business Objects installation directory and
copies it to the File Repository Server. You can download this log file after the Metadata Integrator run
completed.
The default logging level for the file log is Configuration which writes static configuration messages, as well
as informational, warning, and error messages. You can change the logging level for the file log if you want
more detailed information. If your logs are occupying a large amount of space, you can change the maximum
number of instances or days to keep logs.
Each log level logs all messages at that level or higher. Therefore, for example, the default logging level,
Information, logs informational, warning, and error messages. Similarly, if you change the log level to Integrator
trace, Information Steward logs trace, configuration, informational, warning, and error messages.
Administrator Guide
Supportability PUBLIC 297
Log level Description
To change the Metadata Management log levels, you must have the Schedule right on the integrator source.
1. On the Information Steward page in the Central Management Console (CMC), expand the Metadata
Management node, and expand the Integrator Sources node to display all configured integrator sources.
2. Select the integrator source for which you want to change the logging level by clicking anywhere on the row
except its type.
Note
If you click the integrator type, you display the version and customer support information for the
integrator.
Future runs of the recurring schedule for this integrator source will use the logging level you specified.
3. Select the integrator source and click Action History in the top menu tool bar.
The Integrator History pane displays each schedule in the right panel.
4. Select the schedule name and click the icon for View the database log.
The Database log shows the task messages which are a subset of the message in the log file.
5. To find specific messages in the Database log window, enter a string in the text box and click Filter.
For example, you might enter error to see if there are any errors.
Note
For information about troubleshooting integrator sources, see Troubleshooting [page 189]
Administrator Guide
298 PUBLIC Supportability
6. To close the Database log window, click the X in the upper right corner.
1. On the Information Steward page in the Central Management Console (CMC), expand the Data Insight node,
and expand the Projects node.
2. Select the project source, and select the project for which you want to change the log level by clicking
anywhere on the row except its type.
3. Select Action > Schedule in the top menu tool bar.
4. Click the Parameters node in the tree on the left.
5. From the drop-down list, select the log level that you want for Database Log Level or File Log Level.
6. Click Schedule.
Future runs of the recurring schedule for this Data Insight profile or rule task will use the logging level you
specified.
A list of tasks appears in the right panel with the date and time each was last run.
5. Select the task and click Action > History in the top menu tool bar.
The Data Insight Task history pane displays each instance the task was executed.
6. Select the instance name and click the icon for View the database log.
The Database log shows the task messages which are a subset of the message in the log file.
7. To find specific messages in the Database log window, enter a string in the text box and click Filter.
For example, you might enter error to see if there are any errors.
Note
For information about troubleshooting Cleansing Package Builder, see “Troubleshooting” in the User
Guide.
8. To close the Database log window, click the X in the upper right corner.
Administrator Guide
Supportability PUBLIC 299
15.1.2.3 Web Services log
If an error occurs when using the Information Steward Web Services, change the log level to show more details.
4. Click Action Configure Web Services in the top menu tool bar.
5. From the Log Level drop-down list, select one of the following values:
Table 119:
Log level Description
Debug Log debugging, informational, warning, and error messages. To avoid security risks associated with clear
text passwords and logins stored in this logfile, use a lower log level setting.
6. Run your Information Steward Web Services client and view the Web Services log in the Information Steward
log file, which is located in this directory:
Metadata Browsing Service and View Data Service are used by SAP Information Steward to connect and view data
in profiling sources. Communication occurs between the following components:
Log locations
The following log files are in the platform log directory where the Central Management Server (CMS) is installed:
%BOE Install%\SAP BusinessObjects\SAP BusinessObjects Enterprise XI <version>\logging
Requests and responses between the Information Steward Web Application and the Information Steward APS
Services (within the SAP solutions for EIM Adaptive Processing Server) are contained in the
InformationSteward.Explorer.log.
Administrator Guide
300 PUBLIC Supportability
Requests and responses that occur within the SAP solutions for EIM Adaptive Processing Server either between
the Data Services APS Services and Information Steward APS Services or between the Data Services APS
services and the Data Services backend engine are contained in the following directories:
● ICC.MetaDataService_<hostname>_<timestamp>_<version>.log
● ICC.Viewdataservice_<hostname>_<timestamp>_<version>.log
Requests and responses for communicating that occurs within the Data Services backend engine between the
Data Services backend engine and the Data Services APS services within the EIM Adaptive Processing Server are
contained in the following logs:
%DS_COMMON_DIR%\log\MetadataService
%DS_COMMON_DIR\log\ViewdataService
Look for the most recent log file associated with job execution:
<hostname>_<serviceprovider>_<timestamp>_trace<version>.txt
Log levels
For Information Steward and Data Services logs, you can configure the level of information collected as shown in
the following table:
Table 120:
Log level Information Steward Data Services
Finer All traces, requests, and responses All traces, requests, and responses
Related Information
Configuring Metadata Browsing Service and View Data Service [page 225]
Metadata Browsing Service configuration parameters [page 226]
View Data Services configuration parameters [page 227]
The log files are in the platform Log directory where CMS is installed, <%BOE Install%>\SAP
BusinessObjects Enterprise XI 4.0\logging. For example, C:\Program Files (x86)\SAP
BusinessObjects\SAP BusinessObjects Enterprise XI 4.0\logging
● InformationSteward.Administrator.log
Administrator Guide
Supportability PUBLIC 301
● InformationSteward.AdminService.log
● InformationSteward.ApplicationService.log
● InformationSteward.DataReviewService.log
● InformationSteward.DataReviewTaskSchedulingService.log
● InformationSteward.Explorer.log
● InformationSteward.IntegratorService.log
● InformationSteward.MetadataService.log
● InformationSteward.SchedulingService.log
● InformationSteward.SearchService.log
● InformationSteward.TaskSchedulingService.log
● InformationSteward.ViewdataService.log
● EIMAdaptiveProcessingServer.glf (Cleansing Package Builder information)
● EIMAdaptiveProcessingServer_dcatrace.glf (Data Cleansing Advisor information)
On machines where only Web Application is installed, the log files (InformationSteward.Administrator.log
and InformationSteward.Explorer.log) are stored in the Web Application temp directory. For example, C:
\Program Files (x86)\SAP BusinessObjects\Tomcat6\temp\ICC
To use CA Wily Introscope for integration with Information Steward, you must have the following prerequisites:
Table 121:
Option Description
Note
You can also configure the Introscope agent by selecting CMC Servers Server node Placeholders in
the CMC. The Introscope Enterprise Manager host and port are also configured here for the Introscope agent
to communicate with the monitoring application.
Administrator Guide
302 PUBLIC Supportability
For more information:
● Business Intelligence Platform Administrator Guide : “ Integrating the monitoring application with SAP
Solution Manager”
● SAP Business Intelligence Platform Administrator Guide: “Integrating the monitoring application with SAP
Solution Manager”
15.2.2 Workflows
● Browse metadata
● Add tables to project
● View data
● View profiling results
● View profiling sample data
● Export/import rules/rule binding
● Import Metapedia terms
● Metadata Management browsing
● Test rules
● View score card
● View lineage
You can use CA Wily Introscope as part of SAP Solution Manager for measuring Information Steward
performance instrumentation. When installing the platform, the Introscope agent is provided for your deployment
Introscope agents collect performance metrics from SAP Business Intelligence platform Java back-end servers.
Agents also collect information from the surrounding computing environment. The agents then report these
metrics to the Enterprise Manager.
Administrator Guide
Supportability PUBLIC 303
16 Metadata Management export utility
SAP Information Steward provides an XML export utility, MMObjectExporter.bat, which allows you to export an
entire Metadata Management configuration or a subset of objects to an XML file. The XML exported by this utility
follows the XML Schema Definition file located at <Information Steward Installdir> \MM\xml
\ObjectExportSchema.xsd.
● MITI-related objects
● The relationship between objects across integrator configurations.
The utility is installed on the machine on which you installed this product. You specify its output by using required
and optional command line arguments.
A list of command line arguments is included in the comments at the top of the MMObjectExporter.bat file.
Example
In this example, at a command prompt positioned in the installation directory's subdirectory, the user enters
the following command and arguments:
Note
Ensure you enter the entire command on one line.
Administrator Guide
304 PUBLIC Metadata Management export utility
Table 122:
Entry Description
CMS_source Uses the configuration argument and represents the name of the configuration
for export to XML. For example, CMS_source. Alternatively, you can specify the
configuration ID.
"c:\temp\first exported.xml" Uses the filename argument and represents the path and name of the created
XML file. Here the argument is in quotation marks because the file name contains
a space.
boeUser Jane Uses the boeUser argument to specify the name of the CMS user.
boePassword MyPw Uses the boePassword argument to specify the password for the CMS user.
mainObject Universe Uses the mainObject argument to select all Universes in the configuration. Alter
natively, the argument could be more specific. For example, using the syntax
mainObject "Universe=MyUniverse" selects only the Universe named
MyUniverse. Using the syntax mainObject
"Universe=MyUniverse,YourUniverse" selects the two Universes
named MyUniverse and YourUniverse. The parents and children of these objects
would also be included, because the arguments for includeParents and include
Children are not invoked, and so they default to TRUE.
Administrator Guide
Metadata Management export utility PUBLIC 305
17 SAP information resources
A global network of SAP technology experts provides customer support, education, and consulting to ensure
maximum information management benefit to your business.
Table 123:
Address Content
Customer Support, Consulting, and Education services Information about SAP Business User Support programs, as
well as links to technical articles, downloads, and online dis
http://service.sap.com/ cussions.
http://help.sap.com/bois/
Supported platforms (Product Availability Matrix) Information about supported platforms for SAP Information
Steward with a search function to quickly find information re
https://service.sap.com/PAM lated to your platform.
Product tutorials Tutorials that have been developed to help you find answers to
the most common questions about using SAP Information
http://scn.sap.com/docs/DOC-8751 Steward.
Forums on SCN (SAP Community Network ) Discussions that include information and comments from
other SAP users and the means to share your knowledge with
https://go.sap.com/community/topic/information-stew
the community.
ard.html
EIM Wiki page on SCN The means with which to contribute content, post comments,
and organize information in a hierarchical manner to so that in
https://wiki.scn.sap.com/wiki/display/EIM/Enterprise+In
formation is easy to find.
formation+Management+-+EIM
Administrator Guide
306 PUBLIC SAP information resources
18 Glossary
18.1 Glossary
accuracy The extent to which data objects correctly represent the real-world values for which they
were designed.
address profiling A process that componentizes and measures address data with dictionary data.
annotation User notes added to an object in Metadata Management and Data Insight.
assigned For a record, the state of containing some parsed values assigned to attributes and
other parsed values not assigned to attributes. The unassigned parsed values may be
noise data.
association A relationship between terms contained in a Metapedia business glossary and metadata
objects.
catalog A relational object type that, in a relational database management system (RDBMS),
corresponds to a database. Each deployment contains two such object types, one for
the datasources and one for the target tables.
class A user-defined folder that contains related objects that have a common purpose in the
universe.
classification The type of situation that applies to a value. A classification informs you how a variation
may be used.
cleansing package The parsing rules and other information that define how to parse and standardize the
data of a specific data domain.
Cleansing Package A module of Information Steward that allows a data steward to create and modify
Builder cleansing packages for any data domain. A cleansing package is then used to process
the data in accordance with package guidelines through SAP Data Services.
connection A named set of parameters that defines how one or more BusinessObjects applications
can access relational or OLAP database middleware.
consistency The extent to which distinct data instances provide non-conflicting information about
the same underlying data object.
content type A definition that provides insight and meaning to a specific column of data.
Administrator Guide
Glossary PUBLIC 307
context definition A method that allows users to specify context when data contains a pattern or contains
parsed values that have a special meaning when used together, such as a range of
acceptable values.
custom attribute Properties you add to existing metadata objects that, once defined, can be searched for
and viewed.
custom cleansing The parsing rules and other information that you have defined in order to parse and
package manipulate all types of data including operational and product data.
data flow A reusable object containing steps to define the transformation of information from
source to target.
data governance The processes by which an enterprise manages its data assets. These processes should
ensure that the assets are of high quality and can be trusted.
Data Insight project A collaborative space for data stewards and data analysts to assess and monitor the
data quality of a specific domain and for a specific purpose (such as customer quality
assessment, sales system migration, and master data quality monitoring).
data model A structure that defines a given domain and which refers to a logical model that
comprises tables and fields in a hierarchical structure.
data steward A person who manages data as an asset, is an expert in his data domain and is
responsible for the quality of the data.
data type The format used to store a value, which can imply a default format for displaying and
entering the value.
database One or more large structured sets of persistent data, usually associated with software,
to update and query the data. A relational database organizes the data, and relationships
between them, into tables.
datasource A pointer to a source of data; It points to and represents the data that is kept in a data-
access application.
dependency profiling A process that determines whether the data in one column or table is based on the
results of another column or table.
detail In .unv universes, an object that provides descriptive data about a dimensions. A detail is
the equivalent of an attribute in .unx universes.
directory structure A hierarchy within Information Steward that is organized into folders that contain four
categories, namely Data Integration, Business Intelligence, Data Modeling, and
Relational Databases.
domain A structure used to classify content that is specific to a particular area. In a person and
firm cleansing package, a domain is equivalent to a locale. Variations may belong to
more than one domain.
driver record A record that drives the comparison process. Driver records are part of a break group
and are compared with passenger records to determine matches.
duplicate record A record in a match group that represents the same real world entity as the remaining
records in the same match group.
Administrator Guide
308 PUBLIC Glossary
Excel file format Definition of an Excel file that indicates the worksheet and the cells within that worksheet
from which to access data.
extract A process by which Information Steward copies information from source systems and
loads it into the repository.
failed data A set of tables that contain information about all the records that failed validation rules,
project, and connections used.
file format Flat file definition which includes the column name, data type, delimiter or character
width. This is equivalent to the schema for a relational database table.
flat file format Definition of a flat file that includes the column name, data type, delimiter or character
width. This is equivalent to the schema for a relational database table.
fully assigned For a record, the state of having all of its parsed values assigned to attributes.
global domain A special content domain that contains all variations and all of their associated
properties.
impact analysis The process of identifying the object or objects that will be affected if you change or
remove other connected objects.
impact diagram A graphical representation of the object(s) that will be affected if you change or remove
other connected objects.
InfoCube A type of InfoProvider that describes a self-contained dataset, for example, from a
business oriented area.
integrator source A named set of parameters that describes how a metadata integrator can access a data
source.
integrity The extent to which data is not missing important relationship linkages.
job server The software that receives requests to start and stop jobs.
join Combine information from two tables by performing a lookup on every record of the
primary table. The result of "joining" is a "join", i.e. the noun is also "join".
key data domains A set of related data objects or key data entities.
lineage analysis The process of identifying from which source or sources a metadata object obtains its
data.
lineage diagram A diagram that shows where the data comes from and what sources provide the data for
this object.
logical model A data structure that is independent of any specific physical database implementation.
mapping rule A description of how the rows in a target table are generated from the datasources; can
include filters, formulas, and relationships to convert values in your datasource to values
expected in the target.
match A pair or group of records that are found to be identical, based on the criteria you set.
match review task A collection of match groups identified by an automated match job that require manual
review.
Administrator Guide
Glossary PUBLIC 309
metadata integrator An application that collects information about objects in a source system and integrates
it in one or more related source systems.
metadata object A unit of information that the software creates from an object in a source system.
Metapedia A custom glossary within Information Steward that you use to define and organize terms
and categories related to your business data.
Metapedia auto The process of searching for objects whose name, description, annotation, or custom
search attributes contain the name, synonym, or keyword of the term.
Metapedia category The organization system for grouping Metapedia terms to denote a common
functionality. Categories can contain sub-categories, and you can associate Metapedia
terms to more than one category.
Metapedia custom The process of transferring terms and categories from a spreadsheet that contains
import columns that differ from those in the spreadsheet that Metapedia created when
exporting terms.
Metapedia default The process of transferring terms and categories from a spreadsheet that Metapedia
import created when exporting terms.
Metapedia export The process of copying Metapedia terms and categories to a spreadsheet.
Metapedia term A word or phrase that defines a business concept in your organization.
MultiProvider An SAP NetWeaver Business Warehouse object that combines data from several
InfoProviders and makes it available for reporting.
non-match A record that the data steward determines is not a duplicate and does not therefore
belong in the match group.
object equivalency A naming rule that indicates that an object in one source system is the same physical
rule object in another source system.
object tray A temporary holding space for objects that you want to export or define a relationship
for in Metadata Management.
Open Hub Destination An SAP NetWeaver Business Warehouse object within the open hub service that
contains all information about a target system for data in an InfoProvider. The target
system can be external.
output format The format of the cleansed data (that is, the output data) as defined by the domain-
specific parsing rules.
parent-child A hierarchical relationship where one object is subordinate to another. In this hierarchy,
the parent is one level above the child; a parent can have several children, but a child can
have only one parent. For example, a table can have multiple columns, but a column can
belong to only one table.
parsing rule A rule that determines how data is classified based on a pattern within the data and how
the data is mapped to specific attributes.
pattern definition A regular expression that describes a pattern found in uncleansed data. A pattern
definition is used in a user-defined pattern rule.
pattern name The name for a regular expression pattern. A pattern name is part of the user-defined
pattern rule definition.
Administrator Guide
310 PUBLIC Glossary
person and firm A cleansing package that parses party or name and firm information such as given
cleansing package name, family name, prename, title, phone number, and firm or company name.
policy set A collection of policy statements. Policy sets support efficient operations and business
processes execution by organizing information about business polices, processes, and
best practices. Policy sets can contain other policy sets.
policy statement Supports a particular area of the business or a particular business process (for example,
procurement, competitive bidding, or travel). Statements set guidelines for ensuring the
proper management of an organization’s data.
possible match A record in a match group that is a suspected duplicate record and needs to be reviewed
by a data steward to determine its match status.
primary key A column that is guaranteed to contain unique values, and whose values identify all of
the rows in a table.
private cleansing A cleansing package that can be viewed or edited only by the user who owns it.
package
profile A process that generates attributes about the data such as minimum and maximum
values, pattern distribution and data dependency to help data analysts discover and
understand data anomalies.
profile task A task to profile one or more tables, views and/or flat files. This task can be scheduled or
executed on demand.
private rule A rule that is visible to only the project in which it was created.
properties file A collection of information that appears on the Report tab of each report; it includes
such information as the name and description of the report, and the source(s) of the
information contained in the report.
published cleansing A cleansing package that is either SAP-supplied or created and then is published by a
package data steward, is available to all users, and can be used in a Data Services transform.
quality dimensions A category for rules such as accuracy and completeness. This helps to organize your
rules and provides a score that contributes to the scorecard value.
query views An SAP NetWeaver Business Warehouse object consisting of a modified view of the data
in a query or an external InfoProvider.
repository A set of tables that hold user-created and predefined system objects, source and target
metadata, and transformation rules.
rule action The section of a parsing rule that defines the operation to perform when data matches
the rule.
Administrator Guide
Glossary PUBLIC 311
rule binding The assignment of a validation rule to a specific table column in a connection.
rule definition The section of a parsing rule that defines the components included in the rule and the
order they must appear in the data to match the rule.
rule task A task to run rules bound by one or more tables, views and/or flat flies. This task can be
scheduled or executed on demand.
rule type A descriptor that indicates the source of a parsing rule as user- generated or auto-
generated.
same as relationship The association between two objects indicating that they are identical physical objects.
Only objects of the same object type can have this kind of association.
score A numerical result calculated by counting the records that pass a rule divided by the
total number of records.
scorecard A high level data quality view of a key data domain based on business data quality
objectives.
server instance A database, data source, or service in a relational database management system.
service A subsystem that performs a specific function and runs under the process ID of the
parent server.
source An object that provides data that is copied or transformed to become part of the target
object.
source system A software application from which SAP BusinessObjects Information Steward extracts
and organizes metadata into directory structures, enabling you to navigate and analyze
the metadata.
standard form The standardized or normalized form of a variation, which is displayed after cleansing.
sub-category Within a category, the organization system for grouping terms to denote a common
functionality.
synonym Another name for an object in the same system. For example, a synonym for a relational
table exists in the same database as the table.
target The object into which the application loads extracted and transformed data in a data
flow.
target schema A set of tables. A project can only contain one target schema, and its name is always
target schema.
timeliness The extent to which data is sufficiently up-to-date for the task at hand.
trace To display the objects through which the data passes from the source to the final target,
or from the target to the first source.
transfer rules An SAP NetWeaver Business Warehouse object that determines how the data for a
DataSource is to be moved to the InfoSource. The uploaded data is transformed using
transfer rules.
Administrator Guide
312 PUBLIC Glossary
transformation An SAP NetWeaver Business Warehouse object that consists of functions for unloading,
loading, and formatting data between different data sources and data targets that use
data streams.
transformation name The identity of the universe object, if the target is a measure, or the data flow name, if
the source data was taken from an Extract, Transform, and Load (ETL) system.
uniqueness The extent to which the data for a set of columns is not repeated.
uniqueness profiling A process that determines whether the exact piece of data is repeated within the same
column or differing columns.
universe An abstraction of a data source that presents data to users in non-technical terms.
usage scenario An example that is typical of the kinds of tasks you’d like to perform with the software.
validation rules A method that assesses the quality of data in the source system. These rules are bound
to one or more columns to derive a score.
web template An SAP NetWeaver Business Warehouse object consisting of an HTML document that
determines the structure of a Web application.
Administrator Guide
Glossary PUBLIC 313
Important Disclaimers and Legal Information
Coding Samples
Any software coding and/or code lines / strings ("Code") included in this documentation are only examples and are not intended to be used in a productive system
environment. The Code is only intended to better explain and visualize the syntax and phrasing rules of certain coding. SAP does not warrant the correctness and
completeness of the Code given herein, and SAP shall not be liable for errors or damages caused by the usage of the Code, unless damages were caused by SAP
intentionally or by SAP's gross negligence.
Accessibility
The information contained in the SAP documentation represents SAP's current view of accessibility criteria as of the date of publication; it is in no way intended to be a
binding guideline on how to ensure accessibility of software products. SAP in particular disclaims any liability in relation to this document. This disclaimer, however, does
not apply in cases of willful misconduct or gross negligence of SAP. Furthermore, this document does not result in any direct or indirect contractual obligations of SAP.
Gender-Neutral Language
As far as possible, SAP documentation is gender neutral. Depending on the context, the reader is addressed directly with "you", or a gender-neutral noun (such as "sales
person" or "working days") is used. If when referring to members of both sexes, however, the third-person singular cannot be avoided or a gender-neutral noun does not
exist, SAP reserves the right to use the masculine form of the noun and pronoun. This is to ensure that the documentation remains comprehensible.
Internet Hyperlinks
The SAP documentation may contain hyperlinks to the Internet. These hyperlinks are intended to serve as a hint about where to find related information. SAP does not
warrant the availability and correctness of this related information or the ability of this information to serve a particular purpose. SAP shall not be liable for any damages
caused by the use of related information unless damages have been caused by SAP's gross negligence or willful misconduct. All links are categorized for transparency
(see: http://help.sap.com/disclaimer).
Administrator Guide
314 PUBLIC Important Disclaimers and Legal Information
Administrator Guide
Important Disclaimers and Legal Information PUBLIC 315
go.sap.com/registration/
contact.html