0% found this document useful (0 votes)
513 views131 pages

Fault Solution Administration Guide Helix 11.1

Uploaded by

Yasir Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
513 views131 pages

Fault Solution Administration Guide Helix 11.1

Uploaded by

Yasir Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 131

Fault Solution Administration Guide

11.1
Confidentiality, Copyright Notice & Disclaimer
Due to a policy of continuous product development and refinement, TEOCO Corporation or a
TEOCO affiliate company (“TEOCO”) reserves the right to alter the specifications,
representation, descriptions and all other matters outlined in this publication without prior
notice. No part of this document, taken as a whole or separately, shall be deemed to be part
of any contract for a product or commitment of any kind. Furthermore, this document is
provided “As Is” and without any warranty.
This document is the property of TEOCO, which owns the sole and full rights including
copyright. TEOCO retains the sole property rights to all information contained in this
document, and without the written consent of TEOCO given by contract or otherwise in
writing, the document must not be copied, reprinted or reproduced in any manner or form, nor
transmitted in any form or by any means: electronic, mechanical, magnetic or otherwise,
either wholly or in part.
The information herein is designated highly confidential and is subject to all restrictions in any
law regarding such matters and the relevant confidentiality and non-disclosure clauses or
agreements issued with TEOCO prior to or after the disclosure. All the information in this
document is to be safeguarded and all steps must be taken to prevent it from being disclosed
to any person or entity other than the direct entity that received it directly from TEOCO.
TEOCO and Helix are trademarks of TEOCO.
All other company, brand or product names are trademarks or service marks of their
respective holders.
This is a legal notice and may not be removed or altered in any way.
COPYRIGHT © 2020 TEOCO Corporation or a TEOCO affiliate company.
All rights reserved.

Your feedback is important to us: The TEOCO Documentation team takes many measures
in order to ensure that our work is of the highest quality.

If you found errors or feel that information is missing, please send your Documentation-
related feedback to Documentation@teoco.com

Thank you,

The TEOCO Documentation team


Table of Contents

Table of Contents
What is the Fault Management Solution? ..................................................................... 1
Who Should Use this Guide? ..................................................................................................... 1
How this Guide is Organized ..................................................................................................... 1
Additional Reading ..................................................................................................................... 1
Alarm Collection ............................................................................................................. 2
Network Alarm Collection ........................................................................................................... 2
Application Alarm Collection ...................................................................................................... 2
Correlation Alarms ................................................................................................................ 2
Service Alarms ...................................................................................................................... 2
TrafficGuard Alarms .............................................................................................................. 2
Alarm Structure .......................................................................................................................... 3
Alarm Management......................................................................................................... 4
Alarm Monitoring ........................................................................................................................ 4
Alarm Class Concept ............................................................................................................ 4
Toggling Alarms .................................................................................................................... 5
Repeated Alarms .................................................................................................................. 5
Maintenance Calendar .......................................................................................................... 6
Schematic Views for FM ....................................................................................................... 6
GEO Maps for FM ................................................................................................................. 6
FaultPro ................................................................................................................................. 6
FM Screener ......................................................................................................................... 6
FM Alarms Summary ............................................................................................................ 7
Anomaly & Trend Information ............................................................................................... 7
Alarm Prediction .................................................................................................................... 7
Site View Display .................................................................................................................. 8
FM Notifications .................................................................................................................... 8
Alarm Correlation ....................................................................................................................... 8
Correlator TRS ...................................................................................................................... 8
Correlator ES ........................................................................................................................ 9
Machine Learning Root-cause Analysis (RCA)..................................................................... 9
Reporting .................................................................................................................................... 9
Alarm Handling ........................................................................................................................... 9
System Description ...................................................................................................... 10
Engines ....................................................................................................................................10
FM Engine(s) .......................................................................................................................10
FM History ...........................................................................................................................10
FaM Admin ..........................................................................................................................11
FM Analytics ........................................................................................................................11
Correlators...........................................................................................................................11
External APIs ......................................................................................................................11
Clients ......................................................................................................................................11
Cruiser Client ......................................................................................................................11
Light Cruiser Monitoring Client ............................................................................................12
History Analysis Client ........................................................................................................12
Administration Client ...........................................................................................................12
Architecture ..............................................................................................................................13
Active/Active architecture ....................................................................................................14
Apache Kafka and Zoo Keeper ...........................................................................................14
Distributed Cache Architecture ...........................................................................................14
Workflows ..................................................................................................................... 15
Post-Installation Workflow ........................................................................................................15

iii
Fault Solution Administration Guide
Displaying the Cruiser System Folder Names in non-English Languages .........................15
Post-Upgrade Workflow ...........................................................................................................16
Defining the Operator Working Environment ...........................................................................18
Configuration ................................................................................................................ 19
Overview ..................................................................................................................................19
Enrichment Rules .....................................................................................................................19
Action Rules .............................................................................................................................20
Condition .............................................................................................................................20
Modifications/Actions ..........................................................................................................20
Delay ...................................................................................................................................21
Activation Time ....................................................................................................................21
Example of Possible Rules .................................................................................................21
Association Rules .....................................................................................................................21
Condition .............................................................................................................................21
Activation Time ....................................................................................................................21
Toggle Rules ............................................................................................................................21
Repeated Rules .......................................................................................................................22
Display Rules ...........................................................................................................................22
Trouble Ticket Integration ........................................................................................................22
Overview .............................................................................................................................22
Trouble Ticket Mapping Rules ............................................................................................23
NeTkT Plugin ......................................................................................................................24
GEO Maps ...............................................................................................................................24
Setting GEO Maps Configuration ........................................................................................24
Setting Base Configuration Region Coordinates ................................................................25
Map Display Parameters .....................................................................................................26
MapsConfig-project.xml Structure Example ........................................................................28
Flooding Protection ..................................................................................................................29
The Flooding Algorithm .......................................................................................................31
FamEngine Flood System Properties .................................................................................32
Flooding of History Alarms ..................................................................................................33
Flooding in FamProxy .........................................................................................................33
Client Protection from Large Amount of Alarms..................................................................35
FM Screener ............................................................................................................................36
User Actions ........................................................................................................................36
Historic Investigations .........................................................................................................37
Severity Management ..............................................................................................................37
Worklog Management ..............................................................................................................37
Project Fields Configuration .....................................................................................................38
Activating Project Fields ......................................................................................................38
Configuring the Display Name of Alarm Fields ...................................................................39
Making Alarm Fields Visible ................................................................................................40
Configuring “Copy the Alarm Fields as Text” ...........................................................................40
Summary View Configuration ..................................................................................................41
Project Summary View Icons Configuration ........................................................................41
FaultPro Configuration .............................................................................................................43
Site View Configuration ............................................................................................................44
Icons Configuration .............................................................................................................44
Tooltip Configuration ...........................................................................................................45
Additional Details Configuration ..........................................................................................45
Service Details Configuration ..............................................................................................46
KPI Presentation .................................................................................................................49
Site View Refresh Rate .......................................................................................................50
Anomaly & Trend Configuration ...............................................................................................51
About config.xml ..................................................................................................................51
Selecting the PredictiveObjects (for Both Trend and Anomaly) ..........................................51

iv
Table of Contents
Defining the HistoryResolution (for Both Trend and Anomaly) ...........................................53
Configuring the Anomaly Learning Phase ..........................................................................54
Configuring the Score Coloring (for Both Trend and Anomaly) ..........................................55
config.xml File Example ......................................................................................................56
Alarms Prediction Configuration ..............................................................................................62
Offline ..................................................................................................................................62
Online ..................................................................................................................................63
ServiceImpact Configuration ....................................................................................................64
Recognizing PM Entity Name in Alarms ..................................................................................64
Maintenance Calendar Configuration ......................................................................................64
Maintenance Calendar Architecture ....................................................................................65
DB Plug-in Configuration.....................................................................................................66
Maintenance Calendar Module Configuration .....................................................................69
FamMaintenace Module Configuration ...............................................................................70
Machine Learning Root Cause Analysis (RCA) Configuration ................................................71
Learning ..............................................................................................................................71
Learning Investigations .......................................................................................................73
Run-Time.............................................................................................................................75
Correlation Graph ................................................................................................................77
Opening Clients ........................................................................................................................78
Opening FM Cruiser from External Applications .................................................................78
Opening FM History from External Applications .................................................................80
Maintenance .................................................................................................................. 82
Verifying that All Components are Running .............................................................................82
J2EE Components ..............................................................................................................82
FM Services ........................................................................................................................82
Running FM Modules ...............................................................................................................83
Checking the System Queues .................................................................................................83
Checking the Memory Consumption ........................................................................................83
History Table Partitioning .........................................................................................................83
TEOCO Monitor .......................................................................................................................83
Troubleshooting ........................................................................................................... 84
Log Files ...................................................................................................................................84
J2EE Server and Client Log Files .......................................................................................84
FM Services Log Files .........................................................................................................84
Server Troubleshooting ............................................................................................................84
Server Components Are Up and Functioning .....................................................................84
Data Loss and Restart ........................................................................................................84
Alarms Display/Update Is Delayed .....................................................................................84
History Data Is Delayed ......................................................................................................84
Insufficient Oracle Connections ..........................................................................................85
Hazelcast Disconnections ...................................................................................................85
Client Troubleshooting .............................................................................................................85
The Installation Starts and then Fails and an Error Message Appears ..............................85
The Installation States that an Old Installation is Interfering with the Installation ...............85
The .Net Framework Installation Fails ................................................................................86
The Application Starts but Fails with a ‘Could not initialize' Message ................................86
The Application Starts but Fails with an ‘Error Installing application' Message ..................86
The Application Starts but Some Operations are not Available ..........................................86
Drop-down and Context Menus are Displayed Behind the Main Window ..........................87
Cruiser Shows ‘Disconnected’ Status .................................................................................87
Delay in Display/Update of the Alarms ...............................................................................87
Statistics ...................................................................................................................................87
FM Module Statistics ...........................................................................................................87
Client Performance Considerations .........................................................................................90
Appendix A: Active Alarm Attributes .......................................................................... 91
v
Fault Solution Administration Guide
Appendix B: History Alarm Attributes......................................................................... 98
Appendix C: Project Active Alarm Attributes ............................................................. 99
Appendix D: Modules Configurable Properties ........................................................ 101
FamAdmin .........................................................................................................................101
FamEngine ........................................................................................................................101
FamHistory ........................................................................................................................112
JFam .................................................................................................................................113
FamProxy ..........................................................................................................................115
FamAnalytics .....................................................................................................................117
WinFam (Cruiser Client) ....................................................................................................117
FaMAdminModule (FM Admin Client) ...............................................................................125
HistoryAnalisysModule (FM History Client) .......................................................................125

vi
What is the Fault Management Solution?

What is the Fault Management Solution?


Helix’s Fault Management (FM) solution provides users with the ability to receive, view, track,
and analyze faults from any source throughout the telecommunications network, or from
alarm-generating applications.
FM, acting as a basic layer for the network manager, receives alarms in standard format from
agents throughout the network. It also receives alarms and messages from Network Elements
in their proprietary formats, and converts them into the standard format. All alarms and
messages received are stored in a historical database.
The Fault Management Solution has two main functions:
 Alarm Collection
 Alarm Management

Who Should Use this Guide?


This guide is intended for administrators and system integrators of the FM system.

Note: To prevent problems, we recommend that the settings be modified by only one
administrator/system integrator at a time. For example, if two users modify the same rule at
the same time, the last finished operation is executed and the other is ignored without any
warning.

How this Guide is Organized


 The System Description section describes the main components of the FM system
and how they interact with each other. This information is important for understanding
how to maintain the system.
 The Workflows section describes the procedures that should be performed before
installing or upgrading the FM system. In addition, it also describes the high level
procedures you should perform to configure the FM system, such as the operator's
working environment.
 The Configuration section describes the FM system configurations that can be
configured to fine-tune the system to meet your requirements.
 The FM Maintenance section describes various maintenance procedures including
how to verify components are running and run processes and services.
 The Troubleshooting section describes the various log file created by system modules
and provides ways to used them for troubleshooting. You can also find here important
information about collecting Helix and FM statistics and performance tuning.
 The Identifying the FM Modules' version section describes how to identify the version
numbers for error reporting purposes.
 FM Server Utilities describes utilities for raising and resolving alarms.

Additional Reading
For administration tasks specific to J2EE modules, refer to the Helix Administration Guide.

1
Fault Solution Administration Guide

Alarm Collection
Alarm collection can be divided into two main types:
 Network alarms
 Application alarms

Network Alarm Collection


Network alarm collection is performed by the FM’s Mediation layer. The Mediation layer
continuously monitors agents, network elements, and other managed objects using Mediation
libraries. A library is a set of definitions that determines how data coming from network
elements will be interpreted and enriched before being passed on to FM. Collection is
performed both actively, by inquiring on the network elements’ health and state, and
passively, by collecting alarms and messages that are sent to the manager by the network
elements.
The Mediation layer’s alarm collection processes operate continuously, unless they are
disabled by the system administrator. The system administrator can enable or disable the
alarm collection for each specific network element, as well as generically, for all the network
elements.
For more information, refer to the Fault Solution Implementation Guide.

Application Alarm Collection


Correlation Alarms
The Correlation modules included in the Fault Management solution can generate a derived
alarm representing the root cause for a group of child alarms.

Service Alarms
ServiceImpact is a Service Management product that can integrate with the Fault
Management Solution. ServiceImpact performs further analysis and abstraction of alarms by
relating them to end-to-end services such as customer line, data service, or IPTV. This
capability enables service providers to prioritize restoration procedures based on the type of
affected services and customers rather than on the type of impacted network resource. The
analysis is based on the relationship between services and equipment as described in the
Base Configuration module and the alarms’ contents. The ServiceImpact module shows faulty
service details, including the impact on the service and customers. The ServiceImpact module
generates service alarms which are displayed in the FM module.

TrafficGuard Alarms
TrafficGuard is a Performance Management (PM) product. It provides enhanced threshold
capabilities based on existing performance data. Whenever threshold conditions are
breached, TrafficGuard generates a Threshold Crossing Alarm which can be sent to FM and
viewed by operators.

2
Alarm Collection

Alarm Structure
Available fields of the active alarms are specified in the Appendix A: Active Alarm Attributes
chapter. In addition, there is a set of fields intended for specific project usage. These fields
can be populated by project specific Mediation library, by Enrichment Rules, or by project
specific logic. Refer to Project Fields Configuration for more information.
Alarm History extends active alarms with additional fields, which are specified in Appendix B:
History Alarm Attributes.
The fields can be customized (that means changing field label or description) by editing
ProjectActiveAlarm.xml and ProjectHistoryAlarm.xml in the project metadata of the JFam
module. Refer to the Configuration section for more information.

3
Fault Solution Administration Guide

Alarm Management
Alarm management includes the following:
 Alarm Monitoring
 Alarm Correlation
 Reporting
 Alarm Handling

Alarm Monitoring
The FM client is used to display alarms to users in real-time. It notifies operators about alarms
raised or cleared in the Cruiser. The FM basic displays are also available using limited
capabilities tools in the Light Cruiser application. The operators can view additional
information about the alarm using alarm details, and even view the alarm’s raw data, if it is
received from the network element. Audible notification is also available. In addition, FM can
send mails or SMSs with regard to important alarms, using the FM Notification mechanism.
To reduce the number of alarms that the operator has to handle, FM detects repeated
(sequential UP events for the same logicID) and toggled (sequential UP-DOWN events for the
same logicID repeated often) alarms and hides them from the operator. In addition, there are
various actions that can be automated through various rules available in the FM Admin
module.
Once alarms are received and displayed, users can investigate the alarms further and handle
them by acknowledging them, deferring them, or clearing them. Cleared alarms can stay
visible in monitoring clients for a predefined period of time and then be removed. All (active
and cleared) alarms and messages received are stored in a historical database that can be
accessed to produce historical reports using the History Analysis tool. All events and actions
that were applied to specific alarm can be investigated through the Event Log display.
FM has a bidirectional connection with NeTkT, TEOCO’s trouble-ticket product allowing
creation of a new ticket for the alarm or appending it to an already existing ticket. Integration
with other trouble ticket systems is also available.

Alarm Class Concept


The alarm class concept provides the ability for each NOC user to manage only the relevant
alarms under his/her "jurisdiction" (for example, geographical area or technology).
In Helix, alarms are associated with alarm classes on the basis of common attributes, such as
the alarm type (for example, application alarms or infrastructure alarms), or the area in which
the alarms originated. The association is performed through Enrichment Rules.
Once alarms are associated with alarm classes, the alarm classes can be associated with
users to enable specific users to work on alarms under their "jurisdiction" only, and not on
those of others. Association is performed through the TEOCO Admin GUI, where the user can
be linked to a BP class, which in turn is linked to Alarm Classes.
A user associated with the alarm class, can either operate on alarms from a specific class, or
only view their details. Alarms of other classes are invisible to the user.
Alarm class definition should be configured using Enrichment rules.

4
Alarm Management

Toggling Alarms
The Alarm Toggling feature is used to reduce the number of flipping alarm instances.
If X (by default 3) or more instances of the same alarm are raised and closed during Y
minutes (by default 10), the alarm is marked as “toggle”. The first 2 (assuming X = 3)
instances of the alarm are treated as regular ones, but the third one remains active (with a
“toggle” mark) regardless of its CLEAR event, and the following alarm instances are ignored.
That is, the following alarm instances “belong” to the third instance and are not treated as
separate instances. By default, data from toggling instances is not copied to the “hosting”
alarm, but can be changed through Toggle Rules.
The alarm remains toggled until there is a Z minutes (by default 15) “silence” period. By
“silence” we mean no UP or DOWN events for this alarm. If the last event was UP, the third
instance remains in toggle state as an active instance. Otherwise, the instance is treated as
cleared.
The fourth and up instances are not seen in history as separate instances. All the toggle
events can be seen in history log of the third instance. Saying that, it is important to notice
that the server unifies all toggling events (per alarm) that occurred in the same second.
Therefore, events may be missing in the history. The buffering time can be configured through
FamEngine’s toggleRepeat.bufferingTime property.
Toggle parameters can be configured through FamEngine server properties (See
FamEngine). It is also possible to change the configuration for a certain group of alarms
through the Toggle Rules in the FM Admin application.
Sequence Example
 First alarm instance is raised at 10:00 and cleared at 10:01—regular instance
 Second instance is raised at 10:02 and cleared at 10:03—regular instance
 Third instance is raised at 10:04—FM recognizes that the two previous instances
occurred less than 10 minutes ago and marks this instance as toggled
 The third instance is cleared at 10:05—the instance remains active in toggle state
 Fourth instance is raised at 10:06 and cleared at 10:07—no new instance is created.
The user still sees the third instance as active
 Fifth instance is raised at 10:08 and cleared at 10:09—no new instance is created. The
user still sees the third instance as active
 At 10:24 (15 minutes afterward)—the toggle mark is removed from the third instance
and the user sees it as a regular cleared instance

Repeated Alarms
When an alarm is raised and another alarm with the same Logic ID is already active, the new
alarm instance is considered a Repeated alarm. Repeated alarms are automatically
suppressed by FM and do not appear as new rows in the Active Alarms display.
The data of the Repeated alarm is copied to the original alarms unless dictated otherwise by
Repeated Rules. The original alarm stores information about the number of occurred
repeated alarms and time of the last occurrence.
All the repeated events can be seen in the history log of the alarm. Saying that, it is important
to notice that the server unifies all repeated events (per alarm) that occurred in the same
second. Therefore, events may be missing in the history. The buffering time can be
configured through FamEngine’s toggleRepeat.bufferingTime property.

5
Fault Solution Administration Guide

Maintenance Calendar
Planned maintenance is part of the communication supplier utilities, which include activities
such as fixing network problems, element maintenance, and network element upgrade.
Planned maintenance activities can create many FM alarms that do not indicate actual
problems.
The feature is used to facilitate the NOC operators in handling planned maintenance activities
and special event alarms by displaying relevant alarm-maintenance information.
For this feature configuration, refer to Maintenance Calendar Configuration.

Schematic Views for FM


Schematic Views for FM is a complementary add-on to the Fault Management (FM) product.
It displays user-defined sections of a network's topology and the internal structure of the
equipment. It displays the system's managed objects as well as the relationships between
them. Data is displayed graphically, making it easy to access and comprehend. Schematic
Views for FM can display an object’s alarm status as received from the system.

GEO Maps for FM


GEO Maps for FM is an add-on to the Fault Management (FM) product. It provides alarm
geographical visualization with the standard Alarms Map navigation capabilities (such as
zooming and panning). The map shows alarmed objects based on the user selected folder,
considering all filters and criteria of the folder. Each site is displayed as an icon, colored with
the highest alarm severity, with its name near it. By default the background layer source is
Open Street Maps. Optionally, each project can purchase a Google Maps or a Bing Maps
license and use it in addition to Open Street Maps.

FaultPro
FaultPro is an optional add-on used for assisting Telecom service providers to achieve a high
level of NOC efficiency. It provides the capability for automatic problem correction. It is
designed to automatically (or semi-automatically) solve problems and frees the NOC
personnel from having to deal with them.
FaultPro operates in the following modes:
 Automatic Mode—network commands and scripts are activated via automation rules
that meet predefined conditions. The scripts and commands are developed in the
Mediation layer’s NCI module.
 Manual Mode—the Send Network Commands module can be accessed from the
Alarm Monitoring application for manually activating commands, scripts, Telnet
sessions to devices, and so on. The list of available commands and scripts is based on
the alarming network element and on the conditions defined in the association rules.

FM Screener
FM Screener is an optional feature that increases the operational efficiency by reducing the
amount of alarms that NOC operators have to manage. Using this module, FM enables
analyzing which alarms are considered unnecessary and automatically marks them as SPAM
by the FM Screener module. In addition, it provides the end-users and the system
administrator control over the list of SPAM alarms. They can easily add or remove SPAM
indications from the Cruiser and/or from the administrator's GUI.

6
Alarm Management

FM Alarms Summary
The Alarms Summary master mode provides the NOC user with a summary visualization of
the network status. Using the Alarms Summary master mode, the summary visualization can
be done for any folder, thus can be adjusted per each use case. The summary criteria is
configurable and is based on the network elements instances attributes, for example, types,
vendors, and geographic location. It includes color visualization of the network status.
The summary view information can be displayed in both gallery (icon) and list (grid) view.
There are predefined icons in the gallery view that can be configured by the administrator.
In addition, the Alarms Summary provides displays of the alarm distribution by selected alarm
attributes.
If ServiceImpact is installed and the user is permitted to use it, service and customer displays
are available.

Anomaly & Trend Information


Anomaly and trend information provides the NOC and engineering users with added value
data about outliers and behavior trends that have high probability of predicting malfunctions,
for improving the fault investigation. This additional information is built upon historical alarm
data and calculated by running certain analytic algorithms on the history.
The following 2 types of predictive information are available:
 Anomaly—calculates anomaly network behavior as a number between 0 and 100
(0 indicates lowest anomaly and 100 indicates highest anomaly)
 Trend—calculates trends/quantities of increasing/decreasing alarms as a number
between -100 and 100 (-100 indicates highest negative trend, 100 indicates highest
positive trend, and 0 indicates lowest trend)
When the Analytics feature is used, the Alarms Summary master mode becomes the
Analytics master mode and predictive fields and graphs are added to the display.
For information about configuring the Analytics predictive information options, see Anomaly &
Trend Configuration.

Alarm Prediction
Alarm Prediction is a tool that predicts network failures and alerts about them based on an
advanced machine-learning algorithm.
The algorithm scans the alarms history and builds a model that can predict the failure before it
occurs. A mathematical likelihood score is assigned to each predicted alarm and the ones
that receive a high likelihood score are triggered and presented in Cruiser for the NOC
engineers to investigate.
This prediction algorithm is completely network agnostic and fully automated. The tool works
on network data and does not require any hard logic implementation using rules or external
reference data.
For information about configuring the Alarms Prediction options, see Alarms Prediction
Configuration

7
Fault Solution Administration Guide

Site View Display


The Site View display presents information about the network elements associated to a
selected site, based on the BC information, including the links between the network elements
and alarm information for each object.
It can be opened for a selected site or for the From Site of a selected alarm.
It is accessible from the alarm display, GEO Maps display, and the Ribbon.

FM Notifications
The FM Notification mechanism enables you to notify specified users by email or SMS about
network changes that are reflected in the FM system. The notification mechanism is built from
the following main functionalities:
 Notification contacts and groups—managed in the TEOCO Admin GUI and list the
available users and groups to send notification to. Contacts and groups can be
migrated from the Helix user list or the operator's organization LDAP system. For more
information, see the Notification Mechanism chapter in the TEOCO Admin User Guide.
 Notification templates—managed in the FM Admin GUI and provide the ability to
create notification emails or SMS templates. The template can contain a placeholder
for any alarm field. For more information, see the Fault Administrator User Guide.
 Notification rules—managed in the FM Admin GUI using the Action rule definition and
provide the ability to define the exact criteria, template to use, and users/groups to
send notifications to. Using action rules you can also send notification to an ad-hoc
user that is not listed in the Notification contact list. For more information, see the Fault
Administrator User Guide.
 General mail configuration—managed at the infrastructure level. Refer to the Mail
Server Configuration and SmsByMail Service Configuration chapters in the Helix
Administration Guide.

Note: SMSs can be available upon specific SMSC plug-in requirements.

To control the sender’s e-mail and name that will appear in sent mails, refer to the FamEngine
notification.rule.sender.email and notification.rule.sender.name properties.

Alarm Correlation
The Fault Management product offers several modules for identifying the root cause of
network failures. These modules significantly reduce the volume of alarms that network
operators have to manage, and significantly shorten the time required to figure out what went
wrong in the network.

Correlator TRS
Correlator TRS is an optional topology-based Reasoning System that provides a probabilistic
topology-based root cause analysis. It uses the network’s topology and probabilities to identify
the root cause of alarms. It is capable of making correct decisions even when some alarms
arrive late.

8
Alarm Management

Correlator ES
Correlator ES is an FM add-on that uses If-Then type business rules to identify the root cause
of alarms. It uses correlation rules to analyze a group of alarms and identify the root cause
“parent” alarms, which reflect actual faults and require fixing, and symptomatic child alarms
that are secondary reactions to the primary faults, and as such do not require any action.
Correlator ES creates derived alarms when no alarm in the group adequately describes the
root-cause and suppresses false alarms generated as a result of maintenance activities.

Machine Learning Root-cause Analysis (RCA)


The Machine Learning RCA analytical algorithms developed for FM add another level of
automation to Fault Management, which extends the traditional rule-based RCA with even
more dynamic and adaptive mechanisms. The algorithms study and analyze the stream of
alarms reaching the system, suggesting groupings and correlations between alarms, and
tagging the potential root-cause alarms among them.
This mechanism can significantly improve the identification of parents alarms (for example,
the root-causes) in scenarios that were not pre-defined and with new elements introduced to
the network. In terms of NSOC efficiency, using such analytics will reduce the amount of
alarms the controllers needs to manage and will assist them in fixing the hearts of the
problems identified in the network.
For configuration, refer to Root cause analysis configuration.

Reporting
FM Reporter is an optional product that enables service providers to easily access web-based
reports that provide a detailed view of current and historical alarms. It also enables the users
to detect critical problems and developing trends, and take proactive actions before these
events escalate into a crisis. It includes predefined reports and enables the user to create
customized reports.

Alarm Handling
FM offers the following options for handling alarms:
 Opening a trouble ticket using the NeTkT product. Integration with other Trouble Ticket
management systems is also available.
 Sending commands to network elements, using the FaultPro module, which is part of
the FM product suite.
 Marking alarms as SPAM/Premium (using FM Screener).
 Changing the internal state of the alarms using Acknowledge and Defer commands.
 Adding comments (using work logs) to the alarm.
 Creating manual parent/child correlation between alarms.

Note: Some of these Helix options can be automated through Action Rules.

9
Fault Solution Administration Guide

System Description
FM is based on two main layers: Engines and Clients.

Monitoring Administration
Clients Clients

HTTP
PM
Correlation
Fam History
Engines (N2/
(J2EE)
Fam Engine J2EE)
Mediation Alarms
(J2EE)
Fam Admin External APIs
(J2EE) (J2EE/N2)
Service
JM
Impact S/
W
Trouble Tickets

S/
SN
s
ail

M
-M

P
S/ E
SM

NeTkT

Engines
FM Engine(s)
FM Engine is the major component responsible for the handling and distribution of alarms
(including communicating with the Mediation layer that in turn communicates with the
network), manual and automatic alarms command execution, mail/SMS notifications, and
many other activities.
To improve scalability and performance of the FM system it is possible to install multiple FM
Engines that will divide the work between them.
FM Engine is a J2EE module that must be deployed in its own EAR. For more information
about J2EE deployment and configuration, see the Helix Administration Guide.
 FamProxy is a supplement to FM Engine, providing infrastructure for developing FM
applications. It is installed automatically in the required EARs.
 JFam module is an additional automatically installed supplement.

FM History
The FaM History (J2EE) module is responsible for the persistence of history data and events
in the database.

10
System Description

FaM Admin
FaM Admin (J2EE) is the server-side component responsible for the administration services.

FM Analytics
FM Analytics is an optional module responsible for Analytics Predictive Information
calculation.

Correlators
There are three optional correlation engines:
 Correlator TRS—based on N2 technology, supplied as part of the FaM API Service
module.
 Correlator ES (drools)—based on RedHat BRMS, “ES” module.
 Correlator RCA—“FamRCA” module.

External APIs
There are additional modules that provide capabilities of alarm information communication
with external systems. Available protocols are SNMP, message bus (JMS), and web services.

Clients
Cruiser Client
Cruiser is the Helix Fault Management client. It leverages intelligent event-processing
capabilities, advanced Fault Management concepts, and a new telecom-oriented graphical
interface to create the most comprehensive and robust Fault Management solution. Cruiser
enables users to efficiently identify, monitor, and resolve network incidents detected in hybrid
and Next Generation communication networks. The intuitive graphical user interface
streamlines quick problem resolution by providing a consolidated, highly filtered, and
prioritized view of network faults.
The Cruiser Monitoring client is composed of the following modules:
 FamShell
 WinFam
 MapsModule (optional)
 FaultProModule (optional)

11
Fault Solution Administration Guide

Light Cruiser Monitoring Client


Light Cruiser is a limited “light weight” application that includes a subset of the Cruiser
functionalities for specific uses. The light version provides the user with the known Cruiser
capabilities of handling alarms with the same look and feel, but without certain functionalities.
The Light Cruiser Monitoring client is composed of the following modules:
 FamLightShell
 WinFam

History Analysis Client


The FM History client is an investigation tool that enables you to quickly retrieve and view the
alarm history according to selected criteria. It also enables you to investigate alarm problems
and general history browsing.
The History Analysis client is composed of the following modules:
 FamHistoryShell
 HistoryAnalysisModule

Administration Client
The FM Admin client enables administrators to perform the administration tasks that are
required to configure the Fault Management solution to best meet the alarm monitoring
requirements.
The application offers the following main functions: alarm rule creation, Trouble Ticket
Mapping rule definition, FM Configuration, and TRS Correlation rule definition.
The Administration client is composed of the following modules:
 FamAdminShell
 FamAdminModule

12
System Description

Architecture
The following diagram provides a detailed data flow between server-side components:

Historic
Alarms

FM
WR
History

Config
FM

DB
Admin

Kafka History topic


Kafka Admin topic

Deploy rules/ SYNC


FM
Commands Events
Engine

N2 modules NeTkT plugin


TG/ TRS/ ES/ SNMP FM Data
Subscription
FM
Kafka commands topic

Kafka Events topic


Mediation Proxy

G Trl Vl Th
Distributed FM Data
Cache (Hazelcast)
FM Data
Mediation

G Trl Vl Th
FM
Proxy

FM Data
FM
Events
Engine
alarms
Active

model

NeTkT plugin Netkt

13
Fault Solution Administration Guide

Active/Active architecture
To improve scalability, performance, and fault tolerance of the FM system, it is possible to
install multiple instances of FM Engine that will divide the work between them.
The system could survive a crash of FamEngines instances as long as at least one instance
continues to work. It is known, however, that some events being processed by a crashed
FamEngine will be lost.
A relevant trouble ticket plugin (when exist) should be installed on every FamEngine EAR.

Apache Kafka and Zoo Keeper


FM architecture heavily relies on Apache Kafka and Zoo Keeper streaming platforms to
deliver traffic between FM components.
Kafka and Zoo Keeper brokers are installed and configured by ISM. Please refer to the ISM
installation guide for more details.
Kafka documentation can found in http://kafka.apache.org/23/documentation.html.

Distributed Cache Architecture


Hazelcast Distributed cache (http://www.hazelcast.com/) is used to hold the active alarm
information in the memory and distribute the alarm events between the FamEngine and the
FamProxy instances.
Cache data is held in the dedicated EAR(s), usually named FamCache.
Such EARs will have only JFam installed and wls-XXX.ksh should have
export DISTRIBUTED_CACHE=true
Usually, it is enough to have one EAR, but for large projects consider dividing data between
several EARs.
The performance and health of these EARs are crucial for the entire FM system functionality
and should be monitored constantly.

14
Workflows

Workflows

Post-Installation Workflow
The following workflow defines the post-installation steps required to configure the Fault
Management solution.
1. Install all the required Fault Management solution components. Refer to the Helix
Server Installation Guide.
2. Define the library list and activate the library. See the Fault Solution Implementation
Guide for details.
3. Configure the GUI labels and tooltips.
4. Configure FM.
5. Define the users, groups, and roles in the TEOCO Admin application. See the
TEOCO Admin User Guide for details.
6. Define the alarm classes.
7. If necessary, define the project roles in the TEOCO Admin application. See the
TEOCO Admin User Guide for details.
8. Map the alarm classes to user groups. See the TEOCO Admin User Guide for details.
9. (Optional) Define the NCI Commands. See the NCI2 Admin User Guide for details.
10. (Optional) Complete the Locale Configuration for Projects Displaying UI in non-
English languages.
11. Define the users' working environment, (such as folders) for the Cruiser and FM
History applications.
12. (Optional) If NeTkT is installed, integrate the FM system with NeTkT. See the NeTkT
Integration Guide for details. If another Trouble Ticket system is used, perform the
necessary steps to integrate with that system.
13. Verify that all required components are running.
14. Define the operators' working environment.
15. You may validate the system is functioning properly by using the bench sim utility
(alarm simulator).

Displaying the Cruiser System Folder Names in non-English


Languages
In projects that use Russian locales, in addition to appropriate localization configuration (refer
to “Multi-Language Support Settings” in the Helix Administration Guide) the following
post-installation step is required.

15
Fault Solution Administration Guide

To display the Cruiser system folder names in Russian:


1. In the server, go to WinFam_<version
number>.zip\release\weblogic\delivery\db\Oracle\.
2. Run from the Oracle client the script PERSONALIZATION_ENTRIES_RUS.sql.

Note: To prevent corrupted text, the Oracle client should be configured to use the same
character set as the database. Otherwise, the text will be corrupted.

Post-Upgrade Workflow
The following workflow defines the post-upgrade steps required to configure the Fault
Management solution.
1. Check that all the prerequisites are installed on the client and server. See the Helix
Server Installation Guide.
2. Upgrade all the required Fault Management solution components. See the Helix
Server Installation Guide.
3. The following notes are relevant for projects upgrading from versions prior to 8.0:
a. Due to a major change in the FM architecture, the existing project metadata files
may not work. They should be sent to TEOCO S&D for revision. Files supplied
together with the JFam release may be used as a temporary solution until
TEOCO’s recommendation is received.
b. Raise rules and Automation rules were merged into unified Action Rules. While
migration is automatic, we recommend revising the migrated rules.
c. Some alarm fields were removed or made invisible. We recommend revising rules
in FM Admin, folders in Cruiser, and saved queries in FM History. If they use
removed or invisible fields, change the rules to use valid fields.
d. Hook functions of the alarm handler do not exist anymore. Their logic should be
reimplemented using existing means of FM. For example, using Enrichment,
Repeated, and Toggling rules.
e. Alarm Handler Prefs are deprecated. Their values are taken into account during
the upgrade, but from this version onwards, the entire configuration definition is
done through the FamEngine properties. .

4. Update the library list (if required) and activate the library. See the Fault Solution
Implementation Guide for details.
5. Check and adjust the FM configuration.
6. If required, define new alarm classes and map them to BP classes in TEOCO Admin.
See the TEOCO Admin User Guide for details.
7. Define NCI Commands if required. See the NCI2 User Guide for details.
8. If required, fine-tune the integration between FM and NeTkT.
9. Verify that all required components are running.

16
Workflows
This feature enables you to open the FM History display from external applications. It is done
by opening a URL using the appropriate parameters.
The URL prefix is:
http://[your server name]:[port]/
FaMHistoryShell /FaMShellActivator.jsp?
The URL parameters are:

Name Type Description

active boolean Mandatory. Always true.

field string The name of the alarm field to filter by (when filtering by a single
field).

value string The value of the alarm field to filter by (when filtering by a single
field)

timecriteria string Relative time: <N> <Hours, Days, Weeks, Months>), in the
format H/D/W/M<N>
Where:
H=Hours, D=Days, W=Weeks, M=Months
For example, W10 indicates 10 weeks.

allparents boolean Determines whether to open the Correlation Tree window or just
filter by the following parameters.
Set as true if you want to open the Correlation Tree window.
Set as false if you do not want to open the Correlation, but you
want to filter the records by the following parameters.
Ignore this parameter if you just want to filter by a single
field/value (for backward compatibility).

LogicID string The value of the LogicID field of the alarm to filter by.

DateTimeUp string Full date and time, including milliseconds

PCStatus string Parent Child Status, values according to available values in


JFam.

ObjectID int The value of the ObjectID field of the alarm to filter by.

ObjectType int The value of the ObjectType field of the alarm to filter by.

Example:
http://dc50-dev-helix91:3600/ FaMHistoryShell
/FaMHistoryShellActivator.jsp?activate=True&PCStatus=PARENT&ObjectID=123456&Object
Type=78&timecriteria=W10&LogicID=comcast_test_3&allparents=false&DateTimeUp=20/03/
2017 16:43:31.092

17
Fault Solution Administration Guide

Defining the Operator Working Environment


As an administrator, there are several actions that you can perform in terms of defining the
operator working environment:
 Working folders and filters—you can prepare folders to be used on the System or
Group level.
 Master layouts and layouts—you can prepare System and Group level layouts per
folder and master mode.
 Notification filters—you can configure the System and group level criteria for alarm
notification pop up window using filters.
For more information, see the Administration section of the Cruiser User Guide.

18
Configuration

Configuration

Overview
FM system can be configured as follows:
 FM Admin GUI.
 TEOCO Admin GUI.
 FaM Engine and other J2EE modules can be configured by changing module
properties in the appropriate jcore_cfg.xml file. The Modules Configurable Properties
chapter details available module properties. Refer to the Helix Administration Guide
for more details. Changes will take effect after relevant WebLogic server (EAR)
restart.
 Certain configurations require changing project metadata files.

Notes:

 We recommend verifying Fault configuration, especially rules, in the test environment


before applying them in the production environment. We advise using the bench_sim
to test the FM Admin rules.
 We recommend that two or more administrators/system integrators do not define
rules of the same type simultaneously, to prevent problems. For example, if two users
modify the same rule at the same time, the last finished operation is executed and the
other is ignored without any warning.

Enrichment Rules
Enrichment rules are a powerful tool allowing populating or changing the alarm data at any
stage of the alarm life cycle. It is possible to define several rules where each one serves its
own set of alarms. The entire configuration is performed using the FM Admin GUI.
Each rule has the following properties:
Condition
A rule will be applied only on alarms matching the criteria. Criteria can refer to all alarm field
conditions with nested logical “AND”/”OR”/”NOT” between them. Rules will be triggered only
on events specified in the condition, such as: Acknowledge, trouble ticket creation,
parent/child connect/disconnect, and so on.
In addition, Javascript expression (including Mediation Lookups) can be used to define the
criteria.
Change Alarm Fields Values
Enrich the alarm by setting or changing the alarm fields with new updated information.
Lookups and Javascript can be used to populate the alarm fields.
Modify Alarm Class
Change the alarm class of the alarm.
Activation Time
Defines date/time period when rule is active. It is possible to define start and end dates and/or
week-days and/or day hours.

19
Fault Solution Administration Guide
Example of Possible Rules
Update Addition Info field with technician name in charge of the alarmed site.
For more information, refer to the Fault Administrator User Guide.

Action Rules
Action rules are a powerful tool allowing setting any action at any stage of the alarm life cycle.
It is possible to define several rules where each one serves its own set of alarms. The entire
configuration is performed using the FM Admin GUI.

Condition
A rule will be applied only on alarms matching the criteria. Criteria can refer to all alarm field
conditions with nested logical “AND”/”OR”/”NOT” between them. Rules will be triggered only
on events specified in the condition, such as: Acknowledge, trouble ticket creation,
parent/child connect/disconnect, and so on.
In addition, Javascript expression (including Mediation Lookups) can be used to define the
criteria.
Starting from version 8.0, the behavior of the “duration” condition has changed. The duration
of the alarm (the period from the alarm UP time) is checked once at the time of the rule
evaluation. If the alarm duration does not match the condition, the rule is rejected.

Modifications/Actions
The following actions can be applied to the alarm:
 Acknowledge/Undo Acknowledge—change of the alarm internal status, usually
means that alarm was noticed by the operator.
 Create/Disconnect trouble ticket.
 Reject alarm—alarm will be ignored by the system with no further tracking.
 Inhibit alarm—alarm will not be shown in the monitoring clients, but will be tracked in
the system.
 Apply association—copy work logs and trouble tickets from the previous alarm
instance if it was cleared within X (defined in rule) minutes, that is if previous instance
is close to the current one. Copying trouble tickets mean that a new alarms instance is
appended to trouble tickets of a previous alarm instance.
 Do not send to Correlation—alarm will not be sent to a correlation system.
 Create trouble ticket for the alarm.
 Defer alarm—change of the alarm internal status, usually means that the alarm will be
handled later.
 Apply escalation—alarm severity will be raised automatically if alarm is not
acknowledged or cleared within X (defined in rule) minutes.
 Defer/Undo Defer—'snooze' mechanism. The alarm will be in deferred status for the
specified amount of time.
 Alarm Down—clear the alarm.
 Prioritize—raise the alarm priority.
 Create worklog.
 Run NCI command.
 Notification—send email/SMS for specified users.

20
Configuration

Delay
Alarm actions can be delayed for a specific amount of time. The action will be performed at
the end of the period if the alarm is still active and matches the rule criteria. For example, to
create a trouble ticket only 10 minutes after alarm was raised and only if the alarm is still
active.

Activation Time
Defines date/time period when rule is active. It is possible to define start and end dates and/or
week days and/or day hours.

Example of Possible Rules


Open a trouble ticket automatically when the alarm status changes to Acknowledge.
For more information, see the Fault Administrator User Guide.

Association Rules
Association rules enable you to “associate” an alarm with programs, web links, and NCI
commands. Invocation parameters are defined using the powerful Javascript language that
refers to alarm field values and Mediation Lookup results.
The Cruiser user will be able to execute programs and commands associated with the alarm
using the right-click menu. This differs from action rules that are executed automatically by
the system.
Programs are executed on a local user PC and therefore must be properly installed and
configured.

Condition
The rule will be applied only on alarms matching the criteria. Criteria can refer to all the alarm
field conditions with nested logical “AND”/”OR”/”NOT” between them.

Activation Time
Defines date/time period when rule is active. It is possible to define start and end dates and/or
week days and/or day hours.

Toggle Rules
The Toggle rules enable you to change the toggling alarm parameters (such as Toggle On,
Toggle Off, and Toggle Depth) and decide which alarm fields should be updated in each
toggling alarm instance.
The rule will be applied only on alarms matching the criteria. Criteria can refer to all the alarm
field conditions with nested logical “AND”/”OR”/”NOT” between them.
Toggle rules are always defined as Blocking. This means that when a rule is executed, it
prevents the execution of the remaining rules with the same criteria.
For more information, see the Fault Administrator User Guide.

21
Fault Solution Administration Guide

Repeated Rules
The Repeated rules enable you to define whether to update the alarm fields with the
repeating alarm’s fields.
The rule will be applied only on alarms matching the criteria. Criteria can refer to all the alarm
field conditions with nested logical “AND”/”OR”/”NOT” between them.
Repeated rules are always defined as Blocking. This means that when a rule is executed, it
prevents the execution of the remaining rules with the same criteria.
For more information, see the Fault Administrator User Guide.

Display Rules
Display rules are a powerful tool enabling setting special FM alarms display attributes for
selected alarm groups so that they are displayed using Italic, Underscore, and/or different text
and/or background colors. This can enable the NOC operators to easily notice these special
alarms.
The rule will be applied only on alarms matching the criteria. Criteria can refer to all the alarm
field conditions with nested logical “AND”/”OR”/”NOT” between them.
In addition, Javascript expression (including Mediation Lookups) can be used to define the
criteria.
When the alarm matches several rules, the coloring instructions are unified. In case of
conflict, the later rule overwrites the previous instructions.
For more information, see the Fault Administrator User Guide.

Trouble Ticket Integration


Overview
A pluggable architecture for integration with various trouble ticket systems is available. A
plug-in (J2EE module) is responsible for reporting TT system capabilities (such as ticket
attributes) and communicating all commands and requests to the TT system and back.
Relevant plug-ins must be installed in all FM Engine EARs and in all FM EARs (that is all
EARs having FamProxy module installed).

22
Configuration
The following operations exist:
 Create new ticket for the alarm.
 Append the alarm to an existing ticket (chosen by the user).
 Disconnect the alarm from the ticket (that was created for the alarm or alarm was
appended to).
 View the ticket details in the TT system.
 Fetch the tickets from the TT system upon certain criteria. For example, when a user
chooses the ticket for the Append operation.
 Pass the originating alarm worklog to the TT system.

Note: Worklogs that existed before ticket creation and worklogs of the appended
alarms are passed too.

 Update ticket after originating alarm has changed (for example, cleared).
 Update ticket status in FM system after it was changed in the TT system.

Note: Some operations may not be supported by the specific plugin.

Trouble Ticket Mapping Rules


As the TT structure is different from the Alarm structure, the alarm must be transformed into
the TT structure. The Mapping rules mechanism is the tool that enables the administrator to
define the transformation.
We refer to three structures:
 Main ticket—the structure that holds the ticket. Some values must be specified at the
time a ticket is created for an alarm and some will be populated later during ticket
processing.
 Appended—the structure holds the information about the appended alarm with
one-to-many relation to the main ticket.
 Activity—this structure holds changes done to alarm/ticket with a one-to-many relation
to the main ticket.
It is important to understand that structure attributes and types may differ between TT
systems and even between projects. Changes in these structures may require change in the
mapping rules, such as adding mapping to new mandatory attribute.
Mapping rules are managed using the FM Admin GUI. Each rule covers a subset of alarms
(by filter) and defines mapping for the Create, Append, Update, and Worklog (Activity)
operations. ‘Update’ mapping is used when a ticket is updated with alarm changes. At this
stage, the ticket was already modified as a result of the ticket processing and therefore only a
small subset of ticket fields is updated. Usually, the mapping for these fields is the same as in
‘create’ mapping.
Powerful JavaScript language expression can be used for the mapping. In addition to
standard JS functions, an expression can refer to alarm fields and the Mediation Lookup
function.

23
Fault Solution Administration Guide

NeTkT Plugin
NeTkT Plugin is a TT plugin to TEOCO’s NeTkT system.
For configuration details, refer to the NeTkT Administration Guide.

GEO Maps
To successfully implement GEO Maps into the Fault solution, alarms should be populated
with the correct Object Type and Object ID and the Eqp Num should get the NE’s Object ID.

Setting GEO Maps Configuration


GEO Maps for FM is a licensed feature.
Geographical location is stored in Base Configuration in the Site Object. Therefore, it can be
accessed directly by the Site ID or by Equipment (having Site ID information).
By default, FM extracts the location according to equipment ID stored in the Ancestor Object
ID alarm field. It is possible to specify another field using the property
fam.engine.enrichment.site.alarmSiteIDField.
If the Site ID information appears in the alarm, it can be used to extract the location directly.
fam.engine.enrichment.site.topologyType should indicate the BC entry with the correct
coordinates (SITE or EQUIPMENT) and fam.engine.enrichment.site.alarmSiteIDField should
point to the alarm field with the ID information.
Example – by SITE
fam.engine.enrichment.site.topologyType=SITE
fam.engine.enrichment.site.alarmSiteIDField=EquipmentNumber
In this case, FamEngine takes the values from the EquipmentNumber alarm field and treats
them as SITE_ID from the CMM_SITE table and then it takes the coordinates of this site.
The flow is: Alarm fields with SITE_ID > SITE > Coordinates.
Example – by EQUIPMENT
fam.engine.enrichment.site.topologyType=EQUIPMENT
fam.engine.enrichment.site.alarmSiteIDField=EquipmentNumber
In this case, FamEngine takes the values from the EquipmentNumber alarm field and treats
them as EQP_ID from the CMM_EQP table, and then it takes the SITE of this EQP, and then
its coordinates.
The flow is: Alarm field with EQP_ID > EQP > SITE > Coordinates.
To save Base Configuration access time, geographical information is cached in FM. Usually
a site location rarely changes, so there is no need to refresh the cache. In projects where
sites location do change (for example, in case of mobile sites), the property
fam.engine.enrichment.site.cache.refresh.enable should be set to true. This will cause FM to
refresh the cache after every Base Configuration change (NetImport run). Optionally, it is
possible to perform periodic refresh every X seconds
(fam.engine.enrichment.site.cache.refresh.interval property).
The site and region locations are kept in the Base Configuration module.

24
Configuration
In addition to refresh config, we recommend FamEngine and FAM EAR restart after making
the GEO Maps configuration changes.
These settings result in having the appropriate values in the Cruiser’s SiteID, Lat, and Long
fields.

Setting Base Configuration Region Coordinates


Region and site coordinates are used by the Cruiser GEO Maps for positioning the regions
and sites over the maps. The site coordinates are inserted to Base Configuration using
NetImport. The region coordinates are calculated automatically based on their site
coordinates. The calculation ignores sites outside the defined range. The site coordinates
range is stored in the Base Configuration module.

Note: To provide correct Cruiser GEO Map displays, all the sites and regions stored in the
Base Configuration module should contain correct coordinates.

The coordinates range is found in cmm_db.cmm_codes_table. The relevant


CODES_TABLE_NAME is COORDINATES_RANGE.
The default coordinates range includes the whole world as follows:
 Min Long X -180
 Max Long X 180
 Min Lat Y -90
 Max Lat Y 90

Note: Two coordinates cannot have the same value.

To update the coordinate range:


 Use the CMM_DB.PA_UPD_REGION_COORD.UPD_COORD_RANGE function.
In the standard data flow, NetImport automatically calculates the region coordinates
based on the site coordinates. It calculates only coordinates that were not calculated
yet (meaning that the coordinates are empty).
If needed, the procedure that calculates the coordinates can also be run manually for
a specific region or for all regions.

To manually calculate coordinates for all regions with empty (null) coordinates:
 Use the CMM_DB.PA_UPD_REGION_COORD.UPD_REGION_COORD_ALL
function.

Note: To recalculate all region coordinates, empty all the existing coordinates before
running the procedure.

To manually calculate coordinates for a specific region based on Config ID:


 Use the CMM_DB.PA_UPD_REGION_COORD.UPD_REGION_COORD_BY_
CONFIG_ID function with the Config ID as parameter.

25
Fault Solution Administration Guide

Map Display Parameters


The MapsConfig-project.xml file resides in the project metadata of the WinFam module and
enables you to define the parameters of the Alarms Map master mode display. After
installation, the file contains all the required elements with their default parameter values. You
have to change their values to match the project’s configuration and concepts. You can
add/delete elements as required.

The Layers Entry


The Layers entry defines the different Alarms Map layers display. Each layer is defined as a
layer entry under layers.
In addition, the layers entry includes the DefaultDescriptionTemplate entry, which is the
default template used to display the bubble window in levels for which the Description
Template is not defined or not valid. It is taken from the WinFaM Metadata.
Each layer entry contains the following elements:

Name Description

Level Defines the layer’s level in the Maps module. It must match its level
definition in Helix’s Network Data Storage. Level 1 is the highest (for
example, Country) and Level 5 is the lowest (for example, Secondary
Region). Level 0 defines the sites configuration.

Name Defines the layer’s name. It must match its name in Helix’s WinFaM (for
example, Level 0’s name is Sites, Level 1’s name is Country, and Level
2’s name is State).

Description Defines the name of the template used to display the bubble window that
Template shows the details of an element of this layer on the map. It is taken from
the WinFaM Metadata.

Image Defines the name of the image file used to display this layer icon in the
Alarms Map’s Layers pane. It is taken from the WinFam images folder.

MinAlt & Defines the maps altitude range (minimum and maximum) in meters for
which this layer is displayed. We recommend that the MinAlt of each layer
MaxAlt
be equal to the MaxAlt of the layer under it to make sure that exactly one
layer is displayed in any altitude.

BoundingNorth, Defines the layer’s area as a rectangle by its latitude and longitude
BoundingSouth, boundaries in decimal degrees.
BoundingWest,
&BoundingEast

IsEnabled If it is false, this layer is not used in the display.

Categories Defines the elements included in this layer, as described in the following
table.

In addition, the layers entry includes the DefaultDescriptionTemplate entry, which is the
default template used to display the bubble window in levels for which the Description
Template is not defined or not valid. It is taken from the WinFaM Metadata.

26
Configuration
The Categories and Category Entries
The Categories entry defines the elements included in the layer. An element in the layer is
defined as a category. A layer can include any number of category items. Usually, the Sites
layer includes several elements and all the other layers include only one (default) element.
Each category entry contains the following elements:

Name Description

id The category’s ID. It is relevant only if the layer includes more than one
element.

IsDefault If it is true, this is a default category and it is used to define any category
that does not have a valid id or if no other category is defined.

Image Defines the category’s image when it has no alarms. It is relevant in a


non-default category.

DefaultImage Defines the default Image. It is used for a category with no alarms that
does not have an Image or its Image is not valid. It is relevant in a default
category.

pair Defines the mapping between the severity and the icon that represents it.

The pair Entry


Each pair contains the following parameters:

Name Description

Value One of the severities available for this category. It must be an available
Helix severity.

Image The image to be displayed when Value is the category’s severity.

Notes:

 The severity of a category in layer 1 is defined as the highest alarm severity it has.
 The severity of a category in any other layer is defined as the highest severity of the
elements included in it (of a lower level).

27
Fault Solution Administration Guide

The homeview Entry


The homeview entry defines the default map settings to be displayed when the Go to home
location toolbar button is clicked or when the Alarm Map is displayed without any
sites/regions.
The homeview entry contains the following parameters:

Name Description

Latitude The home location map latitude.

Longitude The home location map longitude.

Altitude The home location map altitude.

Description The home location map name or description. It is not displayed on the
map. It is used to provide information about the homeview location to the
user viewing the MapsConfig-project.xml contents.

MapsConfig-project.xml Structure Example


The following is an example of a typical MapsConfig-project.xml structure.

28
Configuration
It contains the following layers (top-down):
 Layer 1’s name is Country. Its image is 003-gray.png, and its altitude range is
350,000-5,000,000 meters.
 Layer 2’s name is State. Its image is 004-gray.png, and its altitude range is
200,000-500,000 meters.
 Layer 3’s name is City. Its image is 005-gray.png, and its altitude range is
100,000-200,000 meters.
 Layer 4’s name is Region4. Its image is 006-gray.png, and its altitude range is
50,000-100,000 meters.
 Layer 5’s name is Region5. It is not enabled. Therefore, its image and altitude range
are not defined.
 Layer 0’s name is Sites. Its image is 001-gray.png, and its altitude range is
500-50,000 meters. It also has a Description Template (DefaultBubbleTemplate.xml),
which is also the DefaultDescriptionTemplate.
All the layers are between latitudes of 3-31 degrees and longitudes of 65-93 degrees.
Layer 0 contains 9 categories. Its category 1 is a default one and has a general default image
(001-gray.png). Its category 2 is not a default one and has a normal behavior image
(009-gray.png). Both categories have the security pairs Critical, Major, Minor, and
Warning, with matching icon images.
When the Go to home location toolbar button is clicked or the Alarm Map is displayed
without a view area definition, the Latitude is 21 degrees, the Longitude is 78 degrees, and
the Altitude is 2,600,000 meters. This homeview location defines India.

Flooding Protection
Alarm flooding is a situation where an exceedingly large amount of alarms is raised in a rate
higher than the FM Server can handle. When this happens, the flooding protection
mechanism is used to ensure that FM Server will keep processing alarms although its
resources are busy. That is done by rejecting certain alarms while saving them in files.
The mechanism uses two configurable protection levels:
 Level 1—when crossing level 1 threshold, only alarms defined in FM Admin rules are
automatically rejected by FM Server and saved to files (by default all alarms with
priority <= 4).
Flooding reject rules are configurable and can be changed by the project.
 Level 2—upon continuous massive alarm flooding and when level 1 reject rules are not
enough, when level 2 threshold is crossed, all alarms are rejected.
When alarm flooding situation is detected, an indication is sent to the TEOCO Monitor
application and a special alarm is sent to Cruiser indicating the flood and causing orange
borders to be added to the Cruiser grid to warn the NOC operators about certain alarms
rejection.
NOC users can download and open the rejected alarms files by right-clicking the flooding
alarm.
When FM server detects that alarm flooding is over it automatically stops rejecting alarms,
flooding alarms are cleared (and saved to history), and the orange grid border is removed.
The Flooding algorithm uses 3 configurable thresholds:
T1 [100,000] < T2 [200,000] < T3 [300,000]

29
Fault Solution Administration Guide

These thresholds can be configured by the following FamProxy Flooding properties:


 T1—flood.minor.threshold
 T2— flood.major.threshold
 T3— flood.severe.threshold

Note: We recommend not changing default values of these threshold properties without
consulting TEOCO first.

These thresholds are defined in terms of total amount of events in all the queues.
We assume that the system hardware enables processing at least 1,000 events/sec.
Therefore, a default of 100,000 events queue is equivalent to a 100 seconds delay.

Note: The files which contain the rejected alarms are saved in the database and the customer
should maintain the database and clear old files.

30
Configuration

The Flooding Algorithm

The Flooding algorithm uses the following parameters:


 Queues—number of alarms in the queues
 Flooding—indicates if flooding situation was detected
 ThrowEvents—indicates if Up events collecting is stopped and thrown events are
written to files
The Flooding algorithm is as follows:
1. While Queues is less than T1 and Flooding is Off, nothing is done.
2. When Queues passes T1, accumulating statistics is started.
3. While Queues is between T1 and T2 and Flooding is Off, accumulating statistics
continues.
4. When Queues passes T2 and Flooding is Off:
a. Flooding becomes On. This indicates flood state in the client.
b. Flood rules are activated:
 Events rejected by flood rule are written to file/files.
 Flood alarm with file name is sent and presented in FM client.
c. Toggle and repeated buffers are increased. .

 If the buffer is set to be 0 (zero), it is increase to 30.


 If not, it is increased to flood configured values (X 30).

31
Fault Solution Administration Guide

5. When Queues passes T3:


o ThrowEvents becomes On.
o Up events collecting is stopped and thrown events are written to files.
o Flood alarm with file name is sent and presented in the FM client.
6. When Queues goes beyond T3 and ThrowEvents is On:
a. ThrowEvents becomes Off.
b. Events collection is resumed. .

7. When Queues goes beyond T1 and Flooding is On:


a. Flooding becomes Off.
b. Flood indication is removed from the FM client.
c. Flood rules are deactivated.
d. Toggle and repeated buffers sizes are reset. .

FamEngine Flood System Properties

Note: Consult TEOCO before changing any of these properties.

Property Name Type Default Description


Value

history.flood. boolean true Enables sending history data to offline


OfflineQueueEnabled files when JMS is flooded.

history.flood. int 500000000 The number of pending bytes in the JMS


ThresholdBytes queue that triggers the OfflineQueue.

flood.minor. int 100000 When FamEngine’s total queue size


threshold surpasses this value, the flood state is set
to MINOR. The FamEngine itself is not
declared as flooded at this stage.

flood.major.threshold int 200000 When FamEngine’s total queue size


surpasses this value, the flood state is set
to MAJOR and FamEngine is declared as
flooded.

flood.severe.threshold int 300000 When FamEngine’s total queue size


surpasses this value, the flood state is set
to Severe. All necessary preventative
actions are taken at this stage.

32
Configuration

Flooding of History Alarms


Just as a flood can be caused by an active alarm overload, a flood of history alarms can also
be caused when the system cannot handle the amount of history alarms and events it needs
to process and store in the database. This kind of flood is usually caused by:
1. History database performance issues.
2. Memory problems in the FamHistory module.
3. Communication problem between the FamEngine and FamHistory modules.
In case 1, a flood will probably appear as growing queues in the FamHistory module and the
mechanism that handles such kind of flood is the same as with active alarms, containing 3
thresholds (configured in FamHistory module). When queues grow and reach T2, the JMS
communication between the FamEngine and FamHistory modules is blocked. This is done to
decrease the flood on FamHistory module and until T1 is reached, history alarms and events
are aggregated in the FamEngine module.
In cases 2 and 3, when the FamHistory module does not consume history alarms and events
at a sufficient rate, FamEngine aggregates them first in the JMS queue and when a certain
threshold of aggregated pending data is reached (500MB by default, configured via
FamEngine history.flood.ThresholdBytes system property) an indication is sent to the TEOCO
Monitor module and history alarms and events are dumped to files under system project
virtual folder configured by the JCore’s vdir.project system property (that means
$BASE_DIR/ttij2ee/project/metadata/vdir/FamEngine/flood/history_queue_storage/). When
the flood is over, the stored data is processed regularly so that no history data loss should
occur (as opposed to active alarms flood when alarms can be rejected).

Flooding in FamProxy
The FamProxy module has 3 implemented flood mechanism thresholds to prevent it from
crashing when it does not manage to handle the rate of active alarms received from
FamEngine. A flood in FamProxy is usually caused when it either does not have enough
resources (usually memory), or the servers have too many event subscribers, or there are
heavy subscribers that prevent from keeping up with the events rate.
When T2 is being reached, FamProxy has 2 different strategies for handling it (according to
flood.handler.mode system property) settings:
 BLOCK_EVENTS—blocking FamEngine. The FamProxy JMS queue causes the
alarms to be aggregated in FamEngine until the queues size is less than T2. This
setting is suitable when FamProxy serves clients such as Cruiser and it ensures that
the events rate does not surpass the rate it can handle.
 DISCONNECT_SUBSCRIBERS—disconnecting subscribers that have filled the
queues and reconnecting them. This setting is suitable when FamProxy servers have
heavy subscribers such as the ES or ServiceImpact external modules.
When T2 is reached, a flood indication is also sent to the TEOCO Monitor module and when
T3 is reached, FamProxy restarts itself to cause all subscribers and queues to be reset.

33
Fault Solution Administration Guide

Setting Flooding Properties


The following Flooding properties should be set in the FamProxy file according to the project’s
needs:

Property Name Type Default Description


Value

flood.handler.enabled boolean false Indicates if FamProxy handles alarms


flood (either by blocking new events or
by disconnecting queued subscribers, as
determined by the flood.handler.mode
property). When this property is set to
false, alarms flood are not handled by
FamProxy, which can eventually lead to
FamProxy EAR performance degradation
and/or memory crash.

flood.handler.mode string BLOCK_ When flood handling is enabled, this


EVENTS property indicates how a flood is handled
when a critical threshold is reached.
The available options are:
 BLOCK_EVENTS—blocks alarm
events flowing from FamEngine. This
in turn can cause queues to be
accumulated in FamEngine that may
eventually lead to a FamEngine flood.
 DISCONNECT_
SUBSCRIBERS—explicitly
disconnects subscribers with large
queues or all subscribers that
correspond to FamProxy restart
procedure.

34
Configuration

The following Flooding properties are also available but we recommend not changing their
default values. If you must change them, consult R&D first.

Note: Consult TEOCO before changing any of these properties.

Property Name Type Default Description


Value

flood.minor.threshold int 100000 T1—flood threshold indicating a minor


delay in alarm events processing. No
anti-flood action is taken at this stage.

flood.major.threshold int 200000 T2—flood threshold indicating a major


delay in alarm events processing.
FamProxy flood indication is turned on
and FamProxy takes anti-flood actions
as indicated by the flood.handler.mode
system property.

flood.severe.threshold int 300000 T3—flood threshold indicating a major


delay in alarm events processing and a
potential process stability risk. At this
stage, FamProxy is restarted.

Client Protection from Large Amount of Alarms


A problematic situation may happen not only as a result of the high rate of the events, but
also due to the big number of active alarms in the system.
Usually, FM server will have no problem handling such a situation as far as enough resources
are allocated.
However, Cruiser is limited by the number of alarms it can display.
In case a specific user has more alarms, on startup, Cruiser fetches only the last 200,000
alarms (WinFam "winfam.alarms.filtering.threshold1" property) sorted according to Repeated
Time. The repeated time of the last alarm is remembered as a threshold and only alarms
having repeated time more recent than the threshold are displayed in Cruiser further on until
the next synchronization with the server.
However, if regardless of the threshold the number of alarms displayed in Cruiser increases
(for example, when new alarms arrive quicker than alarms are cleared) and crosses 220,000
alarms ("winfam.alarms.filtering.threshold2" property), Cruiser forces a new synchronization
with the server, reducing the number of alarms again to 200,000.
It is also possible to specify additional filters on the alarms fetched from the server through
Cruiser UI.

35
Fault Solution Administration Guide

FM Screener
FM Screener is a licensed feature.
The purpose of this engine is to generate two lists of the Logic IDs:
 black list—Logic IDs classified as SPAM, that most certainly can be ignored by the
operator
 white list—Logic IDs classified as Premium, that most certainly should be handled by
the operator.
When a new alarm is raised with an ID appearing in one of the lists, it will be automatically
classified as SPAM or Premium accordingly. All other alarms will be “standard” as before the
feature was introduced.
The SPAM/Premium status can be used in other Cruiser filters as well.
The engine takes the decision based on the following two inputs: user actions and historic
actions.

User Actions
The following actions will implicitly classify an alarm as Premium:
 Worklog (can be changed by “anti.spam.perform.non.spam.action.on.worklog”
property)
 Trouble ticket (can be changed by “anti.spam.perform.non.spam.action.on.tt”)
 Parent/child relations (can be changed by
“anti.spam.perform.non.spam.action.on.correlation”)
 Explicit “mark as non-spam” command
The following action will remove the SPAM indication and make an alarm regular:
 Acknowledge (can be changed by “anti.spam.remove.spam.indication.on.ack”)
The following action will explicitly classify an alarm as SPAM:
 Explicit “mark as spam” command. However, if the alarm ID is already classified as
Premium, special user privilege is required.

Note: The above actions not only change the state of the current alarm instance, but also add
its ID to SPAM/Premium lists and therefore affect instances that follow.

When the Screener alarms number reaches 2M (FamEngine


"anti.spam.table.space.threshold" property) in the screener DB table, this feature will be
automatically disabled and notification will be sent to the log and presented in FM admin.
 The feature can later be manually enabled.
 When the feature is automatically disabled
o Error message will be presented when FM admin is opened.
 Screener actions should be disabled in Cruiser.
 Screener master mode in FM Admin should be disabled.

36
Configuration

Historic Investigations
This feature enables investigation of historic actions that were done or not done to the alarms.
There are two types of queries:
 Queries generating SPAM list
 Queries generating Premium list
Queries run once a day at 1:00AM ("anti.spam.system.query.daily.execution.hour") or every
day (“anti.spam.system.query.execution.days”). A new run of the query overrides all IDs
found in the previous run.
Administrator can enable/disable specific queries through FamAdmin GUI. Disabling the
query will remove all IDs found by it from black/white lists.
There are two levels of priorities on the decisions above:
1. User actions have priority over historic queries.
Therefore if, for example, a historic query classifies the ID as Premium, but a user
executed a ‘mark as spam’ command, the ID will be classified as SPAM.
2. Premium classification has priority over SPAM classification.
Therefore, if, for example, one query classifies the ID as SPAM and another as
Premium, the ID will be classified as Premium.
An administrator can view the generated SPAM and Premium lists in FM Admin and modify
the SPAM indication (swap SPAM/Premium or make the alarm “regular”).
Decisions that caused the ID to be classified as SPAM/Premium lists are also available in FM
Admin.

Severity Management
FM Admin GUI enables you to specify the conversion of alarm priority (1-9) to severity
(Critical/Major/Minor/Warning).
It is also possible to configure the background and text color for every degree of severity. It
will affect the display of the alarms in the monitoring clients.
The system is supplied with 8 standard severity icons. There are four severity categories
(critical, major, minor, and warning), with two sizes for each (16*16 and 24*24). By default,
they are stored in the Project vdir directory for the JFAM product.
If you change the default severity colors that are supplied with the product, you can also
change the icons that will be displayed by loading your own customized icons.
For more information on working with J2EE, refer to the Helix Administration Guide.

Worklog Management
The FM Admin GUI enables the specifying of worklog types and templates that will be
available to the user creating the worklog.

37
Fault Solution Administration Guide

Project Fields Configuration


Activating Project Fields
There is a predefined set of fields intended for specific project usage. These fields can be
populated by a project specific Mediation library, Enrichment rules, or project specific logic.
See Appendix C: Project Active Alarm Attributes.
By default, project fields are not part of the alarm.

To activate desired fields:


1. Uncomment the desired fields in ProjectActiveAlarm.xml
($BASE_DIR/ttij2ee/project/metadata/JFam/classes), and change both ‘label’ and
“description” attributes to appropriate meaningful texts.
If the field is populated by the mediation library, the “logicalName” attribute should be
identical to the RC attribute name for the mapping to be automatic.
It is forbidden to change any other parameter in the fields.
2. Uncomment the required fields in ProjectHistoryAlarm.xml
($BASE_DIR/ttij2ee/project/metadata/JFam/classes/history) in the “DbMapping”
section.
3. Add the fields to the following projections by editing ProjectActiveAlarm.xml. Refer to
the Helix Administration Guide for the exact syntax:
o If fields will be populated by the Mediation library to the
“MediationMappableAttributes” projection:
 To use this field in a condition for Repeated or Toggle rules, to
“ToggleRepeatedRejectRulesFilterableAttributes” projection.
 To specify this field in Repeated or Toggle rules, to
“ToggleRepeatedRulesModifiableAttributes” projection.
 To change value of this field through Enrichment rules, to
“EnrichmentRulesModifiableAttributes” projection.
o If you do not want to let the user to change a value in this field by the editing
alarm, to “NonUpdatableAttributes” projection.
4. Restart the J2EE system.

Example
To activate project field “Proj_Varchar_255_1", populate it through the Mediation library and
base on it toggle/repeated rules filtering.
ProjectActiveAlarm.xml:
<Classes>
<Class name="JFam:ProjectActiveAlarm" superClass="JFam:ActiveAlarm">
<!--
<Attribute name="Proj_Varchar_1024_1" type="string" logicalName="Proj_Varchar 1024_1"
label="Proj_Varchar 1024_1" size="1024"/>
<Attribute name="Proj_Varchar_512_1" type="string" logicalName="Proj_Varchar_512_1"
label="Proj_Varchar_512_1" size="512"/>

38
Configuration
……

-->
<Attribute name="Proj_Varchar_255_1" type="string" logicalName="Proj_Varchar_255_1"
label="Project field" description=”Very useful project field” size="255"/>
<!--
<Attribute name="Proj_Varchar_255_2" type="string" logicalName="Proj_Varchar_255_2" …
<Attribute name="Proj_Varchar_255_3" type="string" logicalName="Proj_Varchar_255_3" …
…….
<Attribute name="Proj_Int_15" type="int" logicalName="Proj_Int_15" label="Proj_Int_15"/>
-->
<ProjectionRef referencedName="MediationMappableAttributes"
name="MediationMappableAttributesEx">
<AddAttributeRef>Proj_Varchar_255_1</AddAttributeRef>
</ProjectionRef>

<ProjectionRef referencedName="ToggleRepeatedRejectRulesFilterableAttributes" name="


ToggleRepeatedRejectRulesFilterableAttributes Ex">
<AddAttributeRef>Proj_Varchar_255_1</AddAttributeRef>
</ProjectionRef>
</Class>
</Classes>

Configuring the Display Name of Alarm Fields


This option enables you to change the presentation of Active and History alarm fields in table
column names, and to control labels and tooltips in relevant FM windows.
A list of Active and History alarms attributes appears in Appendixes A and B.
To change the label and/or tooltip of an active alarm, you need to edit the
ProjectActiveAlarm.xml file, located in project metadata of the JFam module. The change
affects all labels in alarm monitoring clients including folder criteria, Alarm Up, Alarm Info, and
so on.
To change the label and/or tooltip of a history alarm, you need to edit ProjectHistoryAlarm.xml
file located in project metadata of the JFam module. The change affects all labels in the
History Analysis application.

39
Fault Solution Administration Guide

To configure project labels:


 In the relevant project file, create a Refinements section as shown below, and add the
attributes you would like to change as follows:
o name—the ID of the attribute as it appears in the appendix
o label—the desired label of the attribute
o description—the desired tooltip of the attribute
Example:

Note: After the change, you will need to restart all FM related EARs.

Making Alarm Fields Visible


Some rarely used alarm fields are not visible by default. If, however, you need to use them,
edit as follows the ProjectActiveAlarm.xml file (located in the project metadata of the JFam
module).
<Classes>
<Class name="JFam:ProjectActiveAlarm" superClass="JFam:ActiveAlarm">
<Refinements>
<AttributeRefinement name="ModuleType">
<DisplayComponent visible="true"/>
</AttributeRefinement>
</Refinements>
</Class>
</Classes>

Note: After the change you will need to restart all FM related EARs.

Configuring “Copy the Alarm Fields as Text”


The “Copy the Alarm Fields as Text” feature enables copying alarm fields to the clipboard in a
two-column format. The first column displays the field names and the second column displays
the field values.
To change the default order of the fields, specify the required order in the “CopyAsText”
projection of ProjectActiveAlarm.xml that can be found in the JFam project metadata.

40
Configuration

Summary View Configuration


Project Summary View Icons Configuration
The Summary View provides the NOC operator a summary visualization of the network
status. Using the Summary View master mode, the summary visualization can be done for
any folder, and thus can be adjusted per each use case.
The summary criteria is configurable and can be based for example on the network elements
instances, types, vendors, geographic, or any other alarm field. Each summary item is colored
based on the highest severity alarm within the summary group.
The Summary View information can be displayed as both gallery (icons) and list (grid) view.
There are predefined icons in the gallery view that can be configured as described below.

To change the predefined Summary View icons:


1. In the server, go to the path:
$BASE_DIR/j2ee/project/metadata/WinFam/dotNetBundle/
If it does not exist yet, create this folder.
2. In this folder, create/add a file named IconResources.xaml.
3. Make the content of this file as in IconResources.xaml File Example as follows:
a. Make sure it includes definitions for the following 6 key resources:
 genericPathData
 eqpPathData
 eqptypePathData
 servicePathData
 sitePathData
 NotDefinedPathData
b. The project can override the content of these icons with the desired Geometry
(path). The geometry can be provided by a graphic designer.
c. The project can also add new definitions for other fields. For example, if the
project wants to add a special icon for field "Vendor", it should use the following
convention: .

Key="VendorPathData"
Which is <field name>PathData, where field name is the attribute name as defined in
JFam.

41
Fault Solution Administration Guide

It should look like this example:


<sys:String x:Key="VendorPathData" >F1 M 28.000,9.000 L
23.000,14.000 L 23.000,17.000 L 28.000,12.000 L 28.000,9.000 Z M
22.000,14.000 L 0.000,14.000 L 0.000,17.000 L 22.000,17.000 L
22.000,14.000 Z M 4.000,0.000 L 0.000,4.000 L 22.000,4.000 L
26.000,0.000 L 4.000,0.000 Z M 1.000,8.000 L 19.000,8.000 L
19.000,7.000 L 1.000,7.000 L 1.000,8.000 Z M 20.000,8.000 L
21.000,8.000 L 21.000,7.000 L 20.000,7.000 L 20.000,8.000 Z M
1.000,10.000 L 21.000,10.000 L 21.000,9.000 L 1.000,9.000 L
1.000,10.000 Z M 0.000,5.000 L 22.000,5.000 L 22.000,12.000 L
0.000,12.000 L 0.000,5.000 Z M 28.000,4.000 L 28.000,7.000 L
23.000,12.000 L 23.000,9.000 L 28.000,4.000 Z M 28.000,0.000 L
28.000,3.000 L 23.000,8.000 L 23.000,5.000 L 28.000,0.000 Z M
1.000,22.000 L 2.000,22.000 L 2.000,21.000 L 1.000,21.000 L
1.000,22.000 Z M 3.000,22.000 L 21.000,22.000 L 21.000,21.000 L
3.000,21.000 L 3.000,22.000 Z M 1.000,24.000 L 21.000,24.000 L
21.000,23.000 L 1.000,23.000 L 1.000,24.000 Z M 0.000,19.000 L
22.000,19.000 L 22.000,26.000 L 0.000,26.000 L 0.000,19.000 Z M
28.000,14.000 L 28.000,21.000 L 23.000,26.000 L 23.000,19.000 L
28.000,14.000 Z</sys:String>

IconResources.xaml File Example


<ResourceDictionary xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
xmlns:sys="clr-namespace:System;assembly=mscorlib"
>
<sys:String x:Key="genericPathData">F1 M 15.000,28.000 L 28.000,23.000 L 28.000,5.000 L 15.000,10.000 L
15.000,28.000 Z M 13.000,10.000 L 0.000,5.000 L 0.000,23.000 L 13.000,28.000 L 13.000,10.000 Z M 14.000,8.000
L 0.000,3.000 L 14.000,0.000 L 28.000,3.000 L 14.000,8.000 Z</sys:String>
<sys:String x:Key="eqpPathData" >F1 M 22.962,6.339 C 23.598,5.708 24.000,4.843 24.000,3.878 C 24.000,2.913
23.598,2.047 22.962,1.417 L 22.251,2.127 C 22.705,2.578 23.000,3.187 23.000,3.878 C 23.000,4.569 22.705,5.178
22.251,5.629 L 22.962,6.339 Z M 24.378,7.756 C 25.375,6.761 26.000,5.395 26.000,3.878 C 26.000,2.361
25.375,0.995 24.378,0.000 L 23.671,0.707 C 24.489,1.517 25.000,2.634 25.000,3.878 C 25.000,5.122 24.489,6.239
23.671,7.049 L 24.378,7.756 Z M 18.749,5.629 C 18.295,5.178 18.000,4.569 18.000,3.878 C 18.000,3.187
18.295,2.578 18.749,2.127 L 18.038,1.417 C 17.402,2.047 17.000,2.913 17.000,3.878 C 17.000,4.843 17.402,5.708
18.038,6.339 L 18.749,5.629 Z M 17.329,7.049 C 16.511,6.239 16.000,5.122 16.000,3.878 C 16.000,2.634
16.511,1.517 17.329,0.707 L 16.622,0.000 C 15.625,0.995 15.000,2.361 15.000,3.878 C 15.000,5.395 15.625,6.761
16.622,7.756 L 17.329,7.049 Z M 20.500,2.378 C 19.672,2.378 19.000,3.049 19.000,3.878 C 19.000,4.707
19.672,5.378 20.500,5.378 C 21.328,5.378 22.000,4.707 22.000,3.878 C 22.000,3.049 21.328,2.378 20.500,2.378 Z
M 21.000,6.378 L 21.000,12.378 L 20.000,12.378 L 20.000,6.378 M 4.000,13.378 L 0.000,17.378 L 22.000,17.378 L
26.000,13.378 L 4.000,13.378 Z M 1.000,21.378 L 2.000,21.378 L 2.000,20.378 L 1.000,20.378 L 1.000,21.378 Z M
3.000,21.378 L 21.000,21.378 L 21.000,20.378 L 3.000,20.378 L 3.000,21.378 Z M 1.000,23.378 L 21.000,23.378 L
21.000,22.378 L 1.000,22.378 L 1.000,23.378 Z M 0.000,18.378 L 22.000,18.378 L 22.000,25.378 L 0.000,25.378 L
0.000,18.378 Z M 28.000,13.378 L 28.000,20.378 L 23.000,25.378 L 23.000,18.378 L 28.000,13.378 Z</sys:String>
<sys:String x:Key="eqptypePathData" >F1 M 28.000,9.000 L 23.000,14.000 L 23.000,17.000 L 28.000,12.000 L
28.000,9.000 Z M 22.000,14.000 L 0.000,14.000 L 0.000,17.000 L 22.000,17.000 L 22.000,14.000 Z M 4.000,0.000 L
0.000,4.000 L 22.000,4.000 L 26.000,0.000 L 4.000,0.000 Z M 1.000,8.000 L 19.000,8.000 L 19.000,7.000 L
1.000,7.000 L 1.000,8.000 Z M 20.000,8.000 L 21.000,8.000 L 21.000,7.000 L 20.000,7.000 L 20.000,8.000 Z M
1.000,10.000 L 21.000,10.000 L 21.000,9.000 L 1.000,9.000 L 1.000,10.000 Z M 0.000,5.000 L 22.000,5.000 L
22.000,12.000 L 0.000,12.000 L 0.000,5.000 Z M 28.000,4.000 L 28.000,7.000 L 23.000,12.000 L 23.000,9.000 L
28.000,4.000 Z M 28.000,0.000 L 28.000,3.000 L 23.000,8.000 L 23.000,5.000 L 28.000,0.000 Z M 1.000,22.000 L
2.000,22.000 L 2.000,21.000 L 1.000,21.000 L 1.000,22.000 Z M 3.000,22.000 L 21.000,22.000 L 21.000,21.000 L
3.000,21.000 L 3.000,22.000 Z M 1.000,24.000 L 21.000,24.000 L 21.000,23.000 L 1.000,23.000 L 1.000,24.000 Z M
0.000,19.000 L 22.000,19.000 L 22.000,26.000 L 0.000,26.000 L 0.000,19.000 Z M 28.000,14.000 L 28.000,21.000 L
23.000,26.000 L 23.000,19.000 L 28.000,14.000 Z</sys:String>

42
Configuration
<sys:String x:Key="servicePathData" >F1 M 16.000,13.731 C 7.163,13.731 0.000,14.747 0.000,16.000 C
0.000,17.253 7.163,18.269 16.000,18.269 C 24.837,18.269 32.000,17.253 32.000,16.000 C 32.000,14.747
24.837,13.731 16.000,13.731 L 16.000,13.731 Z M 16.000,14.731 C 23.566,14.731 28.335,15.427 30.246,16.000 C
28.335,16.573 23.566,17.269 16.000,17.269 C 8.434,17.269 3.665,16.573 1.754,16.000 C 3.665,15.427
8.434,14.731 16.000,14.731 M 16.000,8.146 C 7.163,8.146 0.000,11.662 0.000,16.000 C 0.000,20.337 7.163,23.854
16.000,23.854 C 24.837,23.854 32.000,20.337 32.000,16.000 C 32.000,11.662 24.837,8.146 16.000,8.146 L
16.000,8.146 Z M 16.000,9.146 C 24.131,9.146 31.000,12.285 31.000,16.000 C 31.000,19.715 24.131,22.854
16.000,22.854 C 7.869,22.854 1.000,19.715 1.000,16.000 C 1.000,12.285 7.869,9.146 16.000,9.146 M 16.000,3.371
C 7.163,3.371 0.000,9.025 0.000,16.000 C 0.000,22.976 7.163,28.629 16.000,28.629 C 24.837,28.629
32.000,22.976 32.000,16.000 C 32.000,9.025 24.837,3.371 16.000,3.371 L 16.000,3.371 Z M 16.000,4.371 C
24.271,4.371 31.000,9.588 31.000,16.000 C 31.000,22.412 24.271,27.629 16.000,27.629 C 7.729,27.629
1.000,22.412 1.000,16.000 C 1.000,9.588 7.729,4.371 16.000,4.371 M 16.000,0.000 C 14.747,0.000 13.731,7.164
13.731,16.000 C 13.731,24.836 14.747,32.000 16.000,32.000 C 17.253,32.000 18.269,24.836 18.269,16.000 C
18.269,7.164 17.253,0.000 16.000,0.000 L 16.000,0.000 Z M 16.000,1.754 C 16.573,3.665 17.269,8.434
17.269,16.000 C 17.269,23.566 16.573,28.335 16.000,30.246 C 15.427,28.335 14.731,23.566 14.731,16.000 C
14.731,8.434 15.427,3.665 16.000,1.754 M 16.000,0.000 C 11.662,0.000 8.146,7.164 8.146,16.000 C 8.146,24.836
11.662,32.000 16.000,32.000 C 20.337,32.000 23.854,24.836 23.854,16.000 C 23.854,7.164 20.337,0.000
16.000,0.000 L 16.000,0.000 Z M 16.000,1.000 C 19.716,1.000 22.854,7.869 22.854,16.000 C 22.854,24.131
19.716,31.000 16.000,31.000 C 12.285,31.000 9.146,24.131 9.146,16.000 C 9.146,7.869 12.285,1.000 16.000,1.000
M 16.000,0.000 C 9.025,0.000 3.371,7.164 3.371,16.000 C 3.371,24.836 9.025,32.000 16.000,32.000 C
22.975,32.000 28.628,24.836 28.628,16.000 C 28.628,7.164 22.975,0.000 16.000,0.000 L 16.000,0.000 Z M
16.000,1.000 C 22.412,1.000 27.628,7.729 27.628,16.000 C 27.628,24.271 22.412,31.000 16.000,31.000 C
9.588,31.000 4.371,24.271 4.371,16.000 C 4.371,7.729 9.588,1.000 16.000,1.000 M 16.000,0.000 C 7.163,0.000
0.000,7.164 0.000,16.000 C 0.000,24.836 7.163,32.000 16.000,32.000 C 24.837,32.000 32.000,24.836
32.000,16.000 C 32.000,7.164 24.837,0.000 16.000,0.000 L 16.000,0.000 Z M 16.000,1.000 C 24.271,1.000
31.000,7.729 31.000,16.000 C 31.000,24.271 24.271,31.000 16.000,31.000 C 7.729,31.000 1.000,24.271
1.000,16.000 C 1.000,7.729 7.729,1.000 16.000,1.000</sys:String>

<sys:String x:Key="sitePathData" >F1 M 20.000,17.000 L 19.000,18.000 L 19.000,14.000 L 20.000,13.000 L


20.000,17.000 Z M 18.000,19.000 L 17.000,20.000 L 17.000,16.000 L 18.000,15.000 L 18.000,19.000 Z M
18.000,14.000 L 17.000,15.000 L 17.000,11.000 L 18.000,10.000 L 18.000,14.000 Z M 20.000,12.000 L
19.000,13.000 L 19.000,9.000 L 20.000,8.000 L 20.000,12.000 Z M 20.000,7.000 L 19.000,8.000 L 19.000,4.000 L
20.000,3.000 L 20.000,7.000 Z M 18.000,9.000 L 17.000,10.000 L 17.000,6.000 L 18.000,5.000 L 18.000,9.000 Z M
16.000,5.000 L 16.000,28.000 L 21.000,23.000 L 21.000,0.000 L 16.000,5.000 Z M 15.000,4.000 L 0.000,4.000 L
4.000,0.000 L 19.000,0.000 L 15.000,4.000 Z M 5.000,21.000 L 2.000,21.000 L 2.000,17.000 L 5.000,17.000 L
5.000,21.000 Z M 13.000,21.000 L 10.000,21.000 L 10.000,17.000 L 13.000,17.000 L 13.000,21.000 Z M
5.000,16.000 L 2.000,16.000 L 2.000,12.000 L 5.000,12.000 L 5.000,16.000 Z M 13.000,16.000 L 10.000,16.000 L
10.000,12.000 L 13.000,12.000 L 13.000,16.000 Z M 9.000,21.000 L 6.000,21.000 L 6.000,17.000 L 9.000,17.000 L
9.000,21.000 Z M 9.000,16.000 L 6.000,16.000 L 6.000,12.000 L 9.000,12.000 L 9.000,16.000 Z M 9.000,11.000 L
6.000,11.000 L 6.000,7.000 L 9.000,7.000 L 9.000,11.000 Z M 13.000,11.000 L 10.000,11.000 L 10.000,7.000 L
13.000,7.000 L 13.000,11.000 Z M 5.000,11.000 L 2.000,11.000 L 2.000,7.000 L 5.000,7.000 L 5.000,11.000 Z M
0.000,5.000 L 0.000,28.000 L 6.000,28.000 L 6.000,23.000 L 9.000,23.000 L 9.000,28.000 L 15.000,28.000 L
15.000,5.000 L 0.000,5.000 Z</sys:String>
<sys:String x:Key="NotDefinedPathData" >F1 M 19.397,12.810 C 19.089,13.249 18.497,13.811 17.620,14.494 L
16.756,15.166 C 16.285,15.531 15.973,15.959 15.818,16.447 C 15.721,16.757 15.668,17.236 15.660,17.888 L
12.353,17.888 C 12.400,16.513 12.530,15.562 12.741,15.037 C 12.952,14.513 13.496,13.908 14.372,13.225 L
15.261,12.529 C 15.553,12.310 15.788,12.069 15.967,11.809 C 16.291,11.361 16.454,10.869 16.454,10.332 C
16.454,9.714 16.273,9.149 15.912,8.641 C 15.551,8.133 14.891,7.878 13.933,7.878 C 12.991,7.878 12.323,8.191
11.930,8.818 C 11.536,9.445 11.339,10.096 11.339,10.771 L 7.812,10.771 C 7.909,8.452 8.719,6.809 10.240,5.840
C 11.200,5.222 12.381,4.912 13.780,4.912 C 15.619,4.912 17.147,5.352 18.364,6.230 C 19.580,7.109 20.188,8.411
20.188,10.137 C 20.188,11.194 19.925,12.086 19.397,12.810 Z M 15.917,23.088 L 12.267,23.088 L 12.267,19.561
L 15.917,19.561 L 15.917,23.088 Z M 14.000,0.000 C 6.268,0.000 0.000,6.268 0.000,14.000 C 0.000,21.732
6.268,28.000 14.000,28.000 C 21.732,28.000 28.000,21.732 28.000,14.000 C 28.000,6.268 21.732,0.000
14.000,0.000 Z</sys:String>
</ResourceDictionary>

FaultPro Configuration
FaultPro can be configured to connect to a specific NE in multiple protocols. It can also be
configured to have more than one credentials set with different connection parameters per
each protocol. Each of these sets id defined as an Access in Communication Admin (see the
Communication Admin User Guide).
To configure the required option, set the NeWorkMode property, in JCore cfg of the
FaultProModule EAR. The available values are protocol (default) and access. If there is
more than one access for the selected protocol, FaultPro selects the first in the list.
43
Fault Solution Administration Guide
The ShowNE Commands property determines whether available NCI commands are
displayed in FaultPro (in addition to the ones defined in the Association rules).

Site View Configuration

Note: Cruiser must be restarted to see the changes after they are done.

The Site View option provides the user a custom-based Site topology view for a selected site.
It provides a graphical display of all the objects that are associated with the chosen site,
based on the BC information and the relevant links between the objects and alarm information
for each object. It can be opened for a selected site or for the From Site of a selected alarm.
It is accessible from the alarm display, GEO Map display, and the Ribbon.

Icons Configuration
To change the Site View icon for a specific attribute:
1. In the SiteViewSetIconByFieldName property, set the name of the attribute by which
you want to determine the icons.
2. In the server, go to the path:
$BASE_DIR/j2ee/project/metadata/WinFam/dotNetBundle/
If it does not exist yet, create this folder.
3. In this folder, create/add a file named IconResources.xaml.
4. In this file, define the new node icon as in IconResources.xaml File Example. The
resource key should be <value>PathData, where value is the value of the attribute
selected in step 1 in the relevant node MD Class.
It should look like this example:
<sys:String x:Key="routerPathData" >F1 M 28.000,9.000 L 23.000,14.000
L 23.000,17.000 L 28.000,12.000 L 28.000,9.000 Z M 22.000,14.000 L
0.000,14.000 L 0.000,17.000 L 22.000,17.000 L 22.000,14.000 Z M
4.000,0.000 L 0.000,4.000 L 22.000,4.000 L 26.000,0.000 L 4.000,0.000
Z M 1.000,8.000 L 19.000,8.000 L 19.000,7.000 L 1.000,7.000 L
1.000,8.000 Z M 20.000,8.000 L 21.000,8.000 L 21.000,7.000 L
20.000,7.000 L 20.000,8.000 Z M 1.000,10.000 L 21.000,10.000 L
21.000,9.000 L 1.000,9.000 L 1.000,10.000 Z M 0.000,5.000 L
22.000,5.000 L 22.000,12.000 L 0.000,12.000 L 0.000,5.000 Z M
28.000,4.000 L 28.000,7.000 L 23.000,12.000 L 23.000,9.000 L
28.000,4.000 Z M 28.000,0.000 L 28.000,3.000 L 23.000,8.000 L
23.000,5.000 L 28.000,0.000 Z M 1.000,22.000 L 2.000,22.000 L
2.000,21.000 L 1.000,21.000 L 1.000,22.000 Z M 3.000,22.000 L
21.000,22.000 L 21.000,21.000 L 3.000,21.000 L 3.000,22.000 Z M
1.000,24.000 L 21.000,24.000 L 21.000,23.000 L 1.000,23.000 L
1.000,24.000 Z M 0.000,19.000 L 22.000,19.000 L 22.000,26.000 L
0.000,26.000 L 0.000,19.000 Z M 28.000,14.000 L 28.000,21.000 L
23.000,26.000 L 23.000,19.000 L 28.000,14.000 Z</sys:String>

44
Configuration

Tooltip Configuration
To change Site View tooltip details:
1. Go to $BASE_DIR/ttij2ee/project/metadata/SchematicViewsServer/classes/.
2. Select the relevant file. For example, to change the tooltip of a node, go to
ProjectViewNode.xml.
3. Add a projection override called CruiserVisibleFields.
By default, the available Site View fields are:

Column Name Description

Region1_Name The first region level name (for example, country) in which the
network element is located

Region2_Name The second region level name (for example, state) in which the
network element is located

Region3_Name The third region level name (for example, city) in which the
network element is located

Site The site in which the network element is located

Vendor The name of the network element vendor

IPAddress The network element IP Address

EqpType The type of the network element

MainFunction The functionality of the network element (the first keyword of


category Function)

For more details about overriding, refer to the MD Class Refinement chapter in the
Helix Administration Guide.
4. Restart the EAR where SchematicViewsServer is deployed.

Additional Details Configuration


To change Site View Additional Details tab configuration:
1. Go to $BASE_DIR/ttij2ee/project/metadata/SchematicViewsServer/classes/.
2. Select the relevant file. For example, to change the tooltip of a node, go to
ProjectViewNode.xml.
3. Add a projection override called CruiserAdditionalFields and indicate the required
fields.

Note: If this projection is not defined, all the following fields are presented.

45
Fault Solution Administration Guide
By default, the available Site View fields are:

Column Name Description

Region1_Name The first region level name (for example, country) in which the
network element is located

Region2_Name The second region level name (for example, state) in which the
network element is located

Region3_Name The third region level name (for example, city) in which the
network element is located

Site The site in which the network element is located

Vendor The name of the network element vendor

IPAddress The network element IP Address

EqpType The type of the network element

MainFunction The functionality of the network element (the first keyword of


category Function)

For more details about overriding, refer to the MD Class Refinement chapter in the
Helix Administration Guide.
4. Restart the EAR where SchematicViewsServer is deployed.

Service Details Configuration


The Service Details Configuration is defined in the MD Class called
BCAPI:SOC.SOCServiceForObject in the
$BASE_DIR/ttij2ee/project/metadata/BCAPI/classes/SOC/ directory.

To control the fields in the details presentation of the selected service in the
Service Status tab:
1. For overriding the definitions, refer to the MD Class Refinement chapter in the Helix
Administration Guide.
2. Restart the EAR where BCAPI is deployed.
By default, the available service fields are:

Column Name Description

Customer Names The customers hiring this service (if one)

ID The service identifier

Service Name The name of the service

Type ID The service type identifier

46
Configuration

Column Name Description

Service Type The service type name

Service Class Code The service class code (1=Gold, 2=Silver, 3=Bronze)

Service Class The service class (Gold, Silver, or Bronze)

Importance Indicates whether the service is important or not. If the


service importance in BC is High or Medium, the icon
is displayed

Description Free text describing the service

Internal Type The service internal type


The available options are:
 CFS—Customer Facing Service
 RFS—Resource Facing Service (internal services in
the model that are not getting service alarms)

Internal ID The internal name of the service

External ID The name of the service as given by the external


customer

47
Fault Solution Administration Guide
In the following example, the NEW_FIELD field is added to the new
CMM_DB.NEW_SERVICE_ENRICHMENT_VW view:
<?xml version="1.0" encoding="UTF-8" ?>
- <Classes>
- <Class name="BCAPI:SOC.SOCServiceForObject"
superClass="BCAPI:SOC.SOCServiceWithCustomers"
<Attribute name="NewField" type="string" />
- <DBMapping mainTable="CMM_DB.NEW_SERVICE_ENRICHMENT_VW"
vendor="Oracle" extendsMapping="false">
- <Table>
- <PrimaryKeys>
<Column name="SERVICE_INSTANCE_ID" />
</PrimaryKeys>
<Attribute name="ObjectId" columnName="SERVICE_INSTANCE_ID" />
<Attribute name="Name" columnName="SERVICE_INST_NAME" />
<Attribute name="ServiceTypeId" columnName="SERVICE_TYPE_ID" />
<Attribute name="ServiceTypeName" columnName="SERVICE_TYPE_NAME" />
<Attribute name="ServiceClassCode"
columnName="SERVICE_INSTANCE_CLASS" />
<Attribute name="ServiceClassDescr" columnName="CODES_DESCRIPTION"
/>
<Attribute name="ServiceImportance"
columnName="SERVICE_INSTANCE_IMPORTANCE" />
<Attribute name="ServiceDescription"
columnName="SERVICE_INSTANCE_DESCRIPTION" />
<Attribute name="ServiceInternalType"
columnName="SERVICE_INTERNAL_TYPE" />
<Attribute name="ServiceInternalId"
columnName="SRVINST_INTERNAL_ID" />
<Attribute name="ServiceExternalId"
columnName="SERVICE_INSTANCE_CODE" />
<Attribute name="ServiceCustomers" columnName="CONCAT_CUST_NAMES"
/>
<Attribute name="NewField" columnName="NEW_FIELD" />
</Table>
</DBMapping>
</Class>
</Classes>

48
Configuration

KPI Presentation
This feature presents the status of the KPI selected for the entity as a dot, colored according
to the Coloring Thresholding rules. When hovering over the entity, it presents the name of the
KPI and its value. To enable this feature, you have to configure the mapping between the
entity and its default counter.

To enable the KPI Presentation feature:


1. In the server, go to the path:
$BASE_DIR/ttij2ee/project/metadata/WinFam/winfam
2. Copy there the following KPIRules.xml file example.
3. Edit it according to your PM implementation.
The KPI Name should be in the format:
[db schema].[counter set name].[column name]
For example:
<KPIName>ALL_IP.STD_RTRIF_TRAF_ROUT.IF_IN_UCAST_PKTS</KPIName>
KPIRules.xml example

<?xml version="1.0" encoding="utf-8" ?>


- <ArrayOfKPIRule xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
- <KPIRule>
<EntityName>ROUTER</EntityName>
<KPIName>ALL_IP.STD_RTRIF_TRAF_ROUT.IF_IN_UCAST_PKTS</KPIName>
- <Ranges>
- <KPIRange>
<ColorName>Green</ColorName>
<FromVal>86</FromVal>
<ToVal>100</ToVal>
</KPIRange>
- <KPIRange>
<ColorName>Blue</ColorName>
<FromVal>76</FromVal>
<ToVal>85</ToVal>
</KPIRange>
- <KPIRange>
<ColorName>Purple</ColorName>
<FromVal>60</FromVal>
<ToVal>75</ToVal>
</KPIRange>
- <KPIRange>
<ColorName>Red</ColorName>
<FromVal>0</FromVal>
<ToVal>59</ToVal>
</KPIRange>
</Ranges>
</KPIRule>
- <KPIRule>
<EntityName>RNC</EntityName>
<KPIName>Dummy2.Dummy2.Dummy2</KPIName>
- <Ranges>
- <KPIRange>
<ColorName>Green</ColorName>
<FromVal>86</FromVal>
<ToVal>100</ToVal>
</KPIRange>
- <KPIRange>
49
Fault Solution Administration Guide
<ColorName>Blue</ColorName>
<FromVal>76</FromVal>
<ToVal>85</ToVal>
</KPIRange>
- <KPIRange>
<ColorName>Purple</ColorName>
<FromVal>60</FromVal>
<ToVal>75</ToVal>
</KPIRange>
- <KPIRange>
<ColorName>Red</ColorName>
<FromVal>0</FromVal>
<ToVal>59</ToVal>
</KPIRange>
</Ranges>
</KPIRule>
- <KPIRule>
<EntityName>Cell</EntityName>
<KPIName>Dummy3.Dummy3.Dummy3</KPIName>
- <Ranges>
- <KPIRange>
<ColorName>Green</ColorName>
<FromVal>86</FromVal>
<ToVal>100</ToVal>
</KPIRange>
- <KPIRange>
<ColorName>Blue</ColorName>
<FromVal>76</FromVal>
<ToVal>85</ToVal>
</KPIRange>
- <KPIRange>
<ColorName>Purple</ColorName>
<FromVal>60</FromVal>
<ToVal>75</ToVal>
</KPIRange>
- <KPIRange>
<ColorName>Red</ColorName>
<FromVal>0</FromVal>
<ToVal>59</ToVal>
</KPIRange>
</Ranges>
</KPIRule>
</ArrayOfKPIRule>

Site View Refresh Rate


The Site View refresh rates are determined by the SiteViewRefreshRate and
SiteViewKPIRefreshRate WinFam (Cruiser client) properties.
You can override these properties to modify the update mechanism for the topology data and
for the KPI data.

50
Configuration

Anomaly & Trend Configuration


About config.xml
The Analytics feature is activated only if included in your license. It also requires the
FamAnalytics module to be installed. We recommend installing this module in a separate
EAR.
The config.xml file ($BASED_DIR/ttij2ee/project/metadata/FamAnalytics/config) enables you
to configure the Analytics parameters.
The installation includes the default file that can be tuned later on.
The file includes the following main sections:
 TrendConfig—enables you to configure the Trend parameters using the following
subsections:
o PredictiveObject—enables you to select the alarm attribute (singles or pairs)
for which the Trend will be calculated and indicate the alarm attribute whose
Trend value will be displayed in the Trend Analytics alarm field.
o HistoryResolution—enables you to define the way the Trend behaviour
information will be collected.
 AnomalyConfig—enables you to select the Anomaly parameters using the following
subsections:
o GraphField—enables you to select the alarm attribute for the Anomaly graph.
o LearningPhase—the Anomaly algorithm requires prior knowledge of the
attributes behavior. This section enables you to define the way this information
will be collected.
o PredictiveObject—enables you to select the alarm attributes (singles or pairs)
for which the Anomaly will be calculated and indicate the alarm attribute whose
Trend value will be displayed in the Anomaly Analytics alarm field.
o HistoryResolution—enables you to define the way the Anomaly behaviour
information will be collected.
 PredictiveRangeConfig—enables you to define the severity colour ranges for both
Trend and Anomaly.
The following sections describe the config.xml file syntax with the parameters (which you can
change) bolded.

Selecting the PredictiveObjects (for Both Trend and Anomaly)


In this section, you select the alarm attributes (singles or pairs) for which the Trend/Anomaly
score will be calculated. Attribute name can be any valid active alarm attribute.
For example, if you choose the FromSite attribute, the algorithm will calculate the score for
every site found in the alarm history.
Syntax
The syntax for selecting a single alarm attribute is:
<PredictiveObject >
<AlarmAttribute1>attribute name</AlarmAttribute1>
</PredictiveObject>

51
Fault Solution Administration Guide
The syntax for selecting a pair of alarm attributes is:
<PredictiveObject>
<AlarmAttribute1>attribute name1</AlarmAttribute1>
<AlarmAttribute2>attribute name2</AlarmAttribute2>
</PredictiveObject>
The syntax for selecting an alarm attribute score to be displayed in the Trend/Anomaly alarm
field:
<PredictiveObject AlarmEnrichment="true">
<AlarmAttribute1>attribute name</AlarmAttribute1>
</PredictiveObject>
Only one <PredictiveObject> can be selected as a source for the alarm enrichment.
For example, if you choose FromSite as the Predictive object for the alarm enrichment and
the alarm has FromSite attribute with “site1” value – the alarm will be enriched with the score
calculated for “site1”.
Formal Schema

52
Configuration

Defining the HistoryResolution (for Both Trend and Anomaly)


In this section, you define the way the information about Trend/Anomaly behavior will be
collected.
The following parameters are configured:
 History range—period (in days) of alarms history scanned to calculate the score
(for example, 7 days)
 Aggregation method— daily/hourly
 Scheduling—how often the score is recalculated
 Name—for convenience and GUI presentation every resolution is given a name
It is possible to define more than one history resolution. In this case, the score will be
calculated for every existing resolution.
For example, if you configure score calculation for sites (FromSite attribute) and define two
periods, of 7 and 30 days, every site will have two scores. In the Summary View GUI you will
be able to choose which of the scores to display.
For alarm enrichment to work, one (and only one) of the history ranges must be marked as
source of the enrichment.

Performance Considerations
By default, at least 20 points are required for the score calculation. For example, defining 7
days with daily resolution will produce only 7 points that will not be sufficient. But if on
contrary, you scan big portions of the history for producing many points, it may degrade the
performance of the algorithm.
Depending on the configuration, the history data size, and the database performance you
may need to optimize the database performance by using additional indexes.
We recommend monitoring the database performance during the initial period and validating
smooth and optimal database functionality.

Formal Schema

53
Fault Solution Administration Guide

Syntax
The syntax for defining a history resolution score to appear in the trend\anomaly alarm field is:
<HistoryResolution AlarmEnrichment="true">

The syntax for defining the history resolution name in the client is:
<Name>resolution name</Name>
The syntax for defining the history days number in the client is:
<HistoryDaysRange>history days number</HistoryDaysRange>

The syntax for defining the aggregation period in the graph is:
<AggregationType>aggregation period</AggregationType>
Aggregation period can be DAILY or HOURLY.

The syntax for running the calculation once a day is:


<ExecutionSetting>
<Daily>time in day (HH:MM)</Daily>
</ExecutionSetting>
For example, time in day can be 04:30.
The syntax for running the calculation once every hour is:
<ExecutionSetting>
<Hourly>minutes after round hour</Hourly>
</ExecutionSetting>
For example, if <minutes after round hour> is 15, It will be executed at 00:15, 01:15, 02:15,
03:15….23:15.

Configuring the Anomaly Learning Phase


In addition to scanning an alarm history as described above, the anomaly algorithm score
calculation is based on a learning phase that should be run periodically.
The following parameters are configured:
 Period (in days) of alarms history scanned
 The schedule of the learning phase run
Syntax
The syntax for selecting the number of learning days for the Anomaly algorithm is:
<HistoryDaysRange>number of learning days</HistoryDaysRange>

54
Configuration
The syntax for defining how often the learning phase will run is:
<ExecutionDaysInterval>days number</ExecutionDaysInterval>
For example, 7 means that learning process is scheduled to run every 7 days

The syntax for defining on which time of day the learning phase will run is:
<ExecutionTimeOfDay>time of day (HH:MM)</ExecutionTimeOfDay>

Performance Considerations
During the learning phase, running a new process is forked. This process consumes about
the same amount of memory as the EAR process where the FamAnalytics module is running.
Therefore, you have to plan the resources of the machine accordingly.

Configuring the Score Coloring (for Both Trend and Anomaly)


In this section you can configure how analytics scores will be colored in the Summary View
GUI.
The syntax for defining one range and its color is:
<PredictiveRangeData>
<Name>range name</Name>
<PredictiveRange>
<MinValue>score minimum value</MinValue>
<MaxValue>score maximum value</MaxValue>
</PredictiveRange>
<Color>range color</Color>
</PredictiveRangeData>
These definitions must cover the entire range from -100 to 100 without overlaps.
The color (range color) is defined using the syntax #RRGGBB, where every color is
presented by two hexadecimal digits. For example, #ff0000 is red, #00ff00 is green, and
#0000ff is blue.

Formal Schema

55
Fault Solution Administration Guide

config.xml File Example


In the config.xml file example, you can see the following definitions:

TrendConfig PredictiveObjects
The selected Trend single alarm attributes are EquipmentName, EquipmentType,
DeviceName, DeviceType, FromSite, Domain, and Area.
The EquipmentName score is selected to be displayed in the Trend Analytics alarm field.

TrendConfig HistoryResolution
The defined resolutions are:
 Daily for 30 days with title 30 Days and execution on 04:30.
 Hourly for 7 days with title 7 Days and execution on 05:30.
 Hourly for 1 day with title 1 Day and execution on XX:15.

Anomaly PredictiveObjects
The selected Anomaly single alarm attributes are EquipmentType, DeviceName, DeviceType,
FromSite, ServiceName, Vendor, Domain and Area.

Anomaly Config HistoryResolution


The defined resolutions are:
 Daily for 30 days with title 30 Days and execution on 03:30.
 Hourly for 7 days with title 7 Days and execution on 04:30.
 Hourly for 1 day with title 1 Day and execution on XX:30.

PredictiveRangeConfig (for Both Trend and Anomaly)


The defined ranges are:
 Low from 0 to 25 with color #99bedc
 Moderate from 26 to 50 with color #ffc600
 Significant from 51 to 75 with color #ff8135
 Serious from 76 to 100 with color # ff413f
 Low (decrease) from -25 to -1 with color #99bedc
 Moderate (decrease) from -50 to -26 with color #ffc600
 Significant (decrease) from -75 to -51 with color #ff8135
 Serious (decrease) from -100 to -76 with color # ff413f

Config File
<?xml version="1.0>

<TrendConfig>
<PredictiveObject AlarmEnrichment="true">
<AlarmAttribute1>EquipmentName</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>

56
Configuration
<AlarmAttribute1>EquipmentType</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>DeviceName</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>DeviceType</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>FromSite</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>ServiceName</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>Vendor</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>Domain</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>Area</AlarmAttribute1>
</PredictiveObject>

<HistoryResolution AlarmEnrichment="true">
<Name>30 Days</Name>
<HistoryDaysRange>30</HistoryDaysRange>
<AggregationType>DAILY</AggregationType>
<ExecutionSetting>
<Daily>04:30</Daily>
</ExecutionSetting>
</HistoryResolution>
<HistoryResolution>
<Name>7 Days</Name>
<HistoryDaysRange>7</HistoryDaysRange>
<AggregationType>HOURLY</AggregationType>
<ExecutionSetting>
<Daily>05:30</Daily>
57
Fault Solution Administration Guide
</ExecutionSetting>
</HistoryResolution>
<HistoryResolution>
<Name>1 Day</Name>
<HistoryDaysRange>1</HistoryDaysRange>
<AggregationType>HOURLY</AggregationType>
<ExecutionSetting>
<Hourly>15</Hourly>
</ExecutionSetting>
</HistoryResolution>
</TrendConfig>

<AnomalyConfig>
<GraphField>Keyword</GraphField>
<LearningPhase>
<HistoryDaysRange>90</HistoryDaysRange>
<AlarmNameAttribute>Keyword</AlarmNameAttribute>
<AlarmEqpAttribute>EquipmentName</AlarmEqpAttribute>
<ExecutionDaysInterval>7</ExecutionDaysInterval>
<ExecutionTimeOfDay>02:00</ExecutionTimeOfDay>
<MinSupport>100</MinSupport>
<MinInterest>2</MinInterest>
<MinConf>0</MinConf>
<MinIS>0</MinIS>
</LearningPhase>

<PredictiveObject AlarmEnrichment="true">
<AlarmAttribute1>EquipmentName</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>EquipmentType</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>DeviceName</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>DeviceType</AlarmAttribute1>
</PredictiveObject>
58
Configuration
<PredictiveObject>
<AlarmAttribute1>FromSite</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>ServiceName</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>Vendor</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>Domain</AlarmAttribute1>
</PredictiveObject>
<PredictiveObject>
<AlarmAttribute1>Area</AlarmAttribute1>
</PredictiveObject>

<HistoryResolution AlarmEnrichment="true">
<Name>30 Days</Name>
<HistoryDaysRange>30</HistoryDaysRange>
<AggregationType>DAILY</AggregationType>
<ExecutionSetting>
<Daily>03:30</Daily>
</ExecutionSetting>
</HistoryResolution>
<HistoryResolution>
<Name>7 Days</Name>
<HistoryDaysRange>7</HistoryDaysRange>
<AggregationType>HOURLY</AggregationType>
<ExecutionSetting>
<Daily>04:30</Daily>
</ExecutionSetting>
</HistoryResolution>
<HistoryResolution>
<Name>1 Day</Name>
<HistoryDaysRange>1</HistoryDaysRange>
<AggregationType>HOURLY</AggregationType>
<ExecutionSetting>
<Hourly>30</Hourly>
59
Fault Solution Administration Guide
</ExecutionSetting>
</HistoryResolution>
</AnomalyConfig>

<PredictiveRangeConfig>
<PredictiveRangeData>
<Name>Low</Name>
<PredictiveRange>
<MinValue>0</MinValue>
<MaxValue>25</MaxValue>
</PredictiveRange>
<Color>#99bedc</Color>
</PredictiveRangeData>
<PredictiveRangeData>
<Name>Moderate</Name>
<PredictiveRange>
<MinValue>26</MinValue>
<MaxValue>50</MaxValue>
</PredictiveRange>
<Color>#ffc600</Color>
</PredictiveRangeData>
<PredictiveRangeData>
<Name>Significant</Name>
<PredictiveRange>
<MinValue>51</MinValue>
<MaxValue>75</MaxValue>
</PredictiveRange>
<Color>#ff8135</Color>
</PredictiveRangeData>
<PredictiveRangeData>
<Name>Serious</Name>
<PredictiveRange>
<MinValue>76</MinValue>
<MaxValue>100</MaxValue>
</PredictiveRange>
<Color>#ff413f</Color>
</PredictiveRangeData>
<PredictiveRangeData>
60
Configuration
<Name>Low (decrease)</Name>
<PredictiveRange>
<MinValue>-25</MinValue>
<MaxValue>-1</MaxValue>
</PredictiveRange>
<Color>#99bedc</Color>
</PredictiveRangeData>
<PredictiveRangeData>
<Name>Moderate (decrease)</Name>
<PredictiveRange>
<MinValue>-50</MinValue>
<MaxValue>-26</MaxValue>
</PredictiveRange>
<Color>#ffc600</Color>
</PredictiveRangeData>
<PredictiveRangeData>
<Name>Significant (decrease)</Name>
<PredictiveRange>
<MinValue>-75</MinValue>
<MaxValue>-51</MaxValue>
</PredictiveRange>
<Color>#ff8135</Color>
</PredictiveRangeData>
<PredictiveRangeData>
<Name>Serious (decrease)</Name>
<PredictiveRange>
<MinValue>-100</MinValue>
<MaxValue>-76</MaxValue>
</PredictiveRange>
<Color>#ff413f</Color>
</PredictiveRangeData>
</PredictiveRangeConfig>

</AnalyticsConfig>

61
Fault Solution Administration Guide

Alarms Prediction Configuration


This feature is implemented by the FmPredictor module and contains two phases:
 Offline (learning)—this part runs periodically (by default, once a week) over the alarm
history DB, preparing the data set.
 Online (real time)—this part runs periodically (by default every half an hour), gets the
latest alarms raised (by default 2 hours) and raises predicted alarms based on the
offline results.
Both phases are Python processes forked from the managed server process where
FmPredictor is installed.

Offline
The algorithm runs every 7 days ("offline.day.interval" property) at 2:00 AM ("offline.time"
property). In case of execution failure, the algorithm reruns every 30 minutes
(“predictor.offline.retry.minutes.interval” property).
The algorithm analyzes the last 92 days ("number.of.days.for.offline.algorithm" property) of
historic alarm data.
Part of the historic data is used as control data to check the correctness of the predictions.
This way, every predicted alarm name has the following two KPIs:
 Precision—how many alarms the algorithm predicted correctly (correct
predictions / all predictions)
 Recall—percentage of total results correctly classified by the algorithm (correct
predictions/ all alarms)
Predictions with precision less than 0.5 ("offline.param.min.precision") or that recall less
than 0.5 ("offline.param.min.recall") will be ignored and dropped.

Pre-requisites
For the prediction algorithm to provide the best results, the following prerequisites are
essential:
1. Preferably 3 months of alarm history data is required (but not less than 1
month).
2. Alarm data should include the following information:
o Alarm Name—the name of the alarm, as provided by the vendor (for example,
AIS, Power Failure, or LOS)
o From Site—the site where the alarm originated from
o Area—the geographic hierarchy above SITE
o District—the geographic hierarchy above AREA
o Eqp Name—the equipment that originated the alarm
o Alarmed Object—the none/sub-equipment entity, such as
Card/Interface/Channel/link
o Object ID
o Ancestor object ID
o Site ID

62
Configuration
Alarm Filtering
It is possible to filter alarms that will be used as an input to the offline and online
algorithms by specifying SQL criteria in the FmPredictor property "offline.param.where".
The criteria is over the history_db.NEW_HIST_MAIN table. Alarms evaluated to true WILL
participate in the offline algorithm.

Online
The online algorithm runs every 15 minutes ("predictor.online.minutes.interval") checking
alarms raised in the last 2 hours (“"hours.of.history.alarms.for.online") and correlates them
to the model built during the offline algorithm.
The algorithm predicts and raises the most specific alarms in three levels:
1. Alarmed Object (object id) with Alarm Name
2. Equipment Name (ancestor object id) with Alarm Name
3. Site (site id) with Alarm Name

A predicted alarm will be raised only for the predictions with:


 likelihood higher than 0.5 ("filter.raise.alarm.likelihood.range")
 priority higher than 7 ("filter.raise.alarm.priority.range")

A predicted alarm will have the following special fields populated:


 Prediction Alarm—a Boolean field that states if the alarm is a prediction alarm
 Likelihood of the prediction
 Precision of the prediction
 Recall of the prediction
 Prediction Avg Time—average time expected for the real alarm to raise
 Prediction Max Time
 Clear Reason
 Prediction Level—Site, Equipment, or Alarmed Object

A predicted alarm will be automatically cleared once the real predicted alarm is raised. If
the real alarm has not occurred, the predicted alarm will be cleared after the prediction
max time has expired. The “Clear reason” field of the alarm will contain the reason for the
clearance.

63
Fault Solution Administration Guide

ServiceImpact Configuration
The Cruiser Alarms Summary mode can provide different displays of the existing services and
customers, based on ServiceImpact information.
The ServiceImpact Admin enables the administrator to set the ServiceImpact system
definition as described in the ServiceImpact Admin User Guide.
For more information about ServiceImpact implementation, see the ServiceImpact
Implementation Guide.

Recognizing PM Entity Name in Alarms


The ability to add a PM entity name into the alarm’s 'Alarmed Object Entity' field is based on
the BC’s PM_ENTITY_MAPPING_RULES table, which contains rules for defining the entity
name base on the combination of object type and ID.
For any additional nonstandard project PM entity, an appropriate row should be added
manually to this table to make it recognizable in the relevant alarms. For more information,
see the PM Implementation Guide.

Maintenance Calendar Configuration


Planned maintenance is part of the communication supplier utilities, which include activities
such as fixing network problems, element maintenance, and network element upgrade.
Planned maintenance activities can create many FM alarms that do not indicate actual
problems.
The Maintenance Calendar feature is used to facilitate the NOC operator in handling planned
maintenance activities and special event alarms by displaying relevant alarm-maintenance
information. Alarm-maintenance fields are also available in the Active Alarms and Alarm
Information displays.
The history of the alarm maintenance changes is saved in the History log.

64
Configuration

Maintenance Calendar Architecture

The Maintenance Calendar mechanism includes the following main components:


 MC—the Maintenance Calendar module is responsible for calculating up-to-date NE
maintenance statuses according to the information provided by the plug-in. The
information is collected by running a full refresh when the Helix system rises and partial
refreshes in predefined rate. In addition, full refreshes are run in lower rate for handling
deletion.
 Plug-ins—the Maintenance plugin is responsible for connecting to the Maintenance
system and extracting relevant information. The DB plugin is supplied as part of the
Helix release, reading Maintenance information from a set of tables. Additional plug-ins
can be developed based on the project needs. For example, to retrieve data through a
Web-service.
 FM Maintenance—an FM module that polls the Maintenance Calendar for updated
maintenance information and updates the alarms with the maintenance status of their
alarmed object.
 MC DB tables—a set of tables delivered by the product that the project can populate
with maintenance data for the DB plugin to use.

65
Fault Solution Administration Guide

Maintenance Calendar Jobs


A Maintenance Calendar Job is used to describe a maintenance task. It has an id and a name
and includes objects that are part of its maintenance task. A Maintenance Calendar Job has
time frames defining when it takes place.
Maintenance time frames can be defined on 2 levels:
1. on job level
2. on object level
A single object in the job can have multiple time frames.
If a Job Object does not have its own time frames or if it is missing either the start or the end
time, it takes/inherits them from its corresponding job.

Note: A time frame is valid only if it has both the start and end time defined.

DB Plug-in Configuration
When using the DB plug-in, the Maintenance Calendar jobs are defined in the following tables
in the CONFIG_DB database.

Table MAINT_JOB
This table contains the Maintenance Calendar Jobs.

Field Name DB field type Mandatory/ Description


optional

JOB_ID VARCHAR2(256) Mandatory The unique identifier of the job

JOB_NAME VARCHAR2(256) Mandatory The name of the job

DESCRIPTION VARCHAR2(256) Optional The description of the job

LAST_UPDATE_ TIMESTAMP Mandatory The last time this entry was


DATE changed in the DB

IS_DELETED NUMBER(1) Must be 0 Functionality not in use

Table MAINT_JOB_EXT
This table can be used to enrich maintenance jobs with project specific attributes. See the
details below.

Field Name DB field type Mandatory/ Description


optional

JOB_ID VARCHAR2(256) Mandatory The unique identifier of the job

66
Configuration
Table MAINT_JOB_TIME_FRAME
This table contains Maintenance Calendar Job time frames.

Field Name DB field type Mandatory/ Description


optional

JOB_ID VARCHAR2(256) Mandatory The unique identifier of the job

START_DATE TIMESTAMP Optional The start time of the job


maintenance task

END_DATE TIMESTAMP Optional The end time of the job


maintenance task

Table MAINT_OBJECT
This table contains Maintenance Calendar Job Objects.

Field Name DB field type Mandatory/ Description


optional

JOB_ID VARCHAR2(256) Mandatory The unique identifier of the job

OBJECT_ID NUMBER(9) Mandatory The unique identifier of the Object

OBJECT_TYPE NUMBER(9) Optional Not in use

Table MAINT_OBJECT_EXT
This table can be used to enrich maintenance objects with project specific attributes.

Field Name DB field type Mandatory/ Description


optional

JOB_ID VARCHAR2(256) Mandatory The unique identifier of the job

OBJECT_ID NUMBER(9) Mandatory The unique identifier of the Object

67
Fault Solution Administration Guide

Table MAINT_OBJECT_TIME_FRAME
This table contains the Maintenance Calendar Job Object time frames.

Field Name DB field type Mandatory/o Description


ptional

JOB_ID VARCHAR2(256) mandatory The unique identifier of the job

OBJECT_ID NUMBER(9) mandatory The unique identifier of the Object

START_DATE TIMESTAMP optional The start time of the Object in the


maintenance task

END_DATE TIMESTAMP optional The end time of the Object in the


maintenance task

Adding Project Attributes to Maintenance Jobs


Maintenance jobs details can be presented in the Cruiser.

To add attributes to a maintenance job:


1. Add columns to the TABLE MAINT_JOB_EXT table.
2. Add DB mapping of these fields to the MD class ProjectMaintenanceJob, under the
project directory of MaintenanceDBPlugin.

68
Configuration

Maintenance Calendar Module Configuration


Property Type Mandatory Default Allowed Description
Name Value Values

fullRefresh int Yes 604800 Defines the time interval in


Interval seconds between two
consecutive full refreshes (all
the data is retaken and the
internal MC data is updated).
This is the only way to remove
jobs deleted from the tables
(because if they no longer exist,
there is no update date to
modify).

partialRefresh int Yes 30 Defines the time interval in


Interval seconds between two
consecutive partial refreshes
(all the data with last update
date after the last partial update
is retaken). This works only in
the DB plugin, and is usually
not needed, because full
update can usually be
supported at the required
minimal time refresh gap.

statusUpdate int Yes 30 Defines the time interval in


Interval seconds between two
consecutive maintenance
status updates. This means
that if an object time enters or
leaves a maintenance, this
thread updates its status.

slidingWindow int Yes 604800 Defines how far forward in


PeriodFuture seconds objects are still
relevant for future status

slidingWindow int Yes 604800 Defines how far backward in


PeriodPast seconds objects are still
relevant for past status

69
Fault Solution Administration Guide

FamMaintenace Module Configuration


The FaM Maintenance module connects periodically to the Maintenance Calendar engine and
retrieves changes of NE statuses.
Alarms related to the NE are updated with the following information:
 The occurrence time (Current, Future, or Past) of the related maintenance activity
 The names of the related maintenance activities
 The start date and time of the earliest related maintenance job
 The end date and time of the latest related maintenance job
 The Object ID that the maintenance information is based on. For example, it can be
inherited from the parent alarm

Property Type Mandatory Default Allowed Description


Name Value Values

SyncInterval int Yes 60 The time interval


in seconds
between two
consecutive
refreshes from the
MC module

Ancestor No Proj_Int_1 A Project The alarm field


ObjectI Active that stores the
DZAttr Alarms field ancestor
of int type ObjectIDZ
value

inheritParent boolean Yes true true/false Indicates whether


Maintenance child alarms
Data without
maintenance data
should inherit
their parent
maintenance data

toCheck boolean Yes true true/false Indicates whether


Ancestor maintenance data
ObjectID should be
associated with
the alarms
according to
Equipment
Number if no data
is associated by
the ObjectID field

70
Configuration

Property Type Mandatory Default Allowed Description


Name Value Values

toCheck boolean Yes false true/false Indicates whether


Ancestor maintenance data
ObjectIDZ should be
associated with
the alarms
according to
AncestorObject
IDZ (the value of
the ancestor
ObjectIDZAttr
field) if no data is
associated by the
ObjectID field
(and by Ancestor
ObjectID if
toCheckAncestor
ObjectID == true)

inheritParent boolean Yes false true/false Child alarm


Maintenance inherits the
Data maintenance data
from its parent
alarm

Maintenance int Yes 120 Synchronization


SyncInterval interval with
Maintenance
Calendar engine

Machine Learning Root Cause Analysis (RCA) Configuration


The logic is implemented in the FamRCA module. We recommend to install it in a separate
EAR.
There are two logical parts:
 Learning—analyze historic alarm data and divide related alarms into clusters
 Runtime—form parent/child relations between alarms belonging to the same cluster

Learning
The learning algorithm runs periodically. It fetches relevant historic alarm data, analyzes
correlation between alarms and divides related alarms into clusters. Clusters are stored in the
database to be used by the runtime logic.
The learning algorithm is executed in a separate process forked from the Managed Server
process where FamRCA EAR is deployed. The process may require significant memory, CPU
and DB resources. It is highly recommend to monitor first runs of the learning process,
validate it has all the required resources and completes successfully.

71
Fault Solution Administration Guide
Alarms Dataset Prerequisites and Recommendations
For the Machine Learning RCA algorithms to provide the best clusters and root causes, the
following prerequisites for the fault data are essential:
1. Preferably, 3 months of alarm data is required (but not less than 1 month).
2. The alarm data should be as informative as possible. It must include the following
information in the different fields:
o Alarm Identifier—a specific identifier of the alarm. This means that instances of
the same alarm are raised with the same Alarm Identifier. This field cannot be
empty and should not contain any redundant data, such as time, temperature,
and internal system index.
o Alert Name—the name of the alarm, as provided by the vendor (for example,
AIS, Power Failure, or LOS).
o Severity/Priority—the severity or priority of the alarm.
o Managed Object—the name of the object that raised the alarm.

Configuration
The following attributes control the algorithm execution parameters (such as time of day,
interval, and retry interval) and defines the data to be collected (such alarm name, keyword,
severity, and the range/resolution of the data).

Property Name Type Refreshable Default Description


Value

learning.execution string Yes 02:00 Time of Day (HH:MM format)


.timeOfDay to run the Learning algorithm

learning.history int Yes 92 Number of days to examine


Range.days history alarms

learning.interval int Yes 7 Interval between executions of


.days the Learning algorithm

learning.retry. int No 60 Learning retry interval on


minutes.interval failure

learning.aggregation int Yes 1 Resolution in minutes for


Resolution.minutes aggregating alarms with the
same LogicID

alarm.keyAttribute string No LogicID The alarm attribute to be used


by the engine as a key

learning.keyword string Yes Keyword The alarm attribute to be


Attribute treated as the Keyword in the
RCA Offline learning phase

learning.severity string Yes Priority The alarm attribute to be


Attribute treated as the Severity in the
RCA Offline learning phase

72
Configuration

In addition, it is possible to fine tune the learning algorithm parameters through


metadata/vdir/learning/rca_learning_params.txt. .
Please consult TEOCO if you believe such tuning is required.

Troubleshooting
Checking whether the learning phase has run is done by looking at the
metadata/vdir/learning directory under the EAR folder.
 3 files are created during the initial data collection: deep.txt, kwrds.txt, and
lid2kwrd.txt.
Their time of creation can show when the learning phase has started.
 rca_learning_log.out and rca_learning_errors.err are output files of the algorithm
itself, showing its progress. Existence of the .err file fails the learning phase.
Usually, this would happen when there is insufficient data or when there are some
mismatches in the 3 input files, due to some momentary discrepancy. If the error occurs
after rerunning the algorithm, it indicates that some specific data (usually the Keyword
attribute) contains invalid content.
 rca_out.txt is the output of the algorithm, which is stored in the table
history_db.rca_scores. The file time is the time the phase has ended. However, the
database records hold no time info. Therefore it is not possible to determine if the
records there are obsolete or not.
Usually, there are 3 areas affecting the success of the Learning phase that need to be
checked:
 Proper configuration—check the time of execution and the existence of sufficient data
in history_db.new_hist_main for the defined ‘historyRange’
 Corrupted/mismatching data—check the aforementioned output and .err files of the
rca_learning algorithm
 DB issues—if the rca_out.txt file was created, but the table does not contain any info (or
contains old info), check the jcore.log of the EAR to for database errors when storing the
records

Manual Run
From the Javascript console, with the FamRCA application, running the learning phase is
done with:
Packages.teoco.famrca.offline.LearningTask.runLearning();

Learning Investigations
Sentinel UI allows to display and explore results of the learning run:
 Visual presentations of the generated clusters
 Investigation of the a specific cluster:
o Information about the cluster alarms
o Graph displaying times when alarm instances of the cluster occurred
The data for the investigation is fetched from the database. We recommend to monitor the
first executions of the flow and perform database adjustments if required.

73
Fault Solution Administration Guide

Configuration
<Property name="widget.fault.alarmsList.amountOfAlarms" type="int"
public="true">
<Value>500</Value>
<Description summary="Defines the amount of alarms which
is get from the server for Active Alarms table" />
</Property>

<Property name="widget.mlrca.clusters.queryLimit" type="int"


public="true">
<Value>1000</Value>
<Description summary="Defines the amount of RCA clusters
to retreive from the Server for the ML-RCA widget" />
</Property>

<Property name="widget.mlrca.autorefresh.interval" type="int"


public="true">
<Value>3600</Value>
<Description summary="Defines MLRCA autorefresh interval
(seconds)" />
</Property>

<Property name="widget.mlrca.alarmsList.maxNumAlarms" type="int"


public="true">
<Value>1000</Value>
<Description summary="Defines the limit of Alarms to
retreive from the Server for the ML-RCA Investigation widget" />
</Property>

<Property name="widget.mlrca.investigationGraph.maxLogicIds"
type="int" public="true">
<Value>50</Value>
<Description summary="Defines the maximum logic ids that
can be shown in the investigation graph" />
</Property>

74
Configuration

Run-Time
Correlation Decisions
The description below is based on the default parameters.
Once the alarm belonging to one of the clusters was raised, it forms a “potential family”. New
alarms raised in a 180 seconds (“online.maxTime” property) time range from the raise time of
the first alarm and belonging to the same cluster are added to this potential family. Cleared
alarms are removed. If during 180 seconds the potential family has at least 4
(“online.minCorrelationCluster“ property) active alarms, it becomes a “real family”, otherwise it
is destroyed.

Once a real family is formed, there are two possibilities:


1. The child with the highest RCA score is chosen to be a parent. All other alarms
become its children.
2. When all children have the same score (within the epsilon precision), a derived alarm
is raised having all alarms as its children.
The relation decision is taken (or revised) every 60 seconds (“online.updateTime” property),
based on the existing alarms in the family, until 180 seconds have passed.
After 180 seconds passed, the family is stored in the memory for additional 1440 minutes
(“online.maxRetentionTime”) or until the parent is cleared, waiting for late alarms with
DateTimeUp falling into the 180 seconds range. Such late alarms are added as children to the
current parent, but do not trigger the revision of a parent.
A new alarm not falling into any existing family forms a new “potential family”.

Alarm Filtering
It is possible to limit the volume of alarms that will participate in the RCA correlation process.
To do this, you have to edit the
$BASEDIR/ttij2ee/project/metadata/FamRCA/filter/RCAJSAlarmFilter.filter file to contain a JS
expression in terms of alarm attribute names.
The expression should return TRUE for alarms that you DO NOT want to be processed.
For example, to exclude alarms from site1 site, the expression should be:
FromSite == site1'

Derived Alarms Population


It is possible to customize the way fields of the derived alarms are populated, based on its
children alarms.
To do this, you have to edit the FamRCA\js\JSDeriveAlarm.js file under the project Metadata
folder.
The JS code receives (as the ‘alarm’ variable of type ActiveAlarm) the derived alarm with a
LogicID set and a set of its children (as the ‘children’ variable of type Set). The code should
populate desired fields in the ‘alarm’ and return it.
Example:
var text = "";
75
Fault Solution Administration Guide
var childrenArray = children.toArray();
var i;
for (i = 0; i < childrenArray.length; i++) {
text += childrenArray[i].getLogicID() + " * ";
}
alarm.setAlarmText(text);
//This must be the end of this JS
alarm;

Configuration

Property Name Type Refreshable Default Description


Value

alarm.keyAttribute – string No LogicID The alarm attribute the engine


the same property will use as key
as the one from
the “Learning”
phase

online.alarm.time string No DateTim The alarm attribute the online


Attribute eUp engine will consider as alarm
date

online.maxTime int Yes 180 Maximum time (seconds) FM


accumulates the active alarms
for a specific family. After this
time, the chosen parent alarm
will not change

online.maxRetention int Yes 1440 The maximum time (minutes) to


Time hold a family in cache, awaiting
late alarms. Late alarms are
only added as children to the
chosen parent, but are not
considered as potential parents

online.min int Yes 4 The minimum number of alarms


CorrelationCluster to be considered a family

online.updateTime int Yes 60 Repeating period of time in


which (intermediate) correlation
decisions are taken. Each time,
a new parent can be chosen
and children can be added

online.topScore. float Yes Cluster TopScore Epsilon, for


epsilon consideration of the same Top
Score. When null, no epsilon is
used.

76
Configuration

Correlation Graph
When looking at the specific correlation in Cruiser, it is possible to see a correlation graph
showing the history of alarms participating in the correlation.

Configuration
The following properties affect the graph presentation:
 WinFam:AlarmConnectionsGraphDisplayFieldName—the name of the alarm field
to be displayed in the Correlation Tree graph. By default, LogicID
 FamProxy:alarm.service.offspring.history.timespan—the back time to be
presented in the graph in hours. By default: 720 hours
 FamProxy:alarm.service.offspring.history.gap—the graph resolution in minutes.
By default, 10 minutes
It is possible to see a tooltip with additional alarm information.
The shown fields are ruled by the “RCAChartAttributes” projections in the ProjectActiveAlarm
MD class. By default, the following fields are displayed:
 EquipmentName
 FromSite
 AlarmText
The same projection is used to determine which alarm information will be saved in the Excel.

77
Fault Solution Administration Guide

Opening Clients
Opening FM Cruiser from External Applications

This feature enables you to open the FM Cruiser display from external applications. It is done
by opening a URL using the appropriate parameters.
The URL prefix is:
https://[apache-host]:[apache-port]/FaMShell_[EAR name]/FaMShellActivator.jsp?

The URL parameters are:

Name Type Default Description


Value

activate string A true value sets the focus on Cruiser.

FilterKey string Indicates filtering by a predefined filter. Use the key that
FiltersService provided you.

TempFilterKey string Indicates filtering by a dynamic filter you created in your


application. Use the key that FiltersService provided you
to be used when you have a complex condition.

FolderCaption string Provides a name for the opened tab.

Navigation string Sets the Cruiser Master Mode for the display.
Mode
The available values are:
 active—Active Alarms
 correlated—Correlated Alarms
 history—Alarms History
 ge—GEO Maps
 summary—Alarms Summary/Analitics
 siteview—Site View
In this case, you have to specify a site id, as
follows:
FieldName=SiteID&FieldValue=[your site
id]&FieldType=int&NavigationMode=siteview
For opening Site View from CAFÉ by Site Band
ID, set FieldName=SBID and FieldValue=[its
value].
To open a specific Site View tab, use the
siteTabName parameter as described below.

78
Configuration

Name Type Default Description


Value

FieldName string With FieldType and FieldValue, indicates filtering by one


or more alarm fields.
To filter by one value of a selected alarm field, set
FieldName to its name, FieldType to its type, and
FieldValue to its value.
To filter by several values of a selected alarm field, you
have to enumerate FieldValue as follows:
FieldName=ObjectID
&FieldValue0=78123&FieldValue1=571987
&FieldValue2=271987&FieldType=int

FieldType string string Indicates the type of FieldName. Can also be int.

FieldValue string An attribute name from the JFam ProjectActiveAlarm MD


class that indicates the value of FieldName.

siteTabName Enables opening a specific Site View tab.


The available values are: ServiceAlarms and
ServiceStatus.

Notes:

 To open a filtered drill down folder, one of the filter parameters (FilterKey,
TempFilterKey, and FieldName) must be set.
 All the parameters are optional and single-entry.

Example of opening the ServiceStatus Site View tab


http://dc50-dev-gold20:3600/FaMShell_FM/FaMShellActivator.jsp?FieldName=SBID
&FieldValue=51633&FieldType=int
&NavigationMode=siteview
&siteTabName=ServiceStatus

When opening the Cruiser client from an external client, the available applications are Cruiser
and Light Cruiser. The following FamProxy properties are used to determine the required
application:
 preferableMonitoringClient—the preferred FM application to open from external client
 secondPreferableMonitoringClient—the second FM application choice to open from
external client
The selection of the FM application to be opened changes for the different external
applications as follows.

79
Fault Solution Administration Guide
Opening from Schematic Views:
1. If Schematic Views was triggered from one of the FM clients, it opens the triggering
application.
1. If Schematic Views was triggered from Sentinel:
a. User permissions and project installation are checked.
b. If the user has permission for only one application, this application is opened.
c. If the user has permission for more than one application, the FaM Proxy property
is checked and the first available application is opened in the following order:
.

i. preferableMonitoringClient
ii. secondPreferableMonitoringClient
iii. The third one

Note: If both properties are not defined, Cruiser (default) is opened.

Opening FM History from External Applications


This feature enables you to open the FM History display from external applications. It is done
by opening a URL using the appropriate parameters.
The URL prefix is:
https://[apache-host]:[apache-port]/FaMHistoryShell_[EAR
Name]/FaMShellActivator.jsp?

80
Configuration
The URL parameters are:

Name Type Description

active boolean Mandatory. Always true.

field string The name of the alarm field to filter by (when filtering by a single
field).

value string The value of the alarm field to filter by (when filtering by a single
field)

timecriteria string Relative time: <N> <Hours, Days, Weeks, Months>), in the
format H/D/W/M<N>
Where:
H=Hours, D=Days, W=Weeks, M=Months
For example, W10 indicates 10 weeks.

allparents boolean Determines whether to open the Correlation Tree window or just
filter by the following parameters.
Set as true if you want to open the Correlation Tree window.
Set as false if you do not want to open the Correlation, but you
want to filter the records by the following parameters.
Ignore this parameter if you just want to filter by a single
field/value (for backward compatibility).

LogicID string The value of the LogicID field of the alarm to filter by.

DateTimeUp string Full date and time, including milliseconds.

PCStatus string Parent Child Status, values according to available values in


JFam.

ObjectID int The value of the ObjectID field of the alarm to filter by.

ObjectType int The value of the ObjectType field of the alarm to filter by.

Example:
https://dc50-dev-helix91:3600/FaMHistoryShell_FM
/FaMHistoryShellActivator.jsp?activate=True&PCStatus=PARENT&ObjectID=123456&Object
Type=78&timecriteria=W10&LogicID=comcast_test_3&allparents=false&DateTimeUp=20/03/
2017 16:43:31.092

81
Fault Solution Administration Guide

Maintenance

Verifying that All Components are Running


This option enables you to verify that the following components are running:
 J2EE Components
 FM Services

J2EE Components
To verify the J2EE components are running:
1. From the Sentinel application, open the TEOCO Admin application.
2. Click System Configuration.
A list of all installed J2EE managed servers and their status appears.

FM Services
To verify that the FaMAPI Service is running:
1. Type the following command:
ps -ef | grep fam_api
2. Verify that you receive the following results:
9602 9267 0 10:21:56 pts/34 0:09 connect -daemon
fam_api_for_alarms.connect
9267 1359 0 10:21:50 pts/34 0:00 connect -daemon fam_api_parent.connect

82
Maintenance

To verify that the Correlation TRS is running:


1. Type the following command:
ps -ef | grep fam_cor_trs
2. Verify that you receive the following results:
lnxgold9 13135 1 0 Mar18 ? 00:00:15 connect -daemon
fam_cor_trs_parent.connect
lnxgold9 13538 13135 0 Mar18 ? 00:01:08 connect -daemon
fam_cor_trs_lookup.connect
lnxgold9 13634 13135 1 Mar18 ? 01:40:42 connect -daemon
fam_cor_trs_full.connect

Running FM Modules
All modules (N2/J2EE) are started/stopped/restarted through
$BASE_DIR/integration/scripts/netrak.ksh utility. It is possible to refer to a specific module (for
example, FamEngine or FAM_SERVICES) or all the system.

Note: Because the processes are related to one another, restarting one of them can cause
implicit restart/reconnect in others.

Checking the System Queues


Constantly monitor the queue statistics information to identify queues in the system that may
cause to delays in the alarms processing.

Checking the Memory Consumption


Insufficient memory may lead to delays in processing and display of the alarms.
For detailed information, activate verbose GC mode (in wls-XXX.ksh files) and analyze the
GC log files.

History Table Partitioning


The FM History data of is kept by default for 30 days using the THIN OUT mechanism.
However, for medium and large products we recommend partitioning the tables on a daily
basis.

TEOCO Monitor
TEOCO Monitor is the best way to monitor FM processes and FM health.
The following parameters can be monitored:
 FM processes running status
 Memory consumption
 FaM Engine up/down status
 Number of active alarms in the system
 Rate of incoming events
 Size of queues in the FM system
83
Fault Solution Administration Guide

Troubleshooting

Log Files
J2EE Server and Client Log Files
For more detailed information about the content of each log file, how to locate errors in the
log files, examples of messages, and a description of how to change a log level, refer to the
Diagnostics and Troubleshooting section of the Helix Administration Guide.

FM Services Log Files


The FM Services log files are located in $FAM_SERVICES_LOG_DIR. For FaMAPI, see
fam_api_for_alarms.log and fam_dst_for_alarms.log.

Server Troubleshooting
Server Components Are Up and Functioning
Check server logs to validate that Fam/FamEngine/FamHistory/FamCache EARs are running
smoothly.
Check if these EARs suffer from memory shortage.

Data Loss and Restart


While all components are running correctly, data should not be lost. However, restarting any
of the FM related components (FamEngine/FamHistory/FamCache/Fam) may lead to losing
alarm related data.
When critical, synchronization with the NE should be initiated.

Alarms Display/Update Is Delayed


Check queue statistics to make sure alarm rate is inside supported range.
Memory shortage may often cause the processing slowness.
Check health of all the FM related EARs. Sometimes problems in one EAR can affect the
entire system.
In some cases network load may prevent efficient distribution of the alarm data between
components.

History Data Is Delayed


In addition to the checks mentioned, you are advised to check queue statistics of the FM
History.

84
Troubleshooting

Insufficient Oracle Connections


If the error “weblogic.jdbc.extensions.PoolLimitSQLException:
weblogic.common.resourcepool.ResourceLimitException: No resources currently available in
pool JCoreDS to allocate to applications, please increase the size of the pool and retry.”
appears, use the WebLogic console to increase the maximum number of JDBC pool
connections to 200.
If the problem persists, consult the DBA to check if there are problems with connection
allocation.

Hazelcast Disconnections
In some edge cases (sometimes during heap dump after Out Of Memory) you may see
disconnection of EARs from the hazelcast cluster. The messages are seen in XXX.stderr log
files.
Currently, the EAR is not part of the cluster and it will NOT be possible to access the cache
alarm information or receive alarm notifications.
If such a disconnection is not justified (for example, when EAR was shut down) and is not
restored shortly, you will have to restart the entire FM.

Client Troubleshooting
The following sections discuss various client-related problems that may occur.

The Installation Starts and then Fails and an Error Message


Appears
The error message that appears contains a Details button. Clicking the button opens a log file
with the exact reason for the failure. If the reason is due to an old installation, see the next
section for more details.

The Installation States that an Old Installation is Interfering with


the Installation
To resolve this problem:
1. Go to:

85
Fault Solution Administration Guide

2. Select Clean installation cache and click OK.

3. Try to open the application again.

The .Net Framework Installation Fails


There can be various reasons for this:
 The Background Intelligence Transfer Service (BITS) is not running. Start the service
from the Services window.
 A more common case is that the user is not a local administrator. Consult your IT
department.

The Application Starts but Fails with a ‘Could not initialize'


Message
This means that a fatal error occurred during the application startup. More information can be
found in the application log file.

The Application Starts but Fails with an ‘Error Installing


application' Message
Verify with IT that you are authorized to write to c:\Netrac folder or make sure that the user is
a local administrator.

The Application Starts but Some Operations are not Available


FM client applications are subject to role permissions. Users may have security restrictions
which prevent them from performing certain operations. Check that the proper roles were
assigned to the user group in the TEOCO Admin.
86
Troubleshooting

Drop-down and Context Menus are Displayed Behind the Main


Window
Due to known Microsoft issues in WPF applications, in rare cases of overlapping windows,
the drop-down and context menus are displayed behind the client's main window. For more
details and scenarios, refer to: http://support.microsoft.com/kb/943326.

Cruiser Shows ‘Disconnected’ Status


FamProxy or FamEngine are down or have disconnected one from another. Check server
logs for the details.

Delay in Display/Update of the Alarms


1. Check the server queue statistics files for queues.
2. Check if the server processes suffer from memory starvation.
3. Check the network performance.

Statistics
FM Module Statistics
FM Module Stats Log
Each FM module that processes either active or history alarms on runtime (that is FaM
Engine, FaM History, and Fam Proxy modules) writes a statistics log that periodically prints its
alarm processing internal state. This feature is active by default and all statistics data is
written to a cyclic log file in the module’s EAR logs directory.
The stats file names for the 3 modules are: FamEngineQueueStats.log,
FamHistoryQueueStats.log, and FamProxyQueueStats.log.
The statistics are written on a 5 minutes interval by default, which can be changed via the
relevant system property for each module:
 FamEngine module system property, “fam.engine.chain.stats.interval”.
 FamHistory module system property, “fam.history.chain.stats.interval”.
 FamProxy module system property, “fam.proxy.chain.stats.interval”.

Stats Log Common Structure


The statistics log file is an XML style log with a main chain element representing FaM
Engine’s alarm processing chain and all its elements that perform the various alarm
processing.
The Chain element contains the following attributes:
 id—AlarmProcessingChainID’s fixed value.
 startTime—date and time when the chain started running.
 time—date and time of the current statistics snapshot.

87
Fault Solution Administration Guide
The Component element contains the following major data:
 name attribute—chain component’s unique name (for example, FamEventBuilder,
CommandProcessing, or AlarmDataEnricher)
 name attribute—chain component’s unique name (for example,
NetworkEventHandler, AlarmFetcher, or CommandExecution)
 MsgReceived element—how many messages have been received since chain start
time.
 MsgProcessed element—how many messages have been successfully processed
since chain start time.
 QueueSize element—how many messages are queued and waiting to be processed.
 AvgProcessTime element—the component’s average processing time.
 CustomData element—a free text element with component specific stats.

Using the Statistics File to Detect Processing Problems


Normally, when alarm processing is working properly, message queues are not expected to
build up, but under certain conditions, such as alarm floods or low resource availability (low
memory or slow DB connection) message queues can build up. When such a queue is
detected in one component, you can take the following steps:
1. Check FaM Engine’s logs searching for relevant errors.
2. If the queue is in FamEngine’s AlarmFetcher or DistributedCache components (in
QueuedCommands element), it might be that there are distributed cache related
problems.
3. Check EAR memory to detect “out of memory” issues.
4. Calculate message rate (going few snapshots backward) to detect alarm flood.
5. Check the queue state (by comparing to former snapshots) to check if it is currently
increasing or decreasing.
FM history module queues in any of the persistency components usually indicate
history_db DB schema related performance or unavailability problems.
By rule of thumb, a queue which is larger than several hundred messages for more than one
snapshot continuously can indicate that there might be a processing problem. A queue with a
few dozen messages is quite normal and does not necessarily indicate any problem.
Most statistics inside of an Engine’s FamEngineQueueStats are local, for example, counting
events is processed only by that FamEngine. Statistics related to amount of data in the
Distributed Cache depict the global amount of data across all Engines.
To estimate the total amounts of data or event rates, statistics from all FamEngine servers
should be collected.

Main FaM Engine Stats File Components


 NetworkEventHandler—shows how many alarm messages have been received
from all resources, either from the network (Mediation) or from various FM clients.
The TotalNetworkMessages element shows how many of these events specifically
arrived from the network.

88
Troubleshooting

 RejectRules—shows how many reject rules are currently active and how many
alarms were rejected. The UnRejectedAlarms element indicates the amount of alarms
that were once rejected and currently are not (due to either reject rules or alarm data
change).
 AlarmFetcher—handles distributed cache alarm fetching. As this component works
in an asynchronous way, its internal queue is stated in the QueuedCommands
element.
 EnrichmentRules—shows how many enrichment rules are deployed and how many
alarms were enriched by them. When queues are observed at this point, it might be
related to a slow enrichment rule using a Mediation Lookup.
 DistributedCache—updates alarms in distributed caches and shows the cache sizes
of alarms and alarm related data (such as work-logs and TTs). As this component
works in an asynchronous way, its internal queue is stated in the QueuedCommands
element. This component also contains the AlarmsPersistencyQueue element that
shows if there is any DB persistency queue related to the FAM_DB.active_alarms
table, such a queue can indicate a DB performance or availability problem.
 EventPublish—sends alarm events from the FaM Engine to various FaM Proxy
instances. When queues are growing, it might be related to events flood, memory, or
some networking issues.
 HistoryAlarmsPublisher—publishes events to the FM History module for DB
persistency. Queues can grow here due to memory or networking issues affecting
connectivity to the Kafka brokers.

Main FaM History Stats File Components


 HistoryDistributor—shows how many history alarm and history event messages have
been received from all FaM Engine servers before they are distributed to the various
persistency components.
 <XXX>Persistence:<YYY>—all these components are responsible for specific history
alarm event persistency, such as work-logs, trouble-tickets, defers, and repeats. When
such a component has a large queue, it either indicates a flood of events or history DB
performance problem in the corresponding history table.
 HistoryAlarmsPersistence—performs history alarms persistency and updates each
alarm change in history alarms related DB tables. Queues here indicate a problem in
the table history_db NEW_HIST_MAIN, HIST_MAIN_PROJECT, or both.

Main FaM Proxy Stats File Components


 EventsDistributer—indicates how many active alarm events have been received from
all FaM Engines and each event type count.
 CacheHandler—indicates alarms cache size and handles Cruiser fetch & subscribe
requests.
 AlarmEventDispatcher—sends all the events to FaM Proxy subscriber manager that
publishes them to all the subscribed clients (Cruiser instances and other applications
relying on alarm events, such as FamApi, ServiceImpact, and Correlator ES). Statistics
for the component also indicate how many FaM Proxy clients are currently subscribed,
total pending events, and top 10 subscribers with pending event queues.

89
Fault Solution Administration Guide

Client Performance Considerations


Due to the distributed nature of the application, it is vulnerable to problems of traffic
congestion. As a result, the performance may deteriorate, especially for high load scenarios.
The other factors that may affect performance are:
 Number of concurrent open views. Although the alarms displayed in different views are
not replicated (the same data cache is used), views management and folder criteria
application may be CPU resources consuming.
 Folder criteria complexity: number of attributes.
 High load.
 High alarm rate.
For FM History, we do not recommend running queries for large scale time criteria (‘all alarms
for the last year’ for example).
The number of open queries is more critical for the FM History client than for the Cruiser client
because each view holds distinct alarms. For example, opening 5 queries with 10000 alarms
each will sum up to 50000 alarms in the application.

90
Appendix A: Active Alarm Attributes

Appendix A: Active Alarm Attributes


Attribute Label Description Attribute ID

Eqp IP Address CLLI/IP address of the alarmed equipment AccessID

Alarm Clearing Type Alarm clearing type (Automatic/Manual) AccessType

Ack User Login Name The login name of the user that performed AckUserLoginName
the acknowledge action

Ack User Name The full name of the user that performed AckUserName
the acknowledge action

External Help Code Link to external help file ActionCode

Additional Details Additional details Additional details

Additional Info 1 Additional information 1 AdditionalInfo

Additional Info 2 Additional information 2 AdditionalInfo2

Additional Info 3 Additional information 3 AdditionalInfo3

Additional Info 4 Additional information 4 AdditionalInfo4

Probable Cause Probable cause description (populated by AdditionalText


Description the library)

Cleared Indicates that the alarm was cleared AlarmCleared

Alarmed Object Entity Type of the alarmed object Alarmed Object Entity

Alarmed Object Vendor Alarmed object vendor Alarmed Object


Vendor

Alarmed Object Model Model of the alarmed object Alarmed ObjectModel

Description Alarm description text AlarmText

Area The geographical area of the alarm's origin Area

Business Value Business value of the alarmed object BusinessValue

Clear User Login Name The login name of the user that performed ClearUserLoginName
the clear action

Clear User Name The full name of the user that performed ClearUserName
the clear action

Clear Reason The reason for clearing the alarm ClearReason

91
Fault Solution Administration Guide

Attribute Label Description Attribute ID

Cluster Quality Indicates the quality of the MLRCA cluster ClusterQuality

Confidence Confidence probability of the P/C relation in Confidence


% (TRS)

P/C Correlation type (P-Parent, C-Child) CorrelationStatus

Counter Set Name Counter set name Counter Set Name

Eqp Time Datetime the alarm was raised as reported Date2


by the equipment

Additional Date 1 Additional date 1 Date3

Time Down Datetime the alarm was cleared DateTimeDown

Time Up Datetime the alarm was raised DateTimeUp

Defer End Time The time the alarm will be undeferred DeferEndTime

Defer Start Time The time the alarm was deferred DeferTime

Alarmed Object Name The complete path of the alarmed object DeviceName
entity

Alarmed Object Type The type of the alarmed object entity DeviceType

District The geographical district of the alarm origin District

Domain Domain Domain

Element Status Element operational status ElementStatus

Eqp Name The specific equipment instance that EquipmentName


generated the alarm

Eqp Identifier AID/NEID EquipmentName2

Ancestor Object ID Object identifier of the specific equipment EquipmentNumber


that generated the alarm (ancestor)

Eqp Type Equipment type EquipmentType

Prioritize Time Datetime when the alarm priority was raised EscalationTime

Original Priority The priority of the alarm before raising the EscalOriginalPriority
priority

92
Appendix A: Active Alarm Attributes

Attribute Label Description Attribute ID

First Ack Date Datetime the alarm was acknowledged for FirstAckDate
the first time

From Site Site containing the alarmed object FromSite

Importance Service Priority Importance

Inhibit Indicates that the alarm is not visible to the Inhibit


users in Active Alarm view

Ack Indicates that the alarm has been IsAck


acknowledged

Automatic Correlation Indicates if the alarm was correlated by a IsAutomaticCorrelation


correlation system

Is Cleared Manually Indicates that the alarm was cleared IsClearedManually


manually

Confirmed Correlation Indicates if the correlation association with IsConfirmed


the child is confirmed Correlation

Defer Status Defer status IsDeferred

Derived Indicates that the alarm was raised by a IsDerived


correlation system to be a parent of other
alarms

Prioritized Indicates that the alarm priority was raised IsEscal

Golden Site Alarm received from a golden site IsGoldenSite

Additional Boolean 1 Additional Boolean 1 IsMaintenance

TT Association Indicates if alarm has a new Trouble Ticket, IsTT


appended Trouble Ticket, both, or multiple
appended Trouble Ticket

Work Log Indicates that the alarm has at least one IsWLog
Work Log

Keyword Alarm name Keyword

KPI Name KPI name KPIName

KPI Value KPI value KPIValue

Alarm Last Action Last action that was performed on the LastChangeAction
alarm

93
Fault Solution Administration Guide

Attribute Label Description Attribute ID

Alarm Last Update Time Datetime of the last update performed on LastChangeTime
the alarm

Last Change User Login The login name of the user who initiated the LastChangeUserLogin
Name last alarm change Name

Last Change User Name The full name of the user that initiated the LastChangeUserName
last alarm change

Last Toggle Time Datetime of the last toggled-up or toggled- LastToggleTime


down event

Last Child Change Time The last time a child alarm was added or LastUpdate
removed to the alarm

Logic ID Alarm identifier LogicID

Additional Info 5 Additional Information 5 MaintenanceRegion

Module Name MIB name for SNMP alarms or library name ModuleName
for others

Module Type Concatenation of 'Vendor' and 'Equipment ModuleType


Type'

Number of Child Alarms Number of child alarms NumChildren

Object ID Object identifier (used with 'Object Type' as ObjectID


the alarming object identifier)

Object Type Object Type identifier (used with 'Object ID' ObjectType
as the alarmed object identifier)

P/C Status Basic Correlation status - Parent, Child, PCStatus


Intermediate, or Orphan

Additional Info 6 Additional Information 6 PlannedWork

Prediction Level The prediction object: site, equipment or PredictionLevel


alarmed object

Predicted Alarm Indicates the alarm is predicted PredictionAlarm

Prediction Likelihood (%) The probability of the alarm to be realized PredictionLikelihood

Prediction Precision (%) The prediction estimated accuracy PredictionPrecision

Prediction Recall (%) The prediction estimated coverage PredictionRecall

Prediction Median Time The average datetime when the predicted PredictionAvgTime
alarm is expected to raise

94
Appendix A: Active Alarm Attributes

Attribute Label Description Attribute ID

Prediction Max Time The maximum datetime when the predicted PredictionMaxTime
alarm is expected to raise

Priority Alarm priority that can range from 1-9. 1 is Priority


the lowest priority indicating lower alarm
severity and 9 is highest priority indicating
critical alarms.

Probable Cause Code Probable cause code ProbableCause

Probable Cause FM Probable cause description (populated by ProbableCauseName


Description FM)

Proposed Repair Action Proposed repair action ProposedRepairAction

Repeated Count Counter of the repeated instances of the RepeatedCount


alarm

Repeated Time Datetime of the last repeated alarm RepeatedTime


instance

Service Affecting Indicates that the alarm causes service ServiceAffect


affection

Service Name Service name ServiceName

Severity Indicates the level of severity of the alarm Severity

Site Category Site category SiteCategory

Site ID Site identifier SiteID

Site Latitude Site latitude SiteLatitude

Site Longitude Site longitude SiteLongitude

Site Name Site name SiteName

Region ID Region identifier SiteRegionID

SPAM Status SPAM status SpamStatus

Additional Int 1 Additional integer 1 SpecificProblem

Collection Time Resolution KPI granularity TimeResolution

Toggle Count Toggle up/down flip count ToggleCount

Toggling State Specify whether alarm is toggled up, ToggleStatus


toggled down or not toggled

95
Fault Solution Administration Guide

Attribute Label Description Attribute ID

Reporting Element The reporting object that generated the Topology


alarm

To Site Site connected with the 'From Site' ToSite

Trend Indication TrendIndication

TT ID Trouble Ticket Identifier TT ID

TT Description Trouble Ticket description TTDescription

TT Error Message Indicates that an error message was TTErrorMessage


received during the last trouble ticket
operation

TT Is Open Indicates whether the Trouble Ticket is TTIsOpen


open or closed

TT Last Update Datetime of the last status update of the TTLastUpdate


Trouble Ticket

TT Status Trouble Ticket status TTStatus

TT User Name the user that performed the last TT Action TTUser

TT User Login Name The Login of the user that performed last TTUserLoginName
Trouble Ticket action

Alarm Type Alarm category Type

Unmanaged Object Alarmed object that is not managed in the Unmanaged Object
Base Configuration module

Eqp Vendor The equipment vendor Vendor

Work Log Count Counter of the created Work Logs WorklogCount

Last Work Log Date The time the last Work Log was created WorkLogDate

Last Work Log Entry Last Work Log text WorkLogText

Last Work Log Type Name Last Work Log type name WorkLogTypeName

Last Work Log User The full name of the user that added the WorkLogUser
last Work Log

Last Work Log User Login The login name of the user that added the WorkLogUserLogin
Name last Work Log Name

Work Order Existing Work Order number WorkOrder

96
Appendix A: Active Alarm Attributes

Attribute Label Description Attribute ID

Trend Analytic Indicates the last instance trend analytic Trend


score

Anomaly Analytic Indicates the last instance anomaly Anomaly


analytic score

Ancestor Object ID Z The identifier of the equipment connected AncestorObjectIDZ


to the equipment that generated the alarm

Eqp Name Z The name of the equipment connected to EqpNameZ


the equipment that generated the alarm

Customer The customer related to the equipment that Customer


generated the alarm

Maintenance Status The occurrence time (Current, Future, or MaintenanceStatus


Past) of the related maintenance activity

Maintenance Name The names of the related maintenance MaintenanceName


activities

Maintenance Start The start date and time of the earliest MaintenanceStart
Datetime related maintenance job Datetime

Maintenance End Datetime The end date and time of the latest related MaintenanceEnd
maintenance job Datetime

RCA Score The calculated root-cause score, as defined RCAScore


by the A-RCA algorithm

97
Fault Solution Administration Guide

Appendix B: History Alarm Attributes


Attribute Label Description Attribute ID

TT Association Type Trouble Ticket - active alarm association type TTAssociationType


(CREATE/APPEND)

Multiple TT Indicating if alarm is associated with multiple MultipleTT


trouble tickets

TT is Assigned TTAssigned

TT is Appended The alarm is appended to a TT TTAppended

TT Create/Append TT operation: create/append TTFunction

Is Parent The alarm is a Parent IsParent

Duplication Status Toggling, repeating or normal alarm TogRepStatus


information

Seconds Duration Duration in seconds SecondsDuration

Hours Duration Duration in hours HoursDuration

Month Duration Duration in month MonthDuration

Was Deferred The alarm was deferred WasDeferred

Was Acknowledged The alarm was acknowledged WasAcknowledged

Was Toggling The alarm was in Toggling state WasToggling

Was SPAM The alarm was marked as SPAM WasSpam

Was Premium The alarm was marked as PremiumRel WasNonSpam

Was Parent The alarm was a Parent WasParent

Was Child The alarm was a Child WasChild

Was Orphan The alarm was an Orphan WasOrphan

TT Was Assigned TT was Assigned TTWasAssigned

TT Was Appended The alarm was appended to a TT TTWasAppended

TT Was Disconnected The alarm was disconnected from the TT TTWasDisconnected

TT Was Multiple The alarm was connected to multiple TTs TTWasMultiple

98
Appendix C: Project Active Alarm Attributes

Appendix C: Project Active Alarm Attributes


All project attributes are predefined and hardcoded disabled by default.
It is possible to enable specific attributes by uncommenting them in the
ProjectActiveAlarm.xml metadata file and the ProjectHistoryAlarm.xml file providing custom
labels for each.
Project Attributes are part of both the Active and History alarm models.

Attribute ID Type and Length Amount

Proj_Varchar_1024_1 Varchar 1024 1 field

Proj_Varchar_512_1 Varchar 512 9 fields


Proj_Varchar_512_9

Proj_Varchar_255_1 Varchar 255 70 fields


Proj_Varchar_255_70

Proj_Datetime_1 Datetime 5 fields


Proj_Datetime_5

Proj_Int_1 Int 15 fields


Proj_Int_15

99
Fault Solution Administration Guide

Attribute ID Type and Length Amount

Proj_Double_1 double 5 fields


Proj_Double_5

100
Appendix D: Modules Configurable Properties

Appendix D: Modules Configurable Properties


The following chapters summarize the configurable properties of the FM related modules.
To change a property value, edit the appropriate jcore_cfg.xml and refresh configuration.
Refer to Helix Admin Guide for more details.
Take care to ensure you understand the implication of your changes. Consult with TEOCO
S&D when relevant.

FamAdmin
Property Type Mandatory Default Allowed Description
Name Value Values

reload.topology. string Yes Scheduler  Scheduler Reload TRS


mode (Reload topology mode
topology telling by which
is done method topology
according will be reloaded
to a
scheduled
hour)
 Signal
(Reload
topology
is done on
a signal
received
from
NetImport)

scheduler.reload int Yes 0 Topology reload


TopologyHour hour: 0-23

scheduler.reload int Yes 15 Retry topology


TopologyRetry reloading
Interval (minutes)

FamEngine
Property Type Mandatory Default Allowed Description
Name Value Values

AlarmHandler. int No 8192 Socket receive


socket.receive. buffer size (in
buffer.size bytes)

AlarmHandler int No 5902 Listening port of


Port Alarm Handler

101
Fault Solution Administration Guide

Property Type Mandatory Default Allowed Description


Name Value Values

HostName string No Host Name of


Alarm Handler

kafka.alarms string Yes Fm Kafka Alarms MAIN


Topic Engine topic

kafka.producer. int No 104857 Kafka Producer


maxRequest 60 Max request size in
Size bytes

RunHistory boolean No true Run severity


Migration enrichment during
Enrichment history migration

action.rule. int No 1 0 (Don't Indicating if and


association.copy copy) how acknowledge
.ack.state. will be performed
1 (Copy)
behavior on Action Rule
2 (Copy only association
if alarm was
associated to
a TT)

action.rule. boolean No false Indicating if the


association. alarm should be of
copy.prev. the user who
instance.ack. acknowledged
username previous instance.

add.tt.activity.to. boolean No true Indicating whether


appended.tt to add TT activity to
appended TT on
Worklog creation

alarm.auto boolean No true Auto ACK when


AckPC connecting Parent
and Child

alarm.auto boolean No true Auto ACK when


AckTT Creating/Appending
TroubleTickets

alarm.autoAck boolean No true Auto ACK when


TTSucceed Creating/Appending
TroubleTickets
request succeed

alarm.autoAck boolean No true Auto ACK when


WL Creating Worklog

102
Appendix D: Modules Configurable Properties

Property Type Mandatory Default Allowed Description


Name Value Values

alarm.bereaved int No 30 Period of task,


TaskInterval marking Child
Seconds alarms as Orphans

alarm.canDrop boolean No 1022; A Derived alarm


Derived false cam be dropped
like a Normal alarm

alarm.correlation int No 30 Period of task, un-


TaskInterval marking alarms for
Seconds correlation

alarm.correlation int No 1026;1 0-Dont unmark


Unmark marked alarms n-
Minutes Unmark after n
minutes

alarm.derived int No 1019;3 Drop automatically


TimeoutMinutes derived alarms with
no children after n
minutes

alarm.down int No 30 Period of tasks


RemoveInterval checking pending
Seconds Remove
commands

alarm.downTo int No 1034;0 N- second to wait


Remove before clearing - 0-
Seconds Feature off

alarm.orphan int No 1012;30 Seconds needed


Timeout for a Child Alarm,
Seconds with a father went
down, to become
orphan. Alarm
Handler

alarm.toggle boolean No false AlarmDown from


ClearFrom rules forces a Clear
RuleAs even when alarm is
Manual toggling, as with a
manual AlarmDown

alarm.toggle int Yes 1002;3 Number of


Depth consecutive rises
for an alarm to
become Toggle

103
Fault Solution Administration Guide

Property Type Mandatory Default Allowed Description


Name Value Values

alarm.toggle int No 2 Number of


HistoryLimit preceding Up and
Down flips stored in
history for every
toggling alarm

alarm.toggleIs boolean No false In toggle down -


UpdateTime whether to update
Down DateTimeDown

alarm.toggleOff int Yes 1005;10 Minutes range for a


Minutes toggling Alarm to
become not
toggling

alarm.toggleOn int Yes 1004;15 Minutes range


Minutes analyzed by the
process to
determine that an
Alarm isToggling

alarm.toggle int No 30 Period of task


TimeoutInterval checking for idle
Seconds toggling alarms,
untoggling them

alarm.toggle int No 120 Period of


Tracking maintenance task
Cleanup that cleans up the
IntervalSeconds toggling and repeat
mechanism cache

alarm.trackers. int No 30 Frequency of


scheduler. tracker tasks
frequency performing delayed
actions as undefer,
timeout, escalation,
and so on

alarm.unack int No 1080;1 Unack child alarm


ChildAlarms when the parent
disconnects 1-yes
0-No

analytics.service int No 1 Time before


Acquire.retry.mi retrying to
nutes reacquire
FamAnalytics
service on failure

104
Appendix D: Modules Configurable Properties

Property Type Mandatory Default Allowed Description


Name Value Values

anti.spam.global boolean No true Anti-spam global


.disable (overriding)
DISABLE flag

anti.spam. boolean Yes true Perform non spam


perform.non. action on correlated
spam.action.on. parent alarm
correlation

anti.spam. boolean Yes true Perform non spam


perform.non. action on trouble-
spam. ticket
action.on.tt

anti.spam. boolean Yes true Perform non spam


perform.non. action on work-log
spam.action.on.
worklog

anti.spam. boolean Yes true Removing spam


remove.spam. indication after an
indication.on. alarm has been
ack acknowledged

anti.spam. string Yes 01:00 Anti-spam system


system.query. query daily
daily. execution hour (0-
execution.hour 23)

anti.spam. string Yes 1,2,3,4, Anti-spam system


system.query. 5,6,7 query execution
execution.days days

anti.spam.syste boolean Yes true Anti-spam system


m.query.executi query execution
on.enabled flag

anti.spam. int Yes 9600 Anti-spam system


system.query. query execution
execution. timeout (seconds)
timeout

anti.spam.table. boolean Yes true Anti spam feature


space.limit. that check if spam
feature.enable query limit size is
larger than
threshold, in case
yes, Screener
feature is disabled.

105
Fault Solution Administration Guide

Property Type Mandatory Default Allowed Description


Name Value Values

anti.spam.table. int Yes 200000 Threshold limit for


space.threshold 0 anti spam table
space limit feature,
when number of
results is higher
than threshold, the
anti spam Screener
feature is disabled.

append.child.to. int Yes 1061;1 1 (Append) Indicating whether


parent.tt to append child
2 (Do not
alarms to a TT
append)
which their parent
3 (Append is creating
without
sending)

append.child.to. boolean Yes 1065; Indicating whether


parent.tt.on.pc. false to append newly
relation created child to
parent's TT

append.tt.on. boolean Yes 1064; Indicating whether


duplicate.tt true to append alarm to
an existing TT

application. string Yes FM FM Engine


displayName Engine application display
name

auto.ack.with. boolean Yes false Perform auto


user.context acknowledge after
WL/TT/PC
commands with
user context

bc.net.import. boolean No false Clear alarm due to


clear.alarms. absent in BC
Enabled

106
Appendix D: Modules Configurable Properties

Property Type Mandatory Default Allowed Description


Name Value Values

bc.net.import. string No RESOU Names of the


entities RCE- entities relevant to
GROUP FE
,APPLIC
ATION,
SUBSC
RIBER-
DEVICE
,CUSTO
MER,SE
RVICE,
FACILIT
Y,IP,INT
RF,SEC
TOR,CA
RD,NE

cache. int No 30 Cache persistency


persistency. queues statistics
queues.stats. fetch interval
interval

cache. int No 15 Cache persistency


persistency. queues statistics
queues.stats. fetch interval during
interval.during. flood
flood

disable.vendor. boolean No false When set to true,


and.equipment. the automatic
type.enrichment enrichment of the
Vendor and
Equipment
attributes of Alarms
is prevented

engine.events int No 10 Interval for storing


StoreInterval. event count per
seconds partition in cache

engine.failover.fr int No 4 Number of times


ozenEvents.cou partition events
nt count may be
frozen when polling
it

engine.failover. int No 3 Number of times


frozenHeartbeat another engine's
.count heartbeat may be
frozen when polling
it

107
Fault Solution Administration Guide

Property Type Mandatory Default Allowed Description


Name Value Values

engine.failover. int No 60 Time to wait for


idleNoHeartbeat another engine to
.seconds report heartbeat

engine.failover. int No 10 Seconds to wait


waitFor until another
Availability. Engine notifies
seconds about a partition
availability

engine. int No 10 Interval for


heartbeat heartbeat
Interval.seconds timestamps of
every FamEngine
(stored in
distributed cache)

engine.support. boolean Yes false Specify if support


multiple.locks.in. multiple locks in
chain chain

fam.engine. int Yes 5 Time interval for


chain.stats. queue statistics
interval gathering

fam.engine. string Yes Equipm The field in alarm


enrichment.site. entNum class witch hold the
alarmSiteIDField ber site/eqp ID value

fam.engine. boolean Yes true Is site data cache


enrichment.site. refresh enabled
cache.refresh.
enable

fam.engine. int Yes 0 The time between


enrichment.site. each cycle of cache
cache.refresh. refreshing in
interval seconds

fam.engine. int No 10 Maximum size of


enrichment.site. aggregated events
chain.max.bulk.
data.size

fam.engine. int No 150 Maximum time in


enrichment.site. milliseconds for
chain.max.data. event aggregation
aggregate.time

108
Appendix D: Modules Configurable Properties

Property Type Mandatory Default Allowed Description


Name Value Values

fam.engine. string Yes EQUIP SITE (Site) The type of


enrichment.site. MENT topology data that
EQUIPMENT
topologyType the
(Equipment)
alarmSiteIDField
holds

fam.events. int No 300 Maximum time in


publish. milliseconds for
aggregate.time history event
aggregation

fam.events. int No 1000 Maximum amount


publish.max. of aggregated
bulk.size history events

flood.block. boolea No false Block BenchSim


bench_sim. n simulator events on
events severe flood

flood.major. int No 200000 Major flood


threshold threshold

flood.minor. int No 100000 Minor flood


threshold threshold

flood.severe. int No 300000 Severe flood


threshold threshold

flood.watchtask. int No 5 Flood watch-task


interval interval

history.Enabled boolean No true Enabling sending


data to History
Module

history.alarm.db. int No 1094;30 Indicating how


data.thinout many days history
alarm data will be
kept in DB

history.events. int No 300 Maximum time in


publish. milliseconds for
aggregate.time history event
aggregation

history.events. int No 1000 Maximum amount


publish.max. of aggregated
bulk.size history events

109
Fault Solution Administration Guide

Property Type Mandatory Default Allowed Description


Name Value Values

manager. int Yes 30 Watchdog interval


watchtask.
interval

mediation. string No MED Mediation


command. Commands Default
default.user. User Name
name

nci.automation. string Yes nsa User login name


user.login.name used to perform
NCI commands in
Action-Rules

notification.rule. string No admin@ Email address of


sender.email teoco.co the mail sender in
m notification rules

notification.rule. string No System Name of the


sender.name Admin mail/sms sender in
notification rules

rejected.alarms. int No 300 Maximum time in


persistency. milliseconds for
aggregate.time rejected alarms
aggregation

rejected.alarms. int No 1000 Maximum amount


persistency.max of aggregated
.bulk.size rejected alarms

rr.migration. boolean No true Indicating if Raise


enabled Rules Migration is
enabled or not

site.data. boolean Yes true Site data


enrichment. enrichment enabled
enabled indication

styleGuide string Yes 1.1 The style guide


Version version to be
supported

sync.generate boolean No 1078;fal defines if repeated


RepeatedAlarms se alarms generated in
the sync process

sync.timeout int No 1800 Timeout in seconds


of sync operation

110
Appendix D: Modules Configurable Properties

Property Type Mandatory Default Allowed Description


Name Value Values

sync.timeoutTask int No 60 Frequency of sync


Frequency timeout task

toggleRepeat. int No 1000 Time for max delay


bufferingTime [ms] before a buffer
is cleared

toggleRepeat. int No 30 Multiplier of


bufferingTime. buffering time
floodMultiplier during flood

tt.create.children. boolean Yes false Create tickets for


tickets children (regardless
of Appending the
main ticket)

tt.create.retry. boolean Yes false Specify if create TT


remove.on.clean. request, that should
alarm be retried, will be
cancelled on clear
alarm

tt.customLast boolean Yes Name of DateTime


Change.attribute attribute to be used
for storing
LastUpdateTime of
a custom set of
attributes

tt.no.association. string No TT status list which


status.list prevent a TT from
being associated

tt.onChange.last boolean Yes true On TT change


UpdateDate. lastUpdateDate
isIndependent may be changed
Field independently on
other fields

tt.request.retry. int Yes 60 TT requests retry


expiration.period expiration period in
minutes.

tt.request.retry. boolean Yes false Specify if perform


on.failure retry on failed TT
requests

tt.request.retry. int Yes 60 TT requests retry


period period in seconds

111
Fault Solution Administration Guide

Property Type Mandatory Default Allowed Description


Name Value Values

tt.statusLast string No Name of DateTime


Change.attribute attribute to be used
for storing
LastUpdateTime of
TT_STATUS

tt.system. boolean No true Indicating if TT


enabled system is enabled
or not

tt.update. boolean Yes false Update appended


appended tickets on
AlarmUpdate, using
APPEND mapping

FamHistory
Property Name Type Mandatory Default Allowed Description
Value Values

fam.history. int No 24000 Maximum size of


chain.maxBulk aggregated
DataSize events

fam.history. int No 3000 Maximum time in


chain.maxData milliseconds for
AggregateTime event aggregation

fam.history. int Yes 5 Time interval for


chain.stats. queue statistics
interval gathering

fam.history. int No 5 Max. number of


converter. threads for
maxThreads parallel
conversion of
Active to History

fam.history. int No 1000 Minimal records


converter. number to split
minBatch into threads when
Converting Active
to History

fam.history. int No 2000 Optimized batch


persistence. size of alarms for
batchSize DB persistence

112
Appendix D: Modules Configurable Properties

Property Name Type Mandatory Default Allowed Description


Value Values

fam.history. int No 12 Max. number of


persistence. threads for
maxBatch parallel batch
Threads persistence

flood.major. int No 200000 Major flood


threshold threshold

flood.minor. int No 100000 Minor flood


threshold threshold

flood.severe. int No 300000 Severe flood


threshold threshold

flood.watchtask. int No 5 Flood watch-task


interval interval

JFam
Property Name Type Mandatory Default Allowed Description
Value Values

alarmClass. int Yes 300 AlarmClass


updateCache. Cache Update
interval Interval

alarms.load. int No 2000 Batch size when


conversion using parallel
BatchSize conversion

alarms.load. int No 0 Parallelism for


parallelism converting Alarms
from XML during
Load from DB. 0 -
no parallelism.

cache. int No 10 Max threads for


alarmQuery. Alarms queries
maxThreads within Executor

kafka.auto boolean Yes true Kafka Consumer


Commit AutoCommit
Consumer

kafka.auto int 1000 Kafka


Commit AutoCommit
Consumer. interval (ms)
interval

113
Fault Solution Administration Guide

Property Name Type Mandatory Default Allowed Description


Value Values

kafka.bootstrap. string Yes localhost: Kafka Boostrap


servers 9092 Servers list

kafka.fam string Yes fmMulti Kafka topic for


EngineService Engine sending
Topic Service commands to ALL
Events

kafka.fam string Yes fmProxy Kafka topic for


EventsTopic Fam sending Fam
Events Events from
FamEngine to
FamProxy

kafka.history string Yes fmHistory Kafka topic for


EventsTopic Events sending History
Events from
FamEngine to
FamHistory

kafka.polling. int Yes 1000 Kafka Consumer


timeout polling timeout
(ms)

kafka.producer. string Yes all Kafka producer


acks.config acks config

kafka.producer. int Yes 1048576 Kafka producer


batch.size. batch size config
config

kafka.producer. int Yes 33554432 Kafka producer


buffer.memory. buffer memory
config config

kafka.producer. string Yes none Kafka producer


compression compression type
(none, gzip,
snappy, lz4, zstd)

kafka.producer. int Yes 1 Kafka producer


linger.ms.config linger config in ms

kafka.producer. int Yes 0 Kafka producer


retries.config retries config

kafka.proxyDH string Yes fmProxyD Kafka topic for


EventsTopic HEvents sending
Dist.Handler
Events from
FamEngine to
FamProxy
114
Appendix D: Modules Configurable Properties

FamProxy
Property Name Type Mandatory Default Allowed Description
Value values

alarm.chain.request. int No 180 Timeout in seconds


timeout for a alarms chain
request to be
performed

alarm.service. int No 10 Time gap of


offspring.history. offspring alarm up
gap history in Minutes

alarm.service. int No 720 Time span of


offspring.history. offspring alarm up
timeSpan history in Hours

fam.proxy.chain. int Yes 5 time interval for


stats.interval queue statistics
gathering

famEngine.failover. int No 30 Time to wait before


retry.seconds retrying to get
FamEngine state
after Engine failover

famEngine.failover. int No 30 Time in seconds to


retry.seconds retry access engine
activeAlarmBI in
case of exception
(try to get
activeAlarmBI from
other engine)

famEngine.retry. int No 1 Time to wait


access.activeAlarm between each try to
BI.seconds.between. access engine
each.try activeAlarmBI in the
period of
famEngine.retry.acc
ess.activeAlarmBI.i
nterval.seconds

flood.handler. boolean Yes true Indicating if flood


enabled handling is enabled
or not

115
Fault Solution Administration Guide

Property Name Type Mandatory Default Allowed Description


Value values

flood.handler.mode string Yes DISCON BLOCK_ Indicating in which


NECT_S EVENTS mode FamProxy is
UBSCRIB (Block handling floods
ERS Alarm
Events)
DISCON
NECT_S
UBSCRIB
ERS (Dis
connect
Subscribe
rs)

flood.major. int No 200000 Major flood


threshold threshold

flood.minor. int No 100000 Minor flood


threshold threshold

flood.queued. int No 15 Queued


subscriber. Subscriber
threshold Percentage
Threshold

flood.severe.thresh int No 300000 Severe flood


old threshold

flood.watchtask.inte int No 5 Flood watch-task


rval interval

history.alarms.fetch int No 10000 History alarms


.limit fetch limit

manager.watchtask int Yes 30 Watchdog interval


.interval

preferableMonitorin string Yes Cruiser Cruiser preferable


gClient (Cruiser) monitoring client
Light
Cruiser
(Light
Cruiser)

secondPreferableM string Yes Light Cruiser second choice


onitoringClient Cruiser (Cruiser) preferable
monitoring client
Light
Cruiser
(Light
Cruiser)

116
Appendix D: Modules Configurable Properties

Property Name Type Mandatory Default Allowed Description


Value values

ws.client.subscriber int No 600 Idle timeout for


.idle.timeout subscription
mechanism (secs)
used via web-
service

FamAnalytics
Property Name Type Mandatory Default Allowed Description
Value Values

anomaly.learning. float No 0.5 Learning


infoWeight Scoring - Info
Weight

learning.retry. int Yes 60 Learning retry


minutes.interval interval on
failure

min.predictive. int Yes 20 Minimum


result.samples number of
samples for
trend/anomaly
calculation

predictive.result. string Yes yyyy- Time format for


date.format MM-dd daily predictive
results

predictive.result. string Yes yyyy- Time format for


datetime.format MM-dd hourly predictive
HH:mm results

WinFam (Cruiser Client)


Property Name Type Mandatory Default Allowed Description
Value Values

AlarmConnections string No LogicID Any alarm The name of the


GraphDisplayField attribute alarm field to be
Name name displayed in the
Correlation Tree
graph

AllowDragFolder boolean No true Allows moving


alarm folders in
the Navigation
pane.

117
Fault Solution Administration Guide

Property Name Type Mandatory Default Allowed Description


Value Values

DisableActionsOn string No All All Actions Disable actions


ClearedAlarms Actions (Disable all on cleared
actions) alarms.
Only TT
Actions
(Disable
only TT
Actions)
Only Alarm
Actions
(Disable
only alarm
actions)
No Action
(Do not
disable any
action)

EnableSiteView boolean No true Indicating


whether to
enable Site View
or not

RefreshAnalytics int No 30 Rate in minutes


Rate in which
analytics data is
refreshed.

RefreshSIDataRate int No 30 Rate in seconds


in which Service
Impact data is
refreshed.

ShowNE boolean Yes true Determines


Commands whether to
display NCI NE
Commands

SiteViewKPI int No 60 The rate for


RefreshRate refreshing Site
View KPI data in
minutes.

SiteViewLinkAggre int No 4 The number of


gationThreshold links between the
same 2 nodes
from which links
should be
aggregated.

118
Appendix D: Modules Configurable Properties

Property Name Type Mandatory Default Allowed Description


Value Values

SiteViewName string No Site - for The name of the


Cruiser site view for
Cruiser in SV
server.

SiteViewRefresh int No 60 The rate for


Rate refreshing Site
View in minutes.

SiteViewSetIcon string No MainFu The name of the


ByFieldName nction attribute in SV
server MD class
by which the
project icons are
set.

SiteViewSpecial string No Modem The value of


KeywordValue MainFunction
that indicates
that the node
should be
colored in Grey.

UseMCJobID boolean Yes true Determines


whether to
display the Job
ID or the Job
Name of a
Maintenance
Calendar Job

UseServerTime boolean No false Indicating


whether Time-
Zone conversion
supported for
alarms

client.ui.enable. boolean No true Indicating


effects whether to
enable client UI
effects or not

dotnet.application. string No Black .Net application


style style.

isInfrastructure boolean Yes false Define if the


module is
infrastructure

lookup.retrieveLimit int Yes 1000 Lookup Retrieve


Limit

119
Fault Solution Administration Guide

Property Name Type Mandatory Default Allowed Description


Value Values

maxCriteriaItems int No 200 Max number of


items in criteria

maxMCdays int No 14 More than 0 Max number of


days displayed in
the Maintenance
Calendar view

netkt.integration. int No 1 0 (Local Netkt integration


type Netkt type
integration)
1 (Web
Netkt
integration)

remedy.server. string No Remedy server


name name

winfam.aggr.alarms int No 1000 Maximum count


.sliding.window.limit of alarms that fit
aggregated
folder criteria

winfam.aggr.folders int No 20 Maximum


.count.limit available count
of aggregated
folders

winfam.alarms. int No 3 Sleep interval


alarms_update. between client
refresh_rate_ alarm updates.
seconds

winfam.alarms. int No 240 Time Interval (in


auto_sync_time_ minutes)
interval_minutes between
automatic alarm
synchronizations
between the
client and the
server. Sync is
disabled if this
value is less or
equal to 0

winfam.alarms.can boolean No 1022 Can delete


DeleteDerived derived alarms.

winfam.alarms. int No 200000 Alarms filtering


filtering.threshold1 threshold

120
Appendix D: Modules Configurable Properties

Property Name Type Mandatory Default Allowed Description


Value Values

winfam.alarms. int No 220000 Alarms filtering


filtering.threshold2 refresh threshold

winfam.alarms.grid. string No Tahoma Default alarms


font |11 grid font.

winfam.alarms. boolean Yes true Disables trouble


trouble_ticket. ticket conversion
disable_conversion functionality

winfam.alarms. string No 1296 Trouble-Ticket


trouble_ticket.id. Identifier Prefix
prefix

winfam.alarms. int No 1297 Maximum


trouble_ticket.max_ Trouble-Ticket
ttid_length Identifier Length

winfam.assocLinks boolean No false Association links


PreviewEnabled preview enabled
flag.

winfam.display. boolean No false When true then


alarm.descr.without description grid
.new.lines field is displayed
without new lines
and tooltip with
new lines

winfam.display. boolean No false Group by tag,


repeated.time.in. displays the last
group.title repeated time in
the group

winfam.dynamic. int No 600 Timeout of


folder.reload. dynamic folder
timeout refresh (in
seconds)

winfam.enable. int No 1000 Perform alarms


alarms.count counting for
system folders

winfam.export.pdf. int No 1000 Maximum rows


maxrows to export to pdf
(maximum
allowed is 1000)

121
Fault Solution Administration Guide

Property Name Type Mandatory Default Allowed Description


Value Values

winfam.filters. boolean No true True to use


useregulare regular
xpressions expressions
engine in alarm
criteria, False
otherwise

winfam.history. int No 2 Represents the


views.maximum_ maximal number
number_of_views of query results
views that can be
open in parallel.

winfam.include. boolean No false Indicating


endpoints.in.link. whether port
alarms (endpoints)
alarms will be
included in link
alarms.

winfam.maps.bing. string No Bing Maps


applicationId Application ID

winfam.maps. string No Google Maps


google.channel Channel

winfam.maps. string No Google Maps


google.clientId Client ID

winfam.maps string No Backgro The name of the


.tileLayer.name und Tile Service used
Map in the maps

winfam.maps. string No https://a The URL of the


tileLayer.url pi.maptil Tile Service used
er.com/ in the maps
maps/str
eets/256
/{z}/{x}/{
y}.png?k
ey=

winfam.maps. string No Scale to display


weather.scale weather

winfam.max_ int No 1 Max number of


alarms_for_tt_ alarms that can
multiple_create be selected for
multiple TT
creation

122
Appendix D: Modules Configurable Properties

Property Name Type Mandatory Default Allowed Description


Value Values

winfam.max. int No 200 Memory needed


memory.needed. for new Google
google.earth Earth Tab
Opening (in MB)

winfam.menu. int No 25 Max number of


max_nci_ NCI Commands
commands in Ribbon/RC
menu

winfam.notifications int No 6
.fadeout_duration_
seconds

winfam.notifications int No 4
.fadeout_timeout_
seconds

winfam.notifications string No ##LogicI Alarms


.notifications_ D## - notification
template ##Alarm popup template
Text##

winfam.notifications string No \\Images Alarms


.sound_file_location \\Notific notification
ation.wa popup template
v

winfam.print. int No 1000 Maximum rows


maxrows to print
(maximum
allowed is 1000)

winfam.selected. int No 3000 Maximum count


alarms.limit of alarms that
can be selected
in folder

winfam.servers. string No Gzip Alarm Service


alarm_service_ Transport Mode
compression

winfam.sitesview string No Alarms, Comma


.tabs.order Service separated list
Alarms, that defines
Service SitesView tabs
Status,A order
dditional
Details

123
Fault Solution Administration Guide

Property Name Type Mandatory Default Allowed Description


Value Values

winfam.use.last. boolean No false When true


named.layout application will
be opened with
last used named
layout otherwise
as it was closed

winfam.views.force. boolean No false Indicating


choosing. whether to force
worklogType selecting a work
log type

winfam.views.max_ int No 5 Max drill down


drill_down_folders_ folders opened
opened concurrently

winfam.views.max_ int No 1000 Maximum


tt_to_retrieve number of view
allowed

winfam.views. int No 2 Maximum


maximum_number_ number of map
of_map_views views allowed

winfam.views. int No 15 Maximum


maximum_number_ number of view
of_views allowed

winfam.views. string No Label Defines where to


navigator.folders. present the
group_folder_ group folders
owner_display_ owners (label,
location tooltip)

winfam.views. string No Alarm


preview_column_ Text
name

winfam_grid_ boolean Yes true If this property is


search_only_ true, then the
visible_fields search will be
done on all
visible fields
only. Otherwise,
the search will be
done on all
fields.

winfam_max_ int No 80000 Total max alarms


supported_alarm_ supported by FM
number client

124
Appendix D: Modules Configurable Properties

Property Name Type Mandatory Default Allowed Description


Value Values

winfam_max_ int No 30000 Folder max


supported_folder_ alarms supported
alarm_number by FM client

FaMAdminModule (FM Admin Client)


Property Name Type Mandatory Default Allowed Description
Value Values

dotnet.application. string Yes Black .Net application


style style

dotnet.jsValidation string Yes Evaluation jsValidationType


Type

isInfrastructure boolean Yes false Define if the module


is infrastructure

lookup.retrieveLimit int Yes 1000 Lookup Retrieve


Limit

HistoryAnalisysModule (FM History Client)


Property Name Type Mandatory Default Allowed Description
Value Values

EnableRegenrate boolean No false Enables Regenerate


Alarm Alarm

UseServerTime boolean No false Indicates whether


Time-Zone
conversion is
supported for
alarms

isInfrastructure boolean No false Define if the module


is infrastructure

winfam.history.disa boolean No false When true, regular


ble_query_edit_for_ users would not be
regular_users able to modify
existing query in the
FM History
application

125

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy