ABCs of zOS Volume-12
ABCs of zOS Volume-12
ABCs of zOS Volume-12
Paul Rogers
Alvaro Salla
ibm.com/redbooks
International Technical Support Organization
January 2010
SG24-7621-00
Note: Before using this information and the product it supports, read the information in “Notices” on
page vii.
This edition applies to Version 1 Release 11 of z/OS (5694-A01) and to subsequent releases and
modifications until otherwise indicated in new editions.
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
The team who wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x
Become a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x
Contents v
vi ABCs of z/OS System Programming Volume 12
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not give you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of
express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any
manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring
any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs.
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
CICSPlex® IMS™ System z10™
CICS® IMS/ESA® System z9®
DB2 Connect™ Language Environment® System z®
DB2® Lotus® Tivoli®
Domino® MQSeries® VTAM®
DS8000® NetView® WebSphere®
ECKD™ Parallel Sysplex® z/Architecture®
Enterprise Storage Server® PR/SM™ z/OS®
ESCON® RACF® z/VM®
FICON® Redbooks® z9®
Geographically Dispersed Parallel Redbooks (logo) ® zSeries®
Sysplex™ S/390®
IBM® SOMobjects®
Novell, the Novell logo, and the N logo are registered trademarks of Novell, Inc. in the United States and other
countries.
SAP R/3, SAP, and SAP logos are trademarks or registered trademarks of SAP AG in Germany and in
several other countries.
Java, and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other
countries, or both.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States,
other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
Installations today process different types of work with different response times. Every
installation wants to make the best use of its resources, maintain the highest possible
throughput, and achieve the best possible system responsiveness. You can realize such
results by using workload management. This IBM® Redbooks® publication introduces you to
the concepts of workload management utilizing Workload Manager (WLM).
Workload Manager allows you to define performance goals and assign a business
importance to each goal. You define the goals for work in business terms, and the system
decides how much resource, such as CPU and storage, should be given to the work. The
system matches resources to the work to meet those goals, and constantly monitors and
adapts processing to meet the goals. This reporting reflects how well the system is doing
compared to its goals, because installations need to know whether performance goals are
being achieved as well as what they are accomplishing in the form of performance goals.
Alvaro Salla is an IBM retiree who worked for IBM for more than 30 years, specializing in
large systems. He has co-authored many IBM Redbooks publications and spent many years
teaching about large systems from S/360 to S/390®. He has a degree in Chemical
Engineering from the University of Sao Paulo, Brazil.
Your efforts will help increase product acceptance and customer satisfaction. As a bonus, you
will develop a network of contacts in IBM development labs, and increase your productivity
and marketability.
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an e-mail to:
redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
The z/OS component that makes this possible is the Workload Manager (WLM). Among
several functions, WLM implements dynamic workload balancing, which is an intelligent
distribution of the arriving transactions across the z/OS systems.
The idea behind z/OS Workload Manager is to make a “contract” between the installation
(your company) and the operating system. The installation classifies the transactions running
on the z/OS operating system in distinct service classes. Each service class defines goals for
the transaction that express the expectation of how the transaction should perform. WLM
uses these goal definitions to manage the transaction across all systems of a sysplex
environment. Figure 1-1 on page 2 illustrates the concepts of workload management.
Workload management
Before the introduction of MVS workload management, MVS required you to translate your
data processing goals from high-level objectives about what work needs to be done into the
extremely technical terms that the system can understand. This translation not only requires
high skill-level staff but also can be protracted and error-prone, and eventually in conflict with
the original business goals. Multi-system, sysplex, parallel processing, and data sharing
environments add to the complexity. MVS workload management provides a solution for
managing workload distribution, workload balancing, and distributing resources to competing
workloads. MVS workload management is the combined cooperation of various subsystems
(CICS®, IMS/ESA®, JES, APPC, TSO/E, z/OS UNIX System Services, DDF, DB2®, SOM,
LSFM, and Internet Connection Server) with the MVS workload management (WLM)
component.
Performance goals
Installations today process different types of work with different completion and resource
requirements. Every installation wants to make the best use of its resources and maintain the
highest possible throughput and achieve the best possible system responsiveness. Workload
management makes this possible. With workload management, you define performance
goals and assign a business importance to each goal. You define the goals for work in
business terms, and the system decides how much resource, such as CPU and storage,
should be given to it to meet the goal. WLM will constantly monitor the system and adapt
processing to meet the goals.
observations
samples
When you drive a car between two locations, the time it takes is determined by the average
speed that you are allowed to drive, the amount of traffic on the streets, and the traffic
regulations at crossings and intersections. Based on this, you can identify the time as the
basic measure to go from a start to a destination point; you can also see that the amount of
traffic is the determining factor in how fast you can travel between these two points. Although
the driving speed is usually regulated and constant, the number of cars on the street
determines how long you have to wait at intersections and crossings.
As a result, you can identify the crossings, traffic lights, and intersections as the points where
you encounter delays on your way through the city. A traffic management system can use
these considerations to manage the running traffic. For example, dedicated lanes can allow
buses and taxis to pass the waiting traffic during rush hours and therefore travel faster than
passenger vehicles.This is a common model of prioritizing a certain type of vehicle against
others in a traffic system.
Using this example, you see that it is possible to use time as a measure between two points
for driving in a city. You also identified the points where contention may occur, and found a
simple but efficient method of prioritizing and managing the traffic.
Transaction managers
In a system, the identification of a transaction is supported by middle ware (transaction
managers) and the z/OS; they tell WLM when a new transaction enters the system and when
it leaves the system. WLM provides constructs to separate the transaction into distinct
classes (service classes). These constructs are known as classification rules. Classification
rules allow an installation to deal with different types of transaction running on the system.
Contention occurs when several transactions want to use the same system resource,
including CPUs, I/O devices and storage, and also software constructs like processes
(dispatchable units) or address spaces. These resources provide the capability to execute
programs and serialization points that allow the programs to access resources or serialized
control areas keeping data integrity. WLM monitors some resources (not all of them) to be
able to understand how many resources each transaction would require or will wait for.
Service classes
The classification rules and the WLM observation of how the transaction uses the resources
provide the base for managing the system through transaction priorities. This management is
performed based on the goals the installation defines for the transaction in the service class.
After classifying the transactions into distinct service classes, the installation associates a
goal with each service class; the goal determines how much service the transaction in the
service class is able to receive resource by setting different values of priorities.
Business importance
In addition to the goal, it is also necessary to define how important is to achieve one service
class transaction goal when compared with other transaction goals. In the case where several
service classes do not achieve their target goals, the importance will assist in deciding which
service class should be helped in getting resources (in other words, have a priority raise).
Sysplex Environment
LPAR LPAR
SYSB SYSA
XCF
WLM WLM
address space address space
XCF XCF
WLM
address space Contains
service definition
An XCF group is a set of related members that a multisystem application defines to XCF. A
member is a specific function, or instance, of the application. A member resides on one
system and can communicate with other members of the same group across the sysplex.
Communication between group members on different systems occurs over the signaling
paths that connect the systems; on the same system, communication between group
members occurs through local signaling services.
Note: The WLM couple data set should be duplexed; an alternate copy is used for backup,
Service definition
The service definition, as illustrated in Figure 1-4 on page 7, contains all the information
about the installation that is needed for workload management processing. There is one
service definition for the entire sysplex. The service level administrator specifies the service
definition through the WLM administrative application. The service level administrator sets up
“policies” within the service definition to specify the goals for work. A service level
administrator must understand how to organize work and be able to assign it performance
objectives.
Service definition
When you set up your service definition, you identify the workloads, the resource groups, the
service classes, the service class periods, and the goals based on your performance
objectives. Then you define classification rules and one or more service policies. This
information makes up the base service definition.
The service definition contains one or more service policies with constructs:
Service classes
Workloads
Classification rules
To start workload management processing, you must define at least one service policy. You
can activate only one policy at a time.
Service classes
Service classes are subdivided into periods and group work with similar performance goals,
business importance, and resource requirements for management and reporting purposes.
You assign performance goals to the periods within a service class.
Workloads
Workloads are a set of service classes together for reporting purposes. All work running in the
installation is divided into workloads. Your installation may already have a concept of
workload. A workload is a group of work that is meaningful for an installation to monitor. For
example, all the work created by a development group could be a workload, or all the work
started by an application, or in a subsystem.
Classification rules
Classification rules determine how to assign incoming work to a service class and report
class. Classification of work depends on having the rules defined for the correct subsystem
type. Classification rules are the rules you define to categorize work into service classes, and
optionally report classes, based on work qualifiers. A work qualifier is what identifies a work
request to the system. The first qualifier is the subsystem type that receives the work request.
Report classes
Report classes group work for reporting purposes. They are commonly used to provide more
granular reporting for subsets of work within a single service class.
SERVICE DEFINITION
SERVICE
RESOURCE GROUP
POLICIES
RESOURCE GROUP
Application
Envir.
CLASSIFICATION RULES WORKLOAD
WORKLOAD
WORKLOAD
Scheduling
Envir.
Report
Class SERVICE CLASS
RERIOD
Report
Class GOAL 1 IMPORTANCE 1
WLM constructs
The service definitions and WLM constructs used for performance management are shown in
Figure 1-5. Using an ISPF application, the installation defines a service definition that is
stored in a WLM couple data set. The service definition is composed of service policies. If the
goals or the importance of your transactions should vary during the day, the month or the
year, it is good practice to use several policies. However, you can activate only one policy at
a time.
The service policy is comprised of workloads, and these workloads are defined only for
accounting purposes. A workload is a named collection of work to be reported as a unit. You
can arrange workloads by subsystem (CICS, IMS™) or by major application (Production,
Batch, Office). Logically, a workload is a collection of service classes. For each workload, you
must have at least one service class that is composed of service class periods that indicate
the goal and the importance of the associated transaction. There are also classification rules
to link the external properties of a transaction to a service class. Optionally there are other
constructs, as such:
Resource groups
Application environments
Scheduling environments
Scheduling environments
Scheduling environments are lists of resource names along with their required states. If an
MVS image satisfies all of the requirements in a scheduling environment, then units of work
associated with that scheduling environment can be assigned to that MVS image.
Resource groups
Resource groups define processor capacity boundaries across the sysplex. You can assign a
minimum and maximum amount of CPU service units per second to work by assigning a
service class to a resource group.
Service Policy
Workload
Service Class
goal1 importance1
goal1 importance1
Service policy
Figure 1-6 illustrates the relationship between workloads and service classes. A service
policy is a named set of overrides to the performance goals and processing capacity
boundaries in the service definition. A policy applies to all of the work running in a sysplex.
Because processing requirements change at different times, service level requirements may
change at different times. If you have performance goals that apply to different times or a
business need to limit access to processor capacity at certain times, you can define multiple
policies. To start workload management processing, you must define at least one service
policy. As previously mentioned, you can activate only one policy at a time.
The active policy can be switched by the following MVS console command to reflect changes
in performance objectives; it is a sysplex-wide command:
V WLM,POLICY=policyname[,REFRESH]
Workloads
A workload is a named collection of service classes to be accounted and reported as a unit.
Note that the workload concept does not affect the management of transaction. You can
arrange workloads by subsystem (CICS, IMS), by major type of service (production, batch,
office), or by line of business (ATM, inventory, department).
The RMF workload Activity report groups performance data by workload and also by service
class periods within workloads.
Service classes
Service classes are specified into periods. Work is grouped with similar performance goals,
business importance, and resource requirements for management and reporting purposes.
Goals are assigned to the periods within a service class.
When any kind of work enters the system, such as transactions or batch jobs, this work is
classified by WLM by assigning the new work a service class.
Note: WLM allows no more than 100 service classes to be defined in a service definition.
The default, however, is 128. This will set aside as much space as you will ever need for
service classes, as well as a little extra for other WLM objects.
You define the goals for work in business terms; that is, importance 1 work is the highest
priority work. The system then decides how much resource, such as CPU and storage,
should be given to the work to meet its goal.
Workloads
The definition of workloads and performance goals is a part of a service definition. Workloads
aggregate a set of service classes together for reporting purposes. To workload
management, “work” is a demand for service, such as a batch job, an APPC, CICS, DB2, or
IMS transaction, a TSO/E logon, a TSO/E command, or an SOM request. All work running in
the installation is divided into workloads.
Figure 1-7 displays a Workload Selection List. In the figure, the workloads defined in the
service policy are comprised of a named collection of work to be reported as a unit. As
mentioned earlier, you can arrange workloads by subsystem (CICS, IMS), by major
application (TSO, batch, office) or by line of business (UNIX System Services, Inventory,
department). Logically, a workload is a collection of service classes. Every workload name
must be unique for all defined workloads in a service definition.
Note: A workload logically consists of a group of one or more service classes. You
associate a workload with a service class in the Service Class panel. Enter your workloads
before creating your service classes.
Service classes
A service class, as illustrated in Figure 1-8, is a named group of work within a workload with
the following similar performance characteristics:
Performance goals
Resource requirements
Business importance to the installation
Performance periods
Because some work has variable resource requirements, workload management provides
performance periods where you specify a series of varying goals and importances. You can
define up to eight performance periods for each service class. You can also assign a service
class to a resource group if its CPU service must be either protected or limited.
This goal is a measure of how fast work should run when ready, without being delayed for
WLM-managed resources. Velocity is a percentage from 1 to 99.
Discretionary goals
Discretionary goals are for low priority work for which you do not have any particular
performance goal. Workload management then processes the work using resources that not
required to meet the goals of other service classes. Defining a response time goal may not be
appropriate for some types of batch work, such as jobs with very long execution times.
Goal importance
Importance is a reflection of how important it is that the service class goal be achieved.
Workload management uses importance only when work is not meeting its goal. Importance
indicates the order in which work should receive resources when work is not achieving its
goal. Importance is required for all goal types except discretionary. Importance applies on a
performance period level and you can change importance from period to period. Importance
is assigned in five levels, from 1 to 5, with 1 being the highest importance.
Note: When there is insufficient capacity for all work in the system to meet its goals,
business importance is used to determine which work should give up resources and which
work should receive more. You assign an importance to a service class period, which
indicates how important it is that the goal be met relative to other goals. Importance plays a
role only when a service class period is not meeting its goal.
Performance index
The performance index (PI) indicates the achievement of the WLM-defined goals for the
service class. The performance index is a calculation of how well work is meeting its defined
goal.
For work with response time goals, the performance index is the actual divided by the
goal.
For work with velocity goals, the performance index is the goal divided by the actual.
A performance index of 1.0 indicates the service class period is exactly meeting its goal. A
performance index greater than 1 indicates the service class period is missing its goal. A
performance index less than 1.0 indicates the service class period is beating its goal. Work
with a discretionary goal is defined to have 0.81 as a performance index.
SYSTEM and SYSSTCx can also be assigned to address spaces or transactions through the
classification rules. However, this assignment is not recommended
Workload: IMS
Importance 1 2 3
Goal 99% complete < 1 sec 90% complete < 5 sec 85% complete < 10 sec
Business goals
As previously mentioned, with workload management you define performance goals and
assign a business importance to each goal. You define the goals for work in business terms,
and the system decides how much resource, such as CPU and storage, should be given to
the work to meet its goal.
Note: Figure 1-10 shows a percentile response time goal, which is the percentage of work
in that period that should complete within the response time. The percentile must be a
whole number.
The figure shows three different service classes defined in the same workload.
An installation should know what it expects to accomplish in the form of performance goals,
as well as how important it is to the business that each performance goal be achieved. With
workload management, you define performance goals for work, and the system matches
resources to the work to meet those goals, constantly monitoring and adapting processing to
meet the goals. Reporting reflects how well the system is doing compared to its goals.
Performance administration
Performance administration is the process of defining and adjusting performance goals.
Workload management introduces the role of the service level administrator. The service
level administrator is responsible for defining the installation's performance goals based on
Some installations might already have this kind of information in a Service Level Agreement
(SLA). The service definition applies to all types of work, including CICS, IMS, TSO/E, z/OS
UNIX System Services, JES, APPC/MVS, LSFM, DDF, DB2, SOM, Internet Connection
Server (also referred to as IWEB) and others. You can specify goals for all MVS-managed
work, whether online transactions or batch jobs. The goals defined in the service definition
apply to all work in the sysplex.
Because the service definition terminology is similar to the terminology found in a Service
Level Agreement (SLA), a service level administrator can communicate with the installation
user community, with upper level management, and with MVS using the same terminology.
When the service level requirements change, the service level administrator can adjust the
corresponding workload management terms, without having to convert them into low-level
MVS parameters.
Critical batch
Production TSO
95% period 1 complete < 0.5 seconds
period 3 is Discretionary
Figure 1-11 displays various types of WLM goals. These are discussed in more detail in the
following sections.
Average response time should be used when enclaves or address spaces produce a
reasonable rate of ended transactions. Some examples could be TSO, Batch, CICS, IMS,
OMVS, APPC, WebSphere® MQ, DB2 parallel query, DB2 stored procedure, DDF, and
HTTP.
Note: Do not use this type of goal for address spaces where the transaction is the address
space itself, or where there is no ending transaction, such as with DFSMShsm, RACF, and
DFSORT.
For statistical reasons, transactions that are appropriate for an average response time goal
should have at least 10 transaction completions within 20 minutes. If less, it might be difficult
for WLM to react correctly because a single transaction might have a significant impact and
affect the calculated response time. WLM is more efficient when it can compare a statistically
valuable number based on multiple ended transactions with the defined goal.
Using time as a measure to manage the transaction on a system requires the system to have
the ability to capture it. z/OS provides processes (dispatchable units as tasks and service
requests), enclaves, and address spaces to execute the application programs. Sometimes
these constructs are started once and live for a very long time. This means that WLM needs
assistance from the middleware (such as CICS, IMS DC) to capture the response time. WLM
also supports an infrastructure that allows the middleware to identify individual transaction
requests, capture their response times, and thus allow WLM to manage them in their own
service classes. This means that on z/OS, WLM is able to manage the service classes of
transactions that belong to the same application and that are executed by a set of server
address spaces.
To summarize, the percentile response time goal is the percentage of transactions in that
period that should complete within the response time (for example, 80% of transactions
ended in 0.5 of a second).If you have a problematic response time distribution towards the
average, using a percentile goal is more useful than using an average goal. For example,
nine transactions with a response time of 0.5 seconds and one transaction with a response
time of 3 seconds have an average response time of 0.75, but 90% of them have a response
time of 0.5 seconds.
Note: Workload Manager does not delay or limit transactions to achieve the response time
goal when extra processing capacity exists, unless you are explicitly using a WLM capping
function.
Discretionary goal
Refer to 1.14, “Discretionary goals” on page 27 for detailed information about this topic.
Then, when defining a service class period you specify a goal, an importance, and a duration
for a performance period. The duration is the amount of service that period should consume
before going on the next goal. The duration is specified in service units, as shown in
Figure 1-12 on page 23.
Service units
The amount of system resources an address space or enclave consumes is measured in
service units. Service units are calculated based on the CPU, SRB, I/O, and storage (MSO)
service that an address space consumes.
Service units are the basis for period switching within a service class that has multiple
periods. The duration of a service class period is specified in terms of service units. When an
address space or enclave running in the service class period has consumed the amount of
service specified by the duration, workload management moves it to the next period. The
work is managed to the goal and importance of the new period.
Because not all kinds of services are equal in every installation, you can assign additional
weight to one kind of service over another. This weight is called a service coefficient.
The amount of service consumed by an address space is computed by the following formula:
service = (CPU x CPU Service Units)
+ (SRB x SRB Service Units)
+ (IOC x I/O Service Units)
+ (MSO x Storage Service Units)
CPU, SRB, IOC, and MSO are installation-defined service definition coefficients, as follows:
CPU Service Units = task (TCB) execution time, multiplied by an SRM constant that is
CPU model-dependent. Included in the execution time is the time used by the address
space while executing in cross-memory mode (that is, during either secondary addressing
mode or a cross-memory call). This execution time is not counted for the address space
that is the target of the cross-memory reference.
SRB Service Units = service request block (SRB) execution time for both local and global
SRBs, multiplied by an SRM constant that is CPU model-dependent. Included in the
execution time is the time used by the address space while executing in cross-memory
mode (that is, during either secondary addressing mode or a cross-memory call). This
execution time is not counted for the address space that is the target of the cross-memory
reference.
I/O Service Units = measurement of individual data set I/O activity and JES spool reads
and writes for all data sets associated with the address space. SRM calculates I/O service
using either I/O block (EXCP) counts or device connect time (DCTI), as specified on the
IOSRVC keyword in the IEAIPSxx parmlib member.
Note that if DCTI is used to calculate I/O service, then operations to VIO data sets and to
devices that the channel measurement facility does not time are not included in the I/O
service total. When an address space executes in cross-memory mode (that is, during
either secondary addressing mode or a cross memory call), the EXCP counts or the DCTI
will be included in the I/O service total. This I/O service is not counted for the address
space that is the target of the cross-memory reference.
Storage Service Units = (central page frames) x (CPU service units) x 1/50, where 1/50 is
a scaling factor designed to bring the storage service component in line with the CPU
component.
Note that the main storage page frames used by an address space while referencing the
private virtual storage of another address space through a cross-service are not included
in the storage service unit calculation. These frames are counted for the address space
whose virtual storage is being referenced.
Execution Velocity % =
(Using CPU + Using I/O) / (Using CPU + Using I/O +
Delay CPU + Delay I/O +
Delay Storage +
Delay by Server AS)
Performance index
PI= Vgoal / Vactual
Execution velocity goals define how fast work should run when ready, without being delayed
for processor, storage, I/O access, and address space queue delay. Execution velocity goals
are intended for work for which response time goals are not appropriate, such as started
tasks or long-running batch work. Figure 1-13 illustrates execution velocity and discretionary
goals.
Execution velocity
The speed in the system can now be expressed by the acceptable amount of delay for the
transaction when it executes. The speed is known as execution velocity, which is a number
between 0 and 100. 0 means that the transaction was completely delayed over the
measurement interval. 100 means that all measured samples are using samples and that the
transaction did not encounter any delays for resources managed by WLM. Generally we may
say:
EXEC VELOCITY% = USING SAMPLES / (USING SAMPLES + DELAY SAMPLES) x 100
To include the I/O figures in both Using and Delay samples in the Execution Velocity goal
formula, you need the option I/O Management to be set to YES in the service policy.
Processor delays also include processor capping delays, which come from resource group
capping. In this case WLM cannot do anything to avoid such delays because they are directly
caused by capping, as required by the installation. Storage delays consists of a variety of
different delay reasons (such as page faulting and WLM address space swapping). Queue
delays can either be JES queue delays (for batch job transaction) or address spaces queue
delays for applications like WebSphere or DB stored procedures, which exploit the WLM
queuing services.
On the other hand, we have an execution velocity goal. We see that an execution velocity of
100% means no delay (the goals managed by WLM as CPU, I/O, and storage), and therefore
it translates into maximal speed. But what does an execution velocity goal of 30% mean?
If you want to define a reliable goal that can be achieved under all circumstances, then you
have to understand how the system behaves during rush hour. Such goals may be easily
achieved during other times because there is no contention and everything can progress very
quickly.
Do not define an execution velocity goal higher than 85%. It will be a very difficult goal to
achieve and will cause worthless WLM attention without a clear profit.
When there are two service classes for different transactions, one for which you can measure
the response time and one for which you cannot measure the response time, you need a way
to compare the potential goal settings against each other. Because it is very difficult to
understand how an execution velocity translates into a response time, you have to capture
the execution velocity of the transaction being managed towards response time goals simply
because you need to understand what this means as an execution velocity. A way for
normalizing these different type of goals is through the calculation of a performance index
(PI), a number that indicates (and also indicates to WLM) how far the service class period is
from the goal, not considering the type of the goal.
Discretionary goals
With discretionary goals, as listed in Figure 1-14, WLM decides how best to run this work.
Because the prime WLM responsibility is to match resources to work, discretionary goals are
best used for work for which you do not have a specific performance goal. For a service class
with multiple performance periods, you can specify discretionary only as the goal in the last
performance period.
Discretionary work is run using any system resources not required to meet the goals of other
work. If certain types of other work are overachieving their goals, that work may be “capped”
so that the resources may be diverted to run discretionary work.
Discretionary goals are for low priority work for which you do not have any particular
performance goal. Workload management then processes the work using resources not
required to meet the goals of other service classes.
In addition to setting response time or velocity goals, you might want to have a transaction
running in the system for which no concrete goal must be achieved. This transaction should
only run when the transactions with a specific goal are satisfied. This transaction is classified
to service classes with a discretionary goal and thus tells the system that it should only run
when service classes with higher importance do not need the resources to achieve their
goals.
Workloads
(incoming work)
Subsystems WLM
Subsystem
Work Classification rules
Qualifiers
Service class
Figure 1-15 Assigning service classes to incoming work using classification rules
IWMCLSFY service
When each subsystems transaction manager receives a work request, it should issue the
IWMCLSFY service to associate an incoming work request with a service class. WLM
receives this classification request and then uses the classification rules defined by the
installation to classify the new work by assigning a service class. Every piece of work that
enters the system will have a service class assigned to it via the IWMCLSFY service.
Figure 1-15 illustrates assigning service classes to incoming work using classification rules.
Classification rules
Classification rules are the rules that WLM uses to associate a performance goal or a
resource group with work by associating incoming work with a service class. Optionally,
classification rules can also associate incoming work with a report class, similar to a report
performance group. These classification rules, defined by an installation, are the rules you
define to categorize work into service classes, and optionally report classes, based on work
qualifiers.
The classification rules for a subsystem are specified in terms of transaction qualifiers such
as job name or transaction class. These qualifiers identify groups of transactions that should
have the same performance goals and importance.
Work qualifiers
Each subsystem has its own transaction work qualifiers. The attributes of incoming work are
compared to these qualifiers and, if there is a match, the rule is used to assign a service class
to the work. A subsystem can also have a default class for work that does not match any of
the rules.
Note: Not all work qualifiers are valid for every subsystem type; they are
subsystem-dependent.
Note: For many of these qualifiers, you can specify classification groups by adding a G to
the type abbreviation; see Figure 1-16.
Base goal:
CPU Critical flag: NO
Service classes
A service class is a named group of work within a workload with the following similar
performance characteristics:
Performance goals
Resource requirements
Business importance to the installation
WLM manages a service class period as a single entity when allocating resources to meet
performance goals. A service class can be associated with only one workload. You can
define up to 100 service classes.
Figure 1-17 on page 32 shows an example of a service class (BATCHHI) with just one period
with an Execution Velocity% goal of 60% and an importance of 3.
Note: You can define multiple performance periods for work in subsystems which use
address spaces or enclaves to represent individual work requests. Multiple periods are not
supported for work in the IMS and CICS subsystem work environments because service
units are accumulated to the address space, not the individual transactions. So, the system
cannot track a duration for those transactions.
Sometimes the installation knows in advance that the transactions coming from one
department are trivial. In such cases you simply assign to them a service class with a difficult
and very important goal. This translates (by WLM) into high priorities for the dispatchable
units executing the programs of such transactions. However, when the same department
mixes trivial and non-trivial transactions, you may use the concept of service class periods to
protect the trivial ones.
The value of duration is determined empirically by watching RMF data about the percentage
of transactions ending in the first period. If this is more than 80%, then the duration must be
decreased. If this is less than 80%, then the duration must be increased.
ASCH LSFM
CB MQ
CICS NETV
DB2 OMVS
DDF SOM
EWLM STC
IMS TCP
IWEB TSO
JES SYSH
Figure 1-19 WLM supported subsystems
A service definition contains all of the information necessary for workload management
processing. Based on this information, you should be able to set up a preliminary service
definition using the worksheets provided. Then, you can enter your service definition into the
ISPF administrative application with ease.
ISPF Administration
Dialog LPs
Secondary WLM
Couple Data Set
Service
Policy
Service definition
The service definition (SD) is used to express your installation’s business goals to your
sysplex. The SD must be installed on a WLM couple data set, and a service policy (contained
in this SD) activated. Only service policies in the SD installed on the WLM couple data set can
be activated. Figure 2-1 illustrates setting up a service policy. A WLM couple data set has an
automatic backup. You may define a minimum of one policy in a service definition. A service
definition contains policies, workloads, service classes, classification rules, resource group
(optional), application environment (optional), and scheduling environment (optional.)
ISPF tables
The WLM SD is stored in ISPF tables, or in XML format. When a new release adds certain
specifications to the SD, changes to the ISPF tables are required. When you open an SD that
has a table structure different (older) from the one currently used by the WLM application, the
WLM application automatically updates the SD structure. After this occurs, the SD cannot be
read by older levels of the WLM application unless the compatibility APARs are installed.
Definition options
When you define your service definition for the first time, define it in the following order:
Policies - A policy consists of a name, a description, and policy overrides. The first time
you set up a service definition, define a policy name and description. If you do not have a
business need to change your goals, you can run with one service policy, without any
policy overrides. All the new service policies must share the same set of service classes
and workloads. The difference is the numeric values of the goals. Then, when you add a
new policy, you override the original; refer to Figure 2-3 on page 43. You use a policy
override only if you have a business need to change a goal for a certain time, such as for
the weekend or nighttime. You can define your policy overrides after you have defined you
service classes.
Workloads - A workload logically consists of a group of one or more service classes. You
associate a workload with a service class in the Service Class panel. The reason for
having workloads is for introducing a unit of account. Enter your workloads before creating
your service classes.
Resource groups (optional) - A resource group is a minimum or maximum amount of
processing capacity. You associate a resource group with a service class in the Service
Class panel. Enter resource groups before creating your service classes.
Service classes - A service class is a group of transactions with similar performance
goals, resource requirements, or business importance. You make the association with a
workload and a resource group in the service class panel. You associate a service class
A service policy override enables an installation to apply different sets of goals under different
sets of conditions (time of day, disaster scenarios, and so on) with no disruption of ongoing
work.
Alternate policy
An alternate policy can be activated by the console command V WLM,P=NIGHT, which means
that policy-switching can be automated. Policy changes take effect across the entire sysplex.
The base policy contains a set of service class period definitions. For each period, the
definition includes an importance, a goal type, and a goal value. For the service class as a
whole, the policy may also specify CPU-related constraints: “CPU Critical” and a resource
group assignment (which may provide a maximum, a minimum, or both). Any or all of these
may be overridden for any or all of the service classes.
In Figure 2-3 on page 43, the characteristics of SC1 and SC3 change, either on issuance of
the V WLM,P=NIGHT console command or when IWMARIN0 is used to activate the NIGHT
policy. In the overridden policy, SC1 gets a new resource group assignment but none of its
period goals are changed. SC2 is not overridden at all. SC3's period goals change.
Only one override policy can be in effect at any time; you cannot partially override another
override. An override cannot affect the classification rules, resource group values, or anything
other than the attributes of a service class and its periods.
In the following sections we guide you through the WLM panels. These panels illustrate how
to create a WLM policy. Figure 2-5 displays the first panel that is shown when you request the
WLM application under ISPF.
Entering WLM
Note that when you have access to this panel, you can use the Help facility to clarify concepts
as needed. To reach the next panel, press Enter. The full ISPF application is called
IWMARIN0.
Definition menu
On the top left corner of the panel, as shown in Figure 2-7, notice the panel ID (IWMPA01, in
this case). This name is presented as a consequence of the TSO/ISPF command PANELID.
This menu lists WLM constructs that may be assigned to define the service definition. The
name of the service definition is a name you choose and you define some text that describes
the service policy definition that is being created.
The Definition Menu is considered the home base panel. A significant amount of function is
accessible from this panel, and you will find yourself returning to it frequently. One quick way
to do that from just about anywhere in the application is to press F4=Return. On the definition
menu header line appears the functionality level (top left side) and the WLM application level
(top right side). The functionality LEVEL021 represents the highest level function in the
current service definition. The WLM Appl LEVEL021 is the level of the WLM ISPF application.
Definition name
Choose a name for the definition. Optionally, you can define a description for the policy.
Select option 1 to create a policy and press the Enter key. You will reach the Create a Service
Policy panel shown in Figure 2-8 on page 49.
Define a policy
On this panel (IWMAP1D), you must type in the name you want to give your policy.
Optionally, you can type in a description of the new service definition.
Service Policy Name . . . . . SHIFT1 (Required)
Description . . . ..... . . . Policy for first shift
Choose names that relate to the installation policies you need to create. The service policy
reflects goals for work, expressed in terms commonly used in service level agreements
(SLAs). Because the terms are similar to those commonly used in an SLA, you can
communicate with users, with business partners, and with z/OS using the same terminology.
The service policy is a named set of overrides to the performance goals and processing
capacity boundaries in the service definition. A policy applies to all of the work running in a
sysplex. Because processing requirements change at different times, service level
requirements may change at different moments. If you have performance goals that apply to
different times or a business need to limit access to processor capacity at certain instants,
you can define multiple policies. To start workload management processing, you must define
at least one service policy. You can activate only one policy at a time.
You can define as many service policies as needed by the installation. For example, other
service policy names might be SHIFT2, SHIFT3, HOLIDAY, and WEEKEND. Each name
describes what type of transactions executes during that service policy.
When you are creating your service definition, you may choose to define one empty “default”
policy with no overrides at all. Next, create your workloads and service classes. Then
determine how and when your service classes may have different goals at different times.
Define additional policies with the appropriate overrides for these time periods.
Service policies
Every service policy name must be unique in a service definition. The service policy is
activated by name in one of the following ways:
By using an operator command from the z/OS operator console.
By using a utility function from the workload management ISPF application.
To create a new service policy, select option 1 on the panel shown in Figure 2-7 on page 47.
Create as many service policies as your installation needs. The service policy name in
Figure 2-9 is called SHIFT1. You might then have a different service policy for each shift
during the day with the same workloads and same service classes, but with different goals to
execute a different type of work.
Workloads
A workload is a unit of accounting comprised of related service classes (SCs). You reach this
panel (IWMAP2D), as shown in Figure 2-10, by entering option 2 in the SD panel (IWMAP01).
Next, create a new workload, WASWKL, for WebSphere transactions. In the workload name
field, specify a description of the workload available to performance monitors for reporting.
Define as many workloads as your installation requires. Workloads contain a set of SCs for
accounting purposes. You associate a workload with an SC in the SC panel.
RMF reports
RMF reports data about different application transactions grouped into categories that you
have defined as workloads and SCs in your service policy. The appropriate grouping of
application transactions is important. In addition to using workloads, the installation may use
report classes as unit of accounting and reporting, as follows:
Different applications which should be managed according to the same goals, you should
define same SC. Applications with different goals need to be assigned to different SC.
If you want to get separate reporting for different applications in the same SC, you can
define separate report classes (RC) for each of them. Reporting for RC is possible with the
same level of detail (RC period) as for SCs.
Note: You must define at least one workload before selecting the service class option.
Importance number
The relative importance of the service class goal is a reflection of how important it is that the
service class goal be achieved, Workload management uses the importance figure only when
work is not meeting its goal. Importance indicates the order in which transactions work should
receive resources (priorities) when those transactions are not achieving their goals.
Importance is required for all goal types except discretionary. Importance applies on a
performance period level and you can change importance from period to period. Importance
is stated in five levels, from 1 to 5, with 1 being the highest importance.
Figure 2-12 shows a Create a Service Class panel. You must assign a workload to the service
class in the Workload field.
Specify the one- to eight-character name of the service class. Every service class name must
be unique within a service definition. You can specify more information about the service
class in the description field.
Note: When you create a new service class, it automatically belongs to all defined service
policies.
When you choose option 1, Average response time, the application displays the average
response time goal pop-up. As shown in Figure 2-15 on page 56, there is a different pop-up
for each goal type, where you can fill in the information for the goal.
In addition, multiple periods are not supported for work in the IMS and CICS subsystem
work environments because service units are accumulated to the address space, not the
individual transactions. Therefore, the system cannot track a duration for those
transactions.
Hours . . . . . __ (0-24)
Minutes . . . . __ (0-99)
Seconds . . . . _5____ (0-9999)
The goal chosen in Figure 2-15 shows that transactions should complete in the first period
within 5 seconds, and that WLM monitors these transactions with an importance of 1. All
transactions consuming more than 200 service units are automatically migrated to the second
period.
In this second period the goal is Discretionary and there is no DUR, indicating that this is the
last period of the WASPROD service class.
Note: To ensure that the system runs smoothly, certain address spaces
cannot be freely assigned to all service classes. The following address
spaces are always classified into service class SYSTEM, independent of the
user-defined classification rules: *MASTER*, INIT, WLM, XCFAS, GRS,
CONSOLE, IEFSCHAS, IXGLOGR, SMF, CATALOG, SMSPDSE, and
SMSPDSE1.
SYSSTC For all started tasks not otherwise associated with a service class. Workload
management treats work in SYSSTC just below special system address
Note: Address spaces in the SYSSTC service class are kept at a very high
dispatching priority.
You can also assign address spaces to the SYSTEM and SYSSTC service
classes as part of your work classification rules.
Note that not all started tasks are appropriate for SYSSTC, because a CPU-intensive started
task could use a large amount of processor cycles. However, if your processor is lightly
loaded, SYSSTC might be appropriate because that one task may not affect the ability of the
remaining processors to manage the important work with goals.
All dispatchable units of the same service class period have the same base dispatching
priority (with the exception of the mean-time-to-wait group for discretionary work). The range
from 202 to 207 is not used and the range from 192 to 201 is used for discretionary work. All
discretionary work is managed by a mean time to wait algorithm, which, simplified, gives a
higher dispatch priority for work using I/O and a lower dispatch priority to work demanding a
lot of CPU. The last used dispatch priority is 191. It is assigned to quiesced work. That also
means that quiesced work can run if the CPU is not busy. It also means that all other work
can access the CPU before it.
Work initiation
When work is initiated, the control program creates an address space. All successive steps in
the job execute in the same address space. The address space has a dispatching priority,
which is normally determined by the control program. The control program will select, and
alter, the priority in order to achieve the best load balance in the system, that is, in order to
make the most efficient use of processor time and other system resources.
All dispatchable units of the same service class period have the same base dispatching
priority (with the exception of the mean-time-to-wait group for discretionary work). The
dispatching order of all dispatchable units with the same dispatching priority in the same or in
different Service Class periods are periodically rotated.
The range from 202 to 207 is not used and the range from 192 to 201 is used for discretionary
work. All discretionary work is managed by a mean time to wait algorithm, which simplified
gives a higher dispatch priority for work using I/O and a lower dispatch priority to work
demanding a lot of CPU. The last used dispatch priority is 191. It is assigned to quiesced
work. That also means that quiesced work can run if the CPU is not busy. It also means that
all other work can access the CPU before it.
Task priority
Each task in an address space has a limit priority and a dispatching priority associated with it.
The control program sets these priorities when a job step is initiated. The dispatching
priorities of the tasks in an address space do not affect the order in which the control program
selects jobs for execution because that order is selected on the basis of address space
dispatching priority.
SRM defines dispatching priority for service class periods. All address spaces in a service
class period have the same base dispatching priority. Multiple service class periods may have
the same base dispatching priority. After a dispatching priority change, service class periods
may be remapped to different dispatching priorities such that there is an unoccupied priority
between each occupied priority. This process is referred to as priority unbunching.
Classification rules
Classification rules are used by workload management to associate a performance goal with
transactions by associating incoming ones with a service class. When you set up your service
definition, you identify the workloads, the resource groups, the service classes, and the
service class periods containing goals based on your performance objectives. Then you
define classification rules and one or more service policies. This information makes up the
base service definition. Using the WLM application, you need to add classification rules to
create rules associating external properties of the arrival transaction with a service class,
which contains the goals used by WLM to manage the transaction priorities.
The panel IWMAP7A, shown in Figure 2-18, displays the subsystem types in the current
service definition. You reach this panel by entering option 6 in IWMAP01. Classification of
transactions depends on having the rules defined for the correct subsystem type. This panel
initially contains the reserved names of the IBM-supplied subsystem types.
Use the Modify option, option 3, to create rules for the IBM-supplied subsystem types.
Option 1 is used to create a new subsystem. In Figure 2-18, a Modify on the subsystem type
CB (Component Broker) subsystem is selected to create classification rules for all
WebSphere transactions in your installation. Eventually, you create rules for all subsystems
used by your installation.
Work qualifiers
Each subsystem uses some of these qualifiers
Full list of work qualifiers and their abbreviations are:
The full list of work qualifiers and their abbreviations is shown in Figure 2-19.
Work qualifiers
Not all work qualifiers are valid for every subsystem type because they are
subsystem-dependent. For many of these qualifiers, you can specify classification groups by
adding a G to the type abbreviation; for example, a transaction name (TN) would become a
transaction name group (TNG).
Rules example
In the example shown in Figure 2-20, assume you want to associate all JES2 work from
department IRS with the service class JESFAST. You assigned the default for JES2 work as
service class JESMED. If all JES2 accounting information from department IRS has the
characters “DIRS” starting in the seventeenth position, you enter a rule with qualifier DIRS* to
match on just the four characters. If you want to filter out those jobs with the eight characters
“DIRS” starting in the seventeenth position, you need another rule with qualifier DIRS to
assign those jobs to JESMED.
If you want to assign a large number of CICS transactions to the same service class, you can
create a transaction name group (TNG). You name the group (for example, CICSCONV) and
list all the transaction names you want included in the group; see Figure 2-22.
Then you use those group names in the classification rules, as shown
in this panel:
Transaction qualifiers
The classification rules for a subsystem are specified in terms of transaction qualifiers such
as job name or transaction class. These qualifiers identify groups of transactions that should
have the same performance goals and importance. The attributes of incoming transactions
are compared to these qualifiers and, if there is a match, the rule is used to assign a service
class to the work. A subsystem can also have a default class for work that does not match
any of the rules.
This example uses the external property User Identification (UI). If the transaction is created
by a user named Benjamim the service class is WASPROD. If not, the service class is the
SYSSTC (automatically defined by WLM).
Note: For more detailed information about each of the work qualifier types listed, refer to
“Defining Work Qualifiers” in “Defining Classification Rules” of z/OS MVS Planning:
Workload Management, SA22-7602.
When the rules for a subsystem are to be created, type a question mark (?) to view the list of
qualifiers that are valid for the subsystem type you have selected. The qualifiers are shown in
Figure 2-26. You can use any combination of work qualifiers to classify work into a service
class.
Note: You only use the work qualifiers that are needed by your installation.
Group types are specified by adding G to the type abbreviation. For example, a transaction
name group is indicated as TNG. For detailed information about the work qualifiers grouped
by subsystem type, move the cursor to the phrase that lists the subsystem type you are
interested in and then press the Help key.
Subsystem instance
You can use subsystem instance (SI) to isolate multiple instances of a subsystem. For
example, use subsystem instance if you have a CICS production system as well as a CICS
test system. The VTAM applid is used to specify the subsystem instance. Individual VTAM
applids can have different transaction names, as shown in Figure 2-26 on page 68.
Additionally, all the functions executed by Workload Manager are listed here. However,
because of the introductory nature of the ABCs Redbooks, not all of the functions are
explained in detail. For further information see z/OS MVS Planning: Workload Management,
SA22-7602 and System Programmer’s Guide to: Workload Manager, SG24-6472.
WLM functions
WLM has several key functions in z/OS, as listed in Figure 3-1. Following is a brief description
of each one.
WLM dynamically adjusts the I/O priority based on how well each service class is meeting its
goals and whether the device can contribute to meeting the goal. The system does not
micro-manage the I/O priorities, and changes a service class period's I/O priority infrequently.
You can assign storage protection to all types of address spaces using classification rules for
subsystem types ASCH, JES, OMVS, STC, and TSO. By specifying yes in the “Storage
Critical” field for a classification rule, you assign storage protection to all address spaces that
match that classification rule. However, before it can be storage-protected, an address space
must be in a service class that meets two requirements:
The service class must have a single period.
The service class must have either a velocity goal, or a response time goal of more than
20 seconds.
Scheduling Environments
Workload balancing
HiperDispatch mode
Initial cooperation between MVS and the transaction managers (CICS, IMS, DB2) allows you
to define performance goals for all types of MVS-managed work. Workload management
dynamically matches resources (access to the processor and storage) to work to meet the
goals. CICS, however, goes further with the CICSplex Systems Manager (CICSPlex® SM) to
dynamically route CICS transactions to meet the performance goals.
Scheduling environments
Scheduling environments helps to ensure that units of work are sent to systems that have the
appropriate resources to handle them. A scheduling environment is a list of resource names
and their required states. Resources can represent actual physical entities (such as a data
base or a peripheral device) or they can represent intangible qualities such as a certain
period of time (like second shift or weekend). These resources are listed in the scheduling
environment according to whether they must be set to ON or set to OFF. A unit of work can
be assigned to a specific system only when all of the required states are satisfied. This
function is commonly referred to as resource affinity scheduling.
HiperDispatch mode
In addition to the performance improvements available with IBM System z10 processors,
z/OS workload management and dispatching are enhanced to take advantage of System z10
hardware design. A mode of dispatching called HiperDispatch now provides additional
processing efficiencies. HiperDispatch mode aligns work to a smaller subset of processors to
maximize the benefits of the processor cache structures, thereby reducing the amount of
CPU time required to execute work. Access to processors has changed with this mode. As a
result, prioritization of workloads via WLM policy definitions becomes more important.
SMF 99 records
When a contention disappears, the resource manager notifies SRM through an ENQRLSE
sysevent. The enqueue promotion interval can be set by the installation through the ERV
option in IEAOPTxx parmlib member. The ERV option specifies the CPU service units that an
address space or enclave can absorb while it is promoted before the promotion is taken back.
In goal mode, the enqueue promotion dispatch priority is determined dynamically at every
policy adjustment interval (10 seconds). It can change based on available processing
capacity and amount of high dispatch work in the system. Address spaces are promoted to
that priority if their current dispatch priority is not already higher than that.
When you enable automatic buffer pool management, DB2 reports the buffer pool size and hit
ratio for random reads to the WLM component. DB2 also automatically increases or
decreases buffer pool size, as appropriate, by up to 25% of the originally allocated size. DB2
limits the total amount of storage that is allocated for virtual buffer pools to approximately
twice the amount of real storage. However, to avoid paging, set the total buffer pool size to
less than the real storage that is available to DB2.
Note: In particular, turn off SMF type 99 records. They trace the actions SRM takes while
in goal mode, and are written frequently. SMF type 99 records are for detailed audit
information only. Before switching your systems into goal mode, make sure you do not
write SMF type 99 records unless you specifically want them.
3. Change priorities,
1. Measure delays based on delays
(again, again & (donor/receiver)
again...)
The system state is measured every 250 milliseconds. At these points, WLM captures the
delay and using states of all work units in the system and resources it manages. WLM
basically manages CPU, DASD I/O devices, system storage, server address spaces, and to
some extent, resource serialization like GRS enqueues. The states are manifold. While for
CPU there is currently one using and one delay state, it is possible to observe multiple
possibilities why work cannot use the storage (for example, paging delays, shortages of the
page data sets, and much more).
Every 10 seconds WLM summarizes system state information and combines it with
measured data like CPU service or transaction end times. At this point WLM learns the
behavior of the workloads and the system. It retains a certain amount of historical data in
storage and maintains graphs showing the relationship between dependent functions in the
system.
In addition to the goal, it is also necessary to define which work is more important than other
work. If too much work wants to use the same resource of an operating system, WLM must
know to which one it must give preference. This is done by defining an importance for each
service class. The importance is a number ranging from 1 to 5. The most important work in
the system is assigned a 1. Work which can suffer first when resources become constrained
is assigned a 5.
You might have work running in the system for which no concrete goal must be achieved.
Such work should only run when plenty of resources are available. Such work is classified to
service classes with a discretionary or no goal, and thus tells the system that it should only
run when service classes with an importance and goal do not need the resources to achieve
their goals.
To manage work, the service classes must be associated with an importance and a goal, as
follows:
The importance tells WLM which service classes are preferred against others when the
goals cannot be achieved.
The goals tell WLM what must be achieved to meet user requirements.
The following goals are possible for service classes:
– An average response time
– A percentile response time, meaning a certain percentage of work requests which
have to end within a certain time limit
– An execution velocity (the acceptable amount of delay that work should encounter
when it runs in the system)
– Discretionary, if no specific goal definition and no specific importance for the work in
the service class exist
The goals must be oriented toward user expectations as well as toward what is possible in
a system, because WLM cannot exceed the physical limitations of the system.
The resources to help the receiver may also come out of what is referred to as “discretionary
resources” which are those that can be reallocated with little or no effect on the system's
ability to meet performance goals.
Service class periods other than receivers and donors can also be affected by changes.
These service class periods are referred to as secondary receivers and donors. SRM may
decide not to help a receiver due to minimal net value for either a primary or secondary donor.
If a service class period is being served by one or more address spaces, it is called the goal
receiver or donor. It is the service class period with the response time goals.
To help such service class periods, SRM must donate resources to the server address
spaces. The service class period that is serving a service class is called a resource receiver
or donor. SRM adjusts resources for the resource donor or receiver to affect the performance
of the goal donor or receiver.
Using
Delayed
Idle
Unknown
Quiesced
The delay information shows the different states that server address spaces experience when
processing transactions. The information is provided on a service and report class basis in
the service or report class representing the transactions. This way, the delay states show for
the transactions being processed, not for the address spaces serving the transactions.
The states include how many times the service or report class was seen active, ready, and
waiting. There are several waiting states. Each of these is reported separately. Other states
include whether the transactions are continued somewhere else in the system, in the sysplex,
or in the network.
In the RMF Workload Activity Report these states are shown as a percentage, and they
should add up to 100%. However, in some cases, there might be an overlay between Using
and Delay states, thus causing the sum to be greater than 100%.
Performance index
Performance index (PI) is a calculation of how well work is meeting its defined goal. For work
with response time goals, the performance index is the actual divided by goal. For work with
velocity goals, the performance index is goal divided by actual. WLM maintains a PI for each
service class period, to measure how actual performance varies from the goal.
A performance index of 1.0 indicates the service class period is exactly meeting its goal. A
performance index greater than 1 indicates the service class period is missing its goal. A
performance index less than 1.0 indicates the service class period is beating its goal. Work
with a discretionary goal is defined to have a performance index of 0.81.
Because there are different types of goals, WLM needs to compare how well or how poorly
transactions in one service class period are performing compared to other transactions. The
comparison is possible through the use of the PI. The PI is a sort of normalized metric
allowing WLM to compare transactions having a velocity goal with transactions having an
average response time goal and with transactions having a percentile response time goal to
determine which is farthest from the goal.
PI values
The following rules explain the meaning of the PI values:
PI = 1 means that transactions in the SC period are exactly meeting their goal.
PI > 1 means that transactions in the SC period are missing their goal.
In the sysplex, every service class period may have two types of PIs:
One or more local PIs - there is one local PI for every z/OS system where the SC period
transaction is running, and it represents its performance in each local system in the
sysplex.
One sysplex (or global) PI - it represents the SC period global weighted performance
across all the z/OS systems in the sysplex.
PI PI
(0.3) (0.4)
PI PI
(0.5) (2.0)
PI PI
(0.3) (2.0)
PI PI
(2.0) (2.3)
Service Classes
The local PI indicates how well a transaction is performing on the local z/OS system. The
local PIs are used by each WLM in a sysplex to calculate the sysplex PI at every 10 seconds.
Both PIs are significant, but it is more important to check the sysplex PI to determine whether
or not the goal is achieved. The sysplex PI is a weighted average of the local PIs, where the
weights are the values of transaction rates in each service class period in each z/OS system.
The following sections explain how the PI is calculated according to the type of goal. For
every type of goal, the achieved value is then compared to the goal value. The achieved
value is calculated, and the goal value is set by the user in the WLM ISPF application.
Figure 3-8 on page 87 shows four z/OS systems with two active service class periods. The
first service class period has the following local PIs: 0.3, 0.3, 0.4, and 2.0. Its global PI is 0.75
(globally happy).
The second service class period has the following local PIs: 0.5, 2.0, 2.0, and 2.3. Its global
PI is 1.7 (globally unhappy).
TRANSACTIONS TRANS.-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE---- --SERVICE RATES-- PAGE-IN RATES ----STORAGE----
AVG 0.00 ACTUAL 157 SSCHRT 0.0 IOC 0 ABSRPTN 0 SINGLE 0.0 AVG 0.00
MPL 0.00 EXECUTION 0 RESP 0.0 CPU 0 TRX SERV 0 BLOCK 0.0 TOTAL 0.00
ENDED 432072 QUEUED 0 CONN 0.0 MSO 0 TCB 0.0 SHARED 0.0 CENTRAL 0.00
END/S 720.32 R/S AFFINITY 0 DISC 0.0 SRB 0 SRB 0.0 HSP 0.0 EXPAND 0.00
£SWAPS 0 INELIGIBLE 0 Q+PEND 0.0 TOT 0 RCT 0.0 HSP MISS 0.0
EXCTD 0 CONVERSION 0 IOSQ 0.0 /SEC 0 IIT 0.0 EXP SNGL 0.0 SHARED 0.00
AVG ENC 0.00 STD DEV 6.153 HST 0.0 EXP BLK 0.0
REM ENC 0.00 APPL % 0.0 EXP SHR 0.0
MS ENC 0.00
---RESPONSE TIME--- EX PERF AVG --USING%-- ------------ EXECUTION DELAYS % ------------- ---DLY%-- -CRYPTO%- %
HH.MM.SS.TTT VEL INDX ADRSP CPU I/O TOTAL UNKN IDLE USG DLY QUIE
GOAL 00.00.00.180 AVG
ACTUALS
*ALL 00.00.00.157 N/A 0.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
OS0B 00.00.00.125 N/A 0.7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
OS0C 00.00.00.214 N/A 1.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
OS0D 00.00.00.150 N/A 0.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
OS0E 00.00.00.163 N/A 0.9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
In Figure 3-9 on page 88, in the RMF Workload Activity Report, the ACTUAL field (under
TRANS.-TIME) shows that the average response time is 0.157s for the 432,072 ended
transactions.
The goal of the service class is 0.180, as shown in the GOAL field. The PI is calculated for the
sysplex (*ALL) and for each system in the sysplex by dividing the achieved (current) average
response time (either in the sysplex or on the single system) with the target average response
time as illustrated in the following formula:
PI = Average response time
---------------------
Goal average
That corresponds for 0.9 for the sysplex PI (*ALL line) and 0.7, 1.2, and 0.8 for each local PI
in the four z/OS systems (OS0A, OS 0B, OS0C, and OS0C) where those transactions are
running.
TRANSACTIONS TRANS.-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE---- --SERVICE RATES-- PAGE-IN RATES ----STORAGE----
AVG 2.97 ACTUAL 1.45.029 SSCHRT 899.2 IOC 223323 ABSRPTN 58154 SINGLE 0.0 AVG 3966.86
MPL 2.97 EXECUTION 1.44.561 RESP 0.9 CPU 8820K TRX SERV 58154 BLOCK 0.0 TOTAL 11780.8
ENDED 12 QUEUED 468 CONN 0.6 MSO 77155K TCB 425.0 SHARED 0.0 CENTRAL 11780.8
END/S 0.02 R/S AFFINITY 0 DISC 0.0 SRB 153458 SRB 7.4 HSP 0.0 EXPAND 0.00
#SWAPS 0 INELIGIBLE 0 Q+PEND 0.3 TOT 86352K RCT 0.0 HSP MISS 0.0
EXCTD 0 CONVERSION 262 IOSQ 0.0 /SEC 172707 IIT 3.1 EXP SNGL 0.0 SHARED 0.00
AVG ENC 0.00 STD DEV 50.588 HST 0.0 EXP BLK 0.0
REM ENC 0.00 APPL % 87.1 EXP SHR 0.0
MS ENC 0.00
---RESPONSE TIME--- EX PERF AVG --USING%- ---------- EXECUTION DELAYS % --------- ---DLY%-- -CRYPTO%- ---CNT%-- %
HH.MM.SS.TTT VEL INDX ADRSP CPU I/O TOT CAPP CPU I/O UNKN IDLE USG DLY USG DLY QUIE
GOAL 30.0%
ACTUALS
ITS1 51.1% 0.6 1.0 20.1 21.7 40.0 18.0 15.4 6.6 18.1 0.0 0.0 0.0 0.0 0.0 0.0
These using and delay samples are found in the RMF Workload Activity Report. In Figure 3-9,
in the RMF Workload Activity Report, the using samples percentages are found in the fields
USING% CPU and USING % I/O. Their sum is 20.1%t + 21.7%, giving 41.8%. The total of all the
delay samples is given in the field EXECUTION DELAYS% TOT; the value is 40.0%.
The goal is an execution velocity of 30%. Then those transactions running in this service
class period are happy. The PI is then calculated by dividing the goal execution velocity by
the achieved execution velocity. In our example the calculation is 30% / 51.1%, giving a PI of
0.6 (the result of the division is rounded). This formula is as follows:
PI = Goal execution velocity
-------------------------
Actual execution velocity
Notice that the formula is the inverse of the average response time PI calculation. The reason
is because the higher the execution velocity figure, the better the situation is. Conversely,
higher the average response time, the worse it is. In the report there is only one z/OS system,
so the local and global PIs are identical.
Figure 3-10 Average response time with percentile goal RMF report
Response time distributions are provided for both service classes and report classes. These
distributions consist of 14 buckets of information. There is a header, explaining the contents
of the buckets, which is provided once. The header contains the value of the particular
bucket, which is a percentage of the specified goal (that is, 50 equates to 50% of the goal;
150 equates to 150% of the goal). There is always one bucket that exactly maps to the
specified goal, with a value of 100%. In each bucket is the number of transactions that
completed in the amount of time represented by that bucket.
WLM keeps response time distribution data in buckets for service class periods that have a
response time goal specified. These buckets are counters that keep track of the number of
transactions ended within a certain response time range. The number of ended transactions
are stored in a set of 14 buckets. These buckets exist per service class period.
Bucket range values
Bucket values range from half of the goal response time value to more than four times the
goal response time value. In our case, the goal is 70% of transactions with a response
time of 0.600 seconds. The bucket values will range from 0.300 seconds to more than
2.400 seconds.
Discretionary goals
With discretionary goals, workload management decides how best to run this work. Because
workload management's prime responsibility is matching resources to work, discretionary
goals are used best for the work for which you do not have a specific performance goal. For a
service class with multiple performance periods, you can specify discretionary only as the
goal in the last performance period.
Discretionary work is run using any system resources not required to meet the goal of other
work. If certain types of other work are overachieving their goals, that work may be “capped”
so that the resources may be diverted to run discretionary work.
A performance index of 1.0 indicates the service class period is exactly meeting its goal. A
performance index greater than 1 indicates the service class period is missing its goal. A
performance index less than 1.0 indicates the service class period is beating its goal. Work
with a discretionary goal is defined to have 0.81 as a performance index.
Universal donors
A donor is a service class period that donates resources to the receiver. Multiple donors may
donate multiple resources to a single receiver during one policy interval. The resources to
help the receiver may also come out of what is referred to as discretionary resources.
Discretionary resources are those that can be reallocated with little or no effect on the
system's ability to meet performance goals.
CPU capping
Certain types of work, when overachieving their goals, potentially will have their resources
“capped” to give discretionary work a better chance to run. Specifically, work that is not part of
a resource group and has one of the following two types of goals will be eligible for this
resource donation:
A velocity goal of 30 or less
A response time goal of more than one minute
Select a Receiver
Determine Bottleneck
End
The purpose of policy adjustment is to meet service class and resource group goals. Policy
adjustment is done approximately every 10 seconds.
Every 10 seconds the WLM policy adjustment routine gains control in one z/OS in the
sysplex. In a sysplex, the WLM PA routines are not synchronized in time. It summarizes the
system state information sampled (every 250 milliseconds) and combines it with real
measured data, as the transactions’ local average response time. Its major objective is to
select a receiver service class period (LPAR) and several donor LPARs (or it could be just
one). The receiver will have a raise in a local priority figure. The donors will have a decrease
in a local priority figure. The donor service class must be a heavy user of the resource that is
delaying the receiver LPAR. This priority is associated with the resource queue causing the
highest delay in the receiver transactions during the last 10 seconds.
To better understand how WLM plays with such priorities, refer to 3.15, “CPU management”
on page 100 and 3.21, “I/O management” on page 108.
The priority involved in the exchange applies to the resource causing delays in the LPAR
receiver that is causing a PI greater than 1. To have a raise in a set of dispatchable units in
one LPAR, you always need to decrease someone else’s priority. The major reason for such
behavior is to decrease the effect in other transactions (the ones which, at this 10-second
interval, are not donors and not receivers). Because all transactions in an LPAR are treated
as a whole, all dispatchable units (TCB/SRB) running in the same LPAR in one z/OS have the
same priorities (with the exception of the discretionary goal). However, this is not true if they
run in different z/OS systems.
LPAR receivers
The logic for choosing the LPAR receiver is based on the PIs (local and global) and
importance. Only one receiver is managed at a 10-second interval time. The policy
adjustment loop (refer to 3.12, “WLM policy adjustment routine” on page 93 for more
information) uses the sysplex PI value at the beginning of each 10-second interval to decide
whether there is an adjustment it can make to improve the sysplex PI (greater than 1) for the
most important unhappy LPAR, starting the search at importance 1. Note that each WLM acts
upon its local PI to decrease the sysplex PI (a weighted average of local PIs).
If a LPAR receiver cannot be found (that is, if there are no happier heavy users of the
resource causing the delay on the receiver) or if the sysplex PI is less than 1, then the policy
adjustment loop is performed a second time, this time focusing on the local LPAR PI (greater
than 1 and very important). An adjustment is made, if possible, to help an LPAR from the local
system perspective. That is, a LPAR is identified as a receiver when its global PI of an
importance 1 service class period is less than 1 (“happy”), but the local PI is greater than 1.
This is to compensate for overachievement on one system cancelling out the effects of poor
performance on other systems that are processing the same service class periods. If a
receiver LPAR cannot be selected, then the same logic is executed at importance level 2, 3,
and so on.
Each node can concurrently cache shared data in local processor memory through
hardware-assisted cluster-wide serialization and coherency controls. As a result, work
requests that are associated with a single workload, such as business transactions or data
base queries, can be dynamically distributed for parallel execution on nodes in the sysplex
cluster based on available processor capacity.
Parallel Sysplex technology builds on and extends the strengths of zSeries e-business
servers by linking up to 32 servers with near-linear scalability to create the industry's most
powerful commercial processing clustered system. Every server in a Parallel Sysplex cluster
has access to all data resources, and every “cloned” application can run on every server.
This approach allows critical business applications to take advantage of the aggregate
capacity of multiple servers to help ensure maximum system throughput and performance
during peak processing periods. In the event of a hardware or software outage, either
planned or unplanned, workloads can be dynamically redirected to available servers, thereby
providing near-continuous application availability.
To summarize, the sysplex PI is used first in the search for unhappy importance 1 service
class periods. If no receiver LPAR is selected, then the local PI is used for importance 1
unhappy transactions. Then, this search continues for other importance (2, 3, 4, 5) LPARs.
This procedure helps to manage installations where heterogeneous transactions run in the
same service class, but on different systems.
The donor LPAR is usually a happy and (PI less than 1) heavy user of a resource causing the
most delay in the LPAR donor.
Note: The service classes SYSTEM and SYSSTC do not have an explicit numeric goal
and can never be candidates for having their priorities changed.
SYSTEM and SYSSTCx service classes are related to STC address spaces that are either
created by the START command or during the IPL process. The creation of these address
spaces is through the ASCRE (address space creation) macro. The STC address spaces are
assigned to SYSTEM or SYSSTCx, depending on the parameters passed as input to this
function and the ones declared at SCHEDxx Parmlib member.
SYSTEM and SYSSTCx can also be assigned to address spaces or transactions through the
classification rules. However, this assignment is not recommended. To prevent the
misclassification of system address spaces, WLM in z/OS V1R10 keeps the following
address spaces as a SYSTEM service class even if they are differently defined in the policy:
MASTER SCHEDULER, WLM, XCFAS, GRS, SMSPDSE, SMSPDSE1, CONSOLE,
IEFSCHAS, IXGLOGR, SMF and CATALOG
There is no goal associated with such a service class, and its DP and IOP never change.
In addition to the previously cited address spaces, the following address spaces use,
automatically, the SYSTEM service class:
DUMPSERV, RASP, CONSOLE, IOSAS, SMXC, ANTMAIN, JESXCF, and ALLOCAS
There is no goal associated with such a service class, and its DP and IOP never changes, as
follows:
It is the default for the subsystem STC in the classification rules. If you do not define
another default, all STC address spaces not filtered though any rule will be assigned to the
SYSSTCx service class.
Examples of address spaces using SYSSTCx are: SMS, LLA, JES2AUX, JES3, JES2,
VTAM, JES2MON, VLF, APPC, ASCH, RRS, DFSMSrmm, and OAM.
It is possible to define service classes SYSSTC1, SYSSTC2, SYSSTC3,SYSSTC4 and
SYSSTC5. Up to z/OS 1.9, such service classes have the same properties as SYSSTC.
DISCRETIONARY ==: "end of the queue for all services," but not always.
Dispatch Priority Workload
255 SYSTEM
254 SYSTC
253 Small consumer
252-208 Managed work
201-192 DISCRETIONARY 201 Most I/O-bound
200
199
198
197
196
195
194
193
192 Most compute-bound
Figure 3-14 lists the range of dispatch priorities and illustrates how WLM assigns them to the
transaction in the system.
The dispatch priority of 255 is reserved to keep the system running and to provide system
transaction preferential access to the CPU. All system transactions should run in the
predefined service class SYSTEM, which always gets a dispatch priority of 255.
Below this starts the range of dispatch priorities (from 208 t0 252) that are dynamically
assigned to user-defined service classes periods (not included in this range are the
discretionary ones). These are all service classes with an importance of 1 to 5. All
dispatchable units of the same service class period have the same base dispatching priority
(with the exception of the mean-time-to-wait group for discretionary transaction) in the same
z/OS. The dispatching order of all dispatchable units with the same dispatching priority in the
same or in different service class periods are naturally rotated.
The range from 202 to 207 is not used and the range from 192 to 201 is used for the
dispatchable units of discretionary transactions. All discretionary transactions are managed
by a mean-time-to-wait (MTW) algorithm which, simplified, gives a higher dispatch priority for
transactions using I/O and a lower dispatch priority to transactions demanding significant PU.
The last used dispatch priority is 191. It is assigned to quiesced transactions running in a
non-swappable address space. That also means that quiesced transactions can run if the PU
is not busy, and all other transactions are in wait state or active in other PUs.
The interesting range is from 208 to 252. WLM assigns these dispatch priorities to the
user-defined service classes periods that have a goal. The assignment is based on the PU
consumption, PU demand, and goal achievement of the transaction.
3.12, “WLM policy adjustment routine” on page 93 explains that WLM determines, through
the performance index, which service class periods meet their goals and which ones miss
their goals. In the next step, WLM starts to look for a receiver by starting with the most
important service class period with the worst PI. Then it determines the resource (based on
the largest delay numbers) for which an adjustment is required.
For PU, that means that the service class must show a high number of PU delays. Now WLM
looks for donors. Based on the captured data, WLM knows how much PU is consumed by
each service class period and how much PU is consumed by all transactions at the same
dispatch priority. From the PU delays and the average time that a transaction used the PU, it
is possible to calculate the demand that the transaction has for the PU.
To assign a new dispatch priority to a receiver service class period, WLM logically moves the
service class period to a higher dispatch priority and then projects the change of PU
consumption, demand, and delays for all service class periods at a lower and at the same
new dispatch priority of the transaction being moved up. If that is not sufficient, WLM looks in
the next step to determine whether some service class periods can run with a lower dispatch
priority and does the same calculations. Refer to 3.15, “CPU management” on page 100 for
more details. The algorithm continues until a positive result can be projected, meaning that
the performance of the receiver gets better and no service class periods with a higher
importance is badly affected, or until WLM concludes that no change is possible.
Figure 3-15 WLM CPU, zAAP, and zIIP dispatching priority adjustment
CPU management
Access to the PUs (CPs, zAAPs, and zIIPs) is controlled through dispatching priorities. z/OS
maintains three queues of ready transaction units, which are sorted in priority order. The
z/OS dispatcher component selects the next ready dispatchable unit (TCB or SRB) going
through such queues. This section explains how these queues are managed by WLM. WLM
assigns dispatch priorities to dispatchable units associated with transactions running in
service classes periods, based on the following factors:
The need of the service class period
– Goal achievement, expressed by performance index
– CPU consumption
The potential impact of a change
– Basically simulates the movement of service classes period dispatchable units to
higher (receivers) and lower (donors) dispatching priorities.
– Forecast, through calculation, the change in service consumption. Forecast, through
calculation, the change in goal achievement.
– Changes dispatch priorities if it is beneficial (through a calculated net value) for the
higher importance transaction.
In the first step, it moves I to a higher dispatch priority. Together with I, there are two other
service class periods, G and H, at the same dispatch priority. After the first move up for I, the
calculations show that H also has to move up because otherwise, no overall value of the
move for service class I can be projected; in other words, H would be degraded.
This projection is named “net value,” meaning that the overall change does not harm higher
importance transactions and is still beneficial for the receiver. But this move does not show
enough receiver value, meaning that the change does not help I enough to improve its
performance. In the next step, WLM tries to move transactions with higher dispatch priorities
to lower dispatch priorities. While the move down for C is acceptable, the move down for B
violates the net value test.
Finally, the move down of A passes the net value test, but again there is not sufficient
receiver value for I. In the third step, WLM again attempts to move the receiver up. In this
case the move for service class I to dispatch priority 249 gives enough receiver value for I and
all other classes to pass the net value tests. In this case, the A and C service class periods
are the donors to service class I, which is the receiver.
Now WLM concludes that the change is meaningful and adjusts the dispatch priorities of the
transaction. Note that if no valuable change could be projected, all intermediate calculations
would have been set back and no change would have occurred.
Also observe that, together with service class I, service class H was moved to a higher
dispatch priority. This became necessary because the first move showed that H would have
been degraded too much. Ultimately H is below I, but now competes with service classes E
and F at the same dispatch priority and therefore has better and sufficient access to the CPU.
The final picture has changed to some extent. The moves are calculated based on the current
and the expected CPU consumption, which is derived from the CPU demand of the service
classes and the amount of CPU that is available at each dispatch priority level.
This example also demonstrates why WLM only executes one change per policy adjustment
interval. One change does not mean that only the resource access for one service class
period has been changed—it means that the access to a resource for a set of service class
periods is changed to help one service class period.
I
Note: Keep in mind that goals generally should be based on SLA (that is, response
time-based). For LPARs out of SLA, you need to observe transaction behavior. Do not
define unrealistic goals; this is especially important for the execution velocity type of goal.
Figure 3-17 Service class panel with CPU Critical option YES
As discussed in 3.16, “Reasons for CPU delays” on page 102, it is possible that transactions
that are in service classes period with easier goals and lower importance get higher dispatch
priority than transactions with a high importance.
As a result, a service class period with low importance, easier goals, and low CPU service
consumption can obtain the highest available dispatch priority in the system. Note the
following points:
This is not a problem as long as the transaction behaves well.
However, it can hurt higher important transactions when the transaction starts to use
greater amounts of CPU in an unpredictable manner. This often happened in CICS when,
after a low activity period (for example, lunch time), users begin to log on and experience
elongated response time. After some policy adjustment, WLM gives CICS the proper
dispatching priority and things run fine again. However, the purpose of the CPU Critical
feature is to eliminate this problem.
This can help in situations where low importance transactions may negatively impact high
importance transactions (for a short but visible period).
The service class protected by CPU Critical must have just one period.
Be careful not to overuse this feature, because otherwise you do not leave enough capability
for WLM to efficiently balance the available CPU resources across the service classes to
meet goals. WLM tries to meet not only the goals of the highest importance transactions, but
also to meet those of lower importance transactions, if the resource permits it.
D P
2 5 2
2 5 0
2 4 8
S e r v C la s = Z , Im p = 2 , C C = N
2 4 6
2 4 4 S e r v C la s = Z , Im p = 3 , C C = N
2 4 2
2 4 0 S e r v C la s = X , Im p = 3 , C C = Y
2 3 8
2 3 6
2 3 4
2 3 2
S e r v C la s = Y 1 , Im p = 4 , C C = N
2 3 0
2 2 8 S e r v C la s = Y 2 , Im p = 5 , C C = N
2 2 6
…
…
This feature should be used for high business importance workload. CICS/IMS production
workloads are good candidates for this feature because their activity can be subject to
fluctuations.
Also, the resource group can be used through a maximum value to limit (capping) the amount
of CPU capacity available to some service classes. Capping is used in situations in which you
want to deny the access of CPU cycles to one or more service classes. Refer to 4.16,
“Resource group capping” on page 154 for more information.
This option can be useful for address space transactions that need to retain virtual pages in
main storage during long periods of inactivity because it cannot afford paging delays when it
becomes active again. Without such a function, the LRU stealing algorithm steals the pages
least referenced, if main storage enters contention. With long-term storage protection
assigned, this transaction address space loses main storage (by stealing) only to transactions
of equal or greater importance that need the storage to meet performance goals.
Storage of protected service classes is only taken when the z/OS runs in short on main
storage conditions.
Imp=3,ExVel=1,Stg Crit Imp=3,ExVel=1,Stg Not Crit Imp=3,ExVel=1,Stg Crit Imp=3,ExVel=1,Stg Not Crit
Imp=4,ExVel=20,Stg Not Crit Imp=4,ExVel=20,Stg Not Crit
Other characteristics are that storage remains protected even if no paging activity is going on
anymore. That is important for online regions (CICS/IMS) that do not show much activity
overnight.
IRLM (the DB2 lock manager address space) has an erratic paging activity and can be
protected against page stealing by using the Storage Critical option.
I/O queues
I/O management
I/O management in WLM has several functions, such as I/O priority management (to
dynamically control I/O priorities) and Dynamic Parallel Access Volume (PAV).
I/O priority is a number used to place I/O requests in I/O queues. I/O priority can be optionally
managed by WLM if I/O priority management is set to Yes in the WLM policy. This option is
recommended. It is independent from the CPU dispatching priority. WLM uses eight different
I/O priority values to classify I/O operations depending on the transactions requiring such I/O.
I/O queues
There are three types of I/O queues where I/O priority management can apply:
IOS UCB queue - This is a local z/OS queue, because it is per device and all requests
come from the same z/OS. The queue occurs when there are no available UCBs to start
the I/O operation.
Channel subsystem queue - This is a global CEC queue because the requests come from
all logical partitions existing in that CEC. The queue occurs when there is no full I/O path
(channel, port switch, controller, and device) that is totally free. Thus, there could be
competition between all the logical partitions to access a particular device.
Control unit queue - This is a global queue because the requests come from all logical
partitions of all connected CECs. IBM control units starting with 2105 are able to use the
I/O priority passed in the I/O request and determined by WLM to order such queue.
Thus, if one important service class period misses its goal (it is a candidate to be a receiver)
and policy adjustment finds out that I/O delay is the biggest problem, then WLM tries to find
other service class periods (candidates to be donors) that use the same set (or a subset) of
the devices for the service class period that is missing the goals. Then the usual assessment
logic starts to determine whether a change in I/O priority is possible between them.
Each I/O request is performed by or on behalf of a z/OS dispatchable unit, either a TCB or an
SRB. The dispatchable units are associated with:
Address spaces
Enclaves
This section explains how to determine where WLM can use priorities to manage the
Queue_Time (QT) in a DASD subsystem. Refer to 3.22, “I/O priority management” on
page 110 for more information about this topic.
To implement PAV, IOS introduces the concept of alias addresses. Instead of one UCB per
logical volume, an MVS host can now use several UCBs for the same logical volume. Apart
from the conventional base UCB, alias UCBs can be defined and used by z/OS to issue I/Os
in parallel to the same logical volume device.
Currently, the Enterprise Storage Server® (ESS) allows for concurrent data transfer
operations to or from the same volume on the same system using the optional feature Parallel
Access Volume (PAV). With ESS, alias device numbers can be defined to represent one
physical device, which allows for multiple I/O operations to be started at one time. However,
the z/OS operating system does not allow more than one I/O operation at a time to the same
device and queues additional I/O operations to the physical device.
Each service class period maps to an I/O priority level. This mapping is dynamically adjusted
based upon how each service class period is meeting its goals and how the I/O contributes to
this success or failure. When IOS receives an I/O request and places the I/O request on the
UCB queue for the target device (all UCBs are busy), the priority of the request is used to
place the request in priority sequence with respect to other pending requests for the same
device.
WLM needs two types of information to make decisions about I/O priority management:
Device service time information
The device service time is the sum of device connect time and device disconnect time for
each I/O request, as measured by the I/O subsystem.
WLM can make decisions by comparing the level of device usage across the range of I/O
priorities, and projecting the amount of device usage that may occur if I/O priorities are
altered.
Device wait time (delay) information
WLM can make decisions by comparing the level of device delay across the range of I/O
priorities, and projecting the amount of delay that may occur if I/O priorities are altered.
The device delay time is the sum of the time each request was delayed by IOS (IOS
queue time) and the time delayed in the I/O subsystem (pending time).
Note: Channel measurement data gives the device connect and disconnect time and the
device pending time (including channel busy, controller busy, shared device busy). The
IOSQ time is a sampled value. The disconnect time is not taken into account by WLM
when calculating velocity% goal.
FF System transaction
F8 Discretionary
This is a part of the Intelligent Resource Director (IRD) feature available on zSeries, z9, and
z10 machines in z/Architecture mode. Refer to 3.26, “Intelligent Resource Director” on
page 118 for more details,
Subsystem Cache
z/OS Static PAV
Life of configuration
App1 UCB
Dynamic PAV
Life of adjustment interval
App2 UCB
Hiper PA
V
App3 UCB Life of I/
O
App4 UCB
App5 UCB
LCU
Your installation can manage PAV devices in static, WLM dynamic, or HyperPav modes. If
static PAV is used, then there is no change in device alias assignment unless an alias is
deleted or added using the DS8000® console. Figure 3-23 shows the PAV modes.
This section describes the current implementation of the WLM algorithms for dynamic alias
management with the DS8000 in some detail. WLM can be used to dynamically manage the
assignment of alias devices to base devices and thus ensure that the concurrency in I/O
processing provided by PAV is available where it is most needed. However, in general the
HyperPav implementation is superior to WLM dynamic alias management.
NUMBER OF ALIASES
MOVE
...
ALIASES
DEVICE ENTRY
DONOR
DEVICE ENTRY DEVICE
To implement this function, you must specify YES for Dynamic Alias Management in the
WLM policy and WLMPAV=YES in each involved device in HCD.
The donor and the receiver UCB bases must be in the same LCU.
This specification is local per z/OS; a device can be WLMPAV=YES in one z/OS, and
WLMPAV=NO in another z/OS.
Specifying WLMPAV=NO for alias (device type 3390A) devices has no effect for a bound
alias, but protects an unbound alias from being a preferential candidate for Dynamic Alias
Management. This may be desirable in several instances:
Some devices in a shared LCU are normally online to some system images, but offline to
others.
When an installation has several sysplex with shared access to a DS8000 LCU.
The DS8000 does not distinguish between static and dynamic PAVs; there are only PAVs.
Enhancements in z/OS V1R3 allow auxiliary storage manager (ASM) to state the minimum
number of PAV dynamic aliases needed for a paging device. The more page data sets you
have on the volume, the more alias devices will be set aside. A minimum number of aliases
will be preserved by the system even in cases where aliases are being moved to reflect
changes in workload. You do not need to define any of this; simply enable the volume for
dynamic alias management for your paging devices.
Note: All systems in your sysplex must be on a z/OS V1R3 or above level for this support.
UCB 110
UCB 100 UCB 100
WLM
IOS IOS
Assign to base 100
1F0 1F1 1F2 1F3
Base Alias Alias Alias Alias Base Base 1F0 1F1 1F2 1F3
Base
100 to 100 to 100 to 110 to 110 110 110 Alias Alias Alias Alias 110
WLM can dynamically reassign an alias IOS manages alias assignment/usage per I/O
to another base request
HyperPAV logic
HyperPav is more efficient than WLM dynamic PAV and eventually will probably replace it.
HyperPav does not use WLM intelligence to distribute the UCB aliases through the UCB base
at the 3390 devices. The work is done at IOS level, when the I/O operation is suppose to
start. The logic follows:
IOS uses a shared pool of LSS free UCB aliases as defined in HCD. LSS stands for
logical subsystem and it is the DS8000 name for logical control unit (a set of 256 devices).
As each I/O operation is requested, if the base volume is busy with another I/O:
– IOS selects a free UCB alias from the pool, quickly binds the alias device to the base
device and starts the I/O (through SSCH instruction).
– When the I/O completes (I/O interrupt), the alias device is returned to the free alias
pool.
The I/O operation is queued only if the mentioned LSS free pool is empty. RMF has data
covering such situation.
With HyperPAV, WLM is no longer involved in managing alias addresses. For each I/O, an
alias address can be picked from a pool of alias addresses within the same LCU. This
capability also allows different HyperPAV hosts to use one alias to access different bases.
This reduces the number of alias addresses required to support a set of bases in a System z
environment with no latency in targeting an alias to a base. This functionality is also designed
Benefits of HyperPAV
HyperPAV has been designed to provide an even more efficient parallel access volume
(PAV) function. When implementing larger volumes, it provides a way to scale I/O rates
without the need for additional PAV alias definitions. HyperPAV exploits FICON architecture
to reduce overhead, improve addressing efficiencies, and provide storage capacity and
performance improvements, as follows:
More dynamic assignment of PAV aliases improves efficiency.
The number of PAV aliases needed might be reduced, taking fewer from the 64 K device
limitation and leaving more storage for capacity use.
Batch
Dynamic channel path Channel Subsystem Channel Subsystem
management
Ability to move channels to DASD
control units as needed, through
switches.
IRD functions
These IRD functions are:
LPAR CPU Management
– WLM Vary logical CPU management
– WLM Weight management
Dynamic Channel Management (ESCON channels with ESCON Directors only)
Channel subsystem priority queuing (CSS IOPQ)
However, for certain types of data it is difficult to implement data sharing. In these cases, the
unbalanced behavior is not completely resolved. IRD introduces the opposite idea of
dynamically taking the hardware (CPUs and channels) to the z/OS LP where it is most
needed, thereby resolving the imbalance problem. In other words, we can move the hardware
resource (CPU and channels) power to where it is needed.
By assigning channels to the pool of managed channels, the system is able to respond to
peaks in a control unit’s demands for I/O channel bandwidth. In addition, this reduces the
complexity of defining the I/O configuration, because the client is no longer required to define
a configuration that will adequately address any variations in workload. This allows them to
specify a much looser configuration definition.
DCM can provide improved performance by dynamically moving the available channel
bandwidth to where it is most needed. Prior to DCM, you had to manually balance your
available channels across your I/O devices, trying to provide sufficient paths to handle the
average load on every controller. This means that at any one time, some controllers probably
have more I/O paths available than they need, while other controllers possibly have too few.
Another benefit of Dynamic Channel Path Management is that you can be less focused on
different rules of thumb regarding how busy you should run your channels for every channel
or control unit type you have installed. Instead, you simply monitor for signs of channel
over-utilization (high channel utilization combined with high pending times).
Because Dynamic Channel Path Management can dynamically move I/O paths to the LCUs
that are experiencing channel delays, you can reduce the CU and channel-level capacity
planning and balancing activity that was necessary prior to Dynamic Channel Path
Management.
Using DCM, you are only required to define a minimum of one non-managed path and up to
seven managed paths to each controller (although a realistic minimum of at least two
non-managed paths is recommended), with Dynamic Channel Path Management taking
responsibility for adding additional paths as required, and ensuring that the paths meet the
objectives of availability and performance.
SYSPLEX
z/OS
z/OS
CF
Linux for
zSeries
z10
syszwlm_123451C5
z10
syszwlm_98765111
LPAR cluster
IRD uses the concept of an LPAR cluster, which consists of the subset of z/OS systems (you
also may have LPs with Linux, with or without z/VM®) that are running as LPs on the same
zSeries (or z9, or z10) server. The z/OS images must be in the same Parallel Sysplex.
IRD uses facilities in z/OS Workload Manager (WLM), Parallel Sysplex, SAP (which is the
processor unit in charge of starting an I/O operation) and LPARs to help you derive greater
value from your investment.
An extensive discussion about IRD is beyond the scope of this book. Instead, this book
describes how to create a WLM policy that transacts properly with IRD. All IRD functions and
setups are fully explained in the Redbooks publication z/OS Intelligent Resource Director,
SG24-5952.
L P A R C L U S T E R R E P O R T
z/OS V1R5 SYSTEM ID SC69 DATE 11/07/2004 INTERVAL 14.59
RPT VERSION V1R5 RMF TIME 16.30.00 CYCLE 1.000 sec
------ WEIGHTING STATISTICS ------- PROCESSOR STATISTICS ----
--- DEFINED -- --- ACTUAL ----- ---- NUMBER --- -- TOTAL% --
CLUSTER PARTITION SYSTEM INIT MIN MAX AVG MIN % MAX % DEFINED ACTUAL LBUSY PBUSY
WTSCPLX1 A0A SC69 10 1 999 19 0.0 0.0 16 2.0 99.30 11.03
A01 SC55 10 1 999 1 100 0.0 2 2.0 80.48 8.94
A02 SC54 10 10 2 2 95.05 10.56
A03 SC49 185 185 2 2 99.91 11.10
A07 SC52 10 10 2 2 95.05 10.56
A08 SC48 10 10 2 2 95.05 10.56
A09 SC47 185 185 2 1 99.63 5.53
In the LPAR Cluster WTSCPLX1 there are 7 LPs. In HMC, the LPs A0A and A01 are defined
with a minimum Weight of 1 and a maximum of 999. As you can see, the Weight (WGT)
values for A0A and A01 are changed from the initial values, due to IRD VAry Weight
Management action. For LP A01, it is equal to the minimum 1. For LP A0A, it is 19. The
amount of weight lost by the donor (A01) is equal to the amount received by the receiver
(A0A).
The reason why IRD raised the weight of A0A is because an important service class running
in A0A was unhappy (PI > 1.0), mainly because of a delay for CPU. The weight was taken
from A01, where there are less important service classes that may be allowed to suffer.
Also, the IRD function WLM Vary CPU Management was activated for LPs A0A and A01 due
to the decimal point in the 2.0 in the ACTUAL field. However, it seems that the value was kept
constant along the RMF interval, meaning that WLM found the correct figure for the interval.
More logical CPUs implies more capacity but at same time more LPAR overhead, and WLM
balances both.
--------Qualifier-------- -------Class--------
Transaction Action Type Name Start Service
DEFAULTS: VRPTDEF
Report
________
____ 1 TN STI* ___ RTG05 VRREPT
More ===>
--------Qualifier-------- -------Class--------
Region - AOR1 Action Type Name Start Service
DEFAULTS: STCDEF
Report
______
Priorities ____
____
1
1
TN
TN
AOR1
AOR2
___
___
VEL60
VEL60
________
________
To solve that, WLM developed a set of programming interfaces that allow transaction
managers such as CICS and IMS/DC to identify their transactions to WLM. This helps WLM
trace the transactions through the transaction manager address space and to identify the
transaction response time in z/OS. WLM developed various techniques to recognize
transactions on the system and to manage them independently from, or together with, the
address space where the programs run to process them.
In CICS or IMS/DC, the installation may be defined in the policy, as shown in Figure 3-30:
Execution Velocity% goal declared in subsystem STC (Classification Rules) for the ASs
(regions) executing the transactions. Based on that, WLM calculates these ASs
priorities.In this case, WLM is not informed about transactions. This method is called
Region.
It is also possible to either manage some CICS/IMS regions (address spaces) to velocity
goals or to manage some others as response time. Then, installations could select certain
regions (for example, regions of less importance, regions processing conversational
transactions, or regions with very low activity) and have WLM manage them to the velocity
goal, while the more important ones with high activity are best managed by WLM according to
the transaction service classes response time goals.
The installation can force one of the methods (Region or Transaction) for certain CICS/IMS
address spaces through the classification rules associated with certain transactions. If you
declare, for example, Region, then even if the transaction arrives with a response time goal,
the method used is Region.
The process works like this: when the CICS address space is started, WLM uses the Velocity
goal as declared in the subsystem STC. (We recommend defining an aggressive goal to start
CICS quickly.) When the first transaction arrives, WLM verifies whether the transaction has a
subsystem CICS rule pointing to a service class response time goal.
If there is no such rule, then WLM keeps using the Velocity of the address space. However, if
there is such a rule, then the address spaces priorities are decided by the PIs and importance
of the running transactions. The point is to always define a region execution velocity goal for a
CICS production (for a fast start), even if you plan to define a subsystem CICS response time.
WLM creates a picture of the address space topology, including the server address spaces
and the service classes of the running transactions. All this information is passed by CICS.It
uses this topology to manage the server address spaces priorities based on the goal
achievement of the most important and most unhappy service classes. The “free ride”
phenomena may occur here, where in the same heterogeneous address space, other
transactions enjoy high figures in priorities caused by very important and very unhappy other
service class transactions.
To understand the distribution of transaction through server ASs, WLM derives a concept
known as “internal service classes.” One internal service class is a set of regions executing
the same set of external service class transactions.
Figure 3-31 depicts the server topology. It shows three external service classes (the ones
defined in WLM policy) for CICS or IMS/DC transaction and four registered server regions
(address spaces) that process the transactions.
WLM needs to identify the number of internal service classes. One region processes only
requests for service class Online_High (circles). As a result, one internal service class is
created, which is used to manage this region.
Assume that the external Service Class Online_High misses its goals. Through the topology,
WLM determines that three server regions process work requests of this Service Class, which
are associated with two internal Service Classes. Then it calculates how much each of the
internal Service Classes contributes to the goal achievement of the external Service Class.
Further assume that WLM finds out that the server regions of the internal class 2 need help
(they are suffering WLM measured delays). Therefore, WLM helps the server regions of this
Service Class to improve the goal achievement of the external Service Class Online High.
Figure 3-32 Classification rule for CICS address space defining transaction type of goal
For JES and STC work only, you can also fill in the “Manage Regions Using Goals Of:” field,
with a value of either TRANSACTION, in which case the region will be managed to the
transaction response time goals, or REGION, in which case the region is managed to the goal
of the service class assigned to the region. Note that for ASCH, CICS, IMS, and OMVS work,
the “Manage Regions Using Goals Of:” field will read N/A.
Required
Action Resource Name State Resource Description
__ DB2A OFF_______ DB2 Subsystem
__ PRIMETIME ON______ Peak Business Hours
Scheduling environment
The use of the scheduling environment facility is optional. WLM can also be in charge of
routing batch jobs to z/OS systems that have the resources that will be needed by such jobs.
Thus, the function is available for the JES2 multi-access spool facility and JES3. A scheduling
environment lists resource requirements, which ensures that transactions are sent to systems
that have those resources. This is particularly important in sysplex installations where the
z/OS systems are not identical.
The defined resources can be physical entities (such as hardware features or subsystem
instance), software products, or intangible entities such as certain periods of time
(transactioning hours, off-shift hours, and so on).
As illustrated in Figure 4-1, the scheduling environment (DB2LATE) lists the resources
(through any resource name) and their required state, ON or OFF existent in all the z/OS
systems. In the example shown in the figure, DB2A is OFF and PRIMETIME is ON. The
resources are also named scheduling elements. Each arriving transaction (mainly a batch
job) has a keyword in JCL informing the required scheduling environment name to select the
z/OS system that contains the desired scheduling elements options described in the referred
scheduling environment name.
Through a MODIFY WLM command, the installation may set ON or OFF the scheduling
elements in a particular z/OS.
WLM
ISPF
ro *all,f wlm,resource=primetime,on
f wlm,res=db2a,on f wlm,res=db2a,off 2
11 2
2
SYSA SYSB
WLM Resources:
DB2A - ON
JES2 Resources:
JobQ DB2A - OFF
CDS PRIMETIME - ON PRIMETIME - ON
SCHENVs: JP -SYSA SCHENVs:
DB2PRIME - Avail. JL -SYSB DB2PRIME -Unavail.
NODB2 - Unavail 4 NODB2 -Available
4
3
//JP JOB SCHENV=
DB2LATE Job JP will be executed in
//JL JOB SCHENV= SYSB
NODB2
Scheduling environment
A scheduling environment is a combination of sysplex systems, database managers, devices,
and languages capable of executing an application. In other words, it is a set of resource
elements with an specific value (ON or OFF) set. In Figure 4-2, the scheduling environment
DB2LATE formed by the resource elements states: DB2A OFF and PRIMETIME ON.
As mentioned, scheduling environments help to ensure that units of work are sent to systems
that have the appropriate resources to handle them. They list resource names along with their
required states. Resources can represent actual physical entities, such as a database or a
peripheral device, or they can represent intangible qualities such as a certain period of time
(like second shift or weekend).
These resources are listed in the scheduling environment according to whether they must be
set to ON or set to OFF. A unit of work can be assigned to a specific system only when all of
Resource element
To understand the idea of resource affinity scheduling, other concepts are introduced, such
as:
A resource element (or scheduling element) is a representation of an execution time
resource, which can be ON or OFF in a system. The installation must identify and name
these entities.
A resource element can be:
– A resource, such as a database, peripheral device, machine feature, system of cheap
cycles
– Or, a TA time property such as second shift or weekend
In Figure 4-2 on page 131, the resource element DB2A is ON in SYSA and OFF in SYSB.
This state is set by the installation through the command (F WLM). The resource element
PRIMETIME is ON on both z/OS systems.
In Step 4, the JP job can only run in the SYSB system. In the SYSB system, the resource
element DB2A is OFF and PRIMETIME is ON.
In a multisystem configuration, one challenge is that not every z/OS system is capable of
providing all scheduling environments at all times for arriving batch jobs. In other words, the
images might not have the same resources. However, it is also important that this be
transparent to users.
Client
Server
Client
Routing
Manager Server
Client
Server
Client WLM
Server
Client
Dynamic workload balancing implies selecting the best z/OS system and the best address
space for an arriving transaction. WLM, because of its global understanding of systems and
resources usage, is the right z/OS component to be the advisor for a transaction manager just
receiving an incoming transaction. To use WLM as a dynamic workload balancer advisor, the
transaction manager must register itself as an exploiter through the WLM API IWMSRSRG.
To make the most of workload management, work needs to be properly distributed so that
MVS can manage the resources. It is essential that the subsystems distributing work are
configured properly for workload distribution in a sysplex. You do this with the controls
provided by each subsystem. For example, in a JES2 and JES3 environment, you spread
initiator address spaces across each system.
Transaction managers
Initial cooperation between MVS and the transaction managers (CICS, IMS, DB2) allows you
to define performance goals for all types of MVS-managed work. Workload management
dynamically matches resources (access to the processor and storage) to work to meet the
goals.
Other subsystems also have automatic and dynamic work balancing in a sysplex. For
example, DB2 can spread distributed data facility (DDF) work across a sysplex automatically.
DB2 can also distribute work in a sysplex through its sysplex query parallelism function.
CICS, TSO, and APPC cooperate with VTAM and workload management in a sysplex to
balance the placement of sessions. SOM objects can automatically spread servers across a
sysplex to meet performance goals and to balance the work.
Capacity-based
Round robin
Goal achievement
Balanced utilization
Homogeneous servers
Workload balancing
In a sense workload balance does not imply that either the transactions or utilizations are
evenly distributed. There are several models (algorithms), listed in Figure 4-4, that may be
used by WLM to perform workload balancing:
Capacity-based attempts to direct transactions to whichever system has the most
available capacity (in terms of MSUs). It does not result in balanced utilization if the CPCs
are of significantly different speeds. This model is the one most used by WLM workload
balance functions.
Round robin attempts to route an equal number of transactions to each server. It also
does not result in balanced utilization. If the CPCs are of significantly different speeds,
then a similar amount of work results in lower utilization on the larger CPC (for example,
the SAP PU rotating IO requests through channels).
Goal achievement attempts to route to the server that is currently delivering performance
that meets or exceeds the installation’s objectives. Faster CPCs deliver better response
times at higher utilizations than slower CPCs, again resulting in uneven utilization across
the CPCs. XCF uses this model for selecting CF links.
Balanced Utilization attempts to direct transactions to whichever system has the least
utilized processors. z/OS does this with the logical CPUs.
Homogeneous servers attempts to send to the same CICS AOR address space any
transactions from the same service class.
An exploiter in this context is a WLM application running in one z/OS, as one CICS AOR. The
transaction is then routed by the transaction manager (TCP/IP DNS in the example) to the
exploiter proportionally to its weight, so higher-weight z/OS systems receive more
transactions.
As shown in Figure 4-5, three application exploiters running in three different z/OS systems
(Server 1, Server 2, and Server 3) are returned by WLM with weights of 2, 2, and 4
respectively. DNS then routes one-fourth of the requests to Server 1; one-fourth to Server 2;
and one-half to Server 3.
Weight calculation
The weight calculation is as follows:
The WLM workload balancer routine measures LP available capacity by:
– The difference between the actual utilization of this LP and the capacity it is
guaranteed as a result of its weight and its number of online logical CPs.
Servers on systems that are in a serious storage shortage (SQA pages, high number of fixed
frame, small number of free auxiliary storage slots) do not receive high weights. They are not
recommended unless all systems are in a shortage.
Also taken into consideration is the health check of the server (that is, the number of abends
per time), to avoid the “black hole effect” (meaning that, because the server is abending, the
apparent response time is very short and then even more transactions are sent to the failing
server).
CICS CP/SM
The generic resources function also increases application program availability, because each
active application program that uses a given generic resource name (a generic resource
member) can back up other generic resource members. Thus, no single application program
is critical to resource availability. When a generic resource member fails, an LU can reinitiate
its session using the same generic resource name. VTAM resolves the session initiation to
one of the other generic resource members. Because the user is unaware of which generic
resource member is providing the function, the user is less affected by the failure of any
single generic resource member.
Sysplex distributor makes use of Workload Manager (WLM) and its ability to gauge server
load and provide a WLM recommendation. In this paradigm, WLM provides the distributing
stack with a WLM recommendation for each target system (a WLM system weight), or the
target stacks provide the distributing stack with a WLM recommendation for each target
server (a WLM server-specific weight). The distributing stack uses this information to
optimally distribute incoming connection requests between a set of available servers.
CICS CP/SM
CICS CP/SM selects the best AOR address space in the best system. WLM controls which
requesting (TOR) CICS regions receive the work. WLM can also affect which AOR is chosen
when using CICSPlex SM.
AE 1
Queue Server
Queue Manager WLM
Queue Server
...
Classification timer
Rules
timer ApplEnv
http: //... Directive
.....
.....
log manager log
manager
Application
cache Environment
manager Queue
AE 2
Queue Server
Worker thread
timer
log manager
The work manager subsystem contains a table or file that associates the work request names
with an application environment name. The application environment names AE1 and AE2 are
specified to workload management in the service definition stored in the WLM couple data
set, and are included in the active policy when a policy is activated.
Application environment
An application environment (AE) is a group of application functions requested by a client that
execute in server address spaces. As mentioned, workload management can dynamically
manage the number of server address spaces to meet the performance goals of the work
making the requests.
The design of server address space management poses the following questions:
If the goals are not met, can an additional server address space improve the performance
index?
If the goals are not met, can an additional dispatchable unit in the server address space
improve the performance index?
If there is a resource constraint (CPU or storage) in the system, how do you reduce the
activity of server address spaces?
When should the number of server address spaces be decreased?
Will the creation of a new server address space adversely impact the performance of
other, more important goals?
These transaction managers use WLM services to permit transaction flow from their
network-attached address spaces, through WLM, and into server address spaces for
execution. The network-attached address spaces are also known as queue managers or
controller, and the server address spaces are also known as queue servers, as listed in
Figure 4-7 on page 140.
JES2 and JES3 transaction managers do not need special application environment
definitions to exploit the server address space management functions as implemented in
WLM Batch Initiator function; see Figure 4-11 on page 146 for more information about this
topic.
Application environment
The server address management function is implemented through the application
environment concept. An application environment (AE) is a WLM construct grouping server
address spaces belonging to a transaction manager that have similar data set definition and
security requirements and therefore started by the same JCL procedure. An AE can have a
system-wide or sysplex-wide scope, depending of the structure and capabilities of the
transaction manager using the AE. The scope of the AE is dictated by the WLM services the
transaction manager uses. The queuing manager services provide system scope, and routing
services provide sysplex scope for an AE.
Each arriving transaction must have an AE name where it belongs and it only can be selected
by an server address space that belongs to this AE name. Then AE is conceived to solve the
affinity problem between transactions and server address spaces. If you do not have affinities
between transactions and server address spaces, you do not need AEs, or just one AE would
be enough.
You must define your AE in the WLM policy and assign incoming transactions to a specific
AE. Figure 4-8 on page 142 shows the definition of an AE named DBDMWLM1. The
transaction manager specialists need to be informed of that name, then some of its arriving
transactions present that name. You may have several AEs per transaction manager.
When defining an AE in WLM service definition, you can specify the following, as shown in
Figure 4-8 on page 142:
Application environment name - unique in the sysplex
Transaction manager - identifies the transaction manager by type and subsystem name
STC procedure name - the JCL in SYS1.PROCLIB to start the server address space
Start parameters - optional parameters to be passed to the server during address space
creation (such as user ID for security checking)
WLM management options - special options governing WLM control of address spaces,
as NOLIMIT for the number of server address spaces
The workload requirements for a given application may determine that multiple server
address spaces should be activated. WLM decides to activate a new address space based
on the following information:
There is available capacity (CPU and storage) in the system, and transactions in a service
class are being delayed waiting on the WLM queue.
A service class is missing its goals, it is suffering significant queue delay, and other
transactions (donors) in the system can afford to give up resources to the transaction
missing goals (receiver).
Note: For every AE queue, WLM starts at least one address space. Therefore, do not
specify too many service classes, to avoid having WLM start too many address
spaces.
Furthermore, the service can be used to define how server spaces should be resumed for
static and dynamic application environments.
Before using this service, the caller must connect to WLM using the IWM4CON service,
specifying Work_Manager=Yes, and Queue_Manager=Yes.
A queuing manager must not insert requests for a dynamic and static application environment
with the same application environment name concurrently.
WebSphere Server
WLM queues
WASHI Servant
Controller Region
Region enclave
WASLO Servant
WebSphere Region
DEFLT Servant
Region
enclave
WLM
policies
HFS
ISPF WLM
Application
Figure 4-10 WebSphere and WLM
Server regions
WebSphere applications that run in the servant region as part of the dispatched enclave use
WLM classification rules. Each WebSphere transaction is dispatched as a WLM enclave and
is managed within the servant region according to the service class assigned through the CB
service classification rules.
WLM-managed initiators
JES2 and JES3 provide automatic and dynamic placement of initiators for WLM-managed job
classes. JES2 and JES3 provide initiator balancing, so that already available WLM-managed
initiators can be reduced on fully loaded systems and increased on low loaded systems to
improve overall batch work performance and throughput over the sysplex. With WLM initiator
management, initiators are managed by WLM according to the service classes and
performance goals specified in the WLM policy.
The main purpose of WLM batch initiator management is to let WLM dynamically adjust the
number of active initiators. In this way, WLM can manage the time that jobs wait for an initiator
based on the goals of the batch service class periods.
The decision to increase the number of WLM initiators takes into consideration the effect on
general system performance, and sometimes such an increase is not executed.
WLM improves the balancing of WLM managed batch initiators between systems of a
sysplex. Although in earlier releases a balancing of initiators between high and low loaded
systems was only done when new initiators were started, this is now done when initiators are
already available. On highly utilized systems, the number of initiators is reduced while new
ones are started on low utilized systems.
This enhancement can improve sysplex performance with better use of the processing
capability of each system. WLM attempts to distribute the initiators across all members in the
sysplex to reduce batch work on highly used systems, while taking care that jobs with
affinities to specific systems are not hurt by WLM decisions.
Initiators are stopped on systems that are utilized over 95% when another system in the
sysplex offers the required capacity for such an initiator. WLM also increases the number of
initiators more aggressively when a system is low utilized and jobs are waiting for execution.
Processor
As you run more UNIX-based products, you need a z/OS UNIX environment that performs
well. Such products include the Lotus® Domino Server, TCP/IP, SAP R/3, and Novell
Network Services based on z/OS UNIX. Keep the following considerations in mind regarding
your z/OS UNIX workload:
Each z/OS UNIX interactive user consumes up to double the system resource of a TSO/E
user.
Every time a user tries to invoke the z/OS UNIX shell, RACF will have to deliver the
security information out of the RACF database.
The average active user will require three or more concurrently running processes that,
without tuning, run in three or more concurrently active address spaces.
These are only a few of the considerations regarding performance impacts and how to
prevent them. In most cases in our lab, tuning improved throughput by 2 to 3 times and the
response time improved by 2 to 5 times. The following sections describe how to achieve such
a significant improvement.
Goal
Mode
Workload
Manager Service
Classes
z/OS
Classification
UNIX
Rules
processes
Types of capping
Using capping artificially limits the CPU used rate for a specific set of tasks and service
requests usually associated with user transactions. The types of capping, as listed in
Figure 4-15, are:
With WLM resource group capping, a set of service classes in a sysplex is capped.
LPAR capping is achieved through weights or by decreasing the number of logical CPs.
The full LPAR is capped.
With soft capping by LPAR with WLM, the full LPAR is capped.
Resource groups
Using resource groups is a way to limit or guarantee resource capacity. A resource group is a
named amount of CPU capacity that you can assign to one or more service classes. For most
systems, you can let workload management decide how to manage the resources in the
sysplex and not use resource groups. You set performance goals for work and let workload
management adjust to meet the goals.
In some cases, however, you might want to use a resource group to, for example, limit the
service that a certain service class can consume. It is recommended that you assign each
resource group to only one service class.
Keep in mind your service class goals when you assign a service class to a resource group.
Given the combination of the goals, the importance level, and the resource capacity, some
goals may not be achievable when capacity is restricted.
If work in a resource group is consuming more resources than the specified maximum
capacity, then the system caps the associated work accordingly to slow down the rate of
resource consumption. The system may use several mechanisms to slow down the rate of
resource consumption, including swapping the address spaces, changing its dispatching
priority, and capping the amount of service that can be consumed. Reporting information
reflects that the service class may not be achieving its goals because of the resource group
capping.
Note: Resource group capping, as a goal enforcement, is performed at the sysplex level.
The CPU service rate values are accumulated first on local systems, and then across the
sysplex for total. The total value of the consumed service rate is used to determine if it
exceeds the resource group maximum service rate.
Define Capacity:
__ 1. In Service Units (Sysplex Scope)
2. As Percentage of the LPAR share (System Scope)
3. As a Number of CPs times 100 (System Scope)
Minimum Capacity . . . . . . . ______
Maximum Capacity . . . . . . . ______
This book focuses on resource group type 2 and 3, as detailed in “Resource group - type 2”
on page 158 and “Resource group - type 3” on page 159.
Specify a one- to eight-character name of the resource group. Every resource group name
must be unique within the sysplex.
When defining a resource group, consider how workloads and service classes have been set
up, including whether they have been set up by subsystem, by application, or by department
or location.You can define up to 32 resource groups per service definition. For options 2 or 3,
the minimum or maximum capacity is defined on a system level, which means that every
system is individually managed to ensure the minimum and maximum capacity.
The capacity is specified in unweighted CPU service units per second. The value must be
between 0 and 999999.
Minimum and maximum capacity applies sysplex-wide; that is, WLM ensures that the limits
are met within the sysplex.
Minimum This refers to the CPU service that should be available for this resource group
when work in the group is missing its goals. The default is 0. If a resource
group is not meeting its minimum capacity and work in that resource group is
missing its goal, then workload management will attempt to give CPU resource
to that work, even if the action causes more important work (outside the
resource group) to miss its goal.
If there is discretionary work in a resource group that is not meeting its
minimum capacity, then WLM will attempt to give the discretionary work more
CPU resource if that action does not cause other work to miss its goal.
The minimum capacity setting has no effect when work in a resource group is
meeting its goals.
Maximum This refers to the CPU service that this resource group may use. Maximum
specified for this resource group applies to all service classes in that resource
group combined.
Maximum is enforced. There is no default maximum value.
Define Capacity:
2 1. In Service Units (Sysplex Scope)
2. As Percentage of the LPAR share (System Scope)
3. As a Number of CPs times 100 (System Scope)
Minimum Capacity . . . . . . . ______
Maximum Capacity . . . . . . . 50____
In Figure 4-18, the resource group named RGLPMX50 limits its assigned service classes to
no more than 50% of the logical partition (LP) upon which that service class transactions are
running. If this constraint comes into play on one LP, it has no effect on the amount of
covered transactions running on another LP unless that transaction also tries to use more
than 50% of that LPAR. Note that the figure pictured here is enforced at z/OS scope, but the
number is the same for every z/OS in the sysplex covered by this WLM policy. Therefore, the
transactions experience different constraints on different LPARs.
LPAR share
The LPAR share is defined as the percentage of the weight definition for the logical partition
to the sum of all weights for all active partitions on the CEC. Note that the definition of
“capacity” (LPAR share) depends on other constraints that may have been placed on the
LPAR such as number of logical CPUs, soft capping, and LPAR capping.
Define Capacity:
3 1. In Service Units (Sysplex Scope)
2. As Percentage of the LPAR share (System Scope)
3. As a Number of CPs times 100 (System Scope)
Minimum Capacity . . . . . . . ______
Maximum Capacity . . . . . . . 250___
Note: An advantage of using resource group type 3 is that it dynamically adjusts to the
processor capacity when the work is run on another hardware.
Figure 4-19 displays the panel for the type 3 option showing a percentage of the total CEC
capacity in units of CPUs multiplied by 100.
If this constraint comes into play on one LP, it has no effect on the amount of covered
transactions running on another LPAR, unless those transactions also try to use more than
2.5 CPUs.
Defined capacity
0 5 10 15 20
IPL Time (hours) Defined capacity
Defined capacity
Defined capacity is part of the z/OS support of Workload License Charges. With it, you can
set a defined capacity limit, also called a soft cap, for the work running in a logical partition.
This defined capacity limit is measured in millions of service units per hour (MSUs). It allows
for short-term spikes in the CPU usage, while managing to an overall, long-term, 4-hour
rolling average. It applies to all work running in the partition, regardless of the number of
individual workloads that the partition may contain.
Group capacity
Soft capping (also known as “defined capacity”) is another type of capping that has been
implemented to support usage-based software pricing. In z/OS V1R8 and z/OS V1R9, the
WLM defined capacity mechanism is extended to handle LPAR groups instead of a single
LPAR. This is called “group capacity limit support.” Because the use of defined capacity
began some years ago, there is a requirement to have more flexibility when defining capacity
Each partition is going to see the consumption of all the other LPARs on the processor. If the
partition belongs to a group, it identifies the other partitions in the same group and calculates
its defined share of the capacity group based on the partition weight (compared to the group).
This share is the target for the partition if all partitions of the group want to use as much CPU
resources as possible. If one or more LPARs do not use their share, this donated capacity is
distributed over the LPARs that need additional capacity. Even when a partition receives
capacity from another partition, it never violates its defined capacity limit (if one exists).
WLM uses the weight definitions of the partitions and their actual
demand to decide how much CPU can be consumed by each
partition in the group.
Capacity groups are defined consisting of three partitions: LPAR1,
LPAR2, and LPAR3. The group limit is defined to 50 MSU and the
weights of the partitions are shown.
LPAR2 50 8.3
LPAR3 150 25
WLM uses the weight definitions of the partitions and their actual demand to decide how
much CPU can be consumed by each partition in the group.
This is an extension in z/OS V1R8 of defined capacity (soft capping), by adding more
flexibility. Instead of soft capping each LP individually, the installation caps a group of LPARs
containing shared logical CPUs.
In this example, at 07:00 p.m. when the systems are IPLed, all three partitions are started. In
the beginning, only partition LPAR1 and LPAR2 use approximately 60 MSU. No work is
running on partition LPAR3. Therefore its measured consumption is very small.
WLM begins to cap the partitions when the 4-hour rolling average for the combined usage
exceeds the 50 MSU limit. This happens around 09:00 p.m. At that point, LPAR1 is reduced
to about 30 MSU and LPAR2 is reduced to about 20 MSU. LPAR3 still does not demand
much CPU. Therefore, the available MSU of the group can be consumed by LPAR1 and
LPAR2.
Around 11:00 p.m., work is started on LPAR3. A small spike can be observed when WLM
recognizes that the third partition starts to demand its share of the group. After that spike,
LPAR3 gets up to 25 MSU of the group because its weight is half of the group weight. MVS1
is reduced to 16.7 MSU and MVS2 is reduced to 8.3 MSU. Based on variation in the
workload, the actual consumption of the partitions can vary but the group limit of 50 MSU is
always met on average.
Discretionary capping
Certain types of work, when overachieving their goals, potentially will have their resources
“capped” to give discretionary work a better chance to run. Figure 4-22 lists discretionary
capping rules.
Specifically, work that is not part of a resource group and has one of the following types of
goals will be eligible for this resource donation:
A velocity goal of 30 or less
A response time goal of over one minute
Note: Work that is eligible for resource donation is work that has been significantly
overachieving its goals. If you have eligible work that must overachieve its goals to provide
the required level of service, then adjust the goals to more accurately reflect the work's true
requirements.
This function is not about capping dispatchable units running in the discretionary type of goal
service class. It is about capping other dispatchable units to allow discretionary work to run at
all (that is, to obtain some CPU cycles) in a busy system.
This is implemented by capping internally (no external parameters) the happy service class
periods that have a dispatch priority higher than the discretionary ones. Figure 4-22 on
page 165 lists rules to use to apply such capping. Notice that such rules are very drastic, thus
making discretionary an almost rare event.
Note that all of the conditions must be met, meaning that WLM does not cap managed
transactions arbitrarily.
SE
System z10 EC
HMC
PR/SM
RMF
Ethernet WLM RMF RMF WLM
Switch DDS
Provisioning
policy
CIM
Capacity server
Provisioning CIM
(SNMP) server
Manager (CPM)
Capacity Provisioning
Control Center (CPCC)
z/OS console(s)
Background
Unexpected workload spikes may exceed available capacity such that Service Level
Agreements cannot be met. Although business need may not justify a permanent upgrade, it
might well justify a temporary upgrade.
z10 EC provides an improved and integrated On/Off CoD and CBU concept:
Faster activation due to autonomic management and improved robustness
The Capacity Provisioning Control Center (CPCC) and the console have no common
commands and the CPCC has only a partial use of commands. CPP is created and managed
by the software component Capacity Provisioning Control Center, which resides on a
Microsoft® Windows® workstation. It provides a system programmer front-end to administer
a capacity provisioning policy. CPCC is a GUI component. The policies are not managed by
WLM, although they are kept in the WLM couple data set. CPCC is not required for regular
CPM operation.
Demand for zAAP processors can only be recognized if at least one zAAP is already online to
the system. Demand for zIIP processors can only be recognized if at least one zIIP is already
online to the system.
The additional physical capacity provided through CPM is distributed through LPAR and the
z/OS systems. In general the additional capacity is available to all LPs, but facilities such as
defined capacity (soft capping) or LPAR capping can be used to control the use of capacity.
CPM operates in four different modes, allowing for different levels of automation:
Manual mode
This is a command-driven mode; there is no CPM policy active.
Analysis mode
In this mode, CPM processes Capacity Provisioning policies and informs the operator
when a provisioning or deprovisioning action would be required according to policy
criteria.
The operator decides whether to ignore the information or to manually upgrade or
downgrade the system using the HMC, the SE, or available CPM commands.
Rule
Provisioning Condition
Time Condition
Workload Condition
Provisioning Scope
Processor Limits
Figure 4-25 Capacity Provisioning policy
Decides dynamically
the size of the buffer
pool
WLM
DB2 Buffer Pool
DB2
Follows WLM
recommendation for buffer
pool size but chooses the
best algorithm to manage
the buffer pool
Random:
Avoid I/Os
Target Sequential:
Do I/Os
more
efficiently
An I/O buffer is used during an I/O operation as a target for reads or as a source for writes.
Buffer Pool
Application
Task Data
Adjust
Collect
3 Exit
Exit
Performance Block
6 8
2
4
BPMgmtOnly DASD
Performance Block
5
7
Adjust
Sampling
Algorithms
To allow DB2 to exploit such a facility, enter the option YES in the keyword ALTER
BUFFERPOOL command: AUTOSIZE (YES/NO). NO is the default.
With YES, DB2 registers the buffer pool (telling the size) with WLM, using a specific API.
When the buffer pool starts to become populated by DB2 pages, DB2 informs WLM, in a time
basis, about delays caused by the read I/O operation. DB2 also keeps WLM informed about
hit ratios for random I/Os. All this information is kept in Performance Blocks.
Figure 4-27 illustrates a chronological sequence of the events associated with WLM buffer
pool management.
If WLM determines that a buffer pool is the primary reason for a service class transaction
delay causing a PI greater than one, WLM instructs DB2 to increase the size of the buffer
pool. It may also compensate for the increase in storage of one pool by reducing the size of
another. All changes are made in increments no longer than plus or minus 25%. Only the size
of the buffer pool is modified; no other buffer pool characteristics are affected. Any changes
made this way to a buffer pool size are maintained across DB2 restarts.
zOS
zOS
Subsystem Address
Space
IWMARINO
API
Enclave RMF
A-CDS
Java app
XCF PR/SM IRD
Physical Resources
D XCF,COUPLE,TYPE=WLM
IXC358I 10.42.54 DISPLAY XCF 501
WLM COUPLE DATA SETS
PRIMARY DSN: SYS1.XCF.WLM02
VOLSER: SBOX73 DEVN: 3A3A
FORMAT TOD MAXSYSTEM
05/28/2001 20:26:14 4
ADDITIONAL INFORMATION:
NOT PROVIDED
ALTERNATE DSN: SYS1.XCF.WLM03
VOLSER: SBOX74 DEVN: 3E3A
FORMAT TOD MAXSYSTEM
05/28/2001 20:26:14 4
ADDITIONAL INFORMATION:
NOT PROVIDED
WLM IN USE BY ALL SYSTEMS
DISPLAY WLM,SYSTEMS
D WLM
IWM025I 10.53.35 WLM DISPLAY 516
ACTIVE WORKLOAD MANAGEMENT SERVICE POLICY NAME: WLMPOL
ACTIVATED: 2005/02/15 AT: 06:49:24 BY: VAINI FROM: SC65
DESCRIPTION: WLM Service Policy
RELATED SERVICE DEFINITION NAME: Sampdef
INSTALLED: 2005/02/15 AT: 06:45:45 BY: VAINI FROM: SC65
WLM VERSION LEVEL: LEVEL013
WLM FUNCTIONALITY LEVEL: LEVEL013
WLM CDS FORMAT LEVEL: FORMAT 3
STRUCTURE SYSZWLM_WORKUNIT STATUS: CONNECTED
STRUCTURE SYSZWLM_6A3A2084 STATUS: CONNECTED
The MODIFY WLM command is a local z/OS command that can be used to change the state of a
scheduling environment resource.
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this book.
IBM Redbooks
For information about ordering these publications, see “How to get Redbooks” on page 183.
Note that some of the documents referenced here may be available in softcopy only.
System Programmer’s Guide to: Workload Manager, SG24-6472
z/OS Intelligent Resource Director, SG24-5952
Other publications
These publications are also relevant as further information sources:
z/OS MVS Planning: Workload Management, SA22-7602
z/OS MVS Programming: Workload Management Services, SA22-7619
Workload Manager, Installations today process different types of work with different
response times. Every installation wants to make the best use of
INTERNATIONAL
WLM policy
its resources, maintain the highest possible throughput, and TECHNICAL
achieve the best possible system responsiveness. You can realize SUPPORT
WLM goal
such results by using workload management. This IBM Redbooks ORGANIZATION
management, WLM
functions publication introduces you to the concepts of workload
management utilizing Workload Manager (WLM). Workload
Manager allows you to define performance goals and assign a
WLM ISPF application business importance to each goal. You define the goals for work
BUILDING TECHNICAL
in business terms, and the system decides how much resource, INFORMATION BASED ON
such as CPU and storage, should be given to the work. The PRACTICAL EXPERIENCE
system matches resources to the work to meet those goals, and
constantly monitors and adapts processing to meet the goals. IBM Redbooks are developed
This reporting reflects how well the system is doing compared to by the IBM International
its goals, because installations need to know whether Technical Support
performance goals are being achieved as well as what they are Organization. Experts from
IBM, Customers and Partners
accomplishing in the form of performance goals. from around the world create
The ABCs of z/OS System Programming is a thirteen-volume timely technical information
collection that provides an introduction to the z/OS operating based on realistic scenarios.
system and the hardware architecture. Whether you are a Specific recommendations
beginner or an experienced system programmer, the ABCs are provided to help you
implement IT solutions more
collection provides the information that you need to start your effectively in your
research into z/OS and related subjects. If you would like to environment.
become more familiar with z/OS in your current environment, or
if you are evaluating platforms to consolidate your e-business
applications, the ABCs collection will serve as a powerful
technical tool. For more information:
ibm.com/redbooks