VSAM06
VSAM06
VSAM06
PRZ
VSE/VSAM
World Alliance of VM, VSE and Linux Chattanooga, Tennessee April, 2006
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
In this presentation, we give an overview of the VSE/VSAM Subsystem, its features and how to exploit them for best overall system performance. We intend to highlight newer features and those older features that are often overlooked and which our experience shows give the most "bang for the buck." We try to highlight key "gotchas" in the performance area. This presentation and its materials are copyrighted materials, developed by Dan Janda, The Swami of VSAM. (c) 2003-2006 by Dan Janda. Permission is granted to WAVV to reproduce this presentation for distribution to its members at no charge.
Trademarks: IBM, VSE, VSE/ESA, ESA, CICS and DL/I are trademarks or registered trademarks of the IBM Corporation The Swami of VSAM is a trademark of Dan Janda
Abstract:
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSE/VSAM Overview
Virtual Storage Access Method For Direct Access Storage Devices (disk devices) General Purpose File Organizations Sequential -- records stored in the sequence presented Direct -- records stored in arbitrary sequence Indexed -- records stored via keys and an index File Access Techniques Sequential -- processing in the order stored Direct -- processing in an arbitrary order Keyed -- processing in key order
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSE/VSAM Overview
Virtual Storage Access Method Functional Areas Catalog Volume, File information Usage statistics DASD Space Management Space allocation, including secondary allocations VSAM and non-VSAM files (VSAM managed SAM, etc.) System Files Libraries Performance Large data transfer sizes for sequential processing Buffer look-aside for direct processing Integrity Backup/Restore Intra- and Inter-System Sharing controls
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSE/VSAM Overview
Virtual Storage Access Method Data Organizations Key Sequenced Data Set (KSDS) Indexed, stored logically in key sequence Entry Sequenced Data Set (ESDS) Non-indexed, stored in order inserted, new records at end of file Relative Record Data Set (RRDS) Non-indexed, stored in relative record order, fixed length records Variable Length Relative Record Data Set (VRDS) Indexed, stored in relative record order, but variable length records Alternate Index Data Set (AIX) A form of KSDS used as a "finder file" Unique or non-unique keys are supported
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSE/VSAM Overview
Virtual Storage Access Method Access Techniques Sequential Access Sequential, forward or backward Keyed Access Sequential, forward or backward Skip Sequential, forward or backward Direct, by full or partial (generic) key Addressed Access Sequential, forward or backward Skip Sequential, forward or backward Direct, by record address Alternate Index Access Same as for keyed access, above Add direct access by non-unique key
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSE/VSAM Overview
Logically stores records on disk Capabilities enabled by these techniques Performance aspects Physically stores records on disk Disk space usage calculations Performance aspects
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSE/VSAM Jargon
Control Interval (CI) The smallest unit of data transfer between main and disk storage One or more logical records are loaded into a CI
Rec 01
Rec 02
Rec 03
Rec ...
Freespace
RDF(s)
CIDF
Rec 01--Rec nn 1 to n logical records of any length Freespace Unused space within CI available for additional record insertion or increase in length of existing records RDF(s) 3-byte VSAM Record Descriptor Field for ESDS/KSDS, one per record length, one for all consecutive records of the same length for RRDS, one per numbered record slot CIDF 4-byte Control Interval Descriptor Field
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSE/VSAM Jargon
Control Area (CA) A group of CIs. In an indexed file, all the data CIs in a CA are mapped by a single index CI.
CI 00 CI 10 CI 20 CI 30 CI 01 CI 11 CI 21 CI 31 CI 02 CI 12 CI 22 CI 32 CI 03 CI 13 CI 23 CI 33 CI 04 CI 14 CI 24 CI 34 CI 05 CI 15 CI 25 CI 35 CI 06 CI 16 CI 26 CI 36 CI 07 CI 17 CI 27 CI 37 CI 08 CI 18 CI 28 CI 38 CI 09 CI 19 CI 29 CI 39
The size of a CA is the smallest among: -- one cylinder (or "max-CA") on the device -- the size of the primary allocation amount -- the size of the secondary allocation amount The number of CIs per CA depends on the device characteristics -(track size, number of tracks per cylinder) and the CI and CA sizes It is generally beneficial to have as large a CA size as possible
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSE/VSAM Jargon
Index Control Interval (Index CI) A CI in an index, containing pointer entries to The next level in the index, or The data CI within the CA (if this is a Sequence Set CI)
CI 00 CI 10 CI 20 CI 30
04/09/06 10:08 PM
CI 01 CI 11 CI 21 CI 31
CI 02 CI 12 CI 22 CI 32
CI 03 CI 13 CI 23 CI 33
CI 04 CI 14 CI 24 CI 34
CI 05 CI 15 CI 25 CI 35
CI 06 CI 16 CI 26 CI 36
CI 07 CI 17 CI 27 CI 37
CI 08 CI 18 CI 28 CI 38
CI 09 CI 19 CI 29 CI 39
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSE/VSAM Jargon
There is always exactly one high-level index CI There may be no to many intermediate-level index CIs There may be one or more low-level (sequence set) index CIs. If only one sequence set CI, it is also the high-level index CI.
...
CI 01 CI 02 CI 03 CI 04 CI 05 CI 06 CI 11 CI 12 CI 13 CI 14 CI 15 CI 16
...
CI 00 CI 01 CI 02 CI 03 CI 04 CI 05 CI 06 CI 07 CI 08 CI 09 CI 00 CI 01 CI 02 CI 03 CI 04 CI 05 CI 06 CI 07 CI 08 CI 09 CI 00 CI 01 CI 02 CI 03 CI 04 CI 05 CI 06 CI 07 CI 08 CI 09 CI 00 CI 07 CI 08 CI 09 CI 10 CI 11 CI 12 CI 13 CI 14 CI 15 CI 16 CI 17 CI 18 CI 19 CI 10 CI 11 CI 12 CI 13 CI 14 CI 15 CI 16 CI 17 CI 18 CI 19 CI 10 CI 11 CI 12 CI 13 CI 14 CI 15 CI 16 CI 17 CI 18 CI 19 CI 10 CI 17 CI 18 CI 19 CI 20 CI 21 CI 22 CI 23 CI 24 CI 25 CI 26 CI 27 CI 28 CI 29 CI 20 CI 21 CI 22 CI 23 CI 24 CI 25 CI 26 CI 27 CI 28 CI 29 CI 20 CI 21 CI 22 CI 23 CI 24 CI 25 CI 26 CI 27 CI 28 CI 29 CI 20 CI 21 CI 22 CI 23 CI 24 CI 25 CI 26 CI 27 CI 28 CI 29 CI 30 CI 31 CI 32 CI 33 CI 34 CI 35 CI 36 CI 37 CI 38 CI 39 CI 30 CI 31 CI 32 CI 33 CI 34 CI 35 CI 36 CI 37 CI 38 CI 39 CI 30 CI 31 CI 32 CI 33 CI 34 CI 35 CI 36 CI 37 CI 38 CI 39 CI 30 CI 31 CI 32 CI 33 CI 34 CI 35 CI 36 CI 37 CI 38 CI 39
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
Performance Tips
Especially if file is processed sequentially
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
Chattanooga Tennessee
VSAM-06.PRZ
2006
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSAM Catalogs
Required Assigned at IPL by DEF CAT command otherwise by DEFINE MCAT
User Catalogs
As desired But, at most one catalog defined on a volume Multiple catalogs can own space on a volume
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSAM Catalogs
Self describing records User catalog pointers Volume definitions Space definitions Cluster definitions Component (Data, Index) definitions Alternate Index and Path definitions
Catalog contents
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSAM Catalogs
Recommendations
Use naming conventions Name Cluster, Data and Index components explicitly Use partition and system independent names when applicable Separate Static files (seldom defined or deleted) Dynamic files (frequently defined or deleted) On-line critical files Batch files Multiple baskets -- all the eggs won't be broken Copyright 2003-2006 by Dan Jand
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSAM Catalogs
Avoid "One-Way Safety"
Don't use RECOVERABLE catalogs If catalog is damaged, file can be opened with CRA If CRA is damaged, file can't be opened If catalog is restored, CRA can uplevel catalog If CRA is restored, catalog cannot uplevel CRA
Recommendations -- continued
Recoverable catalogs were designed in days of small, removable media with (relatively) frequent failures BACKUP is a MUCH better idea
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
CI and CA Splits
A CI Split occurs when there is not enough freespace in a CI to hold the record being inserted. CI Split processing -- four writes:
Set "Split in Progress", write CI Move half of records to new CI in buffer, write new CI Update sequence set with new key and pointer, write index CI Erase moved records from old CI in buffer, turn off "Split in Progress", write old CI
Failures:
System fails during split -System corrected at next update access to CI No free CI in the CA -a CA split becomes necessary
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
CI and CA Splits
A CA Split occurs when there is not enough freespace in a CA to hold the CI being split. CA Split processing -- MANY writes:
Set "Split in Progress", write sequence set CI Format new CA at High Used RBA position in file Move half of CIs to new CA, read and write each CI moved Write new sequence set CI for new CA Update higher level index CIs as needed (bottom up) Erase moved CIs from old CA, write clear CIs Write updated original sequence set CI
Failures:
System fails during split -System corrected at next update access to CI No free CI in the CA -a CA split becomes necessary 04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
CI and CA Splits
Recommendations
Don't worry about CI splits They're inexpensive in time and space Avoid excess CA splits by defining CA Freespace (free CIs in each CA) They can be very expensive in time and space Don't use occurrence of some number of CI or CA splits as a trigger to cause reorganization Better understand the insert strategy VSAM uses Most inserts tend to be somewhat clustered A CI or CA split creates additional freespace exactly where it is most likely to be needed Reorganization will squeeze this freespace out and require more splits in the future, in most cases.
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
CI and CA Splits
Recommendations
Avoid too-frequent reorganization Reorganization will squeeze out the free space previous CI and CA splits have inserted If there are more inserts expected in the same area of the file, there will be more splits Once a split has occurred, the processing cost of the split has been paid You have to understand the insert processing one "hot spot" little distributed free space, let splits handle several or many "hot spots" little distributed free space, let splits handle fairly evenly distributed, with no "hot spots" exploit distributed free space Copyright 2003-2006 by Dan Jand
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
Chattanooga Tennessee
VSAM-06.PRZ
2006
Recommendations:
Non-Shared Resources (NSR) Each string must have adequate index buffers Requirements PER STRING... Unacceptable -- one buffer (old default) Acceptable -- one buffer per index level (new default) Good -- enough buffers to hold all high level index plus one Best -- enough buffers to hold entire index Local Shared Resources (LSR) The pool must have adequate index buffers See above -- Requirements PER STRING becomes IN POOL Monitor VSAM LSR statistics to ensure sufficient buffers are provided in pool to get high probability of finding desired record in the pool (high hit ratio) Data buffers monitored for high hit ratios as well
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
Recommendations:
Non-Shared Resources (NSR) Chained I/O strategy used to "read ahead" and "write behind" Better to read multiple CIs in one I/O than to use smaller I/O chains in an overlapping fashion Block big -Large CI sizes Be aware of VSAM splitting CIs into physical blocks to save space e.g. 3390 disk, 32K CI size VSAM will write each CI as two 16K blocks, 1-1/2 CIs, 48K data per track. Buffer big -Allow from 1/2 to a full cylinder of buffer space to minimize I/O time Local Shared Resources (LSR) In LSR, VSAM reads only a single CI at a time No chained I/O benefits even for sequential processing
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
Recommendations:
Monitor I/O and buffering LISTCAT statistics (before and after a critical job step) Shows data and index EXCPs EXCP -- EXecute Channel Program -- a physical I/O operation Job Accounting data Shows I/O counts by physical device Useful if files are well distributed across devices Still shows overall I/O and CPU activity CICS Shutdown (and Requested) Statistics Shows Logical and physical I/O counts by file LSR Buffer Pool "hits" and "misses" VSAM buffer statistics information is available Sample subroutine in the VSE/ESA Examples documentation Think big -LSR buffers are in 31-bit storage More are generally better, but don't start paging Copyright 2003-2006 by Dan Jand
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
VSAM permits sharing files among partitions in a VSE system VSAM permits sharing files among VSE systems But:
TANSTAAFL (Robert Heinlein) Sharing is not a performance option (The Swami)
Sharing is based on
VSE Lock Table within a single VSE system VSE Lock File when sharing across VSE systems
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
Chattanooga Tennessee
VSAM-06.PRZ
2006
Chattanooga Tennessee
VSAM-06.PRZ
2006
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
Recommendations
Don't use AIX instead of SORT when processing all records in base cluster Remember base cluster will be processed directly, based on alternate key values Base cluster will need additional index buffering for batch ONLY WAY to do this is to specify much larger value for the Base Cluster's BUFFERSPACE when it is defined Make it large enough to hold all the base cluster's index records if possible -- all the high-level index records if not.
04/09/06 10:08 PM
Chattanooga Tennessee
VSAM-06.PRZ
2006
Chattanooga Tennessee
VSAM-06.PRZ
2006
He's building a web site about VSE/VSAM issues http://business.epix.net/~theswami His knowledge and experience can help you, too!
04/09/06 10:08 PM