SAS Catalog
SAS Catalog
Paper 051-29
Using SAS® Catalogs to Develop and Manage SAS® DATA Step Programs
David D. Chapman,US Census Bureau, Washington, DC
ABSTRACT
A key concept in program management is to develop the code in one place, make changes at one location, move the code to where it is
needed for testing or use, and when making changes make sure they work before changing production code. SAS catalogs are ideal for
developing, testing, and managing data step programs. This paper reviews SAS catalogs and the different kinds of elements they contain. It
discusses Source, Log, Output, Macro, Format, and CATAMS elements. For each it gives an example of how to put it into a catalog and how
to retrieve and use it from the catalog. It also discusses how to manage and sequence code coming out of the catalog to make up your data
step and procedure programs. The use of PROC CATALOG to view catalog contents and copy elements to another location, and PROC
UPLOAD and PROC DOWNLOAD to move the SAS catalogs to a production location on a different computer are discussed. Managment
issues on how to use PROC CONTENTS and PROC UPLOAD and PROC DOWNLOAD to move the SAS catalogs to a production location
on either the same or a different computer are shown. The use of the Source Control Manager (SCM) to management data step and
procedure code from a SAS catalog is also discussed.
SAS CATALOGS
SAS catalogs are special SAS files that store many different kinds of information in smaller units called catalog entries. Each entry has an
entry type that identifies its purpose to the SAS System. A single SAS catalog can contain several different types of catalog entries. Some
catalog entries contain system information such as definitions. Other catalog entries contain application information such as window
definitions, help windows, formats, informats, macros, or graphics output. You can list the contents of a catalog using various SA
S features, such as SAS Explorer and PROC CATALOG.
SAS catalog entries are fully identified by a four-level name of the form: “libref.catalog.entry-name.entry-type”. An example is
“SUGI29.PAPER151.CONTROL.SOURCE”. You commonly specify the two level name for an entire catalog, as follows: libref.catalog (e.g.
SUGI29.PAPER110). “libref” is the logical name of the SAS data library in which the catalog is stored. “Catalog” is a valid SAS name for the
file. The entry name and entry type are required by some SAS procedures. “Entry-name” is a valid SAS name for the catalog entry; “entry-
type” is controlled and assigned by SAS when the entry is created. When using a catalog entry, if the entry type has been specified
elsewhere or if it can be determined from context, you can use the entry name alone.
CATALOG ELEMENTS
A SAS catalog is a special type of SAS file that contains elements. Elements can be several different types. All elements in a file can the
same type of element or all elements can be a different type. Different types of elements are described below. A catalog can contain all the
code and information necessary to execute a data step program. There are many different types of catalog elements. Elements most
commonly used with a data step program are listed below. The most common elements used in catalogs are: SOURCE, FORMAT,
MACRO(not a special catalog entry), OUTPUT, LOG, and SOURCE. The use of these elements is given in later sections.
SOURCE
This probably the most common element in use. It is the equivalent to a SAS program (e.g. program1.sas) file or ASCCII file. These elements
are used to hold basic data step or procedure code. There can be a different element for each datastep and proc or a data step can be
divided between several different elements. The source element is similar to a text file used to hold SAS code, original macro statements, or
other general purpose statment such as options. Source code is added to a catalog either interactively or in batch. When it is created in
batch, a data _null_ statement is often used. The except from the LOG file below illustrates how to add data step program code to catalog
and then retrieve and use it.
1
SUGI 29 Coders' Corner
Paper 051-29
The key elements are to have a filename statement that identifies a source entry in a catalog and that filename be used with the “FILE”
statement in the data _NULL_ data step.
FORMAT
User define formats are commonly used with data step programs. User define formats can be created, stored in a catalog, and retrieved from
the catalog for later use with data steps or procedures. User defined format are created using PROC FORMAT. These formats are used to
recode variables, as the basis for classifications that are part of tabulate totals using PROC TABULATE or PROCREPORT, or to display data
in output.
PROC FORMAT is used to create the formats and store them in the catalog. The code to create a user define format and store it in a catalog
is illustrated in the LOG file below. The code creates a format name “$REGION” and stores it in the catalog “sugi29.paper151".
This format that assigns a census region code based on census division code is made available from the catalog file “paper151” by
setting a SAS system option. Once a format has been stored in a catalog, it can be accessed by specifying the system option FMTSEARCH
(). If you physically move the catalog, you must either change the name of the libname or redefine the libname to reflect the catalogs new
location.
SAS searches for formats in a specific order. It first seaches the catalog WORK.FORMATS, then the LIBRARY.FORMATS catalog, and
finally the “SUGI29.PAPER151" catalog.
The code above puts the LOG statements associated with executing the PROC CATALOG statement into the element
program of type LOG and puts the output of the CONTENTS procedures into the element program of type OUTPUT. The “NEW” option
creates a new element each time it is run. If the “NEW” value is not used, data is appended to whatever is in the element.
CATAMS
A CATAMS element is an element for holding data. This data can be retrieved from the catalog to create a SAS data set or view. In theory,
the size of the data set created is limited only by the size of the catalog that can be created. In practice, data sets create from CATAMS
elements are often used to hold program parameters. One application used a CATAMS element to hold IP addresses of different computer
systems.
2
SUGI 29 Coders' Corner
Paper 051-29
The LOG file below reads the SAS data step “sashelp.company” and writes the data to a CATAMS catalog element.
Once data is entered into a CATAMS element, it can be either by used to create a SAS data set or used directly from the catalog as a SAS
Data View. The LOG file containing the code that reads and prints the data contained in the CATAMS element create above is given below.
3
SUGI 29 Coders' Corner
Paper 051-29
NOTE: There were 48 observations read from the data set WORK.COMPANY_VIEW.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.20 seconds
cpu time 0.10 seconds
CATAMS elements are excellent tools to save small data sets associated with a SAS application.
MACROS
Macros used as autocall macros can be stored and used from a SAS catalog. To create an autocall macro called “work_ds” that executes
the PROC DATASETS command. If a macro is stored in an catalog and accessed as an autocall macro, the name of the catalog element
must be the same and as the macro. The LOG file below show how to create the catalog element for the autocall macro.
To use the macro “work_ds” to display files in the work directory, the LOG below illustrates the code to do it. It is important that the file name
for the catalog with the macro is the same as the filename use with the SASAUTOS option.
31 option MAUTOSOURCE ;
32 FILENAME mymac CATALOG 'sugi29.paper151';
33 OPTIONS SASAUTOS=mymac ;
34 %WORK_DS
Any number of macros can be stored and accessed from a program code. Macros could also be created by storing just the source code and
recalling the code to create the macro. A log filed of an example is given below.
4
SUGI 29 Coders' Corner
Paper 051-29
92 %MACRO DB;
93 %include SrcCode1 /source2;
94 %mend;
95 %db
NOTE: %INCLUDE (level 1) file SRCCODE1 is file SUGI29.PAPER151.SRCCODE1.SOURCE.96 +proc datasets ;
Directory
Libref WORK
Engine V9
Physical Name C:\DOCUME~1\DAVIDC~1\LOCALS~1\Temp\SAS Temporary Files\_TD2612
File Name C:\DOCUME~1\DAVIDC~1\LOCALS~1\Temp\SAS Temporary Files\_TD2612
Member File
# Name Type Size Last Modified
1 SASMACR CATALOG 5120 31JAN2004:11:56:10
97 +quit;
NOTE: PROCEDURE DATASETS used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
NOTE: %INCLUDE (level 1) ending.
98 quit;
Catalogs should also be able to be used with compiled macros stored in a catalog.
The PROC REPORT procedure allows a use to store the code that describes an output table as a catalog entry. This catalog entry can be
used later with multiple data sets without change. The source code for creating such a report catalog entry and later reusing it is given below.
The code below creates a catalog entry to produce PROC REPORT tables.
The code below retrieves and uses the report format already created above to produce a report using a different file.
5
SUGI 29 Coders' Corner
Paper 051-29
A filename can also be associated with the entire catalog. This make referencing easier and more logical with the %INCLUDE statement.
The filename statement is usually used in conjunction with the %INCLUDE statement.
%INCLUDE
The %INCLUDE statement inserts code located in a source file or catalog into the program immediately following the statement. The two
most common forms of the statement are for files and catalogs. The code below illustrates how individual files or individual source entires in
a catalog can be referenced with and without the catalog option.
Libref SASUSER
Engine V9
Physical Name C:\Documents and Settings\David Chapman\My Documents\My SAS Files\9.0
File Name C:\Documents and Settings\David Chapman\My Documents\My SAS Files\9.0
Member File
# Name Type Size Last Modified
1 AGENTS DATA 5120 06DEC2002:15:35:13
2 CNTAINER DATA 5120 06DEC2002:15:31:22
{OUTPUT DELETED TO SAVE SPACE}
23 _SCMPLAY DATA 17408 28NOV2003:12:04:02
24 _SCMPREF DATA 17408 28NOV2003:12:03:17
206 +quit;
6
SUGI 29 Coders' Corner
Paper 051-29
The statement can be used to reference an entire catalog for ease of use and clarity. The filename statement is associated with than entire
catalog file instead of the a single element of the catalog. This filename statement is particularly helpful if you a retrieving many different
elements from a catalog.
When a filename refers to the entire catalog, the %include can refer to just the source element within the parentheses. The “SCOURCE2"
option specifies that the code will be written to the LOG. This is illustrated in the section below on controling program flow.
This code identifies the catalog and then inserts and submits code from four different catalog elements in the following order: StartUp,
program_a, program_b, and program_merge. This code also can be easily modified to work with MP/CONNECT by adding the required
MP/CONNECT statements in the sequencing program.
7
SUGI 29 Coders' Corner
Paper 051-29
%INPUT MASTER(PROGRAM_B);
ENDRSUBMIT;
Programs can may be started either interactively or in batch. The key to this is either to use the options on the pull down menus to
put code in the program editor or to construct a SAS program file to do it in batch. Code can be submitted interactively by going to
the file pull down menu and picking “Open Object” Once the catalog and entry are selected, you can pick “Open in program editor” to
put all the code into the program editor where it can be submitted. The code above can be put directly into a “program.sas” file an submitted
as a batch file. The code could also be put into what I call a control element, and only the control element submitted.
PROGRAM EDITOR
Code can be submitted interactively through the PROGRAM EDITOR. When the SOURCE2 option is used with the %INCLUDE statement,
the code submitted appears in the LOG file. The interactive PROGRAM EDITOR often works best for testing. Programs are easily inserted
into the program editor using the “OPEN AS OBJECT” command from the FILE menu. This allows catalog code to either be edited directly or
inserted into the PROGRM EDITOR.
BATCH
For standard, repetitive programs submission by batch is often the preferred way to go. They can be scheduled to rule at a specific time or in
a particular sequence. The batch code could be just a set of %INCLUDE statements or it could be just a single element that contains all the
sequencing. I prefer putting all the code into a single source element that I name “control” or “control_mp”. An example is given below.
PROC CATALOG
PROC CATALOG is a procedure that is used to manage and move a catalog. It can also be used to show the contents of a SAS catalog.
Moving elements from a test directory to a production directory is illustrated below. The LOG showing the results of a copy is given below.
PROC CATALOG can be also be used in to list all the elements in a catalog as the code and output below illustrate.
PROC CATALOG is used to identify what elements are in a catalog and when they were created. It can also be used to move an entire
catalog or individual elements of a catalog between SAS libraries.
8
SUGI 29 Coders' Corner
Paper 051-29
The key is to use the “INCAT” and “OUTCAT” options. PROC UPLOAD and PROC DOWNLOAD, part of transfer services
under SAS/CONNECT, is another way to move catalogs between different computer systems. PROC UPLOAD and PROC
DOWNLOAD work similar to PROC CATALOG. This is also a situation when remote library services under SAS/CONNECT may
be appropriate. The use of PROC UPLOAD/DOWNLOAD allows SAS programmer to develop and manage their production code in catalogs
on a windows computer and move the code to a UNIX or OpenVMS computer for production work.
This technique is used to introduce, change, or test new parameters stored in a CATAMS file, new formats or macors, or new source code
without making any change to working code. If problems are encountered, you can revert back to the original code by just changing the
CATNAME statement. When the test code is accepted, you just need to copy the catalog element from the test catalog to the production
catalog.
o The concatenation catalog is searched in the order listed on the statement and the first occurrence of the entry found is used.
o When a catalog entry is opened for writing to, it will be created in the first catalog listed.
o When a catalog entry is opend for reading, the element in the first catalog read will be used.
o When an element is deleted or renamed, only the first occurrence of the entry is affected.
o Any time a list of catalog entries is specified, only the first occurrence is given.
The effect of these rules are that when new test code is place in a catalog ahead of the production catalog. The new test code will be used
and the production code left unaffected. The effect is the same as if we had change the production catalog and only used the production
catalog.
Checkin/Checkout
Checkin/Checkout in SCM allows several developers to work on the same set of files without overwriting each other's work. When you check
out a catalog entry, for example, no one else can check it out until you check it back in.
9
SUGI 29 Coders' Corner
Paper 051-29
Revision Control
After each checkin, SCM creates a backup of the file. These back-ups are kept indefinitely if you so desire.. You can easily recall previous
revisions of your work, and track how each file has changed over time. You can quickly revert back to a previous revision of a file.
Version Labeling
Along with Revision Control, you can create Version Labels. A Version Label is basically a snapshot of a set of files at a given point in time.
With a version label, you can quickly reproduce an image of an application as it existed in the past. You can create a version label at the end
of each week, or at every major milestone in the application's development cycle.
Distribution
Once you have a Version Label of an application, you can easily create a copy in other locations on the network, or on other remote
machines using SAS/CONNECT®. This makes it easy to do development work in the SCM environment on one platform, and send the
finished product to other places.
CONCLUSION
Catalogs are excellent tools to use in the development of applications and in their management when put into production. All code (source
statements, logs, output, formats, macros, and parameter files) are stored in a single file - a SAS catalog. Moving the catalog files moves
everything needed to execute the program. The Source Control Manager(SCM) is a more complete tool for managing catalogs for data step
and procedure coded. SCM works well with teams and in tracking changes.
REFERENCES
SAS Institute Inc. SAS LANGUAGE: CONCEPTS, Cary, NC, SAS Institute Inc. 1999, 554 pages.
SAS Institute Inc. SAS LANGUAGE: REFERENCE, Cary, NC, SAS Institute Inc. 1999, 554 pages.
ACKNOWLEDGMENTS
SAS and all other SAS Institute Inc. Product and service names are registered trademarks or trademarks of SAS Institute Inc. In the USA and
other countries. ® indicates USA registration.
DISCLAIMER
This paper reports the results of research and analysis undertaken by Census Bureau staff. It has undergone a more limited review by the
Census Bureau than its official publications. This report is released to inform interested parties and to encourage discussion.
CONTACT INFORMATION
Your comments and suggestions are encouraged and welcomed. Contact the author at:
David D. Chapman
Economic Planning and Coordination Division
US Census Bureau
Washington, DC 20233-6100
Work Phone: 301-763-6535
Fax: 301-457-4473
Email: david.d.chapman@census.gov
10