Oracle DUL
Oracle DUL
Oracle DUL
Guide V3.2.0.0
ORACLE CONFIDENTIAL
DUL and this documentation is Oracle Confidential and for Internal use only.
Table of contents
STANDALONE C-PROGRAM
DUL is a standalone C program that directly retrieves rows from tables in data files. The Oracle
RDBMS software is NOT used at all. DUL does dirty reads, it assumes that every transaction is
committed. Nor does it check/require that media recovery has been done.
LAST RESORT
DUL is intended to retrieve data that cannot be retrieved otherwise. It is NOT an alternative for
EXP, SQL*Plus etc. It is meant to be a last resort, not for normal production usage.
Before you use DUL you must be aware that the rdbms has many
hidden features to force a bad database open. Undocumented
parameters and events can be used to skip roll forward, to disable
rollback and more.
The database can be corrupted, but an individual data block used must be 100% correct. During
all unloading checks are made to make sure that blocks are not corrupted and belong to the
correct segment. If during a scan a bad block is encountered, an error message is printed in the
loader file and to standard output. Unloading will continue with the next row or block.
ROWS in CLUSTERS/TABLES
DUL can and will only unload table/cluster data. It will NOT dump triggers, stored procedures
nor create scripts for tables or views. (But the data dictionary tables describing them can be
unloaded). The data will be unloaded in a format suitable for SQL*Loader or IMP. A matching
control file for SQL*Loader is generated as well.
Cross-platform unloading is supported. The database can be copied from a different operating
system than the DUL-host. (Databases/systems done so far: Sequent/ptx, Vax Vms, Alpha Vms,
MVS, HP9000/8xx, IBM AIX, SCO Unix, Alpha OSF/1).
ROBUST
DUL will not dump, spin or hang no matter how badly corrupted the database is.
Full support for all database constructs: row chaining, row migration, hash/index clusters, longs,
raws, rowids, dates, numbers, multiple free list groups, segment high water mark, NULLS,
trailing NULL columns, and unlimited extents.
DUL should work with all versions 6 and 7. DUL has been tested with versions from 6.0.26 up to
7.3.2. Even the old block header layout (pre 6.0.27.2) is supported.
DUL is essentially a single byte application. The command parser does not understand multibyte
characters, but it is possible to unload any multi byte database. For all possible caveats there is a
work around.
RESTRICTIONS
MLSLABELS
(LONG) RAW
DUL can unload (long) raws, but there is no way to reload these 1-to-1 with SQL*Loader. There
is no suitable format in SQL*Loader to preserve all long raws. Use the export mode instead or
write a Pro*C program to load the data.
PORTABLE
DUL can be ported to any operating system with an ANSI-C compiler. DUL has been ported to
many UNIX variants, VMS and WindowsNT.
RDBMS INTERNALS
A good knowledge of the Oracle RDBMS internals is a pre requisite to be able to use DUL
successfully. Andre Bakker's two days internals course is a minimum.
DUL uses an SQL like command interface. There are DDL statements to unload extents, tables,
users or the entire database. Data dictionary information required can be specified in the ddl
statements or taken from the previously unloaded data dictionary. The following three statements
will unload the DEPT table. The most common form is if the data dictionary and the extent map
are available:
UNLOAD TABLE scott.dept;
All relevant information can be specified in the statement as well:
REM Columns with type in the correct order
REM The segment header loaction in the storage clause
CONFIGURATION FILES
There are two configuration files for DUL. "init.dul" contains all configuration parameters. (size
of caches, details of header layout, oracle block size, output file format) In the control file,
"control.dul", the data file names and the oracle file numbers must be specified.
Steps to follow:
1. configure DUL for the target database. This means creating a correct init.dul and
control.dul.
2. Unload the four data dictionary tables. Use "dul dictv7.ddl" for Oracle7 or "dul
dictv6.ddl" for version 6 of Oracle.
3. unload the tables. Use one of the following commands:
o "UNLOAD TABLE owner.table ; (do not forget the semicolon).
o "UNLOAD USER user name ;
o "UNLOAD DATABASE ;
Steps to follow:
1. configure DUL for the target database. This means creating a correct init.dul and
control.dul. (See Port specific parameters ).
2. SCAN DATABASE; : scan the database, build extent and segment map
3. SCAN TABLES; or SCAN EXTENTS; : gather row statistics
4. Identify the lost tables from the output of step 3.
5. UNLOAD the identified tables.
Identifying the tables can be an overwhelming task. But it can be (and has been) done. You need
in depth knowledge about your application and the application tables. Column types can be
guessed by DUL, but table and column names are lost. Any old SYSTEM tablespace from the
same database but weeks old can be of great help!.
AUTOMATED SEARCH
To ease the hunt for the lost tables: the scanned statistical information in seen_tab.dat and
seen_col.dat can be loaded into a fresh database. If you recreate the tables ( Hopefully the create
table scripts are still available) then structure information of a "lost" table can be matched to the
"seen" tables scanned information with two SQL*Plus scripts. (fill.sql and getlost.sql).
Names are not really relevant for DUL, only for the person who must load the data. But
the unloaded data does not have any value, if you do not know from which table it came.
The guessed column types can be wrong. Although the algorithm is conservative and
decides UNKNOWN if not sure.
Trailing NULL columns are not stored in the database. So if the last columns only contain
NULL's than the scanner will NOT find them. (During unload trailing NULL columns are
handled correctly).
When a table is dropped, the description is removed from the data dictionary only. The
data blocks are not overwritten unless they are reused for a new segment. So the scanner
software can see a table that has been dropped.
Tables without rows will go unnoticed.
Newer objects have a higher object id than older objects. If an table is recreated, or if
there is a test and a production version of the same table the object id can be used to
decide.
Export mode
Export mode is a feature available only in DUL version 3. To enable export mode, you must set
the init.dul parameter EXPORT_MODE to TRUE.
For each table a separate IMP loadable file will be generated. The
generated file is completely different from a table mode export
generated by EXP! The file is the minimal format that IMP can load. It is
a single table dump file. With only an insert table statement and the
table data. Table grants, storage clauses, or triggers will not be
included. An optional create table statement is included if the
COMPATIBLE parameter has been set to 6 or 7. The character set
indication in the file in the generated header is V6 style. It is set to
mean ASCII based characterset.
Extreme care has been taken that the dump file can always be loaded
with imp. Only complete good rows are written to the dump file. For
this each row is buffered. The size of this buffer can changed with the
init.dul parameter BUFFER. Incomplete or bad rows are not written out.
SQL*Loader modes
For both SQL*Loader output formats the columns will be space separated and enclosed in double
quotes. Any double quote in the data will be doubled. SQL*Loader recognizes this and will load
only one. The character used to enclose the columns can be changed from double quote to any
character you like with the init.dul parameter LDR_ENCLOSE_CHAR.
There are two styles of physical record organization:
Stream mode
Nothing special is done in stream mode, a newline is printed after each record. This is a compact
format and can be used if the data does not contain newline characters. To enable stream mode set
LDR_PHYS_REC_SIZE = 0 in init.dul.
This mode is essential if the data can contain newlines. One logical record, one comlete row, can
be composed of multiple physical records. The default is record length is 81, this fits nicely on
the screen of a VT220. The physical record size can be specified with LDR_PHYS_REC_SIZE in
init.dul.
The file names generated are: owner name_table name.ext. The extension is ".dmp" for IMP
loadable files. ".dat" and ".ctl" are used for the SQL*Loader datafile and the control file. To
prevent variable substitution and other unwanted side effects, strange characters are stripped.
(Only alpha numeric and '_' are allowed).
If the FILE parameter is set the generated names will be FILEnnn.ext.
This possibility is a work around if the file system does not support long
enough file names.
To unload table data from a database block the following information must be known:
1. Column/Cluster Information: The number and type of the columns. For char or varchar
columns the maximum length as well. The number of cluster columns and the table
number in the cluster. This information can be supplied in the unload statement or it can
be taken from the previously unloaded USER$, OBJ$, TAB$ and COL$.
2. Segment/Extent information: When unloading a table the extent table in the data segment
header block is used to locate all data blocks. The location of this segment header block
(file number and block number) is taken from the data dictionary or can be specified in
the unload statement. If the segment header is not correct/available then another method
must be used. DUL can build its own extent map by scanning the whole database. (in a
separate run of DUL with the scan database statement.)
BINARY HEADERS
C-Structs in block headers are not copied directly, they are retrieved with specialized functions.
All offsets of structure members are programmed into DUL. This approach makes it possible to
cross-unload. (Unload an MVS created data file on an HP) Apart from byte order only four layout
types have been found so far.
1. Vax VMS and Netware : No alignment padding between structure members.
2. Korean Ticom Unix machines : 16 bit alignment of structure members.
3. MS/DOS 16 bit alignment and 16 bit wordsize.
4. Rest of the world (Including Alpha VMS) structure member alignment on member size.
MACHINE DEPENDENCIES
DUL can use the data dictionary: (USER$, OBJ$, TAB$ and COL$) For the data dictionary to be
used, these internal tables must be unloaded first. (sample DDL scripts available "dictv6.ddl" and
"dictv7.ddl") These scripts are different for different DUL versions.
UNLOAD DATABASE;
UNLOAD TABLE
[ owner_name . ]table_name
[ ( column_definitions ) ]
[ cluster_clause ]
[ storage_clause ] ;
storage_clause ::=
STORAGE ( storage_specification
[ more_storage_specs ] )
storage_specification ::=
OBJNO object_id_number
| TABNO cluster_table_number
| SEGOBJNO cluster_object_number
| EXTENTS ( FILE data_segment_header_file_number
BLOCK data_segment_header_block_number
[ BLOCKS extent_size_in oracle_blocks ] ] )
| any_normal_storage_specification_but_silently_ignored
SCAN DATABASE;
Scans all blocks of all data files.
Two files are generated:
1: seg.dat information of found segment headers (index/cluster/table):
(object id, file number, and block number).
2: ext.dat information of contiguous table/cluster data blocks.
(object id(V7), file and block number of segment header (V6),
file number and block number of first block,
number of blocks, number of tables)
SCAN TABLES;
Uses seg.dat and ext.dat as input.
Scans all tables in all data segments (a header block and at least one
matching extent with at least 1 table).
SCAN EXTENTS;
Uses seg.dat and ext.dat as input.
All extents for which no corresponding segment header has been found.
(Only useful if a tablespace is not complete, or a segment header
is corrupt).
Extent Map
UNLOAD TABLE requires an extent map. In 99.99% of the cases the extent map in the segment
header is available. In the rare 0.01% that the segment header is lost an extent map can be build
with the scan database command. The self build extent map will only be used during an unload if
the parameter USE_SCANNED_EXTENT_MAP is set to TRUE.
All data blocks have some ID of the segment they belong to. But there
is a difference between V6 and V7. Data blocks created by Oracle
version 6 have the address of the segment header block. Data blocks
created by Oracle7 have the segment object id in the header.
Column Specification
The column definitions must be specified in the order the columns are stored in the segment, that
is ordered by col$.segcol#. This is not necessarily the same order as the columns where specified
in the create table statement. Cluster columns are moved to the front, longs to the end. Columns
added to the table with alter table command, are always stored last.
UNLOAD EXTENT can be used to unload 1 or more adjacent blocks. The extent to be unloaded
must be specified with the STORAGE clause: To specify a single extent use: STORAGE
( EXTENTS( FILE fno BLOCK bno BLOCKS #blocks) ) (FILE and BLOCK specify the first
block, BLOCKS the size of the extent)
There is a "hidden" trick with file and object numbers that is used to locate the data dictionary
tables. The trick is based on the fact that object numbers are fixed for OBJ$, COL$, USER$ and
TAB$ due to the rigid nature of sql.bsq. This will not be documented because I myself could not
understand my first attempt to describe it.
These statistics are combined and a column type is suggested. Using this suggestion five rows are
unloaded to show the result. These statistics are dumped to two files (seen_tab.dat and
seen_col.dat). There are SQL*Loader and SQL*Plus scripts available to automate a part of the
identification process. (Currently known as the getlost option).
DESCRIBE
There is a describe command. It will show the dictionary information for the table, available in
DUL's dictionary cache.
ALIGN_FILLER
OBSOLETE
Replaced by OSD_C_STRUCT_ALIGNMENT
ASCII2EBCDIC
BOOLEAN
Must (var)char fields be translated from EBCDIC to ASCII. (For unloading MVS
database on a ASCII host)
BIG_ENDIAN_FLAG
OBSOLETE
Replaced by OSD_BIG_ENDIAN_FLAG
BLOCKS_TO_SKIP
NUMBER
Number of data blocks to skip before starting unload. This is usefull for unloading huge
tables in smaller portions. See also the parameter MAX_UNLOAD_BLOCKS.
BUFFER
NUMBER (bytes)
row output buffer size used in export mode only. In export mode each row is first stored
in this buffer, before it is written to the dump file.
COMPATIBLE
NUMBER
Database version , valid values are 6 or 7. If set you can dump file header blocks. Only if
this parameter is set then a create table statement is included in the dump file.
CONTROL_FILE
TEXT
Name of the DUL control file (default: "control.dul").
DBA_FILE_BITS
OBSOLETE
Replaced by OSD_DBA_FILE_BITS
DB_BLOCK_SIZE
NUMBER
Oracle block size in bytes (Maximal 32 K)
DB_LEADING_OFFSET
OBSOLETE
Replaced by OSD_FILE_LEADER_SIZE
DC_COLUMNS
NUMBER
DC_OBJECTS
NUMBER
DC_TABLES
NUMBER
DC_USERS
NUMBER
Sizes of dul dictionary caches. If one of these is too low startup will fail.
EXPORT_MODE
BOOLEAN
EXPort like output mode or SQL*Loader format
FILE
TEXT
Base for (dump or data) file name generation
LDR_ENCLOSE_CHAR
TEXT
The character to enclose fields in SQL*Loader mode.
LDR_PHYS_REC_SIZE
NUMBER
Physical record size for the generated loader datafile.
LDR_PHYS_REC_SIZE = 0 No fixed records, each record is terminated with a newline.
LDR_PHYS_REC_SIZE > 2: Fixed record size.
MAX_OPEN_FILES
Maximum # files that are concurrently kept open at the OS level.
MAX_UNLOAD_BLOCKS
Maximum number of of data blocks that are read. This is meant to break a huge table into
pieces. See also the parameter BLOCKS_TO_SKIP.
OSD_BIG_ENDIAN_FLAG
Byte order in machine word. Big Endian is also known as MSB first. For an explanation
why this is called Big Endian, you should read Gullivers Travels.
OSD_DBA_FILE_BITS
File Number Size in DBA in bits. Or to be more precise the size of the low order part of
the file number.
OSD_FILE_LEADER_SIZE
bytes/blocks added before the real oracle file header block
OSD_C_STRUCT_ALIGNMENT
C Structure member alignment (0,16 or 32) Must be set to 32 for most ports
OSD_WORD_SIZE
Size of a machine word always 32, except for MS/DOS(16)
TICOM_FILLER
OBSOLETE
The new parameter is OSD_C_STRUCT_ALIGNMENT
USE_SCANNED_EXTENT_MAP
BOOLEAN
Use the scanned extent map in ext.dat in unload table. This parameter is only useful if
some segment headers are missing or incorrect.
SAMPLE init.dul :
# sample init.dul configuration parameters
# these must be big enough for the database in question
# the cache must hold all entries from the dollar tables.
dc_columns = 200000
dc_tables = 10000
dc_objects = 10000
dc_users = 40
# OS specific parameters
osd_big_endian_flag = false
osd_dba_file_bits = 6
osd_c_struct_alignment = 32
osd_file_leader_size = 1
# database parameters
db_block_size = 2048
big endian or little endian (byte order in machine words): HP, SUN and mainframes are generally
big endian: OSD_BIG_ENDIAN_FLAG = TRUE. DEC and Intel platforms are little endian:
OSD_BIG_ENDIAN_FLAG = FALSE.
There is no standard trick for this, the following might work on a unix
system:
echo dul | od -x
If you see:
0000000 7564 0a6c
0000004
This is a little endian machine.
dba_file_bits
The number of bits in a dba used for the low order part of file number. Perform the following
query:
SQL> select dump(chartorowid('0.0.1')) from dual;
OSD_C_STRUCT_ALIGNMENT
Structure layout in data file headers. 0: No padding between members in a C-struct (VAX/VMS
only) 16: Some korean ticom machines and MS/DOS 32: Structure members are member size
aligned. (All others including ALPHA/VMS) Check the following query:
SELECT *
FROM v$type_size
WHERE type IN ( 'KCBH', 'KTNO', 'KCBH', 'KTBBH', 'KTBIT', 'KDBH'
, 'KTECT', 'KTETB', 'KTSHC') ;
In general osd_c_struct_alignment = 32 and the following output is expected:
K KTNO TABLE NUMBER IN CLUSTER 1
KCB KCBH BLOCK COMMON HEADER 20
KTB KTBIT TRANSACTION VARIABLE HEADER 24
KTB KTBBH TRANSACTION FIXED HEADER 48
KDB KDBH DATA HEADER 14
KTE KTECT EXTENT CONTROL 44
KTE KTETB EXTENT TABLE 8
KTS KTSHC SEGMENT HEADER 8
8 rows selected.
For VAX/VMS and Netware ONLY osd_c_struct_alignment = 0 and this output is expected:
COMPONEN TYPE DESCRIPTION SIZE
-------- -------- -------------------------------- ----------
K KTNO TABLE NUMBER IN CLUSTER 1
KCB KCBH BLOCK COMMON HEADER 20
KTB KTBIT TRANSACTION VARIABLE HEADER 23
KTB KTBBH TRANSACTION FIXED HEADER 42
KDB KDBH DATA HEADER 14
KTE KTECT EXTENT CONTROL 39
KTE KTETB EXTENT TABLE 8
KTS KTSHC SEGMENT HEADER 7
8 rows selected.
If there is a different list this will require some major hacking and sniffing and possibly a major
change to DUL. (Email bduijnen@nl.oracle.com)
osd_file_leader_size
Number of blocks/bytes before the oracle file header. Unix datafiles have an extra leading block
( file size, block size magic number) A large number ( > 100) is seen as a byte offset, a small
number is seen as a number of oracle blocks.
Unix : osd_file_leader_size = 1
Vms : osd_file_leader_size = 0
Desktop : osd_file_leader_size = 512
Others : Unknown ( Use Andre Bakker's famous PATCH utility to find out)
An Oracle7 file header block starts with the pattern 0X0B010000.
You can add an additional byte offset in control.dul in the optional third field (for instance for
AIX or DEC UNIX data files on raw device)
A control file (default name "control.dul") is used to translate the file numbers to file names. The
format of the control file is simple: Each entry on a separate line, first the file number and then
the file name. A third optional field is an extra positive or negative byte offset, that will be added
to all fseek() operations for that datafile. This makes it possible to skip over the extra block for
AIX on raw devices or to unload from fragments of a datafile.
For instance:
1 /usr/oracle/dbs/system.dbf
8 /usr/oracle/dbs/data.dbf 4096
The file header blocks are NOT verified. This would make it impossible to unload files with a
corrupted header block. For debugging it is possible to dump the file header.
This looks familiar, use the above information and your knowledge of the emp table to
compose:
UNLOAD TABLE emp ( empno number, ename char, job char, mgr number,
hiredate date, sal number, comm number deptno number)
STORAGE ( TABNO 0 EXTENTS( FILE 1 BLOCK 10530));
41. use this statement to unload emp:
42. $ dul
43.
44. UnLoader: Version 2.0.0.0 - Very Restricted on Tue May 16 11:46:33 1995
45.
46. Copyright (c) 1994/95 Oracle Corporation, The Netherlands. All rights
reserved.
47.
48. Loaded 350 segments
49. Loaded 204 extents
50. Extent map sorted
51. DUL> UNLOAD TABLE emp ( empno number, ename char, job char, mgr
number,
52. DUL 2> hiredate date, sal number, comm number deptno number)
53. DUL 3> STORAGE ( TABNO 0 EXTENTS( FILE 1 BLOCK 10530));
54. . unloading table EMP 14 rows unloaded
55. DUL>quit
unload table TAB$( OBJ# number, TS# ignore, FILE# number, BLOCK# number,
CLU# ignore, TAB# number, COLS number, CLUCOLS number)
cluster C_OBJ#(OBJ#)
storage ( tabno 1 segobjno 1 file 1)
;