Updated VXVM
Updated VXVM
Updated VXVM
Classroom Session
Copyright © 2002 VERITAS Software Corporation. All Rights Reserved. VERITAS, VERITAS Software, the VERITAS logo, and all other VERITAS product names and slogans are trademarks or registered
trademarks of VERITAS Software Corporation in the US and/or other countries. Other product names and/or slogans mentioned herein may be trademarks or registered trademarks of their respective
companies.
At the end of this Session we familiar with
VxVM (Veritas volume manager)
• Volumes
• Plex
• Sub –Disk
• Striping (Raid – 0)
• Mirroring (Raid – 1)
• Availability
• Performance
• Scalability
Types of Objects
VxVM operats as a subsystem between your operating system and your data management system, such as file system
and database management system.
VxVM is layered on top of the OS interface services,and is dependent upon how the operating System accesses physical
disks.
Through the support of RAID redundancy techniques, VxVM protects against disk and hardware failures, while providing
the flexibility to extend the capabilities of existing hardware.
With Volume Manager, you enable virtual data storage by bringing a disk under Volume Manager
control. To bring a disk under Volume Manager control means that Volume Manager creates virtual
objects and establishes logical connections between those objects and the underlying physical
objects, or disks.
VxVM Objects
VxVM uses two types of objects to handle storage management physical objects and virtual objects.
Physical objects – Physical Disk , controllers or other hardware with block and raw OS device interfaces that are used to
store data.
Virtual objects – When one or more physical disks are brought under the control of VxVM, it creates virtual objects called
Volumes on those physical disks.
Volume are also composed of virtual objects (Plex and subdisks) that are used in changing the volume configuration
So volumes and their virtual components are called virtual objects or VxVM objects.
VxVM Objects Cont…
Volumes – is a virtual disk devices that appears to application,database, and file system like a physical disk device, but
does not have the physical limitations of a physical disk device.
Plex – VxVM uses subdisks to build virtual objects called Plexes.A plex consists of one or more subdisks located on one
or more physical disks.
Subdisk – A subdisk is a set of contiguous disk blocks. A blocks is a unit space on the disk. Each subdisk represents a
specific portion of a VM disk, which is mapped to a specific region of a physical disk.
Volume
Plex
Subdisk
How Volume Manager Works?
How Volume Manager Works?
1. Volume Manager removes all of the partition table entries from the VTOC, except for partition table entry 2
(backup slice). Partition table entry 2 contains the entire disk, including the VTOC, and is used to determine
2. Volume Manager then rewrites the VTOC and creates two partitions on the physical disk. One partition
contains the private region, and the other contains the public region.
How Volume Manager Works?
• Private region: The private region stores information, such as disk headers, configuration copies, and kernel logs,
that Volume Manager uses to manage virtual objects. The private region represents a small management overhead.
The minimum size for the private region is 1024 sectors (512K) for disks with active configuration databases, but VxVM
uses 2048 sectors (1024K) by default. This default value is rounded up to the next cylinder boundary.
• The maximum size for the private region is 524288 blocks (512K sectors).
• Partition Tags: VxVM sets the partition tags, the numeric values that describe the file system mounted on a partition, for
the public and private regions:
• If the disk has no partitions that are being placed under Volume Manager control, then Volume Manager creates the
private region first, and the public region second, on the disk.
• Volume Manager updates the VTOC with information about the removal of the existing partitions and the addition of the
new partitions during the initialization process.
Summary of Virtual Object Relationships
Summary of Virtual Object Relationships…
Volume Layouts
Planning a First-Time VxVM Setup
Which disks do you want to place under Volume Manager
control?
Do you want to use enclosure-based naming?
Do you want to exclude any disks from Volume Manager
control?
Do you want to suppress dynamic multipathing on any
disks?
When you place disks under Volume Manager control, do you want to
preserve or eliminate data in existing file systems and partitions?
When you place disks under Volume Manager control, do you want to
preserve or eliminate data in existing file systems and partitions?. . .
• When you place a disk under Volume Manager control, you can either preserve the data
that exists on the physical disk (encapsulation) or eliminate all of the data on the physical
disk (initialization).
Encapsulation
• Saving the data on a disk brought under Volume Manager control is called disk
encapsulation
• Contain the required minimum unpartitioned free space of 1024 sectors (512K)
(By default, VxVM uses 2048 sectors (1024K).)
• The partitions are converted to subdisks that are used to create the volumes that
replace the Solaris partitions.
Initialization
• Eliminating all of the data on a physical disk brought under Volume Manager
control is called disk initialization
Note:Any disks that are encapsulated or initialized during installation are placed
in the disk group rootdg. If disks are left alone during installation, they can
be placed under Volume Manager control later and assigned to disk
groups other than rootdg.
Do you want to place the system root disk under Volume
Manager control?
• Existing /, /usr, and /var partitions are converted to volumes without removing the
partitions.
• Other partitions are converted to volumes, and then partitions are removed.
• The existing swap area is converted to a volume. If there is insufficient space for
the private region on the boot disk, Volume Manager takes sectors from the swap
area of the disk, which makes the private region overlap the public region. The
swap partition remains the same size, and the swap volume is resized to be
smaller than the swap partition.
The /etc/system and /etc/vfstab files are modified .
Note: Volume Manager preserves a copy of the original VTOC of any disk
that is encapsulated in /etc/vx/reconfig.d/disks.d/cxtydz /vtoc, where cxtydz is
the SCSI address of the disk.
Typical Initial VxVM Setup
VxVM Licensing
• VRTSlic ;Licensing
• VRTSvmdoc ;Documentation
• # pkgadd -d . VRTSvxvm
• # pkgadd -d . VRTSvmsa
• # pkgadd -d . VRTSvmdoc
• # pkgadd -d . VRTSvmman
• # pkgadd -d . VRTSvmdev
Steps to Add VxVM Packages
• # tail -f /var/opt/vmsa/logs/command
CLI Directories
• /etc/vx/bin
• /usr/sbin
• /usr/lib/vxvm/bin
Managing Disks
Placing a Disk Under Volume Manager Control
Placing a Disk Under Volume Manager
Control…
Placing a Disk Under Volume Manager
Control…
Commands…
• # vxdisksetup -i c1t0d0
• The -i option writes a disk header to the disk, making the disk directly
usable
Commands…
• Evacuating a disk moves the contents of the volumes on a disk to another disk. The
contents of a disk can be evacuated only to disks in the same disk group that have
sufficient free space. You must evacuate a disk if you plan to remove the disk or if
you want to use the disk elsewhere
Commands…
• vxdiskunsetup –C c1t0d0
Rename the disk
• A disk group is created when you place at least one disk in the disk group. When you
add a disk to a disk group, a disk group entry is added to the private region header of
that disk. Because a disk can only have one disk group entry in its private region
header, one disk group does not "know about" other disk groups, and therefore disk
groups cannot share resources, such as disk drives, plexes, and volumes. A volume
with a plex can belong to only one disk group, and subdisks and plexes of a volume
must be stored in the same disk group.
• When you add a disk to a disk group, VxVM assigns the disk media name to the disk
and maps this name to the disk access record. In addition, the host name is also
recorded in the private region. This information is written to the private region of the
disk.
• Disk media name: A disk media name is the logical disk name assigned to a drive
by VxVM. VxVM uses this name to identify the disk for volume operations, such as
volume creation and mirroring.
• Disk access record: A disk access record is a record of how a disk maps to a
physical location and represents the UNIX path to the device. Disk access records
are dynamic and can be re-created when vxdctl enable is run.
• Once disks are placed under Volume Manager control, storage is managed in terms of
the logical configuration. File systems mount to logical volumes, not to physical
partitions. Logical names, such as /dev/vx/[r]dsk/diskgroup_name/volume, replace
physical locations, such as /dev/[r]dsk/c0t4d2s5
• Whenever the VxVM configuration daemon is started (or vxdctl enable is run), the
system reads the private region on every disk and establishes the connections
between disk access records and disk media names.
The rootdg Disk Group
• The rootdg disk group is a special disk group that is created when you install VxVM
during the vxinstall process. VxVM requires that the rootdg disk group exist and that
it contain at least one disk. It is recommended that at least two disks are in the
rootdg disk group so that the VxVM configuration database can be maintained on at
least two disks. If you want your boot disk to be bootable under VxVM, then the boot
disk must be in the rootdg disk group.
Disk Groups and High Availability
Creating Disk Group
• # umount /filesystem2
(or)
RAID is a method of combining several hard disk drives into one logical unit (two or more disks grouped together to appear
as a single device to the host system). RAID technology was developed to address the fault-tolerance and performance
limitations of conventional disk storage. It can offer fault tolerance and higher throughput levels than a single hard drive or
group of independent hard drives. While arrays were once considered complex and relatively specialized storage solutions,
today they are easy to use and essential for a broad spectrum of client/server applications.
VxVM Layouts
•Concatenation and spanning
•Striping (Raid – 0)
•Mirroring (Raid – 1)
•Data is accessed in the first subdisk from beginning to end, data is then accessed in the remaining sub-disk available.
•Concatenated plex do not have to be physically contiguous and can belong to more then one VM disk.
Volume ABCDEF
Plex
A D
Subdisk B E
C F
Striping - RAID 0
Data striping without redundancy (no protection)
Minimum number of drives: 2
Strengths: Highest performance.
Weaknesses: No data protection; One drive fails, all data is lost.
•Striping (Raid 0) is useful if you need large amount of data written or read.
•Striping maps data so that the data is interleaved among two or more physical disks
•Data is allocated alternatively and evenly tp the subdisk of a striped plex
ABCDEF
Volume
Plex
A B
Subdisk Subdisk C D
E F
RAID Level 0
Concatenated Layout
Concatenation: Advantages
• Removes size restrictions: Concatenation removes the
restriction on size of storage devices imposed by physical disk
size.
• Load balancing: Striping is also helpful in balancing the I/O load from
multiuser applications across multiple disks.
• Requires more disk space: Mirroring requires twice as much disk space,
which can be costly for large configurations. Each mirrored plex requires
enough space for a complete copy of the volume's data.
• # vxprint –Aht
• # vxprint –g rootdg
• # vxprint –dt
• # vxprint –st
• # vxprint –pt
• # vxprint -vt
Removing a Volume
Removing a Volume . . .
• vxmirror -g datadg -a
• # /etc/vx/bin/vxmirror -d yes
(or)
datavol-02
Excluding Storage from Volume Creation
• # newfs dev/vx/rdsk/datadg/datavol
• One partition contains the private region. The private region stores
VxVM information, such as disk headers, configuration copies, and
kernel logs. Tag 15 is always associated with the private region. When a
disk is encapsulated, tag 15 is always associated to a slice other than slice
3.
The other partition contains the public region. The public region is used
for storage space allocation and is always associated with tag 14.
Root Disk Encapsulation
Data Disk Encapsulation Requirements
• At least two partition table entries must be available on the
disk.
– One partition is used for the public region.
– One partition is used for the private region.
rootdev:/pseudo/vxio@0:0
set vxio:vol_rootdev_is_volume=1
/etc/vfstab: Before Root Encapsulation
/etc/vfstab: After Root Encapsulation
Mirroring the Root Disk
• The boot disk must be encapsulated by VxVM in order to be
mirrored.
• To mirror the root disk, you must provide another disk with
enough space to contain all of the root partitions (/, /usr,
/var, /opt, and swap).
• You can only use disks in the rootdg disk group for the boot
disk and alternate boot disks.
Why Create an Alternate Boot Disk
• # /etc/vx/bin/vxrootmir secrootmir
• bootpath: '/sbus@3,0/SUNW,socal@d,10000/sf@0,0/
ssd@w2100002037590098,0:a'
• # prtconf -vp | grep vx
vx-disk01: '/sbus@3,0/SUNW,socal@d,10000/sf@0,0/
ssd@w21000020374fe71f,0:a'
• vx-rootdisk: '/sbus@3,0/SUNW,socal@d,10000/sf@0,0/
ssd@w2100002037590098,0:a'
Unencapsulating a Root Disk
• # vxunroot
• To convert the root, swap, usr, var, opt, and home file
systems back to being accessible directly through disk
partitions instead of through volume devices, you use the
vxunroot utility. Other changes that were made to ensure
the booting of the system from the root volume are also
removed so that the system boots with no dependency on
VxVM.
• For vxunroot to work properly, the following conditions must
be met:
• All but one plex of rootvol, swapvol, usr, var, opt, and home
must be removed (using vxedit or vxplex).
• One disk in addition to the root disk must exist in rootdg.
To convert a root volume back to partitions:
• Ensure that the rootvol, swapvol, usr, and var volumes have
only one associated plex each. The plex must be contiguous,
nonstriped, nonspanned, and nonsparse.
• # vxprint -ht rootvol swapvol usr var
• vxconfigd reads the kernel log to determine the state of VxVM objects.
vxconfigd reads the configuration database on the disks, then uses the kernel log
to update the state information of the VM objects.
VxVM Daemons…
• vxiod
• vxiod—VxVM I/O kernel threads provide extended I/O
operations without blocking calling processes. By default, 10
I/O threads are started at boot time, and at least one I/O thread
must continue to run at all times.
VxVM Daemons…
• vxrelocd
• vxrelocd is the hot relocation daemon that monitors events
that affect data redundancy. If redundancy failures are
detected, vxrelocd automatically relocates affected data from
mirrored or RAID-5 subdisks to spare disks or other free space
within the disk group. vxrelocd also notifies the system
administrator by e-mail of redundancy failures and relocation
activities.
V M Disks – Private Region
• The disk header contains the disk label, disk group information, host ID, and
pointers to the private and public regions. You can display disk header
information by using vxdisk list diskname.
• The configuration database contains VxVM object definitions. The size of the
configuration database is approximately 70 percent of the private region.
• Kernel logs contain configuration changes, including information about log
plex attachment, object creation, object deletion, object states, and flags.
Types of VxVM Disks
• A simple disk is a disk that is created dynamically in the kernel and has
public and private regions that are contiguous inside a single partition.
• A sliced disk is a disk that has separate slices for the public and private
regions.
• A NOPRIV disk is a disk that does not contain a private region
VxVM Configuration Database
• The VxVM configuration database stores all disk, volume, plex, and
subdisk configuration records. The vxconfig device (/dev/vx/config) is the
interface through which all changes to the volume driver state are
performed. This device can only be opened by one process at a time, and
the initial volume configuration is downloaded into the kernel through this
device.
• The configuration database is stored in the private region of a VxVM disk.
• The VxVM configuration is replicated within the disk group so that sufficient
copies exist to protect against loss of the configuration in case of physical disk
failure. VxVM attempts to store at least four copies for each disk group.
Displaying Disk Group Configuration Data
• The size of the configuration database for a disk group is the size of the
smallest private region in the disk group. In the example, permlen=2630.
• Log entries are on all disks that have databases. The log is used by the
VxVM kernel to keep the state of the drives accurate, in case the
database cannot be kept accurate (for example, if the configuration daemon
is stopped).
Displaying Disk Configuration Data
• Enabled
In the disabled mode, most operations are not allowed. vxconfigd does
not retain configuration information for the imported disk groups and
does not maintain the volume and plex device directories. Certain
failures, most commonly the loss of all disks or configuration copies in
the rootdg disk group, will cause vxconfigd to enter the disabled state
automatically.
• Booted
• # vxconfigd
• # vxdctl -k stop
• # vxdctl disable
• To display the list of VxVM features which are currently available based on
known licensing information
• By adding the init argument, you can request that vxconfigd reread any
persistently stored license information. If licenses have expired, some features
may become unavailable. If new licenses have been added, the features defined in
those licenses become available.
The volboot File
• Never edit the volboot file manually. If you do so, its checksum will be
invalidated.
• To view the decoded contents of the volboot file
• # vxdctl list
Changing the Host ID
• If a write causes a log region to become dirty when it was previously clean,
the log is synchronously written to disk before the write operation can
occur. On system restart, VxVM recovers only those regions of the volume
that are marked as dirty in the dirty region log.
RAID 5 Logging
• RAID-5 volumes use RAID-5 logs to keep a copy of the data and parity
currently being written.
• Without logging, data not involved in any active writes can be lost or
silently corrupted if both a disk in a RAID-5 volume and the system fail.
• The vxrelocd daemon starts during system startup and monitors VxVM
for failures involving disks, plexes, or RAID-5 subdisks. When a
failure occurs, vxrelocd triggers a hot-relocation attempt and notifies
the system administrator, through e-mail, of failures and any relocation
and recovery actions.
• The vxrelocd daemon is started from the S95vxvm-recover file. The
argument to vxrelocd is the list of people to e-mail notice of a
relocation (default is root). To disable vxrelocd, you can place a "#" in
front of the line in the S95vxvm-recover file.
• The hot-relocation feature is enabled by default. No system
administrator action is needed to start hot relocation when a
failure occurs.
A successful hot-relocation process involves
• Failure detection: Detecting the failure of a disk, plex, or RAID-5 subdisk
• 1. Disk replacement
When a disk fails, you replace the corrupt disk with a new disk. The disk
used to replace the failed disk must be either an uninitialized disk or a disk in
the free disk pool. The replacement disk cannot already be in a disk group. If
you want to use a disk that exists in another disk group, then you must
remove the disk from the disk group and place it back into the free disk pool
before you can use it as the replacement disk.
• 2. Volume recovery
When a disk fails and is removed for replacement, the plex on the failed
disk is disabled, until the disk is replaced. Volume recovery involves:
– Starting disabled volumes
• Before VxVM can use a new disk, you must ensure that
Solaris recognizes the disk. When adding a new disk,
follow these steps to ensure that the new disk is
recognized:
• 1.Connect the new disk.
• 2. Get Solaris to recognize the disk:
• # drvconfig
• # disks
• Note: In Solaris 7 and later, use devfsadm, a one-command
• replacement for drvconfig and disks.
• 3. Verify that Solaris recognizes the disk:
• # prtvtoc /dev/dsk/device_name
• 4. Get VxVM to recognize that a failed disk is now working again:
• # vxdctl enable
• 5. Verify that VxVM recognizes the disk:
• # vxdisk list
• After Solaris and VxVM recognize the new disk, you can then use the disk as a
replacement disk.
Replace a failed disk
• Select a removed or failed disk [<disk>,list,q,?] datadg02
• The -k switch forces VxVM to take the disk media name of the failed disk
and assign it to the new disk
• For example, if the failed disk datadg01 in the datadg disk group was
removed, and you want to add the new device c1t1d0s2 as the replacement
disk
• Note: Exercise caution when using the -k option to vxdg.
Attaching the wrong disk with the -k option can cause
unpredictable results in VxVM
The vxunreloc Utility
• VxVM also provides a utility that unrelocates a disk, that is, moves
relocated subdisks back to their original disk. After hot relocation moves
subdisks from a failed disk to other disks, you can return the relocated
subdisks to their original disk locations after the original disk is repaired or
replaced.
• Unrelocation is performed using the vxunreloc utility,
which restores the system to the same configuration that
existed before a disk failure caused subdisks to be
relocated.
Viewing Relocated Subdisks
• The vxreattach utility reattaches disks to a disk group and retains the
same media name. This command attempts to find the name of the drive in
the private region and to match it to a disk media record that is missing a
disk access record. This operation may be necessary if a disk has a transient
failure, for example, if a drive is turned off and then back on, or if the
Volume Manager starts with some disk drivers unloaded and unloadable.
• vxreattach tries to find a disk in the same disk group with the same
disk ID for the disks to be reattached. The reattach operation may fail even
after finding the disk with the matching disk ID if the original cause (or
some other cause) for the disk failure still exists.
• /etc/vx/bin/vxreattach [-bcr] [dm_name]
• Precautionary Tasks
(or)
• This command creates object definitions for a restored volume out of the object
definitions in the lost volume
• To implement the object definitions of restoredvolume into a
real volume
ABCDEF
Volume
Plex Plex
A D A D
B E subdisk Subdisk B E
C F C F
RAID Level 1
RAID 1 – Advantage & Disadvantage
Mirrored Stripe - RAID 0+1
A B A B
C D Subdisk Subdisk Subdisk Subdisk C D
E F Raid 0 E F
Striping
B Raid 1
A Layered Volume Layered Volume
C D
A A B B
Sub Disk C
Sub Disk C Sub Disk D
Sub Disk D
The concept behind RAID is relatively simple. The fundamental premise is to be able to recover data on-
line in the event of a disk failure by using a form of redundancy called parity. In its simplest form, parity is
an addition of all the drives used in an array. Recovery from a drive failure is achieved by reading the
remaining good data and checking it against parity data stored by the array. Parity is used by RAID
levels 2, 3, 4, and 5. RAID 1 does not use parity because all data is completely duplicated (mirrored).
7 +X = 10
+ X = 10 -7
--------- ----------
X 3
MISSING RECOVERED
DATA DATA
Striping with Parity - Raid 5
Block-level data striping with distributed parity
Volume
RAID Level 5
RAID 5 – Advantage & Disadvantage