User Guide Linstor
User Guide Linstor
User Guide Linstor
Table of Contents
Please Read This First . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
LINSTOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3. Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4. Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5. Upgrading. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6. Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.14. Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.15. Monitoring. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.6. Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2. Upgrades. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.4. Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
This guide assumes, throughout, that you are using the latest version of LINSTOR
and related tools.
• Basic administrative tasks / Setup deals with LINSTOR’s basic functionality and gives you an
insight in using common administrative tasks. Apart of that you can use this chapter as a step-
by-step instruction guide to deploy LINSTOR in its basic setup.
• In Further LINSTOR tasks a variety of advanced and important LINSTOR tasks as well as
configurations is provided to configure LINSTOR in a more complex way.
• The chapters LINSTOR Volumes in Kubernetes, LINSTOR Volumes in Proxmox VE, LINSTOR
Volumes in OpenNebula, LINSTOR volumes in Openstack, LINSTOR Volumes in Docker illustrate
how to implement a LINSTOR based storage with Kubernetes, PROXMOX, Openebula, Openstack
and Docker by using LINSTOR’s API.
1
LINSTOR
2
Chapter 1. Basic administrative tasks / Setup
LINSTOR is a configuration management system for storage on Linux systems. It manages LVM
logical volumes and/or ZFS ZVOLs on a cluster of nodes. It leverages DRBD for replication between
different nodes and to provide block storage devices to users and applications. It manages
snapshots, encryption and caching of HDD backed data in SSDs via bcache.
linstor-controller
A LINSTOR setup requires at least one active controller and one or more satellites.
The linstor-controller relies on a database that holds all configuration information for the whole
cluster. It makes all decisions that need to have a view of the whole cluster. Multiple controllers can
be used for LINSTOR but only one can be active.
linstor-satellite
The linstor-satellite runs on each node where LINSTOR consumes local storage or provides storage
to services. It is stateless; it receives all the information it needs from the controller. It runs
programs like lvcreate and drbdadm. It acts like a node agent.
linstor-client
The linstor-client is a command line utility that you use to issue commands to the system and to
investigate the status of the system.
1.1.2. Objects
Objects are the end result which LINSTOR presents to the end-user or application, such as;
Kubernetes/OpenShift, a replicated block device (DRBD), NVMeOF target, etc.
Node
Node’s are a server or container that participate in a LINSTOR cluster. The Node attribute defines:
3
NetInterface
As the name implies, this is how you define the interface/address of a node’s network interface.
Definitions
Definitions define attributes of an object, they can be thought of as profile or template. Objects
created will inherit the configuration defined in the definitions. A definition must be defined prior
to creating the associated object. For example; you must create a ResourceDefinition prior to
creating the Resource
StoragePoolDefinition
• Defines the name of a storage pool
ResourceDefinition
Resource definitions define the following attributes of a resource:
• The TCP port for DRBD to use for the resource’s connection
VolumeDefinition
Volume definitions define the following:
• The minor number to use for the DRBD device associated with the DRBD volume
StoragePool
• The storage back-end driver to use for the storage pool on the cluster node (LVM, ZFS, etc)
Resource
LINSTOR has now has expanded its capabilities to manage a broader set of storage technologies
outside of just DRBD. A Resource:
4
Volume
Volumes are a subset of a Resource. A Resource could have multiple volumes, for example you may
wish to have your database stored on slower storage than your logs in your MySQL cluster. By
keeping the volumes under a single resource you are essentially creating a consistency group. The
Volume attribute can define also define attributes on a more granular level.
The southbound drivers used by LINSTOR are LVM, thinLVM and ZFS.
1.3. Packages
LINSTOR is packaged in both the .rpm and the .deb variants:
1. linstor-client contains the command line client program. It depends on python which is usually
already installed. In RHEL 8 systems you will need to symlink python
2. linstor-controller and linstor-satellite Both contain systemd unit files for the services. They
depend on Java runtime environment (JRE) version 1.8 (headless) or higher.
For further detail about these packages see the Installable Components section above.
If you have a support subscription to LINBIT, you will have access to our certified
binaries via our repositories.
1.4. Installation
If you want to use LINSTOR in containers skip this Topic and use the "Containers"
section below for the installation.
If you want to have the option of creating replicated storage using DRBD, you will need to install
drbd-dkms and drbd-utils. These packages will need to be installed on all nodes. You will also need
to choose a volume manager, either ZFS or LVM, in this instance we’re using LVM.
Depending on whether your node is a LINSTOR controller, satellite, or both (Combined) will
determine what packages are required on that node. For combined type nodes, we’ll need both the
controller and satellite LINSTOR package.
5
Combined node:
That will make our remaining nodes our Satellites, so we’ll need to install the following packages on
them:
On SLES, DRBD is normally installed via the software installation component of YaST2. It comes
bundled with the High Availability package selection.
As we download DRBD’s newest module we can check if the LVM-tools are up to date as well. User
who prefer a command line install may simply issue the following command to get the newest
DRBD and LVM version:
Depending on whether your node is a LINSTOR controller, satellite, or both (Combined) will
determine what packages are required on that node. For combined type nodes, we’ll need both the
controller and satellite LINSTOR package.
Combined node:
That will make our remaining nodes our Satellites, so we’ll need to install the following packages on
them:
1.4.3. CentOS
CentOS has had DRBD 8 since release 5. For DRBD 9 you’ll need to look at EPEL and similar sources.
Alternatively, if you have an active support contract with LINBIT you can utilize our RHEL 8
repositories. DRBD can be installed using yum. We can also check for the newest version of the LVM-
tools as well.
LINSTOR requires DRBD 9 if you wish to have replicated storage. This requires an
external repository to be configured, either LINBIT’s or a 3rd parties.
6
# yum install drbd kmod-drbd lvm2
Depending on whether your node is a LINSTOR controller, satellite, or both (Combined) will
determine what packages are required on that node. For combined type nodes, we’ll need both the
controller and satellite LINSTOR package.
On RHEL 8 systems you will need to install python2 for the linstor-client to work.
Combined node:
That will make our remaining nodes our Satellites, so we’ll need to install the following packages on
them:
1.5. Upgrading
LINSTOR doesn’t support rolling upgrade, controller and satellites must have the same version,
otherwise the controller with discard the satellite with a VERSION_MISMATCH. But this isn’t a problem,
as the satellite won’t do any actions as long it isn’t connected to a controller and DRBD will not be
disrupted by any means.
If you are using the embedded default H2 database and the linstor-controller package is upgraded
an automatic backup file of the database will be created in the default /var/lib/linstor directory.
This file is a good restore point if for any reason a linstor-controller database migration should fail,
than it is recommended to report the error to Linbit and restore the old database file and
downgrade to your previous controller version.
If you use any external database or etcd, it is recommended to do a manually backup of your
current database to have a restore point.
So first upgrade the linstor-controller, linstor-client package on you controller host and restart
the linstor-controller, the controller should start and all of it’s client should show
OFFLINE(VERSION_MISMATCH). After that you can continue upgrading linstor-satellite on all satellite
nodes and restart them, after a short reconnection time they should all show ONLINE again and your
upgrade is finished.
1.6. Containers
LINSTOR is also available as containers. The base images are available in LINBIT’s container
registry, drbd.io.
In order to access the images, you first have to login to the registry (reach out to sales@linbit.com
7
for credentials):
• drbd.io/drbd9-rhel8
• drbd.io/drbd9-rhel7
• drbd.io/drbd9-sles15sp1
• drbd.io/drbd9-bionic
• drbd.io/drbd9-focal
• drbd.io/linstor-csi
• drbd.io/linstor-controller
• drbd.io/linstor-satellite
• drbd.io/linstor-client
An up to date list of available images with versions can be retrieved by opening http://drbd.io in
your browser. Make sure to access the host via "http", as the registry’s images themselves are
served via "https".
To load the kernel module, needed only for LINSTOR satellites, you’ll need to run a drbd9-$dist
container in privileged mode. The kernel module containers either retrieve an official LINBIT
package from a customer repository, use shipped packages, or they try to build the kernel modules
from source. If you intend to build from source, you need to have the according kernel headers
(e.g., kernel-devel) installed on the host. There are 4 ways to execute such a module load container:
Example using a module shipped with the container, which is enabled by not bind-mounting
/usr/src:
8
Example using a hash and a distribution (rarely used):
For now (i.e., pre DRBD 9 version "9.0.17"), you must use the containerized DRBD
kernel module, as opposed to loading a kernel module onto the host system. If you
intend to use the containers you should not install the DRBD kernel module on
your host systems. For DRBD version 9.0.17 or greater, you can install the kernel
module as usual on the host system, but you need to make sure to load the module
with the usermode_helper=disabled parameter (e.g., modprobe drbd
usermode_helper=disabled).
To run the LINSTOR controller container as a daemon, mapping ports 3370, 3376 and 3377 on the
host to the container:
To interact with the containerized LINSTOR cluster, you can either use a LINSTOR client installed
on a system via packages, or via the containerized LINSTOR client. To use the LINSTOR client
container:
9
# docker run -it --rm -e LS_CONTROLLERS=<controller-host-IP-address> drbd.io/linstor-
client node list
From this point you would use the LINSTOR client to initialize your cluster and begin creating
resources using the typical LINSTOR patterns.
Start and enable the linstor-controller service on the host where it has been installed:
If you are sure the linstor-controller service gets automatically enabled on installation you can use
the following command as well:
10
# linstor node list
You can use the linstor command on any other machine, but then you need to tell the client how to
find the linstor-controller. As shown, this can be specified as a command line option, an
environment variable, or in a global file:
Alternatively you can create the /etc/linstor/linstor-client.conf file and populate it like below.
[global]
controllers=alice
If you have multiple linstor-controllers configured you can simply specify them all in a comma
separated list. The linstor-client will simply try them in the order listed.
The linstor-client commands can also be used in a much faster and convenient
way by only writing the starting letters of the parameters e.g.: linstor node list →
linstor n l
If the IP is omitted, the client will try to resolve the given node-name as host-name by itself.
Linstor will automatically detect the node’s local uname -n which is later used for the DRBD-
resource.
When you use linstor node list you will see that the new node is marked as offline. Now start and
enable the linstor-satellite on that node so that the service comes up on reboot as well:
You can also use systemctl start linstor-satellite if you are sure that the service is already
enabled as default and comes up on reboot.
About 10 seconds later you will see the status in linstor node list becoming online. Of course the
satellite process may be started before the controller knows about the existence of the satellite
11
node.
In case the node which hosts your controller should also contribute storage to the
LINSTOR cluster, you have to add it as a node and start the linstor-satellite as well.
If you want to have other services wait until the linstor-satellite had a chance to create the
necessary devices (i.e. after a boot), you can update the corresponding .service file and change
Type=simple to Type=notify.
This will cause the satellite to delay sending the READY=1 message to systemd until the controller
connects, sends all required data to the satellite and the satellite at least tried once to get the
devices up and running.
On each host contributing storage, you need to create either an LVM VG or a ZFS zPool. The VGs and
zPools identified with one LINSTOR storage pool name may have different VG or zPool names on
the hosts, but do yourself a favor and use the same VG or zPool name on all nodes.
The storage pool name and common metadata is referred to as a storage pool
definition. The listed commands create a storage pool definition implicitly. You can
see that by using linstor storage-pool-definition list. Creating storage pool
definitions explicitly is possible but not necessary.
# linstor sp l
Should the deletion of the storage pool be prevented due to attached resources or snapshots with
12
some of its volumes in another still functional storage pool, hints will be given in the 'status' column
of the corresponding list-command (e.g. linstor resource list). After deleting the LINSTOR-objects
in the lost storage pool manually, the lost-command can be executed again to ensure a complete
deletion of the storage pool and its remaining objects.
In clusters where you have only one kind of storage and the capability to hot-repair storage devices,
you may choose a model where you create one storage pool per physical backing device. The
advantage of this model is to confine failure domains to a single storage device.
Since linstor-server 1.5.2 and a recent linstor-client, LINSTOR can create LVM/ZFS pools on a
satellite for you. The linstor-client has the following commands to list possible disks and create
storage pools, but such LVM/ZFS pools are not managed by LINSTOR and there is no delete
command, so such action must be done manually on the nodes.
Will give you a list of available disks grouped by size and rotational(SSD/Magnetic Disk).
• The device is a root device (not having children) e.g.: /dev/vda, /dev/sda
• The device does not have any file-system or other blkid marker (wipefs -a might be needed)
With the create-device-pool command you can create a LVM pool on a disk and also directly add it
as a storage-pool in LINSTOR.
If the --storage-pool option was provided, LINSTOR will create a storage-pool with the given name.
For more options and exact command usage please check the linstor-client help.
13
In simpler terms, resource groups are like templates that define characteristics of resources created
from them. Changes to these pseudo templates will be applied to all resources that were created
from the resource group, retroactively.
Using resource groups to define how you’d like your resources provisioned should
be considered the de facto method for deploying volumes provisioned by LINSTOR.
Chapters that follow which describe creating each resource from a resource-
definition and volume-definition should only be used in special scenarios.
Even if you choose not to create and use resource-groups in your LINSTOR cluster,
all resources created from resource-definitions and volume-definitions will exist in
the 'DfltRscGrp' resource-group.
A simple pattern for deploying resources using resource groups would look like this:
The commands above would result in a resource named 'my_ssd_res' with a 20GB volume
replicated twice being automatically provisioned from nodes who participate in the storage pool
named 'pool_ssd'.
A more useful pattern could be to create a resource group with settings you’ve determined are
optimal for your use case. Perhaps you have to run nightly online verifications of your volumes'
consistency, in that case, you could create a resource group with the 'verify-alg' of your choice
already set so that resources spawned from the group are pre-configured with 'verify-alg' set:
The commands above result in twenty 10GiB resources being created each with the 'crc32c' 'verify-
alg' pre-configured.
You can tune the settings of individual resources or volumes spawned from resource groups by
setting options on the respective resource-definition or volume-definition. For example, if 'res11'
from the example above is used by a very active database receiving lots of small random writes,
you might want to increase the 'al-extents' for that specific resource:
14
If you configure a setting in a resource-definition that is already configured on the resource-group it
was spawned from, the value set in the resource-definition will override the value set on the parent
resource-group. For example, if the same 'res11' was required to use the slower but more secure
'sha256' hash algorithm in its verifications, setting the 'verify-alg' on the resource-definition for
'res11' would override the value set on the resource-group:
A rule of thumb for the hierarchy in which settings are inherited is the value
"closer" to the resource or volume wins: volume-definition settings take precedence
over volume-group settings, and resource-definition settings take precedence over
resource-group settings.
• Thick LVM
• Thick ZFS
• Thin ZFS
If you want to change the size of the volume-definition you can simply do that by:
15
The parameter 0 is the number of the volume in the resource backups. You have to provide this
parameter , because resources can have multiple volumes and they are identified by a so called
volume-number. This number can be found by listing the volume-definitions.
So far we have only created objects in LINSTOR’s database, not a single LV was created on the
storage nodes. Now you have the choice of delegating the task of placement to LINSTOR or doing it
yourself.
With the resource create command you may assign a resource definition to named nodes explicitly.
1.13.2. Autoplace
The value after autoplace tells LINSTOR how many replicas you want to have. The storage-pool
option should be obvious.
Maybe not so obvious is that you may omit the --storage-pool option, then LINSTOR may select a
storage pool on its own. The selection follows these rules:
• Ignore all nodes and storage pools the current user has no access to
The remaining storage pools will be rated by different strategies. LINSTOR has currently three
strategies:
• MaxFreeSpace: This strategy maps the rating 1:1 to the remaining free space of the storage pool.
However, this strategy only considers the actually allocated space (in case of thinly provisioned
storage pool this might grow with time without creating new resources)
• MinReservedSpace: Unlink the "MaxFreeSpace", this strategy considers the reserved spaced. That
is the space that a thin volume can grow to before reaching its limit. The sum of reserved spaces
might exceed the storage pools capacity, which is as overprovisioning.
• MinRscCount: Simply the count of resources already deployed in a given storage pool
• MaxThroughput: For this strategy, the storage pool’s Autoplacer/MaxThroughput property is the base
of the score, or 0 if the property is not present. Every Volume deployed in the given storage pool
16
will subtract its defined sys/fs/blkio_throttle_read and sys/fs/blkio_throttle_write property-
value from the storage pool’s max throughput. The resulting score might be negative.
The scores of the strategies will be normalized, weighted and summed up, where the scores of
minimizing strategies will be converted first to allow an overall maximization of the resulting
score.
whereas the strategy-names are listed above and the weight can be an arbitrary decimal.
To keep the behaviour of the autoplacer similar to the old one (due to
compatibility), all strategies have a default-weight of 0, except the MaxFreeSpace
which has a weight of 1.
Neither 0 nor a negative score will prevent a storage pool from getting selected,
just making them to be considered later.
Finally LINSTOR tries to find the best matching group of storage pools meeting all requirements.
This step also considers other autoplacement restrictions as --replicas-on-same, --replicas-on
-different and others.
17
If everything went right the DRBD-resource has now been created by LINSTOR.
This can be checked by looking for the DRBD block device with the lsblk command
which should look like drbd0000 or similar.
Now we should be able to mount the block device of our resource and start using LINSTOR.
18
Chapter 2. Further LINSTOR tasks
2.1. DRBD clients
By using the --drbd-diskless option instead of --storage-pool you can have a permanently diskless
DRBD device on a node. This means that the resource will appear as block device and can be
mounted to the filesystem without an existing storage-device. The data of the resource is accessed
over the network on another nodes with the same resource.
This means that changes on different volumes from one resource are getting replicated in the same
chronological order on the other Satellites.
Therefore you don’t have to worry about the timing if you have interdependent data on different
volumes in a resource.
To deploy more than one volume in a LINSTOR-resource you have to create two volume-definitions
with the same name.
19
# linstor resource-definition create backups
# linstor volume-definition create backups 500G
# linstor volume-definition create backups 100G
# linstor volume-definition set-property backups 0 StorPoolName pool_hdd
# linstor volume-definition set-property backups 1 StorPoolName pool_ssd
# linstor resource create alpha backups
# linstor resource create bravo backups
# linstor resource create charlie backups
Since the volume-definition create command is used without the --vlmnr option
LINSTOR assigned the volume numbers starting at 0. In the following two lines the
0 and 1 refer to these automatically assigned volume numbers.
Here the 'resource create' commands do not need a --storage-pool option. In this case LINSTOR
uses a 'fallback' storage pool. Finding that storage pool, LINSTOR queries the properties of the
following objects in the following order:
• Volume definition
• Resource
• Resource definition
• Node
If none of those objects contain a StorPoolName property, the controller falls back to a hard-coded
'DfltStorPool' string as a storage pool.
This also means that if you forgot to define a storage pool prior deploying a resource, you will get
an error message that LINSTOR could not find the storage pool named 'DfltStorPool'.
Currently LINSTOR supports the creation of LVM and ZFS volumes with the option of layering some
combinations of LUKS, DRBD, and/or NVMe-oF/NVMe-TCP on top of those volumes.
For example, assume we have a Thin LVM backed storage pool defined in our LINSTOR cluster
named, thin-lvm:
20
# linstor --no-utf8 storage-pool list
+--------------------------------------------------------------+
| StoragePool | Node | Driver | PoolName | ... |
|--------------------------------------------------------------|
| thin-lvm | linstor-a | LVM_THIN | drbdpool/thinpool | ... |
| thin-lvm | linstor-b | LVM_THIN | drbdpool/thinpool | ... |
| thin-lvm | linstor-c | LVM_THIN | drbdpool/thinpool | ... |
| thin-lvm | linstor-d | LVM_THIN | drbdpool/thinpool | ... |
+--------------------------------------------------------------+
We could use LINSTOR to create a Thin LVM on linstor-d that’s 100GiB in size using the following
commands:
You should then see you have a new Thin LVM on linstor-d. You can extract the device path from
LINSTOR by listing your linstor resources with the --machine-readable flag set:
If you wanted to layer DRBD on top of this volume, which is the default --layer-list option in
LINSTOR for ZFS or LVM backed volumes, you would use the following resource creation pattern
instead:
You would then see that you have a new Thin LVM backing a DRBD volume on linstor-d:
The following table shows which layer can be followed by which child-layer:
21
Layer Child layer
LUKS STORAGE
STORAGE -
For information about the prerequisites for the luks layer, refer to the Encrypted
Volumes section of this User’s Guide.
NVMe-oF/NVMe-TCP allows LINSTOR to connect diskless resources to a node with the same
resource where the data is stored over NVMe fabrics. This leads to the advantage that resources can
be mounted without using local storage by accessing the data over the network. LINSTOR is not
using DRBD in this case, and therefore NVMe resources provisioned by LINSTOR are not replicated,
the data is stored on one node.
To use NVMe-oF/NVMe-TCP with LINSTOR the package nvme-cli needs to be installed on every Node
which acts as a Satellite and will use NVMe-oF/NVMe-TCP for a resource:
If you are not using Ubuntu use the suitable command for installing packages on
your OS - SLES: zypper - CentOS: yum
To make a resource which uses NVMe-oF/NVMe-TCP an additional parameter has to be given as you
create the resource-definition:
As default the -l (layer-stack) parameter is set to drbd, storage when DRBD is used.
If you want to create LINSTOR resources with neither NVMe nor DBRD you have to
set the -l parameter to only storage.
22
# linstor volume-definition create nvmedata 500G
Before you create the resource on your nodes you have to know where the data will be stored
locally and which node accesses it over the network.
First we create the resource on the node where our data will be stored:
On the nodes where the resource-data will be accessed over the network, the resource has to be
defined as diskless:
Now you can mount the resource nvmedata on one of your nodes.
If your nodes have more than one NIC you should force the route between them
for NVMe-of/NVME-TCP, otherwise multiple NIC’s could cause troubles.
Since version 1.5.0 the additional Layer openflex can be used in LINSTOR. From LINSTOR’s
perspective, the OpenFlex Composable Infrastructure takes the role of a combined layer acting as a
storage layer (like LVM) and also providing the allocated space as an NVMe target. OpenFlex has a
REST API which is also used by LINSTOR to operate with.
As OpenFlex combines concepts of LINSTORs storage as well as NVMe-layer, LINSTOR was added
both, a new storage driver for the storage pools as well as a dedicated openflex layer which uses the
mentioned REST API.
In order for LINSTOR to communicate with the OpenFlex-API, LINSTOR needs some additional
properties, which can be set once on controller level to take LINSTOR-cluster wide effect:
• StorDriver/Openflex/ApiPort this property is glued with a colon to the previous to form the basic
http://ip:port part used by the REST calls
Once that is configured, we can now create LINSTOR objects to represent the OpenFlex
architecture. The theoretical mapping of LINSTOR objects to OpenFlex objects are as follows:
Obviously an OpenFlex storage pool is represented by a LINSTOR storage pool. As the next thing
above a LINSTOR storage pool is already the node, a LINSTOR node represents an OpenFlex storage
23
device. The OpenFlex objects above storage device are not mapped by LINSTOR.
When using NVMe, LINSTOR was designed to run on both sides, the NVMe target as well as on the
NVMe initiator side. In the case of OpenFlex, LINSTOR cannot (or even should not) run on the NVMe
target side as that is completely managed by OpenFlex. As LINSTOR still needs nodes and storage
pools to represent the OpenFlex counterparts, the LINSTOR client was extended with special node
create commands since 1.0.14. These commands not only accept additionally needed configuration
data, but also starts a "special satellite" besides the already running controller instance. This special
satellites are completely LINSTOR managed, they will shutdown when the controller shuts down
and will be started again when the controller starts. The new client command for creating a
"special satellite" representing an OpenFlex storage device is:
• ofNode1 is the node name which is also used by the standard linstor node create command
• 192.168.166.7 is the address on which the provided NVMe devices can be accessed. As the NVMe
devices are accessed by a dedicated network interface, this address differs from the address
specified with the property StorDriver/Openflex/ApiHost. The latter is used for the management
/ REST API.
The last step of the configuration is the creation of LINSTOR storage pools:
• ofNode1 and sp0 are the node name and storage pool name, respectively, just as usual for the
LINSTORs create storage pool command
• The last 0 is the identifier of the OpenFlex storage pool within the previously defined storage
device
Once all necessary storage pools are created in LINSTOR, the next steps are similar to the usage of
using an NVMe resource with LINSTOR. Here is a complete example:
24
# set the properties once
linstor controller set-property StorDriver/Openflex/ApiHost 10.43.7.185
linstor controller set-property StorDriver/Openflex/ApiPort 80
linstor controller set-property StorDriver/Openflex/UserName myusername
linstor controller set-property StorDriver/Openflex/UserPassword mypassword
# create a storage pool for openflex storage pool "0" within storage device
"000af795789d"
linstor storage-pool create openflex ofNode1 sp0 0
In case a node should access the OpenFlex REST API through a different host than
specified with
linstor controller set-property StorDriver/Openflex/ApiHost 10.43.7.185 you can
always use LINSTOR’s inheritance mechanism for properties. That means simply
define the same property on the node-level you need it, i.e.
linstor node set-property ofNode1 StorDriver/Openflex/ApiHost 10.43.8.185
A DM-Writecache device is composed by two devices, one storage device and one cache device.
LINSTOR can setup such a writecache device, but needs some additional information, like the
storage pool and the size of the cache device.
25
# linstor storage-pool create lvm node1 lvmpool drbdpool
# linstor storage-pool create lvm node1 pmempool pmempool
The two properties set in the examples are mandatory, but can also be set on controller level which
would act as a default for all resources with WRITECACHE in their --layer-list. However, please note
that the Writecache/PoolName refers to the corresponding node. If the node does not have a storage-
pool named pmempool you will get an error message.
The 4 mandatory parameters required by DM-Writecache are either configured via property or
figured out by LINSTOR. The optional properties listed in the mentioned link can also be set via
property. Please see linstor controller set-property --help for a list of Writecache/* property-keys.
LINSTOR can also setup a DM-Cache device, which is very similar to the DM-Writecache from the
previous section. The major difference is that a cache device is composed by three devices: one
storage device, one cache device and one meta device. The LINSTOR properties are quite similar to
those of the writecache but are located in the Cache namespace:
26
Please see linstor controller set-property --help for a list of Cache/* property-keys and default
values for omitted properties.
Using --layer-list DRBD,CACHE,STORAGE while having DRBD configured to use external metadata,
only the backing device will use a cache, not the device holding the external metadata.
• StorDriver/ZfscreateOptions: The value of this property is appended to every zfs create … call
LINSTOR executes.
• StorDriver/dm_stats: If set to true LINSTOR calls dmstats create $device after creation and
dmstats delete $device --allregions after deletion of a volume. Currently only enabled for LVM
and LVM_THIN storage providers.
When a satellite node is created a first netif gets created implicitly with the name
default. Using the --interface-name option of the node create command you can
give it a different name.
NICs are identified by the IP address only, the name is arbitrary and is not related to the interface
name used by Linux. The NICs can be assigned to storage pools so that whenever a resource is
created in such a storage pool, the DRBD traffic will be routed through the specified NIC.
FIXME describe how to route the controller <-> client communication through a specific netif.
27
2.6. Encrypted volumes
LINSTOR can handle transparent encryption of drbd volumes. dm-crypt is used to encrypt the
provided storage from the storage device.
In order to use dm-crypt please make sure to have cryptsetup installed before you
start the satellite
1. Disable user security on the controller (this will be obsolete once authentication works)
3. Add luks to the layer-list. Note that all plugins (e.g., Proxmox) require a DRBD layer as the top
most layer if they do not explicitly state otherwise.
Disabling the user security on the Linstor controller is a one time operation and is afterwards
persisted.
Before LINSTOR can encrypt any volume a master passphrase needs to be created. This can be done
with the linstor-client.
crypt-create-passphrase will wait for the user to input the initial master passphrase (as all other
crypt commands will with no arguments).
If you ever want to change the master passphrase this can be done with:
28
The luks layer can be added when creating the resource-definition or the resource itself, whereas
the former method is recommended since it will be automatically applied to all resource created
from that resource-definition.
To enter the master passphrase (after controller restart) use the following command:
Whenever the linstor-controller is restarted, the user has to send the master
passphrase to the controller, otherwise LINSTOR is unable to reopen or create
encrypted volumes.
It is possible to automate the process of creating and re-entering the master passphrase.
[encrypt]
passphrase="example"
If either one of these is set, then every time the controller starts it will check whether a master
passphrase already exists. If there is none, it will create a new master passphrase as specified.
Otherwise, the controller enters the passphrase.
In case the master passphrase is set in both an environment variable and the
linstor.toml, only the master passphrase from the linstor.toml will be used.
29
to group and sort the output in multiple dimensions.
Assuming a resource definition named 'resource1' which has been placed on some nodes, a
snapshot can be created as follows:
This will create snapshots on all nodes where the resource is present. LINSTOR will ensure that
consistent snapshots are taken even when the resource is in active use.
The following steps restore a snapshot to a new resource. This is possible even when the original
resource has been removed from the nodes where the snapshots were taken.
First define the new resource with volumes matching those from the snapshot:
At this point, additional configuration can be applied if necessary. Then, when ready, create
resources based on the snapshots:
30
This will place the new resource on all nodes where the snapshot is present. The nodes on which to
place the resource can also be selected explicitly; see the help (linstor snapshot resource restore
-h).
LINSTOR can roll a resource back to a snapshot state. The resource must not be in use. That is, it
may not be mounted on any nodes. If the resource is in use, consider whether you can achieve your
goal by restoring the snapshot instead.
A resource can only be rolled back to the most recent snapshot. To roll back to an older snapshot,
first delete the intermediate snapshots.
Both, the source as well as the target node have to have the resource for snapshot shipping
deployed. Additionally, the target resource has to be deactivated.
Deactivating a resource with DRBD in its layer-list can NOT be reactivated again.
However, a successfully shipped snapshot of a DRBD resource can still be restored
into a new resource.
To allow incremental snapshot-shipping, LINSTOR has to keep at least the last shipped snapshot on
31
the target node. The property SnapshotShipping/Keep can be used to specify how many snapshots
LINSTOR should keep. If the property is not set (or ⇐ 0) LINSTOR will keep the last 10 shipped
snapshots by default.
For instance, it is easy to set the DRBD protocol for a resource named backups:
In order to move a resource between nodes without reducing redundancy at any point, LINSTOR’s
disk migrate feature can be used. First create a diskless resource on the target node, and then add a
32
disk using the --migrate-from option. This will wait until the data has been synced to the new disk
and then remove the source disk.
Suppose our cluster consists of nodes 'alpha' and 'bravo' in a local network and 'charlie' at a remote
site, with a resource definition named backups deployed to each of the nodes. Then DRBD Proxy can
be enabled for the connections to 'charlie' as follows:
The DRBD Proxy configuration can be tailored with commands such as:
LINSTOR does not automatically optimize the DRBD configuration for long-distance replication, so
you will probably want to set some configuration options such as the protocol:
LINSTOR can also be configured to automatically enable the above mentioned Proxy connection
between two nodes. For this automation, LINSTOR first needs to know on which site each node is.
As the Site property might also be used for other site-based decisions in future features, the
33
DrbdProxy/AutoEnable also has to be set to true:
This property can also be set on node, resource-definition, resource and resource-connection level
(from left to right in increasing priority, whereas the controller is the left-most, i.e. least prioritized
level)
Once this initialization steps are completed, every newly created resource will automatically check
if it has to enable DRBD proxy to any of its peer-resources.
To use an external database there are a few additional steps to configure. You have to create a
DB/Schema and user to use for linstor, and configure this in the /etc/linstor/linstor.toml.
2.12.1. Postgresql
[db]
user = "linstor"
password = "linstor"
connection_url = "jdbc:postgresql://localhost/linstor"
2.12.2. MariaDB/Mysql
[db]
user = "linstor"
password = "linstor"
connection_url = "jdbc:mariadb://localhost/LINSTOR?createDatabaseIfNotExist=true"
2.12.3. ETCD
ETCD is a distributed key-value store that makes it easy to keep your LINSTOR database distributed
in a HA-setup. The ETCD driver is already included in the LINSTOR-controller package and only
needs to be configured in the linstor.toml.
34
More information on how to install and configure ETCD can be found here: ETCD docs
[db]
## only set user/password if you want to use authentication, only since LINSTOR 1.2.1
# user = "linstor"
# password = "linstor"
## for etcd
## do not set user field if no authentication required
connection_url = "etcd://etcdhost1:2379,etcdhost2:2379,etcdhost3:2379"
## if you want to use client TLS authentication too, only since LINSTOR 1.2.1
# client_key_pkcs8_pem = "client-key.pkcs8"
## set client_key_password if private key has a password
# client_key_password = "mysecret"
[http]
enabled = true
port = 3370
listen_addr = "127.0.0.1" # to disable remote access
If you want to use the REST-API the current documentation can be found on the following link:
https://app.swaggerhub.com/apis-docs/Linstor/Linstor/
The HTTP REST-API can also run secured by HTTPS and is highly recommended if you use any
features that require authorization. Todo so you have to create a java keystore file with a valid
certificate that will be used to encrypt all HTTPS traffic.
Here is a simple example on how you can create a self signed certificate with the keytool that is
included in the java runtime:
35
keytool -keyalg rsa -keysize 2048 -genkey -keystore ./keystore_linstor.jks\
-alias linstor_controller\
-dname "CN=localhost, OU=SecureUnit, O=ExampleOrg, L=Vienna, ST=Austria, C=AT"
keytool will ask for a password to secure the generated keystore file and is needed for the LINSTOR-
controller configuration. In your linstor.toml file you have to add the following section:
[https]
keystore = "/path/to/keystore_linstor.jks"
keystore_password = "linstor"
Now (re)start the linstor-controller and the HTTPS REST-API should be available on port 3371.
More information on how to import other certificates can be found here: https://docs.oracle.com/
javase/8/docs/technotes/tools/unix/keytool.html
When HTTPS is enabled, all requests to the HTTP /v1/ REST-API will be redirected
to the HTTPS redirect.
Client access can be restricted by using a SSL truststore on the Controller. Basically you create a
certificate for your client and add it to your truststore and the client then uses this certificate for
authentication.
keytool -importkeystore\
-srcstorepass linstor -deststorepass linstor -keypass linstor\
-srckeystore client.jks -destkeystore trustore_client.jks
36
[https]
keystore = "/path/to/keystore_linstor.jks"
keystore_password = "linstor"
truststore = "/path/to/trustore_client.jks"
truststore_password = "linstor"
Now restart the Controller and it will no longer be possible to access the controller API without a
correct certificate.
The LINSTOR client needs the certificate in PEM format, so before we can use it we have to convert
the java keystore certificate to the PEM format.
# Convert to pkcs12
keytool -importkeystore -srckeystore client.jks -destkeystore client.p12\
-storepass linstor -keypass linstor\
-srcalias client1 -srcstoretype jks -deststoretype pkcs12
To avoid entering the PEM file password all the time it might be convenient to remove the
password.
The --certfile parameter can also added to the client configuration file, see Using the LINSTOR
client for more details.
2.14. Logging
Linstor uses SLF4J with Logback as binding. This gives Linstor the possibility to distinguish between
the log levels ERROR, WARN, INFO, DEBUG and TRACE (in order of increasing verbosity). In the current
linstor version (1.1.2) the user has the following four methods to control the logging level, ordered
by priority (first has highest priority):
37
Command ==> SetTrcMode MODE(enabled)
SetTrcMode Set TRACE level logging mode
New TRACE level logging mode: ENABLED
2. When starting the controller or satellite a command line argument can be passed:
[logging]
level="INFO"
38
<?xml version="1.0" encoding="UTF-8"?>
<configuration scan="false" scanPeriod="60 seconds">
<!--
Values for scanPeriod can be specified in units of milliseconds, seconds, minutes
or hours
https://logback.qos.ch/manual/configuration.html
-->
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<!-- encoders are assigned the type
ch.qos.logback.classic.encoder.PatternLayoutEncoder by default -->
<encoder>
<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger - %msg%n</pattern>
</encoder>
</appender>
<appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>${log.directory}/linstor-${log.module}.log</file>
<append>true</append>
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<Pattern>%d{yyyy_MM_dd HH:mm:ss.SSS} [%thread] %-5level %logger -
%msg%n</Pattern>
</encoder>
<rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
<FileNamePattern>logs/linstor-${log.module}.%i.log.zip</FileNamePattern>
<MinIndex>1</MinIndex>
<MaxIndex>10</MaxIndex>
</rollingPolicy>
<triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
<MaxFileSize>2MB</MaxFileSize>
</triggeringPolicy>
</appender>
<logger name="LINSTOR/Controller" level="INFO" additivity="false">
<appender-ref ref="STDOUT" />
<!-- <appender-ref ref="FILE" /> -->
</logger>
<logger name="LINSTOR/Satellite" level="INFO" additivity="false">
<appender-ref ref="STDOUT" />
<!-- <appender-ref ref="FILE" /> -->
</logger>
<root level="WARN">
<appender-ref ref="STDOUT" />
<!-- <appender-ref ref="FILE" /> -->
</root>
</configuration>
When none of the configuration methods above is used Linstor will default to INFO log level.
39
2.15. Monitoring
Since LINSTOR 1.8.0, a Prometheus /metrics HTTP path is provided with LINSTOR and JVM specific
exports.
The /metrics path also supports 3 GET arguments to reduce LINSTOR’s reported data:
• resource
• storage_pools
• error_reports
These are all default true, to disabled e.g. error-report data: http://localhost:3070/metrics?
error_reports=false
The LINSTOR-Controller also provides a /health HTTP path that will simply return HTTP-Status 200
if the controller can access its database and all services are up and running. Otherwise it will return
HTTP error status code 500 Internal Server Error.
Node alpha is the just the controller. Node bravo and node charlie are just satellites.
Here are the commands to generate such a keystore setup, values should of course be edited for
your environment.
40
keytool -keyalg rsa -keysize 2048 -genkey -keystore charlie/keystore.jks\
-storepass linstor -keypass linstor\
-alias charlie\
-dname "CN=Max Mustermann, OU=charlie, O=Example, L=Vienna, ST=Austria, C=AT"
keytool -importkeystore\
-srcstorepass linstor -deststorepass linstor -keypass linstor\
-srckeystore charlie/keystore.jks -destkeystore alpha/certificates.jks
keytool -importkeystore\
-srcstorepass linstor -deststorepass linstor -keypass linstor\
-srckeystore alpha/keystore.jks -destkeystore charlie/certificates.jks
echo '[netcom]
type="ssl"
port=3367
server_certificate="ssl/keystore.jks"
trusted_certificates="ssl/certificates.jks"
key_password="linstor"
keystore_password="linstor"
truststore_password="linstor"
41
ssl_protocol="TLSv1.2"
' | ssh root@charlie "cat > /etc/linstor/linstor_satellite.toml"
Now just start controller and satellites and add the nodes with --communication-type SSL.
LINSTOR automatically configures quorum policies on resources when quorum is achievable. This
means, whenever you have at least two diskful and one or more diskless resource assignments, or
three or more diskful resource assignments, LINSTOR will enable quorum policies for your
resources automatically.
Inversely, LINSTOR will automatically disable quorum policies whenever there are less than the
minimum required resource assignments to achieve quorum.
This is controlled via the, DrbdOptions/auto-quorum, property which can be applied to the linstor-
controller, resource-group, and resource-definition. Accepted values for the DrbdOptions/auto-quorum
property are disabled, suspend-io, and io-error.
Setting the DrbdOptions/auto-quorum property to disabled will allow you to manually, or more
granularly, control the quorum policies of your resources should you so desire.
The default policies for DrbdOptions/auto-quorum are quorum majority, and on-no-
quorum io-error. For more information on DRBD’s quorum features and their
behavior, please refer to the quorum section of the DRBD user’s guide.
For example, to manually set the quorum policies of a resource-group named my_ssd_group, you
would use the following commands:
You may wish to disable DRBD’s quorum features completely. To do that, you would need to first
disable DrbdOptions/auto-quorum on the appropriate LINSTOR object, and then set the DRBD quorum
features accordingly. For example, use the following commands to disable quorum entirely on the
my_ssd_group resource-group:
42
# linstor resource-group set-property my_ssd_group DrbdOptions/auto-quorum disabled
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/quorum off
# linstor resource-group set-property my_ssd_group DrbdOptions/Resource/on-no-quorum
2.17.2. Auto-Evict
If a satellite is offline for a prolonged period of time, LINSTOR can be configured to declare that
node as evicted. This triggers an automated reassignment of the affected DRBD-resources to other
nodes to ensure a minimum replica count is kept.
• DrbdOptions/AutoEvictAfterTime describes how long a node can be offline in minutes before the
eviction is triggered. You can set this property on the controller to change a global default, or on
a single node to give it a different behavior. The default value for this property is 60 minutes.
Afer the linstor-controller loses the connection to a satellite, aside from trying to reconnect, it starts
a timer for that satellite. As soon as that timer exceeds DrbdOptions/AutoEvictAfterTime and all of
the DRBD-connections to the DRBD-resources on that satellite are broken, the controller will check
whether or not DrbdOptions/AutoEvictMaxDisconnectedNodes has been met. If it hasn’t, and
DrbdOptions/AutoEvictAllowEviction is true for the node in question, the satellite will be marked as
EVICTED. At the same time, the controller will check for every DRBD-resource whether the number
of resources is still above DrbdOptions/AutoEvictMinReplicaCount. If it is, the resource in question
43
will be marked as DELETED. If it isn’t, an auto-place with the settings from the corresponding
resource-group will be started. Should the auto-place fail, the controller will try again later when
changes that might allow a different result, such as adding a new node, have happened. Resources
where an auto-place is necessary will only be marked as DELETED if the corresponding auto-place
was successful.
The evicted satellite itself will not be able to reestablish connection with the controller. Even if the
node is up and running, a manual reconnect will fail. It is also not possible to delete the satellite,
even if it is working as it should be. Should you wish to get rid of an evicted node, you need to use
the node lost command. The satellite can, however, be restored. This will remove the EVICTED-flag
from the satellite and allow you to use it again. Previously configured network interfaces, storage
pools, properties and similar entities as well as non-DRBD-related resources and resources that
could not be autoplaced somewhere else will still be on the satellite. To restore a satellite, use
If a LINSTOR volume is composed of multiple "stacked" volume (for example DRBD with external
metadata will have 3 devices: backing (storage) device, metadata device and the resulting DRBD
device), setting a sys/fs/\* property for a Volume, only the bottom-most local "data"-device will
receive the corresponding /sys/fs/cgroup/… setting. That means, in case of the example above only
the backing device will receive the setting. In case a resource-definition has an nvme-target as well
as an nvme-initiator resource, both bottom-most devices of each node will receive the setting. In
case of the target the bottom-most device will be the volume of LVM or ZFS, whereas in case of the
initiator the bottom-most device will be the connected nvme-device, regardless which other layers
are stacked ontop of that.
A quick way to list available commands on the command line is to type linstor.
44
# linstor node list -h
# linstor help node list
Using the 'help' subcommand is especially helpful when LINSTOR is executed in interactive mode
(linstor interactive).
One of the most helpful features of LINSTOR is its rich tab-completion, which can be used to
complete basically every object LINSTOR knows about (e.g., node names, IP addresses, resource
names, …). In the following examples, we show some possible completions, and their results:
# linstor node create alpha 1<tab> # completes the IP address if hostname can be
resolved
# linstor resource create b<tab> c<tab> # linstor assign-resource backups charlie
If tab-completion does not work out of the box, please try to source the appropriate file:
# source /etc/bash_completion.d/linstor # or
# source /usr/share/bash_completion/completions/linstor
For zsh shell users linstor-client can generate a zsh compilation file, that has basic support for
command and argument completion.
2.19.2. SOS-Report
If something goes wrong and you need help finding the cause of the issue, you can use
The command above will create a new sos-report in /var/log/linstor/controller/ on the controller
node. Alternatively you can use
which will create a new sos-report and additionally downloads that report to the local machine into
your current working directory.
This sos-report contains logs and useful debug-information from several sources (Linstor-logs,
dmesg, versions of external tools used by Linstor, ip a, database dump and many more). These
information are stored for each node in plaintext in the resulting .tar.gz file.
45
2.19.3. From the community
For help from the community please subscribe to our mailing list located here:
https://lists.linbit.com/listinfo/drbd-user
2.19.4. GitHub
To file bug or feature request please check out our GitHub page https://github.com/linbit
Alternatively, if you wish to purchase remote installation services, 24/7 support, access to certified
repositories, or feature development please contact us: +1-877-454-6248 (1-877-4LINBIT) ,
International: +43-1-8178292-0 | sales@linbit.com
46
Chapter 3. LINSTOR Volumes in Kubernetes
This chapter describes the usage of LINSTOR in Kubernetes as managed by the operator and with
volumes provisioned using the LINSTOR CSI plugin.
LINBIT provides a LINSTOR operator to commercial support customers. The operator eases
deployment of LINSTOR on Kubernetes by installing DRBD, managing Satellite and Controller pods,
and other related functions.
The name of this secret must match the one specified in the Helm values, by default drbdiocred.
• Configure storage for the LINSTOR etcd instance. There are various options for configuring the
etcd instance for LINSTOR:
◦ Disable persistence for basic testing. This can be done by adding --set
etcd.persistentVolume.enabled=false to the helm install command below.
• Read the storage guide and configure a basic storage setup for LINSTOR
• Select the appropriate kernel module injector using --set with the helm install command in the
final step.
◦ Choose the injector according to the distribution you are using. Select the latest version from
one of drbd9-rhel7, drbd9-rhel8,… from http://drbd.io/ as appropriate. The drbd9-rhel8 image
should also be used for RHCOS (OpenShift). For the SUSE CaaS Platform use the SLES injector
that matches the base system of the CaaS Platform you are using (e.g., drbd9-sles15sp1). For
example:
47
operator.satelliteSet.kernelModuleInjectionImage=drbd.io/drbd9-rhel8:v9.0.24
◦ Only inject modules that are already present on the host machine. If a module is not found,
it will be skipped.
operator.satelliteSet.kernelModuleInjectionMode=DepsOnly
◦ Disable kernel module injection if you are installing DRBD by other means. Deprecated by
DepsOnly
operator.satelliteSet.kernelModuleInjectionMode=None
• Finally create a Helm deployment named linstor-op that will set up everything.
You can use the pv-hostpath Helm templates to create hostPath persistent volumes. Create as many
PVs as needed to satisfy your configured etcd replicas (default 1).
Create the hostPath persistent volumes, substituting cluster node names accordingly in the nodes=
option:
LINSTOR can connect to an existing PostgreSQL, MariaDB or etcd database. For instance, for a
PostgreSQL instance with the following configuration:
POSTGRES_DB: postgresdb
POSTGRES_USER: postgresadmin
POSTGRES_PASSWORD: admin123
The Helm chart can be configured to use this database instead of deploying an etcd cluster by
adding the following to the Helm install command:
48
--set etcd.enabled=false --set
"operator.controller.dbConnectionURL=jdbc:postgresql://postgres/postgresdb?user=postgr
esadmin&password=admin123"
The LINSTOR operator can automate some basic storage set up for LINSTOR.
The LINSTOR operator can be used to create LINSTOR storage pools. Creation is under control of the
LinstorSatelliteSet resource:
At install time
operator:
satelliteSet:
storagePools:
lvmPools:
- name: lvm-thick
volumeGroup: drbdpool
49
helm install -f <file> linstor linstor/linstor
After install
On a cluster with the operator already configured (i.e. after helm install), you can edit the
LinstorSatelliteSet configuration like this:
The storage pool configuration can be updated like in the example above.
By default, LINSTOR expects the referenced VolumeGroups, ThinPools and so on to be present. You
can use the devicePaths: [] option to let LINSTOR automatically prepare devices for the pool.
Eligible for automatic configuration are block devices that:
To enable automatic configuration of devices, set the devicePaths key on storagePools entries:
storagePools:
lvmPools:
- name: lvm-thick
volumeGroup: drbdpool
devicePaths:
- /dev/vdb
lvmThinPools:
- name: lvm-thin
thinVolume: thinpool
volumeGroup: linstor_thinpool
devicePaths:
- /dev/vdc
- /dev/vdd
Currently, this method supports creation of LVM and LVMTHIN storage pools.
lvmPools configuration
• devicePaths devices to configure for this pool.Must be empty and >= 1GiB to be
recognized.Optional
50
• raidLevel LVM raid level.Optional
• vdoLogicalSizeKib Size of the created VG (expected to be bigger than the backing devices by
using VDO).Optional
[VDO]: https://www.redhat.com/en/blog/look-vdo-new-linux-compression-layer
lvmThinPools configuration
• volumeGroup VG to use for the thin pool. If you want to use devicePaths, you must set this to "".
This is required because LINSTOR does not allow configuration of the VG name when preparing
devices.
• devicePaths devices to configure for this pool. Must be empty and >= 1GiB to be recognized.
Optional
The volume group created by LINSTOR for LVMTHIN pools will always follow the
scheme "linstor_$THINPOOL".
zfsPools configuration
• zPool name of the zpool to use. Must already be present on all machines. Required
51
3.2.3. Securing deployment
This section describes the different options for enabling security features available when using this
operator. The following guides assume the operator is installed using Helm
The secret can then be passed to the controller by passing the following argument to helm install
If you want to use TLS certificates to authenticate with an etcd database, you need to set the
following option on helm install:
--set operator.controller.dbUseClientCert=true
If this option is active, the secret specified in the above section must contain two additional keys: *
client.cert PEM formatted certificate presented to etcd for authentication * client.key private key
in PKCS8 format, matching the above client certificate. Keys can be converted into PKCS8 format
using openssl:
The default communication between LINSTOR components is not secured by TLS. If this is needed
for your setup, follow these steps:
• Create private keys in the java keystore format, one for the controller, one for all satellites:
• Create a trust store with the public keys that each component needs to trust:
52
keytool -importkeystore -srcstorepass linstor -deststorepass linstor -srckeystore
control-keys.jks -destkeystore satellite-trust.jks
keytool -importkeystore -srcstorepass linstor -deststorepass linstor -srckeystore
satellite-keys.jks -destkeystore control-trust.jks
• Create kubernetes secrets that can be passed to the controller and satellite pods
It is currently NOT possible to change the keystore password. LINSTOR expects the
passwords to be linstor. This is a current limitation of LINSTOR.
Various components need to talk to the LINSTOR controller via its REST interface. This interface can
be secured via HTTPS, which automatically includes authentication. For HTTPS+authentication to
work, each component needs access to:
• A private key
The next sections will guide you through creating all required components.
The clients need private keys and certificate in a different format, so we need to convert it
53
openssl pkcs12 -in client.pkcs12 -passin pass:linstor -out client.cert -clcerts
-nokeys
openssl pkcs12 -in client.pkcs12 -passin pass:linstor -out client.key -nocerts -nodes
The alias specified for the controller key (i.e. -ext san=dns:linstor-op-
cs.default.svc) has to exactly match the service name created by the operator.
When using helm, this is always of the form <release-name>-cs.<release-
namespace>.svc.
It is currently NOT possible to change the keystore password. LINSTOR expects the
passwords to be linstor. This is a current limitation of LINSTOR
For the controller to trust the clients, we can use the following command to create a truststore,
importing the client certificate
For the client, we have to convert the controller certificate into a different format
openssl pkcs12 -in controller.pkcs12 -passin pass:linstor -out ca.pem -clcerts -nokeys
Now you can create secrets for the controller and for clients:
The names of the secrets can be passed to helm install to configure all clients to use https.
Linstor can be used to create encrypted volumes using LUKS. The passphrase used when creating
these volumes can be set via a secret:
54
kubectl create secret generic linstor-pass --from-literal=MASTER_PASSPHRASE=<password>
--set operator.controller.luksSecret=linstor-pass
To protect the storage infrastructure of the cluster from accidentally deleting vital components, it is
necessary to perform some manual steps before deleting a Helm deployment.
1. Delete all volume claims managed by LINSTOR components. You can use the following
command to get a list of volume claims managed by LINSTOR. After checking that none of the
listed volumes still hold needed data, you can delete them using the generated kubectl delete
command.
After a short wait, the controller and satellite pods should terminate. If they continue to run,
you can check the above resources for errors (they are only removed after all associated pods
terminate)
If you removed all PVCs and all LINSTOR pods have terminated, you can uninstall the helm
deployment
55
helm uninstall linstor-op
Due to the Helm’s current policy, the Custom Resource Definitions named
LinstorController and LinstorSatelliteSet will not be deleted by the command.
More information regarding Helm’s current position on CRD’s can be found
here.
If you are already running a previous version (before v1.0.0) of linstor-operator, you need to
follow the upstream guide. We tried to stay compatible and lower the burden of upgrading as much
as possible, but some manual steps may still be required.
The helm charts provide a set of further customization options for advanced use cases.
global:
imagePullPolicy: IfNotPresent # empty pull policy means k8s default is used
("always" if tag == ":latest", "ifnotpresent" else) ①
# Dependency charts
etcd:
persistentVolume:
enabled: true
storage: 1Gi
replicas: 1 # How many instances of etcd will be added to the initial cluster. ②
resources: {} # resource requirements for etcd containers ③
image:
repository: gcr.io/etcd-development/etcd
tag: v3.4.9
csi-snapshotter:
enabled: true # <- enable to add k8s snapshotting CRDs and controller. Needed for
CSI snapshotting
image: quay.io/k8scsi/snapshot-controller:v2.1.0
replicas: 1 ②
resources: {} # resource requirements for the cluster snapshot controller. ③
stork:
enabled: true
storkImage: docker.io/linbit/stork:latest
schedulerImage: gcr.io/google_containers/kube-scheduler-amd64
replicas: 1 ②
storkResources: {} # resources requirements for the stork plugin containers ③
schedulerResources: {} # resource requirements for the kube-scheduler containers ③
csi:
enabled: true
pluginImage: "drbd.io/linstor-csi:v0.9.1"
csiAttacherImage: quay.io/k8scsi/csi-attacher:v2.2.0
56
csiNodeDriverRegistrarImage: quay.io/k8scsi/csi-node-driver-registrar:v1.3.0
csiProvisionerImage: quay.io/k8scsi/csi-provisioner:v1.6.0
csiSnapshotterImage: quay.io/k8scsi/csi-snapshotter:v2.1.0
csiResizerImage: quay.io/k8scsi/csi-resizer:v0.5.0
controllerReplicas: 1 ②
nodeAffinity: {} ④
nodeTolerations: [] ④
controllerAffinity: {} ④
controllerTolerations: [] ④
resources: {} ③
priorityClassName: ""
drbdRepoCred: drbdiocred # <- Specify the kubernetes secret name here
linstorHttpsControllerSecret: "" # <- name of secret containing linstor server
certificates+key.
linstorHttpsClientSecret: "" # <- name of secret containing linstor client
certificates+key.
controllerEndpoint: "" # <- override to the generated controller endpoint. use if
controller is not deployed via operator
operator:
replicas: 1 # <- number of replicas for the operator deployment ②
image: "drbd.io/linstor-operator:v1.0.0-rc1"
resources: {} ③
controller:
controllerImage: "drbd.io/linstor-controller:v1.7.3"
luksSecret: ""
dbCertSecret: ""
dbUseClientCert: false
sslSecret: ""
affinity: {} ④
tolerations: [] ④
resources: {} ③
satelliteSet:
satelliteImage: "drbd.io/linstor-satellite:v1.7.3"
storagePools: null
sslSecret: ""
automaticStorageType: None
affinity: {} ④
tolerations: [] ④
resources: {} ③
kernelModuleInjectionImage: "drbd.io/drbd9-rhel7:v9.0.24"
kernelModuleInjectionMode: ShippedModules
kernelModuleInjectionResources: {} ③
③ Set container resource requests and limits. See the kubernetes docs. Most containers need a
minimal amount of resources, except for:
57
• operater.satelliteSet.resources Around 700MiB memory is required
④ Affinity and toleration determine where pods are scheduled on the cluster. See the kubernetes
docs on affinity and toleration. This may be especially important for the operator.satelliteSet
and csi.node* values. To schedule a pod using a LINSTOR persistent volume, the node requires a
running LINSTOR satellite and LINSTOR CSI pod.
To create a High Availability deployment of all components, take a look at the upstream guide The
default values are chosen so that scaling the components to multiple replicas ensures that the
replicas are placed on different nodes. This ensures that a single node failures will not interrupt the
service.
The operator can configure the satellites and CSI plugin to use an existing LINSTOR setup. This can
be useful in cases where the storage infrastructure is separate from the Kubernetes cluster.
Volumes can be provisioned in diskless mode on the Kubernetes nodes while the storage nodes will
provide the backing disk storage.
To skip the creation of a LINSTOR Controller deployment and configure the other components to
use your existing LINSTOR Controller, use the following options when running helm install:
After all pods are ready, you should see the Kubernetes cluster nodes as satellites in your LINSTOR
setup.
Your kubernetes nodes must be reachable using their IP by the controller and
storage nodes.
Create a storage class referencing an existing storage pool on your storage nodes.
58
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: linstor-on-k8s
provisioner: linstor.csi.linbit.com
parameters:
autoPlace: "3"
storagePool: existing-storage-pool
resourceGroup: linstor-on-k8s
You can provision new volumes by creating PVCs using your storage class. The volumes will first be
placed only on nodes with the given storage pool, i.e. your storage infrastructure. Once you want to
use the volume in a pod, LINSTOR CSI will create a diskless resource on the Kubernetes node and
attach over the network to the diskfull resource.
The community supported edition of the LINSTOR deployment in Kubernetes is called Piraeus. The
Piraeus project provides an operator for deployment.
This should only be necessary for investigating problems and accessing advanced functionality.
Regular operation such as creating volumes should be achieved via the Kubernetes integration.
If you are integrating LINSTOR using a different method, you will need to install the LINSTOR CSI
plugin. Instructions for deploying the CSI plugin can be found on the project’s github. This will
result in a linstor-csi-controller Deployment and a linstor-csi-node DaemonSet running in the kube-
system namespace.
59
NAME READY STATUS RESTARTS AGE IP
NODE
linstor-csi-controller-ab789 5/5 Running 0 3h10m 191.168.1.200
kubelet-a
linstor-csi-node-4fcnn 2/2 Running 0 3h10m 192.168.1.202
kubelet-c
linstor-csi-node-f2dr7 2/2 Running 0 3h10m 192.168.1.203
kubelet-d
linstor-csi-node-j66bc 2/2 Running 0 3h10m 192.168.1.201
kubelet-b
linstor-csi-node-qb7fw 2/2 Running 0 3h10m 192.168.1.200
kubelet-a
linstor-csi-node-zr75z 2/2 Running 0 3h10m 192.168.1.204
kubelet-e
Configuring the behavior and properties of LINSTOR volumes deployed via Kubernetes is
accomplished via the use of StorageClasses.
Here below is the simplest practical StorageClass that can be used to deploy volumes:
Listing 1. linstor-basic-sc.yaml
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
# The name used to identify this StorageClass.
name: linstor-basic-storage-class
# The name used to match this StorageClass with a provisioner.
# linstor.csi.linbit.com is the name that the LINSTOR CSI plugin uses to identify
itself
provisioner: linstor.csi.linbit.com
parameters:
# LINSTOR will provision volumes from the drbdpool storage pool configured
# On the satellite nodes in the LINSTOR cluster specified in the plugin's deployment
storagePool: "drbdpool"
resourceGroup: "linstor-basic-storage-class"
# Setting a fstype is required for "fsGroup" permissions to work correctly.
# Currently supported: xfs/ext4
csi.storage.k8s.io/fstype: xfs
60
DRBD options can be set as well in the parameters section. Valid keys are defined in the LINSTOR
REST-API (e.g., DrbdOptions/Net/allow-two-primaries: "yes").
Now that our StorageClass is created, we can now create a PersistentVolumeClaim which can be
used to provision volumes known both to Kubernetes and LINSTOR:
Listing 2. my-first-linstor-volume-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: my-first-linstor-volume
spec:
storageClassName: linstor-basic-storage-class
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
This will create a PersistentVolumeClaim known to Kubernetes, which will have a PersistentVolume
bound to it, additionally LINSTOR will now create this volume according to the configuration
defined in the linstor-basic-storage-class StorageClass. The LINSTOR volume’s name will be a
UUID prefixed with csi- This volume can be observed with the usual linstor resource list. Once
that volume is created, we can attach it to a pod. The following Pod spec will spawn a Fedora
container with our volume attached that busy waits so it is not unscheduled before we can interact
with it:
61
Listing 3. my-first-linstor-volume-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: fedora
namespace: default
spec:
containers:
- name: fedora
image: fedora
command: [/bin/bash]
args: ["-c", "while true; do sleep 10; done"]
volumeMounts:
- name: my-first-linstor-volume
mountPath: /data
ports:
- containerPort: 80
volumes:
- name: my-first-linstor-volume
persistentVolumeClaim:
claimName: "my-first-linstor-volume"
Running kubectl describe pod fedora can be used to confirm that Pod scheduling and volume
attachment succeeded.
To remove a volume, please ensure that no pod is using it and then delete the
PersistentVolumeClaim via kubectl. For example, to remove the volume that we just made, run the
following two commands, noting that the Pod must be unscheduled before the
PersistentVolumeClaim will be removed:
3.6. Snapshots
Creating snapshots and creating new volumes from snapshots is done via the use of
VolumeSnapshots, VolumeSnapshotClasses, and PVCs.
62
3.6.1. Adding snapshot support
LINSTOR supports the volume snapshot feature, which is currently in beta. To use it, you need to
install a cluster wide snapshot controller. This is done either by the cluster provider, or you can use
the LINSTOR chart.
By default, the LINSTOR chart will install its own snapshot controller. This can lead to conflict in
some cases:
• the cluster does not meet the minimal version requirements (>= 1.17)
--set csi-snapshotter.enabled=false
Listing 4. my-first-linstor-snapshot-class.yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
name: my-first-linstor-snapshot-class
driver: linstor.csi.linbit.com
deletionPolicy: Delete
Now we will create a volume snapshot for the volume that we created above. This is done with a
VolumeSnapshot:
Listing 5. my-first-linstor-snapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
name: my-first-linstor-snapshot
spec:
volumeSnapshotClassName: my-first-linstor-snapshot-class
source:
persistentVolumeClaimName: my-first-linstor-volume
63
Create the VolumeSnapshot with kubectl:
Finally, we’ll create a new volume from the snapshot with a PVC.
Listing 6. my-first-linstor-volume-from-snapshot.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-first-linstor-volume-from-snapshot
spec:
storageClassName: linstor-basic-storage-class
dataSource:
name: my-first-linstor-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
64
By default, the CSI plugin will attach volumes directly if the Pod happens to be scheduled on a
kubelet where its underlying storage is present. However, Pod scheduling does not currently take
volume locality into account. The replicasOnSame parameter can be used to restrict where the
underlying storage may be provisioned, if locally attached volumes are desired.
The next Stork release will include the LINSTOR driver by default. In the meantime, you can use a
custom-built Stork container by LINBIT which includes a LINSTOR driver, available on Docker Hub
By default, the operator will install the components required for Stork, and register a new
scheduler called stork with Kubernetes. This new scheduler can be used to place pods near to their
volumes.
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
schedulerName: stork ①
containers:
- name: busybox
image: busybox
command: ["tail", "-f", "/dev/null"]
volumeMounts:
- name: my-first-linstor-volume
mountPath: /data
ports:
- containerPort: 80
volumes:
- name: my-first-linstor-volume
persistentVolumeClaim:
claimName: "test-volume"
--set stork.enabled=false
65
3.9. Advanced Configuration
In general, all configuration for LINSTOR volumes in Kubernetes should be done via the
StorageClass parameters, as seen with the storagePool in the basic example above. We’ll give all the
available options an in-depth treatment here.
3.9.1. nodeList
nodeList is a list of nodes for volumes to be assigned to. This will assign the volume to each node
and it will be replicated among all of them. This can also be used to select a single node by
hostname, but it’s more flexible to use replicasOnSame to select a single node.
This option determines on which LINSTOR nodes the underlying storage for
volumes will be provisioned and is orthogonal from which kubelets these volumes
will be accessible.
3.9.2. autoPlace
autoPlace is an integer that determines the amount of replicas a volume of this StorageClass will
have. For instance, autoPlace: "3" will produce volumes with three-way replication. If neither
autoPlace nor nodeList are set, volumes will be automatically placed on one node.
You have to use quotes, otherwise Kubernetes will complain about a malformed
StorageClass.
This option (and all options which affect autoplacement behavior) modifies the
number of LINSTOR nodes on which the underlying storage for volumes will be
provisioned and is orthogonal to which kubelets those volumes will be accessible
from.
3.9.3. replicasOnSame
replicasOnSame is a list of key or key=value items used as autoplacement selection labels when
autoplace is used to determine where to provision storage. These labels correspond to LINSTOR
node properties.
66
LINSTOR node properties are different from kubernetes node labels. You can see
the properties of a node by running linstor node list-properties <nodename>. You
can also set additional properties ("auxiliary properties"): linstor node set-
property <nodename> --aux <key> <value>.
Let’s explore this behavior with examples assuming a LINSTOR cluster such that node-a is
configured with the following auxiliary property zone=z1 and role=backups, while node-b is
configured with only zone=z1.
This guide assumes LINSTOR CSI version 0.10.0 or newer. All properties referenced
in replicasOnSame and replicasOnDifferent are interpreted as auxiliary properties.
If you are using an older version of LINSTOR CSI, you need to add the Aux/ prefix to
all property names. So replicasOnSame: "zone=z1" would be replicasOnSame:
"Aux/zone=z1" Using Aux/ manually will continue to work on newer LINSTOR CSI
versions.
If we configure a StorageClass with autoPlace: "1" and replicasOnSame: "zone=z1", then volumes
will be provisioned on either node-a or node-b as they both have the zone=z1 aux prop.
If we configure a StorageClass with autoPlace: "2" and replicasOnSame: "zone=z1", then volumes
will be provisioned on both node-a and node-b as they both have the zone=z1 aux prop.
You can also use a property key without providing a value to ensure all replicas are placed on
nodes with the same property value, with caring about the particular value. Assuming there are 4
nodes, node-a1 and node-a2 are configured with zone=a. node-b1 and node-b2 are configured with
zone=b. Using autoPlace: "2" and replicasOnSame: "zone" will place on either node-a1 and node-a2 OR
on node-b1 and node-b2.
3.9.4. replicasOnDifferent
replicasOnDifferent takes a list of properties to consider, same as replicasOnSame. There are two
modes of using replicasOnDifferent:
If a value is given for the property, the nodes which have that property-value pair assigned will
be considered last.
67
setting.
• Distribute volumes across nodes with different values for the same key:
If no property value is given, LINSTOR will place the volumes across nodes with different values
for that property if possible.
Example: Assuming there are 4 nodes, node-a1 and node-a2 are configured with zone=a. node-b1
and node-b2 are configured with zone=b. Using a StorageClass with autoPlace: "2" and
replicasOnDifferent: "zone", LINSTOR will create one replica on either node-a1 or node-a2 and
one replica on either node-b1 or node-b2.
3.9.5. localStoragePolicy
localStoragePolicy determines, via volume topology, which LINSTOR Satellites volumes should be
assigned and from where Kubernetes will access volumes. The behavior of each option is explained
below in detail.
ignore (default)
required
When localStoragePolicy is set to required, Kubernetes will report a list of places that it wants to
schedule a Pod in order of preference. The plugin will attempt to provision the volume(s) according
to that preference. The number of volumes to be provisioned in total is based off of autoplace.
If all preferences have been attempted, but no volumes where successfully assigned volume
creation will fail.
68
In case of multiple replicas when all preferences have been attempted, and at least one has
succeeded, but there are still replicas remaining to be provisioned, autoplace behavior will apply
for the remaining volumes.
With this option set, Kubernetes will consider volumes that are not locally present on a kubelet to
be unaccessible from that kubelet.
preferred
When localStoragePolicy is set to preferred, volume placement behavior will be the same as when
it’s set to required with the exception that volume creation will not fail if no preference was able to
be satisfied. Volume accessibility will be the same as when set to ignore.
3.9.6. storagePool
storagePool is the name of the LINSTOR storage pool that will be used to provide storage to the
newly-created volumes.
Only nodes configured with this same storage pool with be considered for
autoplacement. Likewise, for StorageClasses using nodeList all nodes specified in
that list must have this storage pool configured on them.
3.9.7. disklessStoragePool
disklessStoragePool is an optional parameter that only effects LINSTOR volumes assigned disklessly
to kubelets i.e., as clients. If you have a custom diskless storage pool defined in LINSTOR, you’ll
specify that here.
3.9.8. encryption
encryption is an optional parameter that determines whether to encrypt volumes. LINSTOR must be
configured for encryption for this to work properly.
3.9.9. filesystem
filesystem is an option parameter to specify the filesystem for non raw block volumes. Currently
supported options are xfs and ext4.
69
3.9.10. fsOpts
fsOpts is an optional parameter that passes options to the volume’s filesystem at creation time.
3.9.11. mountOpts
mountOpts is an optional parameter that passes options to the volume’s filesystem at mount time.
Before upgrading to a new release, you should ensure you have an up-to-date backup of the
LINSTOR database. If you are using the Etcd database packaged in the LINSTOR Chart, see here
Upgrades using the LINSTOR Etcd deployment require etcd to use persistent
storage. Only follow these
etcd.persistentVolume.enabled=true
steps if Etcd was deployed using
• LINSTOR Controller
• LINSTOR Satellite
• Etcd
• Stork
Some versions require special steps, please take a look here The main command to upgrade to a
new LINSTOR operator version is:
If you used any customizations on the initial install, pass the same options to helm upgrade. For
example:
70
would become
This triggers the rollout of new pods. After a short wait, all pods should be running and ready.
Check that no errors are listed in the status section of LinstorControllers, LinstorSatelliteSets and
LinstorCSIDrivers.
Upgrade to v1.2
LINSTOR operator v1.2 is supported on Kubernets 1.17+. If you are using an older Kubernetes
distribution, you may need to change the default settings, for example [the CSI
provisioner](https://kubernetes-csi.github.io/docs/external-provisioner.html).
There is a known issue when updating the CSI components: the pods will not be updated to the
newest image and the errors section of the LinstorCSIDrivers resource shows an error updating the
DaemonSet. In this case, manually delete deployment/linstor-op-csi-controller and
daemonset/linstor-op-csi-node. They will be re-created by the operator.
To create a backup of the Etcd database and store it on your control host, run:
These commands will create a file save.db on the machine you are running kubectl from.
71
Chapter 4. LINSTOR Volumes in Openshift
This chapter describes the usage of LINSTOR in Openshift as managed by the operator and with
volumes provisioned using the LINSTOR CSI plugin.
Some of the value of Red Hat’s Openshift is that it includes its own registry of supported and
certified images and operators, in addition to a default and standard web console. This chapter
describes how to install the Certified LINSTOR operator via these tools.
LINBIT provides a certified LINSTOR operator via the RedHat marketplace. The operator eases
deployment of LINSTOR on Kubernetes by installing DRBD, managing Satellite and Controller pods,
and other related functions.
Unlike deployment via the helm chart, the certified Openshift operator does not deploy the needed
etcd cluster. You must deploy this yourself ahead of time. We do this via the etcd operator available
on operatorhub.io.
It it advised that the etcd deployment uses persistent storage of some type. Either
use an existing storage provisioner with a default StorageClass or simply use
hostPath volumes.
Read the storage guide and configure a basic storage setup for LINSTOR.
Once etcd and storage has been configured, we are now ready to install the LINSTOR operator. You
can find the LINSTOR operator via the left-hand control pane of Openshift Web Console. Expand the
"Operators" section and select "OperatorHub". From here you need to find the LINSTOR operator.
Either search for the term "LINSTOR" or filter only by "Marketplace" operators.
72
The LINSTOR operator can only watch for events and manage custom resources
that are within the same namespace it is deployed within (OwnNamsespace). This
means the LINSTOR Controller, LINSTOR Satellites, and LINSTOR CSI Driver pods
all need to be deployed in the same namsepace as the LINSTOR Operator pod.
Once you have located the LINSTOR operator in the Marketplace, click the "Install" button and
install it as you would any other operator.
At this point you should have just one pod, the operator pod, running.
Again, navigate to the left-hand control pane of the Openshift Web Console. Expand the "Operators"
section, but this time select "Installed Operators". Find the entry for the "Linstor Operator", then
select the "LinstorController" from the "Provided APIs" column on the right.
From here you should see a page that says "No Operands Found" and will feature a large button on
the right which says "Create LinstorController". Click the "Create LinstorController" button.
Here you will be presented with options to configure the LINSTOR Controller. Either via the web-
form view or the YAML View. Regardless of which view you select, make sure that the
dbConnectionURL matches the endpoint provided from your etcd deployment. Otherwise, the defaults
are usually fine for most purposes.
Lastly hit "Create", you should now see a linstor-controller pod running.
Next we need to deploy the Satellites Set. Just as before navigate to the left-hand control pane of the
Openshift Web Console. Expand the "Operators" section, but this time select "Installed Operators".
Find the entry for the "Linstor Operator", then select the "LinstorSatelliteSet" from the "Provided
APIs" column on the right.
From here you should see a page that says "No Operands Found" and will feature a large button on
the right which says "Create LinstorSatelliteSet". Click the "Create LinstorSatelliteSet" button.
Here you will be presented with the options to configure the LINSTOR Satellites. Either via the web-
form view or the YAML View. One of the first options you’ll notice is the automaticStorageType. If set
to "NONE" then you’ll need to remember to configure the storage pools yourself at a later step.
Another option you’ll notice is kernelModuleInjectionMode. I usually select "Compile" for portability
sake, but selecting "ShippedModules" will be faster as it will install pre-compiled kernel modules on
all the worker nodes.
Make sure the controllerEndpoint matches what is available in the kubernetes endpoints. The
default is usually correct here.
73
apiVersion: linstor.linbit.com/v1
kind: LinstorSatelliteSet
metadata:
name: linstor
namespace: default
spec:
satelliteImage: ''
automaticStorageType: LVMTHIN
drbdRepoCred: ''
kernelModuleInjectionMode: Compile
controllerEndpoint: 'http://linstor:3370'
priorityClassName: ''
status:
errors: []
Lastly hit "Create", you should now see a linstor-node pod running on every worker node.
Last bit left is the CSI pods to bridge the layer between the CSI and LINSTOR. Just as before navigate
to the left-hand control pane of the Openshift Web Console. Expand the "Operators" section, but this
time select "Installed Operators". Find the entry for the "Linstor Operator", then select the
"LinstorCSIDriver" from the "Provided APIs" column on the right.
From here you should see a page that says "No Operands Found" and will feature a large button on
the right which says "Create LinstorCSIDriver". Click the "Create LinstorCSIDriver" button.
Again, you will be presented with the options. Make sure that the controllerEnpoint is correct.
Otherwise the defaults are fine for most use cases.
Lastly hit "Create". You will now see a single "linstor-csi-controller" pod, as well as a "linstor-csi-
node" pod on all worker nodes.
This should only be necessary for investigating problems and accessing advanced functionality.
Regular operation such as creating volumes should be achieved via the Kubernetes integration.
74
As such, please see the previous chapter’s section on Basic Configuration and Deployment.
Etcd can be deployed by using the Etcd Operator available in the OperatorHub.
To deploy STORK, you can use the single YAML deployment available at: https://charts.linstor.io/
deploy/stork.yaml Download the YAML and replace every instance of MY-STORK-NAMESPACE with your
desired namespace for STORK. You also need to replace MY-LINSTOR-URL with the URL of your
controller. This value depends on the name you chose when creating the LinstorController resource.
By default this would be http://linstor.<operator-namespace>.svc:3370
To apply the YAML to Openshift, either use oc apply -f <filename> from the command line or find
the "Import YAML" option in the top right of the Openshift Web Console.
Alternatively, you can deploy the LINSTOR Operator using Helm instead. Take a look at the
Kubernetes guide. Openshift requires changing some of the default values in our Helm chart.
If you chose to use Etcd with hostpath volumes for persistence (see here), you need to enable
selinux relabelling. To do this pass --set selinux=true to the pv-hostpath install command.
For the LINSTOR Operator chart itself, you should change the following values:
global:
setSecurityContext: false ①
csi-snapshotter:
enabled: false ②
stork:
schedulerTag: v1.18.6 ③
etcd:
podsecuritycontext:
supplementalGroups: [1000] ④
operator:
satelliteSet:
kernelModuleInjectionImage: drbd.io/drbd9-rhel8:v9.0.25 ⑤
③ Automatic detection of the Kubernetes Scheduler version fails in Openshift, you need to set it
manually. Note: the tag does not have to match Openshift’s Kubernetes release.
④ If you choose to use Etcd deployed via Helm and use the pv-hostpath chart, Etcd needs to run as
member of group 1000 to access the persistent volume.
75
⑤ The RHEL8 kernel injector also supports RHCOS.
Other overrides, such as storage pool configuration, HA deployments and more, are available and
documented in the Kubernetes guide.
76
Chapter 5. LINSTOR Volumes in Proxmox VE
This chapter describes DRBD in Proxmox VE via the LINSTOR Proxmox Plugin.
'linstor-proxmox' is a Perl plugin for Proxmox that, in combination with LINSTOR, allows to
replicate VM disks on several Proxmox VE nodes. This allows to live-migrate active VMs within a
few seconds and with no downtime without needing a central SAN, as the data is already replicated
to multiple nodes.
5.2. Upgrades
If this is a fresh installation, skip this section and continue with Proxmox Plugin Installation.
Version 5 of the plugin drops compatibility with the legacy configuration options "storagepool" and
"redundancy". Version 5 requires a "resourcegroup" option, and obviously a LINSTOR resource
group. The old options should be removed from the config.
The DRBD9 kernel module is installed as a dkms package (i.e., drbd-dkms), therefore you’ll have to
install pve-headers package, before you set up/install the software packages from LINBIT’s
repositories. Following that order, ensures that the kernel module will build properly for your
kernel. If you don’t plan to install the latest Proxmox kernel, you have to install kernel headers
matching your current running kernel (e.g., pve-headers-$(uname -r)). If you missed this step, then
77
still you can rebuild the dkms package against your current kernel, (kernel headers have to be
installed in advance), by issuing apt install --reinstall drbd-dkms command.
LINBIT’s repository can be enabled as follows, where "$PVERS" should be set to your Proxmox VE
major version (e.g., "6", not "6.1"):
drbd: drbdstorage
content images,rootdir
controller 10.11.12.13
resourcegroup defaultpool
The "drbd" entry is fixed and you are not allowed to modify it, as it tells to Proxmox to use DRBD as
storage backend. The "drbdstorage" entry can be modified and is used as a friendly name that will
be shown in the PVE web GUI to locate the DRBD storage. The "content" entry is also fixed, so do not
change it. The redundancy (specified in the resource group) specifies how many replicas of the data
will be stored in the cluster. The recommendation is to set it to 2 or 3 depending on your setup. The
data is accessible from all nodes, even if some of them do not have local copies of the data. For
example, in a 5 node cluster, all nodes will be able to access 3 copies of the data, no matter where
they are stored in. The "controller" parameter must be set to the IP of the node that runs the
LINSTOR controller service. Only one node can be set to run as LINSTOR controller at the same
time. If that node fails, start the LINSTOR controller on another node and change that value to its IP
address.
Recent versions of the plugin allow to define multiple different storage pools. Such a configuration
would look like this:
78
drbd: drbdstorage
content images,rootdir
controller 10.11.12.13
resourcegroup defaultpool
drbd: fastdrbd
content images,rootdir
controller 10.11.12.13
resourcegroup ssd
drbd: slowdrbd
content images,rootdir
controller 10.11.12.13
resourcegroup backup
By now, you should be able to create VMs via Proxmox’s web GUI by selecting "drbdstorage", or any
other of the defined pools as storage location.
Starting from version 5 of the plugin one can set the option "preferlocal yes". If it is set, the plugin
tries to create a diskful assignment on the node that issued the storage create command. With this
option one can make sure the VM gets local storage if possible. Without that option LINSTOR might
place the storage on nodes 'B' and 'C', while the VM is initially started on node 'A'. This would still
work as node 'A' then would get a diskless assignment, but having local storage might be preferred.
NOTE: DRBD supports only the raw disk format at the moment.
At this point you can try to live migrate the VM - as all data is accessible on all nodes (even on
Diskless nodes) - it will take just a few seconds. The overall process might take a bit longer if the VM
is under load and if there is a lot of RAM being dirtied all the time. But in any case, the downtime
should be minimal and you will see no interruption at all.
For the rest of this guide we assume that you installed LINSTOR and the Proxmox Plugin as
described in LINSTOR Configuration.
The basic idea is to execute the LINSTOR controller within a VM that is controlled by Proxmox and
its HA features, where the storage resides on DRBD managed by LINSTOR itself.
The first step is to allocate storage for the VM: Create a VM as usual and select "Do not use any
media" on the "OS" section. The hard disk should of course reside on DRBD (e.g., "drbdstorage"). 2GB
disk space should be enough, and for RAM we chose 1GB. These are the minimum requirements for
the appliance LINBIT provides to its customers (see below). If you wish to set up your own
79
controller VM, and you have enough hardware resources available, you can increase these
minimum values. In the following use case, we assume that the controller VM was created with ID
100, but it is fine if this VM was created at a later time and has a different ID.
LINBIT provides an appliance for its customers that can be used to populate the created storage. For
the appliance to work, we first create a "Serial Port". First click on "Hardware" and then on "Add"
and finally on "Serial Port":
If everything worked as expected the VM definition should then look like this:
The next step is to copy the VM appliance to the VM disk storage. This can be done with qemu-img.
Once completed you can start the VM and connect to it via the Proxmox VNC viewer. The default
user name and password are both "linbit". Note that we kept the default configuration for the ssh
server, so you will not be able to log in to the VM via ssh and username/password. If you want to
enable that (and/or "root" login), enable these settings in /etc/ssh/sshd_config and restart the ssh
service. As this VM is based on "Ubuntu Bionic", you should change your network settings (e.g.,
80
static IP) in /etc/netplan/config.yaml. After that you should be able to ssh to the VM:
In the next step you add the controller VM to the existing cluster:
In our test cluster the Controller VM disk was created in DRBD storage and it was initially assigned
to one host (use linstor resource list to check the assignments). Then, we used linstor resource
create command to create additional resource assignments to the other nodes of the cluster for this
VM. In our lab consisting of four nodes, we created all resource assignments as diskful, but diskless
assignments are fine as well. As a rule of thumb keep the redundancy count at "3" (more usually
does not make sense), and assign the rest as diskless.
As the storage for the Controller VM must be made available on all PVE hosts in some way, we must
make sure to enable the drbd.service on all hosts (given that it is not controlled by LINSTOR at this
stage):
By default, at startup the linstor-satellite service deletes all of its resource files (.res) and
regenerates them. This conflicts with the drbd services that needs these resource files to
start the controller VM. It is good enough to first bring up the resources via drbd.service and
ensure that the linstor-satellite.service, which brings up the controller resource never
deletes the according res file. To make the necessary changes, you need to create a drop-in for
the linstor-satellite.service via systemctl (do *not edit the file directly).
81
systemctl edit linstor-satellite
[Service]
Environment=LS_KEEP_RES=vm-100-disk
[Unit]
After=drbd.service
Of course adapt the name of the controller VM in the LS_KEEP_RES variable. Note that the value given
is interpreted as regex, so you don’t need to specify the exact name.
After that, it is time for the final steps, namely switching from the existing controller (residing on
the physical host) to the new one in the VM. So let’s stop the old controller service on the physical
host, and copy the LINSTOR controller database to the VM host:
To check if everything worked as expected, you can query the cluster nodes on a physical PVE host
by asking the controller in the VM: linstor --controllers=10.43.7.254 node list. It is perfectly fine
that the controller (which is just a Controller and not a "Combined" host) is shown as "OFFLINE".
This might change in the future to something more reasonable.
As the last — but crucial — step, you need to add the "controllervm" option to /etc/pve/storage.cfg,
and change the controller IP address to the IP address of the Controller VM:
drbd: drbdstorage
content images,rootdir
resourcegroup defaultpool
controller 10.43.7.254
controllervm 100
Please note the additional setting "controllervm". This setting is very important, as it tells to PVE to
handle the Controller VM differently than the rest of VMs stored in the DRBD storage. In specific, it
will instruct PVE to NOT use LINSTOR storage plugin for handling the Controller VM, but to use
other methods instead. The reason for this, is that simply LINSTOR backend is not available at this
stage. Once the Controller VM is up and running (and the associated LINSTOR controller service
inside the VM), then the PVE hosts will be able to start the rest of virtual machines which are stored
in the DRBD storage by using LINSTOR storage plugin. Please make sure to set the correct VM ID in
82
the "controllervm" setting. In this case is set to "100", which represents the ID assigned to our
Controller VM.
It is very important to make sure that the Controller VM is up and running at all times and that you
are backing it up at regular times(mostly when you do modifications to the LINSTOR cluster). Once
the VM is gone, and there are no backups, the LINSTOR cluster must be recreated from scratch.
To prevent accidental deletion of the VM, you can go to the "Options" tab of the VM, in the PVE GUI
and enable the "Protection" option. If however you accidentally deleted the VM, such requests are
ignored by our storage plugin, so the VM disk will NOT be deleted from the LINSTOR cluster.
Therefore, it is possible to recreate the VM with the same ID as before(simply recreate the VM
configuration file in PVE and assign the same DRBD storage device used by the old VM). The plugin
will just return "OK", and the old VM with the old data can be used again. In general, be careful to
not delete the controller VM and "protect" it accordingly.
Currently, we have the controller executed as VM, but we should make sure that one instance of the
VM is started at all times. For that we use Proxmox’s HA feature. Click on the VM, then on "More",
and then on "Manage HA". We set the following parameters for our controller VM:
As long as there are surviving nodes in your Proxmox cluster, everything should be fine and in case
the node hosting the Controller VM is shut down or lost, Proxmox HA will make sure the controller
is started on another host. Obviously the IP of the controller VM should not change. It is up to you
as an administrator to make sure this is the case (e.g., setting a static IP, or always providing the
same IP via dhcp on the bridged interface).
It is important to mention at this point that in the case that you are using a dedicated network for
the LINSTOR cluster, you must make sure that the network interfaces configured for the cluster
traffic, are configured as bridges (i.e vmb1,vmbr2 etc) on the PVE hosts. If they are setup as direct
interfaces (i.e eth0,eth1 etc), then you will not be able to setup the Controller VM vNIC to
communicate with the rest of LINSTOR nodes in the cluster, as you cannot assign direct network
interfaces to the VM, but only bridged interfaces.
One limitation that is not fully handled with this setup is a total cluster outage (e.g., common power
supply failure) with a restart of all cluster nodes. Proxmox is unfortunately pretty limited in this
regard. You can enable the "HA Feature" for a VM, and you can define "Start and Shutdown Order"
constraints. But both are completely separated from each other. Therefore it is hard/impossible to
83
guarantee that the Controller VM will be up and running, before all other VMs are started.
It might be possible to work around that by delaying VM startup in the Proxmox plugin itself until
the controller VM is up (i.e., if the plugin is asked to start the controller VM it does it, otherwise it
waits and pings the controller). While a nice idea, this would horribly fail in a serialized, non-
concurrent VM start/plugin call event stream where some VM should be started (which then are
blocked) before the Controller VM is scheduled to be started. That would obviously result in a
deadlock.
We will discuss these options with Proxmox, but we think the current solution is valuable in most
typical use cases, as is. Especially, compared to the complexity of a pacemaker setup. Use cases
where one can expect that not the whole cluster goes down at the same time are covered. And even
if that is the case, only automatic startup of the VMs would not work when the whole cluster is
started. In such a scenario the admin just has to wait until the Proxmox HA service starts the
controller VM. After that all VMs can be started manually/scripted on the command line.
84
Chapter 6. LINSTOR Volumes in OpenNebula
This chapter describes DRBD in OpenNebula via the usage of the LINSTOR storage driver addon.
Detailed installation and configuration instructions and be found in the README.md file of the
driver’s source.
The LINSTOR addon allows the deployment of virtual machines with highly available images
backed by DRBD and attached across the network via DRBD’s own transport protocol.
With access to LINBIT’s customer repositories you can install the linstor-opennebula with
or
Without access to LINBIT’s prepared packages you need to fall back to instructions on it’s GitHub
page.
A DRBD cluster with LINSTOR can be installed and configured by following the instructions in this
guide, see Initializing your cluster.
The OpenNebula and DRBD clusters can be somewhat independent of one another with the
following exception: OpenNebula’s Front-End and Host nodes must be included in both clusters.
Host nodes do not need a local LINSTOR storage pools, as virtual machine images are attached to
[1]
them across the network .
85
6.4. Configuration
6.4.1. Adding the driver to OpenNebula
Add linstor to the list of drivers in the TM_MAD and DATASTORE_MAD sections:
TM_MAD = [
executable = "one_tm",
arguments = "-t 15 -d dummy,lvm,shared,fs_lvm,qcow2,ssh,vmfs,ceph,linstor"
]
DATASTORE_MAD = [
EXECUTABLE = "one_datastore",
ARGUMENTS = "-t 15 -d dummy,fs,lvm,ceph,dev,iscsi_libvirt,vcenter,linstor -s
shared,ssh,ceph,fs_lvm,qcow2,linstor"
TM_MAD_CONF = [
NAME = "linstor", LN_TARGET = "NONE", CLONE_TARGET = "SELF", SHARED = "yes",
ALLOW_ORPHANS="yes",
TM_MAD_SYSTEM = "ssh,shared", LN_TARGET_SSH = "NONE", CLONE_TARGET_SSH = "SELF",
DISK_TYPE_SSH = "BLOCK",
LN_TARGET_SHARED = "NONE", CLONE_TARGET_SHARED = "SELF", DISK_TYPE_SHARED =
"BLOCK"
]
DS_MAD_CONF = [
NAME = "linstor", REQUIRED_ATTRS = "BRIDGE_LIST", PERSISTENT_ONLY = "NO",
MARKETPLACE_ACTIONS = "export"
]
The Front-End node issues commands to the Storage and Host nodes via Linstor
Host nodes are responsible for running instantiated VMs and typically have the storage for the
images they need attached across the network via Linstor diskless mode.
86
All nodes must have DRBD9 and Linstor installed. This process is detailed in the User’s Guide for
DRBD9
It is possible to have Front-End and Host nodes act as storage nodes in addition to their primary
role as long as they the meet all the requirements for both roles.
Front-End Configuration
Please verify that the control node(s) that you hope to communicate with are reachable from the
Front-End node. linstor node list for locally running Linstor controllers and linstor
--controllers "<IP:PORT>" node list for remotely running Linstor Controllers is a handy way to
test this.
Host Configuration
Host nodes must have Linstor satellite processes running on them and be members of the same
Linstor cluster as the Front-End and Storage nodes, and may optionally have storage locally. If the
oneadmin user is able to passwordlessly ssh between hosts then live migration may be used with the
even with the ssh system datastore.
Only the Front-End and Host nodes require OpenNebula to be installed, but the oneadmin user
must be able to passwordlessly access storage nodes. Refer to the OpenNebula install guide for your
distribution on how to manually configure the oneadmin user account.
The Storage nodes must use storage pools created with a driver that’s capable of making snapshots,
such as the thin LVM plugin.
In this example preparation of thinly-provisioned storage using LVM for Linstor, you must create a
volume group and thinLV using LVM on each storage node.
Example of this process using two physical volumes (/dev/sdX and /dev/sdY) and generic names for
the volume group and thinpool. Make sure to set the thinLV’s metadata volume to a reasonable size,
once it becomes full it can be difficult to resize:
Then you’ll create storage pool(s) on Linstor using this as the backing storage.
The oneadmin user must have passwordless sudo access to the mkfs command on the Storage nodes
87
oneadmin ALL=(root) NOPASSWD: /sbin/mkfs
Groups
Be sure to consider the groups that oneadmin should be added to in order to gain access to the
devices and programs needed to access storage and instantiate VMs. For this addon, the oneadmin
user must belong to the disk group on all nodes in order to access the DRBD devices where images
are held.
Create a datastore configuration file named ds.conf and use the onedatastore tool to create a new
datastore based on that configuration. There are two mutually exclusive deployment options:
LINSTOR_AUTO_PLACE and LINSTOR_DEPLOYMENT_NODES. If both are configured,
LINSTOR_AUTO_PLACE is ignored. For both of these options, BRIDGE_LIST must be a space
separated list of all storage nodes in the Linstor cluster.
Since version 1.0.0 LINSTOR supports resource groups. A resource group is a centralized point for
settings that all resources linked to that resource group share.
Create a resource group and volume group for your datastore, it is mandatory to specify a storage-
pool within the resource group, otherwise monitoring space for opennebula will not work. Here we
create one with 2 node redundancy and use a created opennebula-storagepool:
88
cat >ds.conf <<EOI
NAME = linstor_datastore
DS_MAD = linstor
TM_MAD = linstor
TYPE = IMAGE_DS
DISK_TYPE = BLOCK
LINSTOR_RESOURCE_GROUP = "OneRscGrp"
COMPATIBLE_SYS_DS = 0
BRIDGE_LIST = "alice bob charlie" #node names
EOI
LINSTOR_CONTROLLERS
LINSTOR_CONTROLLERS can be used to pass a comma separated list of controller ips and ports to the
Linstor client in the case where a Linstor controller process is not running locally on the Front-End,
e.g.:
LINSTOR_CONTROLLERS = "192.168.1.10:8080,192.168.1.11:6000"
LINSTOR_CLONE_MODE
Linstor supports 2 different clone modes and are set via the LINSTOR_CLONE_MODE attribute:
• snapshot
The default mode is snapshot it uses a linstor snapshot and restores a new resource from this
snapshot, which is then a clone of the image. This mode is usually faster than using the copy mode
as snapshots are cheap copies.
• copy
The second mode is copy it creates a new resource with the same size as the original and copies the
data with dd to the new resource. This mode will be slower than snapshot, but is more robust as it
doesn’t rely on any snapshot mechanism, it is also used if you are cloning an image into a different
linstor datastore.
The following attributes are deprecated and will be removed in version after the 1.0.0 release.
LINSTOR_STORAGE_POOL
LINSTOR_STORAGE_POOL attribute is used to select the LINSTOR storage pool your datastore should use.
If resource groups are used this attribute isn’t needed as the storage pool can be select by the auto
select filter options. If LINSTOR_AUTO_PLACE or LINSTOR_DEPLOYMENT_NODES is used and
89
LINSTOR_STORAGE_POOL is not set, it will fallback to the DfltStorPool in LINSTOR.
LINSTOR_AUTO_PLACE
The LINSTOR_AUTO_PLACE option takes a level of redundancy which is a number between one and the
total number of storage nodes. Resources are assigned to storage nodes automatically based on the
level of redundancy.
LINSTOR_DEPLOYMENT_NODES
Using LINSTOR_DEPLOYMENT_NODES allows you to select a group of nodes that resources will always be
assigned to. Please note that the bridge list still contains all of the storage nodes in the Linstor
cluster.
Linstor driver can also be used as a system datastore, configuration is pretty similar to normal
datastores, with a few changes:
Also add the new sys datastore id to the COMPATIBLE_SYS_DS to your image datastores (COMMA
separated), otherwise the scheduler will ignore them.
If you want live migration with volatile disks you need to enable the --unsafe option for KVM, see:
opennebula-doc
For datastores which place per node, free space is reported based on the most restrictive storage
pools from all nodes where resources are being deployed. For example, the capacity of the node
with the smallest amount of total storage space is used to determine the total size of the datastore
90
and the node with the least free space is used to determine the remaining space in the datastore.
For a datastore which uses automatic placement, size and remaining space are determined based
on the aggregate storage pool used by the datastore as reported by LINSTOR.
[1] If a host is also a storage node, it will use a local copy of an image if that is available
91
Chapter 7. LINSTOR volumes in Openstack
This chapter describes DRBD in Openstack for persistent, replicated, and high-performance block
storage with LINSTOR Driver.
The LINSTOR driver for OpenStack manages DRBD/LINSTOR clusters and makes them available
within the OpenStack environment, especially within Nova compute instances. LINSTOR-backed
Cinder volumes will seamlessly provide all the features of DRBD/LINSTOR while allowing
OpenStack to manage all their deployment and management. The driver will allow OpenStack to
create and delete persistent LINSTOR volumes as well as managing and deploying volume
snapshots and raw volume images.
Aside from using the kernel-native DRBD protocols for replication, the LINSTOR driver also allows
using iSCSI with LINSTOR cluster(s) to provide maximum compatibility. For more information on
these two options, please see Choosing the Transport Protocol.
92
# First, set up LINBIT repository per support contract
93
Lastly, from the Cinder node, create LINSTOR Satellite Node(s) and Storage Pool(s)
# Create a LINSTOR cluster, including the Cinder node as one of the nodes
# For each node, specify node name, its IP address, volume type (diskless) and
# volume location (drbdpool/thinpool)
The linstor driver will be officially available starting OpenStack Stein release. The latest release is
located at LINBIT OpenStack Repo. It is a single Python file called linstordrv.py. Depending on your
OpenStack installation, its destination may vary.
Place the driver ( linstordrv.py ) in an appropriate location within your OpenStack Cinder node.
For Devstack:
/opt/stack/cinder/cinder/volume/drivers/linstordrv.py
For Ubuntu:
/usr/lib/python2.7/dist-packages/cinder/volume/drivers/linstordrv.py
/usr/lib/python2.7/site-packages/cinder/volume/drivers/linstordrv.py
94
7.3. Cinder Configuration for LINSTOR
7.3.1. Edit Cinder configuration file cinder.conf in /etc/cinder/ as follows:
[DEFAULT]
...
enabled_backends=lvm, linstor
...
[linstor]
volume_backend_name = linstor
volume_driver = cinder.volume.drivers.linstordrv.LinstorDrbdDriver
linstor_default_volume_group_name=drbdpool
linstor_default_uri=linstor://localhost
linstor_default_storage_pool_name=DfltStorPool
linstor_default_resource_size=1
linstor_volume_downsize_factor=4096
Run these commands from the Cinder node once environment variables are configured for
OpenStack command line operation.
For Devstack:
95
For RDO Packstack:
Once the Cinder service is restarted, a new Cinder volume with LINSTOR backend may be created
using the Horizon GUI or command line. Use following as a guide for creating a volume with the
command line.
# Check to see if there are any recurring errors with the driver.
# Occasional 'ERROR' keyword associated with the database is normal.
# Use Ctrl-C to stop the log output to move on.
sudo systemctl -f -u devstack@c-* | grep error
# Create a LINSTOR test volume. Once the volume is created, volume list
# command should show one new Cinder volume. The 'linstor' command then
# should list actual resource nodes within the LINSTOR cluster backing that
# Cinder volume.
openstack volume create --type linstor --size 1 --availability-zone nova linstor-test-
vol
openstack volume list
linstor resource list
More to come
These are not exclusive; you can define multiple backends, have some of them use iSCSI, and others
the DRBD protocol.
96
7.4.1. iSCSI Transport
The default way to export Cinder volumes is via iSCSI. This brings the advantage of maximum
compatibility - iSCSI can be used with every hypervisor, be it VMWare, Xen, HyperV, or KVM.
The drawback is that all data has to be sent to a Cinder node, to be processed by an (userspace)
iSCSI daemon; that means that the data needs to pass the kernel/userspace border, and these
transitions will cost some performance.
The alternative is to get the data to the VMs by using DRBD as the transport protocol. This means
[2]
that DRBD 9 needs to be installed on the Cinder node as well.
One advantage of that solution is that the storage access requests of the VMs can be sent via the
DRBD kernel module to the storage nodes, which can then directly access the allocated LVs; this
means no Kernel/Userspace transitions on the data path, and consequently better performance.
Combined with RDMA capable hardware you should get about the same performance as with VMs
accessing a FC backend directly.
Another advantage is that you will be implicitly benefitting from the HA background of DRBD:
using multiple storage nodes, possibly available over different network connections, means
redundancy and avoiding a single point of failure.
Default configuration options for Cinder driver assumes the Cinder node to be a
Diskless LINSTOR node. If the node is a Diskful node, please change the
'linstor_controller_diskless=True' to 'linstor_controller_diskless=False' and restart
the Cinder services.
In the LINSTOR section in cinder.conf you can define which transport protocol to use. The initial
setup described at the beginning of this chapter is set to use DRBD transport. You can configure as
[3]
necessary as shown below. Then Horizon should offer these storage backends at volume creation
time.
volume_driver=cinder.volume.drivers.drbdmanagedrv.DrbdManageIscsiDriver
volume_driver=cinder.volume.drivers.drbdmanagedrv.DrbdManageDrbdDriver
97
The old class name "DrbdManageDriver" is being kept for the time because of compatibility
reasons; it’s just an alias to the iSCSI driver.
To summarize:
• You’ll need the LINSTOR Cinder driver 0.1.0 or later, and LINSTOR 0.6.5 or later.
• The DRBD transport protocol should be preferred whenever possible; iSCSI won’t offer any
locality benefits.
• Take care to not run out of disk space, especially with thin volumes.
[2] LINSTOR must be installed on Cinder node. Please see the note at [s-openstack-linstor-drbd-external-NOTE].
[3] The OpenStack GUI
98
Chapter 8. LINSTOR Volumes in Docker
This chapter describes LINSTOR volumes in Docker as managed by the LINSTOR Docker Volume
Plugin.
The LINSTOR Docker Volume Plugin is a volume driver that provisions persistent volumes from a
LINSTOR cluster for Docker containers.
# cat /etc/linstor/docker-volume.conf
[global]
controllers = linstor://hostnameofcontroller
# cat /etc/linstor/docker-volume.conf
[global]
storagepool = thin-lvm
fs = ext4
fsopts = -E discard
size = 100MB
replicas = 2
99
8.4.1. Example 1 - typical docker pattern
On node alpha:
On node bravo:
100