PacketFence Clustering Guide
PacketFence Clustering Guide
PacketFence v13.1.0
Version 13.1.0 - January 2024
Table of Contents
Permission is granted to copy, distribute and/or modify this document under the terms of the
GNU Free Documentation License, Version 1.2 or any later version published by the Free
Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included in the section entitled "GNU Free Documentation License".
The fonts used in this guide are licensed under the SIL Open Font License, Version 1.1. This
license is available with a FAQ at: http://scripts.sil.org/OFL
This guide gives a quick start to install active/active clustering in PacketFence 7+. This guide does
not include advanced troubleshooting of the active/active clustering. Refer to the documentation
of HAProxy and Keepalived for advanced features.
Installation Guide
Covers installation and configuration of PacketFence.
Upgrade Guide
Covers compatibility changes, manual instructions and general upgrade notes.
PacketFence News
Covers noteworthy features, improvements and bug fixes by release.
NOTE Appended to this guide is a glossary on specialized terms used in this document.
• RHEL-based systems
• Debian-based systems
3.1.2. sysctl.conf
You will need to configure each server so the services can bind on IP addresses they don’t
currently have configured. This allows faster failover of the services.
net.ipv4.ip_nonlocal_bind = 1
net.ipv6.conf.all.disable_ipv6 = 1
and run:
sysctl -p
reboot
If you plan to use Postfix to send emails, you need to set inet_protocols = ipv4
NOTE
in /etc/postfix/main.cf to be able to use it.
Galera cluster is only supported in 3 nodes cluster and more (with an odd
CAUTION
number of servers).
First, you will need to install, on each servers, Mariabackup for the synchronization to work
correctly.
On RHEL-based systems
For the next steps, you want to make sure that you didn’t configure anything in
/usr/local/pf/conf/cluster.conf. If you already did, comment all the configuration in the file
and do a configreload (/usr/local/pf/bin/pfcmd configreload hard).
Then, you will need to create a user for the database replication that PacketFence will use. You
can use any username/password combination. After creating the user, keep its information close-
by for usage in the configuration.
mysql -u root
FLUSH PRIVILEGES;
When configuring the network interfaces, ensure that you mark the management
NOTE interface as high-availability. Otherwise, you will not be able to perform the
database synchronization.
This step is only necessary to configure IP addresses on interfaces (at OS level). PacketFence
configuration of interfaces will be done later.
In /etc/sysconfig/network-scripts/
One Management Interface ifcfg-YourFirstInterfaceName
[database]
host=100.64.0.1
port=6033
[active_active]
# Change these 2 values by the credentials you've set when configuring MariaDB
above
galera_replication_username=pfcluster
galera_replication_password=aMuchMoreSecurePassword
[webservices]
# Change these 2 values by the credentials you want
user=packet
pass=anotherMoreSecurePassword
[advanced]
configurator=disabled
[services]
galera-autofix=disabled
[mysql]
host=100.64.0.1
port=6033
Now, restart packetfence-config and reload the configuration. You will see errors related to a
cache write issue but you can safely ignore it for now. These appear because packetfence-
config cannot connect to the database yet.
You will need to configure it with your server hostname. Use : hostname command (without any
arguments) to get it.
The CLUSTER section represents the virtual IP addresses of your cluster that will be shared by
your servers.
In this example, eth0 is the management interface, eth1.2 is the registration interface and eth1.3
is the isolation interface.
[CLUSTER]
management_ip=192.168.1.10
[pf1.example.com]
management_ip=192.168.1.5
[pf2.example.com]
management_ip=192.168.1.6
[pf3.example.com]
management_ip=192.168.1.7
Once this configuration is done, reload the configuration and perform a checkup:
The reload and the checkup will complain about the unavailability of the database, which you can
safely ignore for now. Most important is that you don’t see any cluster configuration related
errors during the checkup.
First server
If no error is found in the previous configuration, the previous restart of PacketFence should have
started: keepalived and radiusd-loadbalancer along with the other services. If you have set up
a mail server on your first server, you should have receive a mail from keepalived to inform you
that your first server got Virtual IP (VIP) adresses.
You should now have service using the first server on the IP addresses defined in the CLUSTER
sections.
You can check with ip -br a, on the first server, you need to find the VIP on the
NOTE first ethernet interface. On the others server, be sure to have the
interface.VLANID interfaces with the good IPs.
If you reboot the management node (first server), you will need to stop
packetfence-mariadb (systemctl stop packetfence-mariadb) and start it
WARNING with the new cluster option so the servers can join (systemctl set-
environment MARIADB_ARGS=--force-new-cluster && systemctl restart
packetfence-mariadb)
Now, you will need to integrate your two other nodes in your cluster.
Where :
Then, reload the configuration and start the webservices on second and third servers:
Now, flush any MariaDB data you have on the two servers and restart packetfence-mariadb so
that the servers join the cluster.
WARNING If you have any data in MariaDB on these nodes, this will destroy it.
rm -fr /var/lib/mysql/*
systemctl restart packetfence-mariadb
If you see following message when running systemctl status packetfence-mariadb, your
nodes have successfully joined cluster:
In case you have some issues, ensure your MariaDB instance running with --force-new-cluster
is still running on the first server, if its not, start it again.
Before starting services on all servers, galera-autofix service need to be re-enabled and
configuration synced across cluster:
3.5.5. Wrapping up
Now restart PacketFence on all servers:
You should now reboot each server one by one waiting for the one you rebooted to come back
online before proceeding to the next one:
reboot
After each reboot, ensure the database sync is fine by performing the checks outlined in
Checking the MariaDB sync section.
From the PacketFence web administration interface (using virtual IP address of your cluster), go in
Configuration → System Configuration → Cluster and change the Shared KEY.
If you already use VRRP protocol on your network, you can also change the default Virtual
Router ID and enable VRRP Unicast.
The Galera cluster stack used by PacketFence resembles a lot to how a normal MariaDB Galera
cluster behaves but it contains hooks to auto-correct some issues that can occur.
A lot of useful information is logged in the MariaDB log which can be found in
NOTE
/usr/local/pf/logs/mariadb.log
The Galera cluster stack will continuously check that it has a quorum. Should one of the server be
part of a group that doesn’t have the quorum in the cluster, it will put itself in read-only mode
and stop the synchronization. During that time, your PacketFence installation will continue
working but with some features disabled.
• RADIUS MAC Authentication: Will continue working and will return RADIUS attributes
associated with the role that is registered in the database. If VLAN or RADIUS filters can apply
to this device, they will but any role change will not be persisted.
• RADIUS 802.1X: Will continue working and if 'Dot1x recompute role from portal' is enabled, it
will compute the role using the available authentication sources but will not save it in the
database at the end of the request. If this parameter is disabled, it will behave like MAC
Authentication. VLAN and RADIUS filters will still apply for the connections. If any of your
sources are external (LDAP, AD, RADIUS, …), they must be available for the request to
complete successfully.
• Captive portal: The captive portal will be disabled and display a message stating the system is
currently experiencing an issue.
• DHCP listeners: The DHCP listeners will be disabled and packets will not be saved in the
database. This also means Firewall SSO will not work during that time.
• Web administration interface: It will still be available in read-only mode for all sections and in
read-write mode for the configuration section.
Once the server that is in read-only mode joins a quorum, it will go back in read-write mode and
the system will go back to its normal behavior automatically.
If at least one node is still alive, other nodes will be able to connect to it and re-integrate the
cluster.
If all nodes are ungracefuly shutdown at the same time, they will recover when all nodes boot
back up. When all nodes are ungracefuly shutdown, but not at the same time, the galera-autofix
service will elect one of the nodes as the new master and the cluster will recover. See the chapter
on galera-autofix for details on this.
This service will only be able to join a failing node when one of the conditions below is met:
This service will not perform anything when one of the conditions below is met:
This next section will describe how the service will behave and attempt the cluster recovery
when necessary
Important variables:
• wsrep_cluster_status: Display whether or not the node is part of a primary view or not. A
healthy cluster should always show as primary
• wsrep_incoming_addresses: The current members of the cluster. All the nodes of your cluster
should be listed there.
• wsrep_last_committed: Sequence number of the most recently committed transaction. You
can identify the most advanced node with this value.
• wsrep_local_state_comment: Current sync state of the cluster. A healthy state is 'Synced'.
Refer to the Galera cluster documentation for the meaning of the other values this can have.
In order for the cluster to be considered healthy, all nodes must be listed under
wsrep_incoming_addresses and wsrep_local_state_comment must be Synced. Otherwise look in
the MariaDB log (/usr/local/pf/logs/mariadb.log)
On a cluster which has an issue, once all nodes are back online, you should wait:
• around 10 minutes when at least one of the nodes of the cluster is able to offer database
service
• around 20 minutes when there is no database service available
If your cluster still has issues after that time, you can try to resolve the issue by looking at
sections below.
First, you need to perform checks. Then you will be able to identify in which situation you are:
Cluster offers database service without all nodes or None of the nodes is offering database
service.
Ensure you can connect to MariaDB (through UNIX socket) on each node using:
mysql -u root -p
If all checks returned expected values, the cluster is up and has integrity.
# Expected value: ON
SHOW GLOBAL STATUS LIKE 'wsrep_ready';
# Expected value: ON
SHOW GLOBAL STATUS LIKE 'wsrep_connected';
# Expected values when node is part of the primary component: Joining, Waiting
on SST, Joined, Synced or Donor
SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';
If all checks returned expected values, individual nodes are in working order.
In order to emulate how PacketFence connects to the database, you can run following command:
If you got a prompt, it means PacketFence must be able to connect to the database.
To perform a small query to the database using PacketFence codebase, you can run:
/usr/local/pf/bin/pfcmd checkup
If the command doesn’t return any database error, PacketFence is able to perform reads on
database.
After all nodes have joined back cluster, you should verify MariaDB sync.
You must identify the node you wish to keep the data from and start it with the --force-new
-cluster option.
Find the node which has the highest seqno value in /var/lib/mysql/grastate.dat.
If the seqno value is -1, you need to start MariaDB manually with --wsrep-recover to update the
seqno value using the commands below:
The recovered position is a pair <cluster state UUID>:<sequence number>. The node with the
highest sequence number in its recovered position is the most up-to-date, and should be chosen
as bootstrap candidate.
Once you have identified the most up-to-date node, run following commands on it:
On each of the servers you want to discard the data from, you must destroy all the data in
/var/lib/mysql and start packetfence-mariadb so it resyncs its data from scratch.
rm -fr /var/lib/mysql/*
You should then see /var/lib/mysql be populated again with the data and once MariaDB
becomes available again on the server, it means the sync has completed. In case of issues, look in
the MariaDB log file (/usr/local/pf/logs/mariadb.log)
IMPORTANT In a three nodes cluster, you can offer service with at least one node.
/usr/local/pf/bin/cluster/maintenance --activate
/usr/local/pf/bin/cluster/maintenance --deactivate
/usr/local/pf/bin/cluster/maintenance
The important thing is to start the servers in the opposite order that you will stop
NOTE
them .
Example:
Once prompted, check the packetFence-mariadb sync with the Master, type the command:
mysql -u root -p
MariaDB> show status like 'wsrep%';
NOTE The wsrep_incoming_addresses will give you the IP addresses of the nodes synced.
Files description:
/usr/local/pf/addons/backup-and-maintenance.sh
As the daily automatic backups, you will find the files in:
/root/backup/
Two files will be available, tagged with the Date and Time of your backup.
PacketFence supports having clusters where servers are located in multiple layer 3 networks
which we will also refer as cluster zones.
Simple RADIUS only clusters are more simple and can be configured without too much in-depth
knowledge, but if you want to use the captive portal with a layer 3 cluster, this will definitely
make your setup more complex and will certainly require a lot of thinking and understanding on
how PacketFence works to be able to know how to properly design a cluster like this.
This section will describe the changes to do on your cluster.conf when dealing with layer 3
clusters but doesn’t cover all the cluster installation. In order to install your cluster, follow the
instructions in Cluster Setup and refer to this section when reaching the step to configure your
cluster.conf.
• This example will use 3 servers in a network (called DC1), and 2 in another network (called
DC2).
• Each group of server (in the same L2 network) will have a virtual IP address and will perform
load-balancing to members in the same L2 zone (i.e. same network).
• All the servers will use MariaDB Galera cluster and will be part of the same database cluster
meaning all servers will have the same data.
• In the event of the loss of DC1 or a network split between DC1 and DC2, the databases on
DC2 will go in read-only and will exhibit the behavior described in "Quorum behavior".
• All the servers will share the same configuration and same cluster.conf. The data in
cluster.conf will serve as an overlay to the data in pf.conf to perform changes specific to each
layer 3 zone.
• While going through the configurator to configure the network interfaces, you only need to
have a single interface and set its type to management and high-availability.
[general]
multi_zone=enabled
[DC1 pf1-dc1.example.com]
management_ip=192.168.1.11
[DC1 pf2-dc1.example.com]
management_ip=192.168.1.12
[DC1 pf3-dc1.example.com]
management_ip=192.168.1.13
[DC2 CLUSTER]
management_ip=192.168.2.10
[DC2 pf1-dc2.example.com]
management_ip=192.168.2.11
[DC2 pf2-dc2.example.com]
management_ip=192.168.2.12
In order to configure a RADIUS server with a captive-portal on a layer 3 cluster, you will need at
least 3 servers (5 are used in this example) with 2 interfaces (one for management and one for
registration).
Isolation is omitted in this example for brevity and should be configured the same
NOTE
way as registration if needed
• This example will use 3 servers in a network (called DC1), and 2 in another network (called
DC2).
• Each group of server (in the same L2 network) will have a virtual IP address and will perform
load-balancing (RADIUS, HTTP) to members in the same L2 zone (i.e. same network).
• All the servers will use MariaDB Galera cluster and will be part of the same database cluster
meaning all servers will have the same data.
• In the event of the loss of DC1 or a network split between DC1 and DC2, the databases on
DC2 will go in read-only and will exhibit the behavior described in "Quorum behavior".
• All the servers will share the same configuration and same cluster.conf. The data in
cluster.conf will serve as an overlay to the data in pf.conf and networks.conf to perform
changes specific to each layer 3 zone.
The schema below presents the routing that needs to be setup in your network in order to
deploy this example:
• The static routes from the PacketFence servers to the gateways on your network equipment
will be configured through networks.conf and do not need to be configured manually on the
servers. You will simply need to declare the remote networks so that PacketFence offers
DHCP on them and routes them properly.
• Since the network of the clients is not directly connected to the PacketFence servers via layer
2, you will need to use IP helper (DHCP relaying) on your network equipment that points to
both virtual IPs of your cluster.
• We assume that your routers are able to route all the different networks that are involved for
registration (192.168.11.0/24, 192.168.22.0/24, 192.168.100.0/24) and that any client in
these 3 networks can be routed to any of these networks via its gateway (192.168.11.1,
192.168.22.2, 192.168.100.1).
• Access lists should be put in place to restrict the clients (network 192.168.100.0/24) from
accessing networks other than the 3 registrations networks.
• No special routing is required for the management interface.
• While going through the configurator to configure the network interfaces, you will need to set
an interface to management and high-availability.
• While going through the configurator to configure the network interfaces, you will need to set
an interface to registration.
[general]
multi_zone=enabled
[DC1 CLUSTER]
management_ip=192.168.1.10
[DC1 pf1-dc1.example.com]
management_ip=192.168.1.11
[DC1 pf2-dc1.example.com]
management_ip=192.168.1.12
[DC1 pf3-dc1.example.com]
management_ip=192.168.1.13
[DC2 CLUSTER]
management_ip=192.168.2.10
[DC2 pf1-dc2.example.com]
management_ip=192.168.2.11
[DC2 pf2-dc2.example.com]
management_ip=192.168.2.12
You should use the configuration above to perform the cluster setup and
NOTE complete all the steps required to build your cluster. You should only continue
these steps after it is fully setup and running.
[192.168.100.0]
gateway=192.168.100.1
dhcp_start=192.168.100.20
domain-name=vlan-registration.example.com
Then, to complete the client network configuration, you will need to override the next hop (route
to reach the network) and DNS server in cluster.conf by adding the following:
/usr/local/pf/bin/cluster/sync --as-master
/usr/local/pf/bin/pfcmd configreload hard
For this configuration at least 3 servers are needed at the main site and at least 1 server is
needed at the remote site.
Next configure cluster.conf during the initial cluster setup, refer to the example below.
[general]
multi_zone=enabled
[DC1 CLUSTER]
management_ip=192.168.1.10
[DC1 pf1-dc1.example.com]
management_ip=192.168.1.11
[DC1 pf2-dc1.example.com]
management_ip=192.168.1.12
[DC1 pf3-dc1.example.com]
management_ip=192.168.1.13
[DC2 CLUSTER]
management_ip=192.168.2.10
masterslavemode=SLAVE
masterdb=DC1
[DC2 pf1-dc2.example.com]
management_ip=192.168.2.11
masterslavemode=SLAVE
masterdb=DC1
This mean that the cluster will be in SLAVE mode and will use the db of the DC1 cluster.
And we MUST defined the type and the enforcement on all the interfaces.
Connect to the remote server and perform the following to sync the configuration from the
master cluster:
mkdir /root/backup/restore
cd /root/backup/restore
cp ../packetfence-db-dump-innobackup-YYYY-MM-DD_HHhss.xbstream.gz .
gunzip packetfence-db-dump-innobackup-YYYY-MM-DD_HHhss.xbstream.gz
mbstream -x < packetfence-db-dump-innobackup-YYYY-MM-DD_HHhss.xbstream
mv packetfence-db-dump-innobackup-YYYY-MM-DD_HHhss.xbstream ../
mariabackup --prepare --target-dir=./
systemctl stop packetfence-mariadb
rm -fr /var/lib/mysql/*
mariabackup --innobackupex --defaults-file=/usr/local/pf/var/conf/mariadb.conf
--move-back --force-non-empty-directories ./
chown -R mysql: /var/lib/mysql
systemctl start packetfence-mariadb
mysql -uroot -p
MariaDB [(none)]> GRANT REPLICATION SLAVE ON *.* TO 'pfcluster'@'%';
MariaDB [(none)]> FLUSH PRIVILEGES;
Lastly, run the following script on the remote server to start the slave replication.
/usr/local/pf/addons/makeslave.pl
The "MySQL master ip address" is the ip address of the master server where you created the
backup file. Not the VIP of the primary cluster.
In the case when you run the script you have the following message:
ERROR 1045 (28000) at line 1: Access denied for user 'root'@'%' (using
password: YES)
Unable to grant replication on user pfcluster at ./addons/makeslave.pl line 42,
<STDIN> line 2.
Then you need to be sure that the root user exist in the remote database and have the correct
permissions (SELECT and GRANT):
Edit the file /root/backup/restore/xtrabackup_binlog_info and note the file name and the
position:
mariadb-bin.000014 7473
On the master server of the main cluster - where the backup was created - run the following
command:
mysql -uroot -p
MariaDB [(none)]> GRANT REPLICATION SLAVE ON *.* TO 'pfcluster'@'%';
MariaDB [(none)]> FLUSH PRIVILEGES;
On the remote site master server run the following MySQL command as root:
The replication MASTER_USER and MASTER_PASSWORD can be found in the main sites pf.conf.
The MASTER_HOST is the ip address of the master server on the main site - where the backup
was created. Do not use the VIP.
At the end it you want to check the status of the slave server for debug purposes you can run
the follwing command:
First, you will need to stop PacketFence on your server and put it offline:
Then you need to remove all the configuration associated to the server from
/usr/local/pf/conf/cluster.conf on one of the remaining nodes. Configuration for a server is
always prefixed by the server’s hostname.
Once you have removed the configuration, you need to reload it and synchronize it with the
remaining nodes in the cluster.
# /usr/local/pf/bin/cluster/sync --as-master
# /usr/local/pf/bin/pfcmd configreload hard
Now restart PacketFence on all the servers so that the removed node is not part of the clustering
configuration.
Note that if you remove a node and end up having an even number of servers, you will get
unexpected behaviors in MariaDB. You should always aim to have an odd number of servers at all
time in your cluster.
In order to be sure the configuration is properly synced on all nodes, you will need to enter this
command on the previously selected master node.
# /usr/local/pf/bin/cluster/sync --as-master
Add the additional files one per line in this file. We advise you add this file to the synchronization
too.
Example :
/usr/local/pf/conf/cluster-files.txt
/usr/local/pf/raddb/modules/mschap
listen stats
bind %%management_ip%%:1025
mode http
timeout connect 10s
timeout client 1m
timeout server 1m
stats enable
stats uri /stats
stats realm HAProxy\ Statistics
stats auth admin:packetfence
We strongly advise you change the username and password to something else
NOTE than admin/packetfence although access to this dashboard doesn’t compromise
the server.
If you’re upgrading from a version prior to 5.0, the line may not be there. Add
CAUTION
it close to the other management rules:
Now restart haproxy-portal and iptables in all nodes in order to complete the configuration:
You should now be able to connect to the dashboard on each node using following URL :
The same principle can be applied to haproxy-db with port 1026 in haproxy-
NOTE
db.conf
When modifying the configuration through the administration interface, the configuration will be
automatically synchronized to all the nodes that are online. In the event that one or more nodes
cannot be updated, an error message will be displayed with affected nodes.
• If the configuration is not pushed to at least half of the servers of your cluster, when the
failed nodes will come back online, they will have quorum on the previous configuration and
the one they are running will be pushed to all the servers.
• In a two node cluster, the most recent configuration is always selected when resolving a
conflict.
• In a two node cluster, no decision is taken unless the peer server has its webservices available.
The first step is to get the configuration version from each server through a webservice call.
The results are then organized by version identifier. Should all alive servers run the same version,
the state is considered as healthy and nothing happens.
Then, should there be more than one version identifier across the alive servers, the algorithm
validates that there are at least 2 servers configured in the cluster. If there aren’t, then the most
recent version is pushed on the peer node.
After that, the algorithm looks at which version is on the most servers. In the event that the dead
servers are in higher number than the alive ones, the most recent version is taken. Otherwise, the
version that is present on the most servers will be selected.
When pushing a version to the other servers, if the current server has the most recent version or
For more information, please consult the mailing archives or post your questions to it. For details,
see:
For any questions or comments, do not hesitate to contact us by writing an email to:
support@inverse.ca.
Hourly rates or support packages are offered to best suit your needs.
12.1. Glossary
• 'Alive quorum': An alive quorum is when more than 50% of the servers of the cluster are
online and reachable on the network (pingable). This doesn’t imply they offer service, but only
that they are online on the network.
• 'Hard-shutdown': A hard shutdown is when a node or a service is stopped without being able
to go through a proper exit cleanup. This can occur in the case of a power outage, hard reset
of a server or kill -9 of a service.
• 'Management node/server': The first server of a PacketFence cluster as defined in
/usr/local/pf/conf/cluster.conf.
• 'Node': In the context of this document, a node is a member of the cluster while in other
PacketFence documents it may represent an endpoint.
If you suspect that using ProxySQL causes issues in your deployment, you can revert back to
using haproxy-db by changing database.port in conf/pf.conf to 3306.
Once that is changed on one of your cluster members, propagate your change using:
/usr/local/pf/bin/cluster/sync --as-master
/usr/local/pf/bin/pfcmd configreload hard
Additionally, you could change pfconfig’s configuration to use haproxy-db as well although its
usage of the database is extremelly light. Still, if you want to change it for pfconfig, edit
conf/pconfig.conf and change mysql.port to 3306. After doing this change, restart pfconfig
using systemctl restart packetfence-config. Note that this change must be done on all
cluster members.
VIP address of the cluster doesn’t need to be allowed in your network devices.
In this procedure, the 3 nodes will be named A, B and C and they are in this order in
cluster.conf. When we referenced their hostnames, we speak about hostnames in
cluster.conf.
12.4.1. Backups
Re-importable backups will be taken during the upgrade process. We highly encourage you to
perform snapshots of all the virtual machines prior to the upgrade if possible.
First, we need to tell A and B to ignore C in their cluster configuration. In order to do so, execute
the following command on A and B while changing node-C-hostname with the actual hostname of
node C:
Once this is done proceed to restart the following services on nodes A and B one at a time. This
will cause service failure during the restart on node A
Then, we should tell C to ignore A and B in their cluster configuration. In order to do so, execute
the following commands on node C while changing node-A-hostname and node-B-hostname by
the hostname of nodes A and B respectively.
From this moment on, you will lose the configuration changes and data changes
NOTE
that occur on nodes A and B.
The commands above will make sure that nodes A and B will not be forwarding requests to C
even if it is alive. Same goes for C which won’t be sending traffic to A and B. This means A and B
will continue to have the same database informations while C will start to diverge from it when it
goes live. We’ll make sure to reconcile this data afterwards.
Upgrade node C
From that moment node C is in standalone for its database. We can proceed to update the
packages, configuration and database schema. In order to do so, apply the upgrade process
described here on node C only.
Prior to migrating the service on node C, it is advised to run a checkup of your configuration to
validate your upgrade. In order to do so, perform:
Review the checkup output to ensure no errors are shown. Any 'FATAL' error will prevent
PacketFence from starting up and should be dealt with immediately.
• Stop database:
Now, start the application service on node C using the instructions provided in Restart
PacketFence services section.
If your migration to node C goes wrong, you can fail back to nodes A and B by stopping all
services on node C and starting them on nodes A and B
On node C
On nodes A and B
Once you are feeling confident to try your failover to node C again, you can do the exact
opposite of the commands above to try your upgrade again.
If you are happy about the state of your upgrade on node C, you can move on to upgrading the
other nodes.
On node A
On node B
On nodes A and B
export UPGRADE_CLUSTER_SECONDARY=yes
systemctl restart packetfence-mariadb
It is important that you run the upgrade commands in the same shell you ran your
NOTE export so that the environment variable is properly taken into consideration when
the upgrade script executes.
You should now sync the configuration by running the following on nodes A and B
Where:
When you will re-establish a cluster using node C in the steps below, your environment will be
set in read-only mode for the duration of the database sync (which needs to be done from
scratch).
This can take from a few minutes to an hour depending on your database size.
We highly suggest you delete data from the following tables if you don’t need it:
You can safely delete the data from all of these tables without affecting the functionnalities as
they are used for reporting and archiving purposes. Deleting the data from these tables can make
the sync process considerably faster.
mysql -u root -p pf
MariaDB> truncate TABLE_NAME;
NOTE The steps in next sections will cause brief service disruptions
Now that all the members are ready to reintegrate the cluster, run the following commands on all
cluster members
Now, stop packetfence-mariadb on node C, regenerate the MariaDB configuration and start it as
a new master:
You should validate that you are able to connect to the MariaDB database even though it is in
read-only mode using the MariaDB command line:
If its not, make sure you check the MariaDB log (/usr/local/pf/logs/mariadb.log)
On each of the servers you want to discard the data from, stop packetfence-mariadb, you must
destroy all the data in /var/lib/mysql and start packetfence-mariadb so it resyncs its data from
scratch.
Should there be any issues during the sync, make sure you look into the MariaDB log
(/usr/local/pf/logs/mariadb.log)
Once both nodes have completely synced (try connecting to it using the MariaDB command line).
Once you have confirmed all members are joined to the MariaDB cluster, perform the following
on node C
You can now safely start PacketFence on nodes A and B using the instructions provided in Restart
PacketFence services section.
haproxy-admin service need to be restarted manually on both nodes after all services have been
restarted:
You should now have full service on all 3 nodes using the latest version of PacketFence.
You can monitor the active TCP connections to MariaDB using this command and then
investigate the processes that are connected to it (last column):
You can have an overview of all the current connections using the following MariaDB query:
And if you would like to see only the connections with an active query: