2024..S

Swarm and Evolutionary Computation 89 (2024) 101615
Contents lists available at ScienceDirect
Swarm and Evolutionary Computation

journal homepage: www.elsevier.com/locate/swevo
Collaborative gas source localization strategy with networked nano-drones

in unknown cluttered environments✩
Vu Phi Tran ∗, Matthew A. Garratt, Sreenatha G. Anavatti, Sridhar Ravi
School of Engineering and Technology, The University of New South Wales, Canberra, Australia
ARTICLE INFO ABSTRACT
Keywords: This paper introduces a novel approach for improving gas source localization in dynamic urban environments,
Gas source localization employing a swarm of nano-Crazyflie drones through a hybrid strategy that integrates Adaptive Robotic
Evolutionary algorithms Particle Swarm Optimization (ARPSO) with Bidirectional Brain Emotional Learning (BBEL). The proposed
Particle swarm optimization
method refines the ARPSO algorithm into two phases: an initial exploration phase and a subsequent seeking
Self-organizing systems
phase. The exploration phase activates when no gas is detected, seamlessly transitioning to the seeking phase
Adaptive control
Fuzzy neural network
upon gas detection. This facilitates efficient information exchange and desired velocity generation within the
Flight control system particle swarm. During the seeking phase, particles conduct measurements, share the global best position,
Quadrotor UAV intelligently navigate obstacles, and avoid collisions while directly identifying the global concentration field
Robotic swarms maximum. Unlike conventional methods, the ARPSO algorithm autonomously adapts its parameters online,
mitigating algorithmic failures and local optima challenges. It also considers the limited communication of
robots in complex environments, a factor often overlooked in recent methods. To enhance plume tracking
robustness, facilitate the controller gain tuning process, and ensure system resilience against environmental
disturbances and uncertainties, the BBEL method incorporates a sophisticated fuzzy neural network structure
with reinforcement learning-based adaptation mechanisms. The stability of the closed-loop control system
is rigorously proven using Lyapunov theory. Numerical simulations in complex urban scenarios validate
the algorithm’s effectiveness, showcasing a significant 20% improvement in turnaround time and a flawless
100% success rate with no collisions, in addition to enhanced capabilities in handling uncertainties and
disturbances compared to benchmarks like ARPSO-BBEL with unlimited communication range and no delayed
communications (ARPSO-BBELU), Sniffy Bug (SB)-BBELU, and ARPSO-PIDU.
1. Introduction Traditionally, GSL has been approached through static sensor net-
works to monitor gas concentrations, activating alarms upon exceeding
In a world increasingly threatened by chemical, biological, radi- predetermined thresholds [7,8]. Nevertheless, these methods encounter
ological, and nuclear (CBRN) agents, rapid and precise detection of limitations in high-temperature, high-humidity marine environments
hazardous substances is emerging as an urgent priority [1–3]. Acciden- and often struggle with dead zones where gas leaks might remain
tal or intentional dispersal of these substances poses severe risks to both undetected due to the unpredictable nature of gas diffusion [9]. Fur-
human health and the environment, underscoring the importance of thermore, achieving precise GSL remains a challenge for sensor net-
timely source identification and release scope estimation. This critical works, particularly in scenarios with extensive areas of interest, where
information not only guides the response of emergency services but also deploying a sufficient number of sensors may not be practical [10].
plays a pivotal role in the efforts of local authorities to mitigate the
In response to these challenges, recent years have witnessed a
impact of disasters [4–6].
transformative shift in gas source identification. Moving away from
Gas source localization (GSL) spans the domains of source localiza-
static sensor networks, there has been the adoption of mobile sensor
tion (SL), source term estimation (STE), and source inversion. While
platforms capable of autonomously navigating complex and cluttered
STE identifies both the location and strength of the source, SL focuses
CBRN-contaminated environments. This paradigm shift promises to
solely on determining the source’s location [3].
✩ This research was funded by the Australian Defence Science Technology (DST) Group under the project ‘Autonomous Precision Access (APA): Resilient flight
control for trusted, robust, real-time adaptive control using Neuro-Fuzzy approaches’.
∗ Corresponding author.
E-mail address: phivus2@gmail.com (V.P. Tran).
https://doi.org/10.1016/j.swevo.2024.101615
Received 17 January 2024; Received in revised form 5 April 2024; Accepted 22 May 2024
Available online 12 June 2024
2210-6502/© 2024 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
V.P. Tran et al. Swarm and Evolutionary Computation 89 (2024) 101615
overcome the limitations of static networks, offering a more agile and approaches entail substantial computational resources, potentially re-
adaptable approach to achieving accurate GSL [11]. stricting their implementation on older or less powerful hardware plat-
These mobile platforms autonomously localize the source of agent forms, particularly in environments characterized by complex shapes,
fields, employing advanced algorithms for collaborative gas informa- obstacles, and intricate airflow patterns.
tion processing, identification of CBRN agent sources, and continuous Meanwhile, reinforcement and evolutionary learning approaches
surveillance [12]. Addressing these multifaceted demands requires a have been actively explored, primarily within simulated
comprehensive solution encompassing various aspects including gas environments [13,24]. The transition of these learned policies to
detection, area coverage, robust adaptive control, collision avoidance, obstacle-free real-world settings has been relatively limited [25]. In
navigation, and the efficient coordination of multiple robots [13,14]. particular, there have been some successful instances where policies
Despite recent advancements, a common challenge in the efficient trained for tasks such as light-seeking were transferred from simulations
deployment of mobile sensors for gas source prediction is the pro- to real environments featuring simple obstacles [26]. However, a key
longed exploration times associated with unmanned ground vehicles limiting factor for simulation-based learning approaches has been the
(UGVs) [15,16]. This hardware platform proves to be less suitable for time-intensive nature of gas dispersion modeling, requiring significant
tasks demanding rapid responses, such as GSL, where time is of the domain knowledge for accurate representation. Additionally, the avail-
essence [13]. Additionally, while most gas source-seeking operations ability of diverse training environments has been limited, despite the
traditionally unfold in ideal, obstacle-free settings [3,15,17], recent need for a wide range of scenarios for comprehensive training [27].
studies primarily have primarily employed a point-mass model and On a promising note, swarm intelligence optimization has emerged
fixed-gain PID controllers without accounting for drone dynamics, as a highly efficient approach that harnesses the collective behavior
environmental disturbances, or limited communications [5,13,18]. of swarm robots and distributed algorithmic strategies to identify and
Therefore, this paper focuses on a new hybrid approach for GSL locate gas sources while imposing significantly lower computational
and adaptive control tailored for collaborative unmanned aerial vehi- demands. This approach has gained significant attention due to its supe-
cles (UAVs) in complex, unknown, and obstacle-laden environments. rior search efficacy and robustness compared to traditional gas search
The emphasis is particularly on urban contexts characterized by re- algorithms [5,28,29]. Notable swarm intelligence algorithms, such as
stricted communication capabilities. In such challenging scenarios, the PSO, genetic algorithms, and ant colony algorithms, have demonstrated
deployment of UAVs equipped with mobile airborne sensors plays a their effectiveness in diverse applications [5,30–32]. Feng et al. success-
pivotal role in ensuring rapid and efficient responses. Leveraging UAVs
fully established indoor pollutant source leakage models, implementing
to counter CBRN threats presents a twofold advantage. Firstly, their
PSO for multi-robot active olfactory search [33]. Jain et al. employed
airborne nature renders them less susceptible to obstacles, enabling effi-
PSO and group strategies to locate multiple gas sources in unknown
cient navigation across challenging terrains. This quality is particularly
environments [34], and Villarreal et al. devised a heuristic GSL algo-
crucial in densely cluttered areas, such as confined urban environments,
rithm based on genetic algorithms, successfully replicating chemotaxis
where the use of small, maneuverable UAVs is imperative. Furthermore,
behavior [35].
UAVs, with their capacity to control both altitude and horizontal
The application of the PSO technique holds promise as it effectively
positioning, excel at mapping the three-dimensional structure of gas
mitigates challenges associated with local gas concentration maxima
plumes. This capability gains particular significance when considering
within intricate environments [5,13]. An exemplary illustration is the
meteorological variables such as wind.
Evolved Sniffy Bug-Robotic PSO (RPSO) algorithm [13], meticulously
designed for autonomous GSL in obstacle-cluttered environments. To
2. Related work
optimize the performance of the RPSO and Sniffy bug algorithms,
swarming parameters are finely tuned through the application of a
The field of diffuse gas detection and localization has recently
Genetic Algorithm (GA), consistently showcasing its superiority over
experienced a profound evolution, driven by the integration of au-
manually adjusted settings. The RPSO algorithm plays an essential
tonomous UAV robots. These UAV-based approaches can be broadly
categorized into four core methodologies: biomimetic behavior-based role in GSL, facilitating the exchange of critical information among
techniques, model-based analysis, intelligent learning strategies, and swarm particles and generating strategic waypoints during the search
swarm intelligence optimization. process [13].
Biomimetic behavior-based methods encompass a range of algo- Furthermore, previous research has successfully exploited RPSO in
rithms, including concentration gradient-based strategies [19], lobster the context of relatively large outdoor quadcopters, equipped with
heuristic algorithms inspired by biological behaviors [20], and the advanced LiDAR and GPS navigation systems [36,37]. However, the
annealing algorithm [21]. In these techniques, a single UAV harnesses deployment of a swarm of nano-drones (NDs) presents an ideal solution
concentration measurements from various locations to compute the for GSL within expansive and densely cluttered indoor environments.
concentration gradient, ultimately tracing it to the gas source. How- The diminutive size of these drones equips them with the agility to
ever, it is essential to acknowledge that real-world gas plumes often navigate through narrow and intricate spaces. Moreover, their swarm
exhibit turbulence, marked by vortices, resulting in non-smooth con- configuration unlocks unprecedented efficiency in coverage and rapid
centration gradients. Furthermore, biomimetic methods have predomi- identification of gas sources [13].
nantly concentrated on single-agent deployments, facing challenges in The aspiration to establish fully autonomous gas-seeking swarms
developing efficient multi-robot movement strategies, which can affect using NDs introduces a myriad of challenges. Designing controllers for
localization efficiency [1]. autonomous UAVs is inherently demanding, particularly when facing
Model-based analysis, on the other hand, centers on wind field extreme attitudes [38], strong disturbances like wind gusts, or uncer-
and plume modeling to predict gas source locations and dispersion tain system models [39]. The manual tuning of UAV control loop gains
paths [15]. The work of Marjovi and Marques introduced a decen- is a time-intensive process, particularly when adapting to dynamic
tralized boundaries algorithm, demonstrating its effectiveness through environments or varying payloads [40]. This challenge is exacerbated
both simulations and real-world experiments [22]. Wiedemann et al. in swarm scenarios, where managing controller gains for multiple
employed partial differential equations to model gas diffusion and NDs simultaneously becomes cumbersome. Close-proximity flight en-
introduced probabilistic methods for identifying these equations un- ables the swarm to identify gas sources more rapidly than swarms
der sparsity conditions [23]. Their approach, employing factor graphs maintaining large distances from each other. However, it introduces
and message-passing algorithms for multi-robot GSL and quantifica- complications in the aerodynamic interactions arising from the small
tion, exhibited success in experimental settings. However, model-based distances between NDs. These interaction effects are challenging to
2
model or measure [41]. Additionally, NDs may face challenges in com- in a more rapid response and smoother trajectory for NDs. Addi-
putational overhead and real-time processing due to their low-powered tionally, ARPSO incorporates online tuning of PSO parameters,
nano microprocessor size. In such contexts, the need for robust adaptive tailored for GSL applications in complex and uncertain urban
control systems with fast learning and low computational complex- environments. Empowering ND particles to autonomously adapt
ity becomes paramount, ensuring precise convergence on gas sources to their surroundings, ARPSO dynamically adjusts parameters
in intricate and congested settings when individual swarm members in real-time through an online evolutionary algorithm, ensuring
encounter unique environmental or system dynamic changes [42,43]. agility and mitigating the risk of being trapped by local optima
Meanwhile, recent adaptive RPSO methods, despite their effective- or impeded by obstacles. The proposed algorithm also accounts
ness, often rely solely on simplified point-mass models, offline pa- for the limited communication resources of the robots.
rameter learning, and fixed-gain inner-loop Single Input Single Output 3. This is the first time that the robust and adaptive BBEL algorithm
(SISO) Proportional–Integral–Derivative (PID) controllers. This reliance has been introduced to govern a swarm of Crazyflie drones.
may inadvertently overlook the intricacies of drone dynamics and en- This groundbreaking application not only effectively facilitates
vironmental disturbances [1,5,18]. Additionally, these studies tend to controller tuning but also addresses the intricate dynamics of
share global information without considering the restricted communi- individual drones and their complex aerodynamic interactions
cation resources of the robots [5,13]. This oversight proves impractical within a swarm setting.
in real-world scenarios. 4. To evaluate the effectiveness of our proposed ARPSO-BBEL ap-
The Multilayer Artificial Neural Network (MLANN) is a proach, we conducted a thorough comparative study bench-
well-established brain-inspired architecture known for its self-learning, marked against state-of-the-art methods, including ARPSO-BBEL
self-adaptive, and universal function approximation capability [44]. with the unlimited communication range and ultra low-latency
However, MLANNs encounter challenges related to the curse of dimen- communications (ARPSO-BBELU), ARPSO-PIDU, and Sniffy Bug
sionality, leading to low training speed, high computational complexity (SB)-BBELU [13]. This evaluation, carried out in a realistic simu-
(CC), and a slower convergence rate as the number of neurons in- lated urban environment, considers crucial performance metrics
creases [44,45]. This limitation impedes their applicability to real-time such as fitness threshold, total exploration time, target distance
control requirements. error, success rate, and collision or crash numbers.
In contrast, the Bidirectional Brain-Emulating Layer (BBEL) repre-
The remainder of this article is structured as follows: Section 3 pro-
sents a single-layered architecture that emulates the emotional learning
vides essential background information concerning PSO algorithm and
mechanism in the brain, particularly influenced by the amygdala and
emotional learning. The primary contribution of our work is explained
orbitofrontal cortex. Notably, BBEL exhibits superior features, includ-
in Section 4. Simulation and flight results are described in Section 5,
ing fast learning and quick reactions, with a feedforward CC on the
followed by the conclusion given in Section 6.
order of O(n) [46]. Inspired by More’n et al.’s amygdala-orbitofrontal
model [47], BBEL-based models emphasize the crucial interaction be-
3. Background
tween the amygdala and orbitofrontal cortex in processing emotional
stimuli swiftly and accurately [48].
The section is structured into three parts. In Section 3.1, we present
Importantly, the BBEL-based models, which replicate the limbic sys-
the conventional PSO technique. Next, Section 3.2 details the latest
tem’s rapid emotional learning and quick responses, have demonstrated
RPSO algorithm designed to address the challenges of GSL in complex
success across various real-world platforms. These include larger-sized
and cluttered environments. The third part, elaborated in Section 3.3,
quadrotors and flapping wing micro aerial vehicles, leveraging a low-
delivers a comprehensive review of the BBEL theory.
powered Arduino Due microprocessor board (ARM Cortex-M3 CPU
with a flash memory of 512 KB) [48,49].
3.1. Particle swarm optimization framework
To address critical gaps, our study introduces a BBEL controller and
an online adaptive RPSO approach, providing a sophisticated solution
PSO is a heuristic optimization technique inspired by the collec-
for autonomous gas-seeking swarm robotics, especially in scenarios
tive intelligence of biological swarms, which makes it different from
with communication constraints. To the best of our knowledge, these
gradient-dependent methods [19]. This makes PSO well-suited for ad-
multifaceted challenges remain largely unexplored within the existing
dressing problems characterized by discontinuities, multimodality, and
literature, emphasizing the pioneering nature of our work in addressing
non-convexity. It excels in CBRN scenarios where fitness functions
these critical issues and advancing the field of autonomous gas-seeking
exhibit high fluctuations and multiple modes, positioning it favorably
nano-Crazyflie drone swarms in complex, dynamic, unknown, and clut-
in comparison to genetic algorithms regarding its capacity to uncover
tered environments. We deliberately select nano-Crazyflie drones for
optimal solutions [13]. In its original form, PSO deploys particles
their compact size, enabling navigation through narrow spaces and a
within a problem space. Each particle, represented as a robotic agent in
low impact on the gas plume due to a small downwash wake. Moreover,
our application, presents a potential solution within the 𝑀-dimensional
their ability to execute the BBEL controller in real-time using their space.
low-powered hardware (ARM Cortex-M4 CPU with a flash memory In this context, these agents are described by 2-dimensional vectors,
of 1 Mb), surpassing the computational capabilities of Arduino Due, representing their positions in the search space using 𝑥 and 𝑦 coordi-
underscores their suitability for this application. Motivated by the nates 𝑝𝑖 = [𝑝𝑖𝑥 , 𝑝𝑖𝑦 ]𝑇 . PSO guides these robotic agents in their quest for
research gaps in the field of gas source localization with mobile NDs, optimal solutions by dynamically adjusting their positions through the
our contributions can be summarized as follows: manipulation of their velocities 𝜐𝑖 = [𝜐𝑖𝑥 , 𝜐𝑖𝑦 ]𝑇 , as governed by specific
formulas:
1. In this study, the first successful robotic demonstration of a ( )
swarm of autonomous nano-Crazyflie drones is presented to effi- 𝜐𝑖 (𝑡 + 1) = 𝜔(𝑡) 𝜐𝑖 (𝑡) + 𝛾𝑝 (𝑡)𝑟1 𝑝𝑖 (𝑡) − 𝑝𝑖 (𝑡)
ciently locate a gas source within unknown urban environments, ( ) (1)
including uncertainties and obstacles. + 𝛾𝑔 (𝑡)𝑟2 𝑝𝑔 (𝑡) − 𝑝𝑖 (𝑡) ,
2. In contrast to existing RPSO strategies focused on waypoint
𝑝𝑖 (𝑡 + 1) = 𝑝𝑖 (𝑡) + 𝜐𝑖 (𝑡), 𝑖 = (1, … , 𝑛𝑝 ) (2)
tracking, offline PSO parameter evolution, ideal communication
range, and ultra-reliable communications [5,13,50], our novel where 𝑛𝑝 is the number of particles in the swarm. Random values 𝑟1
adaptive RPSO (ARPSO) method adopts a distinct approach. It and 𝑟2 are chosen from the range between 0 and 1. The variable 𝜔(𝑡)
directly generates and tracks desired velocity setpoints, resulting is the inertia weight at the 𝑡th step. Additionally, 𝑝𝑖 (𝑡) is the position
3
where the 𝑖th particle obtains its personal best fitness value, while 𝑝𝑔 (𝑡) each force occurs when the Crazyflie drone enters the predefined
is the position associated with the best fitness function value sensed corresponding range.
by the entire swarm. The parameters 𝛾𝑝 (𝑡) and 𝛾𝑔 (𝑡) are commonly Finally, a waypoint tracking algorithm governs the particles’ move-
called the cognitive and social weights respectively and are at the 𝑡th ments towards the generated waypoints, with three defined rules:
step. The balance between 𝛾1 (𝑡) and 𝛾2 (𝑡) directly impacts how the (1) line following to the designated waypoint when no obstacles are
algorithm weighs personal and collective experiences in determining detected, (2) wall following movement to avoid obstacles if detected
the swarm’s future search directions. This process is key to PSO’s within a distance of 𝑑𝑜 , and (3) Attraction–Repulsion–Swarming move-
operation, ensuring that the swarm of agents collaboratively refines its ment to avoid other closely detected particles/Crazyflies when the
positions, ultimately converging towards optimal solutions. particle detects at least one other particle within a distance of 𝑑𝑠 while
tracking its goal waypoint. Each agent computes a new waypoint if it
3.2. Robotic particle swarm optimization algorithm reaches within the expected distance 𝑑𝑡 from the previous goal.
Recent RPSO solutions, while successful in source-searching swarm
In recent years, an increasing number of researchers have addressed behavior for natural gas leakage, encounter delays and inefficiencies
the complex multi-robot GSL problem through intelligent swarm with waypoint tracking, causing sluggish responses and non-smooth
robotic behaviors, leveraging either the standard PSO or its various trajectories. This approach not only compromises system performance
modifications [5,13,34]. This method incorporates the directional guid- but also escalates collision risks and restricts adaptability in cluttered
ance provided by PSO with a multi-bug robot algorithm, enabling urban environments. Our proposed ARPSO method directly addresses
precise navigation towards identified directions. The primary goal is these challenges by generating and tracking velocity setpoints, ensuring
to efficiently identify multidimensional spatial locations that maxi- a more responsive and smoother GSL process.
mize gas concentration. Within the collaborative swarm, ND particles
dynamically exchange information, collectively contributing to the
3.3. Bidirectional Fuzzy brain emotional learning algorithm
generation of waypoints, each computed using Eq. (3).
𝑔𝑖 (𝑡) = 𝑝𝑖 (𝑡) + 𝜐𝑖 (𝑡), (3) The control system developed in [51], known as Bidirectional Brain
Emotional Learning (BBEL), exhibits rapid adaptability in controlling
where 𝑔𝑖 (𝑡) is the goal waypoint of agent 𝑖, in iteration 𝑡, 𝑝𝑖 (𝑡) is the
nonlinear systems. Within this research, the BBEL control system,
current position, and 𝜐𝑖 (𝑡) is the velocity command vector generated by
which is based on the Fuzzy Neural Network (FNN) structure, is
the RPSO.
employed for further investigation. The controller’s neural network
The velocity command vector 𝜐𝑖 is dictated by the particle’s mode,
configuration aligns with the number of fuzzy membership functions
distinguishing between the exploration phase, initiated when no gas
assigned to an input. It is thoughtfully designed so that each neuron’s
is detected, and the seeking/exploitation phase, activated when at
weight adaptation depends on the strength of the connected fuzzy layer,
least one particle detects gas. During exploration, 𝜐𝑖 is computed using
as illustrated in Fig. 1. Each fuzzy layer encompasses a designated
Eq. (4), integrating a dynamic goal (𝑔𝑖 ) from the previous iteration and
range, facilitated by Gaussian membership functions, referred to as
a random point (𝑟𝑖 ) within a designated square around the particle’s
the ‘fuzzy range’, established based on the anticipated operational
current position, introducing a new random point for each iteration
boundaries of the system.
governed by a uniform distribution 𝑈 [0, 1]. Scalar parameters, includ-
ing 𝜔′ and 𝛾𝑟 , significantly influence agent behavior. Initialization of 𝑔𝑖 The BBEL control system employed in this study draws upon the
and 𝜐𝑖 occurs randomly within a square of size 𝑟 around the particle principles outlined in [49]. Fig. 1 provides a visual representation of
at the simulation’s outset. Eq. (4) emphasizes that the new velocity the architecture, delineating five distinct stages with specific functions:
vector is a weighted sum, incorporating a vector towards the previously sensory input, sensory cortex, amygdala, orbitofrontal cortex network,
computed goal and another towards a randomly determined point. and output. Additionally, the system integrates a reward signal and
adheres to the weight adaptation laws outlined below:
𝑊 𝑎𝑦𝑃 𝑜𝑖𝑛𝑡𝐺𝑜𝑎𝑙 𝑅𝑎𝑛𝑑𝑜𝑚𝐺𝑜𝑎𝑙
⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞ ⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞
( ) ( ) 𝐼 = [𝑖1 , 𝑖2 , … , 𝑖𝑗 , … , 𝑖𝑛 ]𝑇 ∈ 𝑗 , (6)
′
𝜐𝑖 (𝑡) = 𝜔 𝑔𝑖 (𝑡 − 1) − 𝑝𝑖 (𝑡) +𝛾𝑟 𝑟𝑖 (𝑡) − 𝑝𝑖 (𝑡) (4)
[ ]
Upon detecting/smelling gas by one of the particles, and when the 1 𝑖𝑗 − 𝜇𝑗𝑘 2
ℎ𝑗𝑘 = exp − ( ) , 𝑗 = (1, … , 𝑛),
2 𝜎𝑗𝑘 (7)
gas concentration exceeds a predefined threshold, the waypoints are
updated as in Eq. (5). Here, 𝑝𝑖 represents the position at which particle 𝑘 = (1, … , 𝑚),
𝑖 has detected the highest concentration up to iteration 𝑡, and 𝑝𝑔 denotes ∑
𝑛 ∑
𝑚 ∑
𝑛 ∑
𝑚
the swarm’s best-seen position up to iteration 𝑡. The random values 𝑟1 𝐴= 𝜁𝑗𝑘 ℎ𝑗𝑘 ; 𝑂 = 𝛿𝑗𝑘 ℎ𝑗𝑘 , (8)
and 𝑟2 chosen between 0 and 1 are generated for each iteration for each 𝑗=1 𝑘=1 𝑗=1 𝑘=1
agent. The weighting factors 𝛾𝑝 , 𝛾𝑔 , 𝛾𝑠 , and 𝛾𝑜 are scalars impacting ∑

𝑛
𝑈 = 𝐴 − 𝑂; 𝑅 = (𝑏𝑗 𝑖𝑗 ) + (𝑐 𝑈 ), (9)
the particle’s behavior. The vector towards the next waypoint is a
𝑗=1
weighted sum of the vectors towards its previously computed waypoint,
the best-seen position by the particle, and the best-seen position by where 𝑛 represents the number of input states, and 𝑚 denotes the
the swarm. This algorithm ensures that the swarm continues to explore number of layers. 𝐼 serves as the input state vector. 𝜇𝑗𝑘 and 𝜎𝑗𝑘 are
while converging towards a high concentration of gas. the mean and variance, respectively, of the Gaussian functions for the
𝑗th input and the 𝑘th layer. 𝜁𝑗𝑘 and 𝛿𝑗𝑘 represent the weights of the
𝑊 𝑎𝑦𝑃 𝑜𝑖𝑛𝑡𝐺𝑜𝑎𝑙 𝐶𝑜𝑔𝑛𝑖𝑡𝑖𝑣𝑒
⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞ ⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞ amygdala and the orbitofrontal cortex networks. The output vector 𝑈
( ) ( )
𝜐𝑖 (𝑡) = 𝜔 𝑔𝑖 (𝑡 − 1) − 𝑝𝑖 (𝑡) + 𝛾𝑝 𝑟1 𝐩𝑖 (𝑡) − 𝐩𝐢 (𝑡) reflects the response of the BBEL controller. While 𝐴 and 𝑂 denote the
(5) output vectors of the amygdala network and the orbitofrontal cortex
𝑆𝑜𝑐𝑖𝑎𝑙
⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞ 𝑆𝑒𝑝𝑒𝑟𝑎𝑡𝑖𝑜𝑛 𝐴𝑣𝑜𝑖𝑑𝑎𝑛𝑐𝑒 network, respectively, the learning rates 𝛼 and 𝛽 of the amygdala
( ) ⏞⏞⏞ ⏞⏞⏞
and orbitofrontal cortex adaptation algorithm for the 𝑗th input are
+ 𝛾𝑔 𝑟2 𝐩𝐠 (𝑡) − 𝐩𝐢 (𝑡) + 𝛾𝑠 𝐅𝐬 + 𝛾𝑜 𝐅𝐨 ,
introduced along with the reward function 𝑅 and the gain coefficients
where 𝐹𝑜 represents a virtual force intended to navigate the Crazyflies 𝑏𝑗 and 𝑐 for the input and output layers.
away from nearby obstacles, while 𝐹𝑠 signifies another virtual force Moreover, the adaptation laws governing the amygdala 𝛥𝜁𝑗𝑘 and
directing the Crazyflies to steer clear of one another. Activation of orbitofrontal cortex 𝛥𝛿𝑗𝑘 network weights are captured in the following
4
Fig. 1. BBEL control structure.
equation:
𝛥𝜁𝑗𝑘 = 𝛼[ℎ𝑗𝑘 (𝑅 − 𝐴)]; 𝛥𝛿𝑗𝑘 = 𝛽[ℎ𝑗𝑘 (𝑈 − 𝑅)]. (10)
The weights at the next interaction are subsequently updated as:
𝜁𝑗𝑘 (𝑡 + 1) = 𝛥𝜁𝑗𝑘 (𝑡) + 𝜁𝑗𝑘 (𝑡); 𝛿𝑗𝑘 (𝑡 + 1) = 𝛥𝛿𝑗𝑘 (𝑡) + 𝛿𝑗𝑘 (𝑡). (11)
The proposed BBEL controller incorporates a robust control ap-
proximator 𝑈𝑟𝑝 to address potential approximation errors, defined as:
(𝛾𝑐2 + 1)𝑒(𝑡)
𝑈𝑟𝑝 (𝑡) = (𝑡), (12) Fig. 2. 3D plot of a Gaussian function with a 2D domain [GaussianFunction].
2 𝛾𝑐2
where 𝛾𝑐 is the prescribed positive attenuation constant.
4. Proposed gas source localization method 4.2. Improved robotic particle swarm optimization for multi-nano UAV gas
source localization
4.1. Mathematical model
Similar to the RPSO algorithms outlined in Section 3.2, the im-
The gas plume source is accurately modeled by a 2D elliptical proved RPSO algorithm determines the velocity command vector 𝜐𝑖
Gaussian function, illustrated in Fig. 2. This Gaussian function is char- based on the particle’s mode, categorized as exploration or seeking.
acterized by parameters (𝑝𝑥0 , 𝑝𝑦0 ) denoting the gas plume center, am-
However, during exploration, 𝜐𝑖 is defined as:
plitude/height 𝐻 representing the peak, and coefficients 𝑏1 , 𝑏2 , and
𝑏3 determining the spread along the 𝑥 and 𝑦 axes. The objective 𝜐𝑖 (𝑡) ∩ [0, 1] 𝜐
𝜐𝑖 (𝑡 + 1) = √ , (15)
function of the PSO is to maximize the gas concentration detected 2
by Crazyflies, accomplished by leveraging the positional difference
between the Crazyflies and the gas source, as expressed in Eq. (13). where 𝜐 represents the maximum velocity of each drone.
Upon gas detection, the particle velocity is updated using Eq. (16).
max 𝑓 (𝑥, 𝑦) =𝐻 ⋅ 𝑒𝑥𝑝(−(𝑏1 ⋅ (𝑝𝑥 − 𝑝𝑥0 )2 + The formula includes cognitive and social components, along with
(13) forces for separation (𝐅𝐬 ) and avoidance (𝐅𝐨 ).
2𝑏2 ⋅ (𝑝𝑥 − 𝑝𝑥0 )(𝑝𝑦 − 𝑝𝑦0 ) + 𝑏3 ⋅ (𝑝𝑦 − 𝑝𝑦0 )2 )),
𝐶𝑜𝑔𝑛𝑖𝑡𝑖𝑣𝑒
where 𝐼𝑛𝑒𝑟𝑡𝑖𝑎
⏞⏞⏞ ⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞
( )
𝑐𝑜𝑠(𝜅)2 𝑠𝑖𝑛(𝜅)2 𝝊𝑖 (𝑡 + 1) = 𝜔𝝊𝑖 (𝑡) + 𝛾𝑝 𝑟1 𝐩𝑖 (𝑡) − 𝐩𝐢 (𝑡)
𝑏1 = + ,
2𝜍𝑥2 2𝜍𝑦2 (16)
𝑆𝑜𝑐𝑖𝑎𝑙
𝑠𝑖𝑛(2𝜅) 𝑠𝑖𝑛(2𝜅) ⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞ 𝑆𝑒𝑝𝑎𝑟𝑎𝑡𝑖𝑜𝑛 𝐴𝑣𝑜𝑖𝑑𝑎𝑛𝑐𝑒
𝑏2 = − , ( ) ⏞⏞⏞ ⏞⏞⏞
(14)
4𝜍𝑥2 4𝜍𝑦2 + 𝛾𝑔 𝑟2 𝐩𝐠 (𝑡) − 𝐩𝐢 (𝑡) + 𝛾𝑠 𝐅𝐬 + 𝛾𝑜 𝐅𝐨 .
𝑠𝑖𝑛(𝜅)2 𝑐𝑜𝑠(𝜅)2 Here, when an agent 𝑖 detects another agent within distance 𝑑𝑠 ,
𝑏3 = + .
2𝜍𝑥2 2𝜍𝑦2 it deploys a repulsion force 𝐹𝑠 to avoid collisions. This virtual force
Here, 𝜍𝑥 and 𝜍𝑦 denote the spreads along the 𝑥 and 𝑦 axes of the blob, is active only when another agent is within range and is terminated
respectively, while 𝜅 represents a positive counterclockwise rotational otherwise. The resultant commanded velocity vector is the sum of
angle. repulsive forces, steering the agent away from low-laser readings and
5
neighboring agents. The magnitude of this vector is inversely scaled to particle swarm’s convergence to the global optimum. Furthermore, 𝛾𝑜
𝑑𝑠 , as outlined in Eq. (17). is directly related to the variations in 𝛾𝑝 and 𝛾𝑔 .
An interesting observation is the similarity between variations in
𝑁𝑠 ⎧ 0, 𝑖𝑓 |𝑝 − 𝑝 | > 𝑑 , ⎫
∑ ⎪ 𝑖 𝑗 𝑠 ⎪ acceleration constants and the population’s average velocity during the
𝐹𝑠𝑖 = ⎨ 𝜂( 1 − 1 ) 𝑝𝑖 −𝑝𝑗 , 𝑖𝑓 |𝑝 − 𝑝 | ≤ 𝑑 ⎬, (17) adaptation of the PSO inertia weight, as discussed in Section 4.3.1. Con-
𝑗=1,𝑗≠𝑖 ⎪ |𝑝𝑖 −𝑝𝑗 | 𝑑𝑠 |𝑝𝑖 −𝑝𝑗 | 𝑖 𝑗 𝑠 ⎪
⎩ ⎭ sequently, the formulation of acceleration constants should be tailored
based on the actual average velocity of the particle swarm:
where 𝑁𝑠 represents the total number of nearby Crazyflies, and 𝑁𝑟 is
𝜐𝑎𝑣𝑔 ( 𝜐𝑎𝑣𝑔 )
the total number of Crazyflies involved in the GSL task. The parameter 𝛾𝑝 (𝑡) = 𝛾 ; 𝛾𝑔 (𝑡) = 𝛾 1 − ; 𝛾𝑠 (𝑡) = 𝛾𝑜 (𝑡) = 𝛾𝜐𝑎𝑣𝑔 , (19)
𝜂 > 0 adjusts the intensity of the repulsive force. 𝜐 𝜐
The obstacle avoidance force 𝐹𝑜 for agent 𝑖, triggered upon detect- where 𝛾 is a positive average of acceleration constants typically in the
ing a nearby obstacle, is computed using an approach similar to the range [0.4, 0.8], and 𝜐𝑎𝑣𝑔 denotes the flying average velocity for a
separation force. particle population at the 𝑡th integration, derived as in Eq. (20):
𝑛𝑝 √
4.3. PSO parameter adaptation mechanisms 1 ∑
𝜐𝑎𝑣𝑔 (𝑡) = 𝜐𝑖𝑥 (𝑡)2 + 𝜐𝑖𝑦 (𝑡)2 . (20)
𝑛𝑝 𝑖=1
In the context of PSO, the effectiveness of robotic PSO heavily relies
4.4. Bidirectional Fuzzy brain emotional learning algorithm for the velocity
on parameters such as inertia weight, social and cognitive coefficients,
control loops
and repulsion force sensitivity. Traditionally, these parameters are
manually fine-tuned or optimized offline using methods like Genetic
The BBEL control system is designed to track the desired time-
Algorithms (GA) or Least Mean Square (LMS). Suboptimal choices in
varying Crazyflie velocities derived from the ARPSO while accounting
these parameters can result in algorithmic failures, susceptibility to
for system uncertainties. This involves the control of attitude, linear
local optima, or slow convergence rates [13]. Consequently, enhancing
velocities, and altitude for nano Crazyflie drones. Attitude control
convergence speed and mitigating the risk of falling into local optima
includes roll, pitch, and yaw angle control, while altitude and linear
have emerged as critical objectives in PSO research. To address these
velocities involve 𝑧, 𝜐𝑥 , and 𝜐𝑦 control. The control architecture of the
challenges, our proposed Adaptive PSO (APSO) integrates a systematic
proposed approach, explored in this paper, is illustrated in Fig. 3.
online parameter adaptation scheme, aiming to concurrently accelerate
Each velocity controller takes the velocity error and its correspond-
convergence and mitigate local optima risks.
ing velocity derivative error as input, while each attitude controller is
fed with the attitude error and its corresponding attitude derivative
4.3.1. Inertia weight: 𝜔
error. For the 𝑧 control, the altitude controller receives the inputs of
The inertia weight parameter, as introduced in [52], plays a crucial
the altitude error and its respective derivative. The ultimate control
role in PSO dynamics. This foundational work demonstrated that dis- ̇ pitch rate (𝜙), and yaw rate
signals comprise thrust (𝑧),̇ roll rate (𝜃),
tinct behaviors emerge with a range of time-varying 𝜔 values. A high
(𝜓),
̇ respectively.
inertia weight 𝜔 often leads to weak exploration, while a low value
can gradually reduce particle speed or even stop moving, potentially
4.5. Stability analysis
trapping the algorithm in local optima.
To obtain a better balance in search ability, we employ a sigmoid-
Consider a second-order nonlinear system expressed by the follow-
decreasing function. This function ensures that the particle swarm
ing equations:
maintains high speed initially, slows down in the middle stages for effi-
cient convergence to the global optimum, and ultimately converges at ̇ = 𝑓 (𝜐(𝑡)) + 𝑔(𝜐(𝑡)) 𝑈 (𝑡) + 𝑐(𝜐(𝑡), 𝑡),
𝜐(𝑡) (21)
a controlled speed in the later stages. The sigmoid-decreasing function
where 𝜐(𝑡)
̇ ∈ 𝑛 ,
𝑐(𝜐(𝑡), 𝑡) ∈ 𝑛
= 𝛥𝑓 (𝜐(𝑡)) + 𝛥𝑔(𝜐(𝑡)) + 𝑑(𝑡), and 𝑑(𝑡) ∈
for the inertia weight 𝜔 is described as follows:
𝑛 with ‖𝑑‖1 < 𝑈𝑟𝑝 represent the system output, system and mod-
𝜔−𝜔 eling uncertainties, and disturbances, respectively. Functions 𝑓 (𝜐(𝑡))
𝜔(𝑡) = + 𝜔, (18)
1 + exp ( 2 𝜀 𝑡
− 𝜀) and 𝑔(𝜐(𝑡)) indicate smooth, non-linear (uncertain) continuous functions
𝑡
assumed to be bounded within known constraints. The nominal system
where 𝑡 is the current iteration of the algorithm, 𝑡 is the predefined is described by:
maximum number of iterations, and 𝜀 is the control coefficient adjust-
ing the speed change. The minimum and maximum weights 𝜔 and 𝜔 ̇ = 𝑓𝑜 (𝜐(𝑡)) + 𝑔𝑜 𝑈 (𝑡) + 𝑐(𝜐(𝑡), 𝑡),
𝜐(𝑡) (22)
are typically set to 0.3 and 0.6, respectively. where 𝑓𝑜 (𝜐(𝑡)) and 𝑔𝑜 denote the nominal 𝑓 (𝜐(𝑡)) and 𝑔(𝜐(𝑡)). It is
assumed that 𝑔𝑜 > 0, and the controllability of the nonlinear system
4.3.2. Acceleration constants for cognitive, social, separation, obstacle in Eq. (22) is presumed, along with the existence of 𝑔𝑜−1 .
avoidance behaviors: 𝛾𝑝 , 𝛾𝑔 , 𝛾𝑠 , 𝛾𝑜 The control objective is to generate a system in which the output
According to [53], the position 𝑔𝑖 (𝑡) of each particle in the popu- 𝜐(𝑡) effectively tracks the desired trajectory 𝜐̃ (𝑡). This system tracking
𝛾 𝑝 +𝛾 𝑝
lation converges to 𝑝 𝛾 𝑖 +𝛾𝑔 𝑔 . This convergence behavior implies that, error is formally defined as:
𝑝 𝑔
during a significant iteration 𝑡, the particle positions tend to align
𝑒(𝑡) = 𝜐̃ (𝑡) − 𝜐(𝑡). (23)
closely with the lines connecting the global optimum to the local
optimum. Hence, in the early evolution stage, the optimal value of the The first derivative of the system tracking error is:
individual particle emerges as a significant parameter for guiding the
convergence towards the global optimum. ̇ = 𝜐̃̇ (𝑡) − 𝜐(𝑡).
𝑒(𝑡) ̇ (24)
However, if 𝛾𝑝 remains consistently high for all interactions 𝑡, the Substituting from Eq. (22) into Eq. (23), we obtain:
optimal position of the particle swarm would generally deviate from
the global optimum of the target function (Eq. (13)). Therefore, in the ̇ = 𝑓𝑜 (𝜐(𝑡)) + 𝑔𝑜 𝑈 (𝑡) + 𝑐(𝜐(𝑡), 𝑡) − 𝜐(𝑡).
𝑒(𝑡) ̇ (25)
initial stage of PSO, 𝛾𝑝 should be set to a larger value, while 𝛾𝑔 should
If 𝑐(𝜐(𝑡), 𝑡) is zero or precisely known, an ideal controller can be
be smaller to enhance local optimization speed. As PSO approaches
designed as:
completion, the focus should shift to the global optimum. During this
phase, 𝛾𝑝 should be smaller, while 𝛾𝑔 should be larger, facilitating the 𝑈 ∗ (𝑡) = 𝑔𝑜−1 [𝜐(𝑡)
̇ − 𝑓𝑜 (𝜐(𝑡))𝑧 − 𝑐(𝜐(𝑡), 𝑡) − 𝜄𝑇 𝑒(𝑡)], (26)
6
Fig. 3. The ARPSO-BBEL control structure used for the multi-robot-based gas source localization.
where the feedback gain matrix, denoted as 𝜄 ∈ 𝑛𝑥𝑚 , is meticu- presented in Eq. (30) and Eq. (31), guarantees the convergence of network
lously designed with values aligning with the coefficients of a Hur- parameters and the tracking error in the proposed BBEL control system.
witz polynomial. A Hurwitz polynomial ensures that its roots are
located in the open left half of the complex plane, thereby guaranteeing ⎧−𝛼(𝑒𝑇 𝑓1 𝑙𝐴 )𝑇
lim 𝑡 → ∞‖𝑒‖ = 0. ⎪
̂ < 𝜁)̄ 𝑜𝑟 (‖𝜁‖ ̂ = 𝜁̄ 𝑎𝑛𝑑 𝑒𝑇 𝑓1 𝑙𝐴 𝜁̂ ≥ 0),
̂𝜁̇ = ⎪ 𝑖𝑓 (‖𝜁‖
[ ] (30)
However, due to the presence of non-zero and generally unknown ⎨ 𝑇 𝑇 𝜁̂ 𝑇 𝜁̂ 𝑇
⎪−𝛼(𝑒 𝑓1 𝑙𝐴 ) 1 − ( ‖𝜁‖ ̂ 2)
uncertainties 𝑐(𝜐(𝑡), 𝑡), an ideal controller becomes impractical. To ad- ⎪
⎩ 𝑖𝑓 (‖𝜁‖̂ = 𝜁)̄ 𝑎𝑛𝑑 (𝑒𝑇 𝑓1 𝑙𝐴 𝜁̂ < 0),
dress this, we propose a practical BBEL control strategy tailored for
achieving desired control outcomes in nonlinear drone systems with ⎧𝛽(𝑒𝑇 𝑓2 𝑙𝑂 )𝑇
⎪
uncertainties. ̂ < 𝛿)
̄ 𝑜𝑟 (‖𝛿‖ ̂ = 𝛿̄ 𝑎𝑛𝑑 𝑒𝑇 𝑓2 𝑙𝑂 𝛿̂ ≥ 0)
̂𝛿̇ = ⎪ 𝑖𝑓 (‖𝛿‖
[ ] (31)
⎨ 𝑇 𝑇 𝛿̂ 𝑇 𝛿̂
𝑇
An optimal BBEL control, denoted as 𝑈𝑜 , is designed to learn the ⎪𝛽(𝑒 𝑓2 𝑙𝑂 ) 1 − ( ‖𝛿‖ ̂ 2)
ideal controller 𝑈 ∗ , expressed as follows: ⎪
⎩ 𝑖𝑓 (‖𝛿‖ = 𝛿) 𝑎𝑛𝑑 (𝑒𝑇 𝑓2 𝑙𝑂 𝛿̂ < 0),
̂ ̄
𝑈𝑜 (𝑡) = 𝑈 ∗ (𝑖, 𝑓𝑖 , 𝜁 ∗ , 𝛿 ∗ )(𝑡) + 𝜀, where ||.|| denotes the Euclidean norm. 𝜁̄ and 𝛿 are the given parameter
= 𝐴∗ − 𝑂∗ + 𝜀, (27) bounds.
= 𝑓1 𝑙𝐴 𝜁 ∗ − 𝑓2 𝑙𝑂 𝛿 ∗ + 𝜀,
In Eq. (30), when ‖𝜻‖ ̂ = 𝜻̄ and 𝒆𝑇 𝒇 𝟏 𝒍𝑨 𝜻̂ < 0 are met, then the
where 𝑓1 = 𝑅−𝑎, 𝑙𝐴 = [ℎ11 , … , ℎ𝑗𝑘 ], 𝜁 ∗ = [𝜁11 ∗, … , 𝜁𝑗𝑘 ∗], 𝑓2 = 𝑈 −𝑅, 𝑙𝑂 = following condition (𝜻 ∗ − 𝜻) ̂ 𝜻̂ 𝑇 = 0.5(‖𝜻 ∗ ‖2 − ‖𝜻‖̂ 2 − ‖𝜻̂ − 𝜻 ∗ ‖) < 0
∗
) to ‖𝜻 ‖ < 𝜻. This ensures 𝑽 𝑨 ≥ 0, where 𝑽 𝑨 =
[ℎ11 , … , ℎ𝑗𝑘 ], 𝛿 ∗ = [𝛿11 ∗, … , 𝛿𝑗𝑘 ∗]. Here, 𝜀 represents a reconstructed is satisfied due ̄
(
modeling error, and 𝜁 ∗ and 𝛿 ∗ are the optimal weight vectors of 𝜁 and (𝑡𝑟(𝜻̂̇ 𝑇 𝜻))∕𝜶
̃ ̃
+ 𝒆𝑇 𝒇 𝟏 𝒍𝑨 𝜻.
𝛿, respectively. Similarly, in Eq. (31), when ‖𝜹‖ ̂ = 𝜹̄ and 𝒆𝑇 𝒇 𝟐 𝒍𝑶 𝜹̂ < 0 are met then
𝑇
The control law of the BBEL scheme is assumed to take the form: the following condition (𝜹 − 𝜹)𝜹 = 0.5(‖𝜹∗ ‖2 − ‖𝜹‖
∗ ̂ ̂ ̂ 2 − ‖𝜹̂ − 𝜹∗ ‖) < 0
( )
holds due to ‖𝜹 ‖ < 𝜹. This leads to 𝑽 𝒐 ≥ 0, where 𝑽 𝒐 = (𝑡𝑟(𝜹̂̇ 𝑇 𝜹))∕𝜷
∗ ̄ ̃ +
𝑈𝑐 (𝑡) = 𝑈̂ (𝑖, 𝑓𝑖 , 𝜁̂ , 𝛿)(𝑡)
̂ = 𝑓1 𝑙𝐴 𝜁̂ − 𝑓2 𝑙𝑂 𝛿,
̂ (28) ̃
𝒆𝑇 𝒇 𝟐 𝒍𝑶 𝜹.
Consider the Lyapunov function candidate:
where 𝜁̂ and 𝛿̂ are estimates of the optimal parameters obtained from
( ) ( ) ( )
the tuning algorithm. 𝑉 = (𝒆𝑇 𝒆𝒈−1 ̃ ̃𝑇 ̃ ̃𝑇 (32)
𝒐 )∕2 + 𝑡𝑟(𝜻 𝜻 )∕2𝜶 + 𝑡𝑟(𝜹𝜹 )∕2𝜷
By subtracting Eq. (28) from Eq. (27), an approximation error 𝑈𝑒 is
Its derivative with respect to time is obtained as follows:
defined as: ( ) ( )
̇ −1
𝑉̇ = 𝒆𝑇 𝒆𝒈 ̃ ̂̇ 𝑇 ̃ ̂̇ 𝑇 (33)
𝑈𝑒 (𝑡) = 𝑈𝑜 (𝑡) − 𝑈̂ 𝑐 (𝑡) = 𝑓1 𝑙𝐴 𝜁𝑒 − 𝑓2 𝑙𝑂 𝛿𝑒 + 𝜀, 𝒐 − 𝑡𝑟(𝜻 𝜻 )∕𝜶 − 𝑡𝑟(𝜹𝜹 )∕𝜷
(29)
= 𝐹1 − 𝐹2 + 𝜀, By incorporating the system dynamical model from Eq. (22) into the
control laws outlined in Eq. (26) and Eq. (29), a novel equation for the
where
error derivative is derived as follows:
𝜁𝑒 = 𝜁 ∗ − 𝜁;
̂ 𝛿𝑒 = 𝛿 ∗ − 𝛿;
̂ 𝐹1 = 𝑓1 𝑙𝐴 𝜁𝑒 ; 𝐹2 = 𝑓2 𝑙𝑂 𝛿𝑒 .
𝑒̇ = −𝑔𝑜 (𝑈𝑒 + 𝜄𝑇 𝑒 + 𝑈𝑟𝑝 ). (34)
( ) ( )
Theorem 4.1. For the nonlinear system described in Eq. (22), the BBEL Defining 𝑆 = 𝑡𝑟(𝜻̃ 𝜻̂̇ 𝑇 )∕𝜶 , 𝐷 = 𝑡𝑟(𝜹̃ 𝜹̂̇ 𝑇 )∕𝜷 , and using Eq. (34)
controller, formulated in Eq. (28) alongside the weight adaptation laws and Eq. (33), the time derivative of the 𝑉 function can be rewritten as
7
follows: Algorithm 1 Adaptive RPSO for GSL in Webots

𝑉̇ = −𝑒𝑇 𝑔𝑜 𝑔 −1 (𝑼 𝒆 + 𝜾𝑻 𝒆 + 𝑼 𝒓𝒑 ) − 𝑆 − 𝐷,
𝑜 1: Randomly initialize the motors, Emitter devices-Receiver devices’
= −𝑒𝑇 𝑈𝑒 − 𝑒𝑇 𝜄𝑇 𝑒 − 𝑒𝑇 𝑈𝑟𝑝 − 𝑆 − 𝐷, messages, Crazyflies’ positions and velocities, Gas source position
𝑇 𝑇 𝑇 𝑇 2: Run supervisor robot controller
= −𝑒 (𝑭 𝟏 − 𝑭 𝟐 + 𝜺) − 𝑒 𝜄 𝑒 − 𝑒 𝑈𝑟𝑝 − 𝑆 − 𝐷, (35)
3: while Iteration 𝑡 < 𝑡̄ & Convergence threshold 𝜆 < .01 & No drone
= −(𝑆 + 𝑒𝑇 𝐹1 ) − (𝐷 − 𝑒𝑇 𝐹2 ) − 𝑒𝑇 𝜀 − 𝑒𝑇 𝜄𝑇 𝑒 − 𝑒𝑇 𝑈𝑟𝑝 , is flying do
= −𝑉𝐴 − 𝑉𝑂 − 𝑒𝑇 𝜀 − 𝑒𝑇 𝜄𝑇 𝑒 − 𝑒𝑇 𝑈𝑟𝑝 . Hovering phase:
4: Crazyflies start taking off and hovering in 15s
In this scenario, the boundedness of the uncertain term is assumed
5: Calculate fitness function Eq. (13)
as ‖𝜀‖1 < 𝑈 𝑟𝑝. Given the existence of a BBEL controller, as detailed in
Plume finding phase:
Eq. (28), and the corresponding adaptation laws delineated in Eq. (30)
6: while t > 15s & Gas readings𝑓 (𝑥, 𝑦) > 0.1 do
and Eq. (31), it follows that 𝑉̇ ≤ −𝑒𝑇 𝑈𝑟𝑝 − 𝑒𝑇 𝜄𝑇 𝑒 ≤ 0. Leveraging the
7: Calculate fitness function Eq. (13)
Lyapunov stability theorem and Barbalat lemma [54,55], this implies
8: Call each Crazyflie robot controller
the convergence of 𝑒 to zero as 𝑡 → ∞. Furthermore, the parameter
9: Receive: Updated velocities and positions for Crazyflie robot
estimation errors 𝜁̂ and 𝛿̂ are guaranteed to remain bounded, adhering
via the random search strategy
to the principles of the projection algorithm. Consequently, the pro-
10: Emit: Robot position
posed BBEL control system ensures stable control without additional
11: end while
information about the system.
Plume tracking phase:
12: Adapt acceleration coefficients based on iteration and average
4.6. Webots robotic PSO algorithm
swarm velocity
13: if |𝑝𝑖 − 𝑝𝑗 | ≤ 𝑑𝑐 then
1 outlines the procedures of the Crazyflie PSO simulation algorithm
14: Update global best position
in Webots. Commencing with the initialization of ND parameters-
15: end if
position, motor settings, and communication pathways-the simulation
16: Update local best position, and particle velocities
unfolds. Following launch, the Crazyflies ascend to their predefined
17: Advance simulation time 𝑆𝑡 = 𝑡
altitude and hover at this height in approximately 15 s. Subsequently,
18: end while
the PSO algorithm initiates, leveraging emitter and receiver devices on
each Crazyflie, along with a supervisor, for seamless communication.
Messages, encompassing positions, velocities, and fitness values, are
exchanged. 2.1, and the simulation package can be obtained from [Crazyflie Simu
The receiver/emitter scheme in Webots is adopted to implement this lation Webots Github Repository]. The simulation integrates a physics
algorithm as depicted in Fig. 4. Within this scheme, both receivers and model based on a propeller thrust and torque model for each rotor,
emitters are two devices installed in each robot and on a main virtual as described in the [Webotspropellerdocumentation], incorporating
robot called the supervisor, which facilitates the communication and realistic geometry, mass, and moments of inertia to accurately compute
information transformation between Crazyflie robots. The supervisor the ND states. To emulate realistic operating conditions and to evaluate
collects all the information from the Crazyflies, then executes the PSO the tracking performance of the proposed BBEL controller, Crazyflie’s
algorithmic steps and sends the new positions and velocities back to ND parameters, particularly the thrust and torque coefficients for
the individual NDs. each motor, were established with a substantial uncertainty range of
The PSO algorithm entails several key steps, including the evalu- ±20%. This intentional introduction of uncertainty reflects real-world
ation of the fitness function, the determination of optimal positions, scenarios where parameter fluctuations may arise from manufacturing
and the consequential computation of velocities and positions for the tolerances or extreme environmental conditions. Additionally, during
upcoming simulation time step. This iterative loop continues until a
close-proximity swarm flight, comparative RPSO methods often en-
predefined convergence threshold is reached, the maximum number
counter inter-vehicle aerodynamic interactions among multiple UAVs,
of iterations is attained, or no active nano-drone is detected. The
introducing exogenous disturbances that inherently pose instability and
convergence threshold is defined as the difference between the peak
crash risks. These challenges underscore the crucial necessity for the
of the Gaussian function 𝐻 and the fitness value obtained 𝜆, expressed
controller to adeptly adapt to uncertainties and disturbances, ensuring
as 𝜆 = |𝑓 (𝑥, 𝑦) − 𝐻|.
robust performance in dynamic and challenging environments.
For wall avoidance, each robot features a GPS device that employs
In our study, Python is utilized for communication and PSO algo-
a repulsion force, ensuring a distance of at least 0.5 meters from the
rithm implementation on every robot, while a BBEL or PID controller,
walls.
responsible for controlling altitude, horizontal velocities, and attitude,
5. Simulation results and discussions is implemented in C. This controller employs pitch, roll, yaw, forward,
and sideways commands to maneuver the Crazyflie robot. The system
This section outlines a set of experiments designed to validate our encompasses two robot controllers and a global virtual agent controller,
Crazyflie ND swarm-based gas source localization technique across named the supervisor.
various scenarios. We begin by detailing the experimental setups in Sec- The simulation scenarios involve a swarm of four Crazyflie robots,
tion 5.1. Subsequently, Section 5.2 introduces comparative algorithms, each equipped with an individual controller and four laser rangefinders,
with results and discussions presented in Section 5.3. alongside two green static cylinder obstacles and a virtual supervisor
robot controller. All simulations are performed in a 10 m × 10 m area
5.1. Experimental setup enclosed by 1 m walls. The strategically positioned gas plume source
ensures comprehensive coverage, with specific coordinates (𝑝𝑥0 , 𝑝𝑦0 )
The Crazyflie 2.1, a lightweight open-source flying platform weigh- detailed in Table 1. Fig. 5 visually represents these positions in Webots,
ing only 27 g, is designed to be compact, fitting easily in the palm including the pre-defined starting point for Crazyflie robots. Addi-
of one’s hand. Comprehensive details about this ND are provided by tionally, the highest gas concentration level 𝐻 is set to 1, while the
[BitcrazecompanywhomanufacturestheCrazyflienanoquadcopters]. rotational angle gas 𝜅 and the spreading coefficients along the 𝑥 and 𝑦
In early 2022, Bitcraze introduced a Webots simulation for the Crazyflie axes 𝜍𝑥 and 𝜍𝑦 are 0.5, 1, and 2, respectively.
8
Fig. 4. PSO Receiver Emitter scheme in Webots.
Table 2
Adaptive robotic PSO parameters.
Table 1 Parameter Description Value
Two gas source positions. 𝑡 Maximum number of evolutionary iterations 1000
Gas source (𝑥0 , 𝑦0 )(𝑚, 𝑚) 𝜔 Maximum weight 0.6
𝜔 Minimum weight 0.3
Position1 [−1.22, 0.08]
𝛾 Positive average of acceleration constants 0.5
Position2 [2.48, 1.9]
𝜐 Maximum linear velocity of each drone 1 [m∕s]
𝑑𝑜 Obstacle avoidance distance 0.35 [m]
𝑑𝑠 Separation distance 0.6 [m]
𝑑𝑐 Communication distance 1.5 [m]
In Fig. 5, three distinct case studies validate the effectiveness of the
proposed ARPSO-BBEL approach using the ND Crazyflie swarm plat-
form. The primary objective was to precisely locate the contaminant
source while navigating through complex urban terrain. The initial case 5.2. Comparative algorithms
involves guiding Crazyflie ND drones through a 15-second hovering
period with 𝑝𝑟 = [0, 0, 0.45] (m/s, m/s, m) before searching for the This study conducts a comprehensive performance comparison of
gas source near the swarm’s starting point. This task employs the our proposed ARPSO-BBEL method with a limited communication
proposed BBEL controller, subject to a comparative analysis with the range (ARPSO-BBELL) against ARPSO-BBELU, ARPSO-PIDU, and SB-
PID controller. In the second case, the gas-seeking performance of the BBELU [13]. In the Sniffy Bug (SB) approach, each agent recalculates
proposed ARPSO algorithm is compared to that of the involved Sniffy
a new waypoint upon reaching the acceptance distance of its previous
Bug while the communication range of Crazyflie robots is restricted,
goal. The RPSO parameters are evolved offline using a genetic algo-
with the gas source located in the far corner relative to the swarm’s
rithm from the PyGMO/PAGMO package [56]. SB is modeled as a point
starting point. The final case demonstrates the varying performance
mass, and its desired position reference is controlled by PID controllers.
of the ARPSO approach with and without the limited communication
While the key parameters of the SB and PID are maintained as
range between the robots. These comparisons offer valuable insights
into the strengths and weaknesses of each approach in addressing the presented in [5,13], our adaptive robotic PSO and BBEL parameters
gas source identification problem in complex urban environments. are configured as outlined in Table 2 and Table 3, selected based on
Five crucial criteria to evaluate the performance of gas source pilot experiments. The number of fuzzy layers is set to 𝑚 = 5 for
localization include fitness threshold 𝜆, Euclidean distance between the all control inputs. Their inputs are the linear velocity errors 𝑒𝑥,𝑦 ∈
swarm and the gas source 𝐸𝑑 , task completion time 𝑇𝑐 , success rate 𝑆𝑟 , [−2, 2] (m∕s, m∕s) and their derivatives 𝑒̇ 𝑥,𝑦 ∈ [−5, 5] (m∕s2 , m∕s2 ), the al-
and drone crash number 𝐷𝑛 . Each criterion is rigorously evaluated over titude error 𝑒𝑧 ∈ [−2, 2] (m, m) and its derivatives 𝑒̇ 𝑧 ∈ [−5, 5] (m∕s, m∕s),
7 trials run at each position. Mean values are reported, along with 95% the attitude errors 𝑒𝜃,𝜙,𝜓 ∈ [−2, 2] (rad, rad) and their derivatives 𝑒̇ 𝜃,𝜙,𝜓 ∈
confidence intervals, where appropriate. [−5, 5] (rad∕s2 , rad∕s2 ). The network weights (𝜁𝑗𝑘 and 𝛿𝑗𝑘 ) for the veloc-
Statistical significance is determined using the standard 𝑃 -value ity, altitude, and attitude controllers in the amygdala and orbitofrontal
from the 𝑆𝑡𝑢𝑑𝑒𝑛𝑡 𝑇 -test. If the variance of the means between two cortex are initially set to zero, representing a learning-from-scratch
sets falls below the expected value 𝑃 of 0.05 (that is, 5%), the results approach. This deliberate initialization showcases the rapid adaptation
are significantly different, providing a solid foundation for comparative capability inherent in the proposed BBEL controller for both velocity
analyses in different algorithms. and attitude control.
9
Fig. 5. Two unknown complex urban-like environments.
Table 3 Table 4
BBEL parameters for all control systems. Comparative performance evaluation: BBEL vs. PID con-
Parameter Description Value trollers in the horizontal velocity and altitude tracking control
(averaged results from 7 runs).
𝑏, 𝑐 Reward signal gains 1, 1
Metrics BBEL PID [5,13,50]
𝜆𝑐 Robust control approximator attenuation constants 0.53
𝛼, 𝛽 Learning rates of the A and O layer 10−7 , 10−7 𝜆 0.01 ± 0.00 0.01 ± 50.00
𝐸𝑑 (m) 0.74 ± 0.10 0.77 ± 0.16
𝑇𝑐 (s) 28.06 ± 3.00 40.43 ± 5.29
𝑆𝑟 (%) 100.00 ± 0.00 50.00 ± 50.00
5.3. Results and discussion 𝐷𝑛 0.00 ± 0.00 1.17 ± 0.37
5.3.1. Experiment 1: Desired velocity tracking control using the BBEL

controller vs. PID controller completion time with no collisions or crashes, validating its robustness.
The primary purpose of the preliminary experiment is to measure Additionally, the structural advantage of the BBEL controller lies in its
the robustness and adaptability of the BBEL and PID control systems minimal tuning requirements, requiring adjustments only for learning
when subjected to time-varying and constant 𝑥 and 𝑦 velocity ref- rates and the number of membership functions. This characteristic
erence signals dynamically generated by the ARPSO algorithm. The enables straightforward implementation in real-time applications and
experiment also addresses the challenges posed by additional sys- significantly reduces gain tuning time.
tem disturbances, underscoring the pivotal role of effective obstacle
avoidance. 5.3.2. Experiment 2: Autonomous gas source localization in static urban
Scenario 1 is used in this experiment. Demonstration videos for settings - adaptive robotic PSO vs. evolved sniffy bug with the unlimited
all solutions examined in Experiment 1 are available at the following communication range
addresses: ARPSO-BBELU: https://tinyurl.com/4k53t3rv; ARPSO-PIDU: Given the promising outcomes of the initial test, a subsequent flight
https://tinyurl.com/y9z9suzp. In the ARPSO-BBELU method’s video, test meticulously compares the effectiveness of the adaptive robotic
during the initial 15 s of hovering, both the BBEL and PID controllers PSO strategy with the evolved Sniffy Bug using PSO. Notably, all Sniffy
effectively track the desired constant reference 𝑝𝑟 without significant Bug experiments benefit from an ideal communication environment.
oscillations when system uncertainties are applied. Following stabiliza- To ensure a fair and comprehensive comparison between the two gas-
tion, the swarm of NDs initiates exploration when no gas is detected, seeking algorithms, the second experiment is deliberately performed
transitioning to gas-seeking behavior upon detection. Evidently, sys- without communication constraints (𝑑𝑐 = ∞). The BBEL controller is
tem uncertainty, obstacle disturbances, nano-drone interactions, and consistently employed in all control loops to ensure flight stability and
rapidly changing velocity reference signals impact the performance of robustness. Demonstration videos for each solution are accessible at
both controllers. As can be seen in Table 4, the BBEL adaptive con- the following links: ARPSO-BBELU: http://tinyurl.com/4rpf8f9m; and
troller outperforms the PID counterpart, showcasing superior stability SB-BBELU: http://tinyurl.com/4xtsrkum.
by minimizing drift from the desired velocity reference. In both scenarios detailed in Table 5, both methods successfully
While both controllers demonstrate similar small distance errors 𝐸𝑑 guide the NDs team to the source location, achieving the fitness thresh-
to the gas source location, the PID controller struggles to maintain old 𝜆 of 0.01 ± 0.00 and the 100% success rate 𝑆𝑟 , demonstrating the
stable and secure flights. This is evident in a significant variation of BBEL controller’s effectiveness in tracking desired references, address-
50.00 in the convergence threshold 𝜆, a notably high collision/crash ing uncertainties, and guaranteeing the convergence to gas source. In
number 𝐷𝑛 of 1.17±0.37 and a low success rate 𝑆𝑟 of only 50.00±50.00%. Scenario 1, the ARPSO-BBELU exhibits a source detection time 𝑇𝑐 of
Furthermore, the PID controller completes the task in 40.43 ± 5.09 s, 28.06 ± 3.00 s, merely 2s behind the SB-BBELU. Despite a longer total
more than twice the time taken by the BBEL controller. This significant exploration time in Scenario 2, ARPSO-BBELU maintains a 𝑇𝑐 of 71.73±
time deviation underscores the PID controller’s sluggish response due 8.08 s, a marginal 6.23s slower than the SB-BBELU. This is because the
to its lack of an online adaptive mechanism. Conversely, the BBEL average swarm velocity of the SB-BBELU is fixed at a faster speed than
controller is able to compensate for uncertainties, achieving a faster the adaptive average swarm velocity of the ARPSO-BBELU. However,
10
Fig. 6. Comparison of robot trajectories when guided by different algorithms: (a) ARPSO-BBELU and (b) SB-BBELU to explore the gas source in Scenario 2.
these time differences are statistically insignificant according to the Finally, as shown in Fig. 6, SB-BBELU’s reliance on waypoint track-
Student’s T-test. At the source points, SB-BBELU yields a significantly ing introduces delays and inefficiencies in ND response dynamics, with
larger average distance (𝐸𝑑 ) to the source, exceeding 33% more than discrete waypoint adjustments hindering adaptability to dynamic en-
ARPSO-BBELU in both cases, highlighting the consistency in the data vironmental changes. These erratic trajectories not only degrade over-
collected. This indicates a less cohesive flocking formation, offering all system performance but also increase collision risks and generate
more accessible paths for Crazyflies to reach the source quicker while pronounced trajectory oscillations around obstacles and gas source po-
allowing for stronger individualistic behavior. In contrast, ARPSO- sition. Conversely, the ARPSO-BBELU directly tackles these challenges
BBELU maintains a 𝐸𝑑 of only 0.74 ± 0.10, surpassing the separation by generating and tracking velocity setpoints, providing a solution that
distance 𝑑𝑠 of 0.6 m by 0.14 m, showcasing a tighter swarm (see improves responsiveness and ensures smooth navigation throughout the
Fig. 6) and facilitating information exchange process. This achievement GSL process.
is recognized through direct and continuous velocity tracking loops
coupled with online adaptive PSO parameters.
5.3.3. Experiment 3: Autonomous gas source localization in static ur-
Furthermore, despite a slight collision rate in 7 trials (0–1 col-
lision per run), associated with challenging instances noted in the ban settings- ARPSO with the ideal communication resources (ARPSO-
literature [57], where parameters excel on average but struggle in BBELU) vs. ARPSO with the limited communication range and long delay
demanding environments due to the off-line evolving gene algorithm (ARPSO-BBELL)
trained only on the point mass model, the SB-BBELU method faces The outcomes of Experiment 2 highlight the effectiveness of ARPSO
challenges. The reduced participation of Crazyflie robots in the GSL as a practical solution for a team of four nano-Crazyflie robots effi-
task further diminishes anti-noise and gas source detection capabilities. ciently locating a gas source amidst dynamic uncertainties and ob-
However, the ARPSO-BBELU outperforms the SB-BBELU in accurately stacles in an ideal information exchange environment. Experiment 3
identifying gas sources and reaching them without any collisions, as further explores the inter-robot exchange challenges in large-scale real-
the obstacle avoidance parameters are evolved online according to world scenarios, introducing a limited communication range 𝑑𝑐 of
environmental changes and the actual system dynamics. 1.5 m and high-latency communication. Within this range, two relevant
11
Table 5 Table 7
Comparative analysis: ARPSO-BBELU vs. SB-BBELU performance in 2 gas source seeking PomPy Simulation Parameters with two different levels of the gas diffusion.
scenarios (average of 7 runs).
Level 1: Low rate of diffusion
Scenario 1
Metrics ARPSO-BBELU Evolved Sniffy Bug-BBELU [13] Wind Model Parameters
𝜆 0.01 ± 0.00 0.01 ± 0.00 PomPy Description Value

𝐸𝑑 (m) 0.74 ± 0.10 1.11 ± 0.32
𝑇𝑐 (s) 28.06 ± 3.00 25.98 ± 3.56 𝑢(0)
̄ Mean 𝑥-component of wind velocity −0.5
𝑆𝑟 (%) 100.00 ± 0.00 100.00 ± 0.00 𝑣(0)
̄ Mean 𝑦-component of wind velocity −0.1
𝐷𝑛 0.00 ± 0.00 0.5 ± 0.32
𝑘𝑥 Diffusivity constant in 𝑥 direction 10.0
Scenario 2
𝑘𝑦 Diffusivity constant in 𝑦 direction 10.0
Metrics ARPSO-BBELU Evolved Sniffy Bug-BBELU [13]
𝐺 Input gain constant for boundary condition noise generation 0.0 [m∕s]
𝜆 0.01 ± 0.00 0.01 ± 0.00 𝑏
√ Damping ratio for boundary condition noise generation 0.0 [m]
𝐸𝑑 (m) 0.51 ± 0.18 1.08 ± 0.39 2 𝑎
√
𝑇𝑐 (s) 71.73 ± 8.08 65.50 ± 7.18 𝑎 Bandwidth for boundary condition noise generation 0.0 [m]
𝑆𝑟 (%) 100.00 ± 0.00 100.00 ± 0.00
𝐷𝑛 0.00 ± 0.00 0.83 ± 0.32 Plume Model Parameters
PomPy Description Value

Table 6 𝜎𝑣 Scaling coefficient of diffusion of puffs 0.002
Comparative performance: ARPSO-BBELU vs. ARPSO-BBELL in
Scenario 2 (average of 7 runs). 𝑛 Initial radius of the puffs 20.0
Metrics ARPSO-BBELU ARPSO-BBELL 𝑅(0) Initial puff radius 0.05
𝜆 0.01 ± 0.00 0.01 ± 0.00 𝛾 Rate of puff size spread over time 0.05
𝐸𝑑 (m) 0.51 ± 0.18 0.58 ± 0.13
Level 2: High rate of diffusion
𝑇𝑐 (s) 71.73 ± 8.08 89.47 ± 11.37
𝑆𝑟 (%) 100.00 ± 0.00 100.00 ± 100.00 Wind Model Parameters
𝐷𝑛 0.00 ± 0.00 0.00 ± 0.00
PomPy Description Value
𝑢(0)
̄ Mean 𝑥-component of wind velocity −0.5
NDs collaboratively exchange essential data, including current position, 𝑣(0)

̄ Mean 𝑦-component of wind velocity −0.1
velocity, best local position, and best local fitness value, facilitating the 𝑘𝑥 Diffusivity constant in 𝑥 direction 20.0
calculation of global best position and average swarm velocity. Con- 𝑘𝑦 Diffusivity constant in 𝑦 direction 20.0
versely, robots outside this range do not contribute to these updates. 𝐺 Input gain constant for boundary condition noise generation 0.0 [m∕s]
Furthermore, all communication channels are subject to random delays 𝑏
√ Damping ratio for boundary condition noise generation 0.0 [m]
2 𝑎
ranging from 10 ms to 300 ms over time. A flight test demonstrating √
𝑎 Bandwidth for boundary condition noise generation 0.0 [m]
the ARPSO-BBELL approach can be viewed via the following video link:
https://tinyurl.com/ykl2jykm. Plume Model Parameters
Fig. 7 illustrates the trajectories of a swarm of four Crazyflie nano- PomPy Description Value
drones (NDs) during the gas source identification process. During the
gas-seeking phase, the ARPSO-BBELL algorithm generates trajectories 𝜎𝑣 Scaling coefficient of diffusion of puffs 0.2
for the NDs that exhibit a slightly greater dispersion compared to those 𝑛 Initial radius of the puffs 20.0
produced by ARPSO-BBELU. This discrepancy arises from individual 𝑅(0) Initial puff radius 0.3
gas source tracking behaviors when a robot operates beyond the pre- 𝛾 Rate of puff size spread over time 0.07
defined communication range, compounded by communication delays
during desired velocity tracking. However, both methods effectively
converge towards the gas source within a closely-knit swarm without
and robustness of the proposed method. Firstly, two dynamic obstacles
any collisions. The robust adaptive BBEL controller and the obstacle
are introduced, represented by additional Crazyflie robots tasked with
avoidance strategy integrated into each Crazyflie ensures these systems
tracking a diagonal line that intersects the paths of Crazyflie robots
remain stable and safe despite communication challenges primarily
searching for the gas source. Secondly, the PomPy plume model, a
affecting desired velocity tracking performance.
Python implementation of the filament-based atmospheric dispersion
These observations are reflected in Table 6, where both methods
yield comparable 𝜆, 𝐸𝑑 , 𝑆𝑟 , and 𝐷𝑛 values. However, the ARPSO-BBELL model by Farrell et al. [58], is employed to simulate dynamic 2D odour
exhibits a slightly longer turnaround time, demonstrating a 𝑇𝑐 of 89.47± concentration fields. PomPy accurately reproduces crucial features of
11.37 s, in contrast to the ARPSO-BBELU with a 𝑇𝑐 of 71.73±8.08 s. This real chemical plumes, including source intermittency, diffusive effects,
discrepancy likely arises from increased obstacle avoidance maneuvers and spatial variations [59]. An illustrative heat-map example of this
and individual movement behaviors caused by the limited communica- model is presented in Fig. 8. Parameters for the plume and wind models
tion range and additional communication delays. Notably, these results utilized in this experiment are configured in Table 7.
underscore the ARPSO-BBELL’s remarkable capability for swift adapta- Seven simulation runs were conducted, and their results are visu-
tion while maintaining closed-loop system robustness against practical ally presented at the video link: https://tinyurl.com/3m77k634. Data
communication challenges and system disturbances in demanding real- collected from these runs demonstrate crucial aspects of the algorithm.
world scenarios, setting it apart from gas source localization methods Overall, the flight tests proved successful. As depicted in the accom-
like ARPSO-BBELU, ARPSO-PIDU, and SB-BBELU, which operate under panying video, the initial search phase efficiently covered the area,
unlimited communication regimes. detecting the first plume concentrations within 15 s after the hovering
phase. Subsequently, upon activating the seeking phase, the swarm
5.3.4. Experiment 4: ARPSO-BBELL in dynamic urban settings adeptly tracks up the plume tail and effectively avoids the dynamic
In our final experiment, we incorporate two key elements of en- obstacles, ascending until reaching the source. This behavior is evident
vironmental variability into Scenario 2 to validate the adaptability on the real-time heat-map located at the bottom left of the video,
12
Fig. 7. Comparison of robot trajectories when controlled by different algorithms: (a) ARPSO-BBELU and (b) ARPSO-BBELL to explore the gas source in Scenario 2.
illustrating a distinct entry point of the swarm into the plume tail and Table 8
Comparative analysis: ARPSO-BBELL performances in two cases
its trajectory towards the source.
of the time-varying gas plume (average of 7 runs).
These findings are reinforced by numerical evaluation metrics
Metrics Level 1 Level 2
(Table 8). The method efficiently guides the ND team to track the
𝜆 0.01 ± 0.00 0.01 ± 0.00
dynamic gas flow. The Crazyflies swarm achieves a commendable 100%
𝐸𝑑 (m) 0.74 ± 0.13 0.77 ± 0.09
success rate, demonstrating short convergence times (𝑇𝑠 = 168.2 ± 𝑇𝑐 (s) 168.20 ± 25.61 270.85 ± 20.45
27.61 s for Level 1 and 260.85 ± 20.45 s for Level 2), and maintain- 𝑆𝑟 (%) 100.00 ± 0.00 100.00 ± 0.00
ing a tight swarm formation (𝐸𝑑 = 0.74 ± 0.13 m for Level 1 and 𝐷𝑛 0.00 ± 0.00 0.5 ± 0.32
0.77 ± 0.09 m for Level 2). Moreover, the method guarantees collision
avoidance with obstacles (𝐷𝑛 = 0 for both levels) while navigating
through various challenges, including communication uncertainties, Interestingly, heavier gas releases require longer turnaround times
system disturbances, dynamic obstacles, and the dynamic plume model for source identification due to pronounced diffusive effects and faster
across all trials. turbulent flows, while maintaining a consistent 100% success rate. For
However, in dynamic gas environments, the convergence time no- example, doubling diffusivity constants and scaling coefficients by a
ticeably lengthens compared to static environments, averaging 74.78s factor of 100 leads to significantly longer convergence times, approx-
slower for Level 1 and 171.38s for Level 2. Additionally, the 95% con- imately 102.65s longer. This increase is due to pronounced diffusive
fidence level triples, reflecting the heightened demand for ND systems effects and faster turbulent flows, leading to more significant alterations
to effectively track the substantial and unpredictable variations in the in concentration measurements and the selected global best position.
gas plume over time. These results collectively underscore the algorithm’s robustness and
13
& editing, Validation, Supervision, Project administration, Funding ac-

quisition, Conceptualization. Sreenatha G. Anavatti: Writing – review
& editing, Validation, Supervision, Funding acquisition, Conceptualiza-
tion. Sridhar Ravi: Writing – review & editing, Funding acquisition,
Conceptualization.
Declaration of competing interest
The authors declare the following financial interests/personal rela-

tionships which may be considered as potential competing interests:
Vu Phi Tran reports financial support was provided by Australian
Defence Science Technology (DST) Group. If there are other authors,
they declare that they have no known competing financial interests or
personal relationships that could have appeared to influence the work
reported in this paper.
Data availability
The data that has been used is confidential.
Fig. 8. PomPy Simulation Example.

References
[1] T. Ma, S. Liu, H. Xiao, Location of natural gas leakage sources on offshore
platform by a multi-robot system using particle swarm optimization algorithm,
adaptability to environmental uncertainties, indicating its potential for J. Nat. Gas Sci. Eng. 84 (2020) 103636.
real-world applications. [2] V.P. Tran, M.A. Garratt, K. Kasmarik, S.G. Anavatti, S. Abpeikar, Frontier-led
swarming: Robust multi-robot coverage of unknown environments, Swarm Evol.
Comput. 75 (2022) 101171.
6. Conclusion [3] V.P. Tran, M.A. Garratt, K. Kasmarik, S.G. Anavatti, A.S. Leong, M. Zamani,
Multi-gas source localization and mapping by flocking robots, Inf. Fusion 91
This paper introduces a pioneering hybrid strategy, fusing Adaptive (2023) 665–680.
Robotic Particle Swarm Optimization with Bidirectional Brain Emo- [4] Y. Ji, F. Chen, B. Chen, Y. Wang, X. Zhu, H. He, Multi-robot collaborative source
searching strategy in large-scale chemical clusters, IEEE Sens. J. 22 (18) (2021)
tional Learning, to significantly advance gas source localization in 17655–17665.
dynamic urban settings using a swarm of nano-Crazyflie drones. Unlike [5] J.T. Ebert, F. Berlinger, B. Haghighat, R. Nagpal, A hybrid PSO algorithm for
traditional RPSO methods with manual or offline evolving parameter multi-robot target search and decision awareness, in: 2022 IEEE/RSJ Interna-
tuning, the ARPSO autonomously adapts its parameters online, lever- tional Conference on Intelligent Robots and Systems, IROS, IEEE, 2022, pp.
11520–11527.
aging the average particle flying velocity. This dynamic adaptation
[6] V.P. Tran, A. Perera, M.A. Garratt, K. Kasmarik, S.G. Anavatti, Coverage path
mitigates algorithmic failures and addresses local optima challenges, planning with budget constraints for multiple unmanned ground vehicles, IEEE
contributing to robust plume tracking. Trans. Intell. Transp. Syst. (2023).
The incorporation of BBEL enhances the control system’s resilience [7] S. Alam, D. De, Bio-inspired smog sensing model for wireless sensor networks
to environmental disturbances and uncertainties in nano-drone dy- based on intracellular signalling, Inf. Fusion 49 (2019) 100–119.
[8] S. Salcedo-Sanz, P. Ghamisi, M. Piles, M. Werner, L. Cuadra, A. Moreno-Martínez,
namic systems, validated through rigorous stability analysis with Lya-
E. Izquierdo-Verdiguier, J. Muñoz-Marí, A. Mosavi, G. Camps-Valls, Machine
punov theory. Additionally, our design considers the dynamic nature of learning information fusion in earth observation: A comprehensive review of
the gas plume, dynamic obstacles, limited communication range, and methods, applications and data sources, Inf. Fusion 63 (2020) 256–272.
high-latency communications among the robots, offering scalability to [9] C.H. Yeh, C.H. Lin, M.H. Lin, L.W. Kang, C.H. Huang, M.J. Chen, Deep learning-
large robot teams, real-time operation, and robust fault tolerance, as based compressed image artifacts reduction based on multi-scale image fusion,
Inf. Fusion 67 (2021) 195–207.
demonstrated in both simulations. [10] A. Francis, S. Li, C. Griffiths, J. Sienz, Gas source localization and mapping with
Furthermore, our simulations in intricate urban scenarios highlight mobile robots: A review, J. Field Robotics 39 (8) (2022) 1341–1373.
the superior performance of the hybrid ARPSO-BBEL strategy across [11] G. Fortino, F. Messina, D. Rosaci, G.M. Sarné, C. Savaglio, A trust-based team
all metrics. The results exhibit faster convergence, tighter swarm co- formation framework for mobile intelligence in smart factories, IEEE Trans. Ind.
Inform. 16 (9) (2020) 6133–6142.
hesion, collision-free trajectories, enhanced robustness, and increased
[12] M. Hutchinson, C. Liu, W.H. Chen, Source term estimation of a hazardous
resilience to uncertainties compared to benchmark methods like SB- airborne release using an unmanned aerial vehicle, J. Field Robotics 36 (4)
BBELU and ARPSO-PIDU. These findings underscore the strategy’s po- (2019) 797–817.
tential for real-world gas source localization, promising substantial [13] B.P. Duisterhof, S. Li, J. Burgués, V.J. Reddi, G.C. de Croon, Sniffy bug: A fully
advancements in uncertain urban environments with limited commu- autonomous swarm of gas-seeking nano quadcopters in cluttered environments,
in: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems,
nication distances.
IROS, IEEE, 2021, pp. 9099–9106.
Future work involves transitioning from simulations to physical [14] V.P. Tran, M.A. Garratt, K. Kasmarik, S.G. Anavatti, Dynamic frontier-led
robots for real-world assessment, scaling to larger nano quadcopter swarming: Multi-robot repeated coverage in dynamic environments, IEEE/CAA
swarms, and exploring three-dimensional Gas Source Localization sce- J. Autom. Sin. 10 (3) (2023) 646–661.
narios within buildings. Moreover, addressing challenges related to [15] T. Wiedemann, D. Shutin, A.J. Lilienthal, Model-based gas source localization
strategy for a cooperative multi-robot system - A probabilistic approach and ex-
multiple gas sources will refine swarm coordination mechanisms, in- perimental validation incorporating physical knowledge and model uncertainties,
spiring broader advancements in swarm robotics for diverse resource- Robot. Auton. Syst. 118 (2019) 66–79.
constrained tasks. [16] Z. Han, Y. Yang, W. Wang, L. Zhou, T.R. Gadekallu, M. Alazab, P. Gope, C.
Su, RSSI map-based trajectory design for UGV against malicious radio source: A
CRediT authorship contribution statement reinforcement learning approach, IEEE Trans. Intell. Transp. Syst. 24 (4) (2022)
4641–4650.
[17] P. Hinsen, T. Wiedemann, D. Shutin, A.J. Lilienthal, Exploration and gas source
Vu Phi Tran: Writing – original draft, Software, Methodology, localization in advection–diffusion processes with potential-field-controlled
Investigation, Data curation. Matthew A. Garratt: Writing – review robotic swarms, Sensors 23 (22) (2023) 9232.
14
[18] Y.A. Prabowo, B.R. Trilaksono, E.M. Hidayat, B. Yuliarto, Integration of Bayesian [38] N.O. Lambert, D.S. Drew, J. Yaconelli, S. Levine, R. Calandra, K.S. Pister, Low-
inference and anemotaxis for robotics gas source localization in a large cluttered level control of a quadrotor with deep model-based reinforcement learning, IEEE
outdoor environment, IEEE Access 11 (2023) 22705–22713. Robot. Autom. Lett. 4 (4) (2019) 4224–4230.
[19] M. Jabeen, Q.H. Meng, T. Jing, H.R. Hou, Robot odor source localization in [39] V.P. Tran, F. Santoso, M.A. Garratt, Adaptive trajectory tracking for quadrotor
indoor environments based on gradient adaptive extremum seeking search, Build. systems in unknown wind environments using particle swarm optimization-based
Environ. 229 (2023) 109983. strictly negative imaginary controllers, IEEE Trans. Aerospace Electron. Syst. 57
[20] K. Mjos, F. Grasso, J. Atema, Antennule use by the American lobster, homarus (3) (2021) 1742–1752.
americanus, during chemo-orientation in three turbulent odor plumes, Biol. Bull. [40] C. Sun, M. Liu, C. Liu, X. Feng, H. Wu, An industrial quadrotor UAV control
197 (2) (1999) 249–250. method based on fuzzy adaptive linear active disturbance rejection control,
[21] Z.Z. Yang, T. Jing, Q.H. Meng, UAV-based odor source localization in multi- Electronics 10 (4) (2021) 376.
building environments using simulated annealing algorithm, in: 2020 39th [41] X. Kan, J. Thomas, H. Teng, H.G. Tanner, V. Kumar, K. Karydis, Analysis of
Chinese Control Conference, CCC, IEEE, 2020, pp. 3806–3811. ground effect for small-scale UAVs in forward flight, IEEE Robot. Autom. Lett.
[22] A. Marjovi, L. Marques, Optimal swarm formation for odor plume finding, IEEE 4 (4) (2019) 3860–3867.
Trans. Cybern. 44 (12) (2014) 2302–2315. [42] V.P. Tran, M.A. Mabrok, S.G. Anavatti, M.A. Garratt, I.R. Petersen, Robust
[23] T. Wiedemann, D. Shutin, V. Hernandez, E. Schaffernicht, A.J. Lilienthal, fuzzy Q-learning-based strictly negative imaginary tracking controllers for the
Bayesian gas source localization and exploration with a multi-robot system using uncertain quadrotor systems, IEEE Trans. Cybern. (2022).
partial differential equation based modeling, in: 2017 ISOCS/IEEE International [43] V.P. Tran, M.A. Mabrok, S.G. Anavatti, M.A. Garratt, I.R. Petersen, Robust adap-
Symposium on Olfaction and Electronic Nose, ISOEN, IEEE, 2017, pp. 1–3. tive fuzzy control for second-order Euler-Lagrange systems with uncertainties
[24] T. Wiedemann, C. Vlaicu, J. Josifovski, A. Viseras, Robotic information gathering and disturbances via nonlinear negative-imaginary systems theory, IEEE Trans.
with reinforcement learning assisted by domain knowledge: An application to gas Cybern. (2024).
source localization, IEEE Access 9 (2021) 13159–13172. [44] Y. Mei, G. Tan, Z. Liu, An improved brain-inspired emotional learning algorithm
[25] J. Zhang, Y. Lu, Y. Wu, C. Wang, D. Zang, A. Abusorrah, M. Zhou, PSO-based for fast classification, Algorithms 10 (2) (2017) 70.
sparse source location in large-scale environments with a UAV swarm, IEEE [45] M. Eman, T.M. Mahmoud, M.M. Ibrahim, T.A. El-Hafeez, Innovative hybrid
Trans. Intell. Transp. Syst. (2023). approach for masked face recognition using pretrained mask detection and
[26] X. Chen, C. Fu, J. Huang, A deep Q-network for robotic odor/gas source segmentation, robust PCA, and KNN classifier, Sensors 23 (15) (2023) 6727.
localization: Modeling, measurement and comparative study, Measurement 183 [46] E. Lotfi, M.R. Akbarzadeh-T, A winner-take-all approach to emotional neu-
(2021) 109725. ral networks with universal approximation property, Inform. Sci. 346 (2016)
[27] J. Orr, A. Dutta, Multi-agent deep reinforcement learning for multi-robot 369–388.
applications: A survey, Sensors 23 (7) (2023) 3625. [47] C.B.J. MorÉn, Emotional learning: A computational model of the Amygdala,
[28] S.M. Mamduh, K. Kamarudin, A.Y.M. Shakaff, A. Zakaria, R. Visvanathan, A.S.A. Cybern. Syst. 32 (6) (2001) 611–636.
Yeon, L.M. Kamarudin, A.S.A. Nasir, Gas source localization using grey wolf [48] P.K. Muthusamy, M. Garratt, H. Pota, J. Wang, J.M. Kok, Bidirectional fuzzy
optimizer, J. Telecommun. Electron. Comput. Eng. (JTEC) 10 (1–13) (2018) brain emotional learning control for aerial robots, in: 2018 IEEE Symposium
95–98. Series on Computational Intelligence, SSCI, IEEE, 2018, pp. 146–153.
[29] J. Wang, Y. Lin, R. Liu, J. Fu, Odor source localization of multi-robots with [49] P.K. Muthusamy, M. Garratt, H. Pota, R. Muthusamy, Real-time adaptive in-
swarm intelligence algorithms: A review, Front. Neurorobot. 16 (2022) 949888. telligent control system for quadcopter unmanned aerial vehicles with payload
[30] P.K. Das, H.S. Behera, B.K. Panigrahi, A hybridization of an improved particle uncertainties, IEEE Trans. Ind. Electron. 69 (2) (2021) 1641–1653.
swarm optimization and gravitational search algorithm for multi-robot path [50] N. Gunawardena, K.K. Leang, E. Pardyjak, Particle swarm optimization for source
planning, Swarm Evol. Comput. 28 (2016) 14–28. localization in realistic complex urban environments, Atmos. Environ. 262 (2021)
[31] H. Che, C. Shi, X. Xu, J. Li, B. Wu, Research on improved ACO algorithm-based 118636.
multi-robot odor source localization, in: 2018 2nd International Conference on [51] P.K. Muthusamy, B. Suthar, R. Muthusamy, M. Garratt, H. Pota, L. Seneviratne, Y.
Robotics and Automation Sciences, ICRAS, IEEE, 2018, pp. 1–5. Zweiri, Self-organising BFBEL control system for a UAV under wind disturbance,
[32] M.A. Elaziz, D. Yousri, A.O. Aseeri, L. Abualigah, M.A. Al-qaness, A.A. Ewees, IEEE Trans. Ind. Electron. (2023).
Fractional-order modified heterogeneous comprehensive learning particle swarm [52] M. Taherkhani, R. Safabakhsh, A novel stability-based adaptive inertia weight
optimizer for intelligent disease detection in IoMT environment, Swarm Evol. for particle swarm optimization, Appl. Soft Comput. 38 (2016) 281–295.
Comput. 84 (2024) 101430. [53] H.M. Sun, J.Z. Yu, X.L. Zhang, B.G. Wang, R.S. Jia, The adaptive particle swarm
[33] Q. Feng, C. Zhang, J. Lu, H. Cai, Z. Chen, Y. Yang, F. Li, X. Li, Source localization optimization technique for solving microseismic source location parameters,
in dynamic indoor environments with natural ventilation: An experimental study Nonlinear Process. Geophys. 26 (3) (2019) 163–173.
of a particle swarm optimization-based multi-robot olfaction method, Build. [54] C.T. Lin, C.G. Lee, Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent
Environ. 161 (2019) 106228. Systems, Prentice-Hall, Inc, 1996.
[34] U. Jain, R. Tiwari, W.W. Godfrey, Multiple odor source localization using diverse- [55] L. Eugene, W. Kevin, D. Howe, Robust and Adaptive Control with Aerospace
PSO and group-based strategies in an unknown environment, J. Comput. Sci. 34 Applications, Springer-Verlag London, England, 2013.
(2019) 33–47. [56] D. Izzo, M. Ruciński, F. Biscani, The generalized Island model, in: Parallel
[35] B.L. Villarreal, G. Olague, J.L. Gordillo, Synthesis of odor tracking algorithms Architectures and Bioinspired Algorithms, Springer, 2012, pp. 151–169.
with genetic programming, Neurocomputing 175 (2016) 1019–1032. [57] P. Spronck, I. Sprinkhuizen-Kuyper, E. Postma, DECA: The doping-driven
[36] M.R. Fikri, D.W. Djamari, Palm-sized quadrotor source localization using mod- evolutionary control algorithm, Appl. Artif. Intell. 22 (3) (2008) 169–197.
ified bio-inspired algorithm in obstacle region, Int. J. Electr. Comput. Eng. [58] J.A. Farrell, J. Murlis, X. Long, W. Li, R.T. Cardé, Filament-based atmospheric
(2088-8708) 12 (4) (2022). dispersion model to achieve short time-scale structure of Odor plumes, Environ.
[37] H. Saadaoui, F. El Bouanani, A local PSO-based algorithm for cooperative Fluid Mech. 2 (2002) 143–169.
multi-UAV pollution source localization, IEEE Access 10 (2022) 106436–106450. [59] M. Graham, PomPy - Puff-Based Odour Plume Model in Python, Insect Robotics
Group, University of Edinburgh, 2022, URL https://github.com/InsectRobotics/
pompy. original-date: 2015-11-19T21:22:05Z.
15

2024..S

Uploaded by

Copyright:

Available Formats

2024..S

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2024..S

Uploaded by

Copyright:

Available Formats

Swarm and Evolutionary Computation 89 (2024) 101615

Contents lists available at ScienceDirect

Swarm and Evolutionary Computation

Collaborative gas source localization strategy with networked nano-drones

ARTICLE INFO ABSTRACT

agent. The weighting factors 𝛾𝑝 , 𝛾𝑔 , 𝛾𝑠 , and 𝛾𝑜 are scalars impacting ∑

Fig. 1. BBEL control structure.

follows: Algorithm 1 Adaptive RPSO for GSL in Webots

Fig. 4. PSO Receiver Emitter scheme in Webots.

Fig. 5. Two unknown complex urban-like environments.

5.3.1. Experiment 1: Desired velocity tracking control using the BBEL

𝜆 0.01 ± 0.00 0.01 ± 0.00 PomPy Description Value

PomPy Description Value

Metrics ARPSO-BBELU ARPSO-BBELL 𝑅(0) Initial puff radius 0.05

NDs collaboratively exchange essential data, including current position, 𝑣(0)

& editing, Validation, Supervision, Project administration, Funding ac-

Declaration of competing interest

The authors declare the following financial interests/personal rela-

The data that has been used is confidential.

Fig. 8. PomPy Simulation Example.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.