2024 Anderson Smartphone

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

sensors

Article
Smartphone IMU Sensors for Human Identification through Hip
Joint Angle Analysis
Rabé Andersson 1, * , Javier Bermejo-García 2 , Rafael Agujetas 2 , Mikael Cronhjort 1 and José Chilo 1

1 Department of Electrical Engineering, Mathematics and Science, University of Gävle, 801 76 Gävle, Sweden;
milcrt@hig.se (M.C.); jco@hig.se (J.C.)
2 Departamento de Ingeniería Mecánica, Energética y de los Materiales, Escuela de Ingenierías Industriales,
Universidad de Extremadura, 06006 Badajoz, Spain; javierbg@unex.es (J.B.-G.); rao@unex.es (R.A.)
* Correspondence: rabe.andersson@hig.se; Tel.: +46-26-648944

Abstract: Gait monitoring using hip joint angles offers a promising approach for person identification,
leveraging the capabilities of smartphone inertial measurement units (IMUs). This study investigates
the use of smartphone IMUs to extract hip joint angles for distinguishing individuals based on their
gait patterns. The data were collected from 10 healthy subjects (8 males, 2 females) walking on a
treadmill at 4 km/h for 10 min. A sensor fusion technique that combined accelerometer, gyroscope,
and magnetometer data was used to derive meaningful hip joint angles. We employed various
machine learning algorithms within the WEKA environment to classify subjects based on their hip
joint pattern and achieved a classification accuracy of 88.9%. Our findings demonstrate the feasibility
of using hip joint angles for person identification, providing a baseline for future research in gait
analysis for biometric applications. This work underscores the potential of smartphone-based gait
analysis in personal identification systems.

Keywords: smartphone sensors; IMU sensors; person recognition; machine learning classification;
human motion analysis

Citation: Andersson, R.;


1. Introduction
Bermejo-García, J.; Agujetas, R.; The past few decades have witnessed unprecedented advancements in smartphone
Cronhjort, M.; Chilo, J. Smartphone technologies. Each year, these handheld devices, which are equipped with a diverse array of
IMU Sensors for Human sensors, have grown more sophisticated [1]. These sensors serve various functions, from ac-
Identification through Hip Joint Angle celerometers, magnetometers, and gyroscopes to environmental sensors like ambient light
Analysis. Sensors 2024, 24, 4769. and temperature sensors [2,3].
https://doi.org/10.3390/s24154769 Smartphone sensors have already found valuable applications across multiple fields,
Academic Editor: Tad Brunye including health and rehabilitation, fitness by tracking physical activities, monitoring heart
rates, and measuring sleep patterns [4,5]. In the automotive industry, they are employed for
Received: 30 April 2024 vehicle navigation and accident detection [6]. Environmental monitoring and augmented
Revised: 19 July 2024
reality also benefit from the data collection and processing capabilities of smartphone
Accepted: 20 July 2024
sensors [7]. Building upon these technological advancements, smartphones have also
Published: 23 July 2024
emerged as a robust means of human identification through gait recognition [8–10].
Gait monitoring using sensory data has gained increased research attention over the
years as gait biometrics is a unique pattern of human locomotion. Every locomotion pattern
Copyright: © 2024 by the authors.
is unique due to variations in magnitude and time among people but identifiable as it is
Licensee MDPI, Basel, Switzerland. performed naturally and habitually every day, which involves many muscles and joints [8].
This article is an open access article Similar to other biometric data such as iris, voice samples, and fingerprints, gait analysis has
distributed under the terms and appeared in applications for security and identification purposes. Unlike other biometric
conditions of the Creative Commons data, gait patterns have a multidimensional and convoluted nature, making them extremely
Attribution (CC BY) license (https:// difficult to mimic or steal, just like fingerprints or voice patterns [11].
creativecommons.org/licenses/by/ Gait analysis can be captured by foot pressure sensors, wearable sensors, or a vision-
4.0/). based system using multiple cameras or stereo vision [12,13]. However, foot pressure

Sensors 2024, 24, 4769. https://doi.org/10.3390/s24154769 https://www.mdpi.com/journal/sensors


Sensors 2024, 24, 4769 2 of 22

sensors and vision-based systems require a spacious work area, which is costly and re-
stricted to laboratory research that depends on complex and multi-sensor settings and
specialized personnel to carry out the operations, which make them impractical for many
applications [14,15]. Therefore, the integration of smartphone sensors for gait analysis and
person recognition offers numerous potential benefits [16–18]. First, the availability and
affordability of smartphone sensors make them accessible to a broad demographic; in fact,
85.74% of the world population used smartphones in the year 2023, as illustrated in [19].
Second, their portability and user-friendly interfaces render them ideal for delivering gait
recognition in various settings, from hospitals to home environments [20].
Numerous studies have investigated person recognition through gait analysis, em-
ploying various techniques and sensor placements. For example, Hoang et al. demon-
strated the feasibility of gait recognition using smartphone accelerometers, achieving up to
92.7% accuracy [21]. Connor and Ross provided a comprehensive survey on gait recognition
modalities, highlighting the effectiveness of full-body motion capture and foot pressure
patterns [12]. Makihara et al. discussed the use of multiple joints and their combined
movement patterns for accurate identification [22]. While these studies used multiple
sensor inputs, our focus on hip movement is justified by the work of Derawi et al., which
showed that hip rotation patterns are highly individualistic and can be effectively cap-
tured by waist-mounted accelerometers. Concentrating on hip movement reduces the
impact of variations in arm swing or upper body movements, making this approach more
user-friendly [23].
Hip joint analysis offers several advantages: it is central to gait biomechanics as the
primary connection between the lower limbs and the trunk [24]; requires fewer sensors
than full-body analysis [25]; preserves privacy better than facial recognition or full-body
gait analysis [26]; is sensitive to pathological, neurological, and musculoskeletal condi-
tions [27]; applies directly to many rehabilitation protocols [28]; and correlates strongly with
energy expenditure during walking, as 45% of the mechanical energy comes from the hip
joint [29,30]. These benefits, combined with smartphone ubiquity, make hip joint analysis a
promising approach to study for gait movement monitoring and person recognition.
However, gait movement data can be hard to interpret or analyze [15,31]; thus, re-
searchers use machine learning techniques for gait recognition and data analysis captured
by a camera, force-based systems, or force-sensitive resistors. For this reason, multiple
machine learning algorithms, such as support vector machine (SVM), neural networks
(NNs), long short-term memory neural network (LSTM NN), recurrent neural network
(RNN), naive Bayes (NB), linear discriminant analysis (LDA), and hybrid convolutional
neural network (HCNN), were utilized in previous studies for gait recognition and clas-
sification to extract a comprehensive understanding of biometric data based on human
movements [9,32,33]. Many ML techniques showed their superior performance and power-
ful use in various fields, including human recognition, manufacturing, robotics, quality
inspection, sports performance analysis, and medical diagnosis [31,34].
Therefore, this study explores the use of smartphone IMU sensor data with ML
classification techniques, including perceptron, logistic regression, nearest neighbor rule,
naive Bayes, and random forests, to recognize persons based on their hip joint angle. We
validate smartphone gait movement by comparing it with MCSs using a pendulum test
bench. Our future goal is to utilize this system identification approach in a rehabilitation hip
joint exoskeleton system within a healthcare environment where data can be anonymously
shared with a healthcare system to recognize patients before giving or following their
treatments to avoid sharing any sensitive personal information.
The paper is structured as follows: Section 2 describes the theory behind the steps
conducted in the study, while Section 3 discusses the methods used throughout this research,
with subsections that shed light on the measurement comparison, the test bench with the
MCS and smartphone configuration, and the machine learning analysis. Section 4 is
devoted to presenting the results and discussion. Finally, Section 5 concludes with the
paper’s findings and analysis.
Sensors 2024, 24, 4769 3 of 22

2. Theory
To conduct a hip joint analysis using smartphone sensors, we elucidate the backbone
theoretical key of our research, beginning with the reason behind human gait analysis,
followed by the role of inertial measurement units (IMUs) in smartphones and some
signal-processing techniques used.

2.1. Human Gait Analysis


Human walking is fundamentally a periodic activity characterized by repetitive mo-
tions of body segments. Human gait analysis is pivotal in rehabilitation for the precise
identification of walking abnormalities and biomechanical inefficiencies. By analyzing gait
patterns, clinicians can tailor rehabilitation programs to address specific deficits, thereby
enhancing the efficacy of interventions and improving patient outcomes or combining it
with rehabilitation robotic exoskeleton sessions [35]. Since exoskeleton devices comprise
sensors, actuators, and electronic circuits operating in close contact with patients, under-
standing the intricacies of gait mechanics is pivotal for ensuring that these devices operate
safely and reliably, identifying walking abnormalities for orthotics or prosthetic leg users,
and optimizing movements to mitigate posture-related issues [30].
Gait analysis can include kinematics, kinetics, or EMG measurements. Gait kinematics
describes the motion of the major joints and segments of the lower extremity, such as
the hip, knee, ankle, and foot, while gait kinetics studies the forces that result from the
movement of human gait segments, including the ground reaction force, joint reaction force,
and joint torqueEMG sensors, to measure the electrical activity of the muscles that control
the movement of these segments, such as the quadriceps, hamstrings, gastrocnemius,
and tibialis anterior [35]. However, among these techniques, the measurement of gait
kinematics is pivotal for recognizing the gait phases, joint angles, and segment movements.
To systematically evaluate human gait in a kinematics-based manner, an MCS, with
the use of computer vision techniques, stands out for its unparalleled accuracy, utilizing
either marker-based or markerless methods to achieve results precise to within 1 mm [36].
These systems use reflective markers placed on the person’s body or markerless image
processing techniques to capture human gait. However, these systems are expensive and
out of reach for many clinicians, especially in developing countries [37].
Thus, alternative kinematic measurement techniques can be performed using wearable
sensors such as inertial measurement units (IMUs) or magnetic, angular rate, and gravity
(MARG) sensors fused in micro-electro-mechanical systems (MEMSs) in smartphones for
indoor and outdoor measurements, in contrast with MCSs for laboratories.

2.2. IMUs and MARG in Smartphones


Smartphones have multiple embedded motion sensors, such as accelerometers, gyro-
scopes, and magnetometers, that can offer alternative solutions compared with external
IMUs for gait recognition. In addition, IMUs and MARG can possess multiple advantages,
such as low power consumption, lightweight, and ease of use [38,39]. These sensors are
three-axial sensors that capture the acceleration ai along the X, Y, and Z axes corresponding
to the roll (ϕ), pitch (θ), and yaw (ψ) axes, as shown in Figure 1 and represented in (1).
 T
ai = ax ay az (1)

The gyroscope and magnetometer are also three-axial sensors. The accelerometer
gauges three-axial linear acceleration (ai ), the gyroscope quantifies angular velocity (ωi ),
and the magnetometer assesses the magnetic field (mi ). These sensors’ data can be used to
analyze human gait movements and to find the joint trajectories by fusing them to obtain
the orientation angles [35].
Sensors 2024, 24, 4769 4 of 22

Figure 1. (a) The three-axial sensor measurements of the smartphone; (b) the fixed reference frame of
the earth [40].

2.3. The Sensor Fusion and Signal Processing


The sensor measurements of MARG and IMU usually have intrinsic drift and high-
degree noise, making it challenging to reconstruct trajectories and orientation estimation
directly [41]. Therefore, finding the orientation estimation of the smartphone (ϕ, θ, ψ) based
only on angular velocities (ω) from a gyroscope sensor gives inaccurate measurements,
as they include bias (bω ) (low-frequency noise) and Gaussian noise (WNoise ) associated with
the true angular velocity (ωtrue ). However, integrating any error from the gyroscope leads to
a drift in orientation estimation over time due to the integration operation. Similarly, using
only accelerometer measurements (a) is an inadequate method for orientation estimation
as its measurements include bias (ba ), noise (WNoise ), gravitational acceleration (agravity ),
and non-gravitation acceleration (atrue ), which consequently lead to estimation with high-
frequency noise. Thus, accumulated errors caused by integration calculations of gyroscope
measurements can be solved by fusing measurements provided by an accelerometer and
magnetometer in one of the sensor fusion techniques [38,42]. However, the magnetometer
measurements (m) suffer from the influence of magnetic field interference (minit ), bias (bm ),
and noise (WNoise ) in addition to the true magnetometer measurement (mtrue ), as shown in
Equations (2)–(4) [43].
ω = ωtrue + bω + WNoise , (2)
a = atrue + agravity + ba + WNoise , (3)
m = mtrue + minit + bm + WNoise . (4)
Therefore, the sensor fusion technique is an essential procedure to determine the
accurate orientation. This technique combines data from multiple sensors to provide
a more reliable and comprehensive understanding of drift-free and noise-free spatial
orientation [37].
In the literature, sensor fusion algorithms (SFAs) are predominantly categorized into
deterministic and stochastic frameworks. Within the deterministic paradigm, algorithms
such as linear complementary filters (LCFs) and nonlinear complementary filters (NCFs)
are commonly employed. In contrast, the stochastic domain encompasses a diverse array
of algorithms, including linear Kalman filters (LKFs), extended Kalman filters (EKFs),
complementary Kalman filters (CKFs), square root unscented Kalman filters (SRUKFs),
and square root cubature Kalman filters (SRCKFs) [44].
It is noteworthy that LKF has been utilized for orientation estimation in MARG
and IMU sensor arrays, but it has limitations in adequately addressing the inherent non-
linearities present in real-time systems [37]. Thus, many studies have proposed advanced
SFAs such as the EKF used in the attitude heading reference system (AHRS) [45].
AHRS is a particularly indirect EKF quaternion-based algorithm that has the ability
to estimate magnetic disturbances, which can mitigate the effect of interference minit and
makes it useful for its robustness in various applications [42]. It consists of a two-step
Sensors 2024, 24, 4769 5 of 22

process: prediction and correction to determine the orientation q (ϕ, θ, ψ), as illustrated
in [46] and shown in Figure 2 [47]. The prediction phase relies on gyroscope data based
on integrating the angular velocity (w), while the correction phase utilizes acceleration
signals (a) and magnetometer readings (m). The AHRS algorithm updates the orientation
estimation q by comparing it with the orientation in the prediction phase to minimize
the error between the estimated and the actual orientation through iterative correction
procedures based on the accelerometer and the magnetometer [48].

Figure 2. The attitude heading reference system (AHRS) algorithm.

3. Methods
3.1. Participants
In this paper, 10 subjects (8 males and 2 females) were asked to walk on a treadmill (BH
F2W Dual) for 10 min at a constant normal walking speed of 4 km/h. Summary statistics
(mean ± standard deviation (SD)) included age (33.7 ± 7.65) ranging from 22 to 45 years,
weight in kg (63.75 ± 10.33), height in m (1.74 ± 0.09), and body mass index (kg/m2 ):
(21.75 ± 2.24).
The subjects were in good health and free of any visible walking impairments. All
subjects were instructed to participate in a preliminary warm-up session for 5 min of
walking on a treadmill to guarantee the safety of the participants and familiarize them
with the treadmill environment before the experimental measurements. All experimental
procedures were approved by the Local Ethics Committee at the University of Extremadura.

3.2. Comparison of Measurements


To compare the angle estimation using a smartphone with a motion capture system
(MCS), a pendulum was set on a kinematic test bench, located in the mechanical engineering
laboratory at the University of Extremadura, as shown in Figure 3a. It consisted of a
smartphone placed at one end of a link, while the other end articulated at a fixed point
(referred to as point O), allowing it to rotate 360◦ . This same point coincided with the center
of a goniometer, enabling the user to select the initial amplitude (θ0 ) at which the oscillation
began at time t = 0.
The idea was to compare the angle measurements of a pendulum using a smartphone
and an MCS. The MCS consisted of 8 cameras (Optitrack, Natural Point, Corvallis, OR,
USA). The cameras’ frame rate was set at 100 Hz. Before starting the recordings, a cali-
bration of the space was performed using a calibration square on the floor and a T-wand.
To measure the angle, three markers were placed on the test bench, one on the test bench
corner, another marker at the tip of the pendulum axis, and the last one at the base of the
pendulum. Figure 3a shows the configuration of the camera system and the position of the
markers on the test bench for the pendulum measurements.
The comparison was conducted by moving the pendulum at small angles (around
10◦ ). Given the forces acting on a pendulum as shown in Figure 3b, the pendulum angle
can also be theoretically calculated from the differential equation, which is
g
θ̈ + θ = 0, (5)
l
Sensors 2024, 24, 4769 6 of 22

where θ̈ represents the pendulum’s angular acceleration, g is the acceleration of gravity, l is


the length of the pendulum, and θ is the angular position.
Consider the equation for the simple harmonic motion,

θ (t) = A sin ωt + ϕ, (6)

where A is the amplitude of motion, ω denotes the angular velocity, t is the time variable,
and ϕ is the phase angle. Additionally, it is noted that the term (g/l) multiplied by θ
in Equation (5) represents the square of the angular velocity, denoted as ω 2 . Therefore,
for t = 0 and small angles (around 10◦ ), the equations that describe the kinematics of the
pendulum are  π
θ (t) = θ0 sin ωt + , (7)
2
 π
θ̇ (t) = ωθ0 cos ωt + , (8)
2
 π
θ̈ (t) = −ω 2 θ0 sin ωt + , (9)
2
where θ (t), θ̇ (t), and θ̈ (t) are the angular position, angular velocity, and angular accelera-
tion, all three depending on time and the initial angular position θ0 .

Figure 3. (a) The test bench configuration to compare the MCS and the smartphone measurements;
(b) the theoretical sketch of the configuration of a pendulum.

3.3. Hip Joint Identification


The identification of hip joint angles was conducted in two parts: data acquisition
and feature extraction and ML classification techniques.

3.3.1. Data Acquisition


For this study, 2 smartphones were used to measure the hip joint angle based on
3-axial measurement (X, Y, and Z) coordinates: the acceleration in meters per second
squared (m/s2 ), the angular velocity in radians per second (rad/s), and the magnetic field
in microteslas (µT). One smartphone was mounted on the subject’s torso, while another
smartphone was attached to the thigh like a pendulum-like structure, as shown in Figure 4.
The sampling rate for recording the data was 100 Hz.
Sensors 2024, 24, 4769 7 of 22

Figure 4. The smartphone setup on a subject during the test.

After mounting the smartphones on the subject’s torso and thigh, an initial calibra-
tion was conducted, where the subject was standing in an upright position for 5 s. This
calibration is needed for high-accuracy measurements as MEMS sensors are manufacturer-
calibrated, but some errors can arise over time [43]. In addition to the calibration procedure
and to mitigate the impact of wearing error, we implemented a standardized protocol for
smartphone placement to minimize variability. The device was securely fastened to the
lateral side of the thigh and torso using a specially designed adjustable strap, ensuring
consistent positioning across participants. This approach is supported by Jayasinghe et al.,
who demonstrated a strong correlation (above 0.9) between loose clothing-mounted sensors
and body-mounted sensors when placed on the thigh and shank. This correlation indicates
that our methodology of placing smartphones on the thigh is robust against variations in
wearing conditions, thus minimizing potential biases [49].
Then, with the sensor fusion technique, the sensors provide Euler angles ϕ1 , θ1 , and ψ1
for the torso from the orientation around X, Y, and Z of smartphone 1 and ϕ2 , θ2 , and ψ2 for
the thigh, which come from the orientation around X, Y, and Z of smartphone 2. Therefore,
the relative hip joint angle (ϕh , θh , ψh ) was calculated as the difference between the torso
and thigh angles, as illustrated in Figure 4 and the following equations:

ϕh = ϕ2 − ϕ1 , (10)

θ h = θ2 − θ1 , (11)
ψh = ψ2 − ψ1 , (12)
where ϕh is the roll angle that represents the abduction/adduction of the hip joint, while θh
represents the flexion/extension and ψh represents the internal/external rotation because
the hip joint is mechanically represented as a ball-and-socket joint. However, our angle
of interest for this study was the hip joint flexion/extension, as shown in Figure 4, as the
data serve a planned hip rehabilitation exoskeleton moving in a sagittal plane like the one
shown in the research [30,50]. The functions to connect the smartphones with MATLAB
version R2024a in the cloud and the AHRS filters were called using MATLAB functions and
the “sensor fusion and tracking” toolbox, as illustrated in [51,52]. Therefore, the flowchart
for acquiring and processing data from two smartphones is illustrated in Figure 5, which
was the pre-stage for feature extraction and classification techniques.
Sensors 2024, 24, 4769 8 of 22

Figure 5. The flowchart for acquiring and processing sensor data.

3.3.2. Feature Extraction and Classification Techniques


For this study, the duration of data acquisition for each participant was consistently
fixed at 600 s, which was divided into 100 intervals, with each interval lasting 6 s. Each
interval was considered a separate walking trial. Therefore, 100 trials were used to capture
the normal walking patterns of each participant to facilitate the feature extraction process
and, subsequently, the training models using ML. Furthermore, 10 classes were assigned to
represent 10 subjects, and each class contained 100 feature vectors.
As people have various walking styles, we employed statistical calculations focusing
on the angles of the hip joint—expressed in degrees, where 0 degrees represent the case
when the subject is in an upright position and positive and negative degrees represent
flexion and extension, respectively, as shown in Figure 4 [53]. Time domain features are ex-
tensively used in biological systems due to their lower computational complexity and ease
of implementation [54]. This led to extracting nine time-domain distinct features of each
trial per subject (class). The features capture the dynamic characteristics of hip trajectory,
providing critical insights into the variability and overall patterns of movement. The ex-
tracted features were mean value (MV), median (M), maximum angle, covariance (COV),
minimum angle, variance (VAR), standard deviation (SD), kurtosis (KUR), and skewness
(SKE) of 100 datasets for each subject. The detailed calculations of nine extracted features
are shown in Table 1 and discussed in [40,55].
For hip joint angle analysis and classification, principal component analysis (PCA)
was applied to the dataset with nine extracted features of the hip joint angles for 10 subjects.
The purpose of PCA is to visualize the class regions in the space of predictor variables
(features) and to reduce the dimensionality of the data while retaining most of the variance
present in the original features [56–58]. Based on the explained variance, the first three
principal components were selected for further analysis, as they collectively account for
99.86% of the total variance. Further details and the rationale for this selection are discussed
in Section 4.3.
After the PCA representation was performed, we employed a variety of ML methods
from an open-source data-mining tool called the Waikato Environment for Knowledge
Analysis (WEKA) software, version 3.8.6, to classify the extracted features [59–62]. The clas-
sification methods were chosen from five well-known classification categories, namely,
Bayesian, function, lazy, rule, and tree classifiers, and to train the models, we chose
16 various classifier algorithms, as detailed in Table 2. The results were evaluated using
classification accuracy and receiver operating characteristic (ROC). Classification accuracy
measures the overall correctness of a model. The ROC curve, along with its area under
Sensors 2024, 24, 4769 9 of 22

the curve, evaluates a classifier’s ability to distinguish between classes across various
thresholds [63–65].

Table 1. The mathematical representations and descriptions of the features [40,55].

Feature Description Mathematical Definition

Mean Value (MV) The average of all angles in the sequence. 1


N ∑iN=1 xi
x
Median (M) The middle angle value in the ordered sequence. ( N2 ) + x( N2 +1)
Modd = x( N +1 ) , Meven = 2
2
Maximum Angle The largest angle observed. xmax
1 N
Covariance (COV) Indicates how two angle variables vary together. N −1 ∑i =1 ( Xi − X )(Yi − Y )
Minimum Angle The smallest angle observed. xmin
1 N 2
Variance (VAR) Measures the dispersion around the mean angle. N −1 ∑ i =1 x i
q
1 N
N −1 ∑i =1 ( xi − x̄ )
Standard Deviation (SD) Shows the amount of variation or dispersion of angle values. 2

1 N xi − x̄ 4
N ∑ i =1
Kurtosis (KUR) Describes the sharpness or flatness of the angle distribution. σ
1 N xi − x̄ 3
Skewness (SKE) Shows the asymmetry in the angle distribution. ∑
N i =1 σ
N: number of samples for each trial.

Table 2. The classification algorithms in WEKA.

No. Category Algorithms


1 Bayesian Classifiers BayesNet
NB
2 Function Classifiers Logistic-R
MultiPerceptron
SMO
Simple Logistic
ClassViRegression
3 Lazy Classifiers KStar
LWL
IBk
4 Rule Classifiers JRip
PART
5 Tree Classifiers J48
LMT
RF
REPTree

In the context of WEKA, Bayesian classifiers, such as Bayesian network (BayesNet)


and naive Bayesian (NB), use Bayes’s theorem to generate probabilistic outputs. BayesNet
represents a set of variables via a directed acyclic graph (DAG), where each variable
is a random variable and the edges are the probabilistic dependencies of the variable.
The NB uses the Bayesian theorem with strong (naive) assumptions among the extracted
features [60,66,67]. Still, determining the optimal sample size for training data is a crucial
factor for achieving accurate classification performance, as it enables a closer approximation
of the true data distribution. Thus, a novel approach has been introduced recently to
estimate the minimum training sample size required for a Bayes classifier, which is detailed
in a recent study [68]. This method employs a proxy learning curve, providing a practical
framework for researchers to gauge the quantity of data necessary for their models to
perform effectively. In the realm of the Naive Bayes (NB) classifier, it is important to
note the simplifying assumption of feature independence given the class label, which,
despite its potential to misrepresent the feature interdependencies, often results in a robust
baseline for classification tasks due to its computational efficiency and surprisingly effective
performance in high-dimensional settings.
Sensors 2024, 24, 4769 10 of 22

The function classifiers utilize neural network and regression procedures in their
functions [61]. Function classifiers such as logistic regression, multilayer perceptron
(MultiPerceptron), sequential minimal optimization (SMO), simple logistic, and classi-
fication via regression (ClassViRegression) employ mathematical functions to represent
relationships among the data connections. The MultiPerceptron and logistic regression are
non-parametric supervised classifications, but the multilayer perceptron can predict more
complex features than logistic regression, which usually predicts binary outcomes [69].
SMO utilizes the support vector machine (SVM) algorithm in the training procedures,
whereas simple regression is a condensed version of logistic regression. However, solving
the problem with classification can be handy using regression.
Lazy classifiers such as KStar calculate the distance between instances by employing a
probabilistic measure based on the potential transformation of one instance into another,
whereas the locally weighted learning (LWL) modifies the weight of each neighbor accord-
ing to a distance function. However, the instance-based k (IBk) is the WEKA’s k-nearest
neighbor (kNN) algorithm. All lazy classifiers defer model building until prediction time,
making them efficient for certain datasets [70].
The rule classifiers such as Java repeated incremental pruning (JRip) implement
repeated incremental pruning to produce an error reduction (RIPPER) algorithm [71].
The RIPPER builds a set of rules in the classification by repeatedly adding rules to the
models to cover many instances and minimizing the error of overfitting [72]. A partial
decision tree (PART) combines both decision trees with rule-based learning. Instead of
building a full decision tree, it establishes rules by building and pruning partial decision
trees; thus, it is called PART. PART utilizes partial C4.5 combined with the RIPPER algorithm
in learning [73]. Both JRip and PART are known for producing models that are relatively
easy to interpret.
Lastly, the tree classifiers were used, as they are among the most used classification
techniques because of their ease of implementation [74]. Among the tree classifiers is J48,
which implements the C4.5 algorithm developed by Ross Quinlan for generating decision
trees from a set of training data and uses the concept of information entropy [75]. The other
algorithm was the logistic model trees (LMTs) that build a decision tree based on simple
class values at the leaves and based on a logistic regression model at each level of the
tree (node). This LMT is used to capture the linear and non-linear instructions of the
decision tree using logistic regression methods. Meanwhile, random forest (RF) can classify
large amounts of data with accuracy, as it uses a multitude of decision trees and outputs
the mode of classes at each individual tree. RF is robust for a large number of extracted
features due to its capability to deal with overfitting [74]. However, for the fast decision tree
learning procedure, a reduced error pruning tree (REPTree) was used. The algorithm builds
a decision tree based on the gain/variance information to prune it using reduced error
pruning with backfitting. REPTree uses the methods from C4.5 and the REP concept in its
procedures [74]. Both RF and REPTree are known for their efficiency on large datasets [76].

4. Results and Discussion


4.1. Comparison of the Measuring Systems
In this initial step before data acquisition, we examined the accuracy of measurements
using smartphone sensors and motion-capture systems, benchmarked against a specially
designed pendulum test bed. The focus was on assessing the precision and reliability
of smartphone angle measurement compared with a well-known and precise measure-
ment system represented by the MCS, thereby establishing confidence in smartphone
measurements utilizing sensor fusion algorithms. The results are shown in Figure 6.
Sensors 2024, 24, 4769 11 of 22

Figure 6. Comparison of measurements of MCS and smartphone in the pendulum test bench.

The obtained results reveal a high degree of congruence between the two systems
across the oscillatory motion of the pendulum. Notably, the amplitude consistency and
frequency alignment between the smartphone and MCS data were observed, with small
differences in the amplitude of the two measurements. For this reason, the measurement
performance of the smartphone compared with the MCS can be measured using the root
mean square error (RMSE), as deduced by
v
u  2
u N
t ∑i=0 M(MCS) − M(smartphone)
RMSE = , (13)
N

where N is the number of observation samples over time, which is, for these measurements,
N = 6500. The M(MCS) represented in this work as the reference measurement using the
MCS and M(smartphone) is the measurement conducted by the smartphone.
Therefore, the calculated (RMSE = 0.34) indicates the efficacy of the smartphone in
capturing the pendulum’s motion. Meanwhile, the attenuation in amplitude over time was
consistent across both measurement modalities, suggesting a linear damping characteristic
that is likely attributable to aerodynamic drag and mechanical friction at the pivot point.
Furthermore, we noticed that the amplitude from the smartphone was less than that of the
MCS, but did not affect the ability to identify subjects based on their gait patterns.

4.2. Hip Joint Angles


Through sensor fusion procedures in MATLAB, we transform raw data from ac-
celerometer, gyroscope, and magnetometer sensors into hip joint angle measurements.
The sensor fusion technique is detailed in the flowchart shown in Figure 5.
This study pays special attention to movements within the sagittal plane, underpinning
its relevance to rehabilitation scenarios. Therefore, we incorporate the significance of the
pitch angle, outputted by the AHRS algorithm, as shown in Figure 2. The acceleration
measurement, angular velocity, and magnetic field that are involved in the sensor fusion of
a subject under the experimental test can be seen in Figure 7, as well as their results from
the sensor fusion algorithm representing the hip joint angles being tested.
Sensors 2024, 24, 4769 12 of 22

Figure 7. The acceleration, angular velocities, and magnetic field with the hip joint angle of a
test subject.

Although the graph shows 20 s, it does not need to show huge differences within
the steps as the subject walked on a treadmill, with a constant speed and floor (controlled
environment); thus, we expect not to see huge differences in a human walking style within
20 s. However, the measurement data showed that there were differences in hip joint
angles among the 10 subjects (people have various walking styles) and slight changes (a
few degrees) within the subject steps when a subject became tired [53]. Within this context,
some of these features extracted from the hip joint angle, such as the mean value and the
median, for all the subjects under the experiment are shown in Figure 8.

Figure 8. The boxplot for the mean value and the median features for all the experimental subjects.
Sensors 2024, 24, 4769 13 of 22

4.3. Classification Analysis


Principal component analysis (PCA) was applied to the dataset with nine features
to visualize their relationships, reduce the dimensionality of the data, and capture the
variance in the data. For our study, we initially considered all nine components, as each
component captures a part of the total variance. However, after analyzing the explained
variance that is shown in Table 3, we determined that the first three principal components
collectively account for 99.86% of the total variance, which is substantial and sufficient for
our analysis. Therefore, we decided to use the first three PCA components for visualization
and further analysis, ensuring that most of the information is retained while simplifying
the dataset by reducing the dimensionality from nine to three.

Table 3. Explained variance of principal components.

Principal Component 1 2 3 4 5 6 7 8 9
Explained Variance (%) 91.1337 8.4542 0.2717 0.1325 0.0073 0.0005 0.0001 0.0000 0.0000

The choice of using the first three components is visually supported by the 3D PCA
plot in Figure 9, where each data point represents a feature vector and each color represents
a subject. The three axes (PC1, PC2, PC3) represent the directions of maximum variance in
the data. PC1 accounts for the most variance, followed by PC2 and then PC3. The different
colors represent the 10 different subjects. Each subject, demarcated by a unique color,
presents a cluster formed by the data points, which indicate distinct separation between
subjects’ gait patterns, demonstrating the efficacy of the dimensionality reduction. However,
the proximal clustering within each subject suggests a high intra-subject consistency in
gait features, while the spatial segregation between subjects underscores the inter-subject
variability. Therefore, if the clusters are well separated, it suggests that the PCA has done
a good job of distinguishing between the different subjects’ gait patterns. Additionally,
if any points are far away from the main clusters, then they could be considered outliers.
For this end, we do not see any significant outliers, which suggests that the gait patterns
are relatively consistent within each subject.

Figure 9. The 3D principal component analysis (PCA) for all subjects with color coding to differentiate
each subject.

The trajectories formed by the points (from one end of the graph to the other) can
indicate the progression of the gait cycle for each subject. However, subjects 1, 3, and 9
form tight clusters, indicative of consistent and stable gait patterns with little variation.
Subjects 2, 4, and 5 have more spread along the principal component axes, which may imply
variability in specific gait characteristics. However, the spread is controlled, suggesting
Sensors 2024, 24, 4769 14 of 22

that these variations are systematic and could be related to individual walking styles or
physiological differences. Subject 6 shows a distinct distribution, potentially indicating
unique gait features that may differ significantly from the other subjects, while subjects 7
and 8 show a spread in the PCA space that suggests unique gait patterns. These subjects
may have gait features that are less common among the cohort, which could be indicative
of unique biomechanical traits. Lastly, subject 10 has its data points isolated from the
rest, particularly along PC3. Such separation suggests that this subject has a gait pattern
with distinct characteristics that are not shared with the other subjects, which could be of
particular interest for specific gait analysis.
Five machine learning classifiers—16 algorithms—were trained and tested on hip
joint angles based on nine extracted features. The data were trained by running each
algorithm 10 times and choosing cross-validation with 10 folds in WEKA. It involves
splitting the data into 10 subsets, with 1 subset used for testing and the rest for training in
each iteration. Additionally, stratified cross-validation is used to maintain class distribution.
The final estimate is an average of the 10 iterations, with an optional standard deviation.
Ten-fold is preferred due to its proven accuracy and theoretical support. Repeated stratified
cross-validation further enhances reliability.
When investigating classification accuracy (CA) and other metrics for various classifi-
cation algorithms, certain tuning parameters were adjusted within the WEKA environment.
WEKA’s graphical user interface provides a user-friendly platform for this purpose. For ex-
ample, in the case of BayesNet, a combination of the SimpleEstimator with an alpha range
of 0.5–0.8 and the LAGDHillClimber algorithm was used. This particular configuration
showed superior CA compared with alternatives such as K2, SimulatedAnnealing, and
TAN, which are also available in WEKA. As for the Naive Bayes (NB) classifier, the exper-
iments were performed by switching the KernalEstimator between true and false while
keeping the other WEKA default parameters constant. The MultiPerceptron was another
interesting algorithm that was tested with both 5 and 6 hidden layers. To do this, we
switched back and forth between ‘a’ and ‘t’ in the WEKA object editor. The sequential
minimal optimization (SMO) algorithm was configured with the PolyKernel. This configu-
ration resulted in the highest CA compared with other kernels such as Puk, StringKernel,
and RBFKernel, all with their default parameters. Other classifiers such as SimpleLogistic,
classifierViaRegression, KStar, PART, Logistic-R, and LMT were run with their default
parameters in WEKA.
Local Weighted Learning (LWL) was selected with its default settings, but the Lin-
earNNSearch algorithm was preferred over other options such as KDTree, Cover, and Bal-
Tree due to its higher CA. The instance-based K-nearest neighbor (IBK) classifier was tested
with a KNN value of 9, which outperformed the other KNN values from 1 to 13 in terms
of CA. JRip was run with an option of 9 folds for pruning, as this was found to be more
effective compared with the default value of 1 fold. The J48 classifier was utilized with a
confidence factor of 0.25, which is the default setting. Adjusting this value resulted in no
significant differences in CA. The random forest (RF) classifier was selected with certain
parameters: MaxDepth was set to 0, the number of trees in the forest to 100, and numFea-
ture to 0, which determines the number of randomly selected attributes. Finally, REPTree
was selected with its default settings, including a minNum of 2.0, which refers to the
minimum total weight of instances in a leaf. Additionally, numFolds was set to 3, and
maxDepth was set to −1, indicating no restrictions on the tree depth. This systematic
approach to tuning the parameters and selecting the algorithms was crucial for optimal
classification performance.
The evaluation of these algorithms was conducted based on several key metrics, such
as CA, receiver operating characteristic (ROC), and classification interval (CI). The ROC area
is a single measure of the overall performance of a classification model, with a higher area
under the curve (AUC) value indicating better model performance—with values ranging
from 0 to 1, where 0.5 denotes random guessing, and 1 signifies perfect performance. A CI
serves as a quantitative measure of uncertainty in estimation, wherein the interval’s width
Sensors 2024, 24, 4769 15 of 22

is inversely related to the level of certainty. A broader confidence interval signifies a higher
degree of uncertainty, whereas a narrower interval suggests increased confidence in the
estimation [77]. To calculate the lower and upper limits p of the CI, we used a Wilson score
interval method using Equation (14) [78,79].
r !,
z2 f2 z2 z2

f
p= f+ ±z − + 1+ (14)
2N N N 4N 2 N

Here, N is the number of instances in the test set; f is the observed sample proportion
( f = S/N), where S is the number of successes (or the number of correct guesses made
by the model); and z is the z-score corresponding to the desired confidence level. In our
study, when we use a confidence interval of 80% (with a wider interval indicating more
uncertainty and a narrower one indicating higher confidence), z = 1.82. The term inside
the square root is the adjusted standard error of the proportion, while the denominator
is a correction factor that adjusts the interval’s width. The ± symbol indicates that the
term comes after it adds for the upper limit of the confidence interval and subtracts for
the lower limit. Therefore, the detailed CA, ROC, and CI of all the classification models by
running the algorithms 10 times in WEKA can be seen in Table 4. Additionally, a boxplot in
Figure 10 provides a graphical comparison among the classification models.

Table 4. The classification accuracy of the multiple classifiers with rows shaded in gray indicating the
highest classification accuracy (CA) percentage values for 10 subjects’ data [55].

No. Classification Model CA % Av. ROC Av. CI


1 BayesNet [80] 84.26 ± 1.2 0.975 ± 0.002 [0.829 0.859]
2 NB [81] 85.5 ± 2.1 0.988 ± 0.003 [0.856 0.884]
3 Logistic-R [82] 87.1 ± 1.6 0.986 ± 0.001 [0.865 0.892]
4 MultiPerceptron [83] 88.9 ± 1.3 0.965 ± 0.003 [0.874 0.899]
5 SMO [84] 84.9 ± 1.2 0.976 ± 0.002 [0.847 0.875]
6 SimpleLogistic [85] 88.4 ± 2.3 0.989 ±0.002 [0.861 0.907]
7 ClassViRegression [82] 85.4 ± 1.8 0.985 ± 0.004 [0.847 0.875]
8 KStar [86] 86.1 ± 2.6 0.987 ± 0.003 [0.849 0.887]
9 LWL [87] 63.4 ± 1.6 0.937 ± 0.001 [0.622 0.661]
10 IBk [86] 84.1 ± 1.1 0.917 ± 0.001 [0.830 0.852]
11 JRip [86] 80.1 ± 1.1 0.947 ± 0.003 [0.786 0.818]
12 PART [88] 83.4 ± 1.5 0.932 ± 0.001 [0.819 0.849]
13 J48 [86] 84.9 ± 1.4 0.937 ± 0.005 [0.814 0.844]
14 LMT [74] 88.2 ± 1.4 0.989 ± 0.001 [0.868 0.896]
15 RF [89] 86.9 ± 1.1 0.903 ± 0.001 [0.863 0.889]
16 REPTree [90] 82.9 ± 1.2 0.960 ± 0.001 [0.813 0.843]
CA: classification accuracy; M: mean, SD: standard deviation; Av. CI: average classification interval; Av. ROC:
average receiver operating characteristic. All values under CA are in the form M ± SD).

Figure 10. The boxplot of the classification accuracy across multiple models.
Sensors 2024, 24, 4769 16 of 22

Table 4 and Figure 10 provide insight into the predictive capabilities of the machine
learning models. The MultiPerceptron algorithm exhibited the highest classification accu-
racy (CA), indicating its effectiveness in handling the complex relationships within the gait
data. Other models, like SimpleLogistic and LMT, also showed high accuracy and receiver
operating characteristic (ROC) values.
The evaluation metrics, such as CA, ROC, and classification intervals (CsI), served
as critical indicators of model performance. The LMT algorithm demonstrated high CA
and ROC, suggesting its strength in class probability estimation. The CI calculations
provided by the equation helped quantify the uncertainty in model estimates, offering a
comprehensive assessment of model reliability. These results suggest that a combination of
PCA for feature reduction and a suitable selection of classifiers can yield robust and reliable
insights into gait analysis for both research and clinical practice. For visualization purposes,
the ROC areas for three subjects within the highest CA (MultiPerceptron, SimpleLogistic,
and LMT) are shown in Figure 11.

(a) (b) (c)


Figure 11. The ROC curve of the optimal classifier with the highest classification accuracy across
different subjects: (a) the MultiPerceptron classifier for subject 2, (b) the SimpleLogistic classifier for
subject 4, and (c) the LMT classifier for subject 6.

The ROC curves show that the false positive rate (FPR,1-specificity) approaches zero,
while the true positive rates (TPRs) for the respective classifiers are still quite high, indi-
cating a strong performance in the low false alarm regime. The curves start at the top-left
corner, which suggests that the classifiers can identify a significant number of true positives
without incurring many false positives. However, as the FPR increases, the rate at which
the TPR increases will differ among the classifiers. A steep initial slope in this region is
desirable as it indicates that the classifier can achieve a high TPR without significantly
increasing the FPR. Additionally, it is also crucial to consider the area under the ROC
curve (AUC) values. The closer the AUC is to 1, the better the classifier’s overall ability is
to distinguish between the positive and negative classes across all thresholds. The AUC
values for the classifiers are substantially high (0.973, 0.985, and 0.995 ), indicating good
overall performance.
These curves suggest that the classifiers perform well, particularly at low false alarm
rates, which is often a critical area in many applications where the cost of a false alarm is
high. By dissecting these results, we provide a nuanced understanding of each algorithm’s
performance, offering valuable insights into their potential application in similar research
contexts. These results can be a guide for future algorithmic choices in similar studies
regarding their classification accuracy.
Additionally, the confusion matrices for the three classifiers (MultiPerceptron, Sim-
pleLogistic, and LMT) with the highest classification accuracy (close to 89%) are depicted
in Figure 12. The confusion matrices show that class 2 has variability in CA with the most
confusion in distinguishing classes 1, 4 and 8 for the three classifiers. The same occurs with
class 8, which shows confusion in classes 1, 2 and 5 with (12 instances) in SimpleLogistic.
However, the three classifiers show sufficient CA to distinguish the various classes.
Sensors 2024, 24, 4769 17 of 22

Confusion Matrix for MultiPerceptron, CA=88.9% Confusion Matrix for SimpleLogistic, CA=88.4% Confusion Matrix for LMT, CA=88.2%

1 80 3 5 12 1 83 4 12 1 1 83 4 12 1

2 7 75 8 2 1 7 2 9 72 8 1 10 2 9 76 9 1 5

3 1 93 1 5 3 1 88 1 5 5 3 1 92 2 5

4 7 93 4 7 93 4 7 93

True Class

True Class

True Class
5 1 1 3 88 7 5 3 3 89 5 5 3 3 89 5

6 1 99 6 1 99 6 5 1 89 5

7 5 1 94 7 1 5 94 7 1 94 5

8 6 6 1 7 80 8 3 12 5 80 8 5 5 1 9 80

9 1 3 95 1 9 1 3 95 1 9 6 3 90 1

10 2 1 5 92 10 2 2 5 91 10 2 2 96

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Predicted Class Predicted Class Predicted Class

(a) (b) (c)


Figure 12. The confusion matrices for the classifier with the highest classification accuracy (the classes
are from 1 = subject 1 to 10 = subject 10): (a) the confusion matrix for the MultiPerceptron classi-
fier, (b) the confusion matrix for the SimpleLogistic classifier, and (c) the confusion matrix for the
LMT classifier.

Furthermore, our study provides several contributions to smartphone applications


and gait recognition by demonstrating the effectiveness of various classification algorithms
for human recognition based on hip joint angles. However, other studies of gait recognition
using sensors on ankle or wrist joints have shown promising results in gait analysis.
For instance, Talha et al. [14] utilized a smartphone motion sensor on an ankle joint,
achieving a classification accuracy of 87% with IMU raw data and 94% when using the
gender and height feature in a training set. Similarly, Deb et al. [11] employed a time-
warped similarity metric with accelerometer sensor data on wrist and ankle joints, resulting
in a classification accuracy of 89.7% and 82.3%, respectively.
Our study utilizes conventional classifiers available in WEKA for their balance be-
tween performance and computational efficiency. Therefore, future research aims to explore
advanced classifiers like LSTM NN, RNN, and HCNN to improve performance and capture
complex gait patterns, building upon the solid foundation established by our conventional
machine learning approaches. However, implementing these modern classifiers would
require substantial changes to our setup, including deep learning frameworks like Tensor-
Flow or PyTorch, which involve high computational demands and are beyond our current
scope in this study.
Another future research direction may utilize our methods with diverse demographics
of participants who have various hip or walking pathologies to compare the classification
effectiveness with our current results. Further optimization of parameters and settings
is worth investigating and could achieve the best possible performance and accuracy for
the classification task. Therefore, we intend to investigate the integration of late fusion
techniques, such as score fusion and majority voting strategies, to improve the accuracy
and stability of the proposed models [91].

5. Conclusions
Our study investigates the utilization of smartphone IMU sensors to discriminate
subjects based on their walking styles by analyzing hip joint angles. Our findings confirm
the reliability of these sensors in measuring hip joint angles and effectively distinguishing
between individuals using classification techniques. Through sensor fusion, which inte-
grates accelerometer, gyroscope, and magnetometer data, we have achieved accuracy levels
comparable with a reference system of angle measurements obtained from a camera array.
By employing statistical methodologies for feature extraction and machine learning algo-
rithms, we achieve an 88.9% classification accuracy. This underscores the immense potential
of smartphones in facilitating comprehensive human walking analysis and proficiently
classifying sensor data.

Author Contributions: Conceptualization, R.A. (Rabé Andersson) and J.C.; methodology, R.A. (Rabé
Andersson), J.B.-G. and J.C.; software, R.A. (Rabé Andersson); validation, R.A. (Rabé Andersson),
J.B.-G. and R.A. (Rafael Agujetas); formal analysis, R.A. (Rabé Andersson); investigation, R.A. (Rabé
Sensors 2024, 24, 4769 18 of 22

Andersson), J.B.-G., R.A. (Rafael Agujetas) and J.C.; resources, R.A. (Rabé Andersson); data curation,
R.A. (Rabé Andersson) and J.B.-G.; writing—original draft preparation, R.A. (Rabé Andersson), M.C.
and J.C.; writing—review and editing, R.A. (Rabé Andersson), J.B.-G. and M.C.; visualization, R.A.
(Rabé Andersson); supervision, R.A. (Rafael Agujetas), M.C. and J.C.; project administration, R.A.
(Rabé Andersson); funding acquisition, R.A. (Rabé Andersson). All authors have read and agreed to
the published version of the manuscript.
Funding: This research received funding from the University of Gävle and was partially supported
by the Ministry of Science and Innovation—Spanish Agency of Research (MCIN/AEI/10.13039/
501100011033), through the project PID2022-1375250B-C21.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: The measurement data are available on request from the corresponding
author. The data are not publicly available due to data privacy protection regulations.
Acknowledgments: The authors gratefully acknowledge the support provided for this research
by the University of Gävle, the University of Extremadura, and the Ministry of Science and Inno-
vation—Spanish Agency of Research. The authors would also like to gratefully acknowledge the
participants in this study.
Conflicts of Interest: The authors declare no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript:
MCS motion capture system
EMG electromyography
ML machine learning
SVM support vector machine
NN neural network
LSTM NN long short-term memory neural network
RNN recurrent neural network
NB naive Bayes
LDA linear discriminant analysis
HCNN hybrid convolutional neural network
IMU inertial measurement unit
MARG magnetic, angular rate, and gravity
MEMS micro-electro-mechanical system
SFA sensor fusion algorithm
LCF linear complementary filter
NCF nonlinear complementary filter
LKF linear Kalman filter
EKF extended Kalman filter
CKF complementary Kalman filter
SRUKF square root unscented Kalman filter
SRCKF Square Root Cubature Kalman Filter
AHRS attitude heading reference system
SD standard deviation
RMSE root mean square error
MV mean value
M median
COV covariance
VAR variance
KUR kurtosis
SKE skewness
PCA principal component analysis
WEKA Waikato Environment for Knowledge Analysis
BayesNet Bayesian network
NB naive Bayesian
Sensors 2024, 24, 4769 19 of 22

SMO sequential minimal optimization


LWL locally weighted learning
kNN K-nearest neighbor
IBk instance-based K
JRip Java repeated incremental pruning
RIPPER repeated incremental pruning to produce error reduction error reduction
PART partial decision tree
LMT logistic model tree
REPTree reduced error pruning tree
CA classification accuracy
CI confidence interval
ROC receiver operating characteristic
AUC area under the curve

References
1. Richter, F. Smartphone Sales Worldwide 2007–2021|Statista. Available online: https://www.statista.com/statistics/263437
/global-smartphone-sales-to-end-users-since-2007/ (accessed on 17 October 2023).
2. Majumder, S.; Deen, M.J. Smartphone sensors for health monitoring and diagnosis. Sensors 2019, 19, 2164. [CrossRef] [PubMed]
3. Sprager, S.; Juric, M.B. Inertial Sensor-Based Gait Recognition: A Review. Sensors 2015, 15, 22089–22127. [CrossRef] [PubMed]
4. Drake, J.; Schulz, K.; Bukowski, R.; Gaither, K. Collecting and analyzing smartphone sensor data for health. In Proceedings of the
PEARC ’21: Practice and Experience in Advanced Research Computing, Boston, MA, USA , 18–22 July 2021; pp. 1–4. [CrossRef]
5. Moral-Munoz, J.A.; Zhang, W.; Cobo, M.J.; Herrera-Viedma, E.; Kaber, D.B. Smartphone-based systems for physical rehabilitation
applications: A systematic review. Assist. Technol. 2021, 33, 223–236. [CrossRef] [PubMed]
6. Faiz, A.B.; Imteaj, A.; Chowdhury, M. Smart vehicle accident detection and alarming system using a smartphone. In Proceedings
of the 2015 International Conference on Computer and Information Engineering (ICCIE), Rajshahi, Bangladesh, 26–27 November
2015; pp. 66–69. [CrossRef]
7. Kashevnik, A.; Ponomarev, A.; Shilov, N.; Chechulin, A. In-Vehicle Situation Monitoring for Potential Threats Detection Based on
Smartphone Sensors. Sensors 2020, 20, 5049. [CrossRef] [PubMed]
8. Zhong, Y.; Deng, Y. Sensor orientation invariant mobile gait biometrics. In Proceedings of the IEEE International Joint Conference
on Biometrics, Clearwater, FL, USA, 29 September–2 October 2014; pp. 1–8. [CrossRef]
9. Ramakrishna, M.V.; Harika, S.; Chowdary, S.M.; Kumar, T.P.; Vamsi, T.K.; Adilakshmi, M. Machine Learning based Gait
Recognition for Human Authentication. In Proceedings of the 2nd International Conference on Sustainable Computing and Data
Communication Systems (ICSCDS), Erode, India, 23–25 March 2023; pp. 1316–1322. [CrossRef]
10. Damaševičius, R.; Maskeliunas, R.; Venčkauskas, A.; Woźniak, M. Smartphone User Identity Verification Using Gait Characteris-
tics. Symmetry 2016, 8, 100. [CrossRef]
11. Deb, S.; Ouyang, Y.; Chua, M.C.H.; Tian, J. Gait identification using a new time-warped similarity metric based on smartphone
inertial signals. J. Ambient Intell. Humaniz. Comput. 2020, 11, 4041–4053. [CrossRef]
12. Connor, P.; Ross, A. Biometric recognition by gait: A survey of modalities and features. Comput. Vis. Image Underst. 2018,
167, 1–27. [CrossRef]
13. Wan, C.; Wang, L.; Phoha, V.V. A Survey on Gait Recognition. ACM Comput. Surv. 2018, 51, 89. [CrossRef]
14. Talha, M.; Soomro, H.A.; Naeem, N.; Ali, E.; Kyrarini, M. Human Identification Using a Smartphone Motion Sensor and Gait
Analysis. In Proceedings of the PETRA ’22: Proceedings of the 15th International Conference on PErvasive Technologies Related
to Assistive Environments, Corfu, Greece, 29 June–1 July 2022; pp. 197–202. [CrossRef]
15. Martins, M.; Elias, A.; Cifuentes, C.; Alfonso, M.; Frizera, A.; Santos, C.; Ceres, R. Assessment of walker-assisted gait based on
Principal Component Analysis and wireless inertial sensors. Rev. Bras. De Eng. Biomédica 2014, 30, 220–231. [CrossRef]
16. Zhang, M.W.; Chew, P.Y.; Yeo, L.L.; Ho, R.C. The untapped potential of smartphone sensors for stroke rehabilitation and after-care.
Technol. Health Care 2016, 24, 139–143. [CrossRef]
17. Kong, P.W. Editorial–Special Issue on “Sensor Technology for Enhancing Training and Performance in Sport”. Sensors 2023,
23, 2847. [CrossRef] [PubMed]
18. Wang, S.; Chan, P.P.; Lam, B.M.; Chan, Z.Y.; Zhang, J.H.; Wang, C.; Lam, W.K.; Ho, K.K.W.; Chan, R.H.; Cheung, R.T. Sensor-based
gait retraining lowers knee adduction moment and improves symptoms in patients with knee osteoarthritis: A randomized
controlled trial. Sensors 2021, 21, 5596. [CrossRef] [PubMed]
19. Turner, A. How Many Smartphones Are In The World? 2021. Available online: https://www.bankmycell.com/blog/how-many-
phones-are-in-the-world/ (accessed on 2 January 2024).
20. Bhattacharjya, S.; Cavuoto, L.A.; Reilly, B.; Xu, W.; Subryan, H.; Langan, J. Usability, Usefulness, and Acceptance of a Novel,
Portable Rehabilitation System (mRehab) Using Smartphone and 3D Printing Technology: Mixed Methods Study. JMIR Hum.
Factors 2021, 8, e21312. [CrossRef] [PubMed]
Sensors 2024, 24, 4769 20 of 22

21. Thang, H.M.; Viet, V.Q.; Dinh Thuc, N.; Choi, D. Gait identification using accelerometer on mobile phone. In Proceedings of the
2012 International Conference on Control, Automation and Information Sciences (ICCAIS), Saigon, Vietnam, 26–29 November
2012, pp. 344–348. [CrossRef]
22. Makihara, Y.; Matovski, D.S.; Nixon, M.S.; Carter, J.N.; Yagi, Y. Gait Recognition: Databases, Representations, and Applications.
In Wiley Encyclopedia of Electrical and Electronics Engineering; John Wiley & Sons: Hoboken, NJ, USA, 2015. [CrossRef]
23. Derawi, M.O.; Nickely, C.; Bours, P.; Busch, C. Unobtrusive user-authentication on mobile phones using biometric gait recognition.
In Proceedings of the 2010 6th International Conference on Intelligent Information Hiding and Multimedia Signal Processing,
IIHMSP 2010, Darmstadt, Germany, 15–17 October 2010; pp. 306–311. [CrossRef]
24. Neumann, D.A. Kinesiology of the hip: A focus on muscular actions. J. Orthop. Sport. Phys. Ther. 2010, 40, 82–94. [CrossRef]
[PubMed]
25. Muro-de-la Herran, A.; García-Zapirain, B.; Méndez-Zorrilla, A. Gait Analysis Methods: An Overview of Wearable and
Non-Wearable Systems, Highlighting Clinical Applications. Sensors 2014, 14, 3362. [CrossRef] [PubMed]
26. Bouchrika, I.; Goffredo, M.; Carter, J.; Nixon, M. On using gait in forensic biometrics. J. Forensic Sci. 2011, 56, 882–889. [CrossRef]
[PubMed]
27. Baker, R. The history of gait analysis before the advent of modern computers. Gait Posture 2007, 26, 331–342. [CrossRef] [PubMed]
28. Fleury, A.; Mourcou, Q.; Franco, C.; Diot, B.; Demongeot, J.; Vuillerme, N. Evaluation of a Smartphone-based audio-biofeedback
system for improving balance in older adults–a pilot study. Annu. Int. Conf. IEEE Eng. Med. Biol. Society. IEEE Eng. Med. Biol.
Society. Annu. Int. Conf. 2013, 2013, 1198–1201. [CrossRef]
29. Giandolini, M.; Poupard, T.; Gimenez, P.; Horvais, N.; Millet, G.Y.; Morin, J.B.; Samozino, P. A simple field method to identify
foot strike pattern during running. J. Biomech. 2014, 47, 1588–1593. [CrossRef]
30. Andersson, R.; Björsell, N. The Energy Consumption and Robust Case Torque Control of a Rehabilitation Hip Exoskeleton. Appl.
Sci. 2022, 12, 11104. [CrossRef]
31. Taniguchi, H.; Sato, H.; Shirakawa, T. A machine learning model with human cognitive biases capable of learning from small and
biased datasets. Sci. Rep. 2018, 8, 7397. [CrossRef] [PubMed]
32. Ordóñez, F.J.; Roggen, D.; Liu, Y.; Xiao, W.; Chao, H.C.; Chu, P. Deep Convolutional and LSTM Recurrent Neural Networks for
Multimodal Wearable Activity Recognition. Sensors 2016, 16, 115. [CrossRef] [PubMed]
33. Jiang, X.; Chu, K.H.; Khoshnam, M.; Menon, C. A Wearable Gait Phase Detection System Based on Force Myography Techniques.
Sensors 2018, 18, 1279. [CrossRef] [PubMed]
34. Goh, G.L.; Goh, G.D.; Pan, J.W.; Teng, P.S.P.; Kong, P.W. Automated Service Height Fault Detection Using Computer Vision and
Machine Learning for Badminton Matches. Sensors 2023, 23, 9759. [CrossRef]
35. Tao, W.; Liu, T.; Zheng, R.; Feng, H. Gait Analysis Using Wearable Sensors. Sensors 2012, 12, 2255. [CrossRef]
36. Lihinikaduarachchi, I.; Rajapaksha, S.A.; Saumya, C.; Senevirathne, V.; Silva, P. Inertial Measurement units based wireless sensor
network for real time gait analysis. In Proceedings of the TENCON 2015—2015 IEEE Region 10 Conference, Macao, China, 1–4
November 2015; pp. 1–6. [CrossRef]
37. Olivares, A.; Górriz, J.M.; Ramírez, J.; Olivares, G. Sensor fusion adaptive filtering for position monitoring in intense activities.
In Hybrid Artificial Intelligence Systems, Part I; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2010;
Volume 6076, pp. 484–491. [CrossRef]
38. Ding, W.; Gao, Y. Attitude Estimation Using Low-Cost MARG Sensors with Disturbances Reduction. IEEE Trans. Instrum. Meas.
2021, 70. [CrossRef]
39. Ferrari, A.; Micucci, D.; Mobilio, M.; Napoletano, P. Trends in human activity recognition using smartphones. J. Reliab. Intell.
Environ. 2021, 7, 189–213. [CrossRef]
40. Pinto, B.; Correia, M.V.; Paredes, H.; Silva, I. Detection of Intermittent Claudication from Smartphone Inertial Data in Community
Walks Using Machine Learning Classifiers. Sensors 2023, 23, 1581. [CrossRef] [PubMed]
41. Pan, T.Y.; Kuo, C.H.; Hu, M.C. A noise reduction method for IMU and its application on handwriting trajectory reconstruction.
In Proceedings of the 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016, Seattle, WA, USA,
11–15 July 2016. [CrossRef]
42. Sun, W.; Wu, J.; Ding, W.; Duan, S. A robust indirect Kalman filter based on the gradient descent algorithm for attitude estimation
during dynamic conditions. IEEE Access 2020, 8, 96487–96494. [CrossRef]
43. Olsson, F.; Kok, M.; Halvorsen, K.; Schön, T.B. Accelerometer calibration using sensor fusion with a gyroscope. In Proceedings of
the 2016 IEEE Statistical Signal Processing Workshop (SSP), Palma de Mallorca, Spain, 26–29 June 2016; pp. 1–5. [CrossRef]
44. Nazarahari, M.; Rouhani, H. Sensor fusion algorithms for orientation tracking via magnetic and inertial measurement units: An
experimental comparison survey. Inf. Fusion 2021, 76, 8–23. [CrossRef]
45. Nazarahari, M.; Rouhani, H. 40 years of sensor fusion for orientation tracking via magnetic and inertial measurement units:
Methods, lessons learned, and future challenges. Inf. Fusion 2021, 68, 67–84. [CrossRef]
46. Diaz, E.M.; De Ponte Muller, F.; Jimenez, A.R.; Zampella, F. Evaluation of AHRS algorithms for inertial personal localization in
industrial environments. IEEE Int. Conf. Ind. Technol. 2015, 2015, 3412–3417. [CrossRef]
47. Yadav, N.; Bleakley, C. Accurate Orientation Estimation Using AHRS under Conditions of Magnetic Distortion. Sensors 2014,
14, 20008–20024. [CrossRef] [PubMed]
Sensors 2024, 24, 4769 21 of 22

48. Tomaszewski, D.; Rapiński, J.; Pelc-Mieczkowska, R. Concept of AHRS Algorithm Designed for Platform Independent Imu
Attitude Alignment. Rep. Geod. Geoinform. 2017, 104, 33–47. :10.1515/rgg-2017-0013 [CrossRef]
49. Jayasinghe, U.; Hwang, F.; Harwin, W.S. Comparing Loose Clothing-Mounted Sensors with Body-Mounted Sensors in the
Analysis of Walking. Sensors 2022, 22, 6605. [CrossRef]
50. Andersson, R.; Björsell, N. The MATLAB Simulation and the Linear Quadratic Regulator Torque Control of a Series Elastic
Actuator for a Rehabilitation Hip Exoskeleton. In Proceedings of the 2022 5th International Conference on Intelligent Robotics
and Control Engineering (IRCE), Tianjin, China, 23–25 September 2022; pp. 25–31. [CrossRef]
51. Sensor Fusion and Tracking Toolbox—MATLAB. Available online: https://se.mathworks.com/products/sensor-fusion-and-
tracking.html (accessed on 24 January 2023).
52. Orientation from accelerometer, gyroscope, and magnetometer readings—MATLAB—MathWorks Nordic. Available online:
https://se.mathworks.com/help/fusion/ref/ahrsfilter-system-object.html (accessed on 24 January 2023).
53. Pandey, N.; Abdulla, W.; Salcic, Z. Gait-based person identification using multi-view sub-vector quantisation technique. In
Proceedings of the 2007 9th International Symposium on Signal Processing and its Applications, ISSPA 2007, Sharjah, United
Arab Emirates, 12–15 February 2007. [CrossRef]
54. Cao, Y.; Gao, F.; Yu, L.; She, Q. Gait recognition based on emg information with multiple features. IFIP Adv. Inf. Commun. Technol.
2018, 538, 402–411. [CrossRef]
55. Chen, J.; Sun, Y.; Sun, S. Improving Human Activity Recognition Performance by Data Fusion and Feature Engineering. Sensors
2021, 21, 692. [CrossRef]
56. Yang, M.J.; Zheng, H.R.; Wang, H.Y.; Mcclean, S.; Harris, N. Combining feature ranking with PCA: An application to gait analysis.
In Proceedings of the 2010 International Conference on Machine Learning and Cybernetics, Qingdao, China, 11–14 July 2010;
Volume 1, pp. 494–499. [CrossRef]
57. Jolliffe, I.T. Principal Component Analysis; Springer Series in Statistics; Springer: New York, NY, USA, 1986. [CrossRef]
58. Márquez, F.P.G. Advances in Principal Component Analysis; IntechOpen: Rijeka, Croatia, 2022. [CrossRef]
59. Weka. Weka 3—Data Mining with Open Source Machine Learning Software in Java. Available online: https://ml.cms.waikato.
ac.nz/weka/ (accessed on 1 January 2024).
60. Kotak, P.; Modi, H. Enhancing the Data Mining Tool WEKA. In Proceedings of the 2020 5th International Conference on
Computing, Communication and Security (ICCCS), Patna, India, 14–16 October 2020; pp. 1–6. [CrossRef]
61. kumari Dash, R. Selection of the best classifier from different datasets using WEKA. Int. J. Eng. Res. Technol. (IJERT) 2013, 2.
[CrossRef]
62. Alshammari, M.; Mezher, M. A Comparative Analysis of Data Mining Techniques on Breast Cancer Diagnosis Data using WEKA
Toolbox. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 224–229. [CrossRef]
63. Eligo, W.M.; Leng, C.; Kurika, A.E.; Basu, A. Comparing Supervised Machine Learning Algorithms on Classification Efficiency of
multiclass classifications problem. Int. J. Emerg. Trends Eng. Res. 2022, 10, 346–360. [CrossRef]
64. Ong, M.S.; Magrabi, F.; Coiera, E. Automated categorisation of clinical incident reports using statistical text classification. Qual.
Saf. Health Care 2010, 19, e55. [CrossRef]
65. Davis, J.; Goadrich, M. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd International
Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 233–240. [CrossRef]
66. Area, S.; Mesra, R. Analysis of Bayes, neural network and tree classifier of classification technique in data mining using WEKA.
Comput. Sci. Inf. Technol. 2012, 2, 359–369. [CrossRef]
67. Bouckaert, R. Bayesian Network Classifiers in Weka; Working Paper Series; University of Waikato, Department of Computer Science:
Hamilton, New Zealand, 2004; pp. 1–23.
68. Salazar, A.; Vergara, L.; Vidal, E. A proxy learning curve for the Bayes classifier. Pattern Recognit. 2023, 136, 109240. [CrossRef]
69. Sahoo, G.; Kumar, Y. Analysis of parametric & non parametric classifiers for classification technique using WEKA. Int. J. Inf.
Technol. Comput. Sci. (IJITCS) 2012, 4, 43. [CrossRef]
70. Witten, I.H.; Frank, E.; Hall, M.A.; Pal, C.J. Data Mining: Practical Machine Learning Tools and Techniques, 4th ed.; Morgan Kaufmann:
Burlington, MA, USA, 2016.
71. Shahzad, W.; Asad, S.; Khan, M.A. Feature subset selection using association rule mining and JRip classifier. Int. J. Phys. Sci. 2013,
8, 885–896. [CrossRef]
72. Thakur, S.; Meenakshi, E.; Priya, A. Detection of malicious URLs in big data using RIPPER algorithm. In Proceedings of the
2017 2nd IEEE International Conference on Recent Trends in Electronics, Information and Communication Technology (RTEICT),
Bangalore, India, 19–20 May 2017; pp. 1296–1301. [CrossRef]
73. Mohamed, W.N.H.W.; Salleh, M.N.M.; Omar, A.H. A comparative study of Reduced Error Pruning method in decision tree
algorithms. In Proceedings of the 2012 IEEE International Conference on Control System, Computing and Engineering, Penang,
Malaysia, 23–25 November 2012; pp. 392–397. [CrossRef]
74. Rajesh, P.; Karthikeyan, M. A comparative study of data mining algorithms for decision tree approaches using weka tool. Adv.
Nat. Appl. Sci. 2017, 11, 230–243.
75. Salzberg, S.L. C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993. Mach. Learn.
1994, 16, 235–240. [CrossRef]
Sensors 2024, 24, 4769 22 of 22

76. Frank, E.; Hall, M.; Witten, I. The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and
Techniques”, 4th ed.; Morgan Kaufmann: Burlington, MA, USA, 2016.
77. Hazra, A. Using the confidence interval confidently. J. Thorac. Dis. 2017, 9, 4125–4130. [CrossRef]
78. Richard , R. Confidence Intervals. Applied Biostatistics for the Health Sciences, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2022;
pp. 235–271. [CrossRef]
79. Wallis, S. Binomial Confidence Intervals and Contingency Tests: Mathematical Fundamentals and the Evaluation of Alternative
Methods. J. Quant. Linguist. 2013, 20, 178–208. [CrossRef]
80. Kozlow, P.; Abid, N.; Yanushkevich, S. Gait Type Analysis Using Dynamic Bayesian Networks. Sensors 2018, 18, 3329. [CrossRef]
[PubMed]
81. Manap, H.H.; Tahir, N.M.; Abdullah, R. Anomalous gait detection using Naive Bayes classifier. In Proceedings of the ISIEA
2012—2012 IEEE Symposium on Industrial Electronics and Applications, Bandung, Indonesia, 23–26 September 2012; pp. 378–381.
[CrossRef]
82. Yang, J.H.; Park, J.H.; Jang, S.H.; Cho, J. Novel Method of Classification in Knee Osteoarthritis: Machine Learning Application
Versus Logistic Regression Model. Ann. Rehabil. Med. 2020, 44, 415–427. [CrossRef] [PubMed]
83. Szczepanski, D. Multilayer perceptron for gait type classification based on inertial sensors data. In Proceedings of the 2016
Federated Conference on Computer Science and Information Systems, Gdańsk, Poland, 11–14 September 2016; pp. 947–950.
[CrossRef]
84. Platt, J.C. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines; Microsoft: Redmond, WA, USA,
1998.
85. Seo, J.; Kim, T.; Lee, J.; Kim, J.; Choi, J.; Tack, G. Fall prediction of the elderly with a logistic regression model based on
instrumented timed up & go. J. Mech. Sci. Technol. 2019, 33, 3813–3818. [CrossRef]
86. Ng, Y.L.; Jiang, X.; Zhang, Y.; Shin, S.B.; Ning, R. Automated Activity Recognition with Gait Positions Using Machine Learning
Algorithms. Eng. Technol. Appl. Sci. Res. 2019, 9, 4554–4560. [CrossRef]
87. Atkeson, C.G.; Moore, A.W.; Schaal, S. Locally Weighted Learning. Artif. Intell. Rev. 1997, 11, 11–73. [CrossRef]
88. Frank, E.; Witten, I.H. Generating Accurate Rule Sets Without Global Optimization. In Proceedings of the Fifteenth International
Conference on Machine Learning, Madison, WI, USA, 24–27 July 1998; pp. 144–151.
89. Shi, L.F.; Qiu, C.X.; Xin, D.J.; Liu, G.X. Gait recognition via random forests based on wearable inertial measurement unit. J.
Ambient Intell. Humaniz. Comput. 2020, 11, 5329–5340. [CrossRef]
90. NH, W. Classification of control and neurodegenerative disease subjects using tree based classifiers. J. Pharm. Res. Int. 2020, 32,
63–73.
91. Kuncheva, L.I. Combining Pattern Classifiers: Methods and Algorithms, 2nd ed.; Wiley Blackwell: Hoboken, NJ, USA, 2014; Volume
9781118315, pp. 1–357. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy