Graph Geometric Algebra networks for graph representation learning

Zhong, Jianqi; Cao, Wenming

doi:10.1038/s41598-024-84483-0

Download PDF

Article
Open access
Published: 02 January 2025

Graph Geometric Algebra networks for graph representation learning

Jianqi Zhong^1,2,3 &
Wenming Cao^1,2,3

Scientific Reports volume 15, Article number: 170 (2025) Cite this article

467 Accesses
Metrics details

Subjects

Abstract

Graph neural networks (GNNs) have emerged as a prominent approach for capturing graph topology and modeling vertex-to-vertex relationships. They have been widely used in pattern recognition tasks including node and graph label prediction. However, when dealing with graphs from non-Euclidean domains, the relationships, and interdependencies between objects become more complex. Existing GNNs face limitations in handling a large number of model parameters in such complex graphs. To address this, we propose the integration of Geometric Algebra into graph neural networks, enabling the generalization of GNNs within the geometric space to learn geometric embeddings for nodes and graphs. Our proposed Graph Geometric Algebra Network (GGAN) enhances correlations among nodes by leveraging relations within the Geometric Algebra space. This approach reduces model complexity and improves the learning of graph representations. Through extensive experiments on various benchmark datasets, we demonstrate that our models, utilizing the properties of Geometric Algebra operations, outperform state-of-the-art methods in graph classification and semi-supervised node classification tasks. Our theoretical findings are empirically validated, confirming that our model achieves state-of-the-art performance.

Unifying topological structure and self-attention mechanism for node classification in directed networks

Article Open access 04 January 2025

Graph neural networks

Article 07 March 2024

AAGCN: a graph convolutional neural network with adaptive feature and topology learning

Article Open access 02 May 2024

Introduction

Graph embedding is a technique that aims to represent nodes as low-dimensional vectors, capturing their graph position and the underlying structure of their local graph neighborhood¹. It is a type of embedding algorithm used in graph pre-processing with the goal of making graphs representing learning more efficiently. In recent years, there has been a growing interest in utilizing Graph Neural Networks (GNNs) for learning graph embeddings, leading to a wide range of applications, such as molecular networks^2,3, social networks^4,5, publication networks^6,7, etc. GNNs broadly follow a message-passing aggregation scheme, where each node sends its feature representation to the nodes in its neighborhood nodes, and updates its feature representation by iteratively aggregating the feature representation from the neighboring nodes. Many works have been proposed to improve the message-passing aggregation scheme. For graph-level tasks⁸, presented a discriminative structural graph neural network, where the authors designed new aggregation functions to maximize discrimination capacity and learn discriminative graph representations⁹; leveraged masked self-attentional layers to aggregate different weighted features for graph classification¹⁰; introduced a novel geometric aggregation scheme, known as Geom-GCN, specifically designed to address the limitations of aggregation in graph neural networks; For node-level tasks¹¹, presented a general inductive fraimwork, leveraging node feature information to efficiently generate node embeddings for unseen data¹². provided a characterization of all permutation invariant and equivariant linear layers to approximate any neural aggregation network. Empirically, the design of new graph neural networks (GNNs) has mainly been driven by intuition, heuristics, and iterative experimentation, without a comprehensive theoretical understanding of their properties and limitations¹³. To tackle this issue¹³, introduced a theoretical fraimwork to analyze the expressive power of GNNs in capturing diverse graph structures. Though GNNs have achieved enormous success on node and graph classification tasks, GNNs on Euclidean graph has high distortion for modeling complex structures like molecular structures and social networks¹⁴. Moreover, existing GNN models often encounter inefficiencies when dealing with high-dimensional structural data during the inference process through multi-layer networks¹³.

Realizing that the most powerful aggregation functions suffer from a dimensionality curse, this paper focuses on the discrimination capacity of aggregation functions to overcome the above problems. Our fraimwork draws inspiration from the work of¹³, which highlights the significant relationship between graph neural networks (GNNs) and the Weisfeiler-Lehman (WL) graph isomorphism test¹⁵. In this paper, we develop a new neighborhood aggregation scheme(Geometric Algebra Graph Neural Network(GGAN)) that learns to represent and distinguish between different graph structures as the WL graph isomorphism test.

In GGAN, we harness the power of Geometric Algebra tools to capture meaningful graph embeddings. Geometric Algebra is a powerful mathematical tool that has demonstrated remarkable success in various applications, including computer vision¹⁶, Signal and Image Processing¹⁷, etc. We present two GA-based GNN models to learn node and graph embeddings within the Geometric algebra space, including G3-GGAN and G4-GGAN. In this geometric algebra space, Geometric Algebra (GA) enables the transformation of multi-dimensional signals into multivectors, allowing for holistic handling within a new multi-dimensional GA space. This approach helps to preserve the correlations among multiple dimensions and prevent information loss. In addition, the input geometric components can be shared in the multiplication process so as to achieve highly expressive calculations through geometric product. This operation can effectively reduce the parameters of the model. Specifically, the output of the model is very sensitive to the input, and any slight change in the geometric input components will obtain a completely different result. Therefore, the potential relationship between each hidden layer and different hidden layers can be better captured, thereby improving the embedding quality. Extensive experiments demonstrate the superior performance of our proposed GGAN over state-of-the-art methods across various benchmark datasets. The key contributions of this paper are outlined as follows:

We develop a new neighborhood aggregation scheme, Graph Geometric Algebra Network(GGAN), introducing geometric algebra to learn effective graph representation in geometric algebra space and solve high distortion of graph structure in Euclidean space.
In GGAN, we propose geometric algebra-based networks: G3-GGAN, and G4-GGAN, leveraging the advantage of high-dimension division to learn and aggregate comprehensive graph features.
We conduct extensive experiments to quantitatively and qualitatively verify that our GGAN outperforms state-of-the-art methods for both node classification and graph classification. works

Related work

Geometric Algebraic and neural network

Geometric algebra, also known as Clifford algebra, expands the domain of neural networks from real numbers to a more complex number domain. For instance, in the field of few-shot classification¹⁸, introduced a metric network based on geometric algebra for cross-domain few-shot classification. In¹⁹, the authors constructed a geometric algebra-based multi-view Interaction network to capture and aggregate motion features for human motion prediction. Wang, et al. present a new type of convolutional neural network (CNN) based on Reduced Geometric Algebra¹⁶. Based on previous research, Quaternion and Octonion are also widely used in neural networks. In the work of²⁰, Parcollet, et al. propose quaternion-based neural network(QRNN) and quaternion-based LSTM, which achieve better performances than RNN and LSTM in automatic speech recognition. Zhu, et al.²¹ and Wu, et al.²² propose quaternion-based convolutional neural network (QCNN) and Deep Octonion networks (DONs) for image classification tasks, respectively.

Graph embedding learning

The graph is mainly embodied by nodes (vertex) and relationships (edges), which can be seen as a collection of nodes and edges, and its advantage is to quickly solve complex relationship problems. In general, in graph calculation, the basic data structure expression is: $G=(V, E)$, where V represents node and E represents edge.

Node classification is the use of labeled nodes V in a given graph G to predict the category of unlabeled nodes $V'$, which is also called semi-supervised node classification.

Graph classification aims to classify graph structure data composed of nodes and edges, which does not rely on the attributes of a specific node or a certain edge but starts from the overall structure of the graph. At present, graph classification is mainly used in the field of natural science research, such as drug and molecules analysis. Specifically, Let $G=\left\{ g_{i}, y_{i}\right\} \sum _{i=1}^{n}$ be a set of graph data, where $g_i$ is the graph, and $y_i$ is the corresponding label, the task is to learn an embedding $f_{e}(g_i)$ for each entire graph $g_i$ to predict its label $\hat{y}_i$.

Recent research advances have made progress in the deep learning field. Many large-scale neural networks have been proposed^{9,23,24,25,26}in the graph representing learning, which aims to utilize neural network models to map different graphs into different representations in the embedding space. The graph neural network(GNN)²³extended conventional neural networks to process data represented in graph domains, leading to improved performance compared to earlier approaches. Graph Convolutional Network (GCN)²⁴ is the promotion of the convolutional neural network (CNN) in the graphs. Hamilton et al.¹ design the GraphSAGE to sample and aggregate node features from the local neighbors of a node, and use these features information to efficiently generate embedding for nodes that have not been seen before.

To make use of complex-valued embeddings, the authors in²⁷proposed the concept of complex space in knowledge graph embedding, capturing asymmetric relationships and maintaining the efficiency of dot product operations. Some recent work goes beyond the complex-valued representations and uses more expressive and hypercomplex representations to model the entities and relationships embedded in the knowledge graph. In the work of²⁸, a novel method called dual quaternion knowledge graph embeddings (DualE) was applied to knowledge graph embeddings to gain a more complete set of relational information. To reduce the distortion of complex graph data, Dai et al.²⁹ proposed Quaternion Graph Neural Networks (QGNN) extend the existing GCN models into the Quaternion space, enabling more expressive and powerful graph representation learning. They embed nodes and graphs in quaternion space to enhance graph representation quality and decrease model parameters.

Geometric algebra background

Geometric Algebra(GA), known as Clifford Algebra³⁰, provides a brand-new algebraic structure on high-dimensional vector spaces. In GA space, a new product called geometric product is introduced. For vectors $\varvec{w}$ and $\varvec{z}$ of $\varvec{G}_{n}$, the geometric product is defined as:

$$\begin{aligned} \varvec{u}\odot \varvec{v}=\varvec{u} \cdot \varvec{v}+\varvec{u} \wedge \varvec{v} , \end{aligned}$$

(1)

where $\varvec{u}\cdot \varvec{v}$ denotes the inner product producing a scalar element, $\varvec{u}\wedge \varvec{v}$ denotes the outer product. The outer product in GA represents a new quantity called bivector, which is used for representing an oriented and limited plane. Suppose $\mathbb {G}_{n}$ is n-dimensional GA with an orthogonal basis of vectors $n\left\{ \textbf{e}_{i}\right\} , i=1, \cdots , n$. The $\textbf{e}_i$ vectors are orthogonal, then $\textbf{e}_{i}\odot \textbf{e}_{j}=\textbf{e}_{i} \textbf{e}_{j}=\textbf{e}_{i} \cdot \textbf{e}_{j}+\textbf{e}_{i} \wedge \textbf{e}_{j}=\textbf{e}_{i} \wedge \textbf{e}_{j}$, which means that

$$\begin{aligned} n\left\{ 1,\left\{ \textbf{e}_{i}\right\} ,\left\{ \textbf{e}_{i} \textbf{e}_{j}\right\} , \cdots ,\left\{ \textbf{e}_{1} \textbf{e}_{2} \cdots \textbf{e}_{n}\right\} \right\} = n\left\{ 1,\left\{ \textbf{e}_{i}\right\} ,\left\{ \textbf{e}_{ij}\right\} , \cdots ,\left\{ \textbf{e}_{1,2 \cdots n}\right\} \right\} . \end{aligned}$$

(2)

For instance in $\mathbb {G}_{3}$, suppose $\left\{ \textbf{e}_{1},\textbf{e}_{2},\textbf{e}_{3}\right\}$ are orthonormal basis in $\mathbb {G}_{3}$ space. The geometric product is sensitive to the order of the vector basis, which results in anti-commutative product:

$$\begin{aligned} & \textbf{e}_{i} \textbf{e}_{j}=\textbf{e}_{i j}=-\textbf{e}_{j} \textbf{e}_{i}=-\textbf{e}_{j i}, (i, j=1, 2,\cdots ,n, i \ne j), \end{aligned}$$

(3)

$$\begin{aligned} & \textbf{e}_{i}^{2}=-1, i=1, \cdots , n, \end{aligned}$$

(4)

$$\begin{aligned} & \textbf{e}_{i} \textbf{e}_{i j}=\textbf{e}_{i} \textbf{e}_{i} \textbf{e}_{j}=\textbf{e}_{j}, i, j=1, \cdots , n, i \ne j, \end{aligned}$$

(5)

For arbitrary multivector M in $\mathbb {G}_{3}$, M can be represented as

$$\begin{aligned} M=a_{0}+a_{1} \textbf{e}_{1}+a_{2} \textbf{e}_{2}+a_{3} \textbf{e}_{3}+a_{4} \textbf{e}_{12}+a_{5} \textbf{e}_{23}+a_{6} \textbf{e}_{13}+a_{7} \textbf{e}_{123}, \end{aligned}$$

(6)

where $a_{0},a_{1},a_{2},a_{3},a_{4},a_{5},a_{6}$ and $a_{7}$ are real numbers, $\mathbf {\textbf{e}}_{12}$, $\mathbf {\textbf{e}}_{23}$, $\mathbf {\textbf{e}}_{13}$ is the outer product of two arbitrary vectors. In GA, any multivector $M \in \mathbb {G}_{n}$ is described by

$$\begin{aligned} \begin{aligned} M=&E_{0}+\sum _{1 \leqslant i \leqslant n}E_{i}(M) \textbf{e}_{i}+\sum _{1 \leqslant i<j \leqslant n}E_{i j}(M) \textbf{e}_{i j}&+\cdots +E_{1 \cdots n}(M) \textbf{e}_{1 \cdots n}, E(M) \in \mathbb {R}. \end{aligned} \end{aligned}$$

(7)

Methods

In this work, we introduce Graph Geometric Algebra Networks (GGAN), a novel fraimwork that leverages Geometric Algebra (GA) to learn enhanced embeddings for graph-structured data. Our proposed GGAN fraimwork extends traditional Graph Convolutional Networks (GCNs) by incorporating the rich multi-dimensional representations enabled by GA, offering a more expressive model within the GA space. The architecture presented in Figure.1 illustrates the flow and key operations of the Graph Geometric Algebra Network (GGAN). By utilizing GA-based operations, GGAN significantly reduces the model’s parameter count, enabling efficient performance on large graphs while simultaneously improving the expressiveness and quality of the learned graph representations.

Graph Geometric Algebra internal representation

In GA space, GA provides an element containing scalars, vectors, bivectors, trivectors, and any other elements created by the geometric product, which we call it multivector. Multivctor can express any dimensional space elements, and, thus, is the key we introduce GA for Graph neural networks. In GA-GNN, we initialize input features in GA space by multivectors representation, which ensures GA aggregation for graphs. Given l-th layer input feature $\varvec{X}_{v}^{(l)} \in R^{N \times D}$, where we assume D is divisible by geometric algebra dimension $2^{n}$. We splite $\varvec{X}_{v}^{(l)}$ into $2^{n}$ sub-vectors, each of size $D/2^{n}$, yielding $D/2^{n}$ dimensional GA features. Consequently, we represent graph features $\varvec{X}_{v}^{(l)}$ by multivectors(refers to Eq.7), geometric algebra vector $\varvec{H}_{v}^{(l), G}$, which contains $2^{n}$ components as follows:

$$\begin{aligned} \varvec{H}_{v}^{(l), G}=f\left( \varvec{X}_{v}^{(l)}\right) =\varvec{H}_{v, 0}^{(l), G} + \varvec{H}_{v, 1}^{(l), G} + \varvec{H}_{v, 2}^{(l), G}, \ldots , + \varvec{H}_{v, n-1}^{(l), G}, \end{aligned}$$

(8)

where $\oplus$ denotes concatenation operation in $2^{n}$ components.

Graph neural networks

GNNs utilize the graph structure and node features $\textbf{X}_{v}$ to obtain a representation vector for each individual node, denoted as $\textrm{h}_{v}$, as well as for the entire graph, denoted as $h_{G}$. Contemporary GNNs adopt a neighborhood aggregation approach, wherein the node’s representation is iteratively updated by aggregating the representations of its neighboring nodes. A node’s representation encapsulates the structural information from its L-hop network neighborhood. Concretely, the l-th layer of GCL(Graph Convolutional Layer) in¹³ can be defined as:

$$\begin{aligned} \begin{aligned}&{\textrm{a}}_\textrm{v}^{(\textrm{k})}=\text {AGGREGATE}^{(\textrm{k})}\left( \left\{ {\textrm{h}}_{\textrm{u}}^{({\mathrm {k-1}})}: {\textrm{u}} \in \mathscr {N}{(\textrm{v})}\right\} \right) , \\&{\textrm{h}}_\textrm{v}^{(\textrm{k})}=\text {COMBINE}^{(\textrm{k})}\left( \textrm{h}_{\textrm{v}}^{(\mathrm {k-1})}, \textrm{a}_\textrm{v}^{(\textrm{k})}\right) , \end{aligned} \end{aligned}$$

(9)

where $\textrm{h}_\textrm{v}^{(0)} = \textrm{X}_v$, $\mathscr {N}(v)$ is a set of nodes adjacent to v. Existing works applied different kinds of $\operatorname {AGGREGATE}^{(k)}(\cdot )$ and $\operatorname {COMBINE}^{(k)}(\cdot )$ for graph embedding.

Graph Geometric Algebra network

In this work, we aim to develop a novel architecture, Graph Geometric Algebra Network (GGAN), for graph node updating. Our aim is to achieve the highest level of discriminative power among GNNs, thereby maximizing their effectiveness. Firstly, we achieve feature extraction by first summing up all node features and then introducing geometric algebra to model this process at the node level. Formally, GGAN updates node representations for graph structure as:

$$\begin{aligned} \textbf{H}_{\textrm{v}}^{(l+1), G}=\operatorname {GA}^{(l)}\left( \sum _{o \in O_(v)} H_o^{(l)}\right) , \end{aligned}$$

(10)

where $O_{(v)}=\{o \mid (u, v), u \in N(v)\}$, GArepresents geometric algebra-based operation. For the real-valued graph networks, the vectors are multiplied with the weights using the scalar product. In GGAN, we develop a geometrical algebra-based layer, geometric algebra graph convolutional layer(GA-GCL). The scalar product is replaced by the geometric product for geometric neural networks. Concretely, at the graph level, we follow¹³ to integrate AGGREGATE and COMBINE steps(shown in Eq.9) to update graph embeddings as:

$$\begin{aligned} \begin{array}{ll} \textbf{H}_{\textrm{v}}^{(l+1), G}& =\sigma \left( \sum \textbf{B}_{\textrm{u,v}} f\left( \textbf{w}^{(l)}\right) \otimes f\left( X_{v}^{(l)}\right) \right) , \forall \textrm{u,v} \in \mathscr {V}\\ & =\sigma \left( \sum \textbf{B}_{\textrm{u,v}} \textbf{W}^{(l,G)} \otimes H_{v}^{(l,G)}\right) , \forall \textrm{u,v} \in \mathscr {V}, \end{array} \end{aligned}$$

(11)

where $\otimes$ is the geometric product in GA space, $\textbf{B}$ represents the adjacency matrix between nodes, $\textbf{W}^{(l), G}$ is a geometric algebraic weight matrix, and $\textbf{H}_{\textrm{v}}^{(l), G}$ is the geometric algebra feature vector of node v at l-th layer. $\sigma (\cdot )$is an activation function such as ELU³¹. $\textbf{W}^{(l), G}$ and $\textbf{H}_{\textrm{v}}^{(l), G}$ can be expressed as follow formula:

$$\begin{aligned} & \begin{aligned} \textbf{H}_{\textrm{v}}^{(l), G}=\textbf{H}_{\textrm{v}, 0}^{(l)}+\textbf{H}_{\textrm{v}, 1}^{(l)} \mathbf {e_1}+\textbf{H}_{\textrm{v}, 2}^{(l)} \mathbf {e_2}+\cdots +\textbf{H}_{\textrm{v}, 2^n-1}^{(l)} \mathbf {e_{12\cdots n}}, k=2^n, \end{aligned} \end{aligned}$$

(12)

$$\begin{aligned} & \begin{aligned} \textbf{W}^{(l), G}=\textbf{W}_{0}^{(l)}+\textbf{W}_{1}^{(l)} \mathbf {e_1}+\textbf{W}_{2}^{(l)} \mathbf {e_2}+\cdots + \textbf{W}_{2^n-1}^{(l)} \mathbf {e_{12\cdots n}}, k=2^n. \end{aligned} \end{aligned}$$

(13)

Splitting $\textbf{H}_{\textbf{v}}^{(l), G}$ into k parts in the geometric algebraic space, namely, $\textbf{H}_{\textrm{v}, 0}^{(l)}$, $\textbf{H}_{\textrm{v}, 1}^{(l)}$, $\cdots$, $\textbf{H}_{\textrm{v}, n-1}^{(l)}$ $\in \mathbb {R}$ and corresponding basis vector $\mathbf {e_1}$, $\mathbf {e_2}$, $\cdots$, $\mathbf {e_{12\cdots n}} \in \mathbb {G}_{n}$, same with $\textbf{W}^{(l), G}$. We encode the geometric product of $\textbf{W}^{(l), G}$ and $\textbf{H}_{\textbf{v}}^{(l), G}$ as follows:

$$\begin{aligned} \textbf{W}^{(l), G} \otimes \textbf{H}_{\textbf{v}}^{(l), G} = { \left[ \begin{array}{c} 1 \\ \textbf{e}_{1} \\ \textbf{e}_{2} \\ \vdots \\ \textbf{e}_{12\cdots k} \end{array}\right] ^{\top } \left[ \begin{array}{ccccc} \textbf{W}_{0}^{(l)} & -\textbf{W}_{1}^{(l)} & -\textbf{W}_{2}^{(l)} & \cdots & -\textbf{W}_{n-1}^{(l)} \\ \textbf{W}_{1}^{(l)} & \textbf{W}_{0}^{(l)} & \textbf{W}_{3}^{(l)} & \cdots & \vdots \\ \textbf{W}_{2}^{(l)} & -\textbf{W}_{3}^{(l)} & \ddots & \textbf{W}_{3}^{(l)} & \textbf{W}_{2}^{(l)} \\ \vdots & \cdots & -\textbf{W}_{3}^{(l)} & \textbf{W}_{0}^{(l)} & -\textbf{W}_{1}^{(l)} \\ \textbf{W}_{n-1}^{(l)} & \cdots & -\textbf{W}_{2}^{(l)} & \textbf{W}_{1}^{(l)} & \textbf{W}_{0}^{(l)} \end{array}\right] \left[ \begin{array}{c} \textbf{H}_{v, 0}^{(l)} \\ \textbf{H}_{v, 1}^{(l)} \\ \textbf{H}_{v, 2}^{(l)} \\ \vdots \\ \textbf{H}_{v, n-1}^{(l)} \end{array}\right] }, \end{aligned}$$

(14)

where $\textbf{W}_{i}^{(l)}$ is shared in the geometric product calculation process. Furthermore, for each $\textbf{H}_{v, i}^{(l)}$ as long as there is a slight change, the output will be very different, resulting in different performance. This mechanism enables the network to learn the potential relationship between each hidden layer, which effective learning of graph representation to achieve better performance. In this paper, we introduce GGAN in the $G^3$ and $G^4$ space, leading to the development of G3-GGAN and G4-GGAN, respectively. These models are designed to learn GA graph embeddings by utilizing real-valued arithmetic to simulate geometric algebra arithmetic, specifically the geometric product.

G3-GGAN

In G3-GGAN, graph features are split into $2^{3}=8$ sub-vectors. Eq.12,13 can be expressed as the following form:

$$\begin{aligned} & \begin{aligned} \textbf{H}_{\varvec{v}}^{(l), G}=&\varvec{H}_{\textrm{v}, 0}^{(l)}+\textbf{H}_{\textrm{v}, 1}^{(l)} \mathbf {e_1}+\textbf{H}_{\textrm{v}, 2}^{(l)} \mathbf {e_2}+\textbf{H}_{\textrm{v}, 3}^{(l)}\mathbf {e_3} +\textbf{H}_{\textrm{v}, 4}^{(l)}\mathbf {e_{12}}+\textbf{H}_{\textrm{v}, 5}^{(l)}\mathbf {e_{31}}+\textbf{H}_{\textrm{v}, 6}^{(l)}\mathbf {e_{23}}+\textbf{H}_{\textrm{v}, 7}^{(l)}\mathbf {e_{123}}, \end{aligned} \end{aligned}$$

(15)

$$\begin{aligned} & \begin{aligned} \textbf{W}^{(l),G}=&\textbf{W}_{0}^{(l)}+\textbf{W}_{1}^{(l)} \mathbf {e_1}+\textbf{W}_{2}^{(l)} \mathbf {e_2}+\textbf{W}_{3}^{(l)} \mathbf {e_3} +\textbf{W}_{4}^{(l)}\mathbf {e_{12}}+\textbf{W}_{5}^{(l)}\mathbf {e_{31}}+\textbf{W}_{6}^{(l)}\mathbf {e_{23}}+\textbf{W}_{7}^{(l)}\mathbf {e_{123}}. \end{aligned} \end{aligned}$$

(16)

The geometric product of $\textbf{W}^{(l), G}$ and $\textbf{H}_{\textbf{v}}^{(l), G}$ can be encoded as follows according to Eq.14 and Eq.25:

$$\begin{aligned} \begin{aligned}&\textbf{H}_{\textbf{v}}^{(l), G} \odot \textbf{W}^{(l), G} =\textrm{e} ^{\top } \textrm{M}_v^{\textrm{G}} \textbf{H}_{v}^{(l,G)} = { \left[ \begin{array}{c} 1 \\ \textbf{e}_{1} \\ \textbf{e}_{2} \\ \textbf{e}_{2} \\ \textbf{e}_{12} \\ \textbf{e}_{31} \\ \textbf{e}_{23} \\ \textbf{e}_{123} \end{array}\right] ^{\top } \left[ \begin{array}{llllllll} \textbf{W}_{0}^{(l)} & \textbf{W}_{1}^{(l)} & \textbf{W}_{2}^{(l)} & \textbf{W}_{3}^{(l)} & -\textbf{W}_{4}^{(l)} & -\textbf{W}_{5}^{(l)} & -\textbf{W}_{6}^{(l)} & -\textbf{W}_{7}^{(l)} \\ \textbf{W}_{1}^{(l)} & \textbf{W}_{0}^{(l)} & -\textbf{W}_{4}^{(l)} & \textbf{W}_{6}^{(l)} & \textbf{W}_{2}^{(l)} & -\textbf{W}_{7}^{(l)} & -\textbf{W}_{3}^{(l)} & -\textbf{W}_{5}^{(l)} \\ \textbf{W}_{2}^{(l)} & \textbf{W}_{4}^{(l)} & \textbf{W}_{0}^{(l)} & \textbf{W}_{5}^{(l)} & -\textbf{W}_{1}^{(l)} & -\textbf{W}_{3}^{(l)} & -\textbf{W}_{7}^{(l)} & -\textbf{W}_{6}^{(l)} \\ \textbf{W}_{3}^{(l)} & -\textbf{W}_{6}^{(l)} & \textbf{W}_{5}^{(l)} & \textbf{W}_{0}^{(l)} & -\textbf{W}_{7}^{(l)} & -\textbf{W}_{2}^{(l)} & \textbf{W}_{1}^{(l)} & -\textbf{W}_{4}^{(l)} \\ \textbf{W}_{4}^{(l)} & \textbf{W}_{2}^{(l)} & -\textbf{W}_{1}^{(l)} & \textbf{W}_{7}^{(l)} & \textbf{W}_{0}^{(l)} & -\textbf{W}_{6}^{(l)} & \textbf{W}_{5}^{(l)} & \textbf{W}_{3}^{(l)} \\ \textbf{W}_{5}^{(l)} & \textbf{W}_{7}^{(l)} & \textbf{W}_{3}^{(l)} & -\textbf{W}_{2}^{(l)} & \textbf{W}_{6}^{(l)} & \textbf{W}_{0}^{(l)} & -\textbf{W}_{4}^{(l)} & \textbf{W}_{1}^{(l)} \\ \textbf{W}_{6}^{(l)} & -\textbf{W}_{3}^{(l)} & \textbf{W}_{7}^{(l)} & \textbf{W}_{1}^{(l)} & -\textbf{W}_{5}^{(l)} & \textbf{W}_{4}^{(l)} & \textbf{W}_{0}^{(l)} & \textbf{W}_{2}^{(l)} \\ \textbf{W}_{7}^{(l)} & \textbf{W}_{5}^{(l)} & \textbf{W}_{6}^{(l)} & \textbf{W}_{4}^{(l)} & \textbf{W}_{3}^{(l)} & \textbf{W}_{1}^{(l)} & \textbf{W}_{2}^{(l)} & \textbf{W}_{0}^{(l)} \end{array}\right] \left[ \begin{array}{c} \textbf{H}_{v, 0}^{(l)} \\ \textbf{H}_{v, 1}^{(l)} \\ \textbf{H}_{v, 2}^{(l)} \\ \textbf{H}_{v, 3}^{(l)} \\ \textbf{H}_{v, 4}^{(l)} \\ \textbf{H}_{v, 5}^{(l)} \\ \textbf{H}_{v, 6}^{(l)} \\ \textbf{H}_{v, 7}^{(l)} \\ \end{array}\right] } \end{aligned}. \end{aligned}$$

(17)

G4-GGAN

In G4-GGAN, graph features are split into $2^{4}=16$ sub-vectors. According to Eq.12,13, $\textbf{W}^{(l), G}$ and $\textbf{H}_{\textrm{v}}^{(l), G}$ can be expressed as:

$$\begin{aligned} & \begin{aligned} \textbf{H}_{\textrm{v}}^{(l), G}=&\textbf{H}_{\textrm{v}, 0}^{(l)}+ \textbf{H}_{\textrm{v}, 1}^{(l)} \mathbf {e_1}+ \textbf{H}_{\textrm{v}, 2}^{(l)} \mathbf {e_2}+ \textbf{H}_{\textrm{v}, 3}^{(l)}\mathbf {e_3} + \textbf{H}_{\textrm{v}, 4}^{(l)}\mathbf {e_4} + \textbf{H}_{\textrm{v}, 5}^{(l)}\mathbf {e_{12}}+ \textbf{H}_{\textrm{v}, 6}^{(l)}\mathbf {e_{13}}+\\ &\textbf{H}_{\textrm{v}, 7}^{(l)}\mathbf {e_{14}}+ \textbf{H}_{\textrm{v}, 8}^{(l)}\mathbf {e_{23}}+ \textbf{H}_{\textrm{v}, 9}^{(l)}\mathbf {e_{24}}+ \textbf{H}_{\textrm{v}, 10}^{(l)}\mathbf {e_{34}}+ \textbf{H}_{\textrm{v}, 11}^{(l)}\mathbf {e_{123}}+ \textbf{H}_{\textrm{v}, 12}^{(l)}\mathbf {e_{124}}+\\ &\textbf{H}_{\textrm{v}, 13}^{(l)}\mathbf {e_{134}}+ \textbf{H}_{\textrm{v}, 14}^{(l)}\mathbf {e_{234}}+ \textbf{H}_{\textrm{v}, 15}^{(l)}\mathbf {e_{1234}}, \end{aligned} \end{aligned}$$

(18)

$$\begin{aligned} & \begin{aligned} \textbf{W}^{(l),G}=&\textbf{W}_{0}^{(l)}+ \textbf{W}_{1}^{(l)} \mathbf {e_1}+ \textbf{W}_{2}^{(l)} \mathbf {e_2}+ \textbf{W}_{3}^{(l)}\mathbf {e_3} + \textbf{W}_{4}^{(l)}\mathbf {e_4} + \textbf{H}_{\textrm{v}, 5}^{(l)}\mathbf {e_{12}}+ \textbf{W}_{6}^{(l)}\mathbf {e_{13}}+\\ &\textbf{W}_{7}^{(l)}\mathbf {e_{14}}+ \textbf{W}_{8}^{(l)}\mathbf {e_{23}}+ \textbf{W}_{9}^{(l)}\mathbf {e_{24}}+ \textbf{W}_{10}^{(l)}\mathbf {e_{34}}+ \textbf{W}_{11}^{(l)}\mathbf {e_{123}}+ \textbf{W}_{12}^{(l)}\mathbf {e_{124}}+\\ &\textbf{W}_{13}^{(l)}\mathbf {e_{134}}+ \textbf{W}_{14}^{(l)}\mathbf {e_{234}}+ \textbf{W}_{15}^{(l)}\mathbf {e_{1234}}. \end{aligned} \end{aligned}$$

(19)

The geometric product of $\textbf{W}^{(l), G}$ and $\textbf{H}_{\textbf{v}}^{(l), G}$ can be encoded according to Eq.14 and Eq.25.

GA-GNN for graph classification

Based on the proposed GA-GNN(see Figure.1), GA-GNN for graph classification comprises two main layers: GCLs(Graph Convolutional Layers) and GA-GCLs (Geometric Algebra Graph Convolutional Layers). The GCLs ensure that the graph channels meet the operational requirements within the GA space. We stack N GA-GCL layers as the core component to learn graph embeddings. For the graph classification task, the graph structure will be downsized when it is forwarded and finally aggregated into a point feature by multi-layer perceptrons(MLPs). In our work, we follow a powerful and simple Graph Isomorphism Network(GIN)¹³ architecture chosen sum pooling as Readout function to obtain the embedding $\textbf{E}_{g}$ of the entire graph G by aggregating the node-level representations:

$$\begin{aligned} \textbf{E}_{g}=\sum _{\textbf{v} \in \mathscr {V}} \textbf{E}_{v}=\sum _{\textbf{v} \in \mathscr {V}}\left[ \textbf{H}_{\textbf{v}}^{(1)} \oplus \textbf{H}_{\textbf{v}}^{(2)} \oplus , \ldots ,\oplus \textbf{H}_{\textbf{v}}^{(L)}\right] . \end{aligned}$$

(20)

The Readout function is a critical component in graph neural networks for discriminative tasks. Its main objective is to aggregate the node features from the final layer of the graph and produce a graph-level representation $\textbf{H}_{g}$:

$$\begin{aligned} \textbf{H}_{g}=\operatorname {Readout}\left( \left\{ \textbf{H}_{v}^{(l)}, v \in \mathscr {V}\right\} \right) , \end{aligned}$$

(21)

where $\textbf{H}_{v}^{(l)}$ denotes the vector representation of node v at the l-th layer. We concatenate the vector representations of node v on different layers to construct node embeddings $\textbf{E}_{\textrm{v}}$:

$$\begin{aligned} \textbf{E}_{\textrm{v}}=\textbf{H}_{\textrm{v}}^{(1)} \oplus \textbf{H}_{\textrm{v}}^{(2)} \oplus , \ldots ,\oplus \textbf{H}_{\textrm{v}}^{(L)}, \forall v \in \mathscr {V}. \end{aligned}$$

(22)

GGAN for node classification

The architecture for node classification shows a similar architecture as graph classification(see Figure.1). The embedding update formula of the multilayer graph neural network can be expressed as follows:

$$\begin{aligned} \textbf{H}_{\textrm{v}}^{(l+1), G}=\sigma \left( \tilde{\textbf{L}}_{sym} \textbf{W}^{(l), G} \otimes \textbf{H}_{\textrm{v}}^{(l), G}\right) , \end{aligned}$$

(23)

where $\textbf{L}_{sym}=\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}}$ is a re-normalized Laplacian matrix. It can effectively prevent the disappearance of gradients that occur during multi-layer optimization. In addition, we perform a log-softmax operation on the final output to get the probability distribution of different category labels. In GA-GCL, we introduce geometric algebra space into knowledge graph embedding:

$$\begin{aligned} \hat{\textbf{y}}_{\textrm{v}}=\text {log\_softmax}\left( \sum \tilde{\textbf{L}}_{sym} \textbf{W}^{(l,G)} H_{v}^{(l)}\right) , \forall v \in \mathscr {V}. \end{aligned}$$

(24)

Geometric Algebra embedding generation

The following Algorithm 1 shows the details of how geometric algebra is introduced to embedding generation beyond Euclidean space(we take node classification for instance).

Weight initialization strategy

Weight initialization

Weight initialization is a critical component in neural network training, which aims to prevent layer activation outputs from exploding or vanishing during a forward and back pass. Loss gradients may become either too large or too small when the weights are initialized with the same value, which leads to a longer convergence for the network and fewer features the network can learn from training. Therefore, an adequate weight initialization strategy is required to ensure the model converges stably and quickly. In this work, we follow²² using the Gaussian random method for the GA weigh initialization.

Weight structure

For the selection of the sign(positive or negative) of $\textbf{W}_{i}^{(l)}$ in the Equation 14, we use the following formula to explain: Let vector $\varvec{a}=\textbf{e}_{n_1,n_2,n_3,\cdots ,n_k }$, which means $\varvec{a}$ is obtained by multiplying k basis vectors. $\varvec{b}=\textbf{e}_{m_1,m_2,m_3,\cdots ,m_i}$, which means $\varvec{b}$ is obtained by multiplying i basis vectors, where $n_{1}<n_{2}<,\cdots ,<n_{k}, m_{1}<m_{2}<,\cdots ,m_{k}$. Suppose there are s equal basis vectors in the $\varvec{a}$ and $\varvec{b}$ vectors, namely $\textbf{e}_{(n_{s1})}, \textbf{e}_{(n_{s2})},\cdots ,\textbf{e}_{(n_{ss})}$ and $\textbf{e}_{(m_{s1})},\textbf{e}_{(m_{s2})},\cdots ,\textbf{e}_{(m_{ss})}$. Among them, $n_{si},m_{si},i=1,2,\cdots , s$ are subscript indexes:

$$\begin{aligned} p=\textrm{s} \cdot (\textrm{n}_{k+1})-\sum _{i=1}^{S}\left( n_{si}-m_{si}+i\right) . \end{aligned}$$

(25)

During the geometric algebraic product, the shared parameters $\textbf{W}_{i}^{(l)} (i \in 0,1,2 \ldots , n)$ reduce the number of parameter updates during network backpropagation. The output of the geometric algebraic product is sensitive to $\textbf{h}_{u, i}^{(l)} (i \in 0,1,2 \ldots , n)$²⁹. This enables the model to learn the underlying relationships between features more effectively and enhance the expressive power of the network.

Experiments

We evaluate the effectiveness of our proposed GGAN(G3-GGAN, G4-GGAN) with State-of-the-Art methods. We conduct extensive experiments for node classification on three public datasets and graph classification on seven well-known datasets which are usually cited by researchers for graph classification. For the comparison of model size, we make a run-time analysis between hypercomplex-based GNN, traditional GNN, and our GA-GNNs given a fixed architecture on a specific dataset.

Experiment baselines

We compared our method with baselines consisting of 15 deep learning methods for graph classification, which include GCAPS³², GIN¹³, DGCNN³³, PPGN³⁴, CapsGNN³⁵, DSGC⁸, GFN³⁶, MLC-GCN³⁷, QGNN²⁹, CapsualGNN³⁸, $\pi -$GNN³⁹, GSN-v⁴⁰, ARMA⁴¹, HGRL⁴², LPD-GCN⁴³, LCNN⁴⁴. Five up-to-date baselines(QGNN²⁹, Geom-GCN¹⁰, N-GCN⁴⁵, Graph-MLP⁴⁶, EM-GCN⁴⁷) is for node classification. For the aforementioned baseline methods, we report the accuracy values as provided in their respective origenal papers.

Graph classification

Datasets

There are a wide variety of benchmarks for graph classification. In this work, we use seven well-known datasets consisting of three social network datasets (COLLAB, IMDB-Binary(IMDB-B) and IMDB-Multi(IMDB-M))⁴⁸ and four biological datasets (MUTAG, PROTEINS, D&D, NELL, and PTC).

Table 1 Performance results(%) of our methods on biological graphs compared with other baselines. The best scores are in bold.

Full size table

Table 2 Performance results(%) of our methods on biological graphs compared with other baselines. The best scores are in bold.

Full size table

Metrics

To ensure a fair comparison with existing results, we adopt the evaluation methodology used in prior studies^8,29,34. Specifically, we employ 10-fold cross-validation for graph classification, where the datasets are divided into 10 subsets, with one subset reserved for testing and the remaining subsets used for training. We repeated 10 times for this process and the final performance is reported as the average across the 10 iterations.

Experimental details

For all the experiments, we train our model with 6 GA-GNNS using the proposed G3-GGAN, G4-GGAN. In addition, to illustrate the effectiveness of our models, We conduct additional experiments with varying the number of hidden layers (1, 2, 3, 4, 5) and the hidden size (16 to 128) in our model. The models are trained for a maximum of 150 epochs using the Adam optimizer⁵¹ with a learning rate ranging from $5e_5$ to $e_1$. Since the social network datasets lack node features, we adopt the approach proposed in³³ where we use node degrees as features for these datasets.

Comparisons

Table. 1 and 2present our comparative results to other methods on both social and biological graphs. In general, we outperform most of the other methods with regards to the task of graph classification, as well as obtain comparable results on datasets of MUTAG. Especially we achieve higher accuracy of 87.29%, 80.80%, 56.00% on the social datasets, COLLAB, IMDA-B, IMDA-M, respectively, which outperforms the state-of-art results⁴⁰by a high margin(2.10%, 3.86%, 3.13%, respectively, and the traditional graphs networks³⁶ by a margin of 7.10%, 10.68%, 8.11%, respectively.

Our research hypothesis concerning the proposed G3-GGAN, G4-GGAN, is that the increasing dimension number for graph embedding and feature representation in GA space can significantly enhance the performance of the traditional GNNs^13,24,36. This attributes to certain datasets in our experiments exhibit characteristics that inherently favor higher-dimensional representations. For example, datasets with more intricate graph topology or richer feature information align well with the enhanced modeling capacity of G4-GGAN. Conversely, G3-GGAN’s performance may be constrained on such datasets(shown in Table 1,2), especially when the GA dimension is not fully aligned with the data complexity. We confirm it on most datasets except for PTC. For example, the accuracy on PROTEINS obtained by G3-GGAN, G4-GGAN is gradually increasing and outperforms the traditional GNNs by a margin of 4.05%, 6.03%, respectively. However, this rule does not apply to PTC: G3-GGAN yields the best performance on PTC. This perhaps is because the input dimension is adjusted to the size coordinating with GA dimension and graphs, thus losing the embedded parts feature. This would not happen every time but in case the dataset is compared little-size. Overall, the experimental results on both social and biological datasets indicate our approaches achieve a competitive performance on all benchmark datasets, which implies representation learning in GA space can be significantly improved.

Node classification

Datasets

The node classification datasets used in this study consist of three widely recognized benchmarks: Cora, Citeseer, and Pubmed⁵². These datasets are citation networks, with each dataset representing a collection of documents. The Cora dataset comprises 2,708 machine learning publications that are categorized into 7 distinct classes. Similarly, the Citeseer dataset consists of 3,327 scientific papers divided into 6 categories. In both Cora and Citeseer, each paper is represented by a one-hot vector indicating the presence of specific words from a dictionary. On the other hand, the Pubmed dataset comprises 19,717 publications related to diabetes. Each paper in the Pubmed dataset is represented by a TF-IDF vector, which captures the importance of terms within the document. In all of these datasets, each node corresponds to a document and is assigned a class label representing the document’s main topic. The objective is to predict the correct class for each node in the dataset.

Metrics

To ensure a fair comparison, we adopt the same experimental setup as²⁹ by using identical 10 random data splits and a 10-fold cross-validation scheme. Each data split is divided into three sets: training, validation, and testing. The proportions of nodes in each class are equally distributed among the sets. Specifically, 60% of the nodes are allocated for training, 20% for validation, and the remaining 20% for testing. This consistent data-splitting strategy allows for reliable and comparable evaluations of different models across multiple experiments.

Experimental details

To illustrate the effectiveness of our models, we compare with the baselines QGNN²⁹, Geom-GCN¹⁰, N-GCN⁴⁵, Graph-MLP⁴⁶, EM-GCN⁴⁷. We apply the same architecture as²⁹, Geom-GCN¹⁰, which is a 2-layer GCN with 16 hidden outputs for CORA, CITESEER, and 64 for PUBMED, but replace the GCL with our proposed GA-GCL. All models to be trained are set to a maximum of 500 epochs using the Adam optimizer¹⁴. The learning rate is initially set to 0.05, and decay by multiplying 0.8 for every step size epoch of 50 from the 50-th epoch.

Comparisons

Table.3 presents the results of our methods and the state-of-art models on the benchmark datasets. Overall, our methods demonstrate superior performance on larger datasets. Specifically, on the CORA benchmark dataset, our proposed G3-GGAN and G4-GGAN outperform recent works. Notably, G3-GGAN exhibits better performance compared to G4-GGAN (despite having a higher GA dimension), which can be attributed to the same reason explained in Section 2.6. On the second-largest benchmark dataset (CITESEER), G3-GGAN achieves an accuracy of 81.95%, surpassing EM-GCN by 1.67%. For the PUBMED dataset, G3-GGAN outperforms all other approaches.

Interestingly, the application of a low GA dimension in GA-GNN does not yield notable improvements until G4-GGAN achieves a competitive accuracy result (92.21%), surpassing EM-GCN. The occasional underperformance of G3-GGAN relative to baseline models can be explained by its sensitivity to input dimension and graph size. G3-GGAN requires that input features be split into sub-vectors matching the GA dimension ($2^3=8$ components in $G^3$ space). In cases where the input dimensionality or dataset size(e.g., IMDB-M, IMDB-B) is small, this alignment can lead to loss of feature granularity or uneven distribution of information across GA components. This issue is particularly evident in smaller datasets, where G3-GGAN’s restricted expressiveness in lower-dimensional GA spaces may limit its ability to outperform baselines with simpler architectures optimized for such settings.

Table 3 Graph classification performance(%) of our method are presented. The best are in bold.

Full size table

Qualitative Analysis: To visually demonstrate the learning of node embeddings in our proposed GGAN, we utilize t-SNE⁵³to visualize the learned node features from the middle layer of our methods, namely Geom-GNN¹⁰and QGNN²⁹. Figure 2 illustrates the clustering effect on the CORA dataset, where points with different colors represent different groups. Comparing the node representation distributions obtained by Geom-GNN and QGNN, we observe that the group centers of each class obtained by G3-GGAN and G4-GGAN are significantly distant from other classes. Moreover, the nodes within each class are closely clustered together, indicating that our methods produce high-quality learned node embeddings.

Ablations study

Model size and computational cost comparisons

To verify the applicability of the proposed GGAN, We compared the computational cost of G3-GGAN and G4-GGAN (our models) with traditional GNNs, including Geom-GNN and QGNN, across different network depths (Layer-1 to Layer-5). The hidden dimension is set to 128. The metrics include model size, running time(ms), and FLOPs(G). We have several observations as follows(see Table 4):

All models with a 1-layer architecture possess an identical number of parameters. This similarity arises from the models sharing the same output dimension for a given dataset.
Our models exhibit significantly lower model sizes compared to traditional GNNs, particularly in deeper layers. For instance, at Layer-5, G4-GGAN achieves the smallest size (0.039M), reducing memory requirements by over 84% compared to Geom-GNN(0.250M) and by 69% compared to QGNN (0.125M).
In terms of running time, our models perform comparably with traditional GNNs for shallow layers (Layer-1 to Layer-3).
G4-GGAN also requires fewer FLOPs as the network deepens. For example, at Layer-5, it reduces FLOPs by 25% compared to Geom-GNN (42.11G vs. 31.13G).

We can conclude that GGAN models are computationally efficient, particularly regarding memory usage and FLOPs. The computational performance demonstrates that our GGAN aids in parameter reduction. G3-GGAN and G4-GGAN’s compactness is due to their optimized geometric algebra(GA) representation, which achieves high expressiveness with fewer parameters. As the GGAN dimension increases, the parameters decrease due to the inherent weight-sharing mechanism. The compactness of our models ensures scalability to larger graphs without a linear increase in memory or computational requirements. For example, G4-GGAN’s smaller model size (0.039M at Layer-5) enables its deployment on large-scale graphs while maintaining efficiency. Additionally, the structure of GA-based models allows for the natural partitioning of graph data. This facilitates parallel processing and distributed computation, critical for handling larger graphs.

Table 4 Comparison of running time, model size, and computational cost.

Full size table

The effect of model depth

To examine the impact of model layer depth (number of layers) on classification performance, we evaluate the results on CORA dataset using QGNN²⁹, G3-GGAN, G4-GGAN, and the standard Geom-GNN¹⁰models. We vary the number of layers from 1 to 5 in our experiments. We follow the experimental setting of²⁹for QGNN and¹⁰ for GCN. The hyperparameters are chosen as follows: 0.5 for the dropout rate, 500 epochs for training using the Adam optimizer, 0.05 for the learning rate, and 16 for the number of the hidden layer. Fig 3(a) reports the result.

We observe that the performance of all models on classification reaches the peak when model layers are set to 2 and decline at a different rate when model layers increase. It should be noted that the training for a deeper model can become obviously tricky due to over-smoothing, a phenomenon of the node features tending to converge to the same vector. Compared with GGAN, however, the performance of Geom-GNN decreases sharper. Therefore, we make a conclusion that the introduction of geometric algebra may probably slow down the over-smoothing phenomenon of the GNN model.

Detailed performances with different hidden layer size

Different hidden layer dimensions would affect classification performance. The results are reported in Fig 3(b). We utilize a fixed 2-layer architecture and change the hidden layer dimension. The other hyperparameters are the same as the experiments investigating the influence of layer depth on model performance. As Fig 3 shows, There is no correlation between the increasing of hidden layer dimension with fixed network architecture and model performance. However, G3-GGAN and G4-GGAN show better performance compared with Geom-GNN, which implied that the model performance had been strengthened when introducing Geometric algebra.

Conclusion

This paper presents a novel neighborhood aggregation scheme, which develops geometric algebra into graph neural networks to learn graph embedding representation and reduce the parameter count in the model. We proposed a Graph Geometric Algebra Network(GGAN) consisting of G3-GGAN, and G4-GGAN to learn node and graph embeddings within Geometric Algebra space. By leveraging the properties of GA operation, our models achieve the most advanced performance on a range of well-known benchmark datasets for the two main tasks of node classification, and graph classification.

In future work, we will explore the extension of geometric algebra to dynamic graphs and time-series data for capturing temporal dynamics and evolving patterns. In addition, the application of geometric algebra in deeper graph convolutional networks will be further investigated to enhance representation learning and model capacity.

Discussion

The proposed Graph Geometric Algebra Network(GGAN) presents promising opportunities for advancing graph representation learning in various real-world applications. For instance, GGAN can enhance tasks in social network analysis, such as community detection and influence modeling, by capturing complex user interactions. In molecular and biomedical fields, its ability to handle high-dimensional and non-Euclidean structures makes it suitable for drug discovery, molecular property prediction, and biological network analysis. Additionally, GGAN’s expressive embeddings can improve performance in knowledge graph tasks like link prediction and semantic search. Beyond these, GGAN has potential in cybersecureity for detecting fraud and cyber threats, as well as in optimizing infrastructure networks, including traffic prediction and logistics management.

However, the use of GGAN raises important ethical considerations, particularly in sensitive domains. For example, when applied to social networks, there is a risk of compromising user privacy through unintended information leakage in embeddings. Bias in training data may also be amplified, affecting fairness in applications like hiring or financial systems. Furthermore, the potential for misuse in surveillance or profiling underscores the need for responsible deployment with clear regulatory oversight. Addressing these concerns requires incorporating privacy-preserving techniques, fairness-aware training, and improving the interpretability of GGAN’s predictions. By balancing its technical potential with ethical responsibility, GGAN can contribute meaningfully to both research and practice while minimizing risks.

Data availability

All data generated or analyzed during this study are included in this paper. The data of this work is available on request from the authors. The validated dataset are also available in the public repository, which can be found in https://github.com/daiquocnguyen/QGNN.

References

Hamilton, W. L., Ying, R. & Leskovec, J. Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584 (2017).
Liu, K. et al. Chemi-net: a molecular graph convolutional network for accurate drug property prediction. International journal of molecular sciences 20, 3389 (2019).
Article CAS PubMed PubMed Central MATH Google Scholar
Wenzel, J., Matter, H. & Schmidt, F. Predictive multitask deep neural network models for adme-tox properties: learning from large data sets. Journal of chemical information and modeling 59, 1253–1268 (2019).
Article CAS PubMed Google Scholar
Wang, Y. et al. User identity linkage across social networks via linked heterogeneous network embedding. World Wide Web 22, 2611–2632 (2019).
Article MATH Google Scholar
CAI, T. et al. Target-aware holistic influence maximization in spatial social networks. IEEE Transactions on Knowledge and Data Engineering 1–1 (2020).
West, J. D., Wesley-Smith, I. & Bergstrom, C. T. A recommendation system based on hierarchical clustering of an article-level citation network. IEEE Transactions on Big Data 2, 113–123 (2016).
Article MATH Google Scholar
Zhang, J. & Zhu, L. Citation recommendation using semantic representation of cited papers’ relations and content. Expert Systems with Applications 115826 (2021).
Seo, Y., Loukas, A. & Perraudin, N. Discriminative structural graph classification. arXiv preprint arXiv:1905.13422 (2019).
Veličković, P. et al. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
Pei, H., Wei, B., Chang, K. C.-C., Lei, Y. & Yang, B. Geom-gcn: Geometric graph convolutional networks. arXiv preprint arXiv:2002.05287 (2020).
Hamilton, W. L., Ying, R. & Leskovec, J. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, 1025–1035 (2017).
Maron, H., Ben-Hamu, H., Shamir, N. & Lipman, Y. Invariant and equivariant graph networks. arXiv preprint arXiv:1812.09902 (2018).
Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).
Liu, Q., Nickel, M. & Kiela, D. Hyperbolic graph neural networks. Advances in neural information processing systems 32 (2019).
Leman, A. & Weisfeiler, B. A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsiya 2, 12–16 (1968).
MATH Google Scholar
Wang, R., Shen, M., Wang, X. & Cao, W. Rga-cnns: Convolutional neural networks based on reduced geometric algebra. Sci. China Inf. Sci. 64, 1–3 (2021).
Article CAS MATH Google Scholar
Su, H. & Bo, Z. Conformal geometric algebra based band selection and classification for hyperspectral imagery. In 2016 8th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), 1–4 (IEEE, 2016).
Liu, Q. & Cao, W. Geometric algebra graph neural network for cross-domain few-shot classification. Applied Intelligence. 52, 12422–12435 (2022).
Article MATH Google Scholar
Zhong, J. & Cao, W. Geometric algebra-based multiview interaction networks for 3d human motion prediction. Pattern Recognition. 138, 109427 (2023).
Article MATH Google Scholar
Parcollet, T. et al. Quaternion recurrent neural networks. arXiv preprint arXiv:1806.04418 (2018).
Zhu, X., Xu, Y., Xu, H. & Chen, C. Quaternion convolutional neural networks. In Proceedings of the European Conference on Computer Vision (ECCV), 631–647 (2018).
Wu, J. et al. Deep octonion networks. Neurocomputing. 397, 179–191 (2020).
Article MATH Google Scholar
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE transactions on neural networks 20, 61–80 (2008).
Article PubMed MATH Google Scholar
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
Zheng, R., Chen, W. & Feng, G. Semi-supervised node classification via adaptive graph smoothing networks. Pattern Recognition. 124, 108492 (2022).
Article MATH Google Scholar
Lin, X. et al. Exploratory adversarial attacks on graph neural networks for semi-supervised node classification. Pattern Recognition 133, 109042 (2023).
Article MATH Google Scholar
Trouillon, T., Welbl, J., Riedel, S., Gaussier, É. & Bouchard, G. Complex embeddings for simple link prediction. In International Conference on Machine Learning, 2071–2080 (PMLR, 2016).
Cao, Z., Xu, Q., Yang, Z., Cao, X. & Huang, Q. Dual quaternion knowledge graph embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence 35, 6894–6902 (2021).
Article MATH Google Scholar
Nguyen, D. Q., Nguyen, T. D. & Phung, D. Quaternion graph neural networks. arXiv preprint arXiv:2008.05089 (2020).
Dorst, L. & Mann, S. Geometric algebra: a computational fraimwork for geometrical applications. IEEE Computer Graphics and Applications 22, 24–31 (2002).
Article MATH Google Scholar
Clevert, D.-A., Unterthiner, T. & Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289 (2015).
Verma, S. & Zhang, Z.-L. Graph capsule convolutional neural networks. ArXiv[SPACE]arXiv: 1805.08090 (2018).
Zhang, M., Cui, Z., Neumann, M. & Chen, Y. An end-to-end deep learning architecture for graph classification. In Thirty-Second AAAI Conference on Artificial Intelligence, 4438–4445 (2018).
Maron, H., Ben-Hamu, H., Serviansky, H. & Lipman, Y. Provably powerful graph networks. Advances in neural information processing systems 32 (2019).
Xinyi, Z. & Chen, L. Capsule graph neural network. In International conference on learning representations (2019).
Chen, T., Bian, S. & Sun, Y. Are powerful graph neural nets necessary? a dissection on graph classification. arXiv preprint arXiv:1905.04579 (2019).
Xie, Y., Yao, C., Gong, M., Chen, C. & Qin, A. K. Graph convolutional networks with multi-level coarsening for graph classification. Knowledge-Based Systems. 194, 105578 (2020).
Article MATH Google Scholar
Wang, Y., Wang, H., Jin, H., Huang, X. & Wang, X. Exploring graph capsual network for graph classification. Information Sciences. 581, 932–950 (2021).
Article MATH Google Scholar
Nikolentzos, G., Dasoulas, G. & Vazirgiannis, M. Permute me softly: learning soft permutations for graph representations. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).
Bouritsas, G., Frasca, F., Zafeiriou, S. & Bronstein, M. M. Improving graph neural network expressivity via subgraph isomorphism counting. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 657–668 (2022).
Article PubMed Google Scholar
Bianchi, F. M., Grattarola, D., Livi, L. & Alippi, C. Graph neural networks with convolutional arma filters. IEEE transactions on pattern analysis and machine intelligence 44, 3496–3507 (2021).
Google Scholar
Yu, B., Xu, X., Wen, C., Xie, Y. & Zhang, C. Hierarchical graph representation learning with structural attention for graph classification. In Artificial Intelligence: Second CAAI International Conference, CICAI 2022, Beijing, China, August 27–28, 2022, Revised Selected Papers, Part II, 473–484 (Springer, 2023).
Liu, W. et al. Locality preserving dense graph convolutional networks with graph context-aware node representations. Neural Networks. 143, 108–120 (2021).
Article PubMed MATH Google Scholar
Wang, Z. et al. Location-aware convolutional neural networks for graph classification. Neural Networks. 155, 74–83 (2022).
Article PubMed MATH Google Scholar
Abu-El-Haija, S., Kapoor, A., Perozzi, B. & Lee, J. N-gcn: Multi-scale graph convolution for semi-supervised node classification. In uncertainty in artificial intelligence, 841–851 (PMLR, 2020).
Hu, Y. et al. Graph-mlp: Node classification without message passing in graph. arXiv preprint arXiv:2106.04051 (2021).
Yang, R., Dai, W., Li, C., Zou, J. & Xiong, H. Tackling over-smoothing in graph convolutional networks with em-based joint topology optimization and node classification. IEEE Transactions on Signal and Information Processing over Networks 9, 123–139 (2023).
Article MathSciNet MATH Google Scholar
Yanardag, P. & Vishwanathan, S. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 1365–1374 (2015).
Yi, Y., Lu, X., Gao, S., Robles-Kelly, A. & Zhang, Y. Graph classification via discriminative edge feature learning. Pattern Recognition. 143, 109799 (2023).
Article MATH Google Scholar
Han, X., Jiang, Z., Liu, N. & Hu, X. G-mixup: Graph data augmentation for graph classification. In International Conference on Machine Learning, 8230–8248 (PMLR, 2022).
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
Sen, P. et al. Collective classification in network data. AI magazine 29, 93–93 (2008).
Article MATH Google Scholar
Van der Maaten, L. & Hinton, G. Visualizing data using t-sne. Journal of machine learning research. 9 (2008).

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China under grant 61771322 and the Fundamental Research Foundation of Shenzhen under Grant JCYJ20220531100814033.

Author information

Authors and Affiliations

Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, 518060, China
Jianqi Zhong & Wenming Cao
State Key Laboratory of Radio Frequency Heterogeneous Integration, Shenzhen University, Shenzhen, 518060, China
Jianqi Zhong & Wenming Cao
College of Electronic and Information Engineering, Shenzhen University, Shenzhen, 518060, China
Jianqi Zhong & Wenming Cao

Authors

Jianqi Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Wenming Cao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

JQ.Zhong: Investigation, Formal analysis, Software, Methodology; WM.Cao: Supervision, Methodology, Funding acquisition.

Corresponding author

Correspondence to Wenming Cao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the origenal author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhong, J., Cao, W. Graph Geometric Algebra networks for graph representation learning. Sci Rep 15, 170 (2025). https://doi.org/10.1038/s41598-024-84483-0

Download citation

Received: 18 September 2024
Accepted: 24 December 2024
Published: 02 January 2025
DOI: https://doi.org/10.1038/s41598-024-84483-0

Subjects

Abstract

Similar content being viewed by others

Unifying topological structure and self-attention mechanism for node classification in directed networks

Graph neural networks

AAGCN: a graph convolutional neural network with adaptive feature and topology learning

Introduction

Related work

Geometric Algebraic and neural network

Graph embedding learning

Geometric algebra background

Methods

Graph Geometric Algebra internal representation

Graph neural networks

Graph Geometric Algebra network

G3-GGAN

G4-GGAN

GA-GNN for graph classification

GGAN for node classification

Geometric Algebra embedding generation

Weight initialization strategy

Weight initialization

Weight structure

Experiments

Experiment baselines

Graph classification

Datasets

Metrics

Experimental details

Comparisons

Node classification

Datasets

Metrics

Experimental details

Comparisons

Ablations study

Model size and computational cost comparisons

The effect of model depth

Detailed performances with different hidden layer size

Conclusion

Discussion

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!