Introduction

Aphasia is a condition affecting on average 33% and up to 50% post-stroke patients1. It is associated with impaired comprehension, reading, writing, and production of language2,3, and due to strong inter-individual variability, the development of efficient diagnostic biomarkers is a matter of interest in language neuroscience.

As reported in functional magnetic resonance imaging (fMRI) studies, aphasia is associated with alteration of physiological neuronal communication between brain regions, suggesting that aphasia is in fact a network disorder4. Nevertheless, functional network abnormalities depend on the stage of the pathology. As assessed with fMRI, the (sub-)acute stage is associated with generalized reduced cortical activity and network integration, followed by an increase over the right (non-lesioned) hemisphere and subsequent normalization towards the chronic stage3,5,6. The development of the pathology is also reflected in the distribution of connectivity patterns across the cortex. The acute stage of aphasia was shown to feature reduced inter-hemispheric and higher intra-hemispheric connectivity between language and domain-general cortical regions (Siegel et al.6). At the chronic stage, activation of right-hemispheric or domain-general regions depends on the severity and properties of the lesion over the left hemisphere, which retains a major role in language recovery, as well as on the difficulty of the task7,8,9. In contrast, graph property measures showed no differences or higher network integration in persons with aphasia at the acute stage when compared to healthy controls (HC)10,11 whilst no evidence exists in the chronic stage.

Electroencephalography (EEG) is a convenient alternative to fMRI for both research and diagnostics due to its reduced cost and higher temporal resolution. In fact, it provides more direct insights into the cortical electrical activity as compared to the hemodynamic response detected with fMRI. To date, it is being extensively used as an informative biomarker for diverse conditions which include epilepsy12,13, schizophrenia14,15, and dementia16,17,18. Recently it has become also of research interest for the diagnosis of aphasia and prediction of recovery. At both acute and chronic stages, EEG quantitative measures show a shift towards lower frequencies of alpha and beta bands in the spectrum, associated with severity of both cognitive and language impairment19,20. Notably, task-based event-related potential (ERP) studies found altered cortical responses to vocal stimuli in persons with post-stroke chronic aphasia (PWA) with a loss of inter-hemispheric activity, also associated with the extent of the lesioned region21,22,23,24.

In addition to power-spectrum alterations, EEG is also suitable for investigating cortical network correlates of diverse clinical conditions16,25,26. Network patterns can either be assessed via effective or functional connectivity. The former provides causality information by assessing the mutual predictivity between every couple of signals, and includes biological hypothesis-based methods, e.g. neural mass modelling27,28, and data-driven approaches, e.g. Granger causality29,30. Functional connectivity is a measure of association between the recorded signals, which can be measured via methods including amplitude correlation and phase-based measures31. Both approaches are complementary and allow to analyze the same condition from different perspectives (for a review on the topic the reader is referred to Sakkalis26). In fact, EEG network metrics showed diagnostic potential for conditions which include attention-deficit/hyperactivity disorder, depression and dementia32,33,34.

According to resting-state magnetoencephalography- and EEG-based graph theory studies, the functional brain connectome in acute aphasia features alterations in its geometrical properties, which include reduced network segregation in delta- and theta-band networks at the acute stage compared to healthy condition35, and an association between node degree of Broca area and subsequent language improvement36. At the chronic stage, patients show left-hemispheric reduction of connectivity strength in alpha- and beta- bands, with an increase in a local low-gamma band subnetwork compared with healthy people37,38. Furthermore, left-hemispheric alpha and centro-frontal beta connectivity are reportedly associated with alteration of speech processing37.

Despite a certain number of studies involving specific tasks or resting state paradigms, there is a lack of research on the functional network correlates of perception and processing of natural speech in PWA. Given the ecological validity of continuous speech39,40,41, associated cortical tracking mechanisms and functional network patterns have already been investigated in existing studies on healthy people40,42,43,44,45,46. However, a picture of brain functional network alterations in chronic aphasia is missing.

In a novel manner, we perform an exploratory analysis and use EEG to assess the functional connectome in PWA while they are listening to a naturally spoken story. Beside detecting differential network patterns between PWA and HC groups, we also assess differences in the network architecture via a graph theory analysis, test whether any network pattern is associated with the severity of aphasia, and investigate the diagnostic potential for aphasia of EEG-based network metrics. A summary of our analysis flow is shown in Fig. 1.

Fig. 1
figure 1

Methodological workflow. (a) EEG is recorded while the participants listen to a story, and functional connectivity is extracted from the recorded signals. On the connectivity matrix, purple bar: left hemisphere; green bar: right hemisphere. (b) Group differences and correlations with behavioural test scores were assessed. (c) Graph measures were computed and compared between groups within the frequency bands with significantly different network patterns. Blue: HC; red: PWA. On the topography, blue: HC > PWA, red: PWA > HC. (d) Targeted node-attack to investigate the network distribution in both groups. Blue: HC; red: PWA. (e) Diagnostic accuracy analysis with implementation of a support vector machine classifier.

Results

Demographic data

The two groups did not show any significant difference in age (Mann–Whitney U-test, U = 721.5, P = 0.355), and had a comparable sex distribution (Chi-squared test, χ2 = 0.206, P = 0.650). The HC group performed better in the cognition task (U = 469, P < 0.001), NBT (U = 408.5, P < 0.001), CAT-NL (U = 547, P = 0.010) and ScreeLing (U = 479, P < 0.001). Statistics on demographic data and behavioral test scores are reported in Table 1.

Table 1 Demographic and behavioural data.

Network patterns: between-group comparison

We used the cluster-based Network-Based Statistics toolbox (NBS)47 within a range of primary statistical thresholds (\({t}_{th}\)) to test whether network patterns differed between PWA and HC. The outcome of this analysis is shown in Fig. 2a and reported in Table 2. We found a stronger network component in PWA within the theta-band (4.5–7 Hz) (\(5<{t}_{th}<20\), \(P<0.007\), surviving Holm-Bonferroni correction for six frequency bands), which comprised connections between the left temporal scalp region and right frontal and parietal nodes. A significant network component also emerged in the low-gamma-band (30.5–49 Hz) (\(2<{t}_{th}<12\), \(P<0.040\), not surviving Holm-Bonferroni correction for six frequency bands), featuring weaker connectivity in PWA within the left-parietal nodes, between occipital and frontal areas, and between left-parietal and right-temporal nodes. No significant components were found for the other frequency bands.

Fig. 2
figure 2

Results of the NBS analysis. Darker edges in the topographies represent connections surviving higher tth values. (a) Differential network components between groups were found within the theta (θ)-band and the low-gamma (γ)-band. The boxplots show by way of example the group WPLINBS distributions for the θ- and low-γ-band respectively at \({t}_{th}=12\) (\(P<0.001\)) and \({t}_{th}=8\) (\(P=0.011)\), and distributions at the other tth values are reported in Supplementary materials. (b) A significant negative correlation emerged between the CAT-NL semantic fluency score and network components in the delta (δ)- and low-γ-bands. The scatter plots were reported respectively for \({t}_{th}=8\) \((P=0.024)\) and \({t}_{th}=4\) (\(P=0.014\)), and distributions for the other tth values are reported in Supplementary materials.

Table 2 Significant differences in network metrics between PWA and HC.

Network patterns: aphasia severity

We explored whether severity of aphasia, as assessed through behavioral tests, was associated with any functional subnetwork. To this purpose, we used the NBS and performed F-tests within the PWA group including each behavioral score as co-variate of interest for all frequency bands. We detected two components respectively within the delta-band- (1–4 Hz) (\(8<{t}_{th}<14, P<0.038\), not surviving Holm-Bonferroni correction for six frequency bands) and the low-gamma-band-networks (\(4<{t}_{th}<11, P<0.035\), not surviving Holm-Bonferroni correction for six frequency bands) that negatively correlated with the semantic fluency score of the Comprehensive Aphasia Test in Dutch (CAT-NL)48, as shown in Fig. 2b. No significant correlation was found for the other clinical scores.

Graph theory

Weighted graphs were thresholded from 3 to 40% network densities and graph measures were computed at each density (PT%) within the theta- and low-gamma-band networks, which yielded significant between-group differences from the NBS. The obtained metrics were compared between groups both locally and globally, as shown in Fig. 3 and reported in Table 2. In the theta-band, PWA showed higher average node strength and clustering coefficient, and lower eccentricity, characteristic path length and small-worldness. Although between-group differences in node strength, eccentricity and small-worldness were consistent across PT%, a difference in clustering coefficient emerged only for the highest density values (two-tailed Mann–Whitney U-tests, P < 0.05). The node-level analysis revealed that the higher strength in PWA was driven by the left temporal and right frontal and parietal nodes, whilst most clustered regions in PWA comprised the left-occipital and right-frontal nodes. All regions showed consistently higher eccentricity in HC compared to PWA. Although the node strength in PWA was higher on average, lower values compared to HC were found locally for nodes within the occipital scalp region (two-tailed Mann–Whitney U-tests, P < 0.05, uncorrected for 64 nodes). In the low-gamma-band-network none or only for few PT% values a between-group difference was detected for the average measures (Fig. 3b). Node strength, characteristic path length and small-worldness were not different between groups, whilst the clustering coefficient was higher in PWA for lower values of PT% and the eccentricity was lower in PWA for the highest graph densities (two-tailed Mann–Whitney U-tests, P < 0.05). Nevertheless, local differences emerged for all nodal measures (two-tailed Mann–Whitney U-tests, P < 0.05). Nodes over the left temporal regions were more strongly connected and clustered in PWA. In addition, one node over the right parietal region and the whole right superior temporal region showed respectively higher strength and clustering coefficient. Similarly to the theta-band network, the differences in eccentricity were widespread across all nodes, although less consistent across PT%. Modularity was not different between groups in either frequency bands.

Fig. 3
figure 3

Results of the graph theory analysis. Measures are compared between groups on average and locally with respect to the network density. Blue line: HC group; red line: PWA group. Dotted lines: 95% confidence interval. Blue node: graph measure higher in HC compared with PWA; red node: graph measure higher in PWA compared with HC. Size of the node is proportional to the occurrence of test-significance across PT% values. Statistics are uncorrected for number of nodes. (A) Outcome for the theta (θ)-band network. All measures were different between groups except for the modularity; local measures showed node-specific between-group differences. (B) Outcome for the low-gamma (γ)-band network. Inconsistent differences emerged globally, whilst we found consistent local differences.

Network distribution

To further investigate the network architecture, we performed a targeted node-attack and extracted the trend of the average characteristic path length with respect to the percentage of removed nodes. For the theta-band network, we observed an apparent earlier peak in PWA compared to HC; nevertheless, the Mann–Whitney U-test did not yield a significant result (U = 640.5, P = 0.07). No difference was found either within the low-gamma-band network. The outcome of this analysis is reported in Fig. 4.

Fig. 4
figure 4

Outcome of the targeted node-attack analysis. For the theta (θ)-band-network, the average characteristic path length showed an earlier peak in PWA compared with HC, whilst no difference emerged within the low-gamma (γ)-band network. Shaded areas represent the 95% confidence interval.

Support vector machine

The potential of network measures to discriminate between PWA and HC was assessed by implementing a support vector machine (SVM) classifier with leave-one-out cross-validation (Fig. 5). The training set included the average connectivity strength of the NBS-detected component (WPLINBS) at each cross-validation step as well as all graph theory metrics (i.e. node strength, clustering coefficient, eccentricity, characteristic path length, small-worldness and modularity) and age. Only the frequency bands which yielded significant between-group differences, i.e. theta-band and low-gamma-band, were tested. The classifier was able to correctly predict the group to which a subject belonged with an accuracy of 78% and area under the receiver operating characteristic (AUROC) curve of 83%. The optimal working point of the classifier was situated at sensitivity 74% and specificity 82%. The most important variable for classification as assessed by the classifier was the WPLINBS (obtained at tth = 12) in the theta-band, followed by the WPLINBS (obtained at tth = 8) in the low-gamma-band, the node strength in the theta-band, and the characteristic path length in the low-gamma-band. The complete importance ranking with respective coefficients is reported in Supplementary materials.

Fig. 5
figure 5

Outcome of the SVM classifier. The leave-one-out cross-validation resulted in AUC = 83%, with optimal working point at sensitivity = 74% and specificity = 82%.

Discussion

We used EEG to investigate functional network correlates of natural speech processing in PWA. Through a combination of complementary analyses, we provided a picture of the altered synchronization between scalp regions due to the pathological condition. Using NBS, we detected increased connectivity in aphasia between the left temporal area and the right hemisphere in the theta-band, as well as a weakened posterior-frontal subnetwork in the low-gamma-band, although only the first survived correction for multiple comparisons across frequency bands. The connectivity strength was also negatively correlated with the semantic fluency in PWA in both delta- and low-gamma-band networks. Connectivity changes were reflected into the network geometry as demonstrated with graph theory analysis. In fact, a global reorganization of the network architecture in PWA emerged when compared to HC in the theta-band, whilst only local changes were found in the low-gamma-band network. Alteration of the network architecture in PWA was further assessed with a targeted node-attack approach, which, together with the lower small-worldness, suggested a tendency towards a more scale-free distribution compared to HC. Although not suited for clinical application yet, EEG-network measures alone appear promising for the development of comprehensive biomarkers for chronic aphasia, as proven with a SVM classifier, which yielded an accuracy of 78% (AUC = 83%).

Between-region synchronization is higher in the theta-band network and lower in the low-gamma-band network in PWA compared to HC

The between-group comparison revealed that aphasia is associated with altered EEG network properties within the theta-band and low-gamma-band. From our analysis, an increased synchronization between the right hemisphere and the left temporal region emerges within the theta-band network in aphasia. In contrast, PWA showed a weakened posterior-anterior network pattern in the low-gamma-band compared to HC.

The role of these two frequency bands can be interpreted in the perspective of the Asymmetric Sampling in Time (AST) theory49. According to the proposed model, the left auditory cortex is involved in the decoding of phonemes, whilst slower acoustic modulations such as speech prosody50 are usually integrated in the right auditory cortex. Processing of phonemes and prosody were shown to be respectively associated with gamma-band and theta-band tracking during natural speech listening42,43,51. Our results suggest that these mechanisms might be functionally affected in PWA.

The existing literature and the AST theory nicely fit with the outcome of our analysis in the theta-band network. As discussed in a review by Hartwigsen and Saur5, over-activation of the right hemisphere in domain-general areas at rest or while performing a language test in chronic aphasia is thought to be associated with either a compensatory mechanism7,52,53 or maladaptive inhibition of the still functioning perilesional regions, as also observed in transcranial magnetic stimulation studies with both healthy participants and persons with acute or chronic aphasia54,55,56. Our approach allows to further disentangle this aspect and favors the former speculation while extending it to natural speech processing. In fact, we detected hyper-synchronization between left and right hemispheric regions rather than a stand-alone over-engagement of the contralesional hemisphere. We believe that this may reflect a recruitment of intact domain-general and homologous regions over the right hemisphere by the damaged left-hemispheric language and auditory areas. In line with our finding, an increase of theta-band activity in both acute and chronic aphasia was observed in a region-of-interest- (ROI-) based EEG study57, emerging as a negative shift in the power-spectrum of the alpha-band activity19,58,59.

On the other hand, a biological interpretation of gamma-band activity remains challenging60,61. High-frequency oscillations and coupled activity with the theta-band are reportedly reflecting networks of hippocampal inhibitory interneurons62,63 associated with memory, attentive and learning processes62,64,65,66. Interestingly, these cognitive functions were recently shown to be intact in PWA67, leading to the speculation that protective or compensatory mechanisms might be reflected in weakened low-gamma-band network. Our results may seem in contrast with previous studies reporting either stronger connectivity38 or alteration in alpha- and beta-band networks37 in chronic aphasia. This inconsistency is likely to be due to the chosen measure of connectivity strength, i.e., WPLI vs amplitude or power envelope correlation, and possibly the experimental paradigm, i.e. natural speech listening vs resting state. Therefore, we believe that our results provide further insights into the physiological processes associated with language processing in aphasia rather than contrasting evidence with respect to the existing literature. Nevertheless, between-group comparison does not survive correction for multiple comparisons, and its reliability should be further investigated with a more homogeneous and larger cohort.

EEG connectivity strength correlates with the semantic fluency in aphasia.

By means of NBS, we found two network components in PWA negatively correlated with the semantic fluency score, respectively in the delta-band and the low-gamma-band. In the delta-band, the detected component comprised a cluster over the left temporo-parietal region, likely involving the language area, as well as frontal inter-hemispheric connectivity. Previous research reported a correlation between decrease of cortical activity and speech fluency, whilst there is still contrasting evidence from the network perspective37,68,69,70. A recent EEG study demonstrated an association between the delta-band activity and the perception of rhythmic properties of speech71. Hence, a possible interpretation of our result is that PWA with partial recovery retain an altered processing of articulated sounds, which reflects into the semantic fluency; delta-band activity might be dependent on the level of such impairment and would emerge as a network cluster over the involved cortical regions as observed here. However, the reason why worse performance is associated with stronger connectivity remains speculative. As per the between-group analysis, we propose an enhanced cortical recruitment in the perilesional regions aimed at a compensatory mechanism. On the other hand, the correlation between low-gamma-band connectivity over a posterior-anterior pattern and semantic fluency poses a challenge in its interpretation. As discussed above, highest frequency activity is reportedly associated with phonemic processing, however it is unclear how phonemic processing would relate to semantic fluency, and further studies will be needed to investigate this aspect. In addition, the correlation tests do not survive correction for multiple comparisons, posing a further challenge in its statistical reliability.

An additional concern lies in the fact that semantic fluency is in part related to executive function rather than language processing72. Therefore, we may have not observed network alterations associated with language-related impairment, rather a potential cognitive dysfunction due to the stroke. Recent studies aimed at assessing the behavioral correlates of affected semantic fluency73,74,75; the reported results, together with the topographical output of NBS (Fig. 2), suggest that the correlation between EEG-network metrics and semantic fluency is likely reflecting aphasia-related language processing alterations.

Interestingly, similar findings in chronic aphasia were not reported when amplitude correlation was used as connectivity metric37. In fact, a positive correlation between the alpha- and beta-band network connectivity with fluency score was found, whilst no significant results emerged in other frequency bands. As discussed by the authors, this poses an issue on generalization of results, since analysis outcomes may depend on the connectivity metrics76.

Surprisingly, no correlation emerged with other behavioral measures, i.e. phonological fluency, naming performance or the ScreeLing test score. As we speculated in the paragraph above, failure to capture functional features associated with these other altered behavioral measures might also be due to a reduced sensitivity of WPLI as connectivity metric. This aspect remains unsolved and should be investigated in future research.

Network geometrical properties are altered globally in the theta-band but only locally in the low-gamma-band

By means of graph theory analysis, we detected an overall rearrangement of the functional network architecture in individuals with chronic aphasia.

Within the theta-band, the patterns of alteration of the node strength in PWA, both globally and locally, concur with the altered connectivity patterns detected with the NBS. Most connections are re-routed in PWA and link most areas with the left language and auditory regions, resulting in a functional centralization of the network towards the lesioned areas, possibly for compensatory purposes. This interpretation is further supported by the increased integration of the network as reflected in the reduced characteristic path length and eccentricity, and by the reduced randomness of the node distribution of the network nodes as measured with lower small-worldness. This latter property already emerges at the acute stage, as shown in a previous graph theory study in resting state where authors reported reduced small-worldness in the theta-band network only in patients with a left-hemispheric lesion35. Together with locally increased node strength, we believe this to reflect a redistribution of the network towards a more scale-free distribution while processing natural speech, as also explored with the targeted node-attack approach discussed below. Despite this rearrangement, the modular distribution is not significantly altered in PWA compared to HC in either frequency bands. A recent longitudinal fMRI study investigated the variations of modularity across recovery; at the acute stage, the brain is less modular in aphasia compared to the healthy condition, whilst the functional network tends to a more normative modular distribution towards the chronic stage77. Our results concur with this latter mechanism, and further prove the efficiency of EEG as a method for inferring functional brain network properties in health and disease.

In contrast, no consistent global alteration of graph metrics was detected in the low-gamma-band in PWA, although local changes emerged which spatially resembled the findings in the theta-band network. As for the theta-band network changes, we propose that local variations in the low-gamma-band network architecture should be interpreted in the perspective of compensatory mechanisms by recruitment of residual intact areas. In fact, higher node strength and clustering coefficient in PWA as obtained with WPLI reflects localized stronger synchronization with other brain regions, i.e. coordinated cortical response to the experimental task. The lack of globally distributed changes within the gamma-band network should not be surprising. In fact, whilst theta-band oscillations reportedly have an integration role across long-range-distributed neuronal populations, gamma-band activity mostly reflects local intraregional neuronal activity, and a coupling between the two can be observed in specific processes, as described in Sect. 3.178,79,80.

A scale-free theta-band network distribution is more prominent in PWA compared to HC

By means of an iterative node-attack, we further investigated whether the functional network in PWA features a small-world architecture. As observed in previous studies, the characteristic path length is expected to show an increasing trend throughout the iterative node removal and is followed by a decrease after reaching a maximum32,81,82,83. Reportedly, the maximum peak occurs earlier in scale-free networks when compared with small-world or random networks82. Scale-free networks are less robust to targeted external attacks, due to their higher centralization over a smaller number of nodes, whose distribution follows a power law. There is still a debate on whether the human brain shows a scale-free architecture82,84. In fact, due to their localized centralization, scale-free networks are generally more robust to random damage, such as lesions or atrophy, than small-world or random networks. In our study, we found a tendency towards a negative shift of the trend of the characteristic path length in PWA with respect to the percentage of removed nodes. Nevertheless, this result was not statistically significant (P = 0.07). We believe that this outcome might be affected by a lack of homogeneity in our sample, and given the significant between-group difference in small-worldness emerged from the graph-theory analysis, we would recommend future studies to further investigate the network geometry with a larger and more consistent PWA cohort through a targeted-node-attack approach. In fact, if replicated, our results would suggest aphasia to be associated with a more emphasized scale-free architecture when compared to controls. Together with higher localized node strength and reduced small-worldness, the outcome of the targeted node-attack would further suggest that with aphasia the brain tends to centralize the network towards the affected region in order to process natural speech, perhaps for it to recruit other functioning areas as a compensatory mechanism rather than as a maladaptive process.

Connectivity strength is the most discriminant variable between PWA and HC

To test the diagnostic potential of EEG network metrics, we implemented a SVM classifier. Network metrics which significantly differed between groups were included as features. Age was also included to take in account of age-related network alterations85,86, which might reflect into the accuracy of the classifier. Interestingly, despite our limited sample size, our EEG-network-based classifier resulted in an accuracy score of 78%, that is higher than what was obtained with a lesion-based classification of aphasia87, and slightly lower than an EEG-power-based classifier for general functional outcome of ischemic stroke20. Notably, the most discriminant variable was the connectivity strength in both frequency bands. This outcome agrees with previous research on other pathological conditions17,32,88, and due to its immediate computation compared to other network metrics, this result paves the way towards development of EEG connectivity-based biomarkers for aphasia. Further research will be needed to investigate whether EEG network metrics remain efficient to predict the progress of the condition when measured at the (sub)acute stage.

Limitations

Despite its practicality and diagnostic suitability, the main limitation of our study lies in the use of EEG metrics. In fact, electrical signals recorded from the scalp remain challenging to interpret, as they correspond to the spatial linear combination of cortical activations which are normally affected by conducting properties of the surrounding tissues89. This even more so holds in the case of a stroke, where an effect of the lesion properties on the cortical activity should be expected90,91. Nevertheless, we argue that potential conductivity alteration due to the lesion did not alter the significance of the obtained results. Lesion-related alterations of EEG activity would be significant at the group level in the unlikely case that all PWA exhibited the same stroke characteristics. In contrast, given the high variability aphasia92, conduction alteration introduces a confounding factor into the statistics. This may potentially hamper the emergence of significant differential network components between groups. Nevertheless, by using the NBS we detected a consistent altered network across PWA, hence we are confident that our statistical approach allowed us to capture the backbone functional networks associated with chronic aphasia. The effect of the lesion on the EEG metrics may be investigated by including an additional participant cohort of stroke patients without aphasia. However, this would pose a challenge in controlling for the variability of lesion characteristics across the cohort. A more effective strategy consists of source-level analysis, where an accurate head model is crucial in order to correctly infer the cortical activity and avoid lesion-related distortions in the measured signal93,94.

Aphasia is often associated with cognitive impairment, as also emerged from the demographic scores presented in the present study. Hence, the possibility exists that part of our findings may be due to differences in cognitive functions between groups rather than solely language impairment. A common approach in clinical studies consists of including a cognitive score as a nuisance covariate. However, according to statistical theory this is not the recommended strategy to this scope95. Alternatively, a matched dataset of PWA without clinical impairment might be included at the recruitment stage. However, such approach is not feasible with the available cohort due to the intrinsic characteristics of aphasia, which involve a level of alteration of cognitive functions. Nevertheless, we aimed at reducing the effect of stroke-related cognitive impairment in the statistical analysis by implementing an experimental paradigm based on natural speech listening.

One participants from the PWA group fell asleep for short periods of time during the first three blocks of the story. Given the shortness of the amount of affected recording, we decided to preserve the full EEG of the participant for our analysis. An effect of sleep on functional network properties was reported in the literature, and might be expected to have affected our analysis. Namely, sleep is reportedly associated with a reorganization of the network towards a small-world topology96,97 and stronger connectivity in delta and alpha bands98. Nonetheless, we report lower small-worldness in PWA compared to HC and no between-group differences in alpha or delta bands, further proving the robustness of our results and the lack of any bias.

From the methodological perspective, by definition WPLI does not provide any directionality information on the measured connections. Future work should use alternative metrics to infer whether aphasia is associated with direction-specific connectivity changes. Moreover, to control for conducting signals it rejects all zero-lag connections, potentially omitting any true zero-lag connectivity between scalp-regions. Therefore, we believe that our results should be interpreted in comparison and complementarity to other studies which opted for different connectivity measures, such as amplitude correlation37 or Granger causality99.

A limitation of the NBS approach lies in the choice of the primary statistical threshold, which remains arbitrary47. As a workaround, we chose several values for each test and reported the outcomes of the analysis for the tested ranges. This also poses a challenge in controlling the statistics for multiple frequency bands. In fact, the hypothesis of interest should be tested for a range of primary threshold values, and the outcome is likely to survive the correction for multiple comparisons only for certain threshold values. In our case, the correction for multiple tests survived for either all or no thresholding values but remains difficult to interpret also due to the fact that it would be statistically inaccurate to assume that observations across frequency bands are independent from each other100.

Due to the number of statistical tests and inter-dependence of EEG network metrics, correction for multiple comparisons remains challenging. In fact, most methods in the literature involve conservative assumptions which networks and more generally EEG measures fail to meet due to their intrinsic properties. Nevertheless, we aimed at an exploratory analysis to provide an overall picture of disease-related alterations in a comprehensive number of metrics, rather than investigate a specific hypothesis.

Conclusion

We found an overall rearrangement of the functional network, possibly due to compensatory mechanisms associated with poorer recovery outcomes. Our interpretation is further supported by the outcome of the correlation analysis between clinical scores and functional connectivity, which was stronger for poorer performance. Among all network metrics, connectivity strength contributed the most to the discrimination between PWA and HC, proving the suitability of EEG-network connectivity for the future development of comprehensive biomarkers for aphasia. Nevertheless, we must refrain from proposing the clinical applicability of EEG network-based metrics for the diagnosis of post-stroke aphasia, which should be further assessed. In future research studies, the cortical sources of the detected abnormalities in the scalp-recorded signal should be investigated, as well as the potential of EEG network-based metrics to predict the development of the condition from the acute stage.

Materials and methods

Participants

PWA were recruited within the cohort of participants at the stroke unit of the University Hospital Leuven, Belgium (UZ Leuven), within a larger project comprising other research works101. Screening was performed using a Dutch adaptation of the Language Screening Test (LAST)67,102,103, and only patients with LAST score equal or below the cut-off score (i.e., 14/15) and with left-hemispheric or bilateral lesion were recruited. For each included patient, the experimental session took place at least six months after the stroke onset (median: 19 months; range: 6–369 months). Healthy control participants (HC) were recruited making sure to match the average age of PWA. The participants had no history of psychiatric or neurodegenerative disorders and gave written informed consent before participation. The study was approved by the Medical Ethical Committee UZ/KU Leuven (S60007) and was performed in accordance with the relevant guidelines and regulations.

Our cohort comprised 22 HC (72 ± 7 years old, 15 males) and 27 PWA (73 ± 11 years old, 20 males). Demographic and behavioral information was assessed through standardized clinical tests as described in detail in the work by Kries and colleagues101, and is reported in Table 1. Briefly, the Oxford Cognitive Screen (OCS)104 was used to assess participants’ cognition, and language tests comprised the Nederlandse Benoem Test (NBT), i.e. Dutch naming test105, the subtests for semantic and phonological fluency of the Comprehensive Aphasia Test in Dutch (CAT-NL)48 and the ScreeLing test106. Although seven PWA did not score below the cut-off scores in the behavioral tests on the session day, they had a documented language impairment in the acute phase and were still attending speech-language therapy sessions at the time when the testing sessions took place101.

EEG experiment

Participants listened to a 25-min long fairy tale titled “De Wilde Zwanen” (“The Wild Swans”) by Hans Christian Andersen, narrated by a female Flemish-native speaker. To ensure participants’ attention to the stimulus, a break was introduced every five minutes and questions about the story to that point were asked. The speech stimulus was presented binaurally through ER-3A insert phones (Etymotic Research Inc, IL, USA) using the software APEX107. For each participant stimulus intensity was set at 60 dB SPL plus half of the pure tone average (PTA) of the individual audiometric thresholds at 250, 500 and 1000 Hz, to ensure audibility. All sessions took place in a soundproof room with Faraday cage at the Department of Neurosciences, KU Leuven. EEG signals were obtained while participants listened to the natural speech using a Biosemi ActiveTwo (Amsterdam, Netherlands) high-density cap with 64 Ag/AgCl electrodes distributed according to the 10–20 system, as shown in Fig. 1108. Recordings were sampled at 8192 Hz.

EEG data pre-processing

Pre-processing of the EEG data was automated and was implemented through in-house routines and the Automagic toolbox109 v.2.6 on MATLAB v.9.11 (The MathWorks Inc., Natick, MA, USA, 2021). Signals were high-pass filtered with a second-order zero-delay Butterworth filter with cut-off frequency at 0.1 Hz, referenced to Cz, and line noise (50 Hz) was removed using the ZapLine method110. The recordings were subsampled to 512 Hz, and channels with correlation to their random sample consensus lower than 80% were deemed bad and removed through the EEGLAB plugin clean_rawdata() (http://sccn.ucsd.edu/wiki/Plugin_list_process) (average number of removed channels: 2 ± 2). Artifactual segments of data were also replaced with a temporal interpolation using the Artifact Subspace Reconstruction (ASR)111. The resulting data underwent an independent component analysis (ICA), and classification of the components was automatically performed through the EEGLAB plugin ICLabel112, where only components classified as “brain” or “other” (i.e. mixed components) with probability higher than 50% were preserved (average number of removed components: 26 ± 7). The obtained data were projected back to the channel space, and visually inspected as a further quality-check. Before any further analysis, all signals were average-referenced.

4.4 Weighted phase lag index.

Functional connectivity between scalp EEG signals was measured with the weighted phase lag index (WPLI)31, mathematically defined as:

$$WPLI=\frac{|\langle \left|Im(X)\right|sign\left[Im(X)\right]\rangle |}{\langle \left|Im\left(X\right)\right|\rangle }$$
(1)

where X is the cross-spectrum between the signals, Im(X) is its imaginary part, <> is the expected value operator, and sign is the signum function. WPLI values range between 0 and 1, and are insensitive to volume conduction113, i.e., tend to zero when signals have almost-zero- or zero-lagging synchronization as commonly occurring between conducting signals. WPLI was obtained as implemented in the Fieldtrip toolbox114; EEG signals were first split in three-seconds segments with 50% overlap, then were transformed to the frequency domain using Windowed Fourier Transform (Hanning taper, 0.5 Hz frequency step), and WPLI was computed and averaged across frequency bins. Connectivity was measured within the delta- (1–4 Hz), theta- (4.5–7 Hz), alpha- (7.5–14.5 Hz), beta- (15–30 Hz), low-gamma- (30.5–49 Hz) and mid-gamma-band (50–90 Hz).

Graph theory

Graph theory analysis was performed using functions implemented in the Brain Connectivity Toolbox (BCT) for MATLAB115. To compute the network measures, we applied proportional thresholding on the graphs and set to zero the weakest connections. Threshold values (PT%), or network densities, spanned between 3 and 40% of the total number of edges, and the connectivity weights were preserved in the thresholded graphs in order to reduce the dependence of graph measures on the network density32,116,117,118,119,120. To prevent any group bias associated with functional connectivity strength, all connectivity matrices were normalized by their maximum weight before computing the network metrics121. The computed weighted measures comprised: node strength, i.e. total weights of the connections to the node; clustering coefficient, i.e. the extent to which node’s neighbors are connected between each other; eccentricity, i.e. the longest path between the node and all other nodes; characteristic path length, i.e. average shortest path between all pair of nodes; small-worldness, i.e. the ratio between normalized clustering coefficient and normalized characteristic path length; modularity, i.e. the extent to which a network is segregated. Graph measures were computed at each PT%. Further details on the graph theory metrics are reported in Supplementary materials.

Network distribution

To investigate the type of network architecture, we used a targeted node-attack approach as it was described in previous works32,81,82,83. For each unthresholded graph, we iteratively removed the nodes with the highest strength, computed the characteristic path length, and obtained its trend with respect to the percentage of removed nodes. An earlier peak is associated with a more scale-free-like as compared to a random or small-world architecture. In order to assess the type of network distribution, we extracted for each subject the percentage of removed nodes at which a peak in the characteristic path length occurred.

Statistical analysis

We performed an exploratory investigation of altered EEG functional network metrics associated with aphasia and natural speech perception.

Differential network patterns

EEG network topographies were compared between PWA and HC group using the Network-Based Statistics (NBS) toolbox47. A test statistics threshold (tth) was chosen, graph edges were F-tested against the null-hypothesis of equal average connectivity between groups, and only connections with supra-threshold t-values were preserved. Network components were then built as clusters across the surviving edges, their sizes were measured, and data were permuted between groups. These steps were iteratively performed 5000 times, and the largest component sizes were measured. The final output of the NBS is a family wise error rate-corrected P-value for each network component, obtained as a ratio between number of iterations at which the largest component was equally or greater sized compared to the current component and the total number of iterations. We performed the NBS within a range of tth values47 between one and thirty and visualized the corresponding graphs when the outcomes were significant (P < 0.05). To infer the direction of significance, we visualized the distribution within the two groups of the average strength of the NBS connections (WPLINBS). Difference between groups was tested at all frequency bands.

Aphasia severity

We also used the NBS to test whether any EEG subnetwork was associated with aphasia severity. To this purpose, the behavioral scores were included as covariates of interest and an F-test was performed to test whether any correlating network component in the PWA group exists. If any correlation was found, the direction of significance was assessed by visualizing the distribution of the clinical score with respect to WPLINBS.

Graph theory

Graph measures were obtained within the frequency ranges for which the NBS analysis yielded a significant outcome and were compared between groups at each network density (PT%). Nodal measures, i.e., node strength and clustering coefficient, were locally tested for each node (64 electrodes, uncorrected for multiple tests) as well as globally on average across nodes with Mann–Whitney U-tests (P < 0.05).

Network distribution

To compare the distribution of connections within the network, we extracted the percentage of nodes at which a peak in the trend of the average characteristic path length occurred and compared the values between HC and PWA with a Mann–Whitney U-test (P < 0.05).

Support vector machine

We investigated the diagnostic potential of WPLINBS and graph measures within the frequency bands which yielded any significant outcome from the statistical analysis. We implemented a linear support vector machine (SVM) classifier using the Scikit-Learn (v. 0.24.2) library in Python122 to test the accuracy of classification in PWA and HC groups. We implemented a leave-one-out cross-validation, and at each training step the regularization parameter C was chosen within a range spanning from 10–6 to 106 by means of a cross-validation approach within the training set (fivefold). Given the known effect of aging on brain functional connectivity85,86 age was included as feature in the classifier. For each subject, the WPLINBS and graph metrics were obtained from the rest of the cohort and were used to train the classifier. We then obtained a mean across the tested subjects of the ranking of the weights assigned to the variables, accuracy, optimal working point as best tradeoff between the rate of true and false detected positives123,124, and area under the receiver operating characteristic (AUROC) curve.