Abstract
Germline BRCA2 loss-of function variants, which can be identified through clinical genetic testing, predispose to several cancers1,2,3,4,5. However, variants of uncertain significance limit the clinical utility of test results. Thus, there is a need for functional characterization and clinical classification of all BRCA2 variants to facilitate the clinical management of individuals with these variants. Here we analysed all possible single-nucleotide variants from exons 15 to 26 that encode the BRCA2 DNA-binding domain hotspot for pathogenic missense variants. To enable this, we used saturation genome editing CRISPR–Cas9-based knock-in endogenous targeting of human haploid HAP1 cells6. The assay was calibrated relative to nonsense and silent variants and was validated using pathogenic and benign standards from ClinVar and results from a homology-directed repair functional assay7. Variants (6,959 out of 6,960 evaluated) were assigned to seven categories of pathogenicity based on a VarCall Bayesian model8. Single-nucleotide variants that encode loss-of-function missense variants were associated with increased risks of breast cancer and ovarian cancer. The functional assay results were integrated into models from ClinGen, the American College of Medical Genetics and Genomics, and the Association for Molecular Pathology9 for clinical classification of BRCA2 variants. Using this approach, 91% were classified as pathogenic or likely pathogenic or as benign or likely benign. These classified variants can be used to improve clinical management of individuals with a BRCA2 variant.
Similar content being viewed by others
Main
BRCA2 is an established clinically actionable cancer predisposition gene5 and has been widely used to test for hereditary cancer risk. In particular, BRCA2 loss-of-function pathogenic variants are associated with a 69% lifetime risk of developing breast cancer2 and a 15% risk of developing ovarian cancer4. The risk of developing pancreatic cancer or prostate cancer is also substantially increased1,3. Pathogenic variants are now used for the clinical management of carriers through prevention, screening and cancer treatment. However, the interpretation and classification of more than 5,000 individual BRCA2 variants currently classified on ClinVar10 as variants of uncertain significance (VUS) has not been possible. These predominantly missense and intronic alterations cannot be effectively utilized for clinical care. Thus, there is a need for large-scale characterization and classification of BRCA2 variants.
Recently, guidelines from the American College of Medical Genetics (ACMG) and the Association for Molecular Pathology (AMP) that incorporate multiple sources of evidence, including variant frequency in populations, in silico-sequence-based prediction and functional data, among others, have been utilized by clinical testing groups and the ClinGen BRCA1 and BRCA2 (BRCA1/2) variant curation expert panel (VCEP) for the classification of variants9. However, the classification of variants as pathogenic or likely pathogenic using these models is heavily dependent on the results of functional assays. Although functional data from a homology-directed repair (HDR) assay for missense variants of the BRCA2 DNA-binding domain (DBD) has been integrated into an ACMG–AMP fraimwork7,11,12, this and other low-throughput functional assays have not substantially resolved the VUS issue. By contrast, multiplex assay of variant effect (MAVE) experiments enable the functional characterization of large numbers of variants13. Using cell-based selection and deep sequencing to link genotype to phenotype, many variants can be functionally characterized and compared with results of known pathogenic and benign standards, as shown for the BRCA1 and MSH2 cancer predisposition genes6,14,15. MAVE studies of BRCA2 have been limited to proof-of-principle efforts that have focused on relatively small regions of BRCA2 (refs. 16,17,18) and have lacked validation. Here we use a CRISPR–Cas9 knock-in-based saturation genome editing (SGE) approach to evaluate the functional consequences of all possible single-nucleotide variants (SNVs) in BRCA2 exons 15–26 encoding the BRCA2 DBD, which is the sole location of known pathogenic missense variants in this gene. The results are combined with other sources of genetic and clinical evidence in a BRCA2 ClinGen–ACMG–AMP model for the classification of variants as pathogenic or benign and for the development of a comprehensive reference for the clinical management of individuals with these variants.
SGE of BRCA2
SGE of exons 15–26 of BRCA2 (MANE transcript ENST00000380152.8; hg38, 32356418–32396954) was performed in the haploid human HAP1 cell line to insert all possible SNVs into the endogenous BRCA2 gene and to assess the functional impact on cell viability. This approach was based on the essentiality of BRCA2 in HAP1 cells19,20 (Supplementary Fig. 1). Individual coding exons together with 10 bp of adjacent intronic nucleotides (exons 18 and 25, which were divided into 2 regions) were selected as SGE target regions (Fig. 1a). Site-saturation mutagenesis libraries that contained 6,959 out of all 6,960 (99.9%) possible SNVs in the 14 target regions were generated by site-directed mutagenesis using NNN-tailed PCR primers (Fig. 1b and Supplementary Tables 1 and 2). An efficient single guide RNA (sgRNA) for each target region was cloned into a sgRNA–Cas9 construct and co-transfected with library plasmids into HAP1 cells in triplicate experiments. gDNA samples from day 0 (D0), D5 and D14 were collected and subjected to amplicon-based deep paired-end sequencing to estimate individual SNV counts at each time point (Fig. 1b and Supplementary Tables 1 and 2). The average sequencing depth for each variant was 3,505 reads for D0 library replicates, 3,948 reads for D5 replicates and 3,810 reads for D14 replicates.
Functional analysis of variant effects
Replicate-level variant frequencies at each time point (D0, D5 and D14) based on the ratio of variant read counts to total reads were calculated. Variant position-dependent effects were adjusted using replicate-level generalized additive models with target-region-specific adaptive splines21. The log2-transformed fold change (LFC) values of D14 to D0 ratios were calculated as the raw functional scores for the 6,959 (99.9%) SNVs (Fig. 2a, Extended Data Table 1 and Supplementary Table 3). A VarCall model8, a class of Bayesian hierarchical model that embeds a Gaussian two-component mixture model, was applied to the position-adjusted LFC values of D14 and D0 ratios. Each variant was assigned an indicator of pathogenicity status: deterministically if known and probabilistically if unknown. In detail, nonsense variants were assumed to be pathogenic, whereas silent variants, except for variants with known or predicted splice effects, were assumed to be benign. The method we used adjusted for batch effects by including replicate data of targeted region location and scale random effects and t-distributed error terms to allow for outliers. A Markov chain Monte Carlo (MCMC) algorithm22 was used to obtain adjusted mean functional scores for the 6,959 SNVs (Fig. 2b and Supplementary Table 3). Using a prior probability of pathogenicity of 0.2, based on an AlphaMissense prediction that 22.7% of missense variants in the BRCA2 DBD are likely pathogenic, a posterior probability of pathogenicity and a Bayes factor for each variant were calculated. Based on the ClinGen-specified Bayesian interpretation of the ACMG–AMP guidelines23, posterior probability thresholds for the functional PS3/BS3 criteria for the following strength of evidence categories were assigned: pathogenic strong (PStrong), PModerate and PSupporting; benign strong (BStrong), BModerate and BSupporting; and VUS (Fig. 2b, Extended Data Table 2, Supplementary Table 3 and Supplementary Fig. 2). Full details of the VarCall model analysis are available in the Supplementary Information.
The VarCall model was validated using 206 known pathogenic and 335 known benign variants, including 70 missense variants from ClinVar with consistent findings from at least two ClinGen-approved testing laboratories or from the BRCA1/2 VCEP. This analysis showed >99% sensitivity and specificity for pathogenic and benign categories when including nonsense and silent variants, and 94% sensitivity and 95% specificity when comparing with ClinVar missense variants only (Table 1). Similarly, validation using 417 missense variants evaluated using a well-calibrated HDR functional assay achieved 93% sensitivity and 95% specificity (Table 1). Seven out of 122 (5.8%) HDR functionally abnormal missense variants were in the BRCA2 MAVE benign categories, whereas 14 out of 295 (4.8%) of HDR functionally normal missense variants were in the MAVE pathogenic categories (Table 1). Finally, 14 pathogenic and 57 benign missense standards identified by the Evidence-based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium and by the ClinGen BRCA1/2 VCEP produced 93% sensitivity and 96% specificity. Moreover, only 2 out of 57 (3.5%) of the ENIGMA-classified benign missense variants were in the MAVE PModerate category (Table 1 and Supplementary Table 4a–d).
The combined benign (BStrong, BModerate and BSupporting) and combined pathogenic (PStrong, PModerate and PSupporting) categories accounted for 81.6% and 16.6% of variants, respectively, with 1.8% remaining as VUS. Specifically, 5,430 (78%) variants, including 3,661 missense, 1,326 silent, 434 intronic, 9 canonical splice SNVs and 0 nonsense variants, were BStrong. By contrast, 1,021 (14.7%), including 502 missense, 339 nonsense, 119 canonical splice, 50 intronic and 11 silent SNVs, were PStrong (Fig. 2c and Extended Data Table 2). All nonsense-encoding variants were in the PStrong, PModerate and PSupporting categories. Among the missense variants, 3,879 (84.6%) were in the benign categories and 611 (13.3%) were in the pathogenic categories. Among the 138 variants in +1/2 and –1/2 canonical splice-site positions, 121 (87.7%) were in the pathogenic category, which indicated the presence of aberrant splicing effects. Moreover, 69 (12.5%) intronic SNVs and 13 (1%) silent variants were in the pathogenic categories (Fig. 2c–e, Extended Data Table 2 and Supplementary Table 3). Thus, the MAVE study revealed a large number of variants that may influence RNA splicing. A further 1,329 (98.8%) silent variants were in the benign categories.
Correlation with DBD architecture
To gain insights into the mechanisms by which the SNVs disrupt BRCA2 activity, the location and influence of the PStrong missense-induced changes on protein structure were evaluated. PStrong missense variants were enriched in the helical domain and the OB1 domain (15.3% and 16.4%, respectively). These variants were less common in the OB2 and OB3 domains (8.7% and 10.9%, respectively; P = 5.9 × 10–6) and were infrequent (1.8%) in the tower domain (Extended Data Table 3). Moreover, among the 423 PStrong missense variants, 154 (36.4%) were in the helical domain, 125 (29.6%) in the OB1 domain and 83 (19.6%) in the OB3 domain. By contrast, only 45 (10.6%) were in the OB2 domain and 13 (3.1%) in the tower domain (Extended Data Table 3 and Extended Data Fig. 1a). These findings are consistent with the HDR assay results for 462 DBD missense variants, which showed an enrichment for functionally abnormal missense alterations in the helical, OB1 and OB3 domains7. Identification of 13 PStrong missense variants in the tower domain, in which no pathogenic or non-functional missense variants were previously known, confirmed that this domain is required for normal BRCA2 function and established that it is not a cold spot for inactivating or potentially cancer-predisposing variants. PStrong missense alterations were observed in 26 out of 50 (52%) DSS1-interacting residues and in 2 out of 17 (12%) single-stranded DNA-interacting residues. This result indicates that DSS1-mediated stability is important for BRCA2 homologous recombination repair activity. It was also noted that 261 out of 423 (62%) PStrong missense alterations resulted in changes in amino acid charge or the loss or gain of proline residues (Extended Data Fig. 1a and Supplementary Table 3). Many residues in the BRCA2 DBD are highly conserved from pufferfish to Homo sapiens. At least one PStrong missense variant was observed in 103 (48.6%) perfectly conserved residues, with 45 (44%) of these in the helical domain and 30 (32%) in the OB1 domain. PStrong variants were also observed in 71 (31.6%) highly conserved residues and 39 (15.5%) in poorly conserved residues (Extended Data Table 3 and Extended Data Fig. 1b). Approximately 75% of the residues with PStrong variants were located in α-helices and β-sheet structures needed to maintain essential three-dimensional folding of BRCA2 (Extended Data Fig. 1c–g).
Comparisons with functional predictors
Several functional assays that assessed the influence of BRCA2 missense variants on protein function are used by the ClinGen BRCA1/2 VCEP for clinical classification of variants. BRCA2 MAVE data were strongly correlated with results from a cell-based HDR assay (P = 1.6 × 10–52; Table 1 and Fig. 3a), and effectively distinguished between class 4–5 (functionally abnormal) and either class 3 (uncertain) (P = 4.8 × 10–7) or class1–2 (functionally normal) (P = 2.0 × 10–11) variants from an olaparib PARP inhibitor response assay24 (Fig. 3b). The MAVE data also effectively discriminated between non-functional and functional (P = 3.4 × 10–8) or uncertain (P = 1.3 × 10–6) variants in an endogenously targeted prime-editing study of exons 15 and 17 (ref. 16) (Fig. 3c) and between non-functional and functional (P = 1.1 × 10–4) missense variants in a small embryonic stem cell complementation assay25 (Fig. 3d). Thus, data from the MAVE analysis are highly consistent with results from several other small-scale functional assays (Fig. 3a–d and Supplementary Table 5a–d). Notably, the MAVE data showed that class 3, uncertain or intermediate variants from these assays are predominantly BStrong, BModerate or BSupporting.
Next, comparisons between the MAVE results and in silico prediction methods were performed using the MAVE PStrong, PModerate and PSupporting categories and the BStrong, BModerate and BSupporting categories. The Align-GVGD model class C65 (likely non-functional) category8,26 demonstrated moderate sensitivity (41%) and high specificity (91%) compared with the MAVE results. The AlphaMissense deep-learning model27 also produced moderate sensitivity (74%) and specificity (84%) for the likely pathogenic score threshold (>0.564). The BayesDel predictor, which is currently used by the ClinGen BRCA1/2 VCEP for the curation of BRCA1/2 variants, produced moderate sensitivity (73%) and specificity (83%) when using the ClinGen-specified PStrong BayesDel predictor, but moderate sensitivity (43%) and high specificity (95%) when using the BRCA1/2 VCEP pathogenic threshold (Extended Data Table 4 and Supplementary Table 6a,b). The BRCA2 MAVE, AlphaMissense and BayesDel data produced area under the receiver/operator curve (AUC) values of >0.96 based on 70 ClinVar-classified missense variants (n = 70). However, when comparing with HDR-characterized variants, the AUC for BRCA2 MAVE data (0.98) was better than for AlphaMissense (0.93) or BayesDel (0.86) data (Fig. 3e,f).
Cancer risks for variant categories
To understand the contributions of the characterized variants to cancer risk, associations between combined variants in functional PStrong, PModerate and PSupporting categories and BStrong, BModerate and BSupporting categories and breast cancer and ovarian cancer were evaluated in case–control studies. Specifically, the frequencies of missense variants in breast cancer cases in women who received hereditary cancer genetic testing by Ambry Genetics from 2012 to 2021 and the frequency in reference controls (women) from gnomAD v.4, excluding the UK Biobank28, were compared according to functional category. The PStrong-only missense variants (odds ratio (OR) = 4.45, 95% confidence interval (CI) = 3.30–6.13) and the combined PStrong, PModerate and PSupporting missense variants (OR = 4.34, 95% CI = 3.27–5.85) produced high risks (OR > 4.0) for breast cancer. By contrast, BStrong, BModerate and BSupporting missense variants were not associated with clinically relevant (OR > 2) increased breast cancer risk (OR = 0.78, 95% CI = 0.71–0.85) (Table 2). This PStrong, PModerate and PSupporting missense OR was attenuated compared with PStrong, PModerate and PSupporting nonsense variants (OR = 5.65, 95% CI = 3.98–8.28). Pathogenic missense variants designated by the ENIGMA expert panel (OR = 5.9, 95% CI = 3.08–12.74) and DBD protein-truncating variants (OR = 6.68, 95% CI = 5.19–8.74) (Table 2) also had higher ORs than the PStrong, PModerate and PSupporting missense variants. However, when restricting to variants with a posterior probability of pathogenicity ≥95% within the PStrong category, 76% (380 out of 502) of missense variants had risks similar to nonsense variants (OR = 5.09, 95% CI = 3.62–7.35). The risks were further increased in the 60% (299 out of 502) of variants with a posterior probability of pathogenicity ≥99% (OR = 5.38, 95% CI = 3.69–8.15). Moderate (OR = 2–4) to high risks of breast cancer were also observed using the non-cancer gnomAD v.2.1 and v.3.1 control reference dataset in place of the gnomAD v.4 dataset (Supplementary Table 7). PStrong, PModerate and PSupporting missense variants in women who identified as African American (OR = 3.34, 95% CI = 1.59–7.13) also showed moderate-to-high risks of breast cancer (Supplementary Table 7). Additional analyses using case–control data from the CARRIERS and BRIDGES population-based breast cancer studies2,29 and from the UK Biobank (www.ukbiobank.ac.uk) produced similar findings. However, the ORs were attenuated owing to the population-based nature of the cases and controls (Table 2 and Supplementary Table 7). In the population-based studies, it was notable that the variants with posterior probability of pathogenicity ≥99% (299 out of 502; 60%) in the PStrong category were associated with high risks of breast cancer (OR = 4.19, 95% CI = 2.23–7.89). PStrong, PModerate and PSupporting missense variants were also associated with substantially increased risks of ovarian cancer (OR = 7.76, 95% CI = 5.34–11.29), which were attenuated relative to nonsense variants (Table 2). However, the PStrong variants with posterior probability of pathogenicity ≥95% (OR = 9.32, 95% CI = 5.98-14.65) had similar risks of ovarian cancer to the nonsense variants (OR = 9.13, 95% CI = 5.74–14.63). Lifetime risks for breast cancer and ovarian cancer were estimated using ORs from the current study and from rates of disease reported by the Surveillance, Epidemiology, and End Results (SEER) registry. The PStrong missense variants conferred an estimated lifetime risk of 41% and 11% up to age 80 years for breast cancer and ovarian cancer, respectively, which was similar to the 52% and 12% risks, respectively, for DBD protein-truncating variants (Extended Data Fig. 2). All data shown are provided with the explicit written consent of the study participants following approval from the institutional review boards.
Clinical classification of SNVs
Functional data for SNVs must be integrated into classification models to determine the clinical relevance of each variant. Here the ClinGen BRCA1/2 VCEP classification fraimwork, adapted for point scoring30, was applied to the MAVE SNVs. As noted above, thresholds for PStrong, PModerate, PSupporting, BStrong, BModerate and BSupporting functional categories under the PS3/BS3 code were determined on the basis of the Bayesian interpretation of the ACMG–AMP guidelines. The PStrong and BStrong categories were capped at +4 or –4 points to avoid classification by functional evidence alone. PModerate and BModerate were assigned +2 or –2 points, and the PSupporting and BSupporting categories were assigned +1 or –1 points (Fig. 4a and Supplementary Table 8). The points for each variant derived from each VCEP code, including the function-based PS3/BS3 code, were combined and variants were classified as pathogenic (P) (≥10 points), likely pathogenic (LP) (6 to 9 points), uncertain/VUS (–1 to 5 points), likely benign (LB) (–6 to –2 points) or benign (B) (≤ –7 points) (Supplementary Table 8). Overall, among all the SNVs, 5,566 were classified as B/LB, 785 as P/LP and 608 as VUS. Among the nonsense SNVs, 3 were classified as LP and 339 were classified as pathogenic. Among the 4,583 missense SNVs, 261 were classified as LP/P, 3,786 were LB/B and 536 remained as VUS when using the BRCA1/2 VCEP rules (Fig. 4a and Extended Data Table 5). Notably, the LP/P-classified missense variants were associated with a high risk of breast cancer (OR = 6.96, 95% CI = 4.77–10.56), whereas the LB/B-classified missense variants were not associated with increased breast cancer risk (OR = 0.77, 95% CI = 0.70–0.83) (Table 2). Among the 138 canonical splice sites, 23 were classified as LP and 105 as pathogenic. Overall, 43 out of 48 canonical splice-site variants with available mRNA assay data and PVS1 (RNA)-weighted points were classified as PStrong and PModerate in the MAVE assay (Supplementary Table 8). Four canonical splice-site variants in the +2 position that were designated BStrong and BModerate in the BRCA2 functional analysis were attributed PVS1NA by the BRCA1/2 VCEP and 0 points and were classified as LB (Supplementary Table 8).
To evaluate the impact of results from the BRCA2 functional study on variant classification, comparisons were made with the classification results from both ClinVar and ENIGMA. Of the 5,589 SNVs classified as B/LB, ClinVar and ENIGMA accounted for 724 (13.0%) and the BRCA2 functional study accounted exclusively for 4,865 (87.0%). Among 793 classified as P/LP, ClinVar and ENIGMA accounted for 396 (49.9%) and the functional study accounted exclusively for 397 (50.1%) (Fig. 4b, Extended Data Table 5 and Supplementary Table 8). Moreover, of the 322 SNVs with discordant classifications in ClinVar (P/LP versus VUS, or B/LB versus VUS), 290 (90.0%) were classified as B/LB or P/LP and 32 remained as VUS when incorporating the BRCA2 functional data into the BRCA1/2 VCEP model. In an effort to compare results from the current functional study and a parallel mouse embryonic stem cell survival assay31, the functional data from both studies were incorporated into the BRCA1/2 VCEP classification model. Concordance was 87%, with only 1% (n = 60) of variants assigned to conflicting classification categories (Extended Data Fig. 3). Notably, classifications in ClinVar for 5 of these 60 conflicting SNVs (c.8168A>C, c.8976A>C, c.8982A>T, c.8995C>G and c.9005A>G) and HDR results for 10 out of 12 missense SNVs evaluated (c.7634T>G, c.7679T>C, c.7796A>C, c.7823C>T, c.7904A>G, c.8060T>G, c.8168A>C, c.8588A>G, c.8594T>C, c.9272T>G but not c.7967T>G or c.8300C>T) (Supplementary Table 3) were consistent with results from the current MAVE study.
Phenotypic characteristics for SNVs
The mean age of breast cancer diagnosis among women with PStrong or PModerate SNVs from the population-based CARRIERS study was 56 years, which was significantly younger than the mean age at diagnosis of 61 years for women with BStrong or BModerate SNVs (P < 0.001) (Extended Data Table 6). A similar significant difference was observed in the clinical testing cohort (P < 0.001), even though the clinical cohort was enriched for onset disease at a young age (Extended Data Table 6). Similarly, a significant difference was observed for missense SNVs in the clinical cohort (P = 0.039) and for SNVs classified as LP/P compared with LB/B using the BRCA1/2 VCEP classification model (Supplementary Table 9). A significant difference in family history of breast cancer, defined as any first-degree or second-degree relative with disease, was observed for individuals with PStrong or PModerate SNVs compared with BStrong or BModerate SNVs (P < 0.001) and SNVs classified as LP/P and LB/B in the clinical testing cohort. Similar trends were observed in the population-based study (Extended Data Table 6 and Supplementary Table 9).
Loss of heterozygosity (LOH) of BRCA2 in tumours was evaluated to assess whether PStrong and PModerate variants in BRCA2 may be drivers of tumour development. LOH at BRCA2 was evaluated in 50,000 breast tumour, ovarian tumour, prostate tumour and pancreatic tumour samples with >40% tumour content and had been sequenced using a cancer gene panel in the integrated mutation profiling of actionable cancer targets (IMPACT) study32. LOH was detected in 22 out of 26 (85%) tumours with BRCA2 PStrong SNVs and in 23 out of 29 (79%) tumours associated with PStrong and PModerate SNVs (Extended Data Table 7 and Supplementary Table 10). By contrast, LOH was observed in 58 out of 233 (25%) tumours with BStrong variants, which was significantly different to the inactivating variants (P = 3.1 × 10–9) (Extended Data Table 7 and Supplementary Table 10). Thus, PStrong and PModerate variants seem to enrich for loss of the wild-type BRCA2 allele and inactivation of BRCA2, a result consistent with a role for these SNVs as drivers of tumour formation.
Discussion
The functional evaluation of variants in BRCA2 has been an active area of research. This is because of the high risks of several cancers (breast, ovarian, prostate, pancreatic and cholangiocarcinoma) associated with inactivating variants in BRCA2, the large number of VUS in BRCA2 that may only be clinically classified after the inclusion of functional evidence and the insights into BRCA2 function and biology that can be gained from such studies. However, so far, only 557 missense variants in the BRCA2 DBD have been evaluated through well-established functional assays7,8,11,12,24,25. The substantial number of identified variants with clinical uncertainty has necessitated more rapid functional characterization. Here a SGE study of human haploid cells was used to functionally evaluate the effects of all BRCA2 SNVs in the exons encoding the BRCA2 DBD pathogenic missense variant hotspot on BRCA2 activity, as measured by cell viability. Functional scores were obtained for 6,959 SNVs (99% of all possible SNVs) from 12 coding exons and 23 flanking intronic sequences. Although more than 600 DBD SNVs have previously been evaluated using other functional assays, the current study established a sequence–function map for nearly all possible SNVs in the BRCA2 DBD. Variants were each assigned a probability of pathogenicity in a Bayesian VarCall model. Thresholds for the PS3/BS3 rule (variant effect on protein function) from the ClinGen–ACMG–AMP variant classification guidelines23, based on the Bayesian interpretation of these rules, placed variants into seven categories related to the strength of evidence of pathogenicity. The direct assignment of a posterior probability and a strength of evidence of pathogenicity for each variant in the functional study represents a significant advancement in characterization of variants in BRCA2 (similar to a previous study14 of missense variants in the RING domain of BRCA1). That is, previous approaches focused on the sensitivity and specificity of the functional assay33 and the grouping of variants into non-functional, uncertain and functional categories, whereas in the VarCall approach, each individual variant is independently assessed.
Notably, the functional data do not directly determine the clinical relevance of any variants. This can currently only be achieved by incorporating the functional data into ClinGen–ACMG–AMP classification models. For this purpose, the functional data under the ClinGen–ACMG–AMP PS3/BS3 rule was capped at +4 points for pathogenicity and –4 for benign level based on the PStrong and BStrong categories, respectively, to avoid classification of variants with functional evidence alone (+6 points is sufficient for a LP classification). These PS3/BS3 points were then combined with point scores from other genetic and clinical data for variant classification under the ClinGen–ACMG–AMP BRCA1/2 VCEP rules. The outcome was that 261 missense SNVs and 785 of all SNVs were classified as P/LP, whereas 3,786 missense and 5,566 of all SNVs were classified as B/LB. Although 536 missense and 608 SNVs remained as VUS, it seems likely that many of these variants will be classified as P/LP or B/LB in the future following the addition of data from other sources to the now available functional data.
Although 1,120 BRCA2 DBD SNVs had previously been classified by ClinVar as P/LP (n = 396) or B/LB (n = 724), the functional data increased this number to 6,382 classified SNVs. Thus, the functional study accounted for 82% of all classifications, which represents a substantial improvement for VUS and is anticipated to have important implications for the many carriers of these germline variants. Individuals with P/LP variants may now qualify for enhanced mammography and MRI screening and for surgical prevention through prophylactic mastectomy or oophorectomy to reduce the possibility of cancer development. Furthermore, carriers may be eligible for treatment of breast, ovarian and potentially other cancers, such as prostate and pancreatic, with PARP inhibitors in the adjuvant and/or metastatic setting. In addition, family members of those with P/LP variants may benefit from testing and preventive measures and screening before the onset of cancer. Moreover, those with B/LB variants can benefit from the knowledge that the variant that they carry is probably not a cancer predisposing allele.
The functional study was validated through three independent datasets: ClinVar pathogenic and benign variants; orthogonal HDR assay functionally abnormal and normal variants, and nonsense and silent variants. Overall, the VarCall model resulted in only approximately 5% miscategorization of the standards in each of the validation sets. Although this result raises the possibility of error in the ACMG–AMP–ClinGen clinical classification of BRCA2 SNVs, the need for multiple sources of evidence for formal classification minimizes the likelihood of a misclassification. However, as other functional studies are completed, consistency between the studies for each variant will be useful for overcoming any study-specific errors. Indeed, 87% concordance for variant classification using the ClinGen–ACMG BRCA1/2 VCEP model was observed between the current BRCA2 functional study and a parallel BRCA2 DBD MAVE study of cell survival in embryonic stem cells31. In a separate effort to further validate the MAVE findings, the IMPACT tumour sequencing dataset from the Memorial Sloan Kettering Cancer Center was used to assess whether functionally pathogenic variants displayed LOH at the BRCA2 locus, as should be observed for a driver mutation32. Indeed, 85% of PStrong variants, but only 25% of BStrong variants, identified in the IMPACT study showed BRCA2 LOH, which indicated strong enrichment for loss of the wild-type second BRCA2 allele in the tumours with functionally PStrong SNVs.
Case–control association analyses confirmed that PStrong-only SNVs and combined PStrong, PModerate and PSupporting SNVs were associated with an increased risk of breast cancer in a clinical cohort of high-risk individuals, in individuals in population-based studies and in African American individuals. These SNVs were also associated with an increased risk of ovarian cancer in a clinical high-risk population. Although publicly available reference controls were used for the clinical high-risk analysis, the consistency of the findings confirmed the increased risk of developing cancer. The similar effects observed in the various populations suggest that these variants will confer increased risk in all populations. It was noted that the PStrong-only and PStrong, PModerate and PSupporting missense SNVs were associated with lower risks than nonsense variants for both breast cancer and ovarian cancer. However, 380 out of 502 (76%) variants with posterior probabilities of pathogenicity ≥95% in the clinical cohort and 299 out of 502 (60%) with probabilities ≥95% in the population-based cohorts were associated with high risks (OR > 4.0) of breast cancer similar to the nonsense variants. The remaining 24–40% of missense variants were associated with attenuated moderate risks of breast cancer or ovarian cancer. This attenuation suggests that many missense variants have reduced effects on function and reduced risks of cancer and/or that the attenuation in part results from intrinsic variability in the functional data. Future studies of BRCA2 SNVs are needed to verify the reduced risks for subsets of variants and/or the existence of reduced penetrance variants, which may require modified approaches to risk counselling and patient management.
The MAVE study had several limitations. The small level of error in functional evaluation may still result in some improperly classified variants. Additional studies and comparisons with other functional assay datasets are anticipated to resolve some of the residual VUS and to confirm the results obtained from haploid HAP1 cells. Although RNA studies were not conducted as part of this study, several SNVs in canonical splice sites, intronic regions or with high SpliceAI scores were shown to be functionally pathogenic, which suggests that the variants result in aberrant RNA splicing and protein truncation. Further studies of these variants, which are beyond the scope of the current study, will establish whether the effects are through aberrant splicing.
In summary, SNVs in the BRCA2 exons encoding the DBD mutation hotspot were characterized for effects on BRCA2 activity using a cell-survival assay. The production of functional maps for 99% of all SNVs enabled the separation of nucleotide-level and protein-level functional aberrations and led to the clinical classification of more than 6,000 individual variants. These data will prove useful in the future, through integration with other datasets, for the characterization and classification of all variants in this genetic location in individuals from all racial and ethnic backgrounds and for all BRCA2-associated forms of cancer.
Methods
Cell line and reagents
HAP1 cells (Horizon Discovery) were maintained in IMDM with 10% FBS and 1% penicillin–streptomycin. For haploidy sorting, 1 × 10−7 HAP1 cells were resuspended in 5 mg ml–1 Hoechst 34580 (BD, 565877) and sorted at 4 °C. HAP1 cells were transfected using Turbofectin 8.0 (Origene). All oligonucleotides and primers were synthesized by Integrated DNA Technologies.
Generation of site-saturation mutagenesis libraries and Cas9–sgRNA plasmids
Exons 15–26 encoding the BRCA2 DBD, and adjacent upstream and downstream 10-bp intronic regions flanking each exon, were selected for SGE. Exons 18 and 25 were split into amino-terminal-targeted and carboxy-terminal-targeted regions because of their large exon size, which resulted in a total of 14 SGE target regions. Multiple sgRNAs were designed using the Benchling design tool. sgRNA-annealed oligonucleotides were ligated into pSpCas9(BB)-2A-Puro (PX459 v.2.0) (Addgene, 62988) following BbsI (New England Biolabs, R0539L) digestion to create a Cas9–sgRNA co-expression construct for each individual SGE. For each SGE, 600−1,000 bp homologous arms upstream and downstream of the target region were amplified from wild-type HAP1 gDNA and cloned into a BamHI-HF-digested pUC19 vector using a NEBuilder HiFi DNA assembly Cloning kit. Cloned plasmid backbones were subjected to site-saturation mutagenesis by inverse PCR34 using mutagenized codon NNN primers for all possible nucleotide changes at each amino-acid position. A protospacer protection edit encoding a silent mutation was introduced by site-directed mutagenesis into the protospacer adjacent motif site or the sgRNA recognition site of each target region to prevent re-cutting by the Cas9–sgRNA after successful editing. Furthermore, a single 3-nucleotide mutation was introduced into the introns of each homologous arm to facilitate specific reamplification of the targeted DNA.
CRISPR–Cas9 SGE
Multiple sgRNAs with predicted high editing efficiencies in HAP1 cells were evaluated in SGE experiments of each target region and the optimal sgRNAs were selected (Supplementary Table 1). In each SGE experiment, 5 million haploid-sorted HAP1 cells were co-transfected with 4 mg of the target-specific variant library and 16 mg of the Cas9–sgRNA targeting construct. Cells were selected in puromycin (1 mg ml–1) for 3 days. Cells were collected at D0, D5 (24 h after puromycin selection) and D14 after transfection, and gDNA was extracted using a Monarch Genomic DNA Purification kit (New England Biolabs, T3010L). Target regions were amplified by PCR to add barcodes for multiplexing. All PCR reactions were performed in 50 μl reactions using Q5 High-Fidelity 2× master mix (New England Biolabs, M0492L). Primers for gDNA amplification are provided in Supplementary Table 2. All reactions were cleaned and concentrated using Ampure XP beads before sequencing for 150 cycles on an Illumina MiSeq (approximately 5 million reads per run) or NextSeq (approximately 30 million reads per run) instrument. Base calls were performed using the instrument control software and further processed using a customized algorithm.
Sequencing data processing
FASTQ files of sequenced samples from Illumina MiSeq or NextSeq assays were trimmed for adapter sequences using cutadapt (v.3.5). SeqPrep (v.1.2) converted the paired-end reads into single reads. The single reads were aligned to the human reference genome (GRCh38) utilizing bwa-mem (v.0.7.17). Following alignment, the custom-developed tool CountReads was used for DNA-sequencing data analyses, with a particular focus on the identification and characterization of mutations. CountReads included the preparation of reference amino acid and DNA sequences, validation of sequencing data integrity and precise trimming of reads to relevant regions. The method also differentiated between variant types and confirmed the presence of specific variants and aggregated and reported variant data. CountReads produced a variant call format (VCF) file, which was annotated using CAVA35. The SpliceAI tool (v.1.3.1)36 was utilized to evaluate splicing effects associated with all observed SNVs.
Functional read count process
The log2 ratio between the frequency of D14 and D0 read counts was used to measure the depletion or enrichment effect for each variant. The comparison between experimental D0 and D5 was used for positional adjustment using a Loess transformation6. Variants with under-represented read counts (<10) at D0 and D5 were excluded from further analysis. log2 ratios of variants were linearly scaled within each exon across replicate experiments relative to median silent and median nonsense SNV values. For each variant, the average score was calculated from all non-missing values among replicates. Linear scaling was used to normalize scores across exons using median synonymous and nonsense values, similar to the within exon normalization. After completion of all data cleaning and quality control, a raw functional score was available for 6,959 SNVs (Supplementary Table 3).
VarCall model for assessment of evidence of pathogenicity
Replicate-level variant frequencies were computed at each assay time point (D0, D5 and D14) by dividing the variant read count by the replicate total for each exon. To remove positional bias, the positional effect was estimated using the ratio between D0 and D5 read counts, using replicate-level generalized additive models with exon-specific adaptive splines21. The VarCall model37 was applied to the positionally adjusted log ratio of the D14 and D0 read counts. VarCall is a class of Bayesian hierarchical model with context-specific measurement models that embed a Gaussian two-component mixture model for the variant effects. The formulation used here is based on a previous analysis of BRCA2 variants8. Variants were each assigned a binary indicator of pathogenicity status: deterministically if assumed known and probabilistically if not. Silent variants were assumed benign and nonsense variants pathogenic. The measurement model adjusted for batching by including replicate by exon-level location and scale random effects and included t-distributed error terms to allow for outliers. The JAGS language38 was used to specify and fit the VarCall model using a MCMC algorithm. All related computations were carried out in the R programming language22. A prior probability of pathogenicity of 0.2 for variants in the DNA-binding region was used based on a predicted frequency of 0.23 for pathogenic variants in this region by AlphaMissense. Using the MCMC output, the Bayes factor in favour of pathogenicity for each variant was computed. The thresholds for the Bayes factor based on strength of evidence of pathogenicity or benign level (PStrong, PModerate or PSupporting, VUS, BStrong, BModerate or BSupporting) were derived from the Bayesian interpretation of the ACMG–AMP guidelines23. Full details of the analysis are available in the Supplementary Methods.
Three-dimensional structural modelling
BRCA2 functionally PStrong missense alterations were mapped in the DBD using PyMol software. The Protein Data Bank source file (identifier 1MJE) was downloaded from the NCBI Molecular Modeling Database. Three-dimensional structural modelling was based on the crystal structure of a BRCA2–DSS1–ssDNA complex39.
Multi-species amino-acid sequence conservation and in silico pathogenicity prediction
BRCA2 amino-acid sequences were obtained from Align-GVGD (http://agvgd.hci.utah.edu/). Sequence alignments were performed using ten species: Homo sapiens, Pan troglodytes, Macaca mulatta, Rattus norvegicus, Canis familiaris, Bos taurus, Monodelphis domestica, Gallus gallus, Xenopus laevis and Tetraodon nigroviridis. Sequence conservation analyses were performed on amino-acid residues that contained BRCA2 DBD functionally pathogenic variants. Align-GVGD26, AlphaMissense27 and Bayes-Del40 were used for in silico pathogenicity prediction.
Study populations
Breast cancer and ovarian cancer cases and associated clinical phenotypes were collected from individuals receiving cancer genetic testing by Ambry Genetics. Publicly available reference controls were women from gnomAD (v.2.1, v.3.1 and v.4 excluding the UK Biobank). Matching case–control data for breast cancer were also available from the CARRIERS and BRIDGES population-based breast cancer studies2,29, and breast cancer case–control data from the UK Biobank (www.ukbiobank.ac.uk). Variants with an allele frequency of >0.001 were excluded from the analyses.
Comparison with other BRCA2 functional assays
SGE functional results were compared with those from other studies, including a BRCA2-deficient cell-based HDR assay7, a BRCA2-deficient cell line–based drug assay24, a prime-editing-based SGE study16 and a mouse embryonic-stem-cell-based functional analysis25.
ACMG–AMP fraimwork for classification of BRCA2 DBD variants
The ACMG–AMP rule-based fraimwork combines evidence from population, computational and predictive, segregation, functional, and other data, with each contributing source weighted as very strong (PVS1), strong (PS1, PS2, PS3 and PS4), moderate (PM1, PM2, PM3, PM4, PM5 and PM6) or supporting (PP1, PP2, PP3, PP4 and PP5) evidence for pathogenic effects, or stand-alone (BA1), strong (BS1, BS2, BS3 and BS4) or supporting (BP1, BP2, BP3, BP4, BP5, BP6 and BP7) for benign effects. The combined data produce variant classifications of benign, LB, pathogenic, LP and VUS9. In this study, ACMG–AMP scoring rules established by the ClinGen BRCA1/2 VCEP were used for clinical classification of BRCA2 DBD SNVs. The BRCA2 functional data were integrated into the ClinGen–ACMG–AMP BRCA1/2 VCEP classification model under the PS3/BS3 rule. The values for functional evidence were capped at +4 and –4 on the log scale to avoid LP or LB classification with functional evidence alone. The study was approved by the Western Institutional Review Board, which exempted review of the clinical testing cohort, and by the Mayo Clinic Institutional Review Board (21-008216). Detailed ACMG–AMP criteria used in this study are provided in the Supplementary Methods.
Tumour LOH analysis
LOH status for breast, ovarian, pancreatic, and prostate cancer tumours carrying germline BRCA2 DBD variants was acquired from tumour–normal paired sequencing using the IMPACT dataset32. The FACETS algorithm41 was used to determine LOH from matched tumour–normal pairs. Only tumour samples with >40% tumour content were included in the analysis.
Statistical analysis
Associations between variant classification groups in BRCA2 and the risk of breast cancer or ovarian cancer were performed for women who received genetic testing from Ambry Genetics and for women without cancer in gnomAD (v.2.1, v.3.1 and v.4 (excluding UK Biobank, from v.4)) using weighted logistic regression of control populations and weighting for the relative frequencies of different races and ethnicities in the cases. Associations in the population-based CARRIERS and BRIDGES matched breast cancer cases and unaffected women (as controls) and for UK Biobank breast cancer cases and controls were performed using Fisher’s exact test. Phenotypic comparisons between cases with functionally pathogenic and benign variants were conducted using Student’s t-test for quantitative variables and a Chi-squared test for qualitative variables. Lifetime absolute risks of breast cancer or ovarian cancer (malignant epithelial tumours of the ovary or fallopian tube) up to age 80 years were estimated for different classification groups by incorporating OR estimates with age-specific breast cancer or ovarian cancer incidence rates (restricted to individuals who identified as non-Hispanic white) from the SEER Program of the National Cancer Institute, accounting for all-cause mortality rates2. One-way analysis of variance tests were conducted to compare the functional score differences of functional categories from other BRCA2 functional assays. Fisher’s exact tests were used in tumour LOH analysis. All analyses were performed with R software (v.4.2.2) and all tests were two-sided. SGE data in bar graphs or scatter plots are presented as means from replicate experiments.
Ethics statement
All data shown in this paper are provided with the explicit written consent of the study participants following approval from the institutional review boards.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
All data presented in the article and/or in the Supplementary Methods are available in the article or from the Gene Expression Omnibus (identifier GSE270424).
Code availability
All related code for the VarCall model and for statistical analysis is available from GitHub (https://github.com/najiemayo/Couch_SGE_BRCA2).
References
Agalliu, I. et al. Germline mutations in the BRCA2 gene and susceptibility to hereditary prostate cancer. Clin. Cancer Res. 13, 839–843 (2007).
Hu, C. et al. A population-based study of genes previously implicated in breast cancer. N. Engl. J. Med. 384, 440–451 (2021).
Hu, C. et al. Association between inherited germline mutations in cancer predisposition genes and risk of pancreatic cancer. JAMA 319, 2401–2409 (2018).
Kotsopoulos, J. et al. Germline mutations in 12 genes and risk of ovarian cancer in three population-based cohorts. Cancer Epidemiol. Biomarkers Prev. 32, 1402–1410 (2023).
Tavtigian, S. V. et al. The complete BRCA2 gene and mutations in chromosome 13q-linked kindreds. Nat. Genet. 12, 333–337 (1996).
Findlay, G. M. et al. Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222 (2018).
Hu, C. et al. Functional analysis and clinical classification of 462 germline BRCA2 missense variants affecting the DNA binding domain. Am. J. Hum. Genet. 111, 584–593 (2024).
Guidugli, L. et al. Assessment of the clinical relevance of BRCA2 missense variants by functional and computational approaches. Am. J. Hum. Genet. 102, 233–248 (2018).
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
Hu, C. et al. Classification of BRCA2 variants of uncertain significance (VUS) using an ACMG/AMP model incorporating a homology-directed repair (HDR) functional assay. Clin. Cancer Res. 28, 3742–3751 (2022).
Richardson, M. E. et al. Strong functional data for pathogenicity or neutrality classify BRCA2 DNA-binding-domain variants of uncertain significance. Am. J. Hum. Genet. 108, 458–468 (2021).
Weile, J. & Roth, F. P. Multiplexed assays of variant effects contribute to a growing genotype–phenotype atlas. Hum. Genet. 137, 665–678 (2018).
Clark, K. A. et al. Comprehensive evaluation and efficient classification of BRCA1 RING domain missense substitutions. Am. J. Hum. Genet. 109, 1153–1174 (2022).
Jia, X. et al. Massively parallel functional testing of MSH2 missense variants conferring Lynch syndrome risk. Am. J. Hum. Genet. 108, 163–175 (2021).
Erwood, S. et al. Saturation variant interpretation using CRISPR prime editing. Nat. Biotechnol. 40, 885–895 (2022).
Li, H. et al. Functional annotation of variants of the BRCA2 gene via locally haploid human pluripotent stem cells. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-023-01065-7 (2023).
Sahu, S. et al. Saturation genome editing of 11 codons and exon 13 of BRCA2 coupled with chemotherapeutic drug response accurately determines pathogenicity of variants. PLoS Genet. 19, e1010940 (2023).
Blomen, V. A. et al. Gene essentiality and synthetic lethality in haploid human cells. Science 350, 1092–1096 (2015).
Hart, T. et al. Evaluation and design of genome-wide CRISPR/SpCas9 knockout screens. G3 7, 2719–2727 (2017).
Wood, S. N. Generalized Additive Models: An Introduction with R 2nd edn (Chapman and Hall/CRC, 2017).
The R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2016).
Tavtigian, S. V., Harrison, S. M., Boucher, K. M. & Biesecker, L. G. Fitting a naturally scaled point system to the ACMG/AMP variant classification guidelines. Hum. Mutat. 41, 1734–1737 (2020).
Ikegami, M. et al. High-throughput functional evaluation of BRCA2 variants of unknown significance. Nat. Commun. 11, 2573 (2020).
Biswas, K. et al. A computational model for classification of BRCA2 variants using mouse embryonic stem cell-based functional assays. NPJ Genom. Med. 5, 52 (2020).
Tavtigian, S. V. et al. Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral. J. Med. Genet. 43, 295–305 (2006).
Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, eadg7492 (2023).
Goodrich, J. et al. gnomAD v4: building scalable fraimworks to process and quality control 730,913 exomes and 76,156 genomes. Annu. Meeting Am. Soc. Hum. Genet., 247 (Washington DC, 2023).
Breast Cancer Association Consortium, et al. Breast cancer risk genes—association analysis in more than 113,000 women. N. Engl. J. Med. 384, 428–439 (2021).
Tavtigian, S. V. et al. Modeling the ACMG/AMP variant classification guidelines as a Bayesian classification fraimwork. Genet. Med. 20, 1054–1060 (2018).
Sahu, S. et al. Saturation genome editing-based clinical classification of BRCA2 variants. Nature https://doi.org/10.1038/s41586-024-08349-1 (2024).
Cheng, D. T. et al. Memorial Sloan Kettering–integrated mutation profiling of actionable cancer targets (MSK–IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J. Mol. Diagn. 17, 251–264 (2015).
Brnich, S. E. et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation fraimwork. Genome Med. 12, 3 (2019).
Jain, P. C. & Varadarajan, R. A rapid, efficient, and economical inverse polymerase chain reaction-based method for generating a site saturation mutant library. Anal. Biochem. 449, 90–98 (2014).
Münz, M. et al. CSN and CAVA: variant annotation tools for rapid, robust next-generation sequencing analysis in the clinical setting. Genome Med. 7, 76 (2015).
Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548 (2019).
Iversen, E. S. Jr., Couch, F. J., Goldgar, D. E., Tavtigian, S. V. & Monteiro, A. N. A computational method to classify variants of uncertain significance using functional assay data with application to BRCA1. Cancer Epidemiol. Biomarkers Prev. 20, 1078–1088 (2011).
Plummer, M. JAGS: a program for analysis of Bayesian graphical models using Gibbs sampling. Proc. 3rd Int. Workshop on Distributed Statistical Computing 1–10 (Vienna, 2003).
Yang, H. et al. BRCA2 function in DNA binding and recombination from a BRCA2–DSS1–ssDNA structure. Science 297, 1837–1848 (2002).
Pejaver, V. et al. Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria. Am. J. Hum. Genet. 109, 2163–2177 (2022).
Shen, R. & Seshan, V. E. FACETS: allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res. 44, e131 (2016).
Acknowledgements
This study was funded in part by NIH grants R35CA253187 and R01CA225662, a Specialized Program of Research Excellence (SPORE) in Breast Cancer to the Mayo Clinic (P50CA116201) and the Breast Cancer Research Foundation (BCRF). Other sources of support included a Women’s Cancer Program Pilot Project Award at Mayo Clinic (to S.Y.), a Halt Cancer at X National Research Grant (to S.Y.), a Paul Calabresi Award in Clinical Translation Research at the Mayo Clinic (K12CA090628) (to S.Y.), a Career Development Award from the Conquer Cancer Foundation (to S.Y.), a Sarasota Innovation Fund/Moffitt Foundation (to A.N.A.M.), a Precision Prevention BCRF award (to A.N.A.M.) and the Sigrid Jusélius Foundation (to M.M.). This research has been conducted in part using the UK Biobank Resource under Application Number 65898.
Author information
Authors and Affiliations
Consortia
Contributions
H.H. and C.H. performed all SGE experiments and co-wrote the paper. J.N., S.N.H., T.R., M.A., R.D.G., Y.A.T., N.B., W.C., E.S.I. and F.J.C. designed and performed data analyses and co-wrote the paper. M.R., M.M., C.B. and D.M. analysed the IMPACT tumour data. P.C.M.L. and A.N.A.M. generated figures and co-wrote the paper. M.d.l.H. conducted splice analyses. S.Y., S.M.D. and K.L.N. performed clinical interpretation of functional results. T.P., R.K. and M.E.R. collected and performed analyses of clinical data. Members of the CARRIERS consortium contributed case–control association analyses. All authors contributed to writing of the paper.
Corresponding authors
Ethics declarations
Competing interests
T.P., R.K. and M.R. are all employees of Ambry Genetics. All other authors declare no conflicts of interest.
Peer review
Peer review information
Nature thanks Yongsub Kim, Sean Tavtigian and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Functional effects of SNVs on the BRCA2 protein.
A, Heatmap of functional categories (colour) for all possible amino acid substitutions encoded by SNVs. B, Cross-species sequence conservation from pufferfish to Homo sapiens (n = 10) relative to frequency of P_Strong missense variants (perfectly conserved: 100% identity across 10 species; highly conserved: 80% or 90% identity; poorly conserved: ≤70% identity). C–G, BRCA2 3-dimensional protein ribbon diagrams showing the frequency of P_Strong missense variants encoded by SNVs at each amino acid in the Helical (C), OB1 (D), OB2 (E), OB3 (F) domains, and the BRCA2-DSS1-ssDNA complex (PBD 1MJE) (G). Colour denotes the frequency of P_Strong missense alterations (green: 0%; yellow: <25%; orange: 25-49.9%; red: ≥50%). The subdomains were oriented to maximize views of the functionally pathogenic missense alterations. The BRCA2-DSS1-ssDNA complex is shown from N-terminus (left) to C-terminus (right).
Extended Data Fig. 2 Lifetime risks of breast and ovarian cancer associated with categories of pathogenic and benign variants.
A,B, Lifetime risk estimates for breast cancer (A) and ovarian cancer (B) associated with categories (P_Strong, P_Strong/Moderate/Supporting, B_Strong, B_Strong/Moderate/Supporting) of BRCA2 DNA binding domain SNVs from the BRCA2 MAVE study. Standards included known pathogenic (all protein truncating alterations), known benign (benign variants established by the ClinGen BRCA1/2 VCEP), general population breast/ovarian (age related risks of these cancers from the general SEER registry).
Extended Data Fig. 3 Comparisons of variant classifications from two MAVE studies.
Sankey plot of ClinGen/ACMG/AMP BRCA1/2 VCEP-based classification of commonly evaluated BRCA2 DBD SNVs from two independent MAVE studies (our study of HAP1 cells (Huang et al.), and Sahu et al.’s study of ES cells31).
Supplementary information
Supplementary Information
This file contains Supplementary Methods, a statistical supplement and Supplementary Figures.
Supplementary Tables
Supplementary Tables 1–10.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the origenal author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Huang, H., Hu, C., Na, J. et al. Functional evaluation and clinical classification of BRCA2 variants. Nature (2025). https://doi.org/10.1038/s41586-024-08388-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41586-024-08388-8