Annurev Animal 020518 115024
Annurev Animal 020518 115024
Annurev Animal 020518 115024
89
BACKGROUND
The global cattle population is highly diverse, ranging from breeds specialized for very high milk
production in controlled, temperate environments to numerous local breeds adapted to harsh
tropical conditions. The advent of relatively low-cost whole-genome sequencing has made it pos-
sible to contemplate linking the variation observed in these cattle phenotypes and others important
in dairy and beef cattle production to variation at the genome level.
The first bovine genome reference assembly, based on the Hereford cow Dominette, was pub-
lished in 2009 (1). At the same time, as a result of resequencing, panels of tens of thousands of
single-nucleotide polymorphisms (SNPs) were identified that were polymorphic in a range of
cattle breeds (2, 3). These SNPs, with physically mapped positions on the bovine genome assem-
bly, enabled association trait mapping and reconstructions of past population histories of various
cattle breeds (3).
By 2011, resequencing of the entire genomes of cattle that were foundation animals of breeds
such as Holstein–Friesian dairy cattle had begun. Larkin et al. (4) described the resequencing of
the genomes of Pawnee Farm Arlinda Chief (Chief) and his son Walkway Chief Mark (Mark),
each accounting for ∼7% of all current genomes of Holstein–Friesian dairy cattle. By comparing
the frequencies of the haplotypes of Chief and Mark in the current population, the authors were
able to identify genome regions in which the frequencies of the paternal and maternal haplotype
from each bull differed more than expected owing to drift and to link these regions to possible
candidate polymorphisms for traits that have been under selection.
In the same year, at the 2011 Sir Mark Oliphant Genomics Conference in Melbourne, a group
of researchers met and decided to pool efforts to resequence key ancestors of globally important
dairy and beef breeds. The ultimate aim was to enable accurate imputation of the genotypes of an-
imals genotyped with SNP arrays to whole-genome sequence to find causative mutations affecting
key economic traits in beef and dairy cattle. This was the genesis of the 1000 Bull Genomes Project.
90
a
80 Bos taurus SNP
Bos taurus INDEL
70 Bos taurus–Bos indicus SNP
Bos taurus–Bos indicus INDEL
60
Million variants
50
40
30
20
10
0
1 2 3 4 5 6
Run
Figure 1
(a) Number of sequence variants [single-nucleotide polymorphisms (SNPs, blue) and small insertions and
deletions (INDELs, orange)] detected in successive iterations of the 1000 Bull Genomes Project. (b) Origin of
breeds or samples sequenced in the project (shaded gray). The authors would like to acknowledge Thuy
Nguyen for help with this figure.
high levels of genetic diversity that have been suggested for B. indicus breeds from SNP data and
targeted resequencing of certain genome regions (3). The majority of SNPs and INDELs were
in intergenic regions, and more than 185,000 were in coding sequences and predicted to cause
amino acid substitutions (Table 1).
Attempts have been made to detect structural variants, including copy number variants, as
well as large insertions, deletions, and inversions, in the 1000 Bull Genomes sequence collections.
Detecting structural variation from short-read sequence data (the vast majority of sequences in the
data set are 100-bp paired-end reads from Illumina sequencing) is very challenging. Chen et al. (5)
considered structural variants as likely to be real only if there was evidence for transmission from
one generation to the next. Those authors described 3.49 and 0.67 Mb of structural variants that
were validated by sire–son transmission in Holstein and Jersey cattle, respectively. Interestingly,
structural variants were significantly depleted in a set of genes identified as core for eukaryote
function (that is, very few structural variants were discovered in these genes) (5).
92 Hayes • Daetwyler
accumulated prior to domestication, as current effective population sizes and typical SNP mu-
tation rates (1 × 10−9 ) would not give rise to such high levels of polymorphism.
During domestication, breed formation, and, more recently, intense selection for milk and meat
production, favorable alleles at loci that affect these traits or processes will have been strongly
selected. For instance, coat color or coat pattern is often a breed-defining feature. Boitard et al.
(7) used the 1000 Bull Genomes sequence data and two different approaches to detect significant
signals of positive selection: a within-population approach aimed at identifying selective sweeps
and a population-differentiation approach designed to capture soft or incomplete sweeps. Their
results confirmed already-described, well-known breed-defining or trait-associated loci, including
MC1R (coat color), KIT (coat color and pattern), GHR (growth, milk production), PLAG1 (stature,
age at onset of puberty), and NCAPG/LCORL (stature), and detected several new loci (e.g., ARL15,
PRLR, CYP19A1, and PPM1L). Encouragingly, in some cases, they demonstrated that use of the
sequence data allowed them to pinpoint the underlying causal mutation under selection. They
concluded that the vast majority of adaptive mutations are likely to be regulatory rather than
protein-coding variants.
the HH3 region: a thymine-to-cytosine transition in the gene Smc2 (structural maintenance of
chromosome protein 2). This gene is largely conserved all the way back to yeast.
The 1000 Bull Genomes data have also been used to assist in the rapid identification of causal
mutations for Weaver syndrome in Brown Swiss cattle (12), progressive retinal degeneration in
European cattle (13), oculocutaneous albinism in Braunvieh cattle (14), lethal chondrodyspla-
sia in Holstein cattle (15), the belted phenotype in several cattle breeds (16), familial renal syn-
drome (xanthinuria) in Tyrolean Grey cattle (17), recessive embryonic lethal mutations in Holstein
cattle (18), and recessive embryonic lethal mutations in Montbéliarde cattle (19). Perhaps one of
the most interesting mutations identified was the dominant mutation causing bulldog calf syn-
drome in a particular sire family. However, only a small proportion of the calves were affected,
suggesting mosaicism in the sire germline (11).
The availability of the 1000 Bull Genomes resource, and the fact that cattle producers can be
collaborators in recording genetic defects on a very large scale, led Bourneuf et al. (8, p. 13) to
propose,
The availability of large databases in cattle combined with the typical structure of livestock populations
facilitate the rapid detection and functional characterization of de novo deleterious mutations. The study
of mutations underlying sporadic syndromes in cattle, which also occur in humans, offers an interesting
alternative to laboratory animals for confirming the genetic aetiology of isolated clinical case reports
and gaining insights into the molecular mechanisms involved.
As a result of the detection of recessive defect mutations, particularly embryonic lethals, large-
scale screening for these defects in dairy populations has become routine. When the mutations
are identified, they are included in the next round of low-cost SNP array design (20). These arrays
are then genotyped in hundreds of thousands of industry cattle (to enable genomic selection), and
bull breeders/farmers can make decisions on whether to use carrier animals for breeding. These
deleterious mutations are of course also excellent targets for genome editing, particularly in bulls
that otherwise have very high genetic merit.
trait data and gene expression data, Xiang et al. (28) reported a splice site QTL at Chr6:87392580
within the fifth exon of kappa casein (CSN3) associated with milk production traits.
A more typical outcome from GWAS with imputed (1000 Bull Genomes) sequence data is the
identification not of a definitive casual mutation but rather of a small number of variants, in high
or complete linkage disequilibrium (LD), that are candidate causative mutations. For example,
Kemper et al. (29) identified two SNPs close to the SLC37A1 gene in complete LD associated
with phosphorus content of milk, gene expression, and protein content of milk (BTA1: 144367474
bp and BTA1: 144377960 bp). The gene is a good candidate for affecting phosphorus content of
milk, as it is a phosphorus–glucose antiporter. Sanchez et al. (30) suggested another SNP in close
proximity (BTA1: 144,398,814) as a potential mutation, in their case affecting concentrations of
alpha S1 casein and alpha-lactalbumin. Table 2 describes candidate loci that have been identified
across a range of traits using imputed sequence data.
Table 2 Candidate loci for complex trait variation that have been identified across a range of traits using (1000 Bull
Genomes) imputed sequence data and genome-wide association
Trait(s) Gene Breeds Reference
Early-lactation milk fat AGPAT6 Fleckvieh, Holstein Daetwyler et al. (11)
content
Late-lactation milk fat GHR, MGST1 Brown Swiss Frischknecht et al. (27)
content
Early-lactation milk fat AGPAT6 Brown Swiss Frischknecht et al. (27)
content
Fat content, protein content SLC37A1, TST, MGST1, Holstein, Fleckvieh, Jersey Pausch et al. (26)
TBC1D22A, ABCG2, CSN1S1,
PAEP, DGAT1, FASN, GHR,
LMAN1, AGPAT6, MBL1
Levels of six major milk SLC37A1, MGST1, ABCG2, Montbéliarde, Normande, Sanchez et al. (30)
proteins (whey proteins CSN1S1, CSN2, CSN1S2, CSN3, Holstein
α-lactalbumin and PAEP, DGAT1, AGPAT6, ALPL,
β-lactoglobulin, casein ANKH, PICALM
αs1, αs2, β, and κ)
Fatty acid profiles in milk LARP1B Holstein–Friesian Duchemin et al. (45)
Fat% and protein% in milk FASN, LALBA Holstein–Friesian, Jersey, Goddard et al. (22)
Australian Red
Milk production and ROBO1, SLC37A1, PSMB2, Holstein–Friesian, Jersey, MacLeod et al. (21)
composition OGDH, MYH9, NCF4, ARNTL2, Australian Red
MGST1, CSN2, CSN3, GC,
RDH8, TTC7B, PROM2, PAEP,
ABO, DGAT1, COX6C, TRIM29,
KRT19, PTRF, ERGIC1, GHR,
SMEK1, WARS, MLH1, GMDS,
MARF1, SCD, PRDX3
Fertility traits, calving traits IGLL1, ATP10A Brown Swiss Frischknecht et al. (46)
Milk production traits BTRC, MGST1, SLC37A1, Holstein–Friesian, Jersey Raven et al. (47)
STAT5A, PAEP, GC, CSF2RB,
MUC1, NCF4, GHDCa
Cow fertility EIF4EBP3b Holstein–Friesian Moore et al. (48)
a
Genes in bold highly differentially expressed in mammary gland in Chamberlain et al. (49).
b
Supported by differential expression in endometrium and corpus luteum of high- and low-fertility dairy cows.
The GWAS that have been performed using imputed sequence data demonstrate that this
approach leads to quite accurate identification of causative genes, if not causative mutations. Al-
though this is certainly a step forward, ideally the approach would lead more directly to identifica-
tion of causative mutations. There are at least two limitations here. The inaccuracy of imputation
means that in some cases causal mutation genotypes are imputed less accurately than the level of
LD between other SNPs in close proximity and the true causal mutation. Running association
tests on the imputed data could result in the imputed causal variant having a higher P-value than
other SNPs that are in high LD with the true causal variant (27, 31). The second limitation is the
lack of annotation of regulatory regions on the bovine genome. Many mutations affecting com-
plex traits are expected to be in regulatory regions [and human genetics studies support this, e.g.,
Zhu et al. (32)]. Without annotation of the regulatory regions, it is difficult to distinguish between
variants that may affect gene regulation and variants that are simply in high LD with the causal
mutation. The Functional Annotation of Animal Genomes (FAANG) Consortium (see 33, also in
this volume) aims to comprehensively annotate the bovine genome in the next few years (34).
As pointed out above, the high levels of LD within a breed often lead to results in which the
SNPs identified with the most significant effects on the trait in the sequence data are in complete
LD, which makes identifying the causative mutation very challenging. One approach to reducing
the levels of LD in the population is to use phenotypes and imputed sequence data from multiple
breeds. Bouwman et al. (31) used this approach in a very large (58,265 cattle) GWAS of stature.
The meta-analysis included 17 populations and 8 breeds. The resulting confidence intervals in
significant regions were small, often including only one gene. In addition, several putative causative
mutations were identified with supporting evidence from an expression QTL (eQTL) study and
(limited) functional annotations.
The results of the GWAS with imputed sequence data have in some cases been rapidly trans-
lated into industry applications—the SNPs identified with the most significant effects on the traits
have been included in new SNP array designs that are used for routine genome evaluations of dairy
and beef cattle (20).
96 Hayes • Daetwyler
top 5–15 QTLs per trait per breed, with 3–5 sequence variants used to tag each QTL. A total of
1,623 additional sequence variant QTL markers were selected (for inclusion on a new SNP array
that also included the 54K original markers). As a result of including the new sequence variants,
reliabilities increased by up to 4 percentage points for production traits in Nordic Holsteins, up to
3 percentage points for Nordic Reds, and up to 5 percentage points for French Holsteins. Smaller
gains were observed for mastitis and fertility.
VanRaden et al. (38) imputed sequence variants into 26,970 progeny-tested Holstein–Friesian
bulls. They found that when 6,648 candidate SNPs from the whole-genome sequence with the
largest estimated effects were added to the 60,671 SNPs used in routine evaluations, the reliabil-
ity of genomic prediction was improved by an average of 2.7 percentage points across 33 traits.
Veerkamp et al. (39) did not observe any increase in reliability of genomic predictions using a
similar approach, although their population was restricted to a single breed.
The greatest gains in accuracy of genomic predictions have been observed for multibreed pop-
ulations, particularly in the situation where one breed is not in the reference (or training) pop-
ulation where the genomic predictions are derived but animals from that breed are among the
selection candidates. MacLeod et al. (21) reported that using whole-genome sequence (in their
case, 994,019 sequence variant SNPs in or close to genes) in across-breed genomic predictions
for dairy cattle improved the accuracy of these genomic predictions by approximately 8% for a
breed (Australian Red) that was not in the reference or training population (Figure 2).
Raymond et al. (40) evaluated the use of whole-genome sequence data for across-breed predic-
tions in 595 New Zealand Jersey bulls; 957 Holstein bulls from New Zealand; and 5,553 Dutch
Holstein bulls. They found the highest accuracies of across-breed prediction (up to 0.35) were
achieved when subsets of SNPs were preselected from the whole-genome sequence data using
GWAS, and only those markers (and the routinely used 54K) were used in the genomic prediction.
Using all the variants in whole-genome sequence data did not significantly improve the propor-
tion of genetic variance captured across breeds compared with scenarios with few but preselected
markers.
Taken together, the results of all these studies suggest the following:
1. Within (dairy) breed, the additional accuracy of genomic prediction that can be achieved as
a result of adding whole-genome sequence data is likely to be small; the best results that have
been observed for within-breed predictions were from VanRaden et al. (38), with a 2.7%
increase in the reliability of genomic predictions across traits, where a very large reference
population was available (and the average reliability with 50K was about 60%).
2. The best results (largest increases in accuracy) to date are observed when SNPs are pres-
elected from the sequence data using GWAS and a nonlinear genomic prediction method
is used (e.g., BayesB, BayesR, BayesSSVS). Use of a BLUP method and sequence data does
not result in any additional accuracy when the SNPs from the sequence data are included
(as the effects of these SNPs are shrunk too much to make an impact with this method).
3. The largest gains in accuracy are for multibreed genomic predictions, particularly where
selection candidates for a breed not in the reference (or training) are predicted.
0.8
BayesR 800K
0.7
BayesR SEQ
BayesRC Lact
0.4
0.3
0.2
0.1
0.0
RH AR RH AR RH AR
Fat yield Milk yield Protein yield
Figure 2
Accuracy (correlation of genomic predictions and trait deviations) for fat, milk, and protein yield from a
reference set of 16,214 Holstein and Jersey bulls and cows, in a Red Holstein (RH) bull validation data set
and Australian Red (AR) cow validation sets (based on table 5 in Reference 21). BayesR 800K (blue) are
genomic predictions based on the BovineHD single-nucleotide polymorphism (SNP) (Illumina San Diego),
BayesR SEQ (orange) are genomic predictions from whole-genome sequence, and BayesRC Lact (gray) are
genomic predictions using the BayesRC method described by MacLeod et al. (21). Figure adapted from
Reference 21.
To make more progress with using whole-genome sequence data in genomic predictions, fur-
ther research is required in several areas. Methods for analyzing the sequence data that are com-
putationally efficient are required, given that both the number of sequence variants identified and
the number of animals with imputed sequence variants are likely to grow rapidly. Highly efficient
approaches for implementing Bayesian methods with sequence data exist, and these methods can
be refined and improved (41–43). Further, biological information, including gene expression (both
differential expression and eQTL) and genome annotation information, could be used to identify
classes of sequence variants more likely to harbor mutations affecting complex traits.
Although the GWAS approach to selecting sequence variants to include in genomic predictions
described above is straightforward to implement and has been demonstrated to improve predic-
tion accuracy, it cannot take full advantage of the sequence data. The many mutations of small
effect, which contribute a substantial proportion of the variance for a typical complex trait, will
not be significant in these GWAS (22). Using the biological information such as gene expression
and genome annotation (including enhancers and other regulatory elements) may improve our
ability to identify these mutations of small effect. MacLeod et al. (21) described a genomic pre-
diction method called BayesRC, which groups sequence variants into classes based on annotation,
differential expression, or other information and allows the proportion of variants in each class
that have no effect (excluded from the model) and small, moderate, and large effects (assumed to
98 Hayes • Daetwyler
come from distributions with 0.0001, 0.001, and 0.01 of the genetic variance, respectively) to vary
between classes. When this method was applied to genomic prediction for milk production traits
in dairy cattle, and classes were determined based on whether sequence variants were in genes
that were differentially expressed in lactation experiments, there was some improvement in the
accuracy of genomic predictions (Figure 2). Zhang et al. (44) evaluated the BayesRC approach
in pigs (with imputed sequence data) and also found improvement in the accuracy of genomic
predictions for some (but not all) traits.
MacLeod et al. (21) demonstrated with simulated data that if classes of sequence variants can
be identified that are substantially enriched for causal mutations, the BayesRC approach can result
in significant improvements in the accuracy of genomic predictions, compared with what can be
achieved with high-density array genotypes. With improved annotation of the bovine genome as
a result of the efforts of the FAANG Consortium, this may be a reality within the next few years.
DISCLOSURE STATEMENT
The authors are not aware of any affiliations, memberships, funding, or financial holdings that
might be perceived as affecting the objectivity of this review.
ACKNOWLEDGMENTS
The authors would like to sincerely thank the 1000 Bull Genomes Project consortium, without
which this manuscript and many others would not have been possible. The 1000 Bull Genomes
Project consortium includes Ruedi Fries (Technische Universität München, Germany), Mogens
Lund/Bernt Guldbrandtsen (Aarhus University, Denmark), Didier Boichard (INRA, France),
Paul Stothard (University of Alberta, Genome Canada), Roel Veerkamp (Wageningen UR,
Netherlands), Curt Van Tassell (US Department of Agriculture), Tom Druet (University of
Liege), Birgit Gredler (Qualitas AG), Johanna Vilkki (Natural Resources Institute Finland),
Erik Mullaart (CRV), Alessandro Bagnato (Universitá degli Studi di Milano), Donagh Berry
(TEAGASC), D.-J. De Koning (Swedish University of Agricultural Sciences), Enrico Santus
(Associazione nazionale Allevatori Razza Bruna), James Reecy (Iowa State University), Jerry
Taylor (University of Missouri), Flavio Schenkel (University of Guelph), Cord Drögemüller
(University of Bern), Steve Miller (AgResearch), Dirk Hinrichs (University of Kiel), Beatriz
Villanueva (Spanish National Institute for Agricultural and Food Research and Technology),
Eileen Wall (Scotland’s Rural College), Lorenzo Bomba (Università Cattolica del Sacro Cuore),
Ezequiel Luis Nicolazzi (Fondazione Parco Tecnologico Padano), Luis Varona (Universidad de
Zaragoza), Joanna Szyda (Wroclaw University of Environmental and Life Sciences), Norwegian
University of Life Sciences, Jesús Piedrafita (Universitat Autònoma de Barcelona), Christa
Kuhn (Leibniz Institute for Farm Animal Biology), and Ding Xiang Dong (Chinese Agricultural
University). The authors would also like to thank Mike Goddard for ideas and discussion leading
to some of the results presented in this manuscript.
LITERATURE CITED
1. Bov. Genome Seq. Anal. Consort., Elsik CG, Tellam RL, Worley KC, Gibbs RA, et al. 2009. The genome
sequence of taurine cattle: a window to ruminant biology and evolution. Science 324(5926):522–28
2. Van Tassell CP, Smith TP, Matukumalli LK, Taylor JF, Schnabel RD, et al. 2008. SNP discovery and allele
frequency estimation by deep sequencing of reduced representation libraries. Nat. Methods 5(3):247–52
3. Bov. HapMap Consort., Gibbs RA, Taylor JF, Van Tassell CP, Barendse W, et al. 2009. Genome-wide
survey of SNP variation uncovers the genetic structure of cattle breeds. Science 324(5926):528–32
4. Larkin DM, Daetwyler HD, Hernandez AG, Wright CL, Hetrick LA, et al. 2012. Whole-genome re-
sequencing of two elite sires for the detection of haplotypes under selection in dairy cattle. PNAS
109(20):7693–98
5. Chen L, Chamberlain AJ, Reich CM, Daetwyler HD, Hayes BJ. 2017. Detection and validation of struc-
tural variations in bovine whole-genome sequence data. Genet. Sel. Evol. 49:13
6. Boitard S, Rodríguez W, Jay F, Mona S, Austerlitz F. 2016. Inferring population size history from large
samples of genome-wide molecular data—an approximate Bayesian computation approach. PLOS Genet.
12(3):e1005877
7. Boitard S, Boussaha M, Capitan A, Rocha D, Servin B. 2016. Uncovering adaptation from sequence data:
lessons from genome resequencing of four cattle breeds. Genetics 203(1):433–50
8. Bourneuf E, Otz P, Pausch H, Jagannathan V, Michot P, et al. 2017. Rapid discovery of de novo deleterious
mutations in cattle enhances the value of livestock as model species. Sci. Rep. 7(1):11466
9. VanRaden PM, Olson KM, Null DJ, Hutchison JL. 2011. Harmful recessive effects on fertility detected
by absence of homozygous haplotypes. J. Dairy Sci. 94(12):6153–61
10. Adams HA, Sonstegard TS, VanRaden PM, Null DJ, Van Tassell CP, et al. 2016. Identification of a non-
sense mutation in APAF1 that is likely causal for a decrease in reproductive efficiency in Holstein dairy
cattle. J. Dairy Sci. 99(8):6693–701
11. Daetwyler HD, Capitan A, Pausch H, Stothard P, van Binsbergen R, et al. 2014. Whole-genome sequenc-
ing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat. Genet. 46(8):858–65
12. Kunz E, Rothammer S, Pausch H, Schwarzenbacher H, Seefried FR, et al. 2016. Confirmation of a non-
synonymous SNP in PNPLA8 as a candidate causal mutation for Weaver syndrome in Brown Swiss cattle.
Genet. Sel. Evol. 48:21
13. Michot P, Chahory S, Marete A, Grohs C, Dagios D, et al. 2016. A reverse genetic approach identifies
an ancestral frameshift mutation in RP1 causing recessive progressive retinal degeneration in European
cattle breeds. Genet. Sel. Evol. 48(1):56
14. Rothammer S, Kunz E, Seichter D, Krebs S, Wassertheurer M, et al. 2017. Detection of two non-
synonymous SNPs in SLC45A2 on BTA20 as candidate causal mutations for oculocutaneous albinism
in Braunvieh cattle. Genet. Sel. Evol. 49(1):73
15. Agerholm JS, Menzi F, McEvoy FJ, Jagannathan V, Drögemüller C. 2016. Lethal chondrodysplasia in a
family of Holstein cattle is associated with a de novo splice site variant of COL2A1. BMC Vet. Res. 12:100
16. Rothammer S, Kunz E, Krebs S, Bitzer F, Hauser A, et al. 2018. Remapping of the belted phenotype in
cattle on BTA3 identifies a multiplication event as the candidate causal mutation. Genet. Sel. Evol. 50(1):36
17. Murgiano L, Jagannathan V, Piffer C, Diez-Prieto I, Bolcato M, et al. 2016. A frameshift mutation in
MOCOS is associated with familial renal syndrome (xanthinuria) in Tyrolean Grey cattle. BMC Vet. Res.
12(1):276
18. Fritz S, Hoze C, Rebours E, Barbat A, Bizard M, et al. 2018. An initiator codon mutation in SDE2 causes
recessive embryonic lethality in Holstein cattle. J. Dairy Sci. 101(7):6220–31
19. Michot P, Fritz S, Barbat A, Boussaha M, Deloche MC, et al. 2017. A missense mutation in PFAS (phos-
phoribosylformylglycinamidine synthase) is likely causal for embryonic lethality associated with the MH1
haplotype in Montbéliarde dairy cattle. J. Dairy Sci. 2100(10):8176–87
20. Boichard D, Boussaha M, Capitan A, Rocha D, Hoze C, et al. 2018. Experience from large scale use of
the EuroGenomics custom SNP chip in cattle. Proc. World Congr. Genet. Appl. Livest. Prod. 4:675
21. MacLeod IM, Bowman PJ, Vander Jagt CJ, Haile-Mariam M, Kemper KE, et al. 2016. Exploiting bio-
logical priors and sequence variants enhances QTL discovery and genomic prediction of complex traits.
BMC Genom. 17:144
22. Goddard ME, Kemper KE, MacLeod IM, Chamberlain AJ, Hayes BJ. 2016. Genetics of complex traits:
prediction of phenotype, identification of causal polymorphisms and genetic architecture. Proc. Biol. Sci.
B 283(1835):20160569
23. Grisart B, Coppieters W, Farnir F, Karim L, Ford C, et al. 2002. Positional candidate cloning of a QTL
in dairy cattle: identification of a missense mutation in the bovine DGAT1 gene with major effect on milk
yield and composition. Genome Res. 12(2):222–31
24. Blott S, Kim JJ, Moisio S, Schmidt-Küntzel A, Cornet A, et al. 2003. Molecular dissection of a quantitative
trait locus: A phenylalanine-to-tyrosine substitution in the transmembrane domain of the bovine growth
hormone receptor is associated with a major effect on milk yield and composition. Genetics 163(1):253–66
25. Cohen-Zinder M, Seroussi E, Larkin DM, Loor JJ, Everts-van der Wind A, et al. 2005. Identification
of a missense mutation in the bovine ABCG2 gene with a major effect on the QTL on chromosome 6
affecting milk yield and composition in Holstein cattle. Genome Res. 15(7):936–44
26. Pausch H, Emmerling R, Gredler-Grandl B, Fries R, Daetwyler HD, Goddard ME. 2017. Meta-analysis
of sequence-based association studies across three cattle breeds reveals 25 QTL for fat and protein per-
centages in milk at nucleotide resolution. BMC Genom. 18(1):853
27. Frischknecht M, Pausch H, Bapst B, Signer-Hasler H, Flury C, et al. 2017. Highly accurate sequence
imputation enables precise QTL mapping in Brown Swiss cattle. BMC Genom. 18(1):999
28. Xiang R, Hayes BJ, Vander Jagt CJ, MacLeod IM, Khansefid M, et al. 2018. Genome variants associated
with RNA splicing variations in bovine are extensively shared between tissues. BMC Genom. 19(1):521
29. Kemper KE, Littlejohn MD, Lopdell T, Hayes BJ, Bennett LE, et al. 2016. Leveraging genetically simple
traits to identify small-effect variants for complex phenotypes. BMC Genom. 17(1):858
30. Sanchez MP, Govignon-Gion A, Croiseau P, Fritz S, Hozé C, et al. 2017. Within-breed and multi-breed
GWAS on imputed whole-genome sequence variants reveal candidate mutations affecting milk protein
composition in dairy cattle. Genet. Sel. Evol. 49(1):68
31. Bouwman AC, Daetwyler HD, Chamberlain AJ, Ponce CH, Sargolzaei M, et al. 2018. Meta-analysis of
genome-wide association studies for cattle stature identifies common genes that regulate body size in
mammals. Nat. Genet. 50(3):362–67
32. Zhu Z, Zhang F, Hu H, Bakshi A, Robinson MR, et al. 2016. Integration of summary data from GWAS
and eQTL studies predicts complex trait gene targets. Nat. Genet. 48(5):481–87
33. Giuffra E, Tuggle CK, FAANG Consort. 2019. Functional Annotation of Animal Genomes (FAANG):
current achievements and roadmap. Annu. Rev. Anim. Biosci. 7:65–88
34. Andersson L, Archibald AL, Bottema CD, Brauning R, Burgess SC, et al. 2015. FAANG Consortium.
Coordinated international action to accelerate genome-to-phenome with FAANG, the Functional An-
notation of Animal Genomes project. Genome Biol. 16:57
35. van Binsbergen R, Calus MP, Bink MC, van Eeuwijk FA, Schrooten C, Veerkamp RF. 2015. Genomic
prediction using imputed whole-genome sequence data in Holstein Friesian cattle. Genet. Sel. Evol. 47:71
36. Frischknecht M, Meuwissen THE, Bapst B, Seefried FR, Flury C, et al. 2018. Short communication:
genomic prediction using imputed whole-genome sequence variants in Brown Swiss Cattle. J. Dairy Sci.
101(2):1292–96
37. Brøndum RF, Su G, Janss L, Sahana G, Guldbrandtsen B, et al. 2015. Quantitative trait loci markers
derived from whole genome sequence data increases the reliability of genomic prediction. J. Dairy Sci.
98(6):4107–16
38. VanRaden PM, Tooker ME, O’Connell JR, Cole JB, Bickhart DM. 2017. Selecting sequence variants to
improve genomic predictions for dairy cattle. Genet. Sel. Evol. 49(1):32
39. Veerkamp RF, Bouwman AC, Schrooten C, Calus MP. 2016. Genomic prediction using preselected DNA
variants from a GWAS with whole-genome sequence data in Holstein-Friesian cattle. Genet. Sel. Evol.
48(1):95
40. Raymond B, Bouwman AC, Schrooten C, Houwing-Duistermaat J, Veerkamp RF. 2018. Utility of whole-
genome sequence data for across-breed genomic prediction. Genet. Sel. Evol. 50:27
41. Calus MP, Bouwman AC, Schrooten C, Veerkamp RF. 2016. Efficient genomic prediction based on
whole-genome sequence data using split-and-merge Bayesian variable selection. Genet. Sel. Evol. 48(1):49
42. Wang T, Chen YP, Bowman PJ, Goddard ME, Hayes BJ. 2016. A hybrid expectation maximisation and
MCMC sampling algorithm to implement Bayesian mixture model based genomic prediction and QTL
mapping. BMC Genom. 17(1):744
43. van den Berg I, Bowman PJ, MacLeod IM, Hayes BJ, Wang T, et al. 2017. Multi-breed genomic prediction
using Bayes R with sequence data and dropping variants with a small effect. Genet. Sel. Evol. 49(1):70
44. Zhang C, Kemp RA, Stothard P, Wang Z, Boddicker N, et al. 2018. Genomic evaluation of feed efficiency
component traits in Duroc pigs using 80K, 650K and whole-genome sequence variants. Genet. Sel. Evol.
50:14
45. Duchemin SI, Bovenhuis H, Megens H-J, Van Arendonk JAM, Visker MHPW. 2017. Fine-mapping of
BTA17 using imputed sequences for associations with de novo synthesized fatty acids in bovine milk.
J. Dairy Sci. 100(11):9125–35
46. Frischknecht M, Bapst B, Seefried FR, Signer-Hasler H, Garrick D, et al. 2017. Genome-wide association
studies of fertility and calving traits in Brown Swiss cattle using imputed whole-genome sequences. BMC
Genom. 18(1):910
47. Raven LA, Cocks BG, Kemper KE, Chamberlain AJ, Vander Jagt CJ, et al. 2016. Targeted imputation of
sequence variants and gene expression profiling identifies twelve candidate genes associated with lactation
volume, composition and calving interval in dairy cattle. Mamm. Genome 27(1–2):81–97
48. Moore SG, Pryce JE, Hayes BJ, Chamberlain AJ, Kemper KE, et al. 2016. Differentially expressed genes
in endometrium and corpus luteum of Holstein cows selected for high and low fertility are enriched for
sequence variants associated with fertility. Biol. Reprod. 94(1):19
49. Chamberlain AJ, Vander Jagt CJ, Hayes BJ, Khansefid M, Marett LC, et al. 2015. Extensive variation
between tissues in allele specific expression in an outbred mammal. BMC Genom. 16:993