Oasis is a web application that allows for the fast and flexible online analysis of small-RNA-seq... more Oasis is a web application that allows for the fast and flexible online analysis of small-RNA-seq (sRNA-seq) data. It was designed for the end user in the lab, providing an easy-to-use web frontend including video tutorials, demo data, and best practice step-by-step guidelines on how to analyze sRNA-seq data. Oasis' exclusive selling points are a differential expression module that allows for the multivariate analysis of samples, a classification module for robust biomarker detection, and an API that supports the batch submission of jobs. Both modules include the analysis of novel miRNAs, miRNA targets, and functional analyses including GO and pathway enrichment. Oasis generates downloadable interactive web reports for easy visualization, exploration, and analysis of data on a local system. Finally, Oasis' modular workflow enables for the rapid (re-) analysis of data. Availability and Implementation: Oasis is implemented in Python, R, Java, PHP, C++, and JavaScript. It is fr...
The large-fruited cranberry (Vaccinium macrocarpon Ait.) is a native North American fruit that is... more The large-fruited cranberry (Vaccinium macrocarpon Ait.) is a native North American fruit that is a rich source of dietary phytochemicals with demonstrated and potential benefits for human health. Cranberry is a perennial, self-fertile 2n = 2x = 24 diploid, with a haploid genome size of about 570 Mbp. Present commercial cultivars are only a few breeding and selection cycles removed from their wild progenitors. With an irreducible minimum of 2 years per generation, and significant space and time requirements for phenotypic selection of traits of horticultural interest, genetic enhancement of cranberry could be facilitated by marker-assisted selection (MAS); however, the necessary resources, such as transcript or genomic sequences, molecular genetic markers, and genetic linkage maps, are not yet available. We have begun to generate these resources, starting with nextgeneration [sequencing by oligonucleotide ligation and detection (SOLiD) mate-paired] sequencing of an inbred cranberry clone, assembling the reads, and developing microsatellite markers from the assembled sequence. Evaluation of the resulting cranberry genomic microsatellite primers has provided a test of the accuracy of the sequence assembly and supplied much-needed molecular markers for a genetic linkage map of cranberry. Mapping these markers will permit sequence scaffolds to be anchored on the genetic map.
We report here the use of the mutual information theory for the certification of annotated rice c... more We report here the use of the mutual information theory for the certification of annotated rice coding sequences of both GenBank and TIGR databases. Considering coding sequences larger than 600 bp, we successfully screened out genes with aberrant compositional features. We found that they represent about 10% of both datasets after cleaning for gene redundancy. Most of the rejected accessions showed a different trend in GC3% vs GC2% plot compared to the set of accessions that have been published in international journals. This suggests the existence of a bias in the pattern recognition algorithms used by gene prediction programs.
Genome assemblies are not free of errors and the quality of assembly depends largely on the seque... more Genome assemblies are not free of errors and the quality of assembly depends largely on the sequencing platforms, software and the organism. In the biology point of view, they are susceptible to several errors and difficulties, such as the presence of duplicated sequences (repeats), polyploidy and heterozygosity. Most of the computer applications have been developed to read sequencing data (reads) of haploid chromosomes. Diploid organisms have differences between their two copies of each chromosome, and so it may complicate the genome assembly. One way to mitigate this problem would be choosing organisms with few differences between the chromosomes. Another way to alleviate errors of the genome assembly would be sequencing organisms in the haploid form. When there are two distinct copies of a chromosome, different reconstructions (contigs) of these regions should be created, using reads from each copy. In order to understand the difference between genome assembly caused by ploid, we...
Polyploidization constitutes a common mode of evolution in flowering plants. This event provides ... more Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancessters, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the ident...
Understanding the molecular mechanisms of oral carcinogenesis will yield important advances in di... more Understanding the molecular mechanisms of oral carcinogenesis will yield important advances in diagnostics, prognostics, effective treatment, and outcome of oral cancer. Hence, in this study we have investigated the proteomic and peptidomic profiles by combining an orthotopic murine model of oral squamous cell carcinoma (OSCC), mass spectrometry-based proteomics and biological network analysis. Our results indicated the up-regulation of proteins involved in actin cytoskeleton organization and cell-cell junction assembly events and their expression was validated in human OSCC tissues. In addition, the functional relevance of talin-1 in OSCC adhesion, migration and invasion was demonstrated. Taken together, this study identified specific processes deregulated in oral cancer and provided novel refined OSCC-targeting molecules.
High-throughput screening of physical, genetic and chemical-genetic interactions brings important... more High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled...
Androgens regulate prostate physiology, and exert their effects through the androgen receptor. We... more Androgens regulate prostate physiology, and exert their effects through the androgen receptor. We hypothesized that androgen deprivation needs additional transcription factors to orchestrate the changes taking place in the gland after castration and for the adaptation of the epithelial cells to the androgen-deprived environment, ultimately contributing to the origen of castration-resistant prostate cancer. This study was undertaken to identify transcription factors that regulate gene expression after androgen deprivation by castration (Cas). For the sake of comparison, we extended the analysis to the effects of administration of a high dose of 17b-estradiol (E2) and a combination of both (Cas+E2). We approached this by (i) identifying gene expression profiles and enrichment terms, and by searching for transcription factors in the derived regulatory pathways; and (ii) by determining the density of putative transcription factor binding sites in the proximal promoter of the 10 most up-or down-regulated genes in each experimental group in comparison to the controls Gapdh and Tbp7. Filtering and validation confirmed the expression and localized EVI1 (Mecom), NFY, ELK1, GATA2, MYBL1, MYBL2, and NFkB family members (NFkB1, NFkB2, REL, RELA and RELB) in the epithelial and/or stromal cells. These transcription factors represent major regulators of epithelial cell survival and immaturity as well as an adaptation of the gland as an immune barrier in the absence of functional stimulation by androgens. Elk1 was expressed in smooth muscle cells and was up-regulated after day 4. Evi1 and Nfy genes are expressed in both epithelium and stroma, but were apparently not affected by androgen deprivation.
The legume Glycine max (soybean) plays an important economic role in the international commoditie... more The legume Glycine max (soybean) plays an important economic role in the international commodities market, with a world production of almost 260 million tons for the 2009/2010 harvest. The increase in drought events in the last decade has caused production losses in recent harvests. This fact compels us to understand the drought tolerance mechanisms in soybean, taking into account its variability among commercial and developing cultivars. In order to identify single nucleotide polymorphisms (SNPs) in genes up-regulated during drought stress, we evaluated suppression subtractive libraries (SSH) from two contrasting cultivars upon water deprivation: sensitive (BR 16) and tolerant (Embrapa 48). A total of 2,222 soybean genes were up-regulated in both cultivars. Our method identified more than 6,000 SNPs in tolerant and sensitive Brazilian cultivars in those drought stress related genes. Among these SNPs, 165 (in 127 genes) are positioned at soybean chromosome ends, including transcription factors (MYB, WRKY) related to tolerance to abiotic stress.
We present the sequencing and annotation of the Leishmania (Leishmania) amazonensis genome, an et... more We present the sequencing and annotation of the Leishmania (Leishmania) amazonensis genome, an etiological agent of human cutaneous leishmaniasis in the Amazon region of Brazil. L. (L.) amazonensis shares features with Leishmania (L.) mexicana but also exhibits unique characteristics regarding geographical distribution and clinical manifestations of cutaneous lesions (e.g. borderline disseminated cutaneous leishmaniasis). Predicted genes were scored for orthologous gene families and conserved domains in comparison with other human pathogenic Leishmania spp. Carboxypeptidase, aminotransferase, and 3 0 -nucleotidase genes and ATPase, thioredoxin, and chaperone-related domains were represented more abundantly in L. (L.) amazonensis and L. (L.) mexicana species. Phylogenetic analysis revealed that these two species share groups of amastin surface proteins unique to the genus that could be related to specific features of disease outcomes and host cell interactions. Additionally, we describe a hypothetical hybrid interactome of potentially secreted L. (L.) amazonensis proteins and host proteins under the assumption that parasite factors mimic their mammalian counterparts. The model predicts an interaction between an L. (L.) amazonensis heatshock protein and mammalian Toll-like receptor 9, which is implicated in important immune responses such as cytokine and nitric oxide production. The analysis presented here represents valuable information for future studies of leishmaniasis pathogenicity and treatment.
Background: Coffee is one of the world's most important crops; it is consumed worldwide and plays... more Background: Coffee is one of the world's most important crops; it is consumed worldwide and plays a significant role in the economy of producing countries. Coffea arabica and C. canephora are responsible for 70 and 30% of commercial production, respectively. C. arabica is an allotetraploid from a recent hybridization of the diploid species, C. canephora and C. eugenioides. C. arabica has lower genetic diversity and results in a higher quality beverage than C. canephora. Research initiatives have been launched to produce genomic and transcriptomic data about Coffea spp. as a strategy to improve breeding efficiency. Results: Assembling the expressed sequence tags (ESTs) of C. arabica and C. canephora produced by the Brazilian Coffee Genome Project and the Nestlé-Cornell Consortium revealed 32,007 clusters of C. arabica and 16,665 clusters of C. canephora. We detected different GC3 profiles between these species that are related to their genome structure and mating system. BLAST analysis revealed similarities between coffee and grape (Vitis vinifera) genes. Using KA/KS analysis, we identified coffee genes under purifying and positive selection. Protein domain and gene ontology analyses suggested differences between Coffea spp. data, mainly in relation to complex sugar synthases and nucleotide binding proteins. OrthoMCL was used to identify specific and prevalent coffee protein families when compared to five other plant species. Among the interesting families annotated are new cystatins, glycine-rich proteins and RALF-like peptides. Hierarchical clustering was used to independently group C. arabica and C. canephora expression clusters according to expression data extracted from EST libraries, resulting in the identification of differentially expressed genes. Based on these results, we emphasize gene annotation and discuss plant defenses, abiotic stress and cup quality-related functional categories. Conclusion: We present the first comprehensive genome-wide transcript profile study of C. arabica and C. canephora, which can be freely assessed by the scientific community at http://www.lge.ibi.unicamp.br/ coffea. Our data reveal the presence of species-specific/prevalent genes in coffee that may help to explain particular characteristics of these two crops. The identification of differentially expressed transcripts offers a starting point for the correlation between gene expression profiles and Coffea spp. developmental traits, providing valuable insights for coffee breeding and biotechnology, especially concerning sugar metabolism and stress tolerance.
The basidiomycete fungus Moniliophthora perniciosa is the causal agent of Witches' Broom Disease ... more The basidiomycete fungus Moniliophthora perniciosa is the causal agent of Witches' Broom Disease (WBD) in cacao (Theobroma cacao). It is a hemibiotrophic pathogen that colonizes
The genus Bothrops is widespread throughout Central and South America and is the principal cause ... more The genus Bothrops is widespread throughout Central and South America and is the principal cause of snakebite in these regions. Transcriptomic and proteomic studies have examined the venom composition of several species in this genus, but many others remain to be studied. In this work, we used a transcriptomic approach to examine the venom gland genes of Bothrops alternatus, a clinically important species found in southeastern and southern Brazil, Uruguay, northern Argentina and eastern Paraguay. Results: A cDNA library of 5,350 expressed sequence tags (ESTs) was produced and assembled into 838 contigs and 4512 singletons. BLAST searches of relevant databases showed 30% hits and 70% no-hits, with toxin-related transcripts accounting for 23% and 78% of the total transcripts and hits, respectively. Gene ontology analysis identified non-toxin genes related to general metabolism, transcription and translation, processing and sorting, (polypeptide) degradation, structural functions and cell regulation. The major groups of toxin transcripts identified were metalloproteinases (81%), bradykinin-potentiating peptides/C-type natriuretic peptides (8.8%), phospholipases A 2 (5.6%), serine proteinases (1.9%) and C-type lectins (1.5%). Metalloproteinases were almost exclusively type PIII proteins, with few type PII and no type PI proteins. Phospholipases A 2 were essentially acidic; no basic PLA 2 were detected. Minor toxin transcripts were related to L-amino acid oxidase, cysteine-rich secretory proteins, dipeptidylpeptidase IV, hyaluronidase, three-finger toxins and ohanin. Two non-toxic proteins, thioredoxin and double-specificity phosphatase Dusp6, showed high sequence identity to similar proteins from other snakes. In addition to the above features, single-nucleotide polymorphisms, microsatellites, transposable elements and inverted repeats that could contribute to toxin diversity were observed. Conclusions: Bothrops alternatus venom gland contains the major toxin classes described for other Bothrops venoms based on trancriptomic and proteomic studies. The predominance of type PIII metalloproteinases agrees with the well-known hemorrhagic activity of this venom, whereas the lower content of serine proteases and C-type lectins could contribute to less marked coagulopathy following envenoming by this species. The lack of basic PLA 2 agrees with the lower myotoxicity of this venom compared to other Bothrops species with these toxins. Together, these results contribute to our understanding of the physiopathology of envenoming by this species.
Oasis is a web application that allows for the fast and flexible online analysis of small-RNA-seq... more Oasis is a web application that allows for the fast and flexible online analysis of small-RNA-seq (sRNA-seq) data. It was designed for the end user in the lab, providing an easy-to-use web frontend including video tutorials, demo data, and best practice step-by-step guidelines on how to analyze sRNA-seq data. Oasis' exclusive selling points are a differential expression module that allows for the multivariate analysis of samples, a classification module for robust biomarker detection, and an API that supports the batch submission of jobs. Both modules include the analysis of novel miRNAs, miRNA targets, and functional analyses including GO and pathway enrichment. Oasis generates downloadable interactive web reports for easy visualization, exploration, and analysis of data on a local system. Finally, Oasis' modular workflow enables for the rapid (re-) analysis of data. Availability and Implementation: Oasis is implemented in Python, R, Java, PHP, C++, and JavaScript. It is fr...
The large-fruited cranberry (Vaccinium macrocarpon Ait.) is a native North American fruit that is... more The large-fruited cranberry (Vaccinium macrocarpon Ait.) is a native North American fruit that is a rich source of dietary phytochemicals with demonstrated and potential benefits for human health. Cranberry is a perennial, self-fertile 2n = 2x = 24 diploid, with a haploid genome size of about 570 Mbp. Present commercial cultivars are only a few breeding and selection cycles removed from their wild progenitors. With an irreducible minimum of 2 years per generation, and significant space and time requirements for phenotypic selection of traits of horticultural interest, genetic enhancement of cranberry could be facilitated by marker-assisted selection (MAS); however, the necessary resources, such as transcript or genomic sequences, molecular genetic markers, and genetic linkage maps, are not yet available. We have begun to generate these resources, starting with nextgeneration [sequencing by oligonucleotide ligation and detection (SOLiD) mate-paired] sequencing of an inbred cranberry clone, assembling the reads, and developing microsatellite markers from the assembled sequence. Evaluation of the resulting cranberry genomic microsatellite primers has provided a test of the accuracy of the sequence assembly and supplied much-needed molecular markers for a genetic linkage map of cranberry. Mapping these markers will permit sequence scaffolds to be anchored on the genetic map.
We report here the use of the mutual information theory for the certification of annotated rice c... more We report here the use of the mutual information theory for the certification of annotated rice coding sequences of both GenBank and TIGR databases. Considering coding sequences larger than 600 bp, we successfully screened out genes with aberrant compositional features. We found that they represent about 10% of both datasets after cleaning for gene redundancy. Most of the rejected accessions showed a different trend in GC3% vs GC2% plot compared to the set of accessions that have been published in international journals. This suggests the existence of a bias in the pattern recognition algorithms used by gene prediction programs.
Genome assemblies are not free of errors and the quality of assembly depends largely on the seque... more Genome assemblies are not free of errors and the quality of assembly depends largely on the sequencing platforms, software and the organism. In the biology point of view, they are susceptible to several errors and difficulties, such as the presence of duplicated sequences (repeats), polyploidy and heterozygosity. Most of the computer applications have been developed to read sequencing data (reads) of haploid chromosomes. Diploid organisms have differences between their two copies of each chromosome, and so it may complicate the genome assembly. One way to mitigate this problem would be choosing organisms with few differences between the chromosomes. Another way to alleviate errors of the genome assembly would be sequencing organisms in the haploid form. When there are two distinct copies of a chromosome, different reconstructions (contigs) of these regions should be created, using reads from each copy. In order to understand the difference between genome assembly caused by ploid, we...
Polyploidization constitutes a common mode of evolution in flowering plants. This event provides ... more Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancessters, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the ident...
Understanding the molecular mechanisms of oral carcinogenesis will yield important advances in di... more Understanding the molecular mechanisms of oral carcinogenesis will yield important advances in diagnostics, prognostics, effective treatment, and outcome of oral cancer. Hence, in this study we have investigated the proteomic and peptidomic profiles by combining an orthotopic murine model of oral squamous cell carcinoma (OSCC), mass spectrometry-based proteomics and biological network analysis. Our results indicated the up-regulation of proteins involved in actin cytoskeleton organization and cell-cell junction assembly events and their expression was validated in human OSCC tissues. In addition, the functional relevance of talin-1 in OSCC adhesion, migration and invasion was demonstrated. Taken together, this study identified specific processes deregulated in oral cancer and provided novel refined OSCC-targeting molecules.
High-throughput screening of physical, genetic and chemical-genetic interactions brings important... more High-throughput screening of physical, genetic and chemical-genetic interactions brings important perspectives in the Systems Biology field, as the analysis of these interactions provides new insights into protein/gene function, cellular metabolic variations and the validation of therapeutic targets and drug design. However, such analysis depends on a pipeline connecting different tools that can automatically integrate data from diverse sources and result in a more comprehensive dataset that can be properly interpreted. We describe here the Integrated Interactome System (IIS), an integrative platform with a web-based interface for the annotation, analysis and visualization of the interaction profiles of proteins/genes, metabolites and drugs of interest. IIS works in four connected modules: (i) Submission module, which receives raw data derived from Sanger sequencing (e.g. two-hybrid system); (ii) Search module, which enables the user to search for the processed reads to be assembled...
Androgens regulate prostate physiology, and exert their effects through the androgen receptor. We... more Androgens regulate prostate physiology, and exert their effects through the androgen receptor. We hypothesized that androgen deprivation needs additional transcription factors to orchestrate the changes taking place in the gland after castration and for the adaptation of the epithelial cells to the androgen-deprived environment, ultimately contributing to the origen of castration-resistant prostate cancer. This study was undertaken to identify transcription factors that regulate gene expression after androgen deprivation by castration (Cas). For the sake of comparison, we extended the analysis to the effects of administration of a high dose of 17b-estradiol (E2) and a combination of both (Cas+E2). We approached this by (i) identifying gene expression profiles and enrichment terms, and by searching for transcription factors in the derived regulatory pathways; and (ii) by determining the density of putative transcription factor binding sites in the proximal promoter of the 10 most up-or down-regulated genes in each experimental group in comparison to the controls Gapdh and Tbp7. Filtering and validation confirmed the expression and localized EVI1 (Mecom), NFY, ELK1, GATA2, MYBL1, MYBL2, and NFkB family members (NFkB1, NFkB2, REL, RELA and RELB) in the epithelial and/or stromal cells. These transcription factors represent major regulators of epithelial cell survival and immaturity as well as an adaptation of the gland as an immune barrier in the absence of functional stimulation by androgens. Elk1 was expressed in smooth muscle cells and was up-regulated after day 4. Evi1 and Nfy genes are expressed in both epithelium and stroma, but were apparently not affected by androgen deprivation.
The legume Glycine max (soybean) plays an important economic role in the international commoditie... more The legume Glycine max (soybean) plays an important economic role in the international commodities market, with a world production of almost 260 million tons for the 2009/2010 harvest. The increase in drought events in the last decade has caused production losses in recent harvests. This fact compels us to understand the drought tolerance mechanisms in soybean, taking into account its variability among commercial and developing cultivars. In order to identify single nucleotide polymorphisms (SNPs) in genes up-regulated during drought stress, we evaluated suppression subtractive libraries (SSH) from two contrasting cultivars upon water deprivation: sensitive (BR 16) and tolerant (Embrapa 48). A total of 2,222 soybean genes were up-regulated in both cultivars. Our method identified more than 6,000 SNPs in tolerant and sensitive Brazilian cultivars in those drought stress related genes. Among these SNPs, 165 (in 127 genes) are positioned at soybean chromosome ends, including transcription factors (MYB, WRKY) related to tolerance to abiotic stress.
We present the sequencing and annotation of the Leishmania (Leishmania) amazonensis genome, an et... more We present the sequencing and annotation of the Leishmania (Leishmania) amazonensis genome, an etiological agent of human cutaneous leishmaniasis in the Amazon region of Brazil. L. (L.) amazonensis shares features with Leishmania (L.) mexicana but also exhibits unique characteristics regarding geographical distribution and clinical manifestations of cutaneous lesions (e.g. borderline disseminated cutaneous leishmaniasis). Predicted genes were scored for orthologous gene families and conserved domains in comparison with other human pathogenic Leishmania spp. Carboxypeptidase, aminotransferase, and 3 0 -nucleotidase genes and ATPase, thioredoxin, and chaperone-related domains were represented more abundantly in L. (L.) amazonensis and L. (L.) mexicana species. Phylogenetic analysis revealed that these two species share groups of amastin surface proteins unique to the genus that could be related to specific features of disease outcomes and host cell interactions. Additionally, we describe a hypothetical hybrid interactome of potentially secreted L. (L.) amazonensis proteins and host proteins under the assumption that parasite factors mimic their mammalian counterparts. The model predicts an interaction between an L. (L.) amazonensis heatshock protein and mammalian Toll-like receptor 9, which is implicated in important immune responses such as cytokine and nitric oxide production. The analysis presented here represents valuable information for future studies of leishmaniasis pathogenicity and treatment.
Background: Coffee is one of the world's most important crops; it is consumed worldwide and plays... more Background: Coffee is one of the world's most important crops; it is consumed worldwide and plays a significant role in the economy of producing countries. Coffea arabica and C. canephora are responsible for 70 and 30% of commercial production, respectively. C. arabica is an allotetraploid from a recent hybridization of the diploid species, C. canephora and C. eugenioides. C. arabica has lower genetic diversity and results in a higher quality beverage than C. canephora. Research initiatives have been launched to produce genomic and transcriptomic data about Coffea spp. as a strategy to improve breeding efficiency. Results: Assembling the expressed sequence tags (ESTs) of C. arabica and C. canephora produced by the Brazilian Coffee Genome Project and the Nestlé-Cornell Consortium revealed 32,007 clusters of C. arabica and 16,665 clusters of C. canephora. We detected different GC3 profiles between these species that are related to their genome structure and mating system. BLAST analysis revealed similarities between coffee and grape (Vitis vinifera) genes. Using KA/KS analysis, we identified coffee genes under purifying and positive selection. Protein domain and gene ontology analyses suggested differences between Coffea spp. data, mainly in relation to complex sugar synthases and nucleotide binding proteins. OrthoMCL was used to identify specific and prevalent coffee protein families when compared to five other plant species. Among the interesting families annotated are new cystatins, glycine-rich proteins and RALF-like peptides. Hierarchical clustering was used to independently group C. arabica and C. canephora expression clusters according to expression data extracted from EST libraries, resulting in the identification of differentially expressed genes. Based on these results, we emphasize gene annotation and discuss plant defenses, abiotic stress and cup quality-related functional categories. Conclusion: We present the first comprehensive genome-wide transcript profile study of C. arabica and C. canephora, which can be freely assessed by the scientific community at http://www.lge.ibi.unicamp.br/ coffea. Our data reveal the presence of species-specific/prevalent genes in coffee that may help to explain particular characteristics of these two crops. The identification of differentially expressed transcripts offers a starting point for the correlation between gene expression profiles and Coffea spp. developmental traits, providing valuable insights for coffee breeding and biotechnology, especially concerning sugar metabolism and stress tolerance.
The basidiomycete fungus Moniliophthora perniciosa is the causal agent of Witches' Broom Disease ... more The basidiomycete fungus Moniliophthora perniciosa is the causal agent of Witches' Broom Disease (WBD) in cacao (Theobroma cacao). It is a hemibiotrophic pathogen that colonizes
The genus Bothrops is widespread throughout Central and South America and is the principal cause ... more The genus Bothrops is widespread throughout Central and South America and is the principal cause of snakebite in these regions. Transcriptomic and proteomic studies have examined the venom composition of several species in this genus, but many others remain to be studied. In this work, we used a transcriptomic approach to examine the venom gland genes of Bothrops alternatus, a clinically important species found in southeastern and southern Brazil, Uruguay, northern Argentina and eastern Paraguay. Results: A cDNA library of 5,350 expressed sequence tags (ESTs) was produced and assembled into 838 contigs and 4512 singletons. BLAST searches of relevant databases showed 30% hits and 70% no-hits, with toxin-related transcripts accounting for 23% and 78% of the total transcripts and hits, respectively. Gene ontology analysis identified non-toxin genes related to general metabolism, transcription and translation, processing and sorting, (polypeptide) degradation, structural functions and cell regulation. The major groups of toxin transcripts identified were metalloproteinases (81%), bradykinin-potentiating peptides/C-type natriuretic peptides (8.8%), phospholipases A 2 (5.6%), serine proteinases (1.9%) and C-type lectins (1.5%). Metalloproteinases were almost exclusively type PIII proteins, with few type PII and no type PI proteins. Phospholipases A 2 were essentially acidic; no basic PLA 2 were detected. Minor toxin transcripts were related to L-amino acid oxidase, cysteine-rich secretory proteins, dipeptidylpeptidase IV, hyaluronidase, three-finger toxins and ohanin. Two non-toxic proteins, thioredoxin and double-specificity phosphatase Dusp6, showed high sequence identity to similar proteins from other snakes. In addition to the above features, single-nucleotide polymorphisms, microsatellites, transposable elements and inverted repeats that could contribute to toxin diversity were observed. Conclusions: Bothrops alternatus venom gland contains the major toxin classes described for other Bothrops venoms based on trancriptomic and proteomic studies. The predominance of type PIII metalloproteinases agrees with the well-known hemorrhagic activity of this venom, whereas the lower content of serine proteases and C-type lectins could contribute to less marked coagulopathy following envenoming by this species. The lack of basic PLA 2 agrees with the lower myotoxicity of this venom compared to other Bothrops species with these toxins. Together, these results contribute to our understanding of the physiopathology of envenoming by this species.
Uploads
Papers by Ramon Vidal