BTX 285
BTX 285
BTX 285
doi: 10.1093/bioinformatics/btx285
Advance Access Publication Date: 4 May 2017
Applications Note
Sequence analysis
Abstract
Summary: We have implemented the molecular design laboratory’s antimicrobial peptides pack-
age (modlAMP), a Python-based software package for the design, classification and visual repre-
sentation of peptide data. modlAMP offers functions for molecular descriptor calculation and the
retrieval of amino acid sequences from public or local sequence databases, and provides instant
access to precompiled datasets for machine learning. The package also contains methods for the
analysis and representation of circular dichroism spectra.
Availability and Implementation: The modlAMP Python package is available under the BSD license
from URL http://doi.org/10.5905/ethz-1007-72 or via pip from the Python Package Index (PyPI).
Contact: gisbert.schneider@pharma.ethz.ch
Supplementary information: Supplementary data are available at Bioinformatics online.
1 Introduction der Walt et al., 2011) arrays and pandas (McKinney, 2010) data
frames, where possible. We implemented unit testing to ensure high
The interest in membranolytic antimicrobial peptides (AMPs) has code quality. The package comes with detailed online documenta-
constantly increased over the last decade (Fjell et al., 2012). tion (URL https://pythonhosted.org/modlamp), including elaborate
Research foci have shifted from isolating natural AMPs towards the examples, that demonstrate the use of the various data classes and
computer-assisted design of synthetic analogues and mimetics with analysis methods. A sample script showcases a machine learning
improved properties (Jenssen et al., 2008; Juretic et al., 2017). workflow for classifying AMPs versus other peptides.
Several successful examples of computationally de novo generated
AMPs have been reported (Maccari et al., 2013; Müller et al.,
2016), together with new online AMP prediction tools (Waghu 2 Package description
et al., 2014; Wang et al., 2016). However, to this point one had to The modlAMP package currently consists of nine modules:
connect descriptor calculation, activity prediction and analysis tools
through custom scripts, which requires skills in different program- 1. modlamp.descriptors – molecular descriptor calculations
ming languages and environments. We here present modlAMP, 2. modlamp.sequences – in silico sequence design
a Python package to ease the discovery and design of novel synthetic 3. modlamp.database – queries to peptide databases
AMPs via the amalgamation of sequence generation, descriptor cal- 4. modlamp.datasets – precompiled classification datasets
culation, machine learning and data analysis into a single program- 5. modlamp.plot – visualization tools
ming environment. modlAMP provides functions for calculating a 6. modlamp.ml – machine learning models and functions
variety of different molecular properties and amino acid residue- 7. modlamp.wetlab – interpretation of experimental data
based peptide descriptors. Furthermore, it enables the in silico 8. modlamp.analysis – comparison of sequence libraries
generation of bespoke peptide libraries with desired properties. The 9. modlamp.core – helper functions and parent classes
package design follows a modular, object-oriented architecture. The
functions used by most of the methods are located in the core mod- 2.1 Descriptor calculation
ule and accessed by other modules through local import. We deliber- The two main classes provided by the descriptors module are
ately kept the number of objects small, and relied on numpy (van GlobalDescriptor and PeptideDescriptor. The available
C The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
V 2753
2754 A.T.Müller et al.
Table 1. Amino acid property scales available for descriptor calculation through Moreau-Broto type correlation with the
PeptideDescriptor class
Optionally, users can use their own, locally saved amino acid property scales.
a
Code-ID refers to the scalename input option of the PeptideDescriptor class.
property scales and the corresponding scalename option codes for GLFDIVKKVVGALGSL GLFDIVKKVVGALGSL
classes, and thereby informs the user about the model’s estimated Cornette,J.L. et al. (1987) Hydrophobicity scales and computational tech-
uncertainty and applicability domain. modlAMP relies on the sci- niques for detecting amphipathic structures in proteins. J. Mol. Biol., 195,
kit-learn package (Pedregosa et al., 2011), providing thoroughly 659–685.
Cortes,C. and Vapnik,V. (1995) Support-vector networks. Mach. Learn., 20,
tested state-of-the-art implementations of machine learning and
273–297.
data preprocessing methods in Python.
Eisenberg,D. et al. (1982) The helical hydrophobic moment: a measure of the
amphiphilicity of a helix. Nature, 299, 371–374.
Fjell,C.D. et al. (2012) Designing antimicrobial peptides: form follows func-
2.5 Circular dichroism spectral analysis
tion. Nat. Rev. Drug Discov., 11, 37–51.
Secondary structure dynamics may be a major feature determining
Hellberg,S. et al. (1987) Peptide quantitative structure-activity relationships,
antimicrobial activity of certain classes of AMPs. Initial laboratory
a multivariate approach. J. Med. Chem., 30, 1126–1135.
experiments usually include circular dichroism (CD) spectroscopy of Hopp,T.P. and Woods,K.R. (1981) Prediction of protein antigenic determin-
peptides in different solvents. modlAMP contains the wetlab mod- ants from amino acid sequences. Proc. Natl. Acad. Sci., 78, 3824–3828.
ule for the analysis of CD data (Supplementary Fig. S2) and signal Jenssen,H. et al. (2008) QSAR modeling and computer-aided design of antimi-