Online citations, reference lists, and bibliographies.
← Back to Search

Recent Progress In Multiple Sequence Alignment: A Survey.

Cédric Notredame
Published 2002 · Medicine, Biology

Cite This
Download PDF
Analyze on Scholarcy
Share
The assembly of a multiple sequence alignment (MSA) has become one of the most common tasks when dealing with sequence analysis. Unfortunately, the wide range of available methods and the differences in the results given by these methods makes it hard for a non-specialist to decide which program is best suited for a given purpose. In this review we briefly describe existing techniques and expose the potential strengths and weaknesses of the most widely used multiple alignment packages.
This paper references
10.1016/0022-2836(87)90316-0
A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons.
G. Barton (1987)
10.1016/0022-2836(70)90057-4
A general method applicable to the search for similarities in the amino acid sequence of two proteins.
S. B. Needleman (1970)
TCoffee: A novel algorithm for multiple sequence alignment
C Notredame (2000)
Smithies O: GCG package
J Devereux (1984)
10.1146/ANNUREV.BB.20.060191.001135
Statistical methods and insights for protein and DNA sequences.
S. Karlin (1991)
10.1109/CEC.2000.870716
Evolutionary computation techniques for multiple sequence alignment
Liming Cai (2000)
Gapped BLAST and PSI-BLAST: A new
D. Lipman (1997)
10.1016/0076-6879(90)83038-B
Maximum likelihood methods.
N. Saitou (1990)
10.1093/NAR/25.22.4570
RAGA: RNA sequence alignment by genetic algorithm.
C. Notredame (1997)
10.1093/bioinformatics/13.3.249
Match-Box_server: a multiple sequence alignment tool placing emphasis on reliability
E. Depiereux (1997)
10.1093/bioinformatics/13.6.565
A genetic algorithm for multiple molecular sequence alignment
Ching Zhang (1997)
10.1007/BF02603120
Progressive sequence alignment as a prerequisitetto correct phylogenetic trees
Da-Fei Feng (2007)
10.1137/0149012
Trees, stars, and multiple biological sequence alignment
Stephen F. Astschul (1989)
10.1017/CBO9780511790492.001
Biological sequence analysis: Preface
R. Durbin (1998)
@BULLET One of the first attempts to apply genetic algorithms to sequence analysis
10.1016/0076-6879(90)83011-W
[9] Profile analysis
M. Gribskov (1990)
10.1016/S0079-6603(08)60348-7
Comparative anatomy of 16-S-like ribosomal RNA.
R. Gutell (1985)
10.1016/0022-2836(89)90234-9
Weights for data related by a tree.
S. Altschul (1989)
10.1093/bioinformatics/10.1.53
PHD - an automatic mail server for protein secondary structure prediction
B. Rost (1994)
@BULLET@BULLET One of the most comrehensive textbook on the algorithms dedicated to sequence analysis
10.1016/S0097-8485(99)00012-1
Two Strategies for Sequence Comparison: Profile-preprocessed and Secondary Structure-induced Multiple Alignment
J. Heringa (1999)
10.1111/j.1558-5646.1985.tb00420.x
CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP
J. Felsenstein (1985)
10.1002/PROT.340170407
A method to recognize distant repeats in protein sequences
J. Heringa (1993)
10.1093/bioinformatics/14.3.290
DIALIGN: finding local similarities by multiple sequence alignment
B. Morgenstern (1998)
10.1093/nar/gkh121
The Pfam protein families database
A. Bateman (2004)
10.1093/bioinformatics/9.3.267
Multiple sequence alignment by parallel simulated annealing
M. Ishikawa (1993)
10.1016/0378-1119(88)90330-7
CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.
D. Higgins (1988)
10.1006/JMBI.2000.4042
T-Coffee: A novel method for fast and accurate multiple sequence alignment.
C. Notredame (2000)
10.1117/12.304857
Multiple protein sequence comparison by genetic algorithms
R. R. Gonzalez (1998)
10.1109/CEC.1999.781958
Multiple sequence alignment using evolutionary programming
K. Chellapilla (1999)
10.1093/NAR/16.22.10881
Multiple sequence alignment with hierarchical clustering.
F. Corpet (1988)
10.1093/nar/28.1.225
PRINTS-S: the database formerly known as PRINTS
T. Attwood (2000)
Argos P: A method to recognzse distant repeats in protein sequences. Proteins: Structure Function and Genetics
J Heringa (1993)
10.1006/JSBI.2001.4336
Review: protein secondary structure prediction continues to rise.
B. Rost (2001)
10.1006/JMBI.1999.3091
Protein secondary structure prediction based on position-specific scoring matrices.
D. Jones (1999)
10.1016/S0076-6879(96)66034-0
GOR method for predicting protein secondary structure from amino acid sequence.
J. Garnier (1996)
Hidden Markov Models in Computational Biology: Applications to Protein Modeling. J. Mol. Biol
A Krogh (1994)
Multiple Alignment Using Hidden Markov Models
S. Eddy (1995)
10.1063/1.1699114
Equation of state calculations by fast computing machines
N. Metropolis (1953)
10.1093/nar/29.1.55
A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3
Sabine Dietmann (2001)
10.1016/S0097-8485(00)80003-0
Multiple Protein Sequence Alignment using Double-dynamic Programming
W. Taylor (2000)
10.1093/bioinformatics/7.4.479
A novel randomized iterative strategy for aligning multiple protein sequences
M. Berger (1991)
10.1002/PRO.5560030118
Improving the sensitivity of the sequence profile method
R. Lüthy (1994)
10.1093/bioinformatics/16.9.808
An iterative method for faster sum-of-pairs multiple sequence alignment
K. Reinert (2000)
10.1109/HICSS.1993.270611
Protein modeling using hidden Markov models: analysis of globins
D. Haussler (1993)
Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple http://www.ashley.com 13 REVIEW alignment
J Heringa (1999)
10.1093/NAR/27.13.2682
A comprehensive comparison of multiple sequence alignment programs
J. Thompson (1999)
10.1093/bioinformatics/10.4.379
Further improvement in methods of group-to-group sequence alignment with generalized profile operations
O. Gotoh (1994)
10.1007/BFb0029800
The Maximum Weight Trace Problem in Multiple Sequence Alignment
J. Kececioglu (1993)
10.1145/225298.225326
On genetic algorithms
E. Baum (1995)
10.1016/S0022-2836(05)80006-3
Sequence alignment and penalty choice. Review of concepts, case studies and implications.
M. Vingron (1994)
10.1093/bioinformatics/15.2.122
Combining many multiple alignments in one improved alignment
K. Bucka-Lassen (1999)
10.1093/bioinformatics/15.3.211
DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment
B. Morgenstern (1999)
@BULLET@BULLET The most widely used method for making multiple sequence alignments
10.1126/SCIENCE.8211139
Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment.
C. Lawrence (1993)
10.1006/JMBI.1994.1104
Hidden Markov models in computational biology. Applications to protein modeling.
A. Krogh (1994)
@BULLET@BULLET The first method described that does not require arbitrary gap penalties
10.1093/nar/12.1Part1.387
A comprehensive set of sequence analysis programs for the VAX
J. Devereux (1984)
10.1006/JMBI.1996.0679
Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments.
O. Gotoh (1996)
10.1073/PNAS.93.22.12098
Multiple DNA and protein sequence alignment based on segment-to-segment comparison.
Bernd Morgenstern (1996)
10.1089/cmb.1994.1.337
On the Complexity of Multiple Sequence Alignment
L. Wang (1994)
Playing with blocks: some pitfalls of forcing multiple alignments.
S. Henikoff (1991)
A symmetric-iterated method for the multiple alignment of protein sequences
Luciano Brocchieri (1998)
Clustal w: improving the sensitivity of progressive multiple alignment through sequence weighting
J. D. Thompson (1994)
10.1093/bioinformatics/10.4.419
Multiple sequence alignment using simulated annealing
J. Kim (1994)
A model of evolutionary change in proteins
M. O. Dayhoff (1968)
10.1073/PNAS.91.3.1059
Hidden Markov models of biological primary sequence information.
P. Baldi (1994)
Multiple alignments for structural functional or phylogenetic analyses of homologous sequences
L. Duret (2000)
10.1016/0196-8858(91)90017-D
A time-efficient, linear-space local similarity algorithm
X. Huang (1991)
10.1089/106652701446152
Structure Comparison and Structure Patterns
I. Eidhammer (2000)
Multiple Alignment for Structural Functional or phylogenetic analyses of Homologous Sequences In: Bioinformatics Sequence structure and databanks
L Duret (2000)
10.1016/0022-2836(91)90871-3
Motif recognition and alignment for many sequences by comparison of dot-matrices.
M. Vingron (1991)
10.1093/OXFORDJOURNALS.MOLBEV.A040454
The neighbor-joining method: a new method for reconstructing phylogenetic trees.
N. Saitou (1987)
10.1007/3-540-48873-1_18
Multiple Sequence Alignment Using Parallel Genetic Algorithms
L. Anbarasu (1998)
GCG package
J Devereux (1984)
10.1137/0148063
The multiple sequence alignment problem in biology
H. Carrillo (1988)
10.1093/nar/27.1.215
The PROSITE database, its status in 1999
K. Hofmann (1999)
10.1002/PROT.340180402
Correlated mutations and residue contacts in proteins
U. Goebel (1994)
10.1093/NAR/30.1.235
The PROSITE database, its status in 2002
L. Falquet (2002)
Adaptation in natural and artificial systems
J. Holland (1975)
10.1093/PROTEIN/14.4.227
An approach to improving multiple alignments of protein sequences using predicted secondary structure.
A. Jennings (2001)
10.1089/10665270050081513
Evaluation Measures of Multiple Sequence Alignments
G. Gonnet (2000)
T- Coffee: A novel algorithm for multiple sequence alignment
C Notredame (2000)
@BULLET The first description of the progressive algorithm
10.1093/bioinformatics/12.2.95
Hidden Markov models for sequence analysis: extension and analysis of the basic method
R. Hughey (1996)
10.1093/bioinformatics/14.5.407
COFFEE: an objective function for multiple sequence alignments
C. Notredame (1998)
10.5860/choice.27-0936
Genetic Algorithms in Search Optimization and Machine Learning
D. Goldberg (1988)
10.1093/bioinformatics/13.6.625
DCA: an efficient implementation of the divide-and-conquer approach to simultaneous multiple sequence alignment
J. Stoye (1997)
Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple REVIEW 14
J Heringa (1999)
10.1016/S0022-5193(05)80263-2
A rapid method of protein structure alignment.
C. Orengo (1990)
10.1002/PROT.340090304
A workbench for multiple alignment construction and analysis
G. Schuler (1991)
10.1093/NAR/24.8.1515
SAGA: sequence alignment by genetic algorithm.
C. Notredame (1996)
10.1093/nar/25.1.217
The PROSITE database, its status in 1997
A. Bairoch (1997)
10.1093/bioinformatics/17.4.373
Mocca: semi-automatic method for domain hunting
C. Notredame (2001)
10.1016/S0022-2836(05)80360-2
Basic local alignment search tool.
S. Altschul (1990)
10.1136/bmj.3.5720.417-a
Profile analysis.
M. Gribskov (1990)
10.1093/bioinformatics/15.3.203
An exact solution for the Segment-to-Segment multiple sequence alignment problem
K. Reinert (1998)
10.1093/bioinformatics/15.7.563
Identifying DNA and protein patterns with statistically significant alignments of multiple sequences
G. Hertz (1999)
The alignment of sets of sequences and the construction of phylogenetic trees
P. Hogeweg (1984)
10.1093/bioinformatics/17.8.713
Evaluation of protein multiple alignments by SAM-T99 using the BAliBASE multiple alignment test set
K. Karplus (2001)



This paper is referenced by
Alinhamento múltiplo de proteínas utilizando algoritmos genéticos
Sérgio Jeferson Rafael Ordine (2015)
Pevzner and shuffled elements A novel method for multiple alignment of sequences with repeated data
B. Raphael (2004)
10.17485/IJST/2016/V9I2/84236
Multiple Sequence Alignment based on Developed Genetic Algorithm
H. N. Kaghed (2016)
10.1007/s11265-008-0270-y
Finding and Extracting Data Records from Web Pages
M. Álvarez (2010)
Multiple Sequence Alignment Using MATLAB
Meghna Mathur (2007)
10.1515/psr-2019-0098
A computer-based approach for developing linamarase inhibitory agents
L. Paúl (2020)
10.1007/s10579-004-8682-1
Article: Collating Texts Using Progressive Multiple Alignment
M. Spencer (2004)
10.1007/11596448_121
Simulated Annealing with Injecting Star-Alignment for Multiple Sequence Alignments
H. Huo (2005)
Optimizing Multiple Sequence Alignments using Traveling Salesman Problem and Order-based Evolutionary Algorithms July
Diana Banda Tapia (2012)
Parallel Three-sequence Alignment with Space-efficient
C. Huang (2006)
10.1186/1471-2105-9-212
Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework
Kazutaka Katoh (2007)
10.1186/1471-2105-7-524
Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost
S. Yamada (2006)
10.1109/TCBB.2006.53
Faster Algorithms for Optimal Multiple Sequence Alignment Based on Pairwise Comparisons
Yonatan Bilu (2006)
A HIDDEN MARKOV MODEL FOR COLLABORATIVE FILTERING 1
D. Tepper (2012)
10.21474/IJAR01/269
Multiple Sequence Alignment in Bioinformatics.
Harshita Badlani (2016)
10.1007/978-3-540-77092-3_41
Finding and Extracting Data Records from Web Pages
M. Álvarez (2007)
10.1109/BIBM.2007.40
Improved Methods for Template-Matching in Electron-Density Maps Using Spherical Harmonics
F. Dimaio (2007)
Multiple Sequence Alignment Using Optimization Algorithms
M. Omar (2007)
Comparison of Methods Used for Aligning Protein Sequences
Sangeetha Madangopal (2006)
10.1089/106652703322756096
An Eulerian Path Approach to Global Multiple Alignment for DNA Sequences
Y. Zhang (2003)
A* Algorithms for the Constrained Multiple Sequence Alignment Problem
Dan He (2006)
Microarchitecture Characteristics and Implications of Alignment of Multiple Bioinformatics Sequences
Y. Li (2006)
10.1007/11732242_13
A Methodology for Determining Amino-Acid Substitution Matrices from Set Covers
Alexandre H. L. Porto (2006)
Semantic Similarity Based Data Alignment Using Ontology and Swarm Intelligence Based Annotating Search Results from Web Databases
T. Seeniselvi (2014)
10.1109/BIBE.2006.253324
An improved algorithm for the regular expression constrained multiple sequence alignment problem
A. Arslan (2006)
PARALLEL-TCOFFEE: A parallel multiple sequence aligner
J. Zola (2007)
10.1142/9781860948732_0026
MANGO: a new approach to multiple sequence alignment.
Zefeng Zhang (2007)
10.1093/database/bar009
UniProt Knowledgebase: a hub of integrated protein data
M. Magrane (2011)
10.1007/s10877-005-0680-3
Randomized And Parallel Algorithms For Distance Matrix Calculations In Multiple Sequence Alignment
S. Rajasekaran (2005)
10.1007/978-3-642-29749-6_2
Creating Declarative Process Models Using Test Driven Modeling Suite
S. Zugal (2011)
10.1093/bib/bbn013
Recent developments in the MAFFT multiple sequence alignment program
Kazutaka Katoh (2008)
10.1016/j.compbiolchem.2003.09.002
The PRALINE online server: optimising progressive multiple alignment on the web
V. Simossis (2003)
See more
Semantic Scholar Logo Some data provided by SemanticScholar