Protein Alignment Algorithms
Here's some notes and links to protein alignment algorithms I've found. Here's the
Wikipedia page about sequence alignment.
Terms
Here are some terms that you might run across:
- Sequence Consensus
- a sequence obtained from a multiple sequence alignment that represents the best makeup of the alignment.
Multiple Alignment
Global Multiple Alignment
These algorithms take several proteins and align them all globally.
CLUSTALW
MUSCLE
MAFFT
- multiple "modes" of operation
- can incorporate information from local pairwise alignments into the global alignment
DIALIGN
- seems to be a hybrid of sorts; pieces together many local multiple alignments
- could possibly be used to extract a single motif...
T-COFFEE
POA
PileUp
And many others...
Local Multiple Alignment (Motif Discovery)
There are two different kinds of motifs, gapped and ungapped. Apparently there are not nearly as many algorithms for gapped motifs as there are for ungapped.
Ungapped
Gapped
Still looking...
Papers
Pairwise Alignment
Lots of stuff out there to do this. I'm only interested in finding motif discovery algorithms at present.
Profiles
Profiles (or position-specific scoring matrices) are constructed from multiple sequence alignments. These alignments can have gaps. Below are some of the programs than can build profiles and align sequences to profiles.
Builders
HmmerBuild
- builds hidden Markov model profiles from a multiple sequence alignment consensus
ProfileMake
Aligners
ProfileGap
Application Suites
There are a few packaged suites that include several tools related to protein sequence searching and alignment.
GCG
- contains over 140 programs to do this and that
HMMER
- contains several programs that deal with HMM profiles
Unevaluated
Here are the names of other algorithms that I haven't had a chance to look at yet.