Computational Biology and Chemistry 75 (2018) 39–44
Research Article

An approach for COFFEE objective function to global DNA multiple
sequence alignment

Anderson Rici Amorim, Leandro Alves Neves, Carlos Roberto Valêncio,
Guilherme Freire Roberto, Geraldo Francisco Donegá Zafalon*
Department of Computer Science and Statistics, São Paulo State University, Rua Cristovão Colombo 2265, São José do Rio Preto, São Paulo, Brazil

A R T I C L E I N F O

Article history:
Received 14 September 2016
Received in revised form 29 March 2018
Accepted 20 April 2018
Available online 25 April 2018

MSC:
00-01
99-00

Keywords:
Multiple sequence alignment
Genetic Algorithm
Optimization

A B S T R A C T

Multiple sequence alignment (MSA) is one of the most important tasks in bioinformatics and it can be
used to prediction of structures or functions of unknown proteins and to phylogenetic tree
reconstruction. There are many heuristics to perform multiple sequence alignment, as Progressive
Alignment, Ant Colony, Genetic Algorithms, among others. Along the years, some tools were proposed to
perform MSA and MSA-GA is one of them. The MSA-GA is a tool based on Genetic Algorithm to perform
multiple sequence alignment and its results are generally better than other well-known tools in
bioinformatics, as Clustal W. The COFFEE objective function was implemented in the MSA-GA in order to
allow it to produce better alignments to less similar sequence sets of proteins. Nonetheless, the COFFEE
objective function is not suited do perform multiple sequence alignment of nucleotides. Thus, we have
modified the COFFEE objective function, previously implemented in the MSA-GA, to allow it to obtain
better results also to sequences of nucleotides. Our results have shown that our approach has achieved
better results in all cases when compared with standard COFFEE and most of cases when compared with
WSP for all test cases from BAliBase and BRAliBase. Moreover, our results are more reliable because their
standard deviations have less variation.

© 2018 Elsevier Ltd. All rights reserved.

Contents lists available at ScienceDirect

Computational Biology and Chemistry

journal home page : www.elsevier .com/ loca te /compbiolchem
1. Introduction

Due the rise of genomic data currently available, the multiple
sequence alignment has been considered one of the most
important tasks of bioinformatics. It plays an important role in
sequence analysis, as function prediction of unknown protein
structures (Pei and Grishin, 2014; Li et al., 2015), viral genome
decoding (Greive et al., 2016), diseases studies (Jordan et al., 2015),
among others.

The better alignment of given sequences can be deterministi-
cally obtained by dynamic programming algorithms, as
Needleman–Wunsch (Needleman and Wunsch, 1970) or Smith–
Waterman (Smith and Waterman, 1981). The first one is a well-
known algorithm to global sequence alignments and the second
one is used to execute local sequence alignments. However, these
algorithms were ideally developed to align pair of sequences and,
due their computational costs, it is infeasible produce an alignment
for three or more sequences (Wang and Jiang, 1994).
* Corresponding author.
E-mail address: geraldo.zafalon@unesp.br (G.F.D. Zafalon).

https://doi.org/10.1016/j.compbiolchem.2018.04.012
1476-9271/© 2018 Elsevier Ltd. All rights reserved.
Thus, to smooth the high computational cost, the multiple
sequence alignment algorithms were proposed. This programs can
be based on many heuristics, as Ant Colony (Lee et al., 2008),
Simulated Annealing (Yao et al., 2015), Progressive Alignment
(Sievers and Higgins, 2014), Genetic Algorithms (GA) (Zhu et al.,
2016), among others.

Concerning GA, it can be used to solve multiple sequence
alignment problems through an approach based on Evolution
Theory (Yadav and Banka, 2016). In this approach, individuals of a
population are exposed to mutation, recombination and selection
to evolve a population of possible solutions whose biological
significances are measured by an objective function (Notredame
and Higgins, 1996).

The first tool developed to solve multiple sequence
alignment problems using GA was SAGA (Sequence Alignment
by Genetic Algorithm) (Notredame and Higgins, 1996). It has a
complex pack of 22 operators, including mutation and
recombination, which are selected by an automatic scheduler.
However, some studies have shown that the complexity of
SAGA is unnecessary and the automatic scheduler does not
improve the final quality of the alignments when compared to
an uniform selector of operators (Thomsen and Boomsma,
2004).

http://crossmark.crossref.org/dialog/?doi=10.1016/j.compbiolchem.2018.04.012&domain=pdf
mailto:geraldo.zafalon@unesp.br
https://doi.org/10.1016/j.compbiolchem.2018.04.012
https://doi.org/10.1016/j.compbiolchem.2018.04.012
http://www.sciencedirect.com/science/journal/14769271
www.elsevier.com/locate/compbiolchem


40 A.R. Amorim et al. / Computational Biology and Chemistry 75 (2018) 39–44
Another possibility of tool that uses GA in multiple sequence
alignment of amino acids and nucleotide sequences is MSA-GA
(Gondro and Kinghorn, 2007). This tool is more efficient to solve
global multiple sequence alignment problems because of its
simpler computational approach. In addition, it produces better
results when compared with other well-known tools in bioinfor-
matics, as Clustal W (Thompson et al., 1994).

However, the default objective function used by MSA-GA is
Weighted Sum-of-Pairs (WSP). This objective function, when
evaluates less similar sequence sets, generally produces noisy
alignments, which is, sometimes, undesirable. Thus, Amorim et al.
(2015) implemented the COFFEE (Consistency based Objective
Function For alignmEnt Evaluation) objective function (Notredame
et al., 1998) in MSA-GA, which has allowed the tool to produce, in
general, better results than the original approach.

Nonetheless, the COFFEE objective function was developed to
evaluate protein sequences (Notredame et al., 1998) and its
implementation to evaluate nucleotide sequences is inadequate.
Thus, Wang and Lefkowitz (2005) modified the COFFEE function to
apply it to local sequence alignments, specifically in inconsistent
regions. However, this modification has not been extended to
global multiple sequence alignment of nucleotides.

Thus, to expand the improvements obtained by Amorim et al.
(2015), we have modified the COFFEE objective function to allow
Fig. 1. COFFEE objective
its implementation on tools to global multiple sequence align-
ment, as MSA-GA. In the results we have obtained, we are able to
notice the improvements developed can produce better results to
global multiple sequence alignments of nucleotides using COFFEE
as evaluation scheme.

2. Materials and methods

2.1. MSA-GA and COFFEE objective function

The MSA-GA is a tool to perform multiple sequence alignments
of amino acids or nucleotides using a simple GA and the WSP
function to evaluate the quality of the obtained results (Gondro
and Kinghorn, 2007). In bioinformatics, the MSA-GA stands out
because it can produce better results when compared with other
widely used tools, as Clustal W. However, due to the WSP nature,
MSA-GA generally cannot produce results with good biological
significance when it aligns less similar sequence sets (Amorim
et al., 2015).

Thus, as Notredame et al. (1998) identified some noisy
problems in the final results of well-done objective functions,
where WSP is included, they developed the COFFEE, which is based
on consistency with a pairwise library, smoothing the disadvan-
tages of those functions in order to produce better results. The
 function scheme.


A.R. Amorim et al. / Computational Biology and Chemistry 75 (2018) 39–44 41
COFFEE objective function is formalized by Eq. (1), where N is the
number of sequences, S1, . . . , SN, in a multiple sequence
alignment, LEN(A(i,j)) is the alignment length, SCORE(A(i,j)) is the
number of pair of residues shared between the library and A(i,j)

and, finally, W(i,j) is the weight associated to the correspondent
pairwise alignment.

COFFEE score ¼
PN�1

i¼1
PN

j¼iþ1 Wi;j � SCOREðAi;jÞ
h i

PN�1
i¼1

PN
j¼iþ1 Wi;j � LENðAi;jÞ

h i ð1Þ

In order to make MSA-GA more robust, Amorim et al. (2015)
have implemented the COFFEE objective function in this tool to
evaluate the individuals of the GA. This modification allowed the
MSA-GA to produce alignments of less similar sequence sets with
good biological significance.

Nonetheless, COFFEE is not suited to evaluate multiple
sequence alignment of nucleotides, producing, in these cases,
poor quality results (Notredame et al., 1998). Thus, we have
modified the routine implemented by Amorim et al. (2015) in the
MSA-GA tool, in order to allow that this method directs the tool's
genetic algorithm to improve the alignment of nucleotides in terms
of biological significance.
Fig. 2. COFFEE objective function flowchart. The current position of the alignme
2.2. Improvements in the COFFEE function to global multiple sequence
alignment of nucleotides

Analyzing the modifications made by Amorim et al. (2015), it
can be noticed that the COFFEE function was implemented using
two main routines called MountMatrix and SearchInSeq. The first
one declares and fills the score matrix of each alignment and the
second one searches for correspondences between the analyzed
pair of residues and the pairwise library. In this approach, Amorim
et al. (2015) have obtained improvements in the sensibility of the
MSA-GA tool when it aligns less similar sequence sets. However, to
extend these improvements to the nucleotide alignment module, it
was necessary an adaptation in the COFFEE objective function.

Thus, we have modified the SearchInSeq function in order to
allow COFFEE to produce better alignments of nucleotide
sequences. Originally, the COFFEE function scores pairs of residues
if they are found in any position of the corresponding alignment at
the pairwise library, as can be seen in Fig. Figure 1.

In the present work, we have modified the COFFEE objective
function, which is presented in Fig. 2, in order to restrict the search
for an optimal alignment. Thus, we made changes in the
SearchInSeq function, in order to score just the pair of residues
nt column being analyzed is a parameter to MountMatrix and SearchInSeq.


Fig. 3. SearchInSeq(position) function flowchart.

42 A.R. Amorim et al. / Computational Biology and Chemistry 75 (2018) 39–44
that have correspondence at the pairwise library exactly at the
same position of the column that are being analyzed on the
multiple sequence alignment, as can be seen in the flowchart
presented in Fig. 3. To improve the comprehension of the flowchart
presented in Fig. 3, specially to describe the correspondence of the
Fig. 4. Illustration of 
position in the MSA and in the pairwise alignment in the library, an
illustration is presented in Fig. 4

The original concept of COFFEE was to find the correspon-
dence of a base pair (residue) in any position of the pairwise
alignment of the library. The aim of COFFEE objective function is
the position lock.


Table 1
Average scores of MSA-GA with the standard COFFEE, modified COFFEE and WSP for
all test cases of BAliBase.

Subset COFFEE-S COFFEE-M WSP

Ref. 1 0.046 0.405 0.369
Ref. 2 0.058 0.344 0.311
Ref. 3 0.052 0.257 0.232
Ref. 4 0.012 0.065 0.057
Ref. 5 0.064 0.263 0.196

Table 2
Average scores of MSA-GA with the standard COFFEE, modified COFFEE and WSP for
all test cases of BRAliBase.

Subset COFFEE-S COFFEE-M WSP

G2Intron 0.472 0.702 0.603
rRNA 0.634 0.835 0.795
tRNA 0.683 0.772 0.759
U5 0.452 0.628 0.603

A.R. Amorim et al. / Computational Biology and Chemistry 75 (2018) 39–44 43
to ensure that multiple alignment reaches the maximum
number of correspondences when compared with the pairwise
alignments in the library. The proposed modification also aims
to find the correspondences, however, ensuring these corre-
spondences occur in the same position both in the multiple
alignment and in the pairwise alignment in the library. The
position lock is a way to improve the consistency of results,
because the location of occurrence becomes restricted. Analyz-
ing from the mathematical point of view, the original COFFEE
does not work properly with the new approach, because as it
was developed to work with a bigger set of elements (amino
acids) than the new proposed approach (nucleotides) there is a
reduction in the probability of finding a correspondence in the
same location both for multiple alignment and for pairwise
alignment in the library. Thus, this probability is considerably
improved when the set of elements is reduced (nucleotides
only), because the chance of finding the correspondence in the
same location increases with lower range of elements. There-
fore, the obtained results of modified COFFEE are better than
original COFFEE when nucleotide sequences are used.

As the main idea of COFFEE function is to find a very similar
solution to the exact alignments in the pairwise library, our
adaptations were essentials to allow MSA-GA producing global
multiple sequence alignment with good biological significance,
extending the improvements obtained by Amorim et al. (2015) to
the alignment of nucleotides module as well.

3. Results and discussion

3.1. Benchmark and test platform

In order to evaluate the quality of the obtained solutions with
the modifications presented here, we have used test cases of the
reverse transcription of proteins to DNA from BAliBase (Thompson
et al., 2005; Carroll et al., 2007), which are freely available on the
web.1 The BAliBase is a well suited benchmark to evaluate the
quality of the results obtained by multiple sequence alignment
tools, because it has different test cases classified through
categories with different similarities, which are essentials to a
satisfactory evaluation of the results produced by MSA-GA with the
modified COFFEE objective function.

Moreover, we have used in the tests the BRAliBase benchmark
(Gardner et al., 2005) to make the quality evaluation more robust.
This benchmark provides sequence sets of nucleotides and their
respective structural alignments, which allows comparing the
obtained results with the correct reference alignment. The
BRAliBASE is divided into four groups: G2Intron, rRNA, tRNA
and U5, and each of them contains different test cases with
different characteristics.

To evaluate the biological significance of the produced
alignments, we have used the tool qscore, which is available
1 http://www.drive5.com/bench.
along with BAliBase. Among the several scores offered by qscore
we have used PREFAB Q one. This score compares the obtained
alignment with a reference alignment and returns a score between
one and zero, where zero is the worst alignment and one is the
best.

All tests were executed using a Dell Vostro computer with
Windows 8.1 Pro 64 bits, Intel Core i5-3470S CPU@2.90GHz
processor and 6GB of RAM memory. The parameters used in the
MSA-GA were the same used by Gondro and Kinghorn (2007).

3.2. Quality tests

Concerning the quality tests in this work, we have executed all
test cases of the References 1, 2, 3, 4 and 5 from BAliBase and all test
cases of the groups G2Intron, rRNA, tRNA and U5 from BRAliBase,
with different characteristics, as sequence length and similarity
level. All of these tests that we have performed for BAliBase2 and
BRAliBase3 are available to download from everywhere.

Due to the fact of the stochastic approach of the GA, it generally
produces different alignments for the same sequence set. Then, we
have executed each test case five times, in order to ensure that the
obtained results were statistically correct. Therefore, the score
considered in the present work for each test case is the average of
the scores obtained through all executions.

In Table Table 1 are presented the average scores obtained by
the execution of all test cases from each Reference of BAliBase. The
tests were performed for standard COFFEE (COFFEE-S), modified
COFFEE (COFFEE-M) and WSP approaches.

As can be noticed in Table Table 1, all the obtained results of
COFFEE-M are better than COFFEE-S and WSP. For all results,
when we analyze the improvement of COFFEE-M in relation to
the COFFEE-S, we obtained an average of 584%, and 15.8% for
COFFEE-M in relation to WSP. To obtain these averages we have
calculated the improvement for each Reference (Ref. 1, Ref. 2,
Ref. 3, Ref. 4 and Ref. 5), comparing COFFEE-M with COFFEE-S
and COFFEE-M with WSP. After that, with the improvements of
all References, we have calculated the average improvement of
them. In this case, the average standard deviation of the
standard COFFEE was 0.024, of the modified COFFEE was 0.020,
while to WSP was 0.040.

In Table Table 2 are presented the average scores obtained by
the execution of all test cases from each group of BRAliBase. The
tests were performed for standard COFFEE (COFFEE-S), modified
COFFEE (COFFEE-M) and WSP approaches.

As can be noticed in Table Table 2, all the obtained results of
COFFEE-M are better than COFFEE-S and WSP. For all results, when
we analyze the improvement of COFFEE-M in relation to the
COFFEE-S, we obtained an average of 33.1%, and 6.8% for COFFEE-M
in relation to WSP. In this case, the average standard deviation of
the standard COFFEE was 0.115, of the modified COFFEE was 0.089,
while to WSP was 0.101.
2 http://www.ibilce.unesp.br/gcc/balibase-results-final.xlsx.
3 http://www.ibilce.unesp.br/gcc/bralibase-results-final.xlsx.

http://www.drive5.com/bench
http://www.ibilce.unesp.br/gcc/balibase-results-final.xlsx
http://www.ibilce.unesp.br/gcc/bralibase-results-final.xlsx


44 A.R. Amorim et al. / Computational Biology and Chemistry 75 (2018) 39–44
4. Conclusion

The MSA-GA is a widespread multiple sequence alignment tool
because its simple GA scheme is capable of produce better results
when compared with other well-known tools, as Clustal W
(Gondro and Kinghorn, 2007). However, the standard objective
function of MSA-GA is the WSP, which generally cannot produce
quality alignments when evaluating sets with less similar
sequences.

Thus, Amorim et al. (2015) implemented the COFFEE objective
function in the MSA-GA tool to smooth the disadvantage
previously referred. However, the COFFEE function was ideally
developed to evaluate multiple sequence alignment of amino
acids. So, in order to extend the obtained improvements to the DNA
alignment module, we modified the COFFEE to produce also good
results to alignments of nucleotides. Basically, we have improved
the function to score only the pair of residues consistent with the
pairwise library, at the exact position of the analyzed multiple
sequence alignment's column.

Thus, considering the reverse transcription of BAliBase in all its
data sets, our modified COFFEE was able to produce better results
than standard COFFEE in 100% of the times and in 71.3% of the
times when compared with WSP. Concerning the average quality
improvement, the modified COFFEE achieved 584% in relation to
standard COFFEE and 15.8% in relation to WSP, in terms of
biological significance. Moreover, when we analyze the results
from the execution of all BRAliBase data sets, we can see that our
modified COFFEE was able to produce better results than standard
COFFEE in 100% of the times and in 87.6% of the times when
compared with WSP. Concerning the average quality improve-
ment, the modified COFFEE achieved 33.1% in relation to standard
COFFEE and 6.8% in relation to WSP, in terms of biological
significance. Finally, the new approach with the modified COFFEE
presented less variation in its results when compared with
standard COFFEE and WSP, which ensures more consistency in
the results and improves the biological analysis.

Acknowledgment

The authors would like to thank all of our collaborators and our
institutions for the support to the development of the present work
and São Paulo Research Foundation (FAPESP) under grant number
13/08289-0.

Appendix A. Supplementary data

Supplementary data associated with this article can be found, in the
online version, at https://doi.org/10.1016/j.compbiolchem.2018.04.012.
References

Amorim, A.R., Zafalon, G.F.D., Neves, L.A., Pinto, A., Valêncio, C.R., Machado, J.M.,
2015. Improvements in the sensibility of MSA-GA tool using coffee objective
function. J. Phys.: Conf. Ser. 574 (1), 012104.

Carroll, H., Beckstead, W., O’connor, T., Ebbert, M., Clement, M., Snell, Q., McClellan,
D., 2007. DNA reference alignment benchmarks based on tertiary structure of
encoded proteins. Bioinformatics 23 (19), 2648–2649.

Gardner, P.P., Wilm, A., Washietl, S., 2005. A benchmark of multiple sequence
alignment programs upon structural RNAs. Nucleic Acids Res. 33 (8), 2433–2439.

Gondro, C., Kinghorn, B., 2007. A simple genetic algorithm for multiple sequence
alignment. Genet. Mol. Res. 6 (4), 964–982.

Greive, S.J., Fung, H.K., Chechik, M., Jenkins, H.T., Weitzel, S.E., Aguiar, P.M., Brentnall,
A.S., Glousieau, M., Gladyshev, G.V., Potts, J.R., et al., 2016. DNA recognition for
virus assembly through multiple sequence-independent interactions with a
helix-turn-helix motif. Nucleic Acids Res. 44 (2), 776–789.

Jordan, D.M., Frangakis, S.G., Golzio, C., Cassa, C.A., Kurtzberg, J., Davis, E.E., Sunyaev,
S.R., Katsanis, N., et al., 2015. Identification of cis-suppression of human disease
mutations by comparative genomics. Nature 524 (7564), 225–229.

Lee, Z.-J., Su, S.-F., Chuang, C.-C., Liu, K.-H., 2008. Genetic algorithm with ant colony
optimization (GA-ACO) for multiple sequence alignment. Appl. Soft Comput. 8
(1), 55–78.

Li, B., Chiong, R., Lin, M., 2015. A balance-evolution artificial bee colony algorithm for
protein structure optimization based on a three-dimensional ab off-lattice
model. Comput. Biol. Chem. 54, 1–12.

Needleman, S.B., Wunsch, C.D., 1970. A general method applicable to the search for
similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48 (3),
443–453.

Notredame, C., Higgins, D.G., 1996. Saga: sequence alignment by genetic algorithm.
Nucleic Acids Res. 24 (8), 1515–1524.

Notredame, C., Holm, L., Higgins, D.G., 1998. Coffee: an objective function for
multiple sequence alignments. Bioinformatics 14 (5), 407–422.

Pei, J., Grishin, N.V., 2014. Promals3d: multiple protein sequence alignment
enhanced with evolutionary and three-dimensional structural information.
Methods Mol Biol 263–271.

Sievers, F., Higgins, D.G., 2014. Clustal omega, accurate alignment of very large
numbers of sequences. Methods Mol. Biol. 105–116.

Smith, T.F., Waterman, M.S., 1981. Identification of common molecular
subsequences. J. Mol. Biol. 147 (1), 195–197.

Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. Clustal W: improving the sensitivity
of progressive multiple sequence alignment through sequence weighting,
position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22
(22), 4673–4680.

Thompson, J.D., Koehl, P., Ripp, R., Poch, O., 2005. Balibase 3.0: latest developments
of the multiple sequence alignment benchmark. Proteins: Struct. Funct.
Bioinformatics 61 (1), 127–136.

Thomsen, R., Boomsma, W., 2004. Multiple sequence alignment using saga:
investigating the effects of operator scheduling, population seeding, and
crossover operators. Workshops on Applications of Evolutionary Computation
113–122 Springer.

Wang, L., Jiang, T., 1994. On the complexity of multiple sequence alignment. J.
Comput. Biol. 1 (4), 337–348.

Wang, C., Lefkowitz, E.J., 2005. Genomic multiple sequence alignments: refinement
using a genetic algorithm. BMC Bioinformatics 6 (1), 1.

Yadav, R.K., Banka, H., 2016. Genetic algorithm using guide tree in mutation operator
for solving multiple sequence alignment. Advanced Computing and Systems for
Security 145–157 Springer.

Yao, D., Jiang, M., You, X., Abulizi, A., Hou, R., 2015. An algorithm of multiple
sequence alignment based on consensus sequence searched by simulated
annealing and star alignment. 2015 International Symposium on Bioelectronics
and Bioinformatics (ISBB) 3–6 IEEE.

Zhu, H., He, Z., Jia, Y., 2016. A novel approach to multiple sequence alignment using
multiobjective evolutionary algorithm based on decomposition. IEEE J. Biomed.
Health Informatics 20 (2), 717–727.

https://doi.org/10.1016/j.compbiolchem.2018.04.012
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0005
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0005
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0005
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0010
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0010
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0010
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0015
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0015
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0020
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0020
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0025
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0025
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0025
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0025
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0030
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0030
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0030
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0035
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0035
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0035
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0040
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0040
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0040
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0045
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0045
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0045
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0050
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0050
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0055
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0055
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0060
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0060
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0060
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0065
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0065
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0070
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0070
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0075
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0075
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0075
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0075
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0080
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0080
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0080
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0085
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0085
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0085
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0085
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0090
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0090
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0095
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0095
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0100
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0100
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0100
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0105
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0105
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0105
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0105
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0110
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0110
http://refhub.elsevier.com/S1476-9271(16)30460-1/sbref0110

	An approach for COFFEE objective function to global DNA multiple sequence alignment
	1 Introduction
	2 Materials and methods
	2.1 MSA-GA and COFFEE objective function
	2.2 Improvements in the COFFEE function to global multiple sequence alignment of nucleotides

	3 Results and discussion
	3.1 Benchmark and test platform
	3.2 Quality tests

	4 Conclusion
	Acknowledgment
	Appendix A Supplementary data
	References