Ks 3000 Profit Calculator Manual

 
Ks 3000 Profit Calculator Manual Average ratng: 3,8/5 2761 reviews

PLATFORM Gravely MSRP: Kansas State Contract Price Gravely Atlas JSV US Price List 2015 Effective October 1, 2014 796023 POWER STEERING KIT: BOTH $ 1,349.95 $ 1,211: HEATER/DEFROST KIT BOTH $ 799.95 $ 714 WIPER KIT WITH WASHERS: BOTH $ 499.95 $ 422.

  1. Gross Profit Calculator
  2. Profit Calculator Forex
  3. Office Depot Ks-3000 Profit Calculator Manual
Genomics Proteomics Bioinformatics. 2006; 4(4): 259–263.
Published online 2007 May 23. doi: 10.1016/S1672-0229(07)60007-2
PMID: 17531802
This article has been cited by other articles in PMC.

Abstract

KaKs_Calculator is a software package that calculates nonsynonymous (Ka) and synonymous (Ks) substitution rates through model selection and model averaging. Since existing methods for this estimation adopt their specific mutation (substitution) models that consider different evolutionary features, leading to diverse estimates, KaKs_Calculator implements a set of candidate models in a maximum likelihood framework and adopts the Akaike information criterion to measure fitness between models and data, aiming to include as many features as needed for accurately capturing evolutionary information in protein-coding sequences. In addition, several existing methods for calculating Ka and Ks are also incorporated into this software. KaKs_Calculator, including source codes, compiled executables, and documentation, is freely available for academic use at http://evolution.genomics.org.cn/software.htm.

Key words: model selection, model averaging, AIC, approximate method, maximum likelihood method

Introduction

Calculating nonsynonymous (Ka) and synonymous (Ks) substitution rates is of great significance in reconstructing phylogeny and understanding evolutionary dynamics of protein-coding sequences across closely related and yet diverged species 1., 2., 3.. It is known that Ka and Ks, or often their ratio (Ka/Ks), indicate neutral mutation when Ka equals to Ks, negative (purifying) selection when Ka is less than Ks, and positive (diversifying) selection when Ka exceeds Ks. Therefore, statistics of the two variables in genes from different evolutionary lineages provides a powerful tool for quantifying molecular evolution.

Over the past two decades, several methods have been developed for this purpose, which can generally be classified into two classes: approximate method and maximum likelihood method. The approximate method involves three basic steps: (1) counting the numbers of synonymous and nonsynonymous sites, (2) calculating the numbers of synonymous and nonsynonymous substitutions, and (3) correcting for multiple substitutions. On the other hand, the maximum likelihood method integrates evolutionary features (reflected in nucleotide models) into codon-based models and uses the probability theory to finish all the three steps in one go (4). However, these methods adopt different substitution or mutation models based on different assumptions that take account of various sequence features, giving rise to varied estimates of evolutionary distance (5). In other words, Ka and Ks estimation is sensitive to underlying assumptions or mutation models (3). In addition, since the amount and the degree of sequence substitutions vary among datasets from diverse taxa, a single model or method may not be adequate for accurate Ka and Ks calculations. Therefore, a model selection step, that is, to choose a best-fit model when estimating Ka and Ks, becomes critical for capturing appropriate evolutionary information 6., 7..

Toward this end, we have applied model selection and model averaging techniques for Ka and Ks estimations. We use a maximum likelihood method based on a set of candidate substitution models and adopt the Akaike information criterion (AIC) to measure fitness between models and data. After choosing the best-fit model for calculating Ka and Ks, we average the parameters across the candidate models to include as many features as needed since the true model is seldom one of the candidate models in practice (8). Finally, these considerations are incorporated into a software package, namely KaKs_Calculator.

Algorithm

Candidate models

Substitution models play a significant role in phylogenetic and evolutionary analyses of protein-coding sequences by integrating diverse processes of sequence evolution through various assumptions and providing approximations to datasets. We focused on a set of time-reversible substitution models 9., 10., 11., 12., 13., 14., 15., 16. as shown in Table 117., 18., ranging from the Jukes-Cantor (JC) model, which assumes that all substitutions have equal rates and equal nucleotide frequencies, to the general time-reversible (GTR) model that considers six different substitution rates and unequal nucleotide frequencies. Subsequently, we incorporated the parameters in each nucleotide model into a codon-based model 19., 20.. As a result, a general formula of the substitution rate qij from any sense codon i to j (ij) is given for all candidate models (19):

qij={0ifiandjdifferbymorethanonedifferencekxyπjifiandjdifferbyasynonymoussubstitutionofxforyωkxyπjifiandjdifferbyanonsynonymoussubstitutionofxfory

where πj is the frequency of codon j, ω is the Ka/Ks ratio, and kxy is the ratio of rxy to rCA, x, y ∊{A, C, G, T} (Table 1). For example, in the JC model, kxy and πj are equal to 1 owing to equal substitution rates and equal nucleotide frequencies assumed. In the Hasegawa-Kishino-Yano (HKY) model, kTC and kAG become equivalent to the transition/transversion rate ratio and πj can be estimated from sequences, similar to the method reported by Goldman and Yang (19). Other models can be accommodated by making obvious modifications. Therefore, we could acquire maximum likelihood scores in various values generated from individual candidate model by implementing the codon-based models in a maximum likelihood framework 19., 20..

Table 1

Candidate Models for Model Selection and Model Averaging in KaKs_Calculator

ModelDescription (Reference)Nucleotide frequencySubstitution rate*
JC
F81
Jukes-Cantor model (9)
Felsenstein’s model (10)
Equal
Unequal
rTC = rAG = rTA = rCG = rTG = rCA
K2P
HKY
Kimura’s two-parameter model (11)
Hasegawa-Kishino-Yano model (12)
Equal
Unequal
rTC = rAG ≠ rTA = rCG = rTG = rCA
TNEF
TN
TN model with equal nucleotide frequencies
Tamura-Nei model (13)
Equal
Unequal
rTC ≠ rAG ≠ rTA = rCG = rTG = rCA
K3P
K3PUF
Kimura’s three-parameter model (14)
K3P model with unequal nucleotide frequencies
Equal
Unequal
rTC = rAG ≠ rTA = rCG ≠ rTG = rCA
TIMEF
TIM
Transition model with equal nucleotide frequencies
Transition model
Equal
Unequal
rTC ≠ rAG ≠ rTA = rCG ≠ rTG = rCA
TVMEF
TVM
Transversion model with equal nucleotide frequencies
Transversion model
Equal
Unequal
rTC = rAG ≠ rTA ≠ rCG ≠ rTG ≠ rCA
SYM
GTR
Symmetrical model (15)
General time-reversible model (16)
Equal
Unequal
rTC ≠ rAG ≠ rTA ≠ rCG ≠ rTG ≠ rCA
*rij indicates the rate of substitution of i for j, where i, j ∊ {A, C, G, T}.

Model selection

AIC (21) has been widely used in model selection aside from other methods such as the likelihood ratio test (LRT) and the Bayesian information criterion (BIC) (8). AIC characterizes the Kullback-Leibler distance between a true model and an examined model, and this distance can be regarded as quantifying the information lost by approximating the true model. KaKs_Calculator uses a modification of AIC (AICC), which takes account of sampling size (n), maximum likelihood score (lnLi), and the number of parameters (ki) in model i as follows:

AICCi=AICi+2ki(ki+1)nki1=2lnLi+2ki+2ki(ki+1)nki1

AICC is proposed to correct for small sampling size, and it approaches to AIC when sampling size comes to infinity. Consequently, we could use this equation to compute AICC for each candidate model and then identify a model that possesses the smallest AICC, which is a sign for appropriateness between models and data.

Model averaging

Model selection is merely an approximate fit to a dataset, whereas a true evolutionary model is seldom one of the candidate models (8). Therefore, an alternative way is model averaging, which assigns each candidate model a weight value and engages more than one model to estimate average parameters across models. Accordingly, we first need to compute the Akaike weight (wi, where i = 1, 2,…, m) for each model in a set of candidate models:

wi=exp[12(AICCiminAICC)]j=1mexp[12(AICCjminAICC)]

Gross Profit Calculator

where min AICC is the smallest AICC value among candidate models. We can then estimate model-averaged parameters. Taking kTC as an example, a model-averaged estimate can be calculated by:

where kTC,i is kTC in model i and

Application

KaKs_Calculator is written in standard C++ language. It is readily compiled and run on Unix/Linux or workstation (tested on AIX/IRIX/Solaris). In addition, we use Visual C++ 6.0 for graphic user interface and provide its Windows version that can run on any IBM compatible computer under Windows operating system (tested on Windows 2000/XP). Compiled executables on AIX/IRIX/Solaris and setup application on Windows, as well as source codes, example data, instructions for installation and documentation for KaKs_Calculator is available at http://evolution.genomics.org.cn/software.htm.

Different from other existing tools 22., 23.Chuck berry london sessions rar extractor. , KaKs_Calculator employs model-selected and model-averaged methods based on a set of candidate models to estimate Ka and Ks. It integrates as many features as needed from sequence data and in most cases gives rise to more reliable evolutionary information (see the comparative results on simulated sequences at http://evolution.genomics.org.cn/doc/SimulatedResults.xls) (24). KaKs_Calculator also provides comprehensive information estimated from compared sequences, including the numbers of synonymous and nonsynonymous sites and substitutions, GC contents, maximum likelihood scores, and AICC. Moreover, KaKs_Calculator incorporates several other methods 19., 25., 26., 27., 28., 29., 30., 31. and allows users to choose one or more methods at one running time (Table 2).

Calculator

Table 2

Approximate method
MethodMutation model#1Reference
Step 1Step 2Step 3
NGJCJCJC26
LWLJCK2PK2P28
MLWLK2PK2PK2P30
LPB**K2P25., 29.
MLPB**K2P30
YNHKYHKYHKY27
MYNTNTNTN31
Maximum likelihood method
MethodMutation model#2Reference
GYHKY19., 20.
MSa model that has the smallest AICC among 14 candidate modelsModel-selected method proposed in this study
MAa model that averages parameters across 14 candidate modelsModel-averaged method proposed in this study
#1The approximate method involves three basic steps: Step 1: counting the numbers of synonymous and nonsynonymous sites; Step 2: calculating the numbers of synonymous and nonsynonymous substitutions; Step 3: correcting for multiple substitutions.
#2The maximum likelihood method uses the probability theory to finish the three steps in one go (4).
*No specific definition of synonymous and nonsynonymous sites or substitutions.

Although there exist 203 time-reversible models of nucleotide substitution (8), model selection in practice is often limited to a subset of them (32), and thus model averaging can reduce biases arising from model selection. Therefore, model-averaged methods should be preferred for general calculations of Ka and Ks. Some planned improvements include application of model selection and model averaging to detect positive selection at single amino acid sites, which requires high-speed computing for maximum likelihood estimation, especially when an adopted model becomes complex.

In conclusion, KaKs_Calculator incorporates as many features as needed for accurately extracting evolutionary information through model selection and model averaging, therefore it may be useful for in-depth studies on phylogeny and molecular evolution.

Authors’ contributions

ZZ designed and programmed this software, and drafted the manuscript. JL carried out computer simulations to generate sequences. XQZ performed test for earlier versions of the software. JW and GKSW contributed in conceiving this software and participated in software design. JY supervised the study and revised the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors have declared that no competing interests exist.

Acknowledgements

We thank Professor Ziheng Yang for the permission to use his invaluable source codes in PAML and two anonymous reviewers for their constructive comments on an earlier version of this manuscript. We are grateful to Ya-Feng Hu, Lin Fang, Jia Ye, Hai-Feng Yuan, and Heng Li for their help in software development. We also thank a number of users and members of our institutes for reporting bugs and giving suggestions. This work was supported by grants from the Ministry of Science and Technology of China (No. 2001AA231061) and the National Natural Science Foundation of China (No. 30270748) awarded to JY.

References

1. Kimura M. Cambridge University Press; Cambridge, UK: 1983. The Neutral Theory of Molecular Evolution. [Google Scholar]
2. Li W.H. Sinauer Associates; Sunderland, USA: 1997. Molecular Evolution. [Google Scholar]
3. Fay J.C., Wu C.I. Sequence divergence, functional constraint, and selection in protein evolution. Annu. Rev. Genomics Hum. Genet. 2003;4:213–235. [PubMed] [Google Scholar]
4. Yang Z., Bielawski J.P. Statistical methods for detecting molecular adaptation. Trends Ecol. Evol. 2000;15:496–503. [PubMed] [Google Scholar]
5. Muse S.V. Estimating synonymous and nonsynonymous substitution rates. Mol. Biol. Evol. 1996;13:105–114. [PubMed] [Google Scholar]
6. Sullivan J., Joyce P. Model selection in phylogenetics. Annu. Rev. Ecol. Evol. Syst. 2005;36:445–466.[Google Scholar]
7. Pybus O.G. Model selection and the molecular clock. PLoS Biol. 2006;4:e151.[PMC free article] [PubMed] [Google Scholar]
8. Posada D., Buckley T.R. Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst. Biol. 2004;53:793–808. [PubMed] [Google Scholar]
9. Jukes T.H., Cantor C.R. Evolution of protein molecules. In: Munro H.N., editor. Mammalian Protein Metabolism. Academic Press; New York, USA: 1969. pp. 21–123. [Google Scholar]
10. Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 1981;17:368–376. [PubMed] [Google Scholar]
11. Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 1980;16:111–120. [PubMed] [Google Scholar]
12. Hasegawa M. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 1985;22:160–174. [PubMed] [Google Scholar]
13. Tamura K., Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 1993;10:512–526. [PubMed] [Google Scholar]
14. Kimura M. Estimation of evolutionary distances between homologous nucleotide sequences. Proc. Natl. Acad. Sci. USA. 1981;78:454–458.[PMC free article] [PubMed] [Google Scholar]
15. Zharkikh A. Estimation of evolutionary distances between nucleotide sequences. J. Mol. Evol. 1994;39:315–329. [PubMed] [Google Scholar]
16. Tavare S. Some probabilistic and statistical problems in the analysis of DNA sequences. Lect. Math. Life Sci. 1986;17:57–86.[Google Scholar]
17. Posada D. Using Modeltest and PAUP* to select a model of nucleotide substitution. In: Baxevanis A.D., editor. Current Protocols in Bioinformatics. John Wiley & Sons; New York, USA: 2003. [PubMed] [Google Scholar]
18. Lio P., Goldman N. Models of molecular evolution and phylogeny. Genome Res. 1998;8:1233–1244. [PubMed] [Google Scholar]
19. Goldman N., Yang Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 1994;11:725–736. [PubMed] [Google Scholar]
20. Muse S.V., Gaut B.S. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol. Biol. Evol. 1994;11:715–724. [PubMed] [Google Scholar]
21. Akaike H. A new look at the statistical model identification. IEEE Trans. Autom. Control. 1974;19:716–723.[Google Scholar]
22. Comeron J.M. K-Estimator: calculation of the number of nucleotide substitutions per site and the confidence intervals. Bioinformatics. 1999;15:763–764. [PubMed] [Google Scholar]
23. Yang Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 1997;13:555–556. [PubMed] [Google Scholar]
24. Zhang Z., Yu J. Evaluation of six methods for estimating synonymous and nonsynonymous substitution rates. Genomics Proteomics Bioinformatics. 2006;4:173–181. [PubMed] [Google Scholar]
25. Li W.H. Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol. 1993;36:96–99. [PubMed] [Google Scholar]
26. Nei M., Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 1986;3:418–426. [PubMed] [Google Scholar]
27. Yang Z., Nielsen R. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 2000;17:32–43. [PubMed] [Google Scholar]
28. Li W.H. A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 1985;2:150–174. [PubMed] [Google Scholar]
29. Pamilo P., Bianchi N.O. Evolution of the Zfx and Zfy genes: rates and interdependence between the genes. Mol. Biol. Evol. 1993;10:271–281. [PubMed] [Google Scholar]
30. Tzeng Y.H. Comparison of three methods for estimating rates of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 2004;21:2290–2298. [PubMed] [Google Scholar]
31. Zhang Z. Computing Ka and Ks with a consideration of unequal transitional substitutions. BMC Evol. Biol. 2006;6:44.[PMC free article] [PubMed] [Google Scholar]
32. Posada D., Crandall K.A. MODELTEST: testing the model of DNA substitution. Bioinformatics. 1998;14:817–818. [PubMed] [Google Scholar]
Articles from Genomics, Proteomics & Bioinformatics are provided here courtesy of Elsevier
Genomics Proteomics Bioinformatics. 2006; 4(4): 259–263.
Published online 2007 May 23. doi: 10.1016/S1672-0229(07)60007-2
PMID: 17531802
This article has been cited by other articles in PMC.

Abstract

KaKs_Calculator is a software package that calculates nonsynonymous (Ka) and synonymous (Ks) substitution rates through model selection and model averaging. Since existing methods for this estimation adopt their specific mutation (substitution) models that consider different evolutionary features, leading to diverse estimates, KaKs_Calculator implements a set of candidate models in a maximum likelihood framework and adopts the Akaike information criterion to measure fitness between models and data, aiming to include as many features as needed for accurately capturing evolutionary information in protein-coding sequences. In addition, several existing methods for calculating Ka and Ks are also incorporated into this software. KaKs_Calculator, including source codes, compiled executables, and documentation, is freely available for academic use at http://evolution.genomics.org.cn/software.htm.

Key words: model selection, model averaging, AIC, approximate method, maximum likelihood method

Introduction

Calculating nonsynonymous (Ka) and synonymous (Ks) substitution rates is of great significance in reconstructing phylogeny and understanding evolutionary dynamics of protein-coding sequences across closely related and yet diverged species 1., 2., 3.. It is known that Ka and Ks, or often their ratio (Ka/Ks), indicate neutral mutation when Ka equals to Ks, negative (purifying) selection when Ka is less than Ks, and positive (diversifying) selection when Ka exceeds Ks. Therefore, statistics of the two variables in genes from different evolutionary lineages provides a powerful tool for quantifying molecular evolution.

Over the past two decades, several methods have been developed for this purpose, which can generally be classified into two classes: approximate method and maximum likelihood method. The approximate method involves three basic steps: (1) counting the numbers of synonymous and nonsynonymous sites, (2) calculating the numbers of synonymous and nonsynonymous substitutions, and (3) correcting for multiple substitutions. On the other hand, the maximum likelihood method integrates evolutionary features (reflected in nucleotide models) into codon-based models and uses the probability theory to finish all the three steps in one go (4). However, these methods adopt different substitution or mutation models based on different assumptions that take account of various sequence features, giving rise to varied estimates of evolutionary distance (5). In other words, Ka and Ks estimation is sensitive to underlying assumptions or mutation models (3). In addition, since the amount and the degree of sequence substitutions vary among datasets from diverse taxa, a single model or method may not be adequate for accurate Ka and Ks calculations. Therefore, a model selection step, that is, to choose a best-fit model when estimating Ka and Ks, becomes critical for capturing appropriate evolutionary information 6., 7..

Toward this end, we have applied model selection and model averaging techniques for Ka and Ks estimations. We use a maximum likelihood method based on a set of candidate substitution models and adopt the Akaike information criterion (AIC) to measure fitness between models and data. After choosing the best-fit model for calculating Ka and Ks, we average the parameters across the candidate models to include as many features as needed since the true model is seldom one of the candidate models in practice (8). Finally, these considerations are incorporated into a software package, namely KaKs_Calculator.

Algorithm

Candidate models

Substitution models play a significant role in phylogenetic and evolutionary analyses of protein-coding sequences by integrating diverse processes of sequence evolution through various assumptions and providing approximations to datasets. We focused on a set of time-reversible substitution models 9., 10., 11., 12., 13., 14., 15., 16. as shown in Table 117., 18., ranging from the Jukes-Cantor (JC) model, which assumes that all substitutions have equal rates and equal nucleotide frequencies, to the general time-reversible (GTR) model that considers six different substitution rates and unequal nucleotide frequencies. Subsequently, we incorporated the parameters in each nucleotide model into a codon-based model 19., 20.. As a result, a general formula of the substitution rate qij from any sense codon i to j (ij) is given for all candidate models (19):

qij={0ifiandjdifferbymorethanonedifferencekxyπjifiandjdifferbyasynonymoussubstitutionofxforyωkxyπjifiandjdifferbyanonsynonymoussubstitutionofxfory

where πj is the frequency of codon j, ω is the Ka/Ks ratio, and kxy is the ratio of rxy to rCA, x, y ∊{A, C, G, T} (Table 1). For example, in the JC model, kxy and πj are equal to 1 owing to equal substitution rates and equal nucleotide frequencies assumed. In the Hasegawa-Kishino-Yano (HKY) model, kTC and kAG become equivalent to the transition/transversion rate ratio and πj can be estimated from sequences, similar to the method reported by Goldman and Yang (19). Other models can be accommodated by making obvious modifications. Therefore, we could acquire maximum likelihood scores in various values generated from individual candidate model by implementing the codon-based models in a maximum likelihood framework 19., 20..

Table 1

Candidate Models for Model Selection and Model Averaging in KaKs_Calculator

ModelDescription (Reference)Nucleotide frequencySubstitution rate*
JC
F81
Jukes-Cantor model (9)
Felsenstein’s model (10)
Equal
Unequal
rTC = rAG = rTA = rCG = rTG = rCA
K2P
HKY
Kimura’s two-parameter model (11)
Hasegawa-Kishino-Yano model (12)
Equal
Unequal
rTC = rAG ≠ rTA = rCG = rTG = rCA
TNEF
TN
TN model with equal nucleotide frequencies
Tamura-Nei model (13)
Equal
Unequal
rTC ≠ rAG ≠ rTA = rCG = rTG = rCA
K3P
K3PUF
Kimura’s three-parameter model (14)
K3P model with unequal nucleotide frequencies
Equal
Unequal
rTC = rAG ≠ rTA = rCG ≠ rTG = rCA
TIMEF
TIM
Transition model with equal nucleotide frequencies
Transition model
Equal
Unequal
rTC ≠ rAG ≠ rTA = rCG ≠ rTG = rCA
TVMEF
TVM
Transversion model with equal nucleotide frequencies
Transversion model
Equal
Unequal
rTC = rAG ≠ rTA ≠ rCG ≠ rTG ≠ rCA
SYM
GTR
Symmetrical model (15)
General time-reversible model (16)
Equal
Unequal
rTC ≠ rAG ≠ rTA ≠ rCG ≠ rTG ≠ rCA
*rij indicates the rate of substitution of i for j, where i, j ∊ {A, C, G, T}.

Model selection

AIC (21) has been widely used in model selection aside from other methods such as the likelihood ratio test (LRT) and the Bayesian information criterion (BIC) (8). AIC characterizes the Kullback-Leibler distance between a true model and an examined model, and this distance can be regarded as quantifying the information lost by approximating the true model. KaKs_Calculator uses a modification of AIC (AICC), which takes account of sampling size (n), maximum likelihood score (lnLi), and the number of parameters (ki) in model i as follows:

AICCi=AICi+2ki(ki+1)nki1=2lnLi+2ki+2ki(ki+1)nki1

AICC is proposed to correct for small sampling size, and it approaches to AIC when sampling size comes to infinity. Consequently, we could use this equation to compute AICC for each candidate model and then identify a model that possesses the smallest AICC, which is a sign for appropriateness between models and data.

Model averaging

Model selection is merely an approximate fit to a dataset, whereas a true evolutionary model is seldom one of the candidate models (8). Therefore, an alternative way is model averaging, which assigns each candidate model a weight value and engages more than one model to estimate average parameters across models. Accordingly, we first need to compute the Akaike weight (wi, where i = 1, 2,…, m) for each model in a set of candidate models:

Profit Calculator Forex

wi=exp[12(AICCiminAICC)]j=1mexp[12(AICCjminAICC)]

where min AICC is the smallest AICC value among candidate models. We can then estimate model-averaged parameters. Taking kTC as an example, a model-averaged estimate can be calculated by:

where kTC,i is kTC in model i and

Application

KaKs_Calculator is written in standard C++ language. It is readily compiled and run on Unix/Linux or workstation (tested on AIX/IRIX/Solaris). In addition, we use Visual C++ 6.0 for graphic user interface and provide its Windows version that can run on any IBM compatible computer under Windows operating system (tested on Windows 2000/XP). Compiled executables on AIX/IRIX/Solaris and setup application on Windows, as well as source codes, example data, instructions for installation and documentation for KaKs_Calculator is available at http://evolution.genomics.org.cn/software.htm.

Different from other existing tools 22., 23., KaKs_Calculator employs model-selected and model-averaged methods based on a set of candidate models to estimate Ka and Ks. It integrates as many features as needed from sequence data and in most cases gives rise to more reliable evolutionary information (see the comparative results on simulated sequences at http://evolution.genomics.org.cn/doc/SimulatedResults.xls) (24). KaKs_Calculator also provides comprehensive information estimated from compared sequences, including the numbers of synonymous and nonsynonymous sites and substitutions, GC contents, maximum likelihood scores, and AICC. Moreover, KaKs_Calculator incorporates several other methods 19., 25., 26., 27., 28., 29., 30., 31. and allows users to choose one or more methods at one running time (Table 2).

Table 2

Approximate method
MethodMutation model#1Reference
Step 1Step 2Step 3
NGJCJCJC26
LWLJCK2PK2P28
MLWLK2PK2PK2P30
LPB**K2P25., 29.
MLPB**K2P30
YNHKYHKYHKY27
MYNTNTNTN31
Maximum likelihood method
MethodMutation model#2Reference
GYHKY19., 20.
MSa model that has the smallest AICC among 14 candidate modelsModel-selected method proposed in this study
MAa model that averages parameters across 14 candidate modelsModel-averaged method proposed in this study
#1The approximate method involves three basic steps: Step 1: counting the numbers of synonymous and nonsynonymous sites; Step 2: calculating the numbers of synonymous and nonsynonymous substitutions; Step 3: correcting for multiple substitutions.
#2The maximum likelihood method uses the probability theory to finish the three steps in one go (4).
*No specific definition of synonymous and nonsynonymous sites or substitutions.

Although there exist 203 time-reversible models of nucleotide substitution (8), model selection in practice is often limited to a subset of them (32), and thus model averaging can reduce biases arising from model selection. Therefore, model-averaged methods should be preferred for general calculations of Ka and Ks. Some planned improvements include application of model selection and model averaging to detect positive selection at single amino acid sites, which requires high-speed computing for maximum likelihood estimation, especially when an adopted model becomes complex.

In conclusion, KaKs_Calculator incorporates as many features as needed for accurately extracting evolutionary information through model selection and model averaging, therefore it may be useful for in-depth studies on phylogeny and molecular evolution.

Authors’ contributions

ZZ designed and programmed this software, and drafted the manuscript. JL carried out computer simulations to generate sequences. XQZ performed test for earlier versions of the software. JW and GKSW contributed in conceiving this software and participated in software design. JY supervised the study and revised the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors have declared that no competing interests exist.

Acknowledgements

We thank Professor Ziheng Yang for the permission to use his invaluable source codes in PAML and two anonymous reviewers for their constructive comments on an earlier version of this manuscript. We are grateful to Ya-Feng Hu, Lin Fang, Jia Ye, Hai-Feng Yuan, and Heng Li for their help in software development. We also thank a number of users and members of our institutes for reporting bugs and giving suggestions. This work was supported by grants from the Ministry of Science and Technology of China (No. 2001AA231061) and the National Natural Science Foundation of China (No. 30270748) awarded to JY.

References

1. Kimura M. Cambridge University Press; Cambridge, UK: 1983. The Neutral Theory of Molecular Evolution. [Google Scholar]
2. Li W.H. Sinauer Associates; Sunderland, USA: 1997. Molecular Evolution. [Google Scholar]
3. Fay J.C., Wu C.I. Sequence divergence, functional constraint, and selection in protein evolution. Annu. Rev. Genomics Hum. Genet. 2003;4:213–235. [PubMed] [Google Scholar]
4. Yang Z., Bielawski J.P. Statistical methods for detecting molecular adaptation. Trends Ecol. Evol. 2000;15:496–503. [PubMed] [Google Scholar]
5. Muse S.V. Estimating synonymous and nonsynonymous substitution rates. Mol. Biol. Evol. 1996;13:105–114. [PubMed] [Google Scholar]
6. Sullivan J., Joyce P. Model selection in phylogenetics. Annu. Rev. Ecol. Evol. Syst. 2005;36:445–466.[Google Scholar]
7. Pybus O.G. Model selection and the molecular clock. PLoS Biol. 2006;4:e151.[PMC free article] [PubMed] [Google Scholar]
8. Posada D., Buckley T.R. Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst. Biol. 2004;53:793–808. [PubMed] [Google Scholar]
9. Jukes T.H., Cantor C.R. Evolution of protein molecules. In: Munro H.N., editor. Mammalian Protein Metabolism. Academic Press; New York, USA: 1969. pp. 21–123. [Google Scholar]
10. Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 1981;17:368–376. [PubMed] [Google Scholar]
11. Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 1980;16:111–120. [PubMed] [Google Scholar]
12. Hasegawa M. Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 1985;22:160–174. [PubMed] [Google Scholar]
13. Tamura K., Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 1993;10:512–526. [PubMed] [Google Scholar]
14. Kimura M. Estimation of evolutionary distances between homologous nucleotide sequences. Proc. Natl. Acad. Sci. USA. 1981;78:454–458.[PMC free article] [PubMed] [Google Scholar]
15. Zharkikh A. Estimation of evolutionary distances between nucleotide sequences. J. Mol. Evol. 1994;39:315–329. [PubMed] [Google Scholar]
16. Tavare S. Some probabilistic and statistical problems in the analysis of DNA sequences. Lect. Math. Life Sci. 1986;17:57–86.[Google Scholar]
17. Posada D. Using Modeltest and PAUP* to select a model of nucleotide substitution. In: Baxevanis A.D., editor. Current Protocols in Bioinformatics. John Wiley & Sons; New York, USA: 2003. [PubMed] [Google Scholar]
18. Lio P., Goldman N. Models of molecular evolution and phylogeny. Genome Res. 1998;8:1233–1244. [PubMed] [Google Scholar]
19. Goldman N., Yang Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol. Biol. Evol. 1994;11:725–736. [PubMed] [Google Scholar]
20. Muse S.V., Gaut B.S. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol. Biol. Evol. 1994;11:715–724. [PubMed] [Google Scholar]
21. Akaike H. A new look at the statistical model identification. IEEE Trans. Autom. Control. 1974;19:716–723.[Google Scholar]
22. Comeron J.M. K-Estimator: calculation of the number of nucleotide substitutions per site and the confidence intervals. Bioinformatics. 1999;15:763–764. [PubMed] [Google Scholar]
23. Yang Z. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 1997;13:555–556. [PubMed] [Google Scholar]

Office Depot Ks-3000 Profit Calculator Manual

24. Zhang Z., Yu J. Evaluation of six methods for estimating synonymous and nonsynonymous substitution rates. Genomics Proteomics Bioinformatics. 2006;4:173–181. [PubMed] [Google Scholar]
25. Li W.H. Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol. 1993;36:96–99. [PubMed] [Google Scholar]
26. Nei M., Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 1986;3:418–426. [PubMed] [Google Scholar]
27. Yang Z., Nielsen R. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 2000;17:32–43. [PubMed] [Google Scholar]
28. Li W.H. A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Evol. 1985;2:150–174. [PubMed] [Google Scholar]
29. Pamilo P., Bianchi N.O. Evolution of the Zfx and Zfy genes: rates and interdependence between the genes. Mol. Biol. Evol. 1993;10:271–281. [PubMed] [Google Scholar]
30. Tzeng Y.H. Comparison of three methods for estimating rates of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 2004;21:2290–2298. [PubMed] [Google Scholar]
31. Zhang Z. Computing Ka and Ks with a consideration of unequal transitional substitutions. BMC Evol. Biol. 2006;6:44.[PMC free article] [PubMed] [Google Scholar]
32. Posada D., Crandall K.A. MODELTEST: testing the model of DNA substitution. Bioinformatics. 1998;14:817–818. [PubMed] [Google Scholar]
Articles from Genomics, Proteomics & Bioinformatics are provided here courtesy of Elsevier