Skip to main content

Table 5 Comparisons between various amino acid and codon substitution models for the reference phylogenetic tree of the HA_Human-Flu-A-H1N1

From: Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis

Substitution modela

Kb

Δℓc

ΔAICc

ΔBICc

β de

w 0 d

〈 e w ab 〉 f

m Ì‚ g

hx

α ̂ i

 Amino acid substitution models

FLU-dG4r

1

0.0

0.0

0.0

     

0.913

mtREV-F-dG4r

20

-985.5

2009.0

2085.2

     

0.809

LG-F-dG4r

20

-885.1

1808.3

1884.5

     

0.856

WAG-F-dG4r

20

-777.1

1592.2

1668.4

     

0.882

cpREV10-F-dG4r

20

-695.8

1429.5

1505.8

     

0.858

JTT-F-dG4r

20

-386.3

810.5

886.7

     

0.892

cpREV64-F-dG4r

20

-167.9

373.8

450.0

     

0.840

FLU-F-dG4r

20

8.1

21.7

98.0

     

0.907

 Mechanistic codon substitution models

Equal-Constraint-10-F-dG4r

30

203.4

-348.7

-232.4

(0.0)

-1.109

0.330

0.010

4.768

0.828

EI-11-F-dG4r

31

332.7

-605.4

-485.1

0.311

-0.609

0.212

0.013

4.835

0.880

LG-ML91+-11-F-dG4r

31

394.6

-729.2

-608.8

0.453

-0.690

0.151

0.014

4.792

0.920

WAG-ML91+-11-F-dG4r

31

405.2

-750.4

-630.1

0.565

-0.679

0.145

0.018

4.825

0.940

KHG-ML200-11-F-dG4r

31

410.0

-760.0

-639.7

0.676

-0.214

0.202

0.009

3.287

0.923

JTT-ML91+-11-F-dG4r

31

418.3

-776.6

-656.2

0.636

-0.425

0.162

0.027

3.725

0.923

JTT-ML91+-11-F-dG8r

31

441.2

-822.3

-702.0

0.641

-0.446

0.157

0.026

3.745

0.923

Equal-Constraint-10-F-dG4s

30

206.3

-354.7

-238.4

(0.0)

-1.434

0.238

0.010

4.754

0.823

EI-11-F-dG4s

31

328.5

-596.9

-476.6

0.332

-0.495

0.225

0.015

4.741

0.887

LG-ML91+-11-F-dG4s

31

397.6

-735.2

-614.9

0.454

-0.962

0.115

0.014

4.780

0.903

KHG-ML200-11-F-dG4s

31

412.5

-765.1

-644.7

0.676

-0.662

0.129

0.009

3.300

0.923

WAG-ML91+-11-F-dG4s

31

415.0

-770.0

-649.6

0.627

-0.303

0.190

0.021

4.620

0.890

JTT-ML91+-11-F-dG4s

31

421.1

-782.2

-661.9

0.635

-0.761

0.116

0.027

3.722

0.918

JTT-ML91+-11-F-dG8s

31

457.7

-855.4

-735.1

0.731

-0.317

0.152

0.029

3.630

0.911

Equal-Constraint-10-F-dG4sf

87

297.2

-422.3

-77.4

(0.0)

-1.549

0.212

0.010

4.603

0.716

EI-11-F-dG4sf

88

405.8

-637.7

-288.7

0.313

-0.526

0.229

0.014

4.366

0.856

KHG-ML200-11-F-dG4sf

88

428.1

-682.2

-333.2

0.565

-0.674

0.155

0.010

3.397

0.920

LG-ML91+-11-F-dG4sf

88

439.7

-705.5

-356.5

0.369

-1.050

0.128

0.016

4.575

0.885

WAG-ML91+-11-F-dG4sf

88

443.3

-712.6

-363.7

0.658

-0.012

0.241

0.023

4.446

0.864

JTT-ML91+-11-F-dG4sf

88

447.8

-721.6

-372.7

0.686

-0.200

0.185

0.032

3.520

0.871

  1. b"-F" means that the equilibrium frequencies are estimated to be equal to those in the alignment; equal codon usage is assumed. "-dGmr" and "-dGms" mean discrete gamma distributions withmcategories of unequal probabilities for the rate variation and the variation of selective constraint across sites, respectively. "-dGmsf" means the equilibrium frequencies for respective categories are estimated from their posterior probabilities for sites. The number string in the model name indicates the number of parameters optimized for the substitution rate matrix, and the remaining strings denote a rate matrix or a selective constraint matrix used.
  2. bThe number of adjustable parameters.
  3. cDifference from the reference state; Δℓ = ℓ+20059.7, ΔAIC=AIC-40121.5, and ΔBIC = BIC-40125.5. The reference tree topology is one inferred by FastTree-2 [36].
  4. d w ab =min[β w ab estimate + w 0 (1- δ ab ),0]; w ab estimate is the one specified by the model name.
  5. eThe value parenthesized means that the parameter is fixed at the value specified.
  6. fThe average of e w ab over all amino acid pairs {a,b};〈 e w ab 〉≡ 1 190 ∑ a ∑ b > a e w ab .
  7. gThe ratio of double to single and of triple to double nucleotide change exchangeability; m ̂ ≡ m ̂ [ tc ] [ ag ] .
  8. hThe ratio of mean transitional to mean transversional exchangeability; m Ì‚ tc | ag / m Ì‚ [ tc ] [ ag ] .
  9. iThe shape parameter of a discrete gamma distribution for the variation of mutation rate or selective constraint across sites.