Fottea 2016, 16(2):209-217 | DOI: 10.5507/fot.2015.030

A simple method to test the reproducibility of the phylogenetic reconstructions: the molecular systematics of cyanobacteria as a case study

Andrea Paparini1*, Elvina Lee1, Andrew Bath2, Cameron Gordon2, Una M. Ryan1
1 Vector- and Water-Borne Pathogen Research Group, School of Veterinary & Life Sciences, Murdoch University, WA, Australia; *Corresponding author e-mail: a.paparini@murdoch.edu.au, tel.: +61 8 9360 7649
2 Water Quality Branch, Water Corporation, 629 Newcastle Street, Leederville, Western Australia 6007

Molecular systematics uses currently available data to produce the best approximation to the true (un-observable) phylogeny of a taxon. Molecular phylogeny complements morphological identification and classification of organisms, in order to infer their evolutionary relationships. In the current era dominated by cultivation-independent surveys, testing the potential technical and analytical pitfalls and limitations of environmental DNA surveys appears crucial. Sequence-based phylogenetic reconstructions rely on three main steps: alignment, alignment curation and tree building. Several independent options and settings can be adopted at each step, but it is well known that their choice (or combination) can significantly affect the topology of the phylogenetic tree obtained and skew the reliability of the resultant systematics. For the present study, five alignment algorithms, two curation options and three tree-building methods were used to infer the phylogeny of three orders of cyanobacteria, based on four validated markers widely used for this phylum: 16S rRNA, 16S-23S ITS, cpcBA-IGS and rpoC1. Compared to the alignment algorithm or the curation stringency used, the tree-building method was found to have the greatest effect on the resultant tree topology. This result was consistent for all loci, including the genetically-constrained (protein-coding) locus rpoC1. The reproducibility of the tree topology was clearly visualized and measured for each locus. This paper presents pitfalls in cyanobacteria systematics and implements a simple and rapid method, applicable to any locus and organism, to identify aberrant results and assess the reproducibility of phylogenetic reconstructions.

Keywords: 16S rRNA, 16S-23S ITS, cyanobacteria, molecular phylogeny, phycocyanin operon, rpoC1

Received: October 16, 2015; Accepted: December 22, 2015; Prepublished online: July 20, 2016; Published: October 14, 2016  Show citation

ACS AIP APA ASA Harvard Chicago Chicago Notes IEEE ISO690 MLA NLM Turabian Vancouver
Paparini, A., Lee, E., Bath, A., Gordon, C., & Ryan, U.M. (2016). A simple method to test the reproducibility of the phylogenetic reconstructions: the molecular systematics of cyanobacteria as a case study. Fottea16(2), 209-217. doi: 10.5507/fot.2015.030
Download citation

References

  1. Agapow, P.-M. & Purvis, A. (2002): Power of eight tree shape statistics to detect nonrandom diversification: A comparison by simulation of two models of cladogenesis. - Syst. Biol. 51: 866-872. Go to original source...
  2. Brosius, J.; Palmer, M.L.; Kennedy, P.J. & Noller, H.F. (1978): Complete nucleotide sequence of a 16S ribosomal RNA gene from Escherichia coli. - Proc. Natl. Acad. Sci. U. S. A. 75: 4801-4805. Go to original source...
  3. Castenholz, R.W. (2001): Phylum BX. Cyanobacteria. - In: Boone, D.R.; Castenholz, R.W. & Garrity, G.M. (eds.): Bergey's Manual of Systematic Bacteriology (2nd Edition). - pp. 473-599, Springer-Verlag, New York. Go to original source...
  4. Castresana, J. (2000): Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. - Mol. Biol. Evol. 17: 540-552. Go to original source...
  5. Coenye, T. & Vandamme, P. (2003): Intragenomic heterogeneity between multiple 16S ribosomal RNA operons in sequenced bacterial genomes. - FEMS Microbiol. Lett. 228: 45-49. Go to original source...
  6. Darriba, D.; Taboada, G.L.; Doallo, R. & Posada, D. (2012): jModelTest 2: more models, new heuristics and parallel computing. - Nat. Methods 9: 772-772. Go to original source...
  7. De Bruyn, A.; Martin, D.P. & Lefeuvre, P. (2014): Phylogenetic Reconstruction Methods: An Overview. - In: Besse, P. (ed.): Molecular Plant Taxonomy: Methods and Protocols. - pp. 257-277, Humana Press, Totowa. Go to original source...
  8. Dereeper, A.; Guignon, V.; Blanc, G.; Audic, S.; Buffet, S.; Chevenet, F.; Dufayard, J.F.; Guindon, S.; Lefort, V.; Lescot, M.; Claverie, J.M. & Gascuel, O. (2008): Phylogeny.fr: robust phylogenetic analysis for the non-specialist. - Nucleic Acids Res. 36: W465-469. Go to original source...
  9. Dress, A.W.M.; Flamm, C.; Fritzsch, G.; Gruenewald, S.; Kruspe, M.; Prohaska, S.J. & Stadler, P.F. (2008): Noisy: Identification of problematic columns in multiple sequence alignments. - Algorithms Mol. Biol. 3. Go to original source...
  10. Edgar, R.C. (2004): MUSCLE: a multiple sequence alignment method with reduced time and space complexity. - BMC Bioinformatics 5: 113. Go to original source...
  11. Fergusson, K.M. & Saint, C.P. (2000): Molecular phylogeny of Anabaena circinalis and its identification in environmental samples by PCR. - Appl. Environ. Microbiol. 66: 4145-4148. Go to original source...
  12. Hammer, Ø.; Harper, D.A.T. & Ryan, P.D. (2001): PAST: Paleontological Statistics Software Package for Education and Data Analysis. - Palaeontol. Electronica 4: 9.
  13. Harrison, C.J. & Langdale, J.A. (2006): A step by step guide to phylogeny reconstruction. - Plant J. 45: 561-572. Go to original source...
  14. Hartmann, S. & Vision, T. (2008): Using ESTs for phylogenomics: Can one accurately infer a phylogenetic tree from a gappy alignment? - BMC Evol. Biol. 8: 95. Go to original source...
  15. Holder, M. & Lewis, P.O. (2003): Phylogeny estimation: traditional and Bayesian approaches. - Nat. Rev. Genet. 4: 275-284. Go to original source...
  16. Howard-Azzeh, M.; Shamseer, L.; Schellhorn, H.E. & Gupta, R.S. (2014): Phylogenetic analysis and molecular signatures defining a monophyletic clade of heterocystous cyanobacteria and identifying its closest relatives. - Photosyn. Res. 122: 171-185. Go to original source...
  17. Janse, I.; Meima, M.; Kardinaal, W.E.A. & Zwart, G. (2003): High-resolution differentiation of cyanobacteria by using rRNA-internal transcribed spacer denaturing gradient gel electrophoresis. - Appl. Environ. Microbiol. 69: 6634-6643. Go to original source...
  18. Katoh, K.; Misawa, K.; Kuma, K. & Miyata, T. (2002): MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. - Nucleic Acids Res. 30: 3059-3066. Go to original source...
  19. Kirkpatrick, M. & Slatkin, M. (1993): Searching for evolutionary patterns in the shape of a phylogenetic tree. - Evolution 47: 1171-1181. Go to original source...
  20. Komárek, J. (2006): Cyanobacterial taxonomy: Current problems and prospects for the integration of traditional and molecular approaches. - Algae 21: 349-375. Go to original source...
  21. Komárek, J. (2010): Recent changes (2008) in cyanobacteria taxonomy based on a combination of molecular background with phenotype and ecological consequences (genus and species concept). - Hydrobiologia 639: 245-259. Go to original source...
  22. Landan, G. & Graur, D. (2009): Characterization of pairwise and multiple sequence alignment errors. - Gene 441: 141-147. Go to original source...
  23. Larkin, M.; Blackshields, G.; Brown, N.; Chenna, R.; McGettigan, P.; McWilliam, H.; Valentin, F.; Wallace, I.; Wilm, A.; Lopez, R.; Thompson, J.; Gibson, T. & Higgins, D. (2007): Clustal W and clustal X version 2.0. - Bioinformatics 23: 2947-2948. Go to original source...
  24. Lee, E.; Ryan, U.M.; Monis, P.; McGregor, G.B.; Bath, A.; Gordon, C. & Paparini, A. (2014): Polyphasic identification of cyanobacterial isolates from Australia. - Water Res. 59: 248-261. Go to original source...
  25. Lindgren, A.R. & Daly, M. (2007): The impact of length-variable data and alignment criterion on the phylogeny of Decapodiformes (Mollusca: Cephalopoda). - Cladistics 23: 464-476. Go to original source...
  26. Liu, K.; Nelesen, S.; Raghavan, S.; Linder, C.R. & Warnow, T. (2009): Barking up the wrong treelength: The impact of gap penalty on alignment and tree accuracy. - IEEE-ACM Transactions on Computational Biology and Bioinformatics 6: 7-21. Go to original source...
  27. Löytynoja, A. (2012): Alignment methods: Strategies, challenges, benchmarking, and comparative overview. - In: Anisimova, M. (ed.): Evolutionary Genomics. - pp. 203-235, Humana Press, New York. Go to original source...
  28. Löytynoja, A. (2014): Phylogeny-aware alignment with PRANK. - Methods Mol. Biol. 1079: 155-170. Go to original source...
  29. Löytynoja, A.; Vilella, A.J. & Goldman, N. (2012): Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. - Bioinformatics 28: 1684-1691. Go to original source...
  30. Lyra, C.; Suomalainen, S.; Gugger, M.; Vezie, C.; Sundman, P.; Paulin, L. & Sivonen, K. (2001): Molecular characterization of planktic cyanobacteria of Anabaena, Aphanizomenon, Microcystis and Planktothrix genera. - Int. J. Syst. Evol. Microbiol. 51: 513-526. Go to original source...
  31. McGregor, G.B. & Rasmussen, J.P. (2008): Cyanobacterial composition of microbial mats from an Australian thermal spring: a polyphasic evaluation. - FEMS Microbiol. Ecol. 63: 23-35. Go to original source...
  32. McKenzie, A. & Steel, M. (2000): Distributions of cherries for two models of trees. - Math. Biosci. 164: 81-92. Go to original source...
  33. Miller, M.A.; Pfeiffer, W. & Schwartz, T. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. Gateway Computing Environments Workshop (GCE), 14 Nov 2010. 1-8. Go to original source...
  34. Mishra, B. & Thines, M. (2014): siMBa-a simple graphical user interface for the Bayesian phylogenetic inference program MrBayes. - Mycol. Prog. 13: 1255-1258. Go to original source...
  35. Misof, B.; Meusemann, K.; von Reumont, B.M.; Kuck, P.; Prohaska, S.J. & Stadler, P.F. (2014): A priori assessment of data quality in molecular phylogenetics. - Algorithms Mol. Biol. 9. Go to original source...
  36. Morrison, D.A. & Ellis, J.T. (1997): Effects of nucleotide sequence alignment on phylogeny estimation: A case study of 18S rDNAs of Apicomplexa. - Mol. Biol. Evol. 14: 428-441. Go to original source...
  37. Neilan, B.A. (1995): Identification and phylogenetic analysis of toxigenic cyanobacteria by multiplex randomly amplified polymorphic DNA PCR. - Appl. Environ. Microbiol. 61: 2286-2291. Go to original source...
  38. Ogden, T.H. & Rosenberg, M.S. (2006): Multiple sequence alignment accuracy and phylogenetic inference. - Syst. Biol. 55: 314-328. Go to original source...
  39. Ogden, T.H. & Rosenberg, M.S. (2007): Alignment and topological accuracy of the direct optimization approach via POY and traditional phylogenetics via ClustalW plus PAUP*. - Syst. Biol. 56: 182-193. Go to original source...
  40. Otsuka, S.; Suda, S.; Li, R.; Watanabe, M.; Oyaizu, H.; Matsumoto, S. & Watanabe, M.M. (1999): Phylogenetic relationships between toxic and non-toxic strains of the genus Microcystis based on 16S to 23S internal transcribed spacer sequence. - FEMS Microbiol. Lett. 172: 15-21. Go to original source...
  41. Palenik, B. & Haselkorn, R. (1992): Multiple evolutionary origins of prochlorophytes, the chlorophyllb-containing prokaryotes. - Nature 355: 265-267. Go to original source...
  42. Palinska, K.A. & Surosz, W. (2014): Taxonomy of cyanobacteria: a contribution to consensus approach. - Hydrobiologia 740: 1-11. Go to original source...
  43. Phillips, M.J.; Lin, Y.H.; Harrison, G.L. & Penny, D. (2001): Mitochondrial genomes of a bandicoot and a brushtail possum confirm the monophyly of australidelphian marsupials. - Proc. R. Soc. London Ser. B: Biol. Sc. 268: 1533-1538. Go to original source...
  44. Premanandh, J.; Priya, B.; Teneva, I.; Dzhambazov, B.; Prabaharan, D. & Uma, L. (2006): Molecular characterization of marine cyanobacteria from the Indian subcontinent deduced from sequence analysis of the phycocyanin operon (cpcB-IGS-cpcA) and 16S-23S ITS region. - J. Microbiol. 44: 607-616.
  45. Rambaut, A. (2008): TreeStat [online]. Available from http://tree.bio.ed.ac.uk/software/treestat/.
  46. Rambaut, A. (2014): FigTree [online]. Available from http://tree.bio.ed.ac.uk/software/figtree/.
  47. Rehakova, K.; Johansen, J.R.; Bowen, M.B.; Martin, M.P. & Sheil, C.A. (2014): Variation in secondary structure of the 16S rRNA molecule in cyanobacteria with implications for phylogenetic analysis. - Fottea 14: 161-178. Go to original source...
  48. Robertson, B.R.; Tezuka, N. & Watanabe, M.M. (2001): Phylogenetic analyses of Synechococcus strains (cyanobacteria) using sequences of 16S rDNA and part of the phycocyanin operon reveal multiple evolutionary lines and reflect phycobilin content. - Int. J. Syst. Evol. Microbiol. 51: 861-871. Go to original source...
  49. Ronquist, F.; Teslenko, M.; van der Mark, P.; Ayres, D.L.; Darling, A.; Höhna, S.; Larget, B.; Liu, L.; Suchard, M.A. & Huelsenbeck, J.P. (2012): MrBayes 3.2: Efficient Bayesian phylogenetic inference and model choice across a large model space. - Syst. Biol. 61: 539-542. Go to original source...
  50. Sciuto, K. & Moro, I. (2015): Cyanobacteria: the bright and dark sides of a charming group. - Biodivers. Conserv. 24: 711-738. Go to original source...
  51. Sedaghatinia, A.; Atan, R.B.; Arifin, K. & Murad, M. (2009): Comparison and evaluation of multiple sequence alignment tools in bioinformatics. - International Journal of Computer Science and Network Security 9: 51-56.
  52. Seo, P.S. & Yokota, A. (2003): The phylogenetic relationships of cyanobacteria inferred from 16S rRNA, gyrB, rpoC1 and rpoD1 gene sequences. - J. Gen. Appl. Microbiol. 49: 191-203. Go to original source...
  53. Stamatakis, A.; Hoover, P. & Rougemont, J. (2008): A rapid bootstrap algorithm for the RAxML web servers. - Syst. Biol. 57: 758-771. Go to original source...
  54. Talavera, G. & Castresana, J. (2007): Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. - Syst. Biol. 56: 564 - 577. Go to original source...
  55. Tamura, K.; Peterson, D.; Peterson, N.; Stecher, G.; Nei, M. & Kumar, S. (2011): MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. - Mol. Biol. Evol. 28: 2731-2739. Go to original source...
  56. Thompson, J.D.; Higgins, D.G. & Gibson, T.J. (1994): CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. - Nucleic Acids Res. 22: 4673-4680. Go to original source...
  57. Tomitani, A.; Knoll, A.H.; Cavanaugh, C.M. & Ohno, T. (2006): The evolutionary diversification of cyanobacteria: Molecular-phylogenetic and paleontological perspectives. - Proc. Natl. Acad. Sci. U. S. A. 103: 5442-5447. Go to original source...
  58. Valerio, E.; Chambel, L.; Paulino, S.; Faria, N.; Pereira, P. & Tenreiro, R. (2009): Molecular identification, typing and traceability of cyanobacteria from freshwater reservoirs. - Microbiology 155: 642-656. Go to original source...
  59. Van de Peer, Y. (2009): Phylogenetic inference based on distance methods. - In: Lemey, P.; Salemi, M. & Vandamme, A.M. (eds.): Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing (2nd Edition). - pp. 142-180, Cambridge University Press, New York. Go to original source...
  60. Varon, A. & Wheeler, W.C. (2012): The tree alignment problem. - BMC Bioinformatics 13. Go to original source...
  61. Whitton, B.A. & Potts, M. (2000): Introduction to the Cyanobacteria. - In: Whitton, B.A. & Potts, M. (eds.): The ecology of cyanobacteria. Their diversity in time and space. - pp. 1-11, Springer-Netherlands, Rotterdam. Go to original source...
  62. Willame, R.; Boutte, C.; Grubisic, S.; Wilmotte, A.; Komarek, J. & Hoffmann, L. (2006): Morphological and molecular characterization of planktonic cyanobacteria from Belgium and Luxembourg. - J. Phycol. 42: 1312-1332. Go to original source...
  63. Yang, Z. & Rannala, B. (2012): Molecular phylogenetics: principles and practice. - Nat. Rev. Genet. 13: 303-314. Go to original source...