Code Monkey home page Code Monkey logo

phylobio_final_project's Introduction

Phylogenetic Biology - Final Project

Guidelines - you can delete this section before submission

This repository is a stub for your final project. Fork it, develop your project, and submit it as a pull request. Edit/ delete the text in this readme as needed.

Some guidelines and tips:

  • Use the stubs below to write up your final project. Alternatively, if you would like the writeup to be an executable document (with knitr, jupytr, or other tools), you can create it as a separate file and put a link to it here in the readme.

  • For information on formatting text files with markdown, see https://guides.github.com/features/mastering-markdown/ . You can use markdown to include images in this document by linking to files in the repository, eg ![GitHub Logo](/images/logo.png).

  • The project must be entirely reproducible. In addition to the results, the repository must include all the data (or links to data) and code needed to reproduce the results.

  • If you are working with unpublished data that you would prefer not to publicly share at this time, please contact me to discuss options. In most cases, the data can be anonymized in a way that putting them in a public repo does not compromise your other goals.

  • Paste references (including urls) into the reference section, and cite them with the general format (Smith at al. 2003).

  • Commit and push often as you work.

OK, here we go.

A Phylogeny of the Temnospondyls

Introduction and Goals

The Temnospondyls were a highly successful order of amphibian-like tetrapods that arose during the Carboniferous. Managing to survive the devastating Permian extinction, they persisted through the Triassic but were almost entirely wiped out by the smaller Triassic-Jurassic extinction. The latest known temnospondyl species died off during the Cretaceous. There is controversy about the placement of Temnospondyli with regards to Amphibia. It is unclear whether temnospondyls were an ancestral group to amphibians, or whether they were an divergent group with no modern descendents. Their relationship to Lepospondyls, another extinct clade extant at the same time and suffering from the same ambiguity, is likewise confused. Temnospondyls are inherently interesting: they were the most diverse group of early tetrapods (Ruta et al. 2007), and persisted from the Carboniferous to the Cretaceous in spite of global extinction events and what must have been fierce competition from other amphibians, reptiles, and dinosaurs. The mystery surrounding their placement relative to the amphibians adds to the allure, though that particular avenue of questioning is beyond the scope of the project. A seminal paper by Yates and Warren in 2000 created a phylogeny of the temnospondyls using maximum parsimony. The immediate focus of this project is to attempt to reconstruct their tree using maximum likelihood. I plan to use the character data from this paper to create trees using three multistate models included with RAxML. I also ran a secondary Bayesian analysis using RevBayes.

Methods

After digitizing the character matrix, I decided to run my analyses in RAxML. In order to ensure that I was making the fewest assumptions about my data, as well as to avoid estimating parameters beforehand a priori, I elected to use the multistate GAMMA model of heterogeneity (RAxML command -m MULTIGAMMA). I ran three replicates, with different seeds, of each of the three multistate models included with RAxML: GTR, MK, and ORDERED. I also chose to bootstrap each with 100 iterations. The shell file and input data are included in the git repository, under /RAxML/run.

With the help of Michael Landis I implemented a CTMC model of morphological character evolution in RevBayes to try to recapitulate the data and run biogeographical analyses. Unfortunately, time constraints and not inconsiderable technical difficulties hampered this direction of inquiry to the extent that neither shall be covered save in brief. Scripts of interest can be found in the RevBayes subfolder of this repository, with the biogeographical script in an eponymous sub-subfolder.

Results

The trees generated by the RAxML analyses were all capable of distinguishing the "lower" Temnospondyl outgroup from the higher Limnarchia taxa that were of interest to Yates and Warren. However, none was able to generate a tree with the same pattern or degree of clade-grouping that the parsimony tree displayed. For reference, the following sample tree, used as a baseline "generic" tree hereafter, is highly unbalanced and lacking the rich clade structure of the 2000 tree. GTR tree, seed1 All trees generated by this analysis can be found (as both raw output suitable for viewing in FigTree, or as pdf files) in the RAxML subfolder, within the sub-subfolders corresponding to the analyses that produced them. Of particular concern is the overall low bootstrap support for most branches, suggesting that the topology supplied by RAxML is partly fictitious. What's more, there is not a high level of consistency between trees created from different models. However, the trees all appear to share certain vague groupings; most, as mentioned, place the lower Temnospondyls by themselves, and most are capable of identifying certain sister taxa (eg ''Isodectes'' and ''Acroplous'') to a respectable degree of certainty. The RevBayes analysis used the example tree as a prior. Unfortunately, lacking facility in the Rev language, I fear that the script did not work as intended, as the prior and posterior probabilities for each branch were identical and equal to 1. Some value did emerge, however. Included is a trace of the branch on the tree above that connects to the leaf ''Archegosaurus''. GTR tree, seed1 The erratic behavior of the trace suggests that this branch, along with others, are problematic due to the data itself.

Discussion

The desultory character of the trace may hold a clue as to why some branches have such low values and others have high ones. A trace that jumps between different states can be associated with insufficient burnin or divergence of the runs. The former seems unlikely given that burnin periods look identical and that there are other branches that behave better. The matrix of morphological characters could be impeding the resolution of the tree. There are many gaps in hte matrix, due both to the incompleteness of the fossils found and ambiguity between specimens. Furthermore, modern amphibians are well known to experience dramatic morphological transformations during their lives, which could contribute to ambiguity if the Temnospondyls were anything like them.

The inability of the trees to confirm the tree from the paper leads me to declare my analysis inconclusive. However, this is due in no small part to the paucity of methods for simulating character evolution on morphological data sets, particularly in cases of purely fossil data.

Going forward, I would definitely like to try RevBayes again, with more practice to work out the kinks in my own understanding. In particular, I'd like to re-implement the biogeographical analysis for which I managed to write a script but failed to get it to work properly. I would also like to estimate the enrichment of the Temnospondyl fossil record, to permit rooting of the tree at actual fossil dates.

References

Ruta, M; Pisani, D; Lloyd, G.T.; and Benton, M. J. (2007). ''A supertree of Temnospondyli: cladogenetic patterns in the most species-rich group of early tetrapods''. Proc. R. Soc. B., 274: 3087-3095. doi:10.1098/rspb.2007.1250 Yates, A. M. and Warren, A. A. (2000). ''The phylogeny of the ‘higher’ temnospondyls (Vertebrata: Choanata) and its implications for the monophyly and origins of the Stereospondyli''. Zoological Journal of the Linnean Society, 128: 77–121. doi: 10.1111/j.1096-3642.2000.tb00650.x

phylobio_final_project's People

Contributors

jenssannerud avatar

Watchers

James Cloos avatar  avatar

phylobio_final_project's Issues

TO DO List

  1. Find a way to estimate branch lengths (can I estimate the number of unsampled taxa??)
  2. Grapple with the problem of implementing biogeography when continental positions have changed massively between the Carboniferous and the Jurassic.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.