REPHINEREcursive PHylogenetic INferencE

Molecular phylogenetic analysis is often a two-step process: first, a multiple sequence alignment (MSA) is inferred from the raw data; second, assuming a model of evolution, a tree is inferred from the MSA. A problem arises when species-rich data sets are used to  currently estimate ancient and recent divergence events: to accurately estimate ancient divergence events, slow-evolving DNA are needed but they often do not provide clues about the most recent divergence events. Conversely, fast-evolving DNA are needed to identify recent divergence events but are not useful for identifying older divergence events (due to the confounding effect of homoplasy). We describe an alternative tree building workflow, REPHINE (REcursive PHylogenetic INferencE), which uses recursive workflows involving alignment, masking (i.e., to identify and remove hypervariable and/or gap-rich sites from the MSA), tree inference, and super-tree reconstruction.phylogenetic analysis of the original masked concatenated MSA. The result is a final tree where all the edges are assigned a length in accordance with the inferred optimal fit between the tree, the model, and the data.