Poster Presentation Lorne Infection and Immunity 2022

A tree-based software for high-resolution analysis of metagenomic samples (#210)

Sean Solari 1 2 , Remy Young 1 2 , Vanessa Rossetto Marcelino 1 2 , Samuel Forster 1 2
  1. Centre of Innate Immunity and Infectious Disease, Hudson Institute of Medical Research, Clayton, Victoria, Australia
  2. Department of Molecular and Translational Science, Monash University, Clayton, Victoria, Australia

Rapid advancement of metagenomic sequencing technologies has accelerated the generation of vast biological datasets, revealing complex communities of interacting microbes performing vital roles in many ecosystems. High-throughput shotgun sequencing can be used to analyse these microbial ecosystems, generating whole-community information through reading short sequences of DNA. It has been the task of reference-based metagenomic analysis pipelines to assign these short reads of DNA to taxonomic clades and generate a taxonomic profile of the community. However, the discovery of taxonomically similar isolates with large phenotypic differences challenges the use of such taxonomic profiles, as it demonstrates the existence of crucial functional characteristics that likely cannot be captured through taxonomic clades. exPAM is a software package designed to address this challenge, integrating vast collections of high-quality reference genomes with sequence-based distance trees to establish, with increased resolution, the read abundance and phylogenetic prevalence of metagenomic datasets. The exPAM approach has a capacity to generate isolate-level profiles of these samples, enabling more thoroughly resolved downstream functional analyses that are crucial to uncovering key biological mechanisms within these communities. Alongside this development of metagenomic profiling methods, the emergence of cutting-edge gastrointestinal microbiota culturing techniques has effected surging discovery of novel microbial isolates. To aid the discovery of microbial species, exPAM alerts the user to putative novel sequences from the sample, alongside an estimation of their phylogenetic neighbourhood. This knowledge enables researchers to identify promising targets for further experimentation and deepen our understanding of these complex biological communities.