xTea

xTea (x-Transposable element analyzer) is a software tool that identifies transposable element (TE) insertions in whole-genome sequencing data. Unlike most existing methods limited to short-read data, xTea can effectively analyze short-read and long-read sequencing data. Comparative analyses demonstrate that xTea surpasses short-read-based methods to detect germline and somatic TE insertions. When applied to long-read data, xTea generates a comprehensive catalog of polymorphic insertions, complete with fully assembled and annotated insertional sequences for various retroelements, including pseudogenes and endogenous retroviruses.

Notably, xTea reveals that individual genomes contain an average of nine groups of full-length L1 retrotransposons in centromeric regions. This indicates that centromeres and other highly repetitive genomic regions, such as telomeres, may harbor a significant and largely unexplored reservoir of active L1 elements. By providing a powerful tool for analyzing TE insertions across diverse sequencing data types, xTea contributes to a deeper understanding of transposable elements' impact on the human genome's structure and function.

Topic

Genetic variation;Whole genome sequencing;Sequence assembly;Machine learning;Informatics

Detail

  • Operation: Genotyping;Nucleic acid sequence analysis

  • Software interface: Command-line interface

  • Language: Python,Other

  • License: Other

  • Cost: Free

  • Version name: -

  • Credit: This project was funded by: National Institute of Mental Health (U01MH106883), The National Cancer Institute (R03CA249364), The National Library of Medicine (T15LM007092),

  • Input: -

  • Output: -

  • Contact: Eunjung Alice Lee ealice.lee@childrens.harvard.edu ,Peter J. Park peter_park@hms.harvard.edu

  • Collection: -

  • Maturity: -

Publications

  • Comprehensive identification of transposable element insertions using multiple sequencing technologies.
  • Chu C, et al. Comprehensive identification of transposable element insertions using multiple sequencing technologies. Comprehensive identification of transposable element insertions using multiple sequencing technologies. 2021; 12:3836. doi: 10.1038/s41467-021-24041-8
  • https://doi.org/10.1038/S41467-021-24041-8
  • PMID: 34158502
  • PMC: PMC8219666

Download and documentation


< Back to DB search