SISSRs

SISSRs creates a list of peakmaxima from aligned positions. Chromatin immunoprecipitation sequencing (ChIP-Seq) has become an increasingly popular method for investigating protein-DNA interactions in-vivo on a genome-scale. This method is employed to identify protein-binding sites on DNA, and it combines chromatin immunoprecipitation (ChIP) with ultra-high-throughput massively parallel sequencing. The result of ChIP-Seq experiments is a large number of short sequence reads, which are usually mapped to a reference genome for further analysis. However, the short length of these reads, which is approximately 25-50 nucleotides, poses a significant challenge for determining the exact protein-binding sites within these regions.

To address this challenge, a novel algorithm called Site Identification from Short Sequence Reads (SISSRs) was developed to precisely identify protein-binding sites from short reads generated from ChIP-Seq experiments. The sensitivity and specificity of SISSRs were demonstrated by applying it to ChIP-Seq data for three well-characterized human transcription factors, namely CTCF (CCCTC-binding factor), NRSF (neuron-restrictive silencer factor), and STAT1 (signal transducer and activator of transcription protein 1). Using SISSRs, the researchers identified 26,814, 5,813, and 73,956 binding sites for CTCF, NRSF, and STAT1 proteins, respectively. These numbers are 32%, 299%, and 78% more than what was inferred previously for the respective proteins.

The accuracy of SISSRs was further attested by the fact that the overwhelming majority of the identified binding sites contained the previously established consensus binding sequence for the respective proteins. The sensitivity and precision of SISSRs facilitate further analyses of ChIP-Seq data, which reveal interesting insights that can guide designing ChIP-Seq experiments to map protein-DNA interactions in-vivo.

Moreover, the researchers showed that tag densities at the protein-binding sites indicate protein-DNA binding affinity. This finding highlights the potential of tag density as a tool for distinguishing and characterizing strong and weak binding sites. Using tag density as an indicator of DNA-binding affinity, the researchers identified core residues within the NRSF and CTCF binding sites that are critical for stronger DNA binding.

Topic

ChIP-seq;DNA;Sequencing;ChIP-seq;Transcriptomics;DNA binding sites

Detail

  • Operation: Peak calling;Transcription factor binding site prediction

  • Software interface: Command-line user interface

  • Language: Perl

  • License: -

  • Cost: Free

  • Version name: 1.4

  • Credit: Intramural Research Program of the National Heart Lung and Blood Institute, National Institutes of Health.

  • Input: -

  • Output: -

  • Contact: jothi@mail.nih.gov

  • Collection: -

  • Maturity: -

Publications

  • Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data.
  • Jothi R, et al. Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. 2008; 36:5221-31. doi: 10.1093/nar/gkn488
  • https://doi.org/10.1093/nar/gkn488
  • PMID: 18684996
  • PMC: PMC2532738

Download and documentation

    Currently not available or not maintained.


< Back to DB search