SRA Software Toolkit
SRA Software Toolkit provides programmatic access to sequencing data in the INSDC Sequence Read Archives (SRA) for retrieval and downstream bioinformatics analyses.
Key Features:
- Data Access and Retrieval: Enables efficient access to and retrieval of large-scale sequence datasets archived in the INSDC SRA.
- Integration with NCBI Resources: Integrates SRA data with NCBI resources such as GenBank, Entrez, and BLAST for linked annotation and sequence search.
- Customized Search Capabilities: Supports custom implementations of BLAST optimized to search specialized datasets.
- SDK and Libraries: Includes an SDK and libraries that support multiple programming languages for programmatic data access and development of custom applications.
- Large-scale Data Handling: Implements an architecture and framework for handling large-scale sequence data retrieval and processing.
Scientific Applications:
- Genomic Research: Provides sequence data for comparative genomics and evolutionary biology studies.
- Transcriptomics and Proteomics: Provides access to RNA-seq and other high-throughput sequencing datasets used in transcriptomics and proteomics.
- Disease Research: Enables analysis of genetic disease-related datasets, including studies of gene expression and mutation analysis.
Methodology:
Provides libraries that support multiple programming languages for custom scripts and applications, implements a framework and architecture for sequence data retrieval and processing at scale, and supports custom BLAST implementations for dataset-specific searches.
Topics
Collections
Details
- Tool Type:
- workflow
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- Perl
- Added:
- 8/3/2017
- Last Updated:
- 11/25/2024
Operations
Data Inputs & Outputs
Data handling
Inputs
Outputs
Publications
Wheeler DL. Database resources of the National Center for Biotechnology Information. Nucleic Acids Research. 2006;34(90001):D173-D180. doi:10.1093/nar/gkj158. PMID:16381840. PMCID:PMC1347520.