Data landscape of SRA

The landscape of reprocessed data


Click here to try it!!!

From data searching to analysis in less than minutes with our JupyterHub:

Feel free to contact me at if you have any questions. I will try to reply within three days.


Two steps to go from data searching to analysis

Example analysis


Query samples with mutations at BRAF V600 location (chr7:140,753,336). Sizes of the nodes in the scatter plot represent the mapping quality score.

Example analysis



The following word map should give you a sense of the kind of data that is available in the SRA. The size of each BioSample attribute node represents the data availability in log2 scale. The distances between the nodes in this t-SNE plot represent the textual semantic similarities between the metadata from the submitted annotations. For example, when you zoom into the attribute group “disease” you can hover over the neighboring labels and see that the BioSample attribute “disease” is closely grouped with relevant attributes like “diagnosis” and “disease status”.
For more information, please see:


What do the different SRA accessions represent? (This section is copy and pasted from official NCBI website:

There are 6 different SRA accession types:

Accession Prefix Accession Name Definition Example
SRA SRA submission accession The submission accession represents a virtual container that holds the objects represented by the other five accessions and is used to track the submission in the archive. Since the SRA accession number is an artificial packaging construct, there is no example available since the SRA accession number has no specific response page
SRP SRA study accession A Study is an object that contains the project metadata describing a sequencing study or project. Imported from BioProject. HTML
SRX SRA experiment accession An Experiment is an object that contains the metadata describing the library, platform selection, and processing parameters involved in a particular sequencing experiment. HTML
SRR SRA run accession A Run is an object that contains actual sequencing data for a particular sequencing experiment. Experiments may contain many Runs depending on the number of sequencing instrument runs that were needed. HTML
SRS SRA sample accession A Sample is an object that contains the metadata describing the physical sample upon which a sequencing experiment was performed. Imported from BioSample. HTML