Data landscape of SRA
The landscape of reprocessed data
From data searching to analysis in less than minutes with our JupyterHub:
Feel free to contact me at email@example.com if you have any questions. I will try to reply within three days.
Two steps to go from data searching to analysis
Query samples with mutations at BRAF V600 location (chr7:140,753,336). Sizes of the nodes in the scatter plot represent the mapping quality score.
The following word map should give you a sense of the kind of data that is available in the SRA. The size of each BioSample attribute node represents the data availability in log2 scale. The distances between the nodes in this t-SNE plot represent the textual semantic similarities between the metadata from the submitted annotations. For example, when you zoom into the attribute group “disease” you can hover over the neighboring labels and see that the BioSample attribute “disease” is closely grouped with relevant attributes like “diagnosis” and “disease status”.
For more information, please see: https://www.biorxiv.org/content/biorxiv/early/2018/09/12/414136.full.pdf
What do the different SRA accessions represent? (This section is copy and pasted from official NCBI website: https://www.ncbi.nlm.nih.gov/books/NBK56913/)
There are 6 different SRA accession types:
|Accession Prefix||Accession Name||Definition||Example|
|SRA||SRA submission accession||The submission accession represents a virtual container that holds the objects represented by the other five accessions and is used to track the submission in the archive.||Since the SRA accession number is an artificial packaging construct, there is no example available since the SRA accession number has no specific response page|
|SRP||SRA study accession||A Study is an object that contains the project metadata describing a sequencing study or project. Imported from BioProject.||HTML|
|SRX||SRA experiment accession||An Experiment is an object that contains the metadata describing the library, platform selection, and processing parameters involved in a particular sequencing experiment.||HTML|
|SRR||SRA run accession||A Run is an object that contains actual sequencing data for a particular sequencing experiment. Experiments may contain many Runs depending on the number of sequencing instrument runs that were needed.||HTML|
|SRS||SRA sample accession||A Sample is an object that contains the metadata describing the physical sample upon which a sequencing experiment was performed. Imported from BioSample.||HTML|