Summary:The purpose of this JupyterHub is to let the community to retrieve public data at ease. For example, 1.) you can retrieve the expression of a gene of any public sequencing profiles in < 1 second, or 2.) extract the allelic read counts of a particular sequencing profile in < 1 second. Skymap is a standalone database that aims to offer. 1) a single data matrix for each omic layer for each species that spans a total of >400k sequencing runs from all the public studies, which is done by reprocessing petabytes worth of sequencing data 2.) a biological metadata file that describes the relationships between the sequencing runs and also the keywords extracted from over 3 million free text annotations using NLP 3.) a technical metadata file that describes the relationships between the sequencing runs.
Step by step guide to run our notebooks (Take < 1 min to complete):
- Click this button to login using any of your Google accounts, we don’t ask for any personal info and no registration is involved, it’s really just for logging in.
- Click the “notebooks” directory.
- Click the example notebook “basicRNAseqAnalysis.ipynb”
- Click “Run All” to execute the python code cells in the notebook.
** Feel free to change the notebook and run it in your way. For example, you can change the query gene from “TP53” to “GAPDH” to extract the expression level (TPM) of GAPDH from >100,000 sequencing runs:
- Skymap project
Feel free to shoot me an email if you have any questions and I will try to reply with three days: firstname.lastname@example.org
- The rationale of Skymap project.
- Why JupyterHub for data retrieval?
Data: If you wish to have a local copy of SkyMap, all the data are located in JupyterHub (directory: ~/efs)which you can download using rsync.
Limitation of our JupyterHub:
- We don’t store your local data. Once it becomes idle for more than an hour, your Kubernete pod will be pulled and deleted, you probably want to download your own copy of data.
- I didn’t set a memory limit, but it should most likely die when the memory exceeds 8GB.
- I didn’t set a CPU limit, but in the unlikely case, you might see a slow down for the notebook.
- The JupyterHub might be taken down for maintenance at midnight (PST) everyday.