How to Submit GBS Data to the NCBI Sequence Read Archive (SRA)
There are three steps in submitting sequence data to the NCBI Sequence Read Archive (SRA):
1. Create a BioProject
Login to the NCBI Submission Portal: https://submit.ncbi.nlm.nih.gov/subs/bioproject/ and click on New Submission. Then fill out the following sections. After each section, click “Continue” to move on to the next section:
- Project Type
- Project data type: Check “Phenotype or Genotype.”
- Sample scope: “Select Multiisolate.”
- Organism name: e.g. Triticum aestivum
- Release date: Choose either “Release immediately following processing” or “Release on specified date or upon publication, whichever is first” and enter a Projected release date.
- Project title: Enter a title of your choosing.
- Public description: Provide a paragraph of the study goals and relevance.
- Is your project part of a larger initiative which is already registered with NCBI? Select Yes or No. If yes, enter the "Initiative description" and the "BioProject accession" of the larger initiative to which this project belongs.
- Click on “Continue.” You will be able to create a BioSample after you finish creating this BioProject.
- Enter a PubMed ID or DOI if available.
- Review all the information and click “Submit” if everything looks ok. If anything looks wrong, you can go back to any section by clicking on its respective tab at the top.
2. Create BioSample
Login to the NCBI Submission Portal: https://submit.ncbi.nlm.nih.gov/subs/biosample/and click on New Submission. Then fill out the following sections. After each section, click “Continue” to move on to the next section:
- General Info
- Release date: Check either:
- Select either "Release immediately following processing" or "Release on specified date or upon publication, whichever is first and enter a Projected Release Date."
- Specify if you are submitting a single sample or a file containing multiple samples
- Select "Batch/Multiple" or "Single"
- Select the Plant Sample check box.
- Provide your BioSample attributes directly on the screen or by uploading a .csv file.
- Check that all of the information displayed is correct and make any necessary adjustments. Then click Submit to submit the BioSample.
3. Create Sequence Read Archive Submission
Go to the Sequence Read Archive Submission page: https://www.ncbi.nlm.nih.gov/Traces/sra_sub/sub.cgi?login=pda and click on Create new submission.
- Fill out the Submitter page if not already done.
- On the general info page enter the following information:
- Existing BioProject: Enter the PRNJA number (e.g. PRJNA445460) that was generated for the BioProject submission. (This will be visible when the BioProject submission has been processed)
- Biosample: Click on Yes to indicate that a BioSample submission has been completed.
- Release Date: Check either:
- Release immediately following processing or
- Release on specified date or upon publication, whichever is first and enter a Projected Release Date.
- Key Points to Note:
- bioproject_accession is the accession number for the BioProject that is provided when the BioProject submission processing has been completed.
- biosample_accession is the accession number for the Sample that is provided when the BioSample submission processing has been completed.
- All read files are considered to be part of one BioSample and are related by the BioSample accession number.
- library_id is set to the sampleID associated with the set of files contained in each compressed tgz file i.e. for paired reads, there will be two files per library_id.
- The file names given should be the uncompressed file names contained within the associated tar archive, i.e. fastq file names.
- To initiate the upload type a command of the following form:
- ~/.aspera/cli/bin/ascp -v -i ~/aspera_key_file/<your aspera key file name> -QT -l1000m -k1 -d ~/<submission_folder/ firstname.lastname@example.org:uploads/<your NCBI upload folder>/
- A progress indication will be displayed in the terminal window and a message will be displayed to indicate successful completion of the upload or an error message if the upload failed.
- Go back to the SRA Files page and click on the Select preload folder button and select the folder name that corresponds to your submission folder name e.g. SUB2900496.
- Tick the Autofinish submission check box. The NCBI system will begin processing the files that were uploaded. This can take many hours depending on the size of the uploaded files.
- When processing is complete, the accession numbers for the completed submission will become available. The SRP number is the study accession number that refers to the overall submission.