Difference between revisions of "GBS Data Management"

From Poland Lab Wiki
Jump to: navigation, search
 
(One intermediate revision by the same user not shown)
(No difference)

Latest revision as of 18:48, 12 September 2019

GBS Sequence Data Management consists of the following tasks:

  1. Download of GBS sequence data files from sequencing facilities.
  2. Verify the integrity of downloaded GBS files.
  3. Create or rename GBS sequence files to conform to the standard GBS file naming format.
  4. Filtering of GBS sequence file to remove short reads of less than 75bp (Nextseq files only).
  5. Update wheatgenetics gbs table flowcell, lane, num_lines and md5sum columns for each file.
  6. Checking % valid reads in each GBS file and % reads found in each blank well.
  7. Checking DNA quantification values for blank wells relative to other wells.
  8. Storage of GBS files on Beocat in /bulk/jpoland/sequence GBS sequence file repository
  9. Backup of GBS files to external NAS.

The exact method for executing these tasks is dependent on the sequencing facility that produced the data.

The data management procedure for facilities that are currently used by the Poland Lab are document below.

GBS Data Management for KSU Genomics Facility

GBS Data Management for Genome Quebec NGS Facility

GBS Data Management for Hudson Alpha Facility

GBS Data Management for Novogene Facility