Experiment & Run¶
A description of sample-specific sequencing library, instrument and sequencing methods. An experiment references 1 project and 1 sample. Runs describe the files that belong to the previously created experiments.
Genernal Information¶
*project accession : project_accession
- Definition: A valid project accession has ‘CNP’ prefix.
*sample accession : sample_accession
- Definition: A valid sample accession has ‘CNS’ prefix.
*experiment title : experiment_title
- Definition: Short description that will identify the dataset on public pages. A clear and concise formula for the title would be like: {methodology} of {organism}: {sample info}, e.g. RNA-Seq of mus musculus: adult female spleen.
Library Information & Sequencing¶
*library name : library_name
- Definition: Short unique identifier for the sequencing library. Each library name MUST be unique!
*library strategy : library_strategy
Definition: Sequencing technique intended for the library.
Value:
WGA: Random sequencing of the whole genome following non-pcr amplification.WGS: Random sequencing of the whole genome.WXS: Random sequencing of exonic regions selected from the genome.RNA-Seq: Random sequencing of whole transcriptome.miRNA-Seq: Random sequencing of small miRNAs.WCS: Random sequencing of a whole chromosome or other replicon isolated from a genome.CLONE: Genomic clone based (hierarchical) sequencing.POOLCLONE: Shotgun of pooled clones (usually BACs and Fosmids).AMPLICON: Sequencing of overlapping or distinct PCR or RT-PCR products.CLONEEND: Clone end (5’, 3’, or both) sequencing.FINISHING: Sequencing intended to finish (close) gaps in existing coverage.ChIP-Seq: Direct sequencing of chromatin immunoprecipitates.MNase-Seq: Direct sequencing following MNase digestion.DNase-Hypersensitivity: Sequencing of hypersensitive sites, or segments of open chromatin that are more readily cleaved by DNaseI.Bisulfite-Seq: Sequencing following treatment of DNA with bisulfite to convert cytosine residues to uracil depending on methylation status.Tn-Seq: Sequencing from transposon insertion sites.EST: Single pass sequencing of cDNA templates.FL-cDNA: Full-length sequencing of cDNA templates.CTS: Concatenated Tag Sequencing.MRE-Seq: Methylation-Sensitive Restriction Enzyme Sequencing strategy.MeDIP-Seq: Methylated DNA Immunoprecipitation Sequencing strategy.MBD-Seq: Direct sequencing of methylated fractions sequencing strategy.Synthetic-Long-ReadATAC-seq: Assay for Transposase-Accessible Chromatin (ATAC) strategy is used to study genome-wide chromatin accessibility. alternative method to DNase-seq that uses an engineered Tn5 transposase to cleave DNA and to integrate primer DNA sequences into the cleaved genomic DNA.ChIA-PET: Direct sequencing of proximity-ligated chromatin immunoprecipitates.FAIRE-seq: Formaldehyde Assisted Isolation of Regulatory Elements. reveals regions of open chromatin.Hi-C: Chromosome Conformation Capture technique where a biotin-labeled nucleotide is incorporated at the ligation junction, enabling selective purification of chimeric DNA ligation junctions followed by deep sequencing.ncRNA-Seq: Capture of other non-coding RNA types, including post-translation modification types such as snRNA (small nuclear RNA) or snoRNA (small nucleolar RNA), or expression regulation types such as siRNA (small interfering RNA) or piRNA/piwi/RNA (piwi-interacting RNA).RAD-SeqRIP-Seq: Direct sequencing of RNA immunoprecipitates (includes CLIP-Seq, HITS-CLIP and PAR-CLIP).SELEX: Systematic Evolution of Ligands by EXponential enrichment.ssRNA-seq: strand-specific RNA sequencing.Targeted-CaptureTethered Chromatin Conformation CaptureOTHER: Library strategy not listed (please include additional info in the “design description”).
*library source : library_source
Definition: The library source specifies the type of source material that is being sequenced.
Value:
GENOMIC: Genomic DNA (includes PCR products from genomic DNA).GENOMIC SINGLE CELLTRANSCRIPTOMIC: Transcription products or non genomic DNA (EST, cDNA, RT-PCR, screened libraries).TRANSCRIPTOMIC SINGLE CELLMETAGENOMIC: Mixed material from metagenome.METATRANSCRIPTOMIC: Transcription products from community targets.SYNTHETIC: Synthetic DNA.VIRAL RNA: Viral RNA.OTHER: Other, unspecified, or unknown library source material (please include additional info in the “design description”).
*library selection: library_selection
Definition: Method used to enrich the target in the sequence library preparation.
Value:
RANDOM: Random selection by shearing or other method.PCR: Source material was selected by designed primers.RANDOM PCR: Source material was selected by randomly generated primers.RT-PCR: Source material was selected by reverse transcription PCR.HMPR: Hypo-methylated partial restriction digest.MF: Methyl Filtrated.MDA: Multiple displacement amplification.MSLL: Methylation Spanning Linking Library.cDNA: complementary DNA.ChIP: Chromatin immunoprecipitation.MNase: Micrococcal Nuclease (MNase) digestion.DNase: Deoxyribonuclease (MNase) digestion.Hybrid Selection: Selection by hybridization in array or solution.Reduced Representation: Reproducible genomic subsets, often generated by restriction fragment size selection, containing a manageable number of loci to facilitate re-sampling.Restriction Digest: DNA fractionation using restriction enzymes.5-methylcytidine antibody: Selection of methylated DNA fragments using an antibody raised against 5-methylcytosine or 5-methylcytidine (m5C).MBD2 protein methyl-CpG binding domain: Enrichment by methyl-CpG binding domain.CAGE: Cap-analysis gene expression.RACE: Rapid Amplification of cDNA Ends.size fractionation: Physical selection of size appropriate targets.Padlock probes capture method: Circularized oligonucleotide probes.Oligo-dT: enrichment of messenger RNA (mRNA) by hybridization to Oligo-dT.repeat fractionation: Selection for less repetitive (and more gene rich) sequence through Cot filtration (CF) or other fractionation techniques based on DNA kinetics.cDNA_oligo_dTcDNA_randomPrimingInverse rRNA: depletion of ribosomal RNA by oligo hybridization.PolyA: PolyA selection or enrichment for messenger RNA (mRNA); should replace cDNA enumeration.other: Other library enrichment, screening, or selection process (please include additional info in the “design description”).unspecified: Library enrichment, screening, or selection is not specified (please include additional info in the “design description”).
*library layout : library_layout
Value:
fragment/singlepaired
- *platform :
platform - *instrument model :
instrument_modelValue:
- LS454
- 454 GS454 GS 20454 GS FLX454 GS FLX+454 GS FLX Titanium454 GS Junior
- ABI_SOLID
- AB 5500 Genetic AnalyzerAB 5500xl Genetic AnalyzerAB 5500xl-W Genetic Analysis SystemAB SOLiD 3 Plus SystemAB SOLiD 4 SystemAB SOLiD 4hq SystemAB SOLiD PI SystemAB SOLiD SystemAB SOLiD System 2.0AB SOLiD System 3.0
- BGISEQ
- BGISEQ-500BGISEQ-50BGISEQ-1000BGISEQ-100
- Bionano
- Saphyr
- DIPSEQ
- DIPSEQ-T1DIPSEQ-T5DIPSEQ-T10
- DNBSEQ
- DNBSEQ-G50(MGISEQ-200)DNBSEQ-G400(MGISEQ-2000)DNBSEQ-G400 FASTDNBSEQ-T1DNBSEQ-T5DNBSEQ-T7DNBSEQ-T10DNBSEQ-T10×4DNBSEQ-T20DNBSEQ-T20×2
- CAPILLARY
- AB 310 Genetic AnalyzerAB 3130 Genetic AnalyzerAB 3130xL Genetic AnalyzerAB 3500 Genetic AnalyzerAB 3500xL Genetic AnalyzerAB 3730 Genetic AnalyzerAB 3730xL Genetic Analyzer
- COMPLETE_GENOMICS
- Complete Genomics
- HELICOS
- Helicos HeliScope
- ILLUMINA
- HiSeq X FiveHiSeq X TenIllumina Genome AnalyzerIllumina Genome Analyzer IIIllumina Genome Analyzer IIxIllumina HiScanSQIllumina HiSeq 1000Illumina HiSeq 1500Illumina HiSeq 2000Illumina HiSeq 2500Illumina HiSeq 3000Illumina HiSeq 4000Illumina iSeq 100Illumina NovaSeq 6000Illumina MiniSeqIllumina MiSeqNextSeq 500NextSeq 550
- ION_TORRENT
- Ion Torrent PGMIon Torrent ProtonIon Torrent S5 XLIon Torrent S5
- OXFORD_NANOPORE
- GridIONMinIONPromethION
- PACBIO_SMRT
- PacBio RSPacBio RS IISequelSequel II
design description: design_description
- Definition: Free-form description of the methods used to create the sequencing library; a brief ‘materials and methods’ section.
library construction protocol: library_construction_protocol
- Definition: Describes the protocol by which the sequencing library was constructed.
*spot layout: spot_layout
- Definition: If technical reads (e.g. barcodes, adaptors or linkers) are included in the submitted raw sequences, a spot descriptor must be submitted to describe the position of the technical reads so that they can be removed.
*nominal size: nominal_size
- Definition: The average insert size for paired reads.
Run Information¶
*file type: file_type
- Value:
- bamcramsfffastqPacBio_HDF5bnxOxford_Nanopore
*file name: file_name
*file md5: file_md5
Reference Information¶
When submitting BAM files of aligned reads, you must also specify an assembly - the reference genome that your reads were aligned against. You can identify your reference assembly by its accession from the NCBI, UCSC and Ensembl. If the assembly is not available from a public repository, you will need to submit your own (local) assembly in FASTA format along with your BAM file.
When submitting CRAM files, the references should be provided in the same manner as BAM references.
reference accession: reference_accession
- Definition: This is only if you are submitting a bam/cram file aligned against an assembly - the reference genome in the public repository. Please provide the accession number (e.g. GRCh37) in the public repository.
reference fasta: reference_fasta
- Definition: Please provide the name of the custom assembly fasta file used during alignment (e.g. Mouse.fasta).
reference md5: reference_md5
- Definition: MD5 checksum for the custom assembly fasta file.