The key difference between contig and scaffold is that a contig does not have gaps while a scaffold consists of contigs and gaps.
Genome sequencing of multicellular organisms is very difficult compared to the sequencing of unicellular organisms. Whole-genome shotgun sequencing is an easy and fast genome sequencing technique for multicellular genomes. It is a method that sequences many overlapping DNA fragments in parallel. In this method, small fragments are assembled into larger fragments and then into larger contigs using a computer. Then the contigs are assembled into scaffolds and finally into a chromosome. Therefore, contig is a continuous stretch of nucleotide sequences while scaffold is a portion of the genome consisting of contigs and gaps. Both contig and scaffold are reconstructed genomic sequences.
CONTENTS
1. Overview and Key Difference
2. What is Contig
3. What is Scaffold
4. Similarities Between Contig and Scaffold
5. Side by Side Comparison – Contig vs Scaffold in Tabular Form
6. Summary
What is Contig?
Contig is a genomic sequence. It is a continuous stretch of sequences composed of A, C, G and T bases. It is formed by putting together several little overlapping bits of DNA into a longer sequence. In simple words, contig is an assemblage of a set of sequence fragments. Contig creation involves the identification of overlapping sequence fragments based on local string matching and alignment methods which identify overlapping ends of the sequences.
Contigs do not have gaps. It is a part of a scaffold. Contigs are chained together when creating a scaffold. It requires additional information about the relative position and orientation of the contigs in the genome. The gaps separate contigs in a scaffold. Basically, contig assembly is an important step in whole-genome shotgun sequencing. Contigs are assembled finally into a complete genomic sequence. Contig assembly typically requires the understanding of algorithms in string matching and sequence alignment.
What is a Scaffold?
A scaffold is a reconstructed genomic sequence from end-sequenced whole-genome shotgun clones. Structurally, a scaffold consists of contigs and gaps. Therefore, scaffolds are created by chaining contigs together and separating them by gaps. Scaffold creation needs information regarding the relative position and orientation of the contigs in the genome. Whole-genome shotgun assembly aims at representing each genomic sequence in one scaffold. But, it is not quite possible. Hence, one chromosome may be represented by several scaffolds. Sometimes scaffolds can overlap. Moreover, some scaffolds can be filtered out during the assembly.
The length of a scaffold gap does not show its true length. Generally, gaps are arbitrarily set to certain fixed lengths. Hence, these gaps and uncertainties in their lengths create problems in understanding the true spatial relationships of functional elements in genomes and the actual extent of missing information. Sometimes gaps represent missing genomic information.
What are the Similarities Between Contig and Scaffold?
- Contig and scaffold are genomic sequences composed of nucleotide sequences.
- Scaffolds are composed of contigs and gaps.
What is the Difference Between Contig and Scaffold?
A contig is a continuous sequence assembled from a set of sequence fragments. In contrast, a scaffold is a portion of genomic sequence reconstructed by chaining contigs together. So, this is the key difference between contig and scaffold. Moreover, contigs do not have gaps while contigs in a scaffold are separated by gaps.
Below infographic tabulates more differences between contig and scaffold.
Summary – Contig vs Scaffold
Both contig and scaffold are reconstructed nucleotide sequences in whole-genome sequencing projects. Contig is a continuous stretch of genomic sequence containing A, C, G and T bases without gaps. The scaffold is a genomic sequence consists of contigs and gaps. Hence, shortest assembly components are contigs while the scaffolds are assemblages of contigs. Finally, the scaffolds are assembled into chromosomes in whole-genome sequencing. Thus, this summarizes the difference between contig and scaffold.