Molecular Basis of Inheritance - Notes | Class 12 | Part 9: Human Genome Project (HGP)



·   The entire DNA in the haploid set of chromosomes of an organism is called a Genome.

·   In Human genome, DNA is packed in 23 chromosomes.

·   Human genome contains about 3x109 bp.

·   Human Genome Project (1990-2003) was the first mega project for the sequencing of nucleotides and mapping of all the genes in human genome.

·   HGP was coordinated by U.S. Department of Energy and the National Institute of Health.

Goals of HGP

a.    Identify all the estimated genes in human DNA.

b.    Sequencing of 3 billion chemical base pairs of human DNA.

c.    Store this information in databases.

d.    Improve tools for data analysis.

e.    Transfer related technologies to other sectors.

f.     Address the ethical, legal and social issues (ELSI) that may arise from the project.

Methodologies of HGP: 2 major approaches.

  • Expressed Sequence Tags (ESTs): Focused on identifying all the genes that are expressed as RNA.
  • Sequence annotation: Sequencing whole set of genome containing all the coding & non-coding sequence and later assigning different regions in the sequence with functions.

Procedure of sequencing:

Isolate DNA from a cell → Convert into random fragments → Clone in a host (bacteria & yeast) using vectors (e.g. BAC & YAC) for amplification → Sequencing of fragments using Automated DNA sequencers (Frederick Sanger method) → Arrange the sequences based on overlapping regions→ Alignment of sequences using computer programs.

BAC= Bacterial Artificial Chromosomes

YAC= Yeast Artificial Chromosomes

  • Sanger  has also developed method for sequencing of amino acids in proteins.
  • DNA is converted to fragments as there are technical limitations in sequencing very long pieces of DNA.
  • HGP was closely associated with Bioinformatics.
  • Bioinformatics: Application of computer science and information technology to the field of biology & medicine.
  • Of the 24 chromosomes (22 autosomes and X & Y), the last sequenced one is chromosome 1 (May 2006).
  • DNA sequencing also have been done in bacteria, yeast, Caenorhabditis elegans (a free living non-pathogenic nematode), Drosophila, plants (rice & Arabidopsis), etc.

Salient features of Human Genome

a.     Human genome contains 3164.7 million nucleotide bases.

b.    Total number of genes= about 30,000.

c.   Average gene consists of 3000 bases, but sizes vary. Largest known human gene (dystrophin on X-chromosome) contains 2.4 million bases.

d.   99.9% nucleotide bases are same in all people. Only 0.1% (3x106 bp) difference makes every individual unique.

e.     Functions of over 50% of discovered genes are unknown.

f.     Chromosome I has most genes (2968) and Y has the fewest (231).

g.    Less than 2% of the genome codes for proteins.

h.   Very large portion of human genome is made of Repeated (repetitive) sequences. These are stretches of DNA sequences that are repeated many times. They have no direct coding functions. They shed light on chromosome structure, dynamics and evolution.

i.   About 1.4 million locations have single-base DNA differences. They are called SNPs (Single nucleotide polymorphism or ‘snips’). This helps to find chromosomal locations for disease-associated sequences and tracing human history.


