Nucleic acids (DNA & RNA) are the building blocks of genetic material.
DNA is the genetic material in most of the organisms.
STRUCTURE OF POLYNUCLEOTIDE CHAIN
Polynucleotides are the polymer of nucleotides. DNA & RNA are polynucleotides. A nucleotide has 3 components:
1. A nitrogenous base
2. A pentose sugar (ribose in RNA & deoxyribose in DNA)
3. A phosphate group
- Nitrogen bases are 2 types
}Purines: It includes Adenine (A) and Guanine (G).
} Pyrimidines: It includes Cytosine (C), Thymine (T-only in DNA) & Uracil (U-only in RNA).
- Erwin Chargaff’s rule: In DNA, the proportion of A is equal to T and the proportion of G is equal to C.
Therefore, [A] + [G] = [T] + [C].
- A nitrogenous base is linked to the pentose sugar through an N-glycosidic linkage.
- Nitrogenous base + pentose sugar = nucleoside
ú Adenosine (deoxyadenosine)
ú Guanosine (deoxyguanosine)
ú Cytidine (deoxycytidine)
ú Uridine (deoxythymidine)
- Nitrogen base + sugar + phosphate group = Nucleotide (deoxyribonucleotide).
- 2 nucleotides are linked through 3’-5’ phosphodiester bond → dinucleotide
- More nucleotides → polynucleotide
} Friedrich Meischer (1869): Identified DNA and named it as ‘Nuclein’.
} James Watson and Francis Crick proposed double helix model of DNA.
} Length of DNA is based on the number of nucleotides present in it. A pair of nucleotides referred to as base pairs.
· Ф 174 (a bacteriophage) has 5386 nucleotides.
· Bacteriophage lambda has 48502 base pairs (bp).
· E. coli has 4.6x106 bp.
· Haploid content of human DNA is 3.3x109 bp.
Salient features of double helix structure of DNA
} DNA is made of 2 polynucleotide chains. Its backbone is formed of sugar & phosphates. The bases project inside.
} The 2 chains have anti-parallel polarity, i.e. one chain has the polarity 5’→3’ and the other has 3’→5’.
} The bases in 2 stands are paired through H-bonds forming base pairs (bp).
A=T (2 hydrogen bonds) C≡G (3 hydrogen bonds)
} Purine comes opposite to a pyrimidine. This generates uniform distance between the 2 strands.
} The 2 chains are coiled in a right handed fashion.
} The pitch of the helix= 3.4 nm (34 Å)
} Number of base pair in each turn= 10
} Distance between adjacent base pairs= 0.34 nm (3.4 Å).
} Length of DNA = number of base pairs X distance between two adjacent base pairs.
Number of base pairs in human = 6.6 x 109
Hence, the length of DNA = 6.6 x109 x 0.34x 10-9
= 2.2 m
In E. coli, length of DNA = 1.36 mm
= 1.36 x 10-3 m.
= 4 x 106 bp
PACKAGING OF DNA HELIX
§ In prokaryotes (E.g. E. coli), the DNA is not scattered throughout the cell. DNA, being negatively charged, is held with some positively charged proteins and form ‘nucleoid’.
§ In eukaryotes, there is a set of positively charged, basic proteins called histones. Histones are rich in positively charged basic amino acid residues lysines and arginines.
§ 8 histones form histone octamer.
§ Negatively charged DNA is wrapped around histone octamer to give nucleosome.
§ A typical nucleosome contains 200 bp.
Therefore, the total number of nucleosomes in human =
6.6 x 109 bp = 3.3 x 107
§ Nucleosomes constitute the repeating unit to form chromatin. Chromatin is the thread-like stained bodies.
§ Nucleosomes in chromatin = ‘beads-on-string’.
§ Chromatin is packaged → chromatin fibres → coiled and condensed at metaphase stage → chromosomes.
§ The packaging of chromatin at higher level requires additional set of proteins called non-histone chromosomal (NHC) proteins.
§ Chromatins include
· Euchromatin: Loosely packed and transcriptionally active chromatin and stains light.
· Heterochromatin: Densely packed and inactive region of chromatin and stains dark.
THE SEARCH FOR GENETIC MATERIAL
1. Griffith’s experiment (Transforming principle)
Griffith used mice & Streptococcus pneumoniae.
Streptococcus pneumoniae has 2 strains-
◦ Smooth (S) strain (Virulent): Has polysaccharide mucus coat. Cause pneumonia.
◦ Rough (R) strain (Non-virulent): No mucous coat. Does not cause Pneumonia.
· S-strain → Inject into mice → Mice die
· R-strain → Inject into mice → Mice live
· S-strain (Heat killed) → Inject into mice → Mice live
· S-strain (Hk) + R-strain (live) → Inject into mice → Mice die
He concluded that some ‘transforming principle’, transferred from heat-killed S-strain to R-strain. It enabled R-strain to synthesize smooth polysaccharide coat and become virulent. This must be due to the transfer of genetic material.
2. Biochemical characterization of
- Oswald Avery, Colin MacLeod & Maclyn McCarty worked to determine the biochemical nature of ‘transforming principle’ in Griffith’s experiment.
- They purified biochemicals (proteins, DNA, RNA etc.) from the heat killed S cells to see which ones could transform R cells into S cells.
- They discovered that
ú DNA alone is transformed.
ú Proteases and RNases did not affect transformation.
ú Digestion with DNase inhibited transformation, suggesting that the DNA caused the transformation.
3. The Hershey-Chase Experiment
- Hershey & Chase made 2 preparations of bacteriophage - In one, proteins were labeled with S35 by putting in medium containing radioactive sulphur (S-35). In the second, DNA was labeled with P32 by putting in a medium containing radioactive Phosphorous (P-32).
- These preparations were used separately to infect E. coli.
- After infection, the E. coli cells were gently agitated in a blender to separate the phage particles from the bacteria.
- Then the culture was centrifuged. Heavier bacterial cells are formed as a pellet at the bottom. Lighter viral components outside the bacterial cells remained in the supernatant.
- They found that:
· Supernatant contains viral protein labeled with S35, i.e. the viral protein had not entered the bacterial cells.
· The bacterial pellet contains radioactive P. This shows that viral DNA labeled with P32 had entered the bacterial cells. This proves that DNA is the genetic material.
PROPERTIES OF GENETIC MATERIAL
A genetic material must
· Be able to generate its replica (Replication).
· Chemically and structurally be stable.
· Provide the mutations that are required for evolution.
· Be able to express itself as ‘Mendelian Characters’.
DNA is a better genetic material
Reasons for stability (less reactivity) of DNA
Reasons for mutability (high reactivity) of RNA
Presence of thymine
Presence of uracil
Absence of 2’-OH
Presence of 2’-OH
· The 2 DNA strands are complementary. If separated by heating they come together, when appropriate conditions are provided. (In Griffith’s experiment, when the bacteria were heat killed, some properties of DNA did not destroy).
- Due to unstable nature of RNA, RNA viruses (E.g. Q.B bacteriophage, Tobacco Mosaic Virus etc.) mutate and evolve faster.
- RNA can directly code for the protein synthesis, hence can easily express the characters. DNA is dependent on RNA for protein synthesis.
- For the storage of genetic information DNA is better due to its stability. But for the transmission of genetic information, RNA is better.
} RNA was the first genetic material.
} Essential life processes (metabolism, translation, splicing etc) evolved around RNA.
} It acts as genetic material and catalyst.
} DNA evolved from RNA for stability.
DNA REPLICATION (Semi-conservative model)
Replication is the copying of DNA from parental DNA. Semi-conservative replication is proposed by Watson & Crick. Messelson & Stahl experimentally proved it.
Messelson & Stahl’s Experiment
} They cultured E. coli in a medium containing 15NH4Cl (15N: heavy isotope of N). 15N was incorporated into both strands of bacterial DNA and the DNA became heavier.
} Another preparation containing N salts labeled with 14N is also made. 14N was also incorporated in both strands of DNA and became lighter. The 2 types of DNA can be separated by centrifugation in a CsCl density gradient.
} They took E. coli cells from 15N medium and transferred to 14N medium.
} After one generation, they isolated and centrifuged the DNA. Its density was intermediate between 15N DNA and 14N DNA. This shows that the newly formed DNA one strand is old (15N type) and one strand is new (14N type). This confirms semi-conservative replication.
The Machinery and Enzymes for Replication
· DNA replication starts at a point called origin (ori).
· A unit of replication with one origin is called a replicon.
· During replication, the 2 strands unwind and separate by breaking H-bonds in presence of an enzyme, Helicase.
· The separated strands act as templates for the synthesis of new strands.
· DNA replicates in the 5’→3’ direction.
· Deoxyribonucleoside triphosphates (dATP, dGTP, dCTP & TTP) act as substrate and also provide energy for polymerization.
· Firstly, a small RNA primer is synthesized in presence of an enzyme, primase.
· In the presence of an enzyme, DNA dependent DNA polymerase, many nucleotides join with one another to primer strand and form a polynucleotide chain (new strand).
· Unwinding of the DNA molecule at a point forms a ‘Y’-shaped structure called replication fork.
· The DNA polymerase forms one new strand (leading strand) in a continuous stretch in the 5’→3’ direction (Continuous synthesis).
· The other new strand is formed in small stretches (Okazaki fragments) in 5’→3’ direction (Discontinuous synthesis).
· The Okazaki fragments are then joined together to form a new strand by an enzyme, DNA ligase. This new strand is called lagging strand.
· If a wrong base is introduced in the new strand, DNA polymerase can do proof reading.
· E. coli completes replication within 38 minutes. i.e. 2000 bp per second.
· In eukaryotes, the replication of DNA takes place at S-phase of the cell cycle. Failure in cell division after DNA replication results in polyploidy.
CENTRAL DOGMA OF MOLECULAR BIOLOGY
- It is the process of copying genetic information from one strand of the DNA into RNA.
- Here, adenine pairs with uracil instead of thymine.
- Both strands are not copied during transcription, because
◦ The code for proteins is different in both strands. This complicates the translation.
◦ If 2 RNA molecules are produced simultaneously this would be complimentary to each other, hence form a double stranded RNA. This prevents translation.
- It is the segment of DNA between the sites of initiation and termination of transcription. It consists of 3 regions:
◦ A promoter (Transcription start site): Binding site for RNA polymerase.
◦ The structural gene: The region between promoter and terminator where transcription takes place.
◦ A terminator: End of process of transcription.
- Since the 2 strands have opposite polarity and the DNA- dependent RNA polymerase catalyze the polymerization in only one direction, i.e. 5’→3’.
- 3’→5’ acts as template strand. 5’→3’ acts as coding strand.
3’-ATGCATGCATGCATGCATGCATGC-5’ template strand.
5’-TACGTACGTACGTACGTACGTACG-3’ coding strand.
Transcription unit and gene
- Gene: Functional unit of inheritance. It is the DNA sequence coding for RNA molecule.
- Cistron: A segment of DNA coding for a polypeptide.
- Structural gene in a transcription unit is monocistronic (in eukaryotes) or polycistronic (in prokaryotes).
- The monocistronic structural genes have interrupted coding sequences (split genes).
- The coding sequences (expressed sequences) are called as exons. The exons are interrupted by introns (intervening sequences). In polycistronic, there are no split genes.
Steps of transcription in prokaryotes
} Initiation: Here, the enzyme RNA polymerase binds at the promoter site of DNA. This causes the local unwinding of the DNA double helix. An initiation factor (σ) present in RNA polymerase initiates the RNA synthesis.
} Elongation: The RNA chain is synthesized in the 5’-3’ direction. In this process, activated ribonucleoside triphosphates (ATP, GTP, UTP & CTP) are added. This is complementary to the base sequence in the DNA template.
} Termination: A termination factor (ρ) binds to the RNA polymerase and terminates the transcription.
In bacteria (Prokaryotes) transcription and translation can be coupled because
· mRNA requires no processing to become active.
· Transcription and translation take place in the same compartment (no separation of cytosol and nucleus). Translation can begin before mRNA is fully transcribed.
In eukaryotes, there are 2 additional complexities:
1. There are 3 RNA polymerases:
· RNA polymerase I: Transcribes rRNAs (28S, 18S & 5.8S).
· RNA polymerase II: Transcribes the heterogeneous nuclear RNA (hnRNA). It is the precursor of mRNA.
· RNA polymerase III: Transcribes tRNA, 5S rRNA and snRNAs (small nuclear RNAs).
2. The primary transcripts (hnRNA) contain both the exons and introns and are non-functional. Hence introns have to be removed. For this, it undergoes the following processes:
· Splicing: From hnRNA introns are removed (by the spliceosome) and exons are spliced (joined) together.
· Capping: Here, a nucleotide methyl guanosine triphosphate (cap) is added to the 5’ end of hnRNA.
· Tailing (Polyadenylation): Here, adenylate residues (200-300) are added at 3’-end. It is the fully processed hnRNA, now called mRNA.
It is the sequence of nucleotides (nitrogen bases) in mRNA that contains information for protein synthesis (translation).
20 AMINO ACIDS INVOLVED IN TRANSLATION
1. Alanine (Ala) 11. Leucine (Leu)
2. Arginine (Arg) 12. Lysine (Lys)
3. Asparagine (Asn) 13. Methionine (Met)
4. Aspartic acid (Asp) 14. Phenyl alanine (Phe)
5. Cystein (Cys) 15. Proline (Pro)
6. Glutamine (Gln) 16. Serine (Ser)
7. Glutamic acid (Glu) 17. Threonine (Thr)
8. Glycine (Gly) 18. Tryptophan (Trp)
9. Histidine (His) 19. Tyrosine (Tyr)
10. Isoleucine (Ile) 20. Valine (Val)
ú George Gamow: Suggested that in order to code for 20 amino acids, the code should be made up of 3 nucleotides.
ú Har Gobind Khorana: Developed the chemical method in synthesizing RNA molecules with defined combinations of bases (homopolymers & copolymers).
ú Marshall Nirenberg: Developed cell-free system for protein synthesis.
ú Severo Ochoa enzyme (polynucleotide phosphorylase) is used to polymerize RNA with defined sequences in a template independent manner.
The codons for the various amino acids
Salient features of genetic code
· Triplet code (three-letter code)
· Genetic code is universal.
· No punctuations b/w adjacent codons (comma less code).
· A single amino acid is represented by many codons. Such codons are called degenerate codons.
· The genetic code is non-ambiguous. i.e. one codon specify only one amino acid.
· AUG is the initiator codon. In eukaryotes, methionine is the first amino acid and formyl methionine in prokaryotes.
· Termination codons (non-sense codons/stop codons) are UAA, UAG & UGA. They do not indicate any amino acids.
TYPES OF RNA
- mRNA (messenger RNA): Provide template for translation (protein synthesis).
- rRNA (ribosomal RNA): Structural & catalytic role during translation. E.g. 23S rRNA in bacteria acts as ribozyme.
- tRNA (transfer RNA or sRNA or soluble RNA): Brings amino acids for protein synthesis and reads the genetic code.
tRNA- the adapter molecule
· An Anticodon (NODOC) loop that has bases complementary to the code.
· An amino acid acceptor end to which amino acid binds.
- For initiation, there is another tRNA called initiator tRNA.
- There are no tRNAs for stop codons.
- Secondary (2-D) structure of tRNA looks like a clover-leaf. 3-D structure looks like inverted ‘L’.
TRANSLATION (PROTEIN SYNTHESIS)
It takes place in ribosomes. Includes 4 steps
1. Charging of tRNA (aminoacylation of tRNA)
Formation of peptide bond requires energy obtained from ATP. For this, amino acids are activated (amino acid + ATP) and linked to their cognate tRNA in the presence of aminoacyl tRNA synthetase. So the tRNA becomes charged.
· It begins at the 5’-end of mRNA in the presence of an initiation factor.
· The mRNA binds to the small subunit of ribosome. Now the large subunit binds to the small subunit to complete the initiation complex.
· Large subunit has 2 binding sites for tRNA- aminoacyl tRNA binding site (A site) and peptidyl site (P site).
· Initiation codon for methionine is AUG. So methionyl tRNA complex would have UAC at the Anticodon site.
· At the P site the first codon of mRNA binds with anticodon of methionyl tRNA complex.
· Another aminoacyl tRNA complex with an appropriate amino acid enters the ribosome and attaches to A site. Its anticodon binds to the second codon on the mRNA and a peptide bond is formed between first and second amino acids in presence of an enzyme, peptidyl transferase.
· First amino acid and its tRNA are broken. This tRNA is removed from P site and second tRNA at the A site is pulled to P site along with mRNA. This is called translocation.
· Then 3rd codon comes into A site and a suitable tRNA with 3rd amino acid binds at the A site. This process is repeated.
· A group of ribosomes associated with a single mRNA for translation is called a polyribosome (polysomes).
· When aminoacyl tRNA reaches the termination codon like UAA, UAG & UGA, the termination of translation occurs. The polypeptide and tRNA are released from the ribosomes.
· The ribosome dissociates into large and small subunits at the end of protein synthesis.
An mRNA has additional sequences that are not translated (untranslated regions or UTR). UTRs are present at both 5’-end (before start codon) and 3’-end (after stop codon). They are required for efficient translation process.
REGULATION OF GENE EXPRESSION
Gene expression results in the formation of a polypeptide. In eukaryotes, the regulation includes the following levels:
1. Transcriptional level (formation of primary transcript)
2. Processing level (regulation of splicing)
3. Transport of mRNA from nucleus to the cytoplasm
4. Translational level.
The metabolic, physiological and environmental conditions regulate expression of genes. E.g.
ú In E. coli the enzyme, beta-galactosidase hydrolyses lactose into galactose and glucose. If the bacteria do not have lactose the synthesis of beta-galactosidase stops.
ú The development and differentiation of embryo into adult are a result of the expression of several set of genes.
§ “Each metabolic reaction is controlled by a set of genes”
§ All the genes regulating a metabolic reaction constitute an Operon. E.g. lac operon, trp operon, ara operon, his operon, val operon etc.
§ When a substrate is added to growth medium of bacteria, a set of genes is switched on to metabolize it. This is called induction.
§ When a metabolite (product) is added, the genes to produce it are turned off. This is called repression.
Lac operon in E. coli: The operon controlling lactose metabolism. It consists of
a) A regulatory or inhibitor (i) gene: Codes for the repressor.
b) 3 structural genes:
i. z gene: Codes for b galactosidase (hydrolyze lactose to galactose and glucose).
ii. y gene: Codes for permease (increase permeability of the cell to lactose).
iii. a gene: Codes for a transacetylase.
- The genes present in the operon function together in the same or related metabolic pathway. There is an operator region for each operon.
- If there is no lactose (inducer), Lac operon remains switched off. So the structural genes are not expressed. The regulator gene synthesizes mRNA to produce the repressor protein; this protein binds to the operator genes and blocks RNA polymerase movement.
- If lactose is provided in the growth medium, the lactose is transported into the E. coli cells by the action of permease. Lactose (inducer) binds with repressor protein. So repressor protein cannot bind to operator gene. The operator gene becomes free and induces the RNA polymerase to bind with promoter gene. Then transcription starts. Regulation of lac operon by repressor is called negative regulation.
In the absence of inducer:
In the presence of inducer:
HUMAN GENOME PROJECT (HGP)
· Genome: The entire DNA in the haploid set of chromosome of an organism.
· In Human genome, DNA is packed in 23 chromosomes.
· Human Genome Project (1990-2003) is the first effort in identifying the sequence of nucleotides and mapping of all the genes in human genome.
· Human genome contains about 3x109 bp.
Goals of HGP
a. Identify all the estimated genes in human DNA
b. Determine the sequences of the 3 billion chemical base pairs that make up human DNA.
c. Store this information in databases.
d. Improve tools for data analysis.
e. Transfer related technologies to other sectors.
f. Address the ethical, legal and social issues (ELSI) that may arise from the project.
HGP was closely associated with Bioinformatics.
Bioinformatics: Application of computer science and information technology to the field of biology & medicine. Usually applies in analyzing DNA sequence data.
Methodologies of HGP: 2 major approaches.
ú Expressed Sequence Tags (ESTs): Focused on identifying all the genes that are expressed as RNA.
ú Sequence annotation: Sequencing whole set of genome containing all the coding & non-coding sequence and later assigning different regions in the sequence with functions.
Isolate total DNA from a cell → Convert into random fragments → Clone in suitable host (e.g. BAC & YAC) for amplification → Fragments are sequenced using Automated DNA sequencers (using Frederick Sanger method) → Sequences are arranged based on overlapping regions → Alignment of sequences using computer programs
Genetic and physical maps on the genome were generated using information on polymorphism of restriction endonuclease recognition sites and some repetitive DNA sequences (microsatellites).
Salient features of Human Genome
a. Human genome contains 3164.7 million nucleotide bases.
b. Total number of genes= about 30,000.
c. Average gene consists of 3000 bases, but sizes vary. Largest known human gene (dystrophin on X-chromosome) contains 2.4 million bases.
d. 99.9% nucleotide bases are identical in all people. 0.1% is what makes each of us unique.
e. Functions of over 50% of discovered genes are unknown.
f. Chromosome I has most genes (2968) and Y has the fewest (231).
g. Less than 2% of the genome codes for proteins.
h. Repeated sequences make up very large portion of human genome. Repetitive sequences are stretches of DNA sequences that are repeated many times. They have no direct coding functions, but they shed light on chromosome structure, dynamics and evolution.
i. About 1.4 million locations where single-base DNA differences (SNPs- Single nucleotide polymorphism or ‘snips’) occur in humans.
DNA FINGERPRINTING (DNA PROFILING)
· The technique to identify the similarities of the DNA fragments of 2 individuals.
· Developed by Alec Jeffreys (1985).
Basis of DNA fingerprinting
· DNA carries some non-coding sequences called repetitive sequence [variable number tandem repeats (VNTR)].
· Number of repeats is specific from person to person.
· The size of VNTR varies in size from 0.1 to 20 kb.
· Repetitive DNA are separated from bulk genomic DNA as different peaks during density gradient centrifugation.
· The bulk DNA forms a major peak and the other small peaks are called as satellite DNA.
· Satellite DNA is classified into many categories, (micro-satellites, mini-satellites etc) based on base composition (A:T rich or G:C rich), length of segment and number of repetitive units.
· An inheritable mutation observed in a population at high frequency is called DNA polymorphism (variation at genetic level).
· Polymorphism is higher in non-coding DNA sequence. Because mutations in these sequences may not have any immediate effect in an individual’s reproductive ability.
· These mutations accumulate generation after generation and cause polymorphism. For evolution & speciation, polymorphisms play important role.
Steps of DNA fingerprinting
(Southern Blotting Technique)
a. Isolate DNA (from any cells like blood stains, semen stains or hair roots).
b. Make copies (amplification) of DNA by polymerase chain reaction (PCR).
c. Digest DNA by restriction endonucleases.
d. Separate DNA fragments by gel electrophoresis.
e. Treat with alkali solution (NaOH) to denature DNA bonds in the gel into single strands.
f. Transfer (blotting) single stranded DNA fragments to synthetic membranes such as nitrocellulose or nylon, and then baked in a vacuum oven at 80oC for 3-5 hours (to fix the DNA fragment on the membrane).
g. Nitrocellulose filter paper is placed in a solution containing radioactive labeled single stranded DNA probe. The DNA probe binds with the complimentary sequences of the DNA fragment on the membrane to form a hybridized DNA.
h. The filter paper is washed to remove unbound probe.
i. The hybridized DNA is photographed on to an X-ray film by autoradiography. The image (in the form of dark & light bands) obtained is called DNA fingerprint.
Application of DNA fingerprinting
- Forensic tool to solve paternity, rape, murder etc.
- For the diagnosis of genetic diseases.
- To determine phylogenetic status of animals.