Genomics

See also

Description

The process by which one strand of DNA is copied into a complementary sequence of RNA.

Discussion

Transcription is the process through which a DNA sequence is enzymatically copied by an RNA polymerase to produce a complementary RNA. Or, in other words, the transfer of genetic information from DNA into RNA. In the case of protein-encoding DNA, transcription is the beginning of the process that ultimately leads to the translation of the genetic code (via the mRNA intermediate) into a functional peptide or protein. Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for DNA; therefore, transcription has a lower copying fidelity than DNA replication.

Like DNA replication, transcription proceeds in the 5' → 3' direction (ie the old polymer is read in the 3' → 5' direction and the new, complementary fragments are generated in the 5' → 3' direction). Transcription is divided into 3 stages: initiation, elongation and termination.

Prokaryotic transcription

  • Occurs in the cytoplasm alongside translation.
Initiation

The following steps occur, in order, for transcription initiation:

  • RNA polymerase (RNAP) recognizes and specifically binds to the promoter region on DNA. At this stage, the DNA is double-stranded ("closed"). This RNAP/wound-DNA structure is referred to as the closed complex.
  • The DNA is unwound and becomes single-stranded ("open") in the vicinity of the initiation site (defined as +1). This RNAP/unwound-DNA structure is called the open complex.
  • The RNA polymerase transcribes the DNA, but produces about 10 abortive (short, non-productive) transcripts which are unable to leave the RNA polymerase because the exit channel is blocked by the σ-factor.
  • The σ-factor eventually dissociates from the holoenzyme, and elongation proceeds.

Promoters can differ in "strength"; that is, how actively they promote transcription of their adjacent DNA sequence. Promoter strength is in many (but not all) cases, a matter of how tightly RNA polymerase and its associated accessory proteins bind to their respective DNA sequences. The more similar the sequences are to a consensus sequence, the stronger the binding is.

Most transcripts originate using adenosine-5'-triphosphate (ATP) and, to a lesser extent, guanosine-5'-triphosphate (GTP) (purine nucleoside triphosphates) at the +1 site. Uridine-5'-triphosphate (UTP) and cytidine-5'-triphosphate (CTP) (pyrimidine nucleoside triphosphates) are disfavoured at the initiation site.

Termination

Two termination mechanisms are well known:

  • Intrinsic termination (also called Rho-independent termination) involves terminator sequences within the RNA that signal the RNA polymerase to stop. The terminator sequence is usually a palindromic sequence that forms a stem-loop hairpin structure that leads to the dissociation of the RNAP from the DNA template.One such common termination motif is the palindromic sequence 'GCCGCCAG'. The RNA polymerase fails to proceed beyond this point and consequently, the nascent DNA-RNA hybrid dissociates. The RNA polymerase then proceeds to look for a new initiation-region from which to start the initiation process again.
  • Rho-dependent termination uses a termination factor called ρ factor to stop RNA synthesis at specific sites. This protein binds and runs along the mRNA towards the RNAP. When ρ-factor reaches the RNAP, it causes RNAP to dissociate from the DNA, terminating transcription.

Other termination mechanisms include where RNAP comes across a region with repetitious thymidine residues in the DNA template. or where a GC-rich inverted repeat followed by 4 A residues. the inverted repeat forms a stable stem loop structure in the Rna, which causes the RNA to dissociate from the DNA template.

where the -35 region and the -10 ("Pribnow box") region comprise the basic prokaryotic promoter, and |T| stands for the terminator. The DNA on the template strand between the +1 site and the terminator is transcribed into RNA, which is then translated into protein.

Eukaryotic transcription

Eukaryotes have evolved much more complex transcriptional regulatory mechanisms than prokaryotes. For instance, in eukaryotes the genetic material (DNA), and therefore transcription, is primarily localized to the nucleus, where it is separated from the cytoplasm (where translation occurs) by the nuclear membrane. DNA is also present in mitochondria in the cytoplasm and mitochondria utilize a specialized RNA polymerase for transcription. This allows for the temporal regulation of gene expression through the sequestration of the RNA in the nucleus, and allows for selective transport of RNAs to the cytoplasm, where the ribosomes reside.

Adding to this complexity, eukaryotes have three nuclear RNA polymerases, each with distinct roles and properties:

  • RNA Polymerase I is located in the nucleolus and transcribes ribosomal RNA (rRNA).
  • RNA Polymerase II is localized to the nucleus, and transcribes messenger RNA (mRNA) and most small nuclear RNAs (snRNAs).
  • RNA Polymerase III is localized to the nucleus (and possibly the nucleolar-nucleoplasm interface), and transcribes transfer RNA (tRNA) and other small RNAs (including the small 5S rRNA).

These three RNA polymerases are commonly referred to as Pol I, Pol II and Pol III (and less often Pol A, Pol B, and Pol C, respectively).

Further complexity is added by the multitude of transcription factors and signaling pathways that may interact in combination to mediate cell-type and developmental transcriptional regulation.

The basal eukaryotic transcription complex includes the RNA polymerase and additional proteins that are necessary for correct initiation and elongation.

Primary (initial) mRNA transcripts in eukaryotic cells are synthesized as larger precursor RNAs that are processed by splicing out introns (non-coding sequences) and ligating exons (non-contiguous coding sequences) into the mature mRNA. Primary transcripts for some genes can be large. The primary transcripts of the neurexin genes, for instance, are as large as 1.7 megabases (1,700,000 bases), while the mature (processed) neurexin mRNAs are under 10 kilobases (10,000 bases), with as many as 24 exons and thousands of possible alternative splice variants that produce proteins with different activities.

Gene expression in eukaryotes is also controlled by complex interactions between cis-acting elements within the regulatory regions of the DNA, and trans-acting factors that include transcription factors and the basal transcription complex.

http://www.dadamo.com/wiki/transcript.jpg

Initiation

The core promoter of protein-encoding genes also contains binding sites for the basal transcription complex and RNA polymerase II, and is normally within about 50 bases upstream of the transcription initiation site. Further transcriptional regulation is provided by upstream control elements (UCEs), usually present within about 200 bases upstream of the initiation site. The core promoter for Pol II often contains a TATA box, the highly conserved DNA recognition sequence for the TATA box binding protein, TBP, whose binding initiates transcription complex assembly at the promoter.

Some genes also have enhancer elements that can be thousands of bases upstream or downstream of the transcription initiation site. Combinations of these upstream control elements and enhancers regulate and amplify the formation of the basal transcription complex.

Transcription process

For the pathway and process of construction of the transcription complex please see the individual polymerases: RNA polymerase I RNA polymerase II RNA polymerase III

Measuring and detecting transcription

Transcription can be measured and detected in a variety of ways:

  • Northern blot
  • RNase protection assay
  • RT-PCR
  • In vitro transcription
  • In situ hybridization

History

RNA synthesis by RNA polymerase had been established in vitro by several laboratories by 1965; however, the RNA synthesized by these enzymes had properties that suggested the existence of an additional factor needed to terminate transcription correctly.

By the late 1960s several papers that came out of the Harvard University Biological Laboratories established the basic mechanics of gene expression in bacteria.

Terminology

  • Activator, is a DNA-binding protein that regulates one or more genes by increasing the rate of transcription
  • Repressor, is a DNA-binding protein that regulates one or more genes by decreasing the rate of transcription

Reverse transcription

Some viruses have the ability to transcribe RNA into DNA in order to infect a cell's genome. The main enzyme responsible for this type of transcription is called reverse transcriptase. Reverse Transcriptase often causes the viral genome to be replicated along the cell's genome because of the constant activity of the revertase inside the cell.

Links

Attribution