First thing is that structure-wise this structure doesn't have too many silly nomenclature tricks up its sleeve. For once, the name is what it says it is, a TA rich region on a DNA sequence. Functionally, it positions the RNA polymerase on the DNA sequence for transcription, acting similarly to an E coli promoter. The TATA box is located around -35 to -25 base pairs upstream of the start site. So if some clever multiple choice question asks "Is the start site at the TATA box?" the answer is "No, it is downstream." "Do all genes have a TATA Box?" Again, "No, only those that have high rates of transcription in the cell." We will see plenty of other wonderful ways genes can be transcribed without a TATA box, albeit not in much detail. They will be discussed shortly.
How TATA is your TATA Box?
The TATA box is called a consensus sequence, and it is highly conserved among various genes in various organisms. However, each of the bases (A, T, and also G, and C) have a certain frequency of being in the ideal location "TATATATA" of the TATA box.
The third base has a 100% frequency of having a T, but the other bases are not as clear cut, with the first base being 83% likely of having a T, the second being 91% likely of having an A, and the others having 100%, 95%, 33%, 97%, 36% and 41% for the ideal base be it T or A respectively. There is a 40% probability of having a G in the last position.
This looks like memorization hell, and quite frankly it is, but the take-home message (which never shows up in multiple choice for this course for the record, oops that might have been sass) says the following: the TATA box is highly conserved, however different bases may be present in its sequence. I take back what I first said about the nomenclature, a better name would be the "most likely but not quite always TATA box."
Alternatives to the TATA Box
Initiator Element
Not much is known or considered important to relay at the undergraduate level about initiator elements except that a C is found in the -1 position and an A is found in the +1 position. I am really not sure how they did experimental testing to figure out these details, except it really must occur with a significant (or relatively significant) degree of frequency in the sequence, so if anyone knows and the explanation isn't too complicated that would be cool.
CpG Islands
These are CG rich areas of 20-50 base pairs within 100 base pairs of the start site region of a gene. These genes often have multiple start sites for transcription in a 20-200 bp region, and have neither a TATA box, nor initiator elements.
***********************
Promoter Proximal Elements
These are sequences within 100-200 base pairs of the start sequence that aren't the TATA box or any of the above sequences mentioned. They can be cell-type specific (not universally conserved).
Enhancers
Enhancers can be quite far away from the gene they enhance - even greater than 50 kilobases away! Their location may be upstream from the promoter, downstream from the promoter, within an intron, or downstream of the final exon of the gene. As one of my favourite Beatles songs likes to say, they can be quite literally be "here, there, and everywhere." The direction of the enhancer doesn't matter! Also, like promoter Proximal Elements, they are often cell type specific.
Difference between Promoter Proximal Elements and Enhancers?
Recall that it is with a human categorical bias that we organize these components of the cell. So, the distinction between promoter proximal elements and enhancers is not clear cut. For the purposes of the course I am taking, if something is within 100-200 base pair of the start sequence and helps initiate transcription, we'd probably call it a promoter proximal element - but who knows it could just as easily be an enhancer, or perhaps it is both!
**The next post will continue with an explanation about finding Promoter Proximal Elements with linker scanning mutations, and deletion analysis, and then there will also be another post about Enhancers and their effects on transcription.
No comments:
Post a Comment