|
|||||||||||
Open Positions
Research
This weeks exercise is about creating Dayhoff matrices by using the "count method". For this method, substitutions in alignments are counted and from the resulting count matrix, a mutation matrix is computed. This is described in detail in this bio-recipe; open it now, read it and play with the code to understand what is explained there.
The bio-recipe describes how Dayhoff matrices can be computed from a set of precomputed alignments. We want to do the same, but from simulated alignments. The goal is a) to verify that we can reconstruct the correct substitution probability matrix and b) to infer the correct PAM distance from the alignments.
Your task is now to generate 333 sequence pairs (as s1 and s2 below) and apply the bio-recipe to them. Since no gaps are introduced, the sequences do not have to be aligned (this means the DynProgStrings commands are not needed.)
Can you recover the original substitution matrix? What about the PAM distance of the alignments?
You can closely follow the bio-recipe, but instead of the precomputed alignments SampleAl, use the Mutate() function of Darwin to create alignments:
s1 := Rand(Protein(500)); s2 := Mutate(s1,40);
The first command produces a random protein of 500 amino acids, the second command mutates it randomly over 40 PAM.
The 40 PAM substitution probability matrix can be obtained as follows:
M40 := exp(40*logPAM1):
(logPAM1 is the logarithm of the 1 PAM matrix that is used for the computation of DMS).
Do everything again, but now simulated with the much larger distance of 500 PAM. What do you expect? How does this affect the estimated substitution matrix and distance?
Wichtiger Hinweis:
Diese Website wird in älteren Versionen von Netscape ohne
graphische Elemente dargestellt. Die Funktionalität der
Website ist aber trotzdem gewährleistet. Wenn Sie diese
Website regelmässig benutzen, empfehlen wir Ihnen, auf
Ihrem Computer einen aktuellen Browser zu installieren. Weitere
Informationen finden Sie auf
folgender
Seite.
Important Note:
The content in this site is accessible to any browser or
Internet device, however, some graphics will display correctly
only in the newer versions of Netscape. To get the most out of
our site we suggest you upgrade to a newer browser.
More
information