Background
Mutations in cancers
- somatic mutations are a major cause of tumorigenesis
- two source of recurrent mutations:
- Background mutation rate
- mutagens, DNA context, methylation etc
- no selection pressure
- Functional selection pressure
background mutation Model is needed
Plan
- Modelling Background mutation rate
- Expected: model BMR -> significance of each mutation
- Analysis functional selected mutations
- Missense / Nonsense comparison
- Functional analysis
Modelling Background mutation rate(BMR)
Model 1
- developed by (Chang et al. 2016) in nbt.
- trinucleotide mutation rate -> weight-averaged for codon
- [Not] cancer specific
- data process :
- using all single nucleotide variants(SNVs)
- removed hypermutated sites: positions bearing >=99% mutations in a gene
- binomial model
\[
Prob(X=k)=\binom{n}{r}P_{c,g}^{k}(1-P_{c,g})^{n-k} \\
\]
center nucleotide mutation rate in a trinucleotide(t)
\[
P_t=\frac{C_t}{F_t} \\\text{ }C_t:\#of\_mut;F_t:\#of\_trinucleotide
\]
codon mutation rate \(P_{c,g}\) of codon c and gene g:
\[
P_{c,g}=\frac{\frac{C_g}{N_{sample}}}{\sum_{t \in g}{N_{t,g}P_t}} \sum_{t \in c}{\frac{n_{t,c}}{n_c}P_t}
\]
$N_{t,g}$ : #of tri in gene;
$N_{sample}$ : sample size ;
$n_{t,c}$ : # of mut in site t of codon c;
$n_c$ : # of mut in codon c
Preliminary results
- data: UCSC (238 samples, 68197 SNV mutations)
fread("~/GIT/hotspots/test/testrun2_sig_hotspots.txt")->testrun2
ggplot(data=testrun2)+geom_point(aes(x=log10(Mutation_Count),y=-log10_pvalue,color=TP),alpha=.5)+xlab("mutation recurrence")
fread("~/GIT/hotspots/test/testrun2_sig_hotspots.txt")->testrun2
names(testrun2)[14]<-"tri"
ggplot(data=testrun2)+geom_col(aes(x=tri,y=Mutability))+theme(axis.text = element_text(angle=90))
Model 2(to do)
- From MutsigCV by (Lawrence et al. 2013)
- designed for genes
- might require some tuning-up
- use silent mutations from target gene and its neighbors
next
- Modelling BMR
- finish model 1 in complete dataset
- Combine features from experiment
- Reports to finish.
Reference
Chang, Matthew T., Saurabh Asthana, Sizhi Paul Gao, Byron H. Lee, Jocelyn S. Chapman, Cyriac Kandoth, JianJiong Gao, et al. 2016. “Identifying Recurrent Mutations in Cancer Reveals Widespread Lineage Diversity and Mutational Specificity.” Nature Biotechnology 34 (2): 155–63. doi:10.1038/nbt.3391.
Lawrence, Michael S., Petar Stojanov, Paz Polak, Gregory V. Kryukov, Kristian Cibulskis, Andrey Sivachenko, Scott L. Carter, et al. 2013. “Mutational Heterogeneity in Cancer and the Search for New Cancer-Associated Genes.” Nature 499 (7457): 214–18. doi:10.1038/nature12213.
LS0tCnRpdGxlOiAiV29ya3JlcG9ydCIKYXV0aG9yOiAiVG9uZ2NodWFuIFpoYW5nIgpkYXRlOiAnYHIgU3lzLkRhdGUoKWAnCm91dHB1dDoKICBodG1sX2RvY3VtZW50OiBkZWZhdWx0CiAgYmVhbWVyX3ByZXNlbnRhdGlvbjogZGVmYXVsdAogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICBpb3NsaWRlc19wcmVzZW50YXRpb246IGRlZmF1bHQKICBwZGZfZG9jdW1lbnQ6IGRlZmF1bHQKICBzbGlkeV9wcmVzZW50YXRpb246IGRlZmF1bHQKYmlibGlvZ3JhcGh5OiAvaG9tZS90Yy9HSVQvdGNnYU11dC9yZXBvcnRzLzIwMTcwNDExLmJpYgotLS0KCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoY2FjaGU9VFJVRSwgZWNobyA9IEZBTFNFKQoKZG93bmxvYWRfYW5kX29yX2xvYWQgPC0gZnVuY3Rpb24oeCkgewogIHkgPC0geFshKHggJWluJSBpbnN0YWxsZWQucGFja2FnZXMoKVssICJQYWNrYWdlIl0pXQogIGlmIChsZW5ndGgoeSkpewogIGluc3RhbGwucGFja2FnZXMoeSwgZGVwZW5kZW5jaWVzID0gVFJVRSkKICB9CiAgc2FwcGx5KHgsIHJlcXVpcmUsIGNoYXJhY3Rlci5vbmx5ID0gVFJVRSkKfQoKbXlfbGlicmFyaWVzIDwtIGMoImdncGxvdDIiLCAiZHBseXIiLCAicHVycnIiLCJjaXRyIiwiZGF0YS50YWJsZSIpCmRvd25sb2FkX2FuZF9vcl9sb2FkKG15X2xpYnJhcmllcykKYGBgCgojIEJhY2tncm91bmQKCiMjIE11dGF0aW9ucyBpbiBjYW5jZXJzCgotICoqc29tYXRpYyBtdXRhdGlvbnMgYXJlIGEgbWFqb3IgY2F1c2Ugb2YgdHVtb3JpZ2VuZXNpcyoqICAKICAgIC0gYnV0ICBtdXRhdGlvbnMgYXJlIG5vaXN5IAogICAgCgotIHR3byBzb3VyY2Ugb2YgcmVjdXJyZW50IG11dGF0aW9uczoKICAgIC0gQmFja2dyb3VuZCBtdXRhdGlvbiByYXRlCiAgICAgICAgLSBtdXRhZ2VucywgRE5BIGNvbnRleHQsIG1ldGh5bGF0aW9uIGV0YwogICAgICAgIC0gbm8gc2VsZWN0aW9uIHByZXNzdXJlCiAgICAtIEZ1bmN0aW9uYWwgc2VsZWN0aW9uIHByZXNzdXJlICAKICAgICAgICAtIHNlbGVjdGVkIQoKPiBgYmFja2dyb3VuZCBtdXRhdGlvbiBNb2RlbCBpcyBuZWVkZWRgIAogICAgCiMjIFBsYW4gIAoKMS4gKipNb2RlbGxpbmcgQmFja2dyb3VuZCBtdXRhdGlvbiByYXRlKiogIAogICAgLSBFeHBlY3RlZDogbW9kZWwgQk1SIC0+IHNpZ25pZmljYW5jZSBvZiBlYWNoIG11dGF0aW9uICAKICAgIAoyLiBBbmFseXNpcyBmdW5jdGlvbmFsIHNlbGVjdGVkIG11dGF0aW9ucyAgCiAgICAtIE1pc3NlbnNlIC8gTm9uc2Vuc2UgY29tcGFyaXNvbgogICAgLSBGdW5jdGlvbmFsIGFuYWx5c2lzCgojIE1vZGVsbGluZyBCYWNrZ3JvdW5kIG11dGF0aW9uIHJhdGUoQk1SKSAgCiMjIE1vZGVsIDEgIAotIGRldmVsb3BlZCBieSBbQGNoYW5nX2lkZW50aWZ5aW5nXzIwMTYtMV0gIGluIG5idC4gIAogICAgLSB0cmludWNsZW90aWRlIG11dGF0aW9uIHJhdGUgLT4gd2VpZ2h0LWF2ZXJhZ2VkIGZvciBjb2RvbiAgCiAgICAtIFtOb3RdICBjYW5jZXIgc3BlY2lmaWMgICAKICAgIC0gZGF0YSBwcm9jZXNzIDogIAogICAgICAgIC0gdXNpbmcgYWxsIHNpbmdsZSBudWNsZW90aWRlIHZhcmlhbnRzKFNOVnMpCiAgICAgICAgLSByZW1vdmVkIGh5cGVybXV0YXRlZCBzaXRlczogcG9zaXRpb25zIGJlYXJpbmcgPj05OSUgbXV0YXRpb25zIGluIGEgZ2VuZSAgCi0gYmlub21pYWwgbW9kZWwgIAoKJCQKUHJvYihYPWspPVxiaW5vbXtufXtyfVBfe2MsZ31ee2t9KDEtUF97YyxnfSlee24ta30gIFxcCiQkICAKCgpfX19fCgoKY2VudGVyIG51Y2xlb3RpZGUgbXV0YXRpb24gcmF0ZSBpbiBhIHRyaW51Y2xlb3RpZGUodCkgIAokJApQX3Q9XGZyYWN7Q190fXtGX3R9IFxcXHRleHR7ICB9Q190Olwjb2ZcX211dDtGX3Q6XCNvZlxfdHJpbnVjbGVvdGlkZQokJCAgCgpjb2RvbiBtdXRhdGlvbiByYXRlICRQX3tjLGd9JCBvZiBjb2RvbiBjIGFuZCBnZW5lIGc6ICAKJCQKUF97YyxnfT1cZnJhY3tcZnJhY3tDX2d9e05fe3NhbXBsZX19fXtcc3VtX3t0IFxpbiBnfXtOX3t0LGd9UF90fX0gXHN1bV97dCBcaW4gY317XGZyYWN7bl97dCxjfX17bl9jfVBfdH0gCiQkICAKYGBgCiROX3t0LGd9JCA6ICNvZiB0cmkgaW4gZ2VuZTsgIAokTl97c2FtcGxlfSQgOiBzYW1wbGUgc2l6ZSA7ICAKJG5fe3QsY30kIDogIyBvZiBtdXQgaW4gc2l0ZSB0IG9mIGNvZG9uIGM7ICAKJG5fYyQgOiAjIG9mIG11dCBpbiBjb2RvbiBjICAKYGBgCgojIyBQcmVsaW1pbmFyeSByZXN1bHRzICAKLSBkYXRhOiBVQ1NDICgyMzggc2FtcGxlcywgNjgxOTcgU05WIG11dGF0aW9ucykKCmBgYHtyfQpmcmVhZCgifi9HSVQvaG90c3BvdHMvdGVzdC90ZXN0cnVuMl9zaWdfaG90c3BvdHMudHh0IiktPnRlc3RydW4yCmdncGxvdChkYXRhPXRlc3RydW4yKStnZW9tX3BvaW50KGFlcyh4PWxvZzEwKE11dGF0aW9uX0NvdW50KSx5PS1sb2cxMF9wdmFsdWUsY29sb3I9VFApLGFscGhhPS41KSt4bGFiKCJtdXRhdGlvbiByZWN1cnJlbmNlIikKCmBgYAoKCl9fXwoKYGBge3J9CmZyZWFkKCJ+L0dJVC9ob3RzcG90cy90ZXN0L3Rlc3RydW4yX3NpZ19ob3RzcG90cy50eHQiKS0+dGVzdHJ1bjIKbmFtZXModGVzdHJ1bjIpWzE0XTwtInRyaSIKZ2dwbG90KGRhdGE9dGVzdHJ1bjIpK2dlb21fY29sKGFlcyh4PXRyaSx5PU11dGFiaWxpdHkpKSt0aGVtZShheGlzLnRleHQgPSBlbGVtZW50X3RleHQoYW5nbGU9OTApKQoKYGBgCgoKCiMjIE1vZGVsIDIodG8gZG8pICAKLSBGcm9tIE11dHNpZ0NWIGJ5IFtAbGF3cmVuY2VfbXV0YXRpb25hbF8yMDEzXSAgCiAgICAtIGRlc2lnbmVkIGZvciBnZW5lcwogICAgICAgIC0gbWlnaHQgcmVxdWlyZSBzb21lIHR1bmluZy11cAogICAgLSB1c2Ugc2lsZW50IG11dGF0aW9ucyBmcm9tICoqdGFyZ2V0IGdlbmUqKiBhbmQgaXRzIG5laWdoYm9ycwogICAgCiMjIG5leHQgIAotIE1vZGVsbGluZyBCTVIKICAgIC0gZmluaXNoIG1vZGVsIDEgaW4gY29tcGxldGUgZGF0YXNldAogICAgLSAKICAgIC0gQ29tYmluZSBmZWF0dXJlcyBmcm9tIGV4cGVyaW1lbnQKLSBSZXBvcnRzIHRvIGZpbmlzaC4KCgojIyBSZWZlcmVuY2UKCgoK