Background

Mutations in cancers

  • somatic mutations are a major cause of tumorigenesis
    • but mutations are noisy
  • two source of recurrent mutations:
    • Background mutation rate
      • mutagens, DNA context, methylation etc
      • no selection pressure
    • Functional selection pressure
      • selected!

background mutation Model is needed

Plan

  1. Modelling Background mutation rate
    • Expected: model BMR -> significance of each mutation
  2. Analysis functional selected mutations
    • Missense / Nonsense comparison
    • Functional analysis

Modelling Background mutation rate(BMR)

Model 1

  • developed by (Chang et al. 2016) in nbt.
    • trinucleotide mutation rate -> weight-averaged for codon
    • [Not] cancer specific
    • data process :
      • using all single nucleotide variants(SNVs)
      • removed hypermutated sites: positions bearing >=99% mutations in a gene
  • binomial model

\[ Prob(X=k)=\binom{n}{r}P_{c,g}^{k}(1-P_{c,g})^{n-k} \\ \]


center nucleotide mutation rate in a trinucleotide(t)
\[ P_t=\frac{C_t}{F_t} \\\text{ }C_t:\#of\_mut;F_t:\#of\_trinucleotide \]

codon mutation rate \(P_{c,g}\) of codon c and gene g:
\[ P_{c,g}=\frac{\frac{C_g}{N_{sample}}}{\sum_{t \in g}{N_{t,g}P_t}} \sum_{t \in c}{\frac{n_{t,c}}{n_c}P_t} \]

$N_{t,g}$ : #of tri in gene;  
$N_{sample}$ : sample size ;  
$n_{t,c}$ : # of mut in site t of codon c;  
$n_c$ : # of mut in codon c  

Preliminary results

  • data: UCSC (238 samples, 68197 SNV mutations)
fread("~/GIT/hotspots/test/testrun2_sig_hotspots.txt")->testrun2
ggplot(data=testrun2)+geom_point(aes(x=log10(Mutation_Count),y=-log10_pvalue,color=TP),alpha=.5)+xlab("mutation recurrence")


fread("~/GIT/hotspots/test/testrun2_sig_hotspots.txt")->testrun2
names(testrun2)[14]<-"tri"
ggplot(data=testrun2)+geom_col(aes(x=tri,y=Mutability))+theme(axis.text = element_text(angle=90))

Model 2(to do)

  • From MutsigCV by (Lawrence et al. 2013)
    • designed for genes
      • might require some tuning-up
    • use silent mutations from target gene and its neighbors

Reference

Chang, Matthew T., Saurabh Asthana, Sizhi Paul Gao, Byron H. Lee, Jocelyn S. Chapman, Cyriac Kandoth, JianJiong Gao, et al. 2016. “Identifying Recurrent Mutations in Cancer Reveals Widespread Lineage Diversity and Mutational Specificity.” Nature Biotechnology 34 (2): 155–63. doi:10.1038/nbt.3391.

Lawrence, Michael S., Petar Stojanov, Paz Polak, Gregory V. Kryukov, Kristian Cibulskis, Andrey Sivachenko, Scott L. Carter, et al. 2013. “Mutational Heterogeneity in Cancer and the Search for New Cancer-Associated Genes.” Nature 499 (7457): 214–18. doi:10.1038/nature12213.

LS0tCnRpdGxlOiAiV29ya3JlcG9ydCIKYXV0aG9yOiAiVG9uZ2NodWFuIFpoYW5nIgpkYXRlOiAnYHIgU3lzLkRhdGUoKWAnCm91dHB1dDoKICBodG1sX2RvY3VtZW50OiBkZWZhdWx0CiAgYmVhbWVyX3ByZXNlbnRhdGlvbjogZGVmYXVsdAogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICBpb3NsaWRlc19wcmVzZW50YXRpb246IGRlZmF1bHQKICBwZGZfZG9jdW1lbnQ6IGRlZmF1bHQKICBzbGlkeV9wcmVzZW50YXRpb246IGRlZmF1bHQKYmlibGlvZ3JhcGh5OiAvaG9tZS90Yy9HSVQvdGNnYU11dC9yZXBvcnRzLzIwMTcwNDExLmJpYgotLS0KCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQprbml0cjo6b3B0c19jaHVuayRzZXQoY2FjaGU9VFJVRSwgZWNobyA9IEZBTFNFKQoKZG93bmxvYWRfYW5kX29yX2xvYWQgPC0gZnVuY3Rpb24oeCkgewogIHkgPC0geFshKHggJWluJSBpbnN0YWxsZWQucGFja2FnZXMoKVssICJQYWNrYWdlIl0pXQogIGlmIChsZW5ndGgoeSkpewogIGluc3RhbGwucGFja2FnZXMoeSwgZGVwZW5kZW5jaWVzID0gVFJVRSkKICB9CiAgc2FwcGx5KHgsIHJlcXVpcmUsIGNoYXJhY3Rlci5vbmx5ID0gVFJVRSkKfQoKbXlfbGlicmFyaWVzIDwtIGMoImdncGxvdDIiLCAiZHBseXIiLCAicHVycnIiLCJjaXRyIiwiZGF0YS50YWJsZSIpCmRvd25sb2FkX2FuZF9vcl9sb2FkKG15X2xpYnJhcmllcykKYGBgCgojIEJhY2tncm91bmQKCiMjIE11dGF0aW9ucyBpbiBjYW5jZXJzCgotICoqc29tYXRpYyBtdXRhdGlvbnMgYXJlIGEgbWFqb3IgY2F1c2Ugb2YgdHVtb3JpZ2VuZXNpcyoqICAKICAgIC0gYnV0ICBtdXRhdGlvbnMgYXJlIG5vaXN5IAogICAgCgotIHR3byBzb3VyY2Ugb2YgcmVjdXJyZW50IG11dGF0aW9uczoKICAgIC0gQmFja2dyb3VuZCBtdXRhdGlvbiByYXRlCiAgICAgICAgLSBtdXRhZ2VucywgRE5BIGNvbnRleHQsIG1ldGh5bGF0aW9uIGV0YwogICAgICAgIC0gbm8gc2VsZWN0aW9uIHByZXNzdXJlCiAgICAtIEZ1bmN0aW9uYWwgc2VsZWN0aW9uIHByZXNzdXJlICAKICAgICAgICAtIHNlbGVjdGVkIQoKPiBgYmFja2dyb3VuZCBtdXRhdGlvbiBNb2RlbCBpcyBuZWVkZWRgIAogICAgCiMjIFBsYW4gIAoKMS4gKipNb2RlbGxpbmcgQmFja2dyb3VuZCBtdXRhdGlvbiByYXRlKiogIAogICAgLSBFeHBlY3RlZDogbW9kZWwgQk1SIC0+IHNpZ25pZmljYW5jZSBvZiBlYWNoIG11dGF0aW9uICAKICAgIAoyLiBBbmFseXNpcyBmdW5jdGlvbmFsIHNlbGVjdGVkIG11dGF0aW9ucyAgCiAgICAtIE1pc3NlbnNlIC8gTm9uc2Vuc2UgY29tcGFyaXNvbgogICAgLSBGdW5jdGlvbmFsIGFuYWx5c2lzCgojIE1vZGVsbGluZyBCYWNrZ3JvdW5kIG11dGF0aW9uIHJhdGUoQk1SKSAgCiMjIE1vZGVsIDEgIAotIGRldmVsb3BlZCBieSBbQGNoYW5nX2lkZW50aWZ5aW5nXzIwMTYtMV0gIGluIG5idC4gIAogICAgLSB0cmludWNsZW90aWRlIG11dGF0aW9uIHJhdGUgLT4gd2VpZ2h0LWF2ZXJhZ2VkIGZvciBjb2RvbiAgCiAgICAtIFtOb3RdICBjYW5jZXIgc3BlY2lmaWMgICAKICAgIC0gZGF0YSBwcm9jZXNzIDogIAogICAgICAgIC0gdXNpbmcgYWxsIHNpbmdsZSBudWNsZW90aWRlIHZhcmlhbnRzKFNOVnMpCiAgICAgICAgLSByZW1vdmVkIGh5cGVybXV0YXRlZCBzaXRlczogcG9zaXRpb25zIGJlYXJpbmcgPj05OSUgbXV0YXRpb25zIGluIGEgZ2VuZSAgCi0gYmlub21pYWwgbW9kZWwgIAoKJCQKUHJvYihYPWspPVxiaW5vbXtufXtyfVBfe2MsZ31ee2t9KDEtUF97YyxnfSlee24ta30gIFxcCiQkICAKCgpfX19fCgoKY2VudGVyIG51Y2xlb3RpZGUgbXV0YXRpb24gcmF0ZSBpbiBhIHRyaW51Y2xlb3RpZGUodCkgIAokJApQX3Q9XGZyYWN7Q190fXtGX3R9IFxcXHRleHR7ICB9Q190Olwjb2ZcX211dDtGX3Q6XCNvZlxfdHJpbnVjbGVvdGlkZQokJCAgCgpjb2RvbiBtdXRhdGlvbiByYXRlICRQX3tjLGd9JCBvZiBjb2RvbiBjIGFuZCBnZW5lIGc6ICAKJCQKUF97YyxnfT1cZnJhY3tcZnJhY3tDX2d9e05fe3NhbXBsZX19fXtcc3VtX3t0IFxpbiBnfXtOX3t0LGd9UF90fX0gXHN1bV97dCBcaW4gY317XGZyYWN7bl97dCxjfX17bl9jfVBfdH0gCiQkICAKYGBgCiROX3t0LGd9JCA6ICNvZiB0cmkgaW4gZ2VuZTsgIAokTl97c2FtcGxlfSQgOiBzYW1wbGUgc2l6ZSA7ICAKJG5fe3QsY30kIDogIyBvZiBtdXQgaW4gc2l0ZSB0IG9mIGNvZG9uIGM7ICAKJG5fYyQgOiAjIG9mIG11dCBpbiBjb2RvbiBjICAKYGBgCgojIyBQcmVsaW1pbmFyeSByZXN1bHRzICAKLSBkYXRhOiBVQ1NDICgyMzggc2FtcGxlcywgNjgxOTcgU05WIG11dGF0aW9ucykKCmBgYHtyfQpmcmVhZCgifi9HSVQvaG90c3BvdHMvdGVzdC90ZXN0cnVuMl9zaWdfaG90c3BvdHMudHh0IiktPnRlc3RydW4yCmdncGxvdChkYXRhPXRlc3RydW4yKStnZW9tX3BvaW50KGFlcyh4PWxvZzEwKE11dGF0aW9uX0NvdW50KSx5PS1sb2cxMF9wdmFsdWUsY29sb3I9VFApLGFscGhhPS41KSt4bGFiKCJtdXRhdGlvbiByZWN1cnJlbmNlIikKCmBgYAoKCl9fXwoKYGBge3J9CmZyZWFkKCJ+L0dJVC9ob3RzcG90cy90ZXN0L3Rlc3RydW4yX3NpZ19ob3RzcG90cy50eHQiKS0+dGVzdHJ1bjIKbmFtZXModGVzdHJ1bjIpWzE0XTwtInRyaSIKZ2dwbG90KGRhdGE9dGVzdHJ1bjIpK2dlb21fY29sKGFlcyh4PXRyaSx5PU11dGFiaWxpdHkpKSt0aGVtZShheGlzLnRleHQgPSBlbGVtZW50X3RleHQoYW5nbGU9OTApKQoKYGBgCgoKCiMjIE1vZGVsIDIodG8gZG8pICAKLSBGcm9tIE11dHNpZ0NWIGJ5IFtAbGF3cmVuY2VfbXV0YXRpb25hbF8yMDEzXSAgCiAgICAtIGRlc2lnbmVkIGZvciBnZW5lcwogICAgICAgIC0gbWlnaHQgcmVxdWlyZSBzb21lIHR1bmluZy11cAogICAgLSB1c2Ugc2lsZW50IG11dGF0aW9ucyBmcm9tICoqdGFyZ2V0IGdlbmUqKiBhbmQgaXRzIG5laWdoYm9ycwogICAgCiMjIG5leHQgIAotIE1vZGVsbGluZyBCTVIKICAgIC0gZmluaXNoIG1vZGVsIDEgaW4gY29tcGxldGUgZGF0YXNldAogICAgLSAKICAgIC0gQ29tYmluZSBmZWF0dXJlcyBmcm9tIGV4cGVyaW1lbnQKLSBSZXBvcnRzIHRvIGZpbmlzaC4KCgojIyBSZWZlcmVuY2UKCgoK