Analyzing the PTC Tasting Phenotype with MySeq

This page is an example analysis of the "bitter tasting" trait using the MySeq application in an embedded context. Here MySeq is used to both query a whole genome VCF for NA12878 (from Genome in a Bottle) by genomic coordinates and predict the bitter tasting phenotype directly. All of the queries demonstrated here are performed "live" in the browser, that is these are not pre-generated results. Try MySeq as a "standalone" application.

Kim et al. identified a 3 variant haplotype in the TAS2R38 gene associated with the ability to taste the substance phenylthiocarbamide (PTC). The haplotype variants are described in the paper as A49P, V262A, and I296V, with "PAV" the dominant tasting haplotype. A little work is required to (reverse) translate that specification into genomic coordinates. We can translate those amino acid substitutions to nucleotide substitutions using the protein accession obtained from the NCBI and the Mutalyzer "back translator", e.g. NP_789787.4(TAS2R38):p.A49P. We take the coding nomenclature Mutalyzer reports NM_176817.4:c.145G>C to its position converter to obtain the GRCh37 genomic coordinates, NC_000007.13:g.141673345C>G, or chr7:141673345C>G.

Paper VariantGenomic Variant (GRCh37/hg19)
A49P chr7:141673345C>G
V262A chr7:141672705G>A
I296V chr7:141672604T>C

We can use MySeq to query for those variants in NA12878 by genomic position, e.g. chr7:141673345. Clicking on any of the variant rows will bring up additional annotations, including "Functional" annotations that we can use to verify our reverse translation.

The heterozygous T/C, G/A and C/G genotypes are very suggestive that NA12878 carries one copy of the tasting haplotype. However, the variant call data alone is insufficient to know which alleles are on the same chromosome. The first two variants are only 101bp apart. Thus we can verify that those variants are indeed on the same chromosome by looking to see if the alleles are on same fragment (i.e. in the same read or read-pair), a process termed "read-backed phasing". The pileup for NA12878's whole exome sequencing (WES) data is shown below for the region spanning the first two variants (from the 1000 Genome Phase 3 dataset).

However, when we review the pileup of the 1000 Genomes Phase 3 exome data (shown above) in this region we don't see any fragments with both the C ("blue") and A ("green") alternate alleles. However, when we review the GenBank entry for TAS2R38, we see the following misc_difference.

                5869
                /gene="TAS2R38"
                /gene_synonym="PTC; T2R38; T2R61; THIOT"
                /note="This sequence differs from the reference genome
                assembly (NCBI Build 37) at this position. C was replaced
                by T to represent the more common allele."
                /replace="c"
            

The reference G allele in chr7:141672705G>A is the alternate Ala262 amino acid. Thus the "taster" haplotype PAV described in the paper corresponds to the bolded alleles shown below. As we see above, those alleles do appear to be on the same chromosome in NA12878.

chr7:141673345C>G C/G
chr7:141672705G>A G/A
chr7:141672604T>C T/C

Having verified the haplotype structure, we can now predict that NA12878, who carries one copy of the dominant taster haplotype, tastes PTC as bitter.

In addition to query and annotation, MySeq implements a number of analyses including predicting physical traits like bitter tasting. This analysis, based on just the A49P variant, shows the same results we derived above.