Within-host diversity#

Within-host diversity, i.e. the genetic diversity of the parasite population within an infected individual, is determined by a number of different factors. Superinfection gives rise to multiclonal infections and is an important driver of within-host diversity. In our model the rate of superinfection is given by \(\chi\).

Following an episode of superinfection, a multiclonal infection may propagate along a transmission chain for several generations - this is known as cotransmission. Sexual recombination occurs with each generation of cotransmission, so that the parasites become increasing related to each other although they remain genetically diverse. Cotransmission is more likely to continue over multiple generations if the quantum of transmission \(Q\) is large, and it will not occur if \(Q=1\).

Mutation and recombination also contribute to within-host diversity but at a much lower level than superinfection and cotransmission. Again this depends on \(Q\) which determines the force of genetic drift in eliminating the diversity generated by mutation, recombination and superinfection.

Within-host diversity can be empirically quantified by a number of approaches:

  • Complexity of infection (COI). This involves estimating the number of distinct parasite genotypes observed in an infected individual using genetic barcodes, microsatellites, microhaplotyes and other sequence data.

  • Measurement of within-host nucleotide diversity (\(\pi_W\)) by deep genome sequencing of individual infections.

  • Estimation of the inbreeding coefficient \(F_{WS}\) by comparing mean within-host heterozygosity \(\widehat{H}_W\) with local subpopulation heterozygosity \(H_S\) across hundreds of thousands of SNPs.

  • Measurement of the proportion of the genome that is identical by descent (IBD) between alleles sampled within a host. In our model this is approximated by \(\gamma_W\), the within-host haplotype homozygosity of a 2 centimorgan locus. This is difficult to measure using conventional short-read sequencing but is becoming possible due to recent advances in single-cell sequencing, long-read sequencing and algorithmic construction of haplotypes from short-read data.

The genomic transmission graph provides a number of fundamental insights into the driving forces of within-host diversity, and also into how measurements of \(\pi_W\) and \(F_{WS}\) can be used to estimate the transmission parameters \(Q\) and \(\chi\).

The worked example below shows how to obtain \(\pi_W\) (mean within-host nucleotide diversity) and \(\gamma_W\) (within-host haplotype homozygosity of a 2 centimorgan locus) for different combinations of transmission parameters using coalestr.

On other pages we examine:

  1. the empirical distribution of \(\pi_W\) and how this can be used to estimate the quantum of transmission \(Q\).

  2. the relationship between \(\widehat{H}_W\) and \(H_S\).

  3. the relationship between \(F_{WS}\) and the parameters \(\chi\) and \(Q\).

!pip install coalestr
from coalestr import cs
''' We create a history of transmission parameters ..

my_history = [[D, N, Q, X, M]]
    D = duration of simulation
    N = effective number of hosts (Nh)
    Q = quantum of transmission
    X = crossing rate of transmission chains (chi)
    M = number of migrant hosts (Nm) '''

my_history = [[100000, 18764, 1, 0, 0]]
my_population = cs.Population(my_history)
my_population.get_coalescent()
my_population.get_diversity()
Observation time.    Events captured.   Mean coalescence time
                      beho      wiho        beho     wiho
        0              99.5     100.0     18188.4      1.0
Observation time.  Nucleotide diversity     Haplotype homozygosity
                      beho       wiho           beho       wiho
        0           4.00e-04   2.20e-08       4.09e-03   9.87e-01

In the above example, \(N_h = 18764\), \(Q = 1\) and \(\chi = 0\). The within-host parasite population has a mean coalescence time of 1 generation, giving nucleotide diversity \(\pi_W = 2.2 \times 10^{-8}\) and haplotype homozygosity \(\gamma_W = 0.987\).

In the example below \(N_h = 18674\), \(Q = 10\) and \(\chi = 1\). The within-host parasite population has a mean coalescence time of 23517 generations, giving nucleotide diversity \(\pi_W = 5.17 \times 10^{-4}\) and haplotype homozygosity \(\gamma_W = 0.0814\).

my_history = [[100000, 18764, 10, 1, 0]]
my_population = cs.Population(my_history)
my_population.get_coalescent()
my_population.get_diversity()
Observation time.    Events captured.   Mean coalescence time
                      beho      wiho        beho     wiho
        0              60.5      63.8     25633.1  23517.1
Observation time.  Nucleotide diversity     Haplotype homozygosity
                      beho       wiho           beho       wiho
        0           5.64e-04   5.17e-04       7.09e-04   8.14e-02