Mullins: Improving infection time estimators using PacBio sequencing
Use of best-possible infection time estimators in HIV-1 prevention efficacy trials of bNAb regimens, such as the AMP trials, is important for making inferences about VRC01 concentration and serum neutralization readouts as correlates of HIV-1 risk and as correlates of VRC01 prevention efficacy. Infection time estimators are based on single- or multiple-time-point data on HIV diagnostics and intra-host HIV-1 sequence diversification, using models of diagnostic assay operating characteristics and molecular clock models applied to deep sequencing data. Development and evaluation of infection time estimators for use in HIV-1 prevention efficacy trials requires data sets on individuals who meet two criteria: (1) precise knowledge of the true date of HIV-1 acquisition, and (2) a first HIV-1 RNA-positive sample inferred with high probability to be within only several days after the date of HIV-1 acquisition (i.e. Fiebig Stage 1: HIV-1 following the first HIV-1 RNA-positive sample and prior to ART initiation). Such data sets are rare, motivating this proposal to use samples from participants that meet both criteria in the FRESH and RV217 longitudinal cohort studies, as both studies employed twice-weekly HIV-1 RNA PCR testing in order to detect acutely HIV-1 infected individuals. In the AMP trials, UMI (universal molecular identifier)-tag based PacBio sequencing methodology has been used to sequence the
gag-pol gene regions of the HIV-1 genome from participants who acquire infection. However, no PacBio sequencing data sets of these gene regions exist to date that meet the two above-mentioned inclusion criteria.
The use of PacBio sequencing has two unique and valuable advantages. First, unlike the Sanger and Illumina sequencing platforms, it allows determination of the entire 2.5-3.0kb lengths of these genome segments in single sequence reads, and in doing so allows an assessment of changes-at-a-distance from the contact point with antibodies that may impact protein structure and thus antibody neutralization sensitivity (i.e. along the entire proteins). Second, the use of UMIs as molecular barcodes that are added onto each viral genome as it is copied into cDNA allows performance of accurate single-genome sequencing, and at greater depth than has been achieved to date. It should be stressed that the procedures developed in the Mullins laboratory are unique in that they remove PCR and sequencing misincorporation errors. This is critical since these errors typically exceed the natural degree of HIV population diversity early in infection. These procedures also greatly reduce the occurrences of recombination during PCR. In contrast, both mutation and recombination artefacts are common in typical PacBio (as well as other deep sequencing approaches). These improvements are achieved by sequencing each molecule over 50 times (versus 1 time in typical PacBio experiments), and it is this over-sequencing coupled with UMI tagging that eliminates PCR and sequencing errors. Also, generating the products to be sequenced over a large number of independent PCR reactions - usually 8 per sample, compared to the typical single reaction - nearly eliminates recombination artefacts. In summary, these procedures permit an unprecedentedly deep and accurate view of emerging founder virus populations
in vivo, including sensitive identification of variants of potentially differential antibody sensitivity. This accuracy is critical and necessary to discern virus population diversity in the nearly homogeneous virus populations that typically characterize acute HIV infection. Without these advanced procedures in place, misincorporation errors that occur during the PCR reactions would equal or exceed the number of mutations that have occurred in the virus population very early in infection. These errors can prevent accurate measurements of viral diversity and obscure the true timing of infection.
rev-env-nef region was chosen for analysis since the Env protein is the target for bNAb neutralization and the component
rev-env region is used for the construction of plasmid vectors to express Env for laboratory testing of neutralization sensitivity. The
gag-pol region was chosen as an internal control since it does not encode proteins targeted by the bNAb, and therefore the rates of evolution over this region of the viral genome may be differentially impacted by the bNAb regimen. However, both regions of the viral genome being sequenced are linked, and thus evolutionary rates of both may be impacted. In contrast, analysis of the FRESH and RV217 cohorts using UMI-PacBio technology allows an accurate assessment of the emergence of virus variants without the influence of pre-existing antibody. The generation of PacBio
gag-pol sequences from FRESH and RV217 participant samples is critical to the development and evaluation of infection time estimators directly applicable to the AMP trials, as well as being applicable to any efficacy trial of a bNAb regimen for HIV-1 prevention. These data will also provide a reference for later being able to assess within the AMP trials whether and how bNAbs affect the timing of the emergence of viremia and ensuing virus variation.
This grant is led by Dr. James Mullins (University of Washington) and will support related activities in the laboratories of Dr. Carolyn Williamson (University of Cape Town), Dr. Peter Gilbert (Fred Hutchinson Cancer Research Center), and Dr. Morgane Rolland (Henry M. Jackson Foundation/MHRP).
1. Create HIV-1 large-scale and highly accurate sequence data sets from newly infected persons with closely approximated dates of infection, using the Pacific BioSciences (PacBio) Single Molecule Real Time (SMRT) DNA sequencing platform (referred to here as simply PacBio sequencing).
2. Use these sequence data for statistical analyses needed to improve infection time estimators.
3. Apply the improved infection time estimators to the Antibody Mediated Prevention (AMP) trial HVTN 704/HPTN 085 and HVTN 703/HPTN 081 data sets.
4. Develop improved statistical methods for learning about HIV-1 monoclonal broadly neutralizing antibody (bNAb) prevention efficacy that incorporate the improved infection time estimators.