Lentivirus
Domesticating HIV
Lentiviral vectors are engineered derivatives of human immunodeficiency virus type 1 (HIV-1). Using a deadly pathogen for everyday research may seem counterintuitive, but the properties that make HIV-1 scary (stable genomic integration, efficient transduction of non-dividing cells, and broad tropism) are precisely why it’s so useful. Retaining these delivery capabilities while eliminating replication competency was achieved through progressive dissection of the HIV-1 genome across three generations of packaging systems, with a fourth now emerging.

Figure X. Comparison of wild-type HIV-1 genome organization versus first-, second-, third-, and fourth-generation lentiviral vector systems, illustrating progressive removal of viral genes to eliminate replication competency while retaining delivery function.
First-generation systems (Naldini et al., 1996)1 split the viral genome into two plasmids: a transfer vector retaining cis-acting elements needed for RNA processing, packaging, reverse transcription, and integration, and a packaging construct supplying all trans-acting proteins (Gag, Pol, Tat, Rev) except Env. This eliminated the possibility of producing replication-competent lentivirus in a single recombination event, but retained potentially perilous accessory genes (vif, vpr, vpu, nef) that were unnecessary for vector function. Second-generation systems (Zufferey et al., 1997)2 stripped these accessory genes away but kept Tat, yielding a three-plasmid system (transfer vector, packaging construct, envelope). Third-generation systems (Dull et al., 1998)3 eliminated Tat dependence (instead driving expression of the lentiviral genome via most commonly RSV or CMV) and separated Rev onto its own plasmid, producing the currently standardized four-plasmid SIN architecture (see figure below). At the cost of increased safety, this plasmid split can reduce titers: the loss of Tat removes transcriptional elongation through the TAR element, and co-transfecting four plasmids rather than three lowers the probability that any given cell receives all components36. In practice, the deficit is protocol-dependent and can be recovered by swapping the chimeric 5′ LTR promoter (e.g. from RSV to CMV) and optimizing plasmid ratios37. Each of the system’s core components (the cis-acting elements of the transfer vector, the Gag-Pol packaging machinery, Rev, and the VSV-G envelope) are described in the Second and Third Generation Lentivirus section.
More recently, Vink et al. (2017)32 developed a fourth-generation design, the LTR1 vector, which eliminates HIV-1 packaging sequences (Ψ and gag fragments) from the integrated provirus entirely. These elements are placed beyond the LTRs and are therefore lost upon reverse transcription. The resulting provirus contains no HIV-1 sequences beyond the minimal LTR att sites, reducing theoretical mobilization risk to near zero and accelerating transgene expression by removing inhibitory RNA structures from the 5′ UTR. These systems haven’t seen broad adoption (second- and third-generation vectors remain the standard), but they represent the continued trajectory of safety dissection.
Altogether, three decades of engineering has produced a transfer vector that retains merely ~8% of the original HIV-1 genome, exclusively cis-acting regulatory sequences, while all protein-coding functions are supplied from separate, unpackageable helper plasmids.
Second and Third Generation Lentivirus
Second-generation systems (Zufferey et al., 1997)2 stripped away the four HIV-1 accessory genes (vif, vpr, vpu, nef) that were unnecessary for vector function and potentially harmful, but still relied on Tat for transcription from the native 5′ LTR. The critical advance to third-generation systems (Dull et al., 1998)3 removed Tat dependence entirely by replacing the U3 region of the 5′ LTR with a constitutive promoter (typically CMV or RSV) and separated Rev onto its own plasmid. This created a four-plasmid architecture: transfer vector, Gag-Pol, Rev, and envelope, requiring three highly improbable independent recombination events to reconstitute anything resembling replication-competent HIV-1. The self-inactivating (SIN) deletion in the 3′ LTR U3 region, copied to both LTRs upon reverse transcription, provided the final safety layer by silencing the proviral promoter in transduced cells.
Today, essentially all research-grade lentiviral work uses third-generation systems. A modern third-generation transfer vector retains only the cis-acting elements required for RNA processing, packaging, reverse transcription, and integration. Understanding these elements in detail is essential for understanding why recombination occurs where it does.

Figure 1. Linear map of the third-generation lentiviral transfer vector RNA showing all cis-acting elements, their functional roles, and the obligatory first strand transfer via R–R complementarity.
The 5′ untranslated region and packaging signal (Ψ). The 5′ UTR spans from the transcription start site through the first ~350 nucleotides and contains structured RNA elements governing export, dimerization, and selective packaging. TAR (Trans-Activation Response element), the first ~60 nucleotides, forms a stable stem-loop. In third-generation vectors, Tat dependence is eliminated by replacing U3 with a constitutive promoter, making TAR vestigial for transcription but essential as part of the R region required for strand transfer during reverse transcription. The primer binding site (PBS) is an 18-nucleotide sequence complementary to the 3′ end of human tRNALys3, the primer for minus-strand DNA synthesis. Its placement near the 5′ end is the fundamental architectural reason why the first strand transfer is obligatory: RT initiates at the PBS and copies leftward, hitting the 5′ cap after synthesizing only ~100 nt of cDNA. Without a primer near the 3′ end, the only way to copy the remainder of the genome is for the nascent cDNA to jump to the 3′ R.
SL1 / Dimerization Initiation Site (DIS). This element contains a 6-nucleotide palindromic loop (GCGCGC in HIV-1 subtype B) that mediates the kissing-loop interaction initiating RNA dimerization. This element is most directly relevant to recombination: any two library members sharing the DIS palindrome can co-dimerize and be co-packaged as a heterozygous virion. SL3 presents a GGAG tetraloop bound by the NC zinc knuckles of Gag, the primary specificity determinant for genome selection. SL4 encompasses the mutated gag AUG, with ~360 bp of the 5′ gag coding sequence retained as part of the extended packaging signal. Structural work by Vamva et al. (2022, Nucleic Acids Research) revealed a U5–gag interaction contributing to the monomer–dimer RNA structural equilibrium linked to packaging competence.
Central polypurine tract and central termination sequence (cPPT/CTS). Derived from HIV-1 pol, these serve as a second plus-strand priming site during reverse transcription. The resulting central DNA flap enhances nuclear import efficiency 5–10-fold11.
Internal promoter and transgene cassette. This region carries the sgRNA expression cassette and, in some designs, a barcode and/or selection marker. In standard single-guide screens (e.g., the GeCKO, Brunello, and Dolcetto libraries from the Broad Genetic Perturbation Platform), this region contains a single U6 promoter driving one sgRNA. In combinatorial designs (such as the Big Papi dual-guide vector, CROPseq-multi (Russell et al., 2024)12, or CHyMErA (Gonatopoulos-Pournatzis et al., 2020)13) this region contains two or more sgRNA cassettes, and it is the homology between these repeated elements that drives intra-molecular recombination.
RRE, 3′ LTR (SIN), and WPRE. The RRE (~350 nt, from HIV-1 env) is bound by Rev for CRM1-mediated nuclear export of unspliced vector RNA. The 3′ LTR carries the SIN ΔU3 deletion, which is copied to both LTRs during reverse transcription, silencing the provirus. The polypurine tract (PPT), an RNase H-resistant purine-rich sequence, serves as the primer for plus-strand DNA synthesis. WPRE upstream of the 3′ LTR enhances transgene expression 2–5-fold via CRM1-independent mRNA stabilization.
The three helper plasmids supply all trans-acting proteins. None contain Ψ or LTRs, preventing their mRNAs from being packaged. The Gag-Pol packaging plasmid encodes the Gag polyprotein (MA–CA–NC–p6) and, via −1 ribosomal frameshift at ~5% frequency, the Gag-Pol polyprotein (adding PR–RT–IN). MA targets Gag to PI(4,5)P2-rich membrane domains; CA forms the conical capsid core (~250 hexamers + 12 pentamers); NC’s zinc knuckles bind Ψ with nanomolar affinity; p6 recruits ESCRT (TSG101 via PTAP, ALIX via YPXnL) for budding. The mature virion contains ~2,500 Gag and ~120 Gag-Pol molecules, yielding ~50 RT heterodimers and ~12 IN molecules inside the CA core. The Rev plasmid supplies Rev, which binds the RRE and mediates nuclear export of unspliced vector RNA; without Rev, packaging efficiency drops to near zero. The VSV-G envelope plasmid provides VSV-G, which binds LDLR and confers pantropism to essentially any mammalian cell type. Its pH-dependent fusion mechanism matches the endosomal entry pathway, and VSV-G shares no sequence homology with other system components, minimizing recombination risk. Notable exceptions to pantropism: quiescent T cells, B cells, and HSCs express low LDLR levels, requiring alternative pseudotyping strategies.
(For an excellent practical overview of lentiviral vector biology, components, and protocols, see Addgene’s Lentiviral Vector Guide.)
The Packaging Process: From Transcription to Budding

Figure 2. Virus production in HEK293T packaging cells. Four plasmids converge: transfer vector RNA dimerizes via DIS, is recognized by Gag-NC, and assembles into virions at the plasma membrane.
The standard packaging cell is HEK293T, not the parental HEK293 line. While HEK293 cells were generated by transforming human embryonic kidney cells with sheared adenovirus 5 DNA (Graham et al., 1977)33, they lack the ability to episomally replicate transfected plasmids. HEK293T cells were subsequently derived by stable transfection with a plasmid encoding the SV40 large T antigen (DuBridge et al., 1987)34. Large T binds the SV40 origin of replication (SV40 ori) present in most common expression vectors (including psPAX2, pMD2.G, and many transfer backbones), driving episomal amplification of transfected plasmids to high copy number within each cell. This amplification dramatically increases the amount of vector RNA and helper proteins produced per cell, boosting lentiviral titers 10–100-fold over parental HEK29335. For this reason, virtually all lentiviral packaging protocols specify HEK293T (or its derivative Lenti-X 293T, which has been further selected for high-titer production).
Transfer vector RNA is transcribed from the chimeric 5′ LTR and exported via Rev/RRE. In the cytoplasm, vector RNAs dimerize through the DIS kissing-loop at SL1. In a pooled library, all members share the same DIS palindrome, so dimerization is stochastic: different library members can co-dimerize and be co-packaged as heterozygous virions. Dimerization and Gag recognition are mechanistically coupled: the dimeric conformation exposes NC-binding sites that are occluded in the monomer, creating a quality-control mechanism favoring packaging of dimeric RNA.
Approximately 12 Gag molecules nucleate on the dimeric RNA via NC, then traffic to the membrane where ~2,500 additional Gag molecules assemble. ESCRT-III catalyzes membrane scission. After budding, PR cleaves Gag/Gag-Pol, producing the mature virion: lipid envelope with ~14 VSV-G trimers, MA shell, conical CA core housing two RNA genomes + NC + RT + IN. If the two RNAs differ (heterozygous virion), template switching during RT will produce chimeric proviruses. This is the origin of inter-molecular recombination in pooled screens: any time a library is produced in bulk, a substantial fraction of virions will be heterozygous.
Transduction: From Cell Entry to Integration
Entry and Capsid Nuclear Import
VSV-G binds LDLR on the target cell surface, triggering clathrin-mediated endocytosis. At pH ~6.0–6.2, VSV-G undergoes a conformational change that drives membrane fusion, releasing the conical CA core into the cytoplasm.
The intact capsid core traverses the nuclear pore complex (NPC) narrow-end-first, engaging FG-nucleoporins (Nup358 at the cytoplasmic face, Nup153 at the nuclear basket). The NPC deforms to ~60 nm to accommodate the core. Cyclophilin A (CypA) in the cytoplasm blocks premature CPSF6 engagement; once in the nucleus, CPSF6 binds the CA pocket (the same pocket targeted by the antiretroviral lenacapavir) and traffics the core to nuclear speckle-associated domains (SPADs). This capsid-mediated nuclear import mechanism is the fundamental reason lentiviruses can infect non-dividing cells, a capability gammaretroviral vectors lack (gammaretroviruses require mitotic nuclear envelope breakdown to access host DNA).
Reverse Transcription

Figure 3. The five steps of reverse transcription, showing obligatory strand transfers and the recombination hotzone during minus-strand elongation (Step 3).
Reverse transcription converts the diploid RNA genome into one double-stranded DNA provirus. Current evidence indicates it begins in the cytoplasm but completes inside the nucleus, within the intact CA core (a nanoscale reaction chamber concentrating RNA templates, RT, NC, and dNTPs imported through CA hexamer pores).
Step 1: Minus-strand strong-stop DNA synthesis. RT uses tRNALys3 at the PBS to initiate minus-strand synthesis, reading the template 3′→5′ (leftward toward the 5′ cap). It copies U5 and R (~100 nt), then hits the 5′ end; no more template. RNase H degrades the RNA in the hybrid, freeing the nascent −ssDNA.
Step 2: First strand transfer (obligatory). The freed −ssDNA jumps to the 3′ end of the genome by annealing its R complement to the 3′ R. This jump can occur intramolecularly (same RNA) or intermolecularly (co-packaged RNA): the first recombination opportunity. RT cannot initiate de novo; without this jump, reverse transcription stalls irreversibly.
Step 3: Minus-strand elongation (the recombination hotzone). RT extends the minus strand along the entire genome from the 3′ R through U3, through the transgene cassette, all the way back toward the 5′ end. This is where the bulk of recombination occurs, via the dynamic copy-choice mechanism: RT pauses at RNA secondary structures, RNase H over-degrades the donor template ahead of the polymerase, the nascent cDNA is exposed, and the co-packaged acceptor RNA invades and anneals. An estimated 5–14 template-switching events occur per genome per cycle in HIV-114. The recombination rate approximates ~1 crossover per kilobase of inter-element distance: a critical design parameter for vector architecture.
Step 4: Plus-strand synthesis and second transfer. The PPT, an RNase H-resistant purine-rich RNA remnant, primes plus-strand DNA synthesis. RT copies U3-R-U5-PBS to produce +ssDNA, which jumps to the other end of the minus strand via PBS complementarity (second obligatory transfer). The cPPT provides a second priming site, and the two plus strands meet to form the central DNA flap.
Step 5: Integration. The completed dsDNA has two identical ΔU3-R-U5 LTRs (SIN). Integrase recognizes only the terminal ~15 bp att sequences at each end, performs 3′-processing (removing a CA dinucleotide), and catalyzes strand transfer into host DNA with a 5 bp stagger. LEDGF/p75 tethers IN to H3K36me3 in active gene bodies, biasing integration into transcribed regions15.

Figure 4. Capsid nuclear import, RT completion in the nucleus, and integration at nuclear speckle-associated domains via LEDGF/p75 tethering.
Determinants of Template-Switching Frequency
Polymerase domain mutations. Drug-resistance mutations (K65R, L74V, E89G, Q151N, M184I) slow polymerization and increase switching 2–6-fold. Hydroxyurea-mediated dNTP depletion increases switching ~1.8-fold, consistent with the model that pausing is the rate-limiting step for template switching.
RNase H mutations. H539N and D549N decrease switching ~2-fold16. More aggressive mutations (D443N, E478Q, D498N) further reduce switching but crash titer >1000-fold, making them impractical for vector production. Phenotypic mixing experiments in MLV showed a steady decline in repeat deletion frequency with decreasing functional RNase H, with >4-fold decreases when 95% of virion RT was RNase H-defective (Mbisa et al., 2005)17.
Connection domain mutations. Patient-derived mutations (E312Q, G335C/D, N348I, A360I/V, V365I, A376S) reduce RNase H activity modestly while maintaining titer. In single-cycle assays, inclusion of these mutations reduced template-switching frequency from ~47% to ~32% (Nikolenko et al., 2007)18. N348I in particular maintains viral replication capacity, reduces secondary RNase H cleavages, and increases processive DNA synthesis, an attractive combination for vector engineering. Combining N348I with A360V and a modest RNase H site mutation (D549N) might achieve ~3-fold reduction in switching while keeping titer usable.
Nuclear dNTP pools. Because RT completes in the nucleus, the relevant dNTP concentration is the nuclear pool, which is lower than cytoplasmic levels; meaning RT inherently operates under conditions that favor pausing and template switching. This is an underappreciated factor: even without heterozygous virions, the nuclear environment itself promotes recombination.
Recombination as a Problem for Pooled Screens
The earliest genome-wide loss-of-function libraries used shRNA hairpins delivered by lentivirus. Stable integration meant each cell carried a heritable, barcode-like perturbation that could be tracked by deep sequencing after phenotypic selection, a property no transient delivery method could match. When CRISPR-Cas9 arrived, the same lentiviral infrastructure was repurposed almost overnight: the shRNA cassette was swapped for a guide-RNA expression cassette, and the first genome-scale CRISPR knockout screens (Shalem et al., 2014; Wang et al., 2014)45 were published within months of each other. This made lentiviral vectors the workhorse of pooled genetic screens. The platform rapidly expanded to CRISPRi/CRISPRa transcriptional control (Gilbert et al., 2014; Horlbeck et al., 2016)67, single-cell transcriptomic readouts that pair perturbation identity with gene expression (Perturb-seq: Dixit et al., 2016; CROP-seq: Datlinger et al., 2017)89, and ultimately genome-scale perturbation profiling of over one million cells in a single experiment (Replogle et al., 2022)10.
In every case, the lentiviral vector is the common thread: it integrates a single perturbation per cell, maintains it through division, and encodes a sequenceable identifier. But the same retroviral biology that enables this efficient delivery introduces an intrinsic source of potential noise: genetic recombination during reverse transcription.
Inter-molecular Recombination: The Perturb-seq Problem
Inter-molecular recombination occurs when RT switches templates between co-packaged RNAs in heterozygous virions. This was the central technical challenge of the original Perturb-seq platform. The Dixit et al. (2016) vector placed the sgRNA (mU6 promoter) and a polyadenylated guide barcode (GBC) in antiparallel cassettes ~2.7 kb apart8. When the library was pooled prior to virus production, template switching during reverse transcription scrambled sgRNA–GBC linkages at high frequency.
Xie et al. (2018) subsequently quantified this directly in Mosaic-seq libraries: when libraries were pooled before viral packaging, the most abundant sgRNA for each barcode occupied a median of only ~42% of reads, meaning >50% of sgRNA–barcode linkages were scrambled19. In contrast, individually packaged viruses maintained >83% correct linkage. This confirmed that recombination occurs overwhelmingly during reverse transcription of co-packaged heterodimeric genomes, not during co-infection of the same cell by independent virions.
The companion Adamson et al. (2016) study anticipated this problem and mitigated it through arrayed cloning: each sgRNA–GBC pair was individually cloned and verified by Sanger sequencing before pooling, creating a known dictionary that allowed computational identification of recombinant proviruses20. This preserved sgRNA–GBC fidelity but was laborious and not scalable to genome-wide libraries. Their subsequent 2018 preprint provided detailed best practices and discussed three mitigation strategies: arrayed library preparation, carrier plasmid dilution, and the CROP-seq architecture.
The most systematic recent measurement comes from the Blainey/Russell CROPseq-multi study, which assayed recombination across eight plasmid libraries and four cell lines12. They observed 9–17% total recombination (plasmid + lentiviral), with a mean of 12% attributable to lentiviral integration. Recombination frequency scaled approximately linearly with inter-element distance, consistent with the ~1 crossover/kb rule established in the virology literature.
Intra-molecular Recombination: The Multi-Guide Problem
Intra-molecular recombination is mechanistically distinct: RT jumps between repeated sequence elements on the same RNA molecule, deleting one copy and all intervening sequence. This cannot be prevented by carrier dilution, since it occurs on a single genome.
The shRNA gene therapy precedent. The problem was first characterized extensively in the HIV-1 gene therapy field, where multiple shRNAs are needed simultaneously to prevent viral escape. ter Brake et al. (2008) tested multi-shRNA vectors with repeated H1 or U6 promoters and found that identical promoters caused frequent cassette deletion21. In triple-cassette vectors with identical promoters, only 13% of SupT1 cell clones retained the intact construct. A comprehensive study by McIntyre et al. (2009) generated >500 clonal cell lines with 2–6 repeated shRNA cassettes and found deletion frequencies ranging from 2% to 36%, with central positions deleted most frequently22.
The TALEN disaster. TALEN DNA-binding domains consist of ~34 amino acid repeats with ~97% sequence identity at the nucleotide level. Mock et al. (2014) showed that lentiviral vectors package full-length TALEN mRNAs intact, but 100% of single-cell clones (25/25) showed recombination in the repeat domain after reverse transcription23. In 39 of 40 analyzed clones, recombination occurred between identical nucleotides of different repeats, resulting in in-frame elimination of complete repeat modules. This was so severe that the group developed RT-dead lentiviral particles (NRTLVs) for mRNA-only delivery as a workaround. The TALEN case represents the extreme of what happens when high-identity tandem repeats meet lentiviral reverse transcription.
Dual-gRNA vectors for CRISPR screens. Vidigal and Ventura (2015) directly demonstrated the problem: lentiviral vectors expressing gRNA pairs from two identical hU6 promoters lost the proximal gRNA cassette, confirmed by genomic PCR24. Replacing one hU6 with a synthetic murine U6 variant eliminated recombination. The VectorBuilder group (2024) quantified the relationship between shared sequence length and recombination frequency: a 249 bp duplicated stuffer reduced upstream reporter expression from 96% to 19% of cells, while identical CMV promoters (568 bp) left only ~12% expressing the upstream reporter25.
Wegner et al. (2024, Nature Biomedical Engineering) pushed this further with genome-wide quadruple-sgRNA (qgRNA) libraries26. Despite using four different promoters, they found ~30% of reads were chimeric after pooled lentiviral delivery. Rather than engineering the RT or the vector further, they computationally filtered chimeric reads, accepting the recombination rate and compensating with high cell numbers (~300 cells per qgRNA).
Recombination during reverse transcription is not a rare exception but a systematic constraint. It affects the design of every lentiviral construct, and the cost of ignoring it scales with library complexity.
Mitigation Strategies
Reducing Heterozygous Virion Formation
Carrier plasmid dilution. Feldman et al. (2018) demonstrated that co-transfecting a 1:100 library-to-carrier transfer vector ratio ensures >99% of library-containing virions are paired with carrier RNA, not another library member27. This nearly eliminates inter-molecular recombination but reduces effective titer ~100-fold, requiring scaled-up virus production. The approach has been used successfully in multiple Perturb-seq studies.
DIS subtype incompatibility. Chen et al. (2009) showed that replacing the DIS palindrome (subtype B: GCGCGC) with subtype C (GUGCAC) reduces cross-pool co-packaging ~9-fold28. This approach is limited by the small number of validated orthogonal DIS classes.
Eliminating Intra-molecular Repeats
Orthogonal promoters. The standard approach since ter Brake et al. (2008). Kabadi et al. (2014) assembled four sgRNAs under hU6, mU6, 7SK, and H1 via Golden Gate cloning29. Limited by the small number of compact Pol III promoters with validated gRNA expression.
Orthogonal scaffolds. The Adamson et al. (2016) vector used modified gRNA constant regions (cr2, cr3 variants) alongside distinct promoters. CROPseq-multi extends this with tRNA-processed arrays using orthogonal scaffolds12.
Dual-nuclease systems. CHyMErA (Gonatopoulos-Pournatzis et al., 2020) pairs SpCas9 with Cas12a, which uses a completely different guide RNA structure, inherently eliminating all scaffold homology13. Benchmarking by Najm et al. (2023) across ten digenic CRISPR technologies found that alternative tracrRNA sequences from SpCas9 consistently showed superior performance for combinatorial screens30.
Minimizing inter-element distance. The Big Papi vector uses antiparallel promoters to place spacers <200 bp apart, reducing recombination to ~9% (CROPseq-multi measurements). Cas12a systems with native crRNA array processing separate spacers by only ~20 bp, approaching negligible recombination.
Readout-Level Solutions
CROP-seq. Datlinger et al. (2017) placed the sgRNA cassette in the 3′ LTR9. During provirus synthesis, LTR duplication copies the sgRNA to both ends, producing a Pol II-driven copy that can be captured directly in scRNA-seq. Because the sgRNA itself is the barcode, recombination between sgRNA and a distant barcode is irrelevant. This design has largely supplanted the original Perturb-seq architecture.
Direct-capture Perturb-seq. Replogle et al. (2020) developed modified sgRNA scaffolds with capture sequences enabling direct detection of multiple sgRNAs per cell, supporting combinatorial screens without barcodes.
Computational filtering. The Replogle et al. (2022) genome-scale CRISPRi Perturb-seq used direct guide capture, largely bypassing the recombination problem10. The Horlbeck/Replogle dual-sgRNA CRISPRi library accepted ~20–30% recombination and computationally removed affected cells, increasing required cell numbers proportionally.
Alternative Delivery Systems
Transposon-based delivery. PiggyBac and Sleeping Beauty eliminate all RT-mediated recombination entirely. Limited by delivery efficiency (transfection/electroporation) and MOI control. Used for CRISPR screens in specific contexts but have not displaced lentiviral delivery broadly.
Non-reverse-transcribable LVs (NRTLVs). Mock et al. (2014) inactivated RT entirely for mRNA-only delivery of TALENs23. Incompatible with pooled screens requiring stable integration.
Gag-only VLPs. Recent systems (Haldrup et al., 2023; Jia et al., 2025) eliminate both RT and IN, packaging CRISPR RNPs or base editor mRNA. Useful for hit-and-run editing, not for screens requiring stable proviral barcodes.
The Unexplored Frontier: RT Engineering
The most mechanistically direct but least explored approach to reducing recombination is to modify reverse transcriptase itself. The virology literature provides extensive characterization of mutations that modulate recombination, but none have been tested in a lentiviral packaging plasmid for screening applications.
The connection domain mutations offer the most promising starting point. N348I alone reduced template switching from ~47% to ~32% in the GFP reconstitution assay while maintaining viral replication capacity18. Ehteshami et al. (2008) further characterized A360V as reducing secondary RNase H cleavages through both RNase H-dependent and -independent mechanisms31. These mutations were identified in the context of AZT resistance, but their recombination-reducing phenotype makes them attractive for vector engineering. Combining N348I with A360V and a modest RNase H site mutation (D549N) could plausibly achieve ~3-fold reduction in switching while keeping titer usable.
The gap between virology and functional genomics is striking. Despite decades of RT characterization, no group has built a packaging plasmid carrying these mutations and tested it for library delivery. The screening community treats the packaging plasmid as a black box. We estimate that a systematic effort: testing 5–10 RT mutant combinations for titer, recombination frequency, and screen performance, would require approximately 3–6 months of focused work and could quantify the practical benefit of RT engineering.
These effects should be additive with carrier dilution (which addresses inter-molecular events orthogonally) and with orthogonal vector elements (which address intra-molecular events at the repeat level). A combined strategy (RT mutations + carrier dilution + orthogonal cassette design) could in principle reduce total recombination to low single-digit percentages, a regime where computational filtering becomes nearly costless.
Summary and Future Directions
Lentiviral recombination in pooled screens emerges from the fundamental biology of the vector system: RNA dimerization is coupled to packaging, diploid genomes are the norm, and template switching is an intrinsic property of RT that serves essential functions (obligatory strand transfers) alongside generating unwanted crossovers. Two decades of screening experience have produced a toolkit of workarounds: carrier dilution, CROP-seq, orthogonal promoters/scaffolds, dual-nuclease systems, and computational filtering. These have been sufficient for most single-guide screen designs.
However, as the field moves toward combinatorial screens, multi-modal readouts, and higher-order perturbation arrays, the cost of recombination increases. Intra-molecular recombination in particular, which is not addressed by carrier dilution, imposes an increasingly severe constraint on vector design. The architectural solutions (orthogonal promoters, minimized inter-element distance) are approaching their physical limits: there are only a handful of validated compact Pol III promoters, and Cas12a systems, while elegant, lag in guide design tools and perturbation modality breadth.
The most promising and least explored direction is systematic RT engineering in the packaging plasmid. The virology and functional genomics communities would benefit from closer collaboration on this front. The mutations are characterized, the packaging plasmids are well-defined, and the assays (single-cycle GFP reconstitution, paired sgRNA sequencing after transduction) are established. What is missing is the systematic engineering effort to bridge these two fields.
References
- Naldini, L., et al. (1996). In vivo gene delivery and stable transduction of nondividing cells by a lentiviral vector. Science, 272(5259), 263–267. doi:10.1126/science.272.5259.263 ↑
- Zufferey, R., et al. (1997). Multiply attenuated lentiviral vector achieves efficient gene delivery in vivo. Nature Biotechnology, 15(9), 871–875. doi:10.1038/nbt0997-871 ↑
- Dull, T., et al. (1998). A third-generation lentivirus vector with a conditional packaging system. Journal of Virology, 72(11), 8463–8471. doi:10.1128/JVI.72.11.8463-8471.1998 ↑
- Shalem, O., et al. (2014). Genome-scale CRISPR-Cas9 knockout screening in human cells. Science, 343(6166), 84–87. doi:10.1126/science.1247005 ↑
- Wang, T., et al. (2014). Genetic screens in human cells using the CRISPR-Cas9 system. Science, 343(6166), 80–84. doi:10.1126/science.1246981 ↑
- Gilbert, L.A., et al. (2014). Genome-scale CRISPR-mediated control of gene repression and activation. Cell, 159(3), 647–661. doi:10.1016/j.cell.2014.09.029 ↑
- Horlbeck, M.A., et al. (2016). Compact and highly active next-generation libraries for CRISPR-mediated gene repression and activation. eLife, 5, e19760. doi:10.7554/eLife.19760 ↑
- Dixit, A., et al. (2016). Perturb-Seq: Dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell, 167(7), 1853–1866. doi:10.1016/j.cell.2016.11.038 ↑
- Datlinger, P., et al. (2017). Pooled CRISPR screening with single-cell transcriptome readout. Nature Methods, 14(3), 297–301. doi:10.1038/nmeth.4177 ↑
- Replogle, J.M., et al. (2022). Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq. Cell, 185(14), 2559–2575. doi:10.1016/j.cell.2022.09.011 ↑
- Zennou, V., et al. (2000). HIV-1 genome nuclear import is mediated by a central DNA flap. Cell, 101(2), 173–185. doi:10.1016/S0092-8674(00)80828-4 ↑
- Russell, A.J., et al. (2024). CROPseq-multi: a versatile solution for multiplexed perturbation and readout in pooled single-cell CRISPR screens. bioRxiv. doi:10.1101/2024.03.26.586793 ↑
- Gonatopoulos-Pournatzis, T., et al. (2020). Genetic interaction mapping and exon-resolution functional genomics with a hybrid Cas9–Cas12a platform. Nature Biotechnology, 38, 638–648. doi:10.1038/s41587-020-0437-z ↑
- Levy, D.N., et al. (2004). Dynamics of HIV-1 recombination in its natural target cells. PNAS, 101(12), 4204–4209. doi:10.1073/pnas.0306764101 ↑
- Ciuffi, A., et al. (2005). A role for LEDGF/p75 in targeting HIV DNA integration. Nature Medicine, 11(12), 1287–1289. doi:10.1038/nm1329 ↑
- Nikolenko, G.N., et al. (2004). Mutations in the connection domain of HIV-1 reverse transcriptase increase 3′-azido-3′-deoxythymidine resistance. PNAS, 101(36), 13160–13165. doi:10.1073/pnas.0404167101 ↑
- Mbisa, J.L., et al. (2005). RNase H activity of reverse transcriptase modulates the frequency of genetic rearrangements in retroviral vectors. Journal of Virology, 79(24), 15573–15583. doi:10.1128/JVI.79.24.15573-15583.2005 ↑
- Nikolenko, G.N., et al. (2007). Mechanism for nucleoside analog-mediated abrogation of HIV-1 replication: balance between RNase H activity and nucleotide excision. PNAS, 104(10), 4218–4223. doi:10.1073/pnas.0700139104 ↑
- Xie, S., et al. (2018). Frequent sgRNA-barcode recombination in single-cell perturbation assays. PLoS ONE, 13(6), e0198635. doi:10.1371/journal.pone.0198635 ↑
- Adamson, B., et al. (2016). A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell, 167(7), 1867–1882. doi:10.1016/j.cell.2016.11.048 ↑
- ter Brake, O., et al. (2008). Lentiviral vector design for multiple shRNA expression and durable HIV-1 gene therapy. Molecular Therapy, 16(3), 557–564. doi:10.1038/sj.mt.6300382 ↑
- McIntyre, G.J., et al. (2009). Multiple shRNA combinations for near-complete coverage of all HIV-1 strains. AIDS Research and Therapy, 6, 1. doi:10.1186/1742-6405-6-1 ↑
- Mock, U., et al. (2014). Novel lentiviral vectors with mutated reverse transcriptase for mRNA delivery of TALE nucleases. Scientific Reports, 4, 6409. doi:10.1038/srep06409 ↑
- Vidigal, J.A. & Ventura, A. (2015). Rapid and efficient one-step generation of paired gRNA CRISPR-Cas9 libraries. Nature Communications, 6, 8083. doi:10.1038/ncomms9083 ↑
- VectorBuilder (2024). Technical note: lentiviral recombination with duplicated sequences. vectorbuilder.com ↑
- Wegner, M., et al. (2024). Genome-wide functional screens with quadruple sgRNA arrays. Nature Biomedical Engineering. doi:10.1038/s41551-024-01192-1 ↑
- Feldman, D., et al. (2019). Optical pooled screens in human cells. Cell, 179(3), 787–799. doi:10.1016/j.cell.2019.09.016 ↑
- Chen, J., et al. (2009). High efficiency of HIV-1 genomic RNA packaging and heterozygote formation revealed by single virion analysis. PNAS, 106(32), 13535–13540. doi:10.1073/pnas.0906822106 ↑
- Kabadi, A.M., et al. (2014). Multiplex CRISPR/Cas9-based genome engineering from a single lentiviral vector. Nucleic Acids Research, 42(19), e147. doi:10.1093/nar/gku749 ↑
- Najm, F.J., et al. (2023). Orthologous CRISPR–Cas9 enzymes for combinatorial genetic screens. Nature Biotechnology, 36, 179–189. doi:10.1038/nbt.4048 ↑
- Ehteshami, M., et al. (2008). Connection domain mutations N348I and A360V in HIV-1 reverse transcriptase enhance resistance to 3′-azido-3′-deoxythymidine through both RNase H-dependent and -independent mechanisms. Journal of Biological Chemistry, 283(32), 22222–22232. doi:10.1074/jbc.M803521200 ↑
- Vink, C.A., et al. (2017). Eliminating HIV-1 packaging sequences from lentiviral vector proviruses enhances safety and expedites gene transfer for gene therapy. Molecular Therapy, 25(8), 1790–1804. doi:10.1016/j.ymthe.2017.04.028 ↑
- Graham, F.L., et al. (1977). Characteristics of a human cell line transformed by DNA from human adenovirus type 5. Journal of General Virology, 36(1), 59–72. doi:10.1099/0022-1317-36-1-59 ↑
- DuBridge, R.B., et al. (1987). The p300 protein of adenovirus-transformed 293 cells is a focus-forming oncogene but its transforming activity is not required for transformation. Molecular and Cellular Biology, 7(1), 379–387. doi:10.1128/mcb.7.1.379-387.1987 ↑
- Pear, W.S., et al. (1993). Production of high-titer helper-free retroviruses by transient transfection. PNAS, 90(18), 8392–8396. doi:10.1073/pnas.90.18.8392 ↑
- Gill, D.R., et al. (2020). Optimized transgene delivery using third-generation lentiviruses. Current Protocols in Molecular Biology, 133(1), e125. doi:10.1002/cpmb.125 ↑
- Azizi, S.A., et al. (2020). Improved third-generation lentiviral packaging with pLKO.1C vectors. BioTechniques, 69(5), 371–378. doi:10.2144/btn-2019-0155 ↑