Structural, Functional and Genomic Diversity of Plant NLR proteins: an Evolved Resource for Rational Engineering of Plant Immunity.

: Plants employ a diverse intracellular system of NLR (Nucleotide-binding, Leucine rich-Repeat) innate immune receptors to detect pathogens of all types. These receptors represent valuable agronomic traits that plant breeders rely on to maximize yield in the face of devastating pathogens. Despite their importance, the mechanistic underpinnings of NLR-based disease resistance remain obscure. The rapidly increasing numbers of plant genomes are revealing a diverse array of NLR-type immune receptors. In parallel, mechanistic studies are describing diverse functions for NLR immune receptors. In this review, we intend to broadly describe how the structural, functional and genomic diversity of plant immune receptors can provide a valuable resource for rational engineering of plant immunity.


Introduction: Layers of the plant immune system.
Plants have evolved an elaborate innate immune system to detect and limit the growth of potential pathogens.Lacking an adaptive immune system with mobile cells, each plant cell must be able to detect and defend appropriately against pathogens.Plants mount a sophisticated multilayered defense response including local physical barriers and chemical weapons, systemic signaling to prime uninfected cells and programmed cell death to limit pathogens that rely on living host cells.To present an appropriate response, plants must have a mechanism to identify microbes of all types and discriminate between friend and foe.To integrate signals from their biotic environment, plants rely on a diverse collection immune receptors often numbering in the hundreds per genome (53; 59; 99).Despite their numbers, exactly how a limited set of genomically-encoded immune receptors can protect plants against a deluge of rapidlyevolving microbial pathogens remains unknown.
Plant immune receptors come in two broad classes that have been proposed to play complementary roles (55).The first class of immune receptors, termed Pattern Recognition Receptors (PRRs), monitors the extracellular environment for signals derived from microbes (25; 112).The PRR class is responsible for transducing recognition of pathogens across the plasma membrane to activate a defense response known as PAMP-triggered (pathogen-associated molecular patterns) immunity, or PTI (25; 134).PTI-activating signals are usually conserved, essential microbe-derived or generated molecules (15; 86).PRRs typically have an extracellular ligand-binding domain, a transmembrane-spanning domain, and an intracellular kinase domain (145).PTI is sufficient to render plants immune to a large number of potential pathogens.Microbes that are successful pathogens on a given host have evolved tools to defeat PTI.In many cases, these evolved tools are secreted or translocated proteins known as "effectors" or small molecule toxins (16; 119).The second layer of the plant immune system has evolved to defeat these evolved pathogens, by recognizing pathogen virulence tools as reliable indicators of pathogenesis.This second layer of defense is composed NLR-type (Nucleotide-binding, Leucine-rich Repeat or, alternatively, NOD-Like Receptor) immune receptors (139).These NLR receptors recognize evolved pathogens either directly by the presence or indirectly, by the activity of translocated effectors and toxins.This effector-triggered immunity (ETI) "reboots" plant immune responses dampened by pathogen manipulation and results in strong disease resistance, often associated with programmed cell death known as the hypersensitive response (HR) (26).In some interesting cases, the boundaries between PTI and ETI are less distinct and PRR-type receptors can behave genetically like ETI-triggering NLRs (113).Together PRRs and NLRs must be sufficient to recognize and discriminate between environmental microbes, symbionts and pathogens (47).
Since the rise of agriculture, farmers and breeders have selected for resistance traits to maximize yield by reducing losses to pathogens.The genetic behavior of these traits led to Flor's gene-for-gene hypothesis which proposed single dominant genes in the plant somehow recognized single dominant genes in the pathogen (39).In the 20 th century, researchers realized that the plant resistance traits are often encoded by NLR immune receptors.One of the most striking and unexpected findings was that immunity to pathogens of all kingdoms could be encoded by a single stereotyped class of immune receptor.Since then we have greatly expanded our knowledge of NLRs, but many basic mechanistic facts remain poorly understood.While commonalities between NLRs were striking when first discovered, intensive study has exposed a great deal of diversity in both their structure and function.As increasing numbers of genomes are sequenced, evidence for NLR diversity at the species and population level is also rapidly accumulating.In this review we will focus on NLRs, their structural, functional and genomic diversity, and progress and prospects for engineered disease resistance.

NLRs are multi-domain molecular switches.
Upon their identification, plant NLR proteins were recognized to contain conserved domains in a stereotypical configuration: a variable N-terminal domain, a central nucleotide binding site (NBS) domain with similarity to the AAA-ATPase family and a Cterminal LRR.The variable N-terminal domains are typically a TIR (Toll-interleukin-1 receptor) or a CC (Coiled-Coil) domain (Figure 1).More recently, evolutionarily distinct classes of CC domains have been described, and many CC-type NLRs (CNLs) have been refined as RPW8-type CC-NLRs (CCr-NLRs or RNLs) based on their similarity to the CC-only disease resistance gene RPW8 (24; 132; 144).How these domains interact during both the inactive resting state and the activated, signaling state remains unknown.How plant NLRs function is ultimately a difficult structural biology problem.Unfortunately, the structure of any full-length plant NLR has not yet been determined.NLR proteins are proposed to function as multi-domain switches with inactive and active states driven by the nucleotide-bound status of the central NBS domain; binding to ADP promotes a closed, inactive conformation and binding to ATP promotes an open, active conformation (Figure 1) (reviewed in (12)).The basis for thinking of plant NLRs as NBSdriven switches is largely drawn from structurally-related NBS domains of animal proteins (56; 109).Homology modeling to the NBS domain of animal proteins such as the apoptosome component APAF1 reveal that NLRs contain conserved motifs with predictable functions (109).The P-loop nucleotide binding motif is required for nucleotide binding and can be reliably mutated to generate a loss of function mutant (7; 32; 111; 127).A second motif, characterized by the amino acid sequence MHD, can be mutated to generate gain of function autoactive alleles (7; 28; 50; 127).Despite the strikingly similar domain structure of plant and animal NLRs, as well as mechanistic similarities they are likely products of convergent evolution (117).
If the NBS domain is responsible for controlling the switch between resting and activated states of NLRs, which domain is responsible for signaling downstream?Deletion analysis of the various domains has found that for a number of NLRs the Nterminal CC, CCr or TIR domain is required and often sufficient for cell death signalling (10; 24; 76; 108).Thus the N-terminus is proposed to transduce the activation signal to downstream pathways.In the case of both CC and TIR domains, oligomerization of the N-terminus is proposed to be the critical event in activation.At the whole NLR level, oligomerization can be effector induced, as with the tobacco N gene (81) or with Arabidopsis RPP1 (97).Other NLRs, such as the RPS4/RRS1 complex appear to constitutively self-associate pre-activation, and then subsequently form post-activation N-terminal multimers to activate defense (51) (Figure 1).Several dimeric CC and TIR structures now exist for NLR N-termini, but the exact conformations of resting and activated multimers and the extent of their oligomerization remains elusive (33; 138).How N-terminal domain multimers form in response to conformational change in the NBS in response to nucleotide binding remains an important unanswered question.While there is much agreement that NLRs behave as switches, exactly how these three domains respond to pathogens and activate cell death remains mechanistically unclear (77; 110).

NLRs can directly or indirectly recognize pathogen effectors
Evolved pathogens of all kingdoms deliver intracellular effector molecules to immunosuppress and manipulate the host.Thus, these effectors are excellent reliable signals for the plant to monitor.In many cases, NLRs can directly recognize pathogen effectors by binding to them (Figure 1).There do not appear to be generalizable rules to how or where NLRs bind pathogen effectors.The LRR of the rice CNL Pi-ta was first proposed as an effector binding domain (54).A role in substrate binding makes intuitive sense given the known role of LRRs as diverse substrate-binding platforms (40).Subsequently, LRRs have been found to directly bind diverse effectors and also to be under diversifying selection, presumably driven by effector diversity and diversification (44; 64).Beyond the LRR, other domains are also clearly regulating recognition.In the case of genes at the L locus of flax, TNLs with identical LRRs, but slightly divergent TIR domains have distinct specificities (34).
How effector-binding to the LRR (or other domains) opens NLRs and promotes oligomerization remains unclear.Analysis of alleles of the flax TNLs L6 and L7 suggest that in the absence of pathogens NLRs may be in an equilibrium in the "on" and "off" states, and that effectors stabilizes the active, ATP-bound state (9).Consistent with this model, for L6 and L7, the activating effector was found to bind inactive versions of the NLR more weakly than active forms (9).If closed, inactive NLRs have multiple points of intramolecular contact between or among domains, then effectors could bind to any of them and disrupt a closed state, or stabilize an open one.

NLRs can indirectly recognize pathogen effectors by guarding important immune targets
NLRs can also indirectly recognize the presence of pathogen effectors by monitoring their impact on host targets (Figure 1).This model was first proposed to explain the Pto/Prf/AvrPto system, where targeting of the Pto kinase by AvrPto is detected by the CNL Prf (118).This "guard hypothesis" proposes that by guarding important, conserved targets of pathogens (or decoys of targets) the plant immune system can detect all pathogens without a separate, genomically encoded receptor for each pathogen (27).Another important outcome of the guard hypothesis is that we can better understand the plant immune system by knowing the set of proteins that evolution has selected to be guardees of NLRs.Indirect recognition can allow plants to detect mechanisticallydistinct effectors that target the same host protein.

Downstream signalling events are not understood.
Surprisingly, the main function of NLRs, the downstream activation of disease resistance and cell death remains mechanistically obscure.How effector activation eventually is transduced into disease resistance, and often cell death, is unknown for any NLR.Downstream events have proven remarkably resistant to forward genetic analysis.The lack of mutable genes required for NLR signalling has prompted hypotheses of redundancy or lethality.An alternative is that the pathways are extremely direct and lack a downstream element.A direct action hypothesis proposes that NLRs are capable of directly activating immune responses and/or killing cells.Intriguingly, it has been proposed that the CNLs Rx1 and I-2 can bind and deform DNA in an effectordependent manner (36; 37).How DNA binding and deformation by NLRs promotes an "immune-competent" state remains to be determined, but could represent an extremely direct and redundant pathway.
There are however a few identified genes that are required for NLR function.Chaperones such as HSP90, RAR1 and SGT1 are generally required for NLR protein accumulation (100).Genetic analysis of suppressors of autoactive NLRs has revealed a number of novel regulators of NLR homeostasis (reviewed in (72)).Signalling components downstream of NLR accumulation are more rare.Interestingly, the downstream genes appear to split NLR function by TIR vs CC class.All TNLs tested require a lipase-like gene called EDS1 (125).In addition to EDS1, TNLs also require EDS1-like family members such as PAD4 and SAG101, which function in complexes with EDS1 (121).These proteins interact with TNLs and shuttle in and out of the nucleus.Their biochemical function remains mysterious as conserved lipase catalytic residues are not required for supporting TNL immune function (121).CNLs do not appear to directly require EDS1, but several are strongly dependent on the function of NDR1, a protein with homology to integrins (62).Exactly how NDR1 is required for CNL function remains obscure and NDR1 function may not be limited to NLR signalling, as ndr1 mutant plants also have altered responses to compatible Pseudomonas syringae, which lacks recognized ETI-triggering effectors (62).

Diversity of NLR domain structure
In spite of the fact that NLRs were initially recognized to have a stereotyped domain structure, sequencing of the Arabidopsis genome revealed an unexpected diversity in NLR-like sequences (84).Not only full CNL and TNL receptors exist, but also "truncated" versions that could lack LRR or NBS-LRR domains (Figure 1).These truncated forms are reminiscent of truncated animal immune receptors such as Myd88, a TIR protein that serves as a cytoplasmic adapter for a number of Toll-like receptors.Myd88, and similar TIR adaptor proteins act downstream of multiple receptors to transduce receptor activation (90).In the case of truncated plant TNLs, their function appears to be more specific, although the number of cases tested remains low.RLM3, a TIR-NBS protein is the first example of a "truncated TNL" protein which is required for disease resistance (103).Other TIR-NBS proteins such as TN2 and CHS1 have loss of function or overexpression phenotypes consistent with immune receptors (123; 136; 142).The TIR-only protein RBA1 is required for cell death in response to the type III effector protein HopBA1 (Figure 1) (87).Exactly how truncated TNLs function in the immune system remains unclear, but a compelling hypothesis is that they form heterocomplexes with full-length TNLs.Consistent with a hetero-interaction hypothesis, chs1 autoimmunity phenotypes were recently shown to require SOC3, a full-length TNL (Figure 1) (140).

Genomic pairs and "integrated domains" are a shortcut to novel virulence targets
The most important recent NLR discovery has been the realization that some NLRs function as genomically-linked pairs (20; 126; 137).These dual NLR systems are proposed to be made up of a signaling NLR and a receptor NLR (18).The signalling NLR behaves much like a traditional NLR and guards the receptor NLR.Remarkably, the receptor NLR behaves as an effector binding platform, containing unusual motifs (i.s.not CC/TIR, RPW8, NBS or LRR domains) that are recognized by pathogen effectors.Integrated domains can be found in many locations within an NLR (Figure 1).These effector-interacting NLR motifs have similarity to the intended pathogen virulence targets and have been referred to as "integrated domains", or "integrated decoys" (IDs) (18; 88).In the case of RPS4 and RRS1, the two molecules preexist as a complex, and the signaling TNL RPS4 is activated after the effector PopP2 acetylates an RRS1integrated WRKY transcription factor domain (Figure 1).PopP2 "intended" targets are WRKY transcription factors; PopP2 acetylation targets the DNA-binding domain of WRKY transcription factors required for proper immune responses (69).Some RRS1 alleles are capable of recognizing both PopP2 and the sequence-unrelated effector AvrRps4, apparently through mechanistically distinct targeting of RRS1 (96).Similarly, CNLs such as RGA4/RGA5 also exist in genetically linked pairs that contain a decoy domain (RATX1/HMA, a putative metal-binding domain in RGA5) that is targeted by multiple effectors (19; 78).
As more plant genomes are made available, the list of atypical domains integrated into NLRs is rapidly expanding (Supplemental Table 1).These genomic pairs reveal important information solely through their primary sequence and can be identified across the plant phylogeny (65).There are many useful hypotheses that follow from these observations.First, pairs at a locus (especially head to head) are now reasonably hypothesized to function as a unit.To test this hypothesis, mutations in one locus should suppress the second.Accordingly, transient reconstruction assays of paired loci should include both genes.Second, unusual domains should be considered as effector binding targets and their homologs as relevant to pathogenicity.Thus the universe of NLR IDs across the plant phylogeny is now a minimal set of pathogen virulence targets.Many of these IDs have not been previously indicated as pathogen virulence targets.These domains are a hypothesis generator based solely on genome sequences.While it has not yet been demonstrated, it remains an open possibility that some IDs may retain their former biochemical function (131).

Helpers and genetic interactions across NLR-type.
Canonically, NLRs have been associated with recognizing a specific pathogen and conferring qualitative disease resistance.More recently, "helper" NLRs have been identified that are required for (or "help") the function of other NLRs.One of the first cloned NLRs was the N-gene in tobacco; a TNL that confers resistance to the tobacco mosaic virus (124).To identify other component of N-mediated disease resistance, Peart et al. performed a VIGS assay looking for loss of N-mediated cell death (92).This screen identified NRG1 as an CCr-containing RNL required for the function of the TNL N-gene.NRG1 is a member of a gene family that also includes ADR1 RNL proteins.Silencing NRG1 and ADR1 in combination resulted in loss of cell death mediated by the CNL Rx2, while single silencing constructs had no effect.Similar redundancy and crosstype interaction was reported in Arabidopsis for the ADR family (13).Interestingly, in Arabidopsis the function of ADR1-L2 as a helper NLR is independent of the p-loop, which is typically required for ETI across NLRs (13).Despite this functional divergence, the N-termini of ADR1 and NRG1 proteins are capable of triggering cell death, indicating that helpers may have functions as ETI-triggers as well as helpers for other NLRs (24).Intriguingly, RNL NRG1 helpers appear to have been co-retained or lost with TNLs multiple times during plant evolution (24).More recently, the NRC family of CNL helpers has been found to be required for clade-specific NLR function (130).

PigmR/PigmS: A novel genomically-paired NLR "helper" mechanism to reduce the cost of resistance
Not all helper NLRs are positive regulators of another NLR's function.In rice, cloning of the rice blast gene Pigm revealed a novel, agronomically important mechanism imparted using a canonical CC-NBS-LRR domain structure (30).Deng, et al. found that PigmR, which encodes a CNL is responsible for broad-spectrum, durable resistance to rice blast.Interestingly, PigmR is found at an NLR cluster with an extremely closely related partner CNL PigmS (only four polymorphic AA between PigmR and PigmS).
Intriguingly, this genomically-linked pair does not follow the integrated decoy model of RPS4/RRS1.Instead, PigmS heterodimerizes with PigmR and suppresses PigmRbased resistance.This suppression apparently counteracts a cost of PigmR-mediated resistance, as it results in increased grain yield.The PigmS impact on productivity may be determined in a tissue-specific manner as while PigmR is constitutively expressed throughout the plant, PigmS is pollen-specific.This presents a novel, potentially engineerable, mechanism of NLR-improvement: tissue-specific expression of dominantnegative "inhibitor NLRs" to decrease fitness costs of NLRs.

Unveiling the diversity of NLR-coding genes
Much of the mechanistic study of NLR function described above has been derived from a limited number of model organisms and genes.With improvement of costs and capabilities of next-generation sequencing technologies, genomic approaches became front line resources to characterize NLR diversity across plant species and populations.To date, the NLR repertoires of over 100 species are available, or can be easily obtained from genome annotations (Supplemental Table 2 lists genome-wide NLR interrogation studies performed to date).Taken together, those repertoires allow extensive comparative analyses and the definition of evolutionary paths.

An intricate evolutionary history explains current diversity
Knowledge of NLR diversity and distribution can reveal novel sources of resistance with enormous biotechnological potential.Preliminary efforts towards characterization of NLR diversity started immediately after the first R-genes were cloned in the mid 1990's (8; 85; 124).At that time, comparative analysis focused on the LRR region, given the results from seminal studies showing significant clustering of nonsynonymous substitutions in that region (14; 34; 80; 89; 122), and the preliminary indications showing that R gene specificity was determined by LRRs (40; 54; 105).An early population-level study aimed at characterizing intraspecific NLR polymorphisms was developed in Arabidopsis thaliana by Bakker et al. in 2006.This study provided a snapshot of LRR domain diversity across 27 NLRs from 96 accessions of Arabidopsis (4).The methodological innovation, at the time, was to compare LRR polymorphisms to a genome-wide empirical distribution of polymorphisms, rather than to neutral models.This approach identified RPP13 as highly polymorphic and with signatures of balancing selection, adding to the already known genes under balancing selection: RPP1 (14), RPS2 (17), RPP5 (89), RPM1 (104), RPS5 (114), and elucidated seven more loci with weaker balancing selection signatures: AT1G56540, AT1G59780, AT3G50950, AT4G14370, AT4G14610, AT5G58120, and AT5Gg63020 (4).PCR-based approaches have been employed to address sequence recombination, conversion, indels and copynumber variation at particular gene clusters in species other than A. thaliana (5; 66; 67; 73).Those studies aimed to characterize the complex selective forces on the evolution of individual genes or clusters.
Genome-wide studies, such as those in A. thaliana (46) and rice (133) have provided a deeper insight into NLR distribution, diversity and evolution.In those studies, researchers found that genetically clustered NLR genes frequently swap sequences and are thus more polymorphic than singleton loci.Distinct evolutionary paths and rates for TIR-and non-TIR containing NLRs are apparent in A. thaliana (23).NLRs evolve rapidly, and copy number variants were more often found in NLR genes relative to the genome as a whole.33.3 % of NLR-coding genes from the reference Col-0 accession appeared to be deleted in at least one of the 80 accessions, compared to 12.5% of genes in the entire genome (46).An equivalent number of NLRs must be absent from the Col-0 reference genome.This indicates that there is much to be learned from a deep dive into closely related genomes.
With the advent of second and third generation sequencing technologies, efforts definitively shifted towards genome-wide comparative studies.An early attempt to characterize genome-wide variation among 18 A. thaliana ecotypes employed pairedend Illumina reads and a combination of reference-based and de novo assembly (41).Bioinformatic limitations of short-read assembly forced authors to limit the analysis to single-copy regions homologous to the the reference (Col-0) genome.Accordingly, analysis of copy-number and structural variation was hampered, as well as the discovery of novel NLRs (93).The currently reported A. thaliana pan-NLRome (At-panNLRome), defined as the union of NLR genes of the different ecotypes, is thus restricted to the genes known in a single genotype reference accession (Col-0) (93).Nevertheless, accumulated knowledge shows that the At-panNLRome expands beyond the Col-0 NLR repertoire (31; 89).An interesting example is the A. thaliana DANGEROUS MIX2 (DM2) cluster, which in Col-0 contains two RPP1-like genes, but in Ler contains up to seven RPP1-like genes (21; 107).To date, it is still unknown if NLRs in the different DM2 loci contribute to recognition of different pathogens.Further expansion of the At-panNLRome will help describe how NLR genes expand and contract across populations in response to pathogen selective pressures.
The NLR content of the A thaliana Col-0 genome was first described in 2003 (83), since then our knowledge of NLR gene content across the plant phylogeny has rapidly expanded.Bioinformatic comparative analysis opened a new avenue for studying NLR genetic diversity and evolution.In a recent study, a panel of 6,000 NLR genes from 22 Angiosperm species were incorporated in a comparative analysis and phylogenetic reconstruction.The reported results elucidate how all currently known NLRs likely diversified from 23 NLRs belonging to three distinct ancestral TNL, CNL and RNL lineages (99).A similar ancestral state reconstruction analysis using 38 sequenced species representing the six kingdoms of life (eubacteria, archaebacteria, fungi, protists, plants and animals) showed that the most basal plants analyzed had a very limited NLR repertoire (30 NLRs in Physcomitrella patens and 17 in Selaginella moellendorffii) (135).On the other hand, higher plant genomes typically encode numerous NLR genes, with hundreds of genes in gymnosperm and angiosperm genomes.Detailed analysis of plant lineages reveals expansion and contraction of particular NLR classes (99).
The scenario of NLR genes expansion and contraction is complex.While TNLs are expanded in Brassicaceae, the opposite is observed in Poaceae, with TNLdepletion and an expansion of the CNL class (Table 1).Family-level evolutionary paths are not that clear across the plant phylogeny.Comparative analysis of Fabaceae has shown multiple expansion and contraction events, leading to an increase in of NLRs in Cajanus cajan and Medicago truncatula and a decrease in the NLR repertoire of Lotus japonicus, Phaseolus vulgaris and Cicer arietinum (143).Interestingly, whole genome duplication does not seem to necessarily contribute to net increase the number of NLR genes.NLRs seem to be rather maintained in a dosage-or diploidization-sensitive scheme.In fact, the mechanisms governing NLR gene expansion and/or contraction in the different species might depend on the intraspecific diversity, widespread or restricted geographic distribution, ploidy, mating system (inbreeding or outcrossing), generation time and domestication history (in the case of crops).
As more plant genomes are sequenced, comparative analyses of NLRs and NLRomes will provided a better understanding of its diversity and evolutionary history.For that, the establishment of rigorous and reproducible analysis pipelines will be key.In some cases, analyses of the NLR content reported by different groups can be strikingly inconsistent (Table 1).The observed variance might be due, at least in some cases, to the use of different genome annotation versions, or to which bioinformatic tools and settings are used to perform the analysis.Use and reporting of standardized methods could reduce variance reported between publications and facilitate comparisons [see SIDEBAR].

Good practices for Genome-wide identification of NLRs.
Exploratory descriptions of NLR repertoires provide a valuable glimpse at the specieslevel NLR diversity and allow comparative NLRome analysis across the different taxonomic clades (3; 60; 65; 95; 99).To that end, a variety of bioinformatic tools have proven useful to identify NLR genes from genome sequences and annotated gene models.Available methods include ab initio predictors, identification of functional domains or motifs, similarity searches against databases, PCR amplification with partially degenerated primers, and R-gene enrichment and sequencing (several studies in Supplementary Table 2

use those methods).
Accurate identification of protein domains in a collection of sequences is critical to defining and organizing proteins into families.Multidomain NLR proteins can be further classified according to domain architectures.To this end, hidden Markov model (HMM) profiles have become a popular means to identify protein domains.High quality, manually curated and biologically relevant HMM profiles for a wide range of domains are available via Pfam (38), TIGRFAM (48) and SMART (71).Each HMM profile in the Pfam-A database contains curated bit score thresholds (38).
One limitation in reproducibly defining NLRomes, may simply be the lack of a unified definition for NLR-coding genes.Given the current mechanistic understanding of plant NLR biology, a putative NLR-coding gene would contain either an NB-ARC, or TIR, or RPW8 domain.LRR domains commonly occur in other protein families and should not be considered part of the primary definition of NLRs.CC folds can't be easily detected using domain profiles, and often require secondary structure prediction such as Paircoil2 (79), MARCOIL (29), COILS (75), MultiCoil (129), and PCOILS (45).The different CC prediction tools generate slightly different outputs, but their union and/or intersections can be informative and assist identification of high probability CC signatures.
Criteria for defining NLRs has changed as our mechanistic understanding of NLR has deepened.Historically, NB-ARC alignments and NB-ARC motifs have been used to discriminate between TIR, or non-TIR NLRs (40; 82; 91).When the first NB-LRR and RPW8 NLRs were reported (132), it became relevant to distinguish between CC-NB-LRRs, RPW8-NB-LRRs and NB-LRRs.A curated RPW8 HMM profile is available from Pfam-A, allowing distinction between CC and CCr classes.TIR and TIR-NB proteins are increasingly being described with immune receptor-like function (discussed above), thus a broader definition of "NLR" is likely warranted.Recent reports have also pointed to the importance of considering TIR_2 domains in addition to TIR domains when defining NLRs (95).

Resistance from relatives in the post genomic era
Plant species with major agricultural and economic interest frequently have large and complex genomes.Therefore, cheap and efficient methods to identify NLRs at a genome-wide level are invaluable.R-gene enrichment and sequencing (RenSeq) is a method that allows selective sequencing of NLR-containing genomic fragments (42; 58).The method allows the definition of the NLRome of any plant by using an RNA bait library (complementary to known or partially annotated NLRs from related species) combined with a HT sequencing platform (typically Illumina, Pacbio or Nanopore) (43; 57).The technique reduces the overall complexity of the genomic sample, and allows focused sequencing on the enriched gene family.
RenSeq technology allowed refinement of NLR gene annotations, as well as the identification of 317, 105 and 126 previously unreported NLRs in S. tuberosum DM clone, S. lycopersicum Heinz 1706 and S. pimpinellifolium LA1589, respectively (1).Most of the novel genes mapped to unannotated or gapped regions of the genomes.RenSeq allowed thus the definition of previously unidentified or incomplete NLR clusters, in which the novel genes were found to reside (1; 2; 58; 94; 115).This technique can also be applied to plant species for which there is no available draft genome.RenSeq applied to wild relatives of tomato allowed the identification of markers that cosegregated with resistance to Phytophtora infestans (58).RenSeq combined with long read Single Molecule Real Time (SMRT) sequencing is effective at resolving NLR clusters that are notoriously difficult to sequence.(42; 43; 128).RenSeq has also been successfully used to enrich NLR cDNAs, allowing transcript validation of 167 S. lycopersicum Heinz 1706 and 154 S. pimpinellifolium LA1589 NLRs (1).Identification of sources of resistance from wild relatives, or ancestral progenitors from the primary geographic diversity centers will provide novel NLR variants to further increase the disease resistance gene pool available for breeding programs (70; 120)

NLR transfers between genomes
Understanding NLR diversity at the mechanistic and genomic levels provides an invaluable resource for breeding and, eventually, rationally engineering disease resistance.Traditionally, sources of resistance have been selected for, or found in closely related genomes.Plant genomes have followed independent evolutionary paths and each has a unique set of immune receptors.To what extent are immune receptors transferable between more distant genomes?To what extent will genomic studies define a pan-NLRome allowing the use of these diverse products of evolution from across the kingdom as resistance traits?
The first example of interfamily immune receptor transfer was between Arabidopsis and the solanaceous plants Nicotiana benthamiana and Tomato (68).EFR is a PRR receptor-like kinase that perceives the bacterial PAMP EF-Tu (elf18 peptide), but it is only present in the Brassicaceae.After transferring it into solanaceous genomes, EFR was able to confer responsiveness to elf18.Importantly, it also resulted in strong bacterial disease resistance in tomato.NLRs can also be transferred between genomes.Rice genomes don't have a known resistance specificity for Xanthomonas oryzae pv.oryzicola.After identifying a disease resistance trait in maize, Zhao, et al. were able to transfer RXO1, a CNL, into rice and generate resistant plants (141).
Even further phylogenetic distances are possible.The monocot CNL MLA1 has been transferred from barley into the dicot Arabidopsis (74).Amazingly, MLA1 is functional in Arabidopsis and recognizes the pathogen effector AVRa1.This result indicates that the machinery required for NLR function can be conserved over extremely large phylogenetic distances.A high level of conservation is also supported by the general feasibility of transient assays in Nicotiana and the conservation of Nicotiana EDS1 function to support phylogenetically distant Arabidopsis TIR and TNL functions (87; 126).
There are likely limits to the transfer of immune receptors.To serve as useful traits NLRs must be functional and properly regulated.Functionality requires that the NLR can integrate into a largely unknown system required for recognition and downstream function.Proper regulation is required to ensure that NLRs do not have negative impacts on fitness via autoactivity.Autoactivity is a frequent outcome of transgenic expression of NLRs.This is likely due to the idiosyncratic nature of transgenic lines and resulting over-or mis-expression.In other cases the autoactivity may be genetically determined.In the case of the "Dangerous Mix" loci, incompatibilities can be revealed by outcrosses of Arabidopsis genomes that have undergone independent evolution (11).Several of these loci map to NLR immune receptors and may reflect drift between NLRs and guardees that results in inappropriate physical interaction and the resulting autoactivity (21).Thus, in the case of NLRs that guard host proteins, there may be a limitation based on conservation between the guardee and the adopted guardee in the new genome.NLRs themselves may form incompatible heteromeric complexes and one NLR may activate a second when they encounter each other via outcrossing (116).In other cases, NLRs may negatively regulate each other.Transfer of the rye Pm8 resistance gene into wheat is limited in some genotypes by the dominant action of the wheat Pm3 resistance gene (52).As both genes are CNLs, it is intriguing to speculate that the suppression is via the formation of an inappropriate, inactive heteromeric receptor complex.

NLR tinkering: fine tuning responses
Existing NLRs can also be tinkered with, to either expand recognition or tune responsiveness.An early attempt at modifying NLR specificity mutagenized the Rx CNL in ordered to expand recognition of potato virus X strains (35).By using random mutagenesis targeted at the LRR, they were able to find Rx mutants that could recognize not only the wild-type version of PVX coat protein (CP), but also mutant CP that could evade wild-type Rx.Interestingly, the Rx mutants now also recognized CP from the distantly-related poplar mosaic virus (PoMV).One of the mutants, Rx N846D, displayed systemic necrosis when challenged with PoMV, demonstrating a cost to increased recognition.Further mutagenesis of Rx N846D was able to find new mutations that were able to convert the systemic necrosis into a strong resistance able to control PoMV (49).Interestingly, while N846D is located in the LRR, the suppressing mutations are in the NBS domain, suggesting an interdomain contact.Similar attempts to generate expanded specificities for the potato NLR R3a were able to expand recognition to "stealthy" versions of the AVR3a effector (22; 98).
Study of the wheat CNL Pm3 indicates that the NBS domain of NLRs are tuned in their responses and that this tuning can be downstream of "triggerability" (106).In this case, immune output can be altered independently of propensity to be activated by mutation of only two residues in the ARC2 subdomain.This is consistent with a hypothesis that initial pathogen detection is translated into an appropriately tuned resistance response.These two tuning residues are surface exposed in NBS models, but higher order true structures of the NBS in combination with other domains will be required to understand how they are promoting an ATP-bound active conformation.By all indications, NLRs have multiple intramolecular interactions that can be tuned for a combination of activation and output strength.

NLR re-engineering: building better mousetraps
Beyond single point mutations, more extensive re-engineering of NLRs has also been attempted.Domain swaps between closely related NLRs (such as Rx1 and Gpa2) can result in a corresponding specificity swap (101).Domain swaps indicate that NBS and LRR intramolecular interactions are critical for maintaining the resting state of NLRs to avoid inappropriate, elicitor-independent activation (102).These Rx1/Gpa2 domain swaps used existing specificities to engineer NLR function, what are the prospects for novel specificities?
Recently, breakthrough studies of the RPS5 system has presented an excellent opportunity to rationally engineer NLR immune recognition.In this case, the CNL RPS5 indirectly recognizes AvrPphB proteolytic cleavage of the decoy kinase protein PBS1.The elegant solution described by Kim et al, is that replacement of the AvrPphB cleavage site with an engineered protease site will allow an unmodified RPS5 to activate defenses to novel proteases (61).By engineering the guardee they were able to obviate problems of autoactivation created by modifying the NLR itself.But even with a WT NLR, there are likely issues that will have to be solved for any engineered PBS1/RPS5 system.The authors found that activation of RPS5 defenses against turnip mosaic virus (TuMV) (using an engineered PBS1 cleaved by TuMV Nla protease) was slower than needed to limit systemic spread.They proposed that the plasma membrane localization of WT RPS5 may be inappropriate for detection of an effector protease found mostly in the nucleus.If these sorts of pathogen-specific issues can be overcome, the abundance of protease effectors in pathogen effector repertoires suggests that RPS5/PBS1 may be a widely useful NLR engineering approach.
In most cases, our understanding of how an NLR functions is limited.In the case of RPS5 and PBS1 years of research was required to adequately understand how to use it as an NLR engineering platform (63).The recent discovery of paired NLRs with integrated domains (described above) suggests a powerful shortcut for identifying engineering targets.NLR pairs are relatively easy to identify and are present in many plant genomes.Importantly, following the model of RPS4 and RRS1, if integrated domains are effector decoys, then we will not have to genetically identify an unknown guardee.The loci should be transferable to novel genomes, and as they contain both components of the receptor complex (receptor NLR and signaling NLR).As they define a complete receptor complex, they should be less susceptible to problems arising from incompatibility due to independent evolution.Recent study by Bailey et al. indicates that NLRs with integrated domains are quickly gaining and losing novel unsual domains (3) , and thus may be rapidly changing specificity within a conserved receptor context.It will be extremely informative to understand what mutations in the canonical NLR domains are required to accomodate a novel ID.These mutations will undoubtedly be critical for both maintenance of the resting state and/or appropriate activation in response to effector modification of the ID.Replacement of an ID with a novel effector target or with a homologous one derived from the recipient genome may be a viable approach to engineering NLRs.

Unanswered Questions and Outlook
Many of the basic questions about NLR function remain unanswered.A better understanding of how individual NLR domains interact with one another is critical to understanding how the molecules function as a switch.This is important for limiting the costs of inappropriate activation, as well as for understanding pathogen specificity and strength of response.We need to understand how NLRs activate downstream events: how disease resistance and cell death are triggered remains, remarkably, a black box.How do NLRs homo and hetero-oligomerize to generate an immune system?To what extent do the two tiers of the immune system (NLRs and PRRs) functionally cooperate to form an immune system?Can we identify characteristics of NLRs that promote durable resistance?Rational engineering of immune receptors is an increasingly achievable goal.By mechanistically understanding how NLRs function, we will be able to modify existing NLRs or generate novel receptor systems that recognize pathogens of interest.By exploring the breadth and depth of plant NLR natural variation, we will expand our toolbox of deployable disease resistance traits.Accelerating climate change is predicted to generate novel pathogen/plant interactions, demanding rapid responses by plant breeders (6).Rational design of plant immune systems will be one tool, of many, that enables agricultural systems to keep pace with pathogens.??CHS1 and SOC3 also occur in a genomic pair.SOC3 physically interacts with CHS1, but it is unclear if this pair also functions to recognize a pathogen effector.(f;ii) RBA1 encodes a TIR-only protein that triggers cell-death in response to the pathogen effector HopBA1.While RBA1 and HopBA1 co-immunoprecipitate, they may not interact directly and could require unknown components such as a putative guardee or an unknown partner TNL.
Myb_DNA-bind_6 2 Myb_DNA-binding 1 Supplemental Table 1 NLRs combine an signaling NLR and a receptor NLR that contains a decoy domain f "truncated" NLRs may function in hetero-oligomeric immune complexes

Figure 1 )
Figure1) NLRs are modular switches.(a) typical plant NLRs contain a variable Nterminal domain, either TIR (T), Coiled-coil (C) or RPW8-like (R) domain followed by an NBS domain (N) and Leucine-rich repeat domain (L).(b) NLRs undergo conformation switching depending on ADP/ATP binding state induced/stabilized by effector (or guardee) trigger.Multimerization of N-terminus is required and often sufficient for signaling (red glow).The exact multimerization state not known, but is shown here only as dimeric for graphical clarity.Detection of pathogen effectors can either be direct (i) or indirectly via modification of a host guardee protein (G).(c) Plant genomes contain a diverse array of NLR domain combinations.(d) Plant genomes contain NLRs with unusual "integrated domains" (X).Integrated domains can occur in many locations in the NLR domain structure.Example shown are from Arabidopsis.(e) NLRs with integrated domains are often found as pairs divergently expressed at a single genomic locus.In the case of RPS4/RRS1, effector (PopP2) targeting and acetylation of the integrated WRKY decoy domain (W) in RRS1 activates RPS4 to activate defense responses.The exact stoichiometry and orientation of RRS1 and RPS4 pre and post activation are unknown, but RPS4 and RRS1 interact pre-activation and the postactivation complex requires RPS4-RPS4 TIR self-association to signal.(f) Truncated NLRs likely function in hetero-oligomeric immune complexes.(f;i) Autoactivity triggered by the TN mutant chs1-1 requires full length TNL SOC3.CHS1 and SOC3 also occur in a genomic pair.SOC3 physically interacts with CHS1, but it is unclear if this pair also functions to recognize a pathogen effector.(f;ii) RBA1 encodes a TIR-only protein that triggers cell-death in response to the pathogen effector HopBA1.While RBA1 and HopBA1 co-immunoprecipitate, they may not interact directly and could require unknown components such as a putative guardee or an unknown partner TNL.
• The Plant Resistance Genes database (PRGdb).http://prgdb.org• Public release of RenSeq assemblies from 69 accessions of Arabidopsis thaliana, and R-genes from four Nicotiana and four Solanum species.InterproScan are currently the most used bioinformatic tools to detect functional domains.-Consider using multiple secondary structure prediction tools to detect proteins likely to present a coiled-coil fold, and report the probability cut-off.-To facilitate reproducibility, authors should include model specific cut-offs included in curated HMM databases and report software and database versions, as well as genome annotation release.

Table 1 Legend Survey of the number of NLR proteins in 19 plant species.
Number of genes detected in the gene models provided by the authors using a conservative hmmscan with PfamA RPW8 and NB-ARC --cut_tc thresholds The total number of NBS genes in each domain arrangement was retrieved from the indicated papers.One asterisk indicates reported non-TIR NLRs.Two asterisks refer to domains present in the fasta sequences provided by the authors, but not explicitly presented in the respective publication.HMMER with Trusted Cutoff threshold was used to retrieve RPW8 domains.Table rows are colored according to taxonomic family, with different shades for each species.Yellow, Poaceae; Orange, Solanaceae; Blue, Fabaceae; and Green, Brassicaceae.The phylogeny of the species listed in the table was obtained from timetree.org.

Plant NLR repertoires at the service of pathogen resistance engineering
. (A) Wild relatives of an interesting crop might exhibit useful disease resistance phenotypes.NLR sequencing with RenSeq or MBP (Mapping by sequencing, Gina et al.Nature Communications 2017) allows the identification of crop and wild relative NLR repertoires.Comparative analysis of presence-absence, SNP and InDel polymorphisms assist the identification of NLR(s).Upon identification of the R-gene(s) in a resistant wild relative, resistance can be introgressed into the crop by hybridization and consecutive backcrosses.RenSeq might be a valuable tool to reduce genome complexity and assist selection of progeny.(B) When the crop and the wild relative are sexually incompatible, the NLR(s) can be cloned from the wild relative (or a more phylogenetically distant genome) and introduced the desired crop via transgenesis.(C) In the future, the accumulated knowledge in NLR domain swapping, integrated decoys, pathogen effector targets and point-mutation alleles will be used to engineer novel resistances.Pathogen effector targets might be incorporated into an already existent NLR-ID, in order to create a novel sensor.Modification of NLR-associated guardees or decoys (such as PBS1, not shown) are also possible.

Legend: Atypical domains detected NLR genes from different plant species.
Data obtained from Sarris et al.BMC Biology 2016.Green boxes show overrepresented in NLRs compared to the rest of the genomes, using significant Fisher's exact test p-value lower than 0.05.Grey boxes indicate fusion of the respective domain fusion to at least one NLR, but no enrichment.