Review Mary Christie1 0022-2836/© 2015 Elsevi Structural Biology and Regulation of Protein Import into the Nucleus , 2, †, Chiung-Wen Chang3, 4, †, Gergely Róna5, 6, †, Kate M. Smith7, †, Alastair G. Stewart 8, Agnes A.S. Takeda9, Marcos R.M. Fontes9, Murray Stewart 3, 10, Beáta G. Vértessy5, 6, Jade K. Forwood7 and Bostjan Kobe3 1 - The Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, NSW 2010, Australia 2 - St Vincent's Clinical School, University of New South Wales Faculty of Medicine, Darlinghurst, NSW 2010, Australia 3 - School of Chemistry and Molecular Biosciences, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, University of Queensland, Brisbane, QLD 4072, Australia 4 - Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA 5 - Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, Hungary 6 - Department of Applied Biotechnology and Food Sciences, Budapest University of Technology and Economics, Budapest H-1111, Hungary 7 - School of Biomedical Sciences, Charles Sturt University, Wagga Wagga, NSW 2650, Australia 8 - School of Molecular Bioscience, The University of Sydney, Sydney, NSW 2006, Australia 9 - Department of Physics and Biophysics, Institute of Biosciences, Universidade Estadual Paulista, Botucatu, São Paulo 18618-000, Brazil 10 - MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, United Kingdom Correspondence to Bostjan Kobe: b.kobe@uq.edu.au http://dx.doi.org/10.1016/j.jmb.2015.10.023 Edited by D. Görlich Abstract Proteins are translated in the cytoplasm, but many need to access the nucleus to perform their functions. Understanding how these nuclear proteins are transported through the nuclear envelope and how the import processes are regulated is therefore an important aspect of understanding cell function. Structural biology has played a key role in understanding the molecular events during the transport processes and their regulation, including the recognition of nuclear targeting signals by the corresponding receptors. Here, we review the structural basis of the principal nuclear import pathways and the molecular basis of their regulation. The pathways involve transport factors that are members of the β-karyopherin family, which can bind cargo directly (e.g., importin-β, transportin-1, transportin-3, importin-13) or through adaptor proteins (e.g., importin-α, snurportin-1, symportin-1), as well as unrelated transport factors such as Hikeshi, involved in the transport of heat-shock proteins, and NTF2, involved in the transport of RanGDP. Solenoid proteins feature prominently in these pathways. Nuclear transport factors recognize nuclear targeting signals on the cargo proteins, including the classical nuclear localization signals, recognized by the adaptor importin-α, and the PY nuclear localization signals, recognized by transportin-1. Post-translational modifications, particularly phosphoryla- tion, constitute key regulatory mechanisms operating in these pathways. © 2015 Elsevier Ltd. All rights reserved. Overview of Nuclear Import Pathways The nucleus is separated from the cytoplasm by a double membrane and houses the genetic material and the transcriptional apparatus, separating it from er Ltd. All rights reserved. the translational and metabolic machinery in the cytoplasm of eukaryotic cells. The key to this separation is the corresponding ability to regulate transport across the nuclear envelope. This trans- port occurs through nuclear pore complexes (NPCs), J Mol Biol (201 ) 428, 2060–20906 mailto:b.kobe@uq.edu.au http://dx.doi.org/MaryChristie12 Chiung-WenChang34 RonaGergelyR�na56 Kate M.Smith7 Alastair G.Stewart8Agnes A.S.Takeda9Marcos R.M.Fontes9MurrayStewart310VertessyBe�ta G.V�rtessy56Jade K.Forwood7BostjanKobe3Nb.kobe@uq.edu.au1The Garvan Institute of Medical Research, 384 Victoria Street, Darlinghurst, NSW 2010, AustraliaThe Garvan Institute of Medical Research384 Victoria StreetDarlinghurstNSW2010Australia2St Vincent's Clinical School, University of New South Wales Faculty of Medicine, Darlinghurst, NSW 2010, AustraliaSt Vincent's Clinical School, University of New South Wales Faculty of MedicineDarlinghurstNSW2010Australia3School of Chemistry and Molecular Biosciences, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, University of Queensland, Brisbane, QLD 4072, AustraliaSchool of Chemistry and Molecular Biosciences, Institute for Molecular Bioscience and Australian Infectious Diseases Research Centre, University of QueenslandBrisbaneQLD4072Australia4Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USAVerna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of MedicineHoustonTX77030USA5Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest H-1117, HungaryInstitute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of SciencesBudapestH-1117Hungary6Department of Applied Biotechnology and Food Sciences, Budapest University of Technology and Economics, Budapest H-1111, HungaryDepartment of Applied Biotechnology and Food Sciences, Budapest University of Technology and EconomicsBudapestH-1111Hungary7School of Biomedical Sciences, Charles Sturt University, Wagga Wagga, NSW 2650, AustraliaSchool of Biomedical Sciences, Charles Sturt UniversityWagga WaggaNSW2650Australia8School of Molecular Bioscience, The University of Sydney, Sydney, NSW 2006, AustraliaSchool of Molecular Bioscience, The University of SydneySydneyNSW2006Australia9Department of Physics and Biophysics, Institute of Biosciences, Universidade Estadual Paulista, Botucatu, S�o Paulo 18618-000, BrazilDepartment of Physics and Biophysics, Institute of Biosciences, Universidade Estadual PaulistaBotucatuS�o Paulo18618-000Brazil10MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, United KingdomMRC Laboratory of Molecular BiologyFrancis Crick Avenue, Cambridge Biomedical CampusCambridgeCB2 0QHUnited KingdomNCorresponding author.1M.C., C.-W.C., G.R. and K.M.S. contributed equally to this work. http://dx.doi.org/ Fig. 1. Overview of the main nuclear import pathways. Schematic illustration of three different nuclear import pathways. (i and ii) Classic import pathway; cargo (nucleoplasmin, shown in red; PDB entry 1K5J), Impα (shown in green; PDB entries 1IAL and 1EE5) and Impβ1 (shown in yellow; PDB entry 1QGK) form a ternary complex before translocating across the membrane via the nuclear pore (shown in gray); RanGTP (shown in blue; PDB entry 2BKU) and Cse1p [or CAS] (shown in orange; PDB entry 1WA5) dissociate the complex and release the cargo. (iii and iv) Snurportin-1 import pathway; cargo (U1A-UTR, shown in red and wheat; PDB entry 1AUD), snurportin-1 (shown in green; PDB entry 1UKL) and Impβ1 translocate across the membrane; RanGTP and CRM1 (shown in orange; PDB entry 3GJX) release the cargo. (v and vi) Import pathway involving direct cargo binding to Impβ1; cargo (SREBP2, shown in red; PDB entry 1UKL) and Impβ1 form a binary complex before translocating across the membrane; RanGTP dissociates the complex. 2061Review: Protein Import into the Nucleus huge macromolecular structures that span both nuclear membranes and that are built from multiple copies of a large number of proteins termed nucleo- porins (Nups) [1,2] (see reviews by Beck, Schwartz, Lemke and Gorlich in this issue). The NPC has a large enough channel to allow proteins smaller than ~40 kDa to passively diffuse through it; however, most, if not all, proteins with functions in the nucleus use active carrier-mediated transport. Translocation of proteins through NPCs requires additional carrier proteins or transport factors. Many of these carriers belong to the β-karyopherin (β-Kap) (or importin-β, Impβ) superfamily. All β-Kap family members are built from tandem HEAT repeats (named after huntingtin, elongation factor 3, protein phosphatase 2A and TOR1 [3]), each of which contains ~40–45 amino acids that form two antiparallel α-helices (designated A and B) linked by a loop. This repetitive structure places β-Kaps in the solenoid protein category [4], which features prominently among proteins involved in nucleocytoplasmic transport (Fig. 1). An additional key component of nuclear transport pathways is the small GTPase Ran. Ran cycles between GDP- and GTP-bound states [5], and the state of the bound nucleotide is determined by Ran regulatory proteins, including RanGEF (Ran guanine nucleotide exchange factor, also called RCC1, regulator of chromosome condensation 1, or Prp20p in yeast) [6] and RanGAP (Ran GTPase-activating protein, Rna1p in yeast) [7]. Like other members of the Ras-family GTPases, Ran is composed of a small G-domain and contains two surface loops, termed switch-I and switch-II, which change conformation depending on the nucleotide-bound state of the protein. The RanGDP/GTP gradient, generated by RanGEF and RanGAP being located in the nucleus and the cytoplasm, respectively, establishes direc- tionality in nucleocytoplasmic transport pathways. As a result, import receptors that bind cargo in the cytoplasm can release it in the nucleus through binding RanGTP [5,8–10], whereas export receptors can bind cargo in the nucleus through simultaneous binding ofRanGTPand can release it in the cytoplasm after GTP hydrolysis is triggered. Whether a protein localizes to the nucleus is usually determined by nuclear targeting signals. The first nuclear targeting signal discovered, and the best characterized, is the classical nuclear localization sequence (cNLS) recognized by the protein impor- tin-α (Impα) (karyopherin-α). Impα is an adaptor protein that links the cargo to a carrier protein that actually takes the cargo through the NPC; its specific carrier protein is importin-β (Impβ1) (karyopherin-β1) [11]. Impα is also a solenoid but is built from armadillo (ARM) repeats [12,13]. Some proteins also have nuclear export signals (NESs) [14,15] and can shuttle in and out of the nucleus. Themembers of the β-Kap family transport cargoes by binding the nuclear localization signal (NLS) either image of Fig.�1 2062 Review: Protein Import into the Nucleus directly or through adaptor molecules such as Impα or snurportin-1 (Fig. 1). There are 20 β-Kap family members in humans, 10 of which mediate transport of macromolecules into the nucleus, 7 translocate macromolecules from the nucleus to the cytoplasm, 2 have been shown to mediate translocation in both directions and 1member remains to be characterized. The yeast genome codes for 14 β-Kaps. The reason for the large repertoire of nuclear import receptors within cells remains to be fully elucidated; in part, it can be attributed to the requirement of cells to translocate hundreds of quite disparate macromolecules across the nuclear envelope. It is emerging that different β-Kap family members recognize different classes of cargoes, and moreover, tissue-specific expression of family members may differentially localize cargoes within different cell types. However, it also appears that there is some redundancy in this system, with several β-Kap receptors able to recognize the same cargoes [16]. Structural biology has played an important role in deciphering the molecular events required for nuclear import. Here, we review the available structural information on different nuclear import pathways and Table 1. Structurally characterized transport receptors involve Transport receptor (abbreviation) Other names Tra dir Importin-β (Impβ1) Importin-90, karyopherin β-1, nuclear factor p97, pore-targeting complex 97-kDa subunit, (PTAC97), Kap95p (yeast; yImpβ1) Im Transportin-1 (Trn1) Importin-β2, karyopherin-β2, M9 region interaction protein (MIP); Kap104p (yeast; yTrn1) Im Transportin-3 (Trn3) Importin-12, Transportin-SR; Kap111p (yeast, yTrn3) Im Importin-13 (Imp13) Ran-binding protein 13 (RanBP13), karyopherin-13 (Kap13) Bi-dir Importin-α (Impα) Karyopherin-α (Kapα); Kap60p (yeast; yImpα) Im Snurportin-1 Im Symportin-1 Syo1 (synchronized import) Im Hikeshi Im the associated determinants of specificity in these pathways, and we also review the regulatory mech- anisms acting on the import processes. The main pathways are summarized in Fig. 1, and the repre- sentative protein structures are listed in Table 1. The structures of nuclear export complexes are reviewed by Matsuura [256]. Nuclear Import Mediated by β-Kap Family Members Overview of the β-Kap family The members of the β-Kap family are highly conserved across eukaryotes, reflecting their critical cellular function. A higher degree of similarity is often observed between orthologues in different species than between paralogues in the same species [17], and although there is low sequence identity among family members (Supplementary Table 1), they display a high degree of structural similarity. Many HEAT repeats in the β-Kap family display conserved d in protein import into the nucleus. nsport ection Representative available structures (PDB entry) port Apo: yImpβ1 (3ND2) Cargo complex: Impβ1:PTHrP (1M5N); Impβ1:SREBP2 (1UKL); Impβ1:SNAIL1 (3W5K) Adaptor complex: Impβ1:Impα-IBB domain (1QGK); Impβ1:snurportin-IBB domain (3LWW) Ran complex: yImpβ1:RanGTP (2BKU); yImpβ1:RanGDP (3EA5) Nup complex: Impβ1:GLFG (1O6P); Impβ1:FxFG (1F59); yImpβ1:Nup1p (2BPT) port Apo: Trn1 (2QMR, 2Z5J) Cargo complex: Trn1:hnRNPA1 M9 NLS (2H4M); Trn1:Tap NLS (2Z5K); Trn1:hnRNP D NLS (2Z5N); Trn1:hnRNP M NLS (2OT8); Trn1:JKTBP NLS (2Z5O); Trn1:FUS NLS (4FQ3); Trn1:Nab2 NLS (4JLQ); Trn1:HCC1 (4OO6) Ran complex: Trn1:Ran (1QBK) port Apo: Trn3 (4C0P) Cargo complex: Trn3:ASF/SF2 (4C0O) Ran complex: Trn3:Ran (4C0Q) ectional Apo: Imp13 (3ZKV) Cargo complex: Imp13:Mago-Y14 (2X1G); Imp13:UBC9 (2XWU) Ran complex: Imp13:Ran (2X19) port Apo: Impα (1IAL); Supplementary Table 3 Cargo complex: Impα:SV40-TAg NLS (1EJL, 1BK6); Impα:nucleoplasmin NLS (1EE5, 3UL1); Impα:PB2 (2JDQ); Impα:CBP80 (3FEY); Impα:VP24 (4U2X); Supplementary Table 2 Nup complex: yImpα:Nup2p (2C1T); Impα:Nup50 (2C1M); Supplementary Table 2 port Cargo complex: snurportin-1 m3G-cap-binding domain:m3GpppG-cap dinucleotide (1XK5) port Apo: Syo1 (4GMO) Cargo complex: Syo1:RpL5:RpL11 (5AFF) port Apo: Hikeshi (3WVZ) 2063Review: Protein Import into the Nucleus Asp and Arg residues at positions 19 and 25 of the repeat, respectively, which form hydrogen (H)-bond- ing ladders [18]. Similarities have been described between HEAT and ARM repeats, particularly for the conserved residues that form the hydrophobic cores [18–20]. The solenoid structure of these proteins (up to 20 repeats within members of the β-Kap family) enables variation in the helicoidal curvature through cumulative subtle changes throughout the protein [4,21]. Impβ1-mediated nuclear import Impβ1 (Kap95p in yeast, termed yImpβ1 here) was the first member of the β-Kap family to be charac- terized structurally and has been crystallized in complex with a number of cargo molecules, Ran and Nups (Fig. 2). The first structure corresponded to Impβ1 bound to the importin-β binding (IBB) domain of Impα [22]. Impβ1 was shown to contain 19 HEAT repeats arranged in a superhelix, forming a convex face (formed by the A-helices of each repeat) and a concave face (formed by the B-helices each Fig. 2. Structures of Impβ1. PDB entries: PTHrP complex (1 Impα:IBB complex (1QGK), RanGTP complex (2BKU), GLFG (2BPT) and apo (3ND2). The HEAT repeats involved in carg Impβ1 from Impβ1:Ran complex has all cargoes overlaid. repeat). The majority of the Impβ1 interactions with its binding partners occur on the concave (B-helix) face, and the competition for these sites by Ran, Impα and various cargo proteins forms the founda- tion for the mechanisms of assembly, disassembly and translocation. The structural flexibility of the Impβ1 structure is essential for its function [21]. Apo-Impβ1 The crystal structure of apo-yImpβ1 revealed that the 19 HEAT repeats are arranged in a tightly coiled and compact conformation [21] (Fig. 2). This tightlywound structure ismediated byHEAT repeats 2 and 4 (residues S74 and D167, respectively), interact- ing with HEAT repeat 17 residues R696, E737, N738 and G739. The interaction interface has a total buried surface area of only 306 Å2, which is considerably smaller than most protein:protein interactions, indi- cating that there is a relatively small energy re- quirement to distort the flexible yImpβ1. This is supported by small-angle X-ray scattering analysis [21], which indicates that yImpβ1 exists in multiple M5N), SREBP2 complex (1UKL), Snail1 complex (3W5K), complex (1O6P), FxFG complex (1F59), Nup1p complex o binding are highlighted in dark yellow. A representative image of Fig.�2 2064 Review: Protein Import into the Nucleus conformations in solution, and the vast array of functions it performsare achievedby taking advantage of cumulative small structural changes that efficiently allow the transition between various conformations as internal energy is stored by continuous flexing. Impβ1:cargo interactions A number of cargo proteins are recognized directly by Impβ1. The structurally characterized examples include the parathyroid hormone-related protein PTHrP [23], the sterol regulatory element-binding protein SREBP2 [24] and the zinc finger protein SNAI1 (Snail1) [25] (Fig. 2). The regions on Impβ1 to which they bind overlap, implying that only one cargo can bind at a time. Impβ1 binds PTHrP using B-helices spanning HEAT repeats 2–11 at three distinct binding sur- faces. The N-terminal region of the PTHrP NLS (residues 67–79) binds to HEAT repeats 2–7, the central region (residues 80–86) binds HEAT repeats 8–10 and the C-terminal residues (residues 87–93) bind HEAT repeats 8–11. Overall, the PTHrP NLS, when bound to Impβ1, exists in an extended con- formation and buries 1027 Å2 of surface area. The crystal structure of the SREBP2 NLS bound to Impβ1 revealed a helix–loop–helix structure for the cargo (SREBP2 residues 343–403) [24]. To accom- modate this binding, Impβ1 adopts a more open conformation and engages more of the C-terminal HEAT repeats, compared to the PTHrP complex. SREBP2 binds through the B-helices of HEAT repeats 7–17. The two long helices present in repeats 7 and 17 bind SREBP2 rather similar to chopsticks. A notable difference to PTHrP is that, rather than salt bridges dominating the interactions, SREBP2 involves more hydrophobic interactions. This interac- tion buries 1355 Å2 of surface area. To bind four zinc finger domains (ZF) of Snail1 (residues 151–264), Impβ1 uses the B-helices of HEAT repeats 5–14, including the acidic loop in HEAT repeat 8. Unlike other cargo complexes, Impβ1 adopts a less curved structure to accommo- date the bulky snail-like NLS conformation. The ZF1 domain (residues 151–176) acts as the “head”, where the N-terminal α-helix (residues 164–176) is bound within a cleft on Impβ1 formed from HEAT repeats 9–11. The local conformation is stabilized by a hydrophobic interaction between L166 of Snail1 and P440 of Impβ1. The Snail1 “shell” is composed of the three domains ZF2–ZF4, which form a compact and tight interaction with Impβ1; the ZF2 domain (residues 177–202) forms H-bonds with HEAT repeats 13 and 14; the α-helix within the ZF3 domain (residues 203–230) arranges antiparallel with the inner Impβ1 α-helix of HEAT repeat 6; and the ZF4 domain (residues 231–264) interacts with residues in HEAT repeats 7 and 9. In the five C-terminal residues that represent the tail, C262 and R264 bind to HEAT repeat 10. Overall, Snail1 binds through 15 intermo- lecular interactions in the B-helices of HEAT repeats 5–14 that bury 2205 Å2 of surface area [25]. Although the HEAT repeats used to bind the three cargo molecules (5–14 for Snail1, 2–11 for PTHrP and 7–17 for SREBP2) overlap, the binding mech- anism in each is distinctly different. The interfaces overlap with the region binding RanGTP, suggesting that, regardless of the binding mechanism, the structural requirements include a large contact area and the ability for the complex to be disassembled by RanGTP binding upon entry to the nucleus. Impβ1:adaptor interactions The Impβ1 interactions with adaptor molecules Impα and snurportin-1 have been characterized structurally (Fig. 2; reviewed in Ref. [26]). In both cases, Impβ1 forms a closed conformation, wrap- ping tightly around the IBB domains of these adaptors, forming an array of salt-bridge interactions through HEAT repeats 7–19. The N-terminal resi- dues of Impα (αIBB) (residues 11–23) and snurportin (sIBB) (residues 25–40) IBB domains mediate interactions with the HEAT repeats 7–11, whereas the C-terminal α-helical regions of αIBB (residues 24–51) and sIBB (residues 41–65) bind through HEAT repeats 12–19. Although there are common HEAT repeats involved in binding adaptor and cargo molecules, the overall mechanism of binding is distinctly different between these two groups. Not only is the overall orientation of the cargo NLSs different (e.g., in SREBP, the α-helices are bound perpendicular to Impβ1, whereas the IBB domains bind parallel with the Impβ1 superhelix) but also the overall helicoidal twist of Impβ1 is different, with IBB domain-bound Impβ1 forming a more closed conformation. Impβ1:Ran interactions Dissociation of the nuclear import cargo from Impβ1 following entry into the nucleus is mediated by RanGTP binding [5,8–10]. The structures for truncated human Impβ1 [27] and full-length yImpβ1 [28] in complex with RanGTP (Fig. 2) have been determined, revealing important allosteric mecha- nisms of cargo-release control. The N-terminal fragment of Impβ1, encompassing HEAT repeats 1– 10 (residues 1–462) bound to RanGTP, identified two main contact areas: (1) HEAT repeat 1 interacting with Ran switch-II region residues W64, G74 and Q82; HEAT repeat 2 interacting with Ran residues E70, D77G78, N103 and D107; HEAT repeat 3 interacting with Ran R110, V111 and E113; and (2) HEAT repeat 7 and the highly conserved acidic loop inHEAT repeat 8 interacting with Ran residues R140KKNLQYY, K159, R166 and P172N173. The structure of the complex with the full-length yeast orthologue (Kap95p; yImpβ1) 2065Review: Protein Import into the Nucleus revealed an additional interaction site within RanGTP, involving the Ran switch-I loop and the C-terminal arch of yImpβ1, producing a change in curvature that locks yImpβ1 in a conformation incompatible with cargo binding. A sequential binding mechanism has been proposed to occur for RanGTP binding at three distinct sites, with residues in the switch-II loop binding to the CRIME motif in N-terminal HEAT repeats 1–4 first, followed by the basic patch on Ran (K134, H139RK) binding to acidic residues in HEAT repeats 7 and 8 and finally the switch-I loop binding HEAT repeats 12–15 residues R29; K37 forming salt bridges and H-bonds; F 35 hydrophobic interactions; and K152, N154, N156 and F127 also contributing to the interface. The third site, where theRanGTPswitch- I loop binds theC-terminal arch of yImpβ1, is crucial for locking the molecule in a conformation with increased curvature that cannot bind cargo. This allosteric mechanism enables the release of cargo because it results in Impβ1 becoming locked in a conformation that prevents the flexibility required to bind to different partners. The size of the yImpβ1:RanGTP interface (2159 Å2 [28]) is similar to the interfaces found in cargo and adaptor complexes, and there is limited overlap between the binding sites. The data therefore strongly support an allosteric mechanism of control for cargo release. The structure of the yImpβ1:RanGDP complex revealed further insights into Ran binding in the nucleus and cytoplasm [29]. The crystal structure showed that the Ran switch-I and switch-II regions are induced by yImpβ1 into a GTP-bound confor- mation so that, rather than these switch regions precluding binding to yImpβ1, they are forced into conformations that enable yImpβ1 binding. This is not inconsistent with other reports of binding partner-induced conformational changes within Ran and Ras switch regions [27,30]. One region that was shown to be differentially positioned in yImpβ1:RanGDP and yImpβ1:RanGTP structures Fig. 3. Structures of (A) Ran:RanBD1:Impβ1(1–462) [32] and 1K5D). RanBD1 forms a molecular embrace with Ran, seques RanGTP-Impβ structure onto the ternary complexes reveals ste corresponds to the “basic loop” contained with Ran, comprising residues 133–144, which move 8 Å between the two structures. Significantly, this struc- ture highlights that the yImpβ1 HEAT repeats 7 and 8 are exposed in the RanGDP complex, allowing binding partners such as Impα to disassemble the complex [29]. Impβ1 and RanGTP dissociation and recycling After translocation to the cytoplasm, recycling of Impβ1 and other transport factors is achieved by the conversion of Ran into the GDP-bound state. Accessory proteins located at the cytoplasmic face of the NPC facilitate the dissociation of the kinetically stable importin:RanGTP complexes and the gener- ation of RanGDP; the intrinsic GTPase activity of Ran is low and GTP hydrolysis is prevented when Ran is in complex with β-Kap family members [31]. Although the precise mechanism of RanGTP disso- ciation from transport factors is unclear, one model implicates RanGAP, together with Ran-binding domain RanBDs from either Ran binding protein RanBP1 or RanBP2 [31]. The structure of Ran in complex with RanBD1 (from RanBP2) and a frag- ment of Impβ1 (residues 1–462) has been deter- mined [32] (Fig. 3A), and superimposing it on the Impβ1:RanGTP complex identifies clashes between the Impβ1 C-terminal HEAT repeats and RanBD1. This suggests that the primary role of RanBDs is to destabilize the interaction between the transport factors and RanGTP through steric hindrance. The structure of Ran:GppNHp (a non-hydrolysable GTP analogue) in complex with RanBD1 (of RanBP2) andRanGAPhas also been determined [33] (Fig. 3B), and comparison with the Impβ1:RanGTP complex reveals further clashes between Impβ1 and RanGAP. In vitro radiolabeled-nucleotide assays have demon- strated that RanBP1 does not affect the intrinsic GTPaseactivity ofRan in isolation.However,RanBP1 (B) RanGppNHp:RanBD1:RanGAP complexes (PDB entry tering Ran's C-terminus. Superimposition of the full-length ric clashes between Impβ and RanBD1 or RanGAP. image of Fig.�3 2066 Review: Protein Import into the Nucleus has been shown to co-stimulate, with RanGAP, GTP hydrolysis by Ran [34], as well as increase the association rate of RanGTP and RanGAP [35]. This is consistent with the crystal structures of the ternary complexes; RanBP1 forms a molecular embrace with Ran, sequestering Ran's C-terminal region that inhibits RanGAP-mediated GTP hydrolysis. More- over, the structure of the ternary complex indicated that, unlike other Ras-familyGAPs, RanGAPdoes not provide catalytic residues to stimulate RanGTP hydrolysis [36,37]. Instead, Ran contains all requisite catalytic machinery and the primary role of RanGAP appears to be the stabilization of theRan switch-II loop and the repositioning of Ran's catalytic Q69 residue toward the active site [33]. In contrast to its GTP-bound state, RanGDP has low affinity for transport factors (~2 μM for Impβ1) [29], RanBDs (~10 μM) [38,39] and RanGAP (~100 μM) [35]. GTP hydrolysis therefore precludes rebinding of cytoplasmic RanGDP to transport factors or accessory proteins, rendering dissocia- tion essentially irreversible on the one hand while enabling the recycling of its binding partners for further rounds of import and disassembly on the other. Impβ1:Nup interactions To mediate translocation across the nuclear envelope, Impβ1 interacts directly with Nups that contain tandem repeats of motifs based on a Phe-Gly core (FG Nups). The structures of Nup FxFG and GLFG motifs bound to yImpβ1 and Impβ1 show that the interaction sites are primarily located on the convex outer surface in pockets between the A-helices (Fig. 2). HEAT repeats 5 and 6 bind both the GLFG and FxFG peptides [40,41]. The high- affinity Nup1p (residues 963–1076) binding sites on yImpβ1 are located between the A-helices of HEAT repeats 5–8. The first and second Nup1p (residues 974–988) binding sites are between HEAT repeats 7 and 8 and HEAT repeats 6 and 7, respectively, with the third site between HEAT repeats 5 and 6. Because of the repetitive sequences within the Nup1p C-terminal domain, it is unclear whether the binding involves Nup1p residues 999–1011 or residues 1019–1031. The interactions within all three sites are predominantly hydrophobic and involve the Phe aromatic rings, as well as contribu- tions from the adjacent hydrophobic residues (site 1, F977 and P979; site 2, F987 and I985; site 3, F1008 and I1007 or F1027 and I1026). There is also an additional hydrophobic interaction between P983 and HEAT repeat 7, as well as several H-bonds to the peptide backbone of Nup1p at each site. The total buried surface area of interactions between Impβ1 and other FG Nup cores is ~1000 Å2. However, for the yImpβ1:Nup1p FxFG, the interaction of the buried surface area of the three binding sites is twice that observed for the other interactions, with 2210 Å2 buried [41]. Although the higher affinity of Nup1p for yImpβ1 cannot offer a mechanism for translocation, concentrating yImpβ1 at the nuclear face can enhance the kinetics of import complex dissociation and thus the overall rate of transport [41]. Bednenko et al. proposed that there was a second, weaker, FG binding site on human Impβ1, located between HEAT repeats 14 and 16 and that involved Leu612, Phe688 and Leu695 [42]. Although mutations of this site alone did not impair the binding of a FG peptide, these mutations did impair function in conjunction with mutations in the primary FG binding site. Molecular dynamics calculations [43] indicated that theremay be additional FG binding sites on Impβ1, but this work has not been validated by mutagenesis. Transportin-1/PY-NLS-mediated nuclear import Transportin-1 (Trn1) imports a broad spectrum of cargoes, many of which are mRNA-binding proteins or transcription factors. Like other β-Kaps, Trn1 is composed of a series of HEAT repeats arranged as C- and N-terminal arches (reviewed in Ref. [16]). Compared to Impβ1, it contains one additional HEAT repeat and a large 62-residue loop that connects the two helices within HEAT repeat 8 and that appears to be involved in cargo release upon RanGTP binding. Trn1 binds cargoes directly, recognizing them through a broad range of loosely related NLSs that have a characteristic C-terminal Pro-Tyr (PY) motif, which generally has an Arg preceding it by 2–5 residues, giving a consensus of RX2-5PY [44]. The motif in the best-characterized cargo, the splicing factor hnRNP A1, is known as the M9-NLS [45–47]. These motifs lack defined elements of secondary structure so that they can adopt a conformation to match the binding surface on Trn1. These NLSs generally also contain either a basic or a hydropho- bic cluster N-terminal to the PY motif. The hydro- phobic PY-NLSs contain two motifs separated by 8–13 amino acids, an N-terminal Φ-G/A/SU motif (Φ represents a hydrophobic amino acid) and a C-terminal sequence R/K/H-X2-5-P-Y. Structures of a range of these NLSs bound to Trn1 show considerable variability in the way they bind. The region in the C-terminal arch of Trn1, to which the N-terminal clusters in the NLSs bind, is rich in negatively charged resides and also contains a number of scattered clusters of hydrophobic resi- dues, enabling it to accommodate a considerable range of different NLS sequences [48,49]. Struc- ture-guided mutations have not always been suc- cessful in disrupting these interactions, reflecting the complexity with which Trn1 recognizes its cargoes. Like other β-Kaps, Trn1 binds its cargoes in the cytoplasm, where Ran is primarily in its GDP-bound 2067Review: Protein Import into the Nucleus state and releases it in the nucleus when RanGTP binds. Trn1:hydrophobic PY-NLS complexes The available structures include the complexes with the NLSs from the heterogeneous nuclear ribonucleoproteins hnRNP A1 and D, nuclear RNA export factor TAP, hnRNP D-like protein JKTBP and fused in sarcoma protein (FUS) [44,50,51] (Fig. 4). In the Trn1:hnRNP A1 crystal structure, the so-called M9-NLS (residues 257–305) binds the C-terminal arch of Trn1, comprising HEAT repeats 8–20. The hydrophobic N-terminal motif residues F273 and P275 formhydrophobic contactswithTrn1 residues I773 and W730, respectively. Beyond P288Y289, the PY-NLS is disordered. Residues 263–266 in hnRNP A1 interact with a hydrophobic patch in HEAT repeats 18 and 19 on the outer convex surface, as well as with HEAT 20 on the inner concave surface.Residues 267–269 bind the loop in HEAT repeat 18, while the rest of hnRNP A1 follows the inner concave C-terminal arch to Fig. 4. Structures of Trn1. PDB entries: hnRNP A1 complex JKTBP complex (2Z5O), FUS complex (4FQ3), Nab2 comp (2OT8), RanGppNHP complex (1QBK) and apo (2Z5J). The H dark yellow. A representative Trn1 in apo-form has all cargoe contact HEAT repeats 8–17. Isothermal calorimetry analysis of site-directed mutants indicates that the most significant energy contributions to the interac- tion come from the N-terminal hydrophobic motif [48]. The hnRNP A1 M9-NLS is antiparallel with the Trn1 superhelix, and the interaction buries 3432 Å2 of surface area. Trn1 cargo binding involving other hydrophobic PY-NLSs follows an analogous pattern, with the hydrophobic hnRNP D PY-NLS (residues 332–355) binding to Trn1 HEAT repeats 8–18, whereas the TAP NLS (residues 53–82) binds to HEAT repeats 8–13 (but not HEAT repeats 14–18); the JKTBP NLS (residues 396–420) binds to HEAT repeats 8–13. Similar to other PY-NLSs, FUS PY-NLS (residues 498–526) occupies the C-terminal arch of Trn1, but unlike other PY-NLSs that are structurally disordered, the central segment of the FUS peptide (residues 514–522) forms a 2.5-turn α-helix. The FUS PY-NLS interacts with Trn1 at three major sites. The first site involves the N-terminal residues 508–511 of FUS forming hydrophobic interactionswith Trn1 residues W730 and I773. Within this N-terminal (2H4M), hnRNP D complex (2Z5N), TAP complex (2Z5K), lex (4JLQ), HCC1 complex (4OO6), hnRNP M complex EAT repeats involved in cargo binding are highlighted in s overlaid. image of Fig.�4 2068 Review: Protein Import into the Nucleus hydrophobic motif, FUS K510 makes hydrophobic interactions with Trn1 residues W730 and salt bridges with Trn1 E653 and D693. In the second “central” site, FUS residues 514–522, arranged as an α-helix, interact with Trn1 HEAT repeats 9–12 at residues D509, D543, D550, E588 andD646, formingH-bonds and salt-bridge interactions with all five of its basic residues (R514, H517, R518, R521 and R522). In the third site, the C-terminal residues P525Y526 interact primarily hydrophobically with Trn1 residues A381, L419, I457 and W640. Overall, the FUS PY-NLS binds Trn1 through hydrophobic interactions at both the N-terminus and the C-terminus of the peptide and through electrostatic interactions in its central α-helix. Recently, FUS binding to Trn1 was shown to also occur within the adjacent unmethylated RGG3 domain (residues 472–507), suggesting that it could act independently as a Trn1-dependent NLS or as an accessory domain capable of extending the PY-NLS [52]. Overall, the C-terminal PYmotif and the N-terminal hydrophobic motif in these PY-NLSs are recognized by Trn1 HEAT repeats 8–13 and HEAT repeats 14– 18, respectively. Trn1:basic PY-NLS complexes In the structure of Trn1 in complex with yNab2 (residues 205–242) [51] (Fig. 4), Nab2 residues T234RFNPL240 bind Trn1 in an extended conforma- tion, occupying the same binding site observed by the RX2-5PY motifs in other Trn1:PY-NLS structures. The Nab2 PY-NLS structure contains a “PL” instead of the canonical “PY”. The N-terminal R235 makes salt-bridge and H-bond interactions with D543, T506, E509 and T547 of Trn1. Additionally, F236 makes hydrophobic contacts with Trn1 residues A499, E498 and W460; P238 interacts predominantly through hydrophobic interactions with Trn1 residues L419, I457 and W460; and L239 interacts hydrophobically with Trn1 residues L419, A381,A422 and W460. The Tyr aromatic ring can make more hydrophobic or polar interactions, compared to the Leu in the PL motif, resulting in a higher energetic contribution. The structure of the Trn1:HCC1 (hepatocellular carcinoma protein 1) complex (PDB ID 4OO6) has been deposited; however, a detailed description of the structure has not yet been published. We provide a brief comparison with other similar NLSs. HCC1 contains a basic PY-NLS and analysis of the interface with Trn1 reveals residues R92GRYRSPY99 that bind HEAT repeats 8–14, with H-bonds formed between HCC1 Y95, R96 and Y99, as well as Trn1 residues D384, A423, S502 and T506. The basic PY-NLS of hnRNP M binds Trn1 HEAT repeats 8–16 [48]. Unlike hnRNP A1 that binds the convex side of the N-terminal Trn1, the N-terminus of hnRNP M binds toward the Trn1 arch opening. In the N-terminal basic motif of hnRNP M, residues E51KNI54 bind the same region of Trn1 as hnRNP A1 residues 274–277. However, in contrast to hnRNPA1, hnRNP M is ordered beyond the PY motif with five residues extending C-terminally. Hydrophobic inter- actions involve aliphatic portions of hnRNPM residues K52 and I54 and Trn1 residues W730, I642, D646 and Q685. Surprisingly, unlike hnRNP A1, the most sig- nificant energy contributions come from the C-terminal PY domain rather than from the N-terminal motif. Overall, although Trn1 uses the same residues (HEAT repeats 8–13) to bind the C-terminal PY-NLS motifs in all PY-NLSs, the N-terminal regions of basic and hydrophobic PY-NLSs only partially overlap. Trn1:Ran complex Similar to other β-Kap family members, Ran binding to Trn1 in the nucleus displaces the imported cargo. Trn1 has a characteristic acidic H8 loop, and the proposed NLS dissociation mechanism involves RanGTP binding to the Trn1 N-terminal arch. This causes conformational changes that push the H8 loop into the principal cargo-binding site in the C-terminal arch of Trn1, causing cargo release [53]. The structure of Trn1:RanGTP complex (Fig. 4) [54] reveals two distinct binding interfaces: an N-terminal interface involving interactions between Trn1 HEAT repeats 1–4 and the switch regions of Ran (residues 64–110) and a more centrally located interface involving HEAT repeats 7–8 and 14–15 and loop-8 (residues 311– 373) of Trn1 binding to Ran α-helices α4 and α5, β-strand β6 and the intervening loops. In the N-termi- nal interface, the C-terminal region of Ran switch-I (residues 44–47) interacts with Trn1 HEAT repeat 1; the switch-II region of Ran (residues 72–82) is buried at the interface by hydrophobic contacts to Trn1 HEAT repeat 2. Overall, these interactions mediate a combined buried surface area of 3900 Å2 and involve both polar and hydrophobic contacts, with most of the polar contacts contributed by Ran R106 and R110 and Trn1 S165, D164 and E161. Transportin-3/RS repeat NLS-mediated nuclear import Transportin-3 (Trn3), composed of 20 HEAT repeats, mediates the nuclear import of many proteins containing arginine-serine (“RS”) repeat NLSs [55–57]. These proteins are typically involved in mRNA metabolism and include the alternative splicing factor/splicing factor ASF/SF2. Structures of Trn3 bound to ASF/SF2 and Ran, and its unbound form, are available [58] (Fig. 5). The flexibility of the HEAT repeat region is important for binding of the RS domains, as well as the RNA recognition motif in ASF/SF2. The significant overlap in binding regions between RanGTP and the cargo explains the structural basis of release. Fig. 5. Structures of Trn3. PDB entries: ASF/SF2 complex (4C0O), Ran complex (4C0Q) and apo (4C0P). The HEAT repeats involved in cargo binding are highlighted in dark yellow. A representative Trn3 in apo-form has all cargoes overlaid. 2069Review: Protein Import into the Nucleus Trn3:ASF/SF2 cargo complex The ASF/SF2 protein is an RNA-splicing factor that contains two RRM domains and an RS domain. Residues within HEAT repeat 15, particularly Arg-rich regions within B-helices, are the key binding determi- nants for the recognition of the phosphorylated RS domain of ASF/SF2 [58]. There are three major binding regions, which bury a total of ~4300 A2 of surface area. The RRM domain (residues 116–191) binds HEAT repeats 4–7 and 19–20, the RS regions (residues 198–211) are bound by HEAT repeat domains 14–17 and the linker region between the RRM and RS domains is bound by HEAT repeat domains 12 and 13 [58]. These interactions are mediated by an extensive array of salt bridges. Although the RRM domain exhibited 60% of the buried surface area, the RS domain is actually the major contributor to binding; this is based on (i) mutagenesis studies, (ii) the fact that many RS domain-containing proteins that interact with Trn3 do not contain RRM domains and (iii) the RS-domain NLS being necessary and sufficient to mediate nuclear transport [58]. Trn3:Ran complex Similar to other β-Kap family members bound to Ran, the structure of Trn3 in complex with RanGTP shows that Ran contacts the B-helices on the concave side of the transport receptor. Sites that mediate binding within Ran include the switch-I and switch-II regions, which interact with the Trn3 HEAT repeats 1–3 and HEAT repeats 17 and 18, respec- tively [58]. In particular, the switch-I region of Ran inhibits the ability of HEAT repeat 15 to interact with the RS domain. Apo-Trn3 The 20 HEAT repeats in apo-Trn3 are arranged in a circular shape, whereby the N- and C-terminal repeats face each other. In most β-Kap family members, HEAT repeats pack in a rather uniform manner, but in Trn3, there are several notable exceptions: HEAT repeats 1 and 2 pack perpendic- ular to each other, the stacking of HEAT repeats 3/4 and 9/10 displays pronounced left-handed twists and HEAT repeat 20 contains an additional C-terminal α-helix. Interestingly, the crystal structure revealed a significant molecular interface mediating homodimer formation, and small-angle X-ray scattering analysis is consistent with this observation; however, the functional role of dimerization is unclear at this stage [58]. Importin-13-mediated nuclear transport Importin-13 (Imp13) is the closest paralogue of Trn3 but has distinctly different cargo recognition specificity [59]. Whereas Trn3 predominantly inter- acts with flexible RS domains, Imp13 mediates the nuclear import of several transcription factors con- taining histone-fold motifs (composed of ~70 amino acids arranged as three α-helices). Similar to Trn1, Imp13 contains 20 consecutive HEAT repeats [60] and interactions with cargo occur on the inner concave surface; however, Imp13 is able to mediate transport of cargoes both into and out of the nucleus. Recent co-crystal structures of Imp13 with the exon junction complex components Mago and Y14, as well as the E2 SUMO-conjugating enzyme UBC9, show that the flexibility of Imp13 is important for cargo binding (Fig. 6). Imp13:Mago-Y14 cargo complex The first structure of Imp13 corresponds to the complex with Mago-Y14, revealing that 15 HEAT repeats are involved in binding the cargo [60]. Imp13 adopts a closed ring-like conformation, whereby the N- and C-terminal arches are facing each other, and Mago-Y14 binds to the inner concave-surface helices of the C-terminal arch. HEAT repeats 8 and 9 interact with the Mago β-sheet, at the site where the N terminal region of Y14 is bound, and HEAT repeat image of Fig.�5 Fig. 6. Structures of Imp13. PDB entries: Mago-Y14 complex (2X1G), UBC9 complex (2XWU), Ran complex (2X19) and apo (3ZKV). The HEAT repeats involved in cargo binding are highlighted in dark yellow. A representative Imp13 from Imp13:Ran complex has all cargoes overlaid. 2070 Review: Protein Import into the Nucleus 15 binds at the opposite side of the Mago β-sheet. HEAT repeats 17, 18 and 20 interact with Mago α-helices, and HEAT repeats 4–7 and HEAT repeats 19 and 20 surround Y14. Imp13:UBC9 cargo complex The crystal structure of the Imp13:UBC9 cargo complex showed a unique cargo recognition mode, with UBC9 bound within the N-terminal arch of Imp13 to occupy the RanGTP-binding site in that region [61]. Unlike the Imp13:Mago-Y14 complex, the N- and C-terminal HEAT repeats of Imp13 are positioned away from each other. UBC9 lies between HEAT repeats 1 and 9 and forms interactions within the inner concave surface of Imp13. UBC9 makes interactions mainly through three of its loops. The first interacting region involves hydrophobic interactions mediated by a loop and a helix that bind both helices within HEAT repeat 1 and the B-helix of HEAT repeat 2 of Imp13. I125 of UBC9makes contact with Imp13 residues Y34, E73 and Y76, while an additional hydrophobic interac- tion involves UBC9, Y134, positioned toward Imp13 L33 and Y34. Imp13:Ran complex Similar to Impβ1 and Trn1, cargo release and directionality of nuclear import of Imp13 are achieved by Ran. However, cargo release of Impβ1 and Trn1 relies on the characteristic acidic loop within HEAT repeat 8, which is lacking in Imp13, and therefore, the mechanism of cargo release is likely to be different in these transport molecules. The Imp13:RanGTP structure shows RanGTP interact- ing with the inner concave helices contained within the N-terminal arch of Imp13 at 3 sites, similar in position to those identified in Impβ1, Trn1 and Trn3 [62]. The Ran switch-I loop binds Imp13 at HEAT repeats 16–19 with predominantly polar and elec- trostatic contacts (e.g., Ran K39K40 binding Imp13 D785/D788) [62]. The Ran switch-II loop binds Imp13 HEAT repeats 1–3, with hydrophobic interactions involving Ran L77 and electrostatic contacts be- tween Ran D79 and Imp13 R122. The helix adjacent to the switch-II loop also contacts HEAT repeats 3 and 4; in particular, Ran residues R108 and R112 contact Imp13 residues E175 and E176 [62]. The third binding site of RanGTP is through Imp13 HEAT repeats 8 and 9, with Ran residues R168K169 image of Fig.�6 2071Review: Protein Import into the Nucleus approaching negatively charged Imp13 residues D415E416 on helix 9B. Thus, unlike the mechanism of cargo release by Impβ1, Imp13 releases cargo due to its direct steric clashes with RanGTP. Impα-Mediated Nuclear Import In this pathway, the Impα:Impβ1 heterodimer binds to cargo proteins containing cNLSs [63]. The translocation through the nuclear pore is achieved through transient interactions between Impβ1 and Nups. This process, known as the classical nuclear import pathway, is thought to be the most exten- sively used nuclear import mechanism in the cell [64–66]. Monopartite and bipartite cNLSs The first nuclear targeting motif was identified in the simian virus SV40 large T-antigen (TAg) through mutational studies. It comprises a small stretch of positively charged amino acid residues (P126KKKRRV132). Non-conservative substitu- tions of residues within this motif abrogated nuclear distribution of the cargo protein [67]. Furthermore, fusion of this motif to cytoplasmic proteins such as β-galactosidase induced their nuclear accumula- tion [68]. A similar, but more complex, signal was later defined for the Xenopus laevis nucleoplasmin protein, consisting of two clusters of basic amino acids separated by a 10- to 12-residue linker region (K155RPAATKKAGQAKKKK170) [69]. Sub- stitution of residues within either basic cluster altered the nuclear distribution of the protein, suggesting that both motifs were required for nuclear targeting. By contrast, mutation of residues within the linker region had no effect on nuclear distribution [70]. The two sequences are now commonly de- scribed as the prototypic monopartite (SV40-TAg) and bipartite (nucleoplasmin) cNLSs and numer- ous cNLS-containing cargo proteins have since been identified based on sequence similarity with these two [65] (Supplementary Table 2). Using in vitro transport assays in digitonin-permeabilized cells, we showed the active import of cNLS sequences to be dependent on soluble cytoplas- mic factors [71]. This in vitro system was later used to identify and characterize essential transport factors, Impα and Impβ1, that could reconstitute nuclear import of cNLS sequences when reintro- duced to cytosol-depleted cells [63,72–76]. Con- sequently, cNLS cargoes are defined by the presence of one or two sequence clusters rich in Arg and Lys that are necessary and sufficient for nuclear import by the Impα:Impβ1 complex. Structure of Impα Impα has a modular structure composed of a short N-terminal auto-inhibitory region that also mediates binding to Impβ1 [77] (the IBB domain) and larger C-terminal NLS binding domain composed of 10 ARM repeats [78,79]. The ARM repeat motif, first described for the Drosophila melanogaster armadillo protein [80], is composed of three α-helices (H1, H2 and H3). The continuous stacking of the tandem ARM repeats generates a superhelical solenoid, with the H3 helices forming the Impα inner concave surface (Fig. 7A). A rotation between consecutive ARM repeats creates a groove along the superhe- lical axis of the protein, where the NLS binding sites are located [11]. The ARM repeat solenoid appears to be much less flexible than the HEAT repeat solenoids in β-Kaps. The structures of Impα proteins from different organisms have been deter- mined (human [81–85], Saccharomyces cerevisiae [13,79,86], mouse [12,87–102], rice [100,103], Arabidopsis thaliana [104] and Neurospora crassa [105]; Supplementary Table 3). These structures all comprise 10 ARM repeats, but their curvatures vary, particularly between proteins from different phyloge- netic families [128]. The structural variations result in differences in binding NLSs, as observed for rice and mouse Impα bound to the same NLS peptide, for example [66]. Structural basis of cNLS recognition by Impα X-ray crystallography has been used extensively to elucidate the molecular details of cNLS binding to Impα. The concavesurface formedby ImpαH3helices comprises the cNLS binding site, which displays a high degree of sequence conservation between Impα proteins from various organisms (Fig. 7B). More specifically, conserved (^R/K)XXWXXXN motifs (where x is any residue, and ^R/K is any residue other than Arg/Lys, typically an acidic or hydrophilic residue) within the H3 helices form an array of binding pockets along the inner concave groove of the Impα adaptor. The conserved Asn residues form H-bonds with the cNLS backbone, whereas the invariant Trp side chains form an array of binding cavities on the adaptor surface, typically with acidic residues (^R/K) located at the end of these pockets. Thus, the aliphatic moieties of long basic side chains, such as Lys and Arg, can interact with the stacked indole rings of the Trp array, while the positively charged portion of the side chain can simultaneously form H-bonds and salt bridges with the hydrophilic residues that line the pockets and can also form cation-π interactions with the electron clouds of the tryptophan indoles. Disruption of the (^R/K)XXWXXXN motif within ARM repeats 5 and 6 has been observed in all Impα proteins with known structure. This disruption effec- tively creates and segregates two distinct binding Fig. 7. NLS binding by Impα. (A) Structure of rice Impα (PDB entry 4BQK) with H3 helices colored green. (B) Structure of mouse Impα (PDB entry 3UL1) in complex with nucleoplasmin (Npl) cNLS (shown as black sticks). The mouse adaptor is colored by sequence conservation based on known Impα structures (human Impα1, human Impα3, human Impα5, human Impα7, Mus musculus Impα1, S. cerevisiae Impα, Oryza sativa Impα, A. thaliana Impα3 and N. crassa Impα1). (C) Schematic representation of a monopartite NLS binding at the Impαmajor and minor site pockets. Conserved Asn and Trp residues of Impα shown in green and yellow, respectively. Monopartite NLSmain chains and side chains are shown as black and blue lines, respectively. Broken lines indicate common salt-bridge interactions at the P2- and P2′-binding cavities. (D) Structure of mouse Impα with atypical minor site-binding Guα NLS shown in blue. Impα residues comprising (^R/K)XXWXXXN motif in ARM repeats 7 and 8 are shown in stick representation. Guα residues that bind to minor site cavities are indicated. (E) Structure of full-length yeast Impα (PDB entry 1WA5; IBB domain shown in green) superimposed onto yeast Impα:Nup2p (PDB entry 2C1T; Nup2p shown in magenta) and yeast Impα:nucleoplasmin cNLS (PDB entry 1EE5; NLS shown in orange) complexes. For clarity, only one Impα ARM repeat domain is shown in gray surface representation. 2072 Review: Protein Import into the Nucleus regions on the Impα surface termed the major (ARM repeats 2–4) and the minor (ARM repeats 6–8) binding sites (Fig. 7). Monopartite and bipartite cNLSs bind to the Impα binding sites in an extended conformation (Fig. 7). Structural analyses have demonstrated that monopartite cNLSs preferentially bind to the major binding site, reflected through lower crystallographic B-factors and the presence of more extensive electron density when compared to that observed at the minor binding site [79,87,97]. In addition, substitution of residues within the major binding site can abrogate nuclear accumulation of monopartite cNLS cargoes, whereas minor binding sitemutations haveminimal effects [106]. By contrast, mutation of residues at either minor or major binding sites can severely disrupt the interaction with bipartite cNLSs [106], which interact simultaneously with the two binding regions on the Impα surface. The major binding site of Impα is composed of four principal binding cavities that bind the side chains of cNLS residues P2–P5 (Fig. 7). Structures of numerous cNLSs bound to Impα have been determined (Supplementary Table 2). For all characterized monopartite and bipartite se- quences, the most crucial structural determinant is a Lys residue located at position P2. Its side chain forms a salt bridge with a highly conserved Impα Asp side chain (Fig. 7). Consistently, substi- tution of this conserved Asp residue results in an ~300- to 400-fold decrease for monopartite and bipartite cNLS binding in yImpα [106]. Although an Arg side chain can bind at the P2 position (PDB entry 4HTV; Supplementary Table 2), mutational studies on the SV40-TAg cNLS have demonstrated that a Lys is energetically favored at this position [107,108]. image of Fig.�7 2073Review: Protein Import into the Nucleus Although preference for long basic side chains is observed at the other major binding site positions (P3, P4 and P5), cNLSs can have a range of different amino acids in these positions (Supplementary Table 2). The calculated free-energy contributions of the SV40-TAg basic side chains at the P3–P5 positions are between 1/4 and 2/3 of that observed for the P2 Lys residue, with the P4 position contributing the least free energy to the interaction with Impα [107]. This suggests that non-basic side chains are not strictly necessary at the P3, P4 and P5 positions, provided that the overall affinity of the cNLS cluster is sufficient to constitute a functional cNLS motif. This is achieved by maximizing interactions at the other major binding site pockets, the regions directly flanking the major binding site or binding at the minor binding site in the case of bipartite cNLSs. The two key amino acids in the N-terminal region of bipartite cNLS (positions P1′–P2′) bind in a conserved manner to the minor NLS binding site, but adjacent auxiliary cavities can be used differen- tially in a cNLS-specific manner (Supplementary Table 2). In bipartite cNLSs, a “KR”motif is observed predominantly at these positions (Supplementary Table 2), with the P2′ Arg side chain forming a salt bridge with a conserved Impα Glu residue. The total energetic contribution of the P1′ and P2′ pockets has been calculated to be 3.2 kcal/mol, comparable to that observed for a basic residue at the P3 or P5 position [107]. Although this interaction is modest, the addition of a KRmotif N-terminal to a non-functional SV40-TAg variant, whereby the critical P2 Lys residue was replaced with a Thr, was sufficient to direct nuclear accumulation of the protein [109]. Thus, compared to monopartite motifs, the sequence requirements at the major binding site are not as strict in bipartite cNLSs due to the additional interactions at the minor binding site, as well as cNLS-specific linker region interactions. Structural studies showed that a minimum of 10 residues between the P2′ and P2 positions is required to allow functional cNLSs to interact simultaneously with both the major and minor binding sites on the Impα surface [56]. Furthermore, early localization assays demonstrated that the linker region could tolerate non-conservative sub- stitutions, as well as insertions [67], and these observations are consistent with the minimal inter- actions observed between bipartite cNLS linker regions and the Impα surface in crystal structures. Consistently, bipartite cNLS linker region residues are not well ordered and have higher crystallographic B-factors than residues at the major and minor binding sites. In some cNLSs, electron density is absent for most linker-region residues (Supplemen- tary Table 2, residues in italics indicate residues not visible in crystallographic models), suggesting that bipartite linker regions longer than 12 residues likely bulge away from the Impα surface but are functional, provided that sufficient contacts at the major and minor binding regions are maintained. Notably, peptide library studies have demonstrated a prefer- ence for acidic residues in bipartite linker region sequences [110], and structural studies have shown that negatively charged side chains can form electro- static interactions with the basic surface of Impα ARM repeats 4–6 [90]. Taken together, structural and biochemical data have revealed the molecular determinants of cNLS binding to the Impα adaptor. These studies have therefore enabled the elucidation of consensus sequences of both types of cNLSs. The monopartite cNLS motif is defined as K(K/R)X(K/R), whereas the bipartite cNLS consensus sequences correspond to KRX10 -12KRRK, KRX10 -12K(KR)(KR) and KRX10-12K(K/R)X(K/R) (where X corresponds to any residue, Lys residues in boldface indicate the critical P2 lysine and minor site-binding KR motifs are underlined) [65,90]. Atypical Impα-dependent NLSs Several nuclear targeting signals are dissimilar to cNLSs described above but are nevertheless recognized by Impα. An analysis of binding of a random peptide library to Impα variants revealed six classes of NLSs, including two types of non-cNLSs, which were annotated “plant-specific” (consensus sequence LGKR[K/R][W/F/Y]) and “minor site- specific” (consensus sequences KRX[W/F/Y]XXAF and [R/P]XXKR[K/R][^DE]) NLSs [111]. Both of these types of NLSs feature a short basic cluster flanked C-terminally by hydrophobic residues. Unique features of the binding of atypical NLSs to Impα have been identified by structural and biochem- ical studies. Crystal structures of mouse Impα in complexwith poorly basicNLSs (G257KISKHWTGI266; G273SIIRKWN280) from the human phospholipid scramblase isoform hPLSCR1/4 show binding to the major and minor NLS binding sites, respectively [95,99]. The exclusive binding to the minor NLS binding site is also observed in naturally occurring NLSs {e.g., the mouse RNA helicase II (Guα) NLS (K842RSFSKAF849) [101]}, whereas as the mitotic regulator protein, TPX2 NLS (K284RKH287) binds predominantly to the minor site but could be consid- ered an atypical bipartite NLS (with K327MIK330 binding to the major site) [91]. Unlike other NLSs that bind in an extended conformation, structural analysis of the Guα NLS and four other “minor site-specific” NLSs in complex with mouse Impα revealed that the C-terminal residues of these NLSs form an α-helical turn [101]. This distinct structure of the NLS is stabilized by internal H-bond and cation-π interactions between the aromatic residues from the NLSs and the positively charged residues from Impα. Such a conformation is prevented sterically at the Impαmajor binding site, explaining the minor site preference of 2074 Review: Protein Import into the Nucleus these motifs [101]. Although contacts between “minor site-specific” NLSs are observed at the major binding site (Supplementary Table 2), the NLS peptides at the minor binding region havemore extensive interactions and lower crystallographic B-factors. Synthetic peptides corresponding to “plant-specific” NLSs show preferential binding to the minor NLS binding site of rice Impα, although the structural determinants of their binding mode are different from the ones observed for other “minor site-specific”NLSs [100]. Although putative naturally occurring “plant- specific”NLSscanbe foundusing sequence analyses [100], they have not yet been characterized exper- imentally. Additional atypical NLSs have been identified that have not been characterized structur- ally, for example, in Borna disease virus P10 protein (R6LTLLELVRRLNGN19) [112]. Bioinformatic analy- ses of the distribution of different classes of NLSs in diverse eukaryotes indicate that the atypical NLSs are much less prevalent than the monopartite and bipartite cNLSs [66]. Structures of Impα in complex with native proteins Recognition of NLSs by Impα requires that these linear sequence motifs adopt an extended conforma- tion. Consistently, cNLSs are located in disordered (and thus flexible) regions of native proteins and thus structural characterization of the Impα-cNLS interac- tion has predominantly involved the use of peptide sequences that correspond to NLS segments. Struc- tural analyses of Impα in complex with native cNLS- containing proteins or domains have only been described for the influenza virus PB2 C-terminal fragment (residues 628–759) [83,84], the human cap-binding protein CBP80 (in complex with CBC20) [81] and the Ebola virus VP24 protein [113]. The PB2 C-terminal fragment has been crystallized in complex with four different human Impα isoforms (Impα5 [83] and Impα1, Impα3 and Impα7 [84]). In all these structures, a PB2 globular domain Lys residue located outside the canonical bipartite cNLS sequence interacts in trans with residues that comprise the Impα P3′ pocket. In the absence of the globular domain, the PB2 bipartite cNLS is able to interact with the P3′-binding pocket, causing a register shift in the cNLS minor binding site cavities (Supplementary Table 2) [84]. Differences in binding registers have also been described for structures of the SV40-TAg cNLS peptide (Supplementary Table 2), suggesting that these linear motifs can differentially bind to the Impα surface to maximize favorable interactions. In contrast to the NLSs described above, the Ebola virus VP24 protein interacts with Impα through a distinctly different mechanism. The crystal structure of truncated human Impα (ARM repeats 7–10) in complex with VP24 shows that the virus protein primarily contacts the extreme C-terminus of the adaptor through three interspersed clusters on the surface of the folded VP24 structure [113]. More- over, VP24 interacts with the opposite surface of Impα compared to NLS peptide sequences, with the H2 helices of ARM repeats 9 and 10 defining the VP24 interface [113]. Although the binding surfaces of NLS-containing cargo and VP24 do not overlap, a 2-fold difference in binding is observed between the nucleoplasmin cNLS and Impα in the presence of VP24. This suggests that minor binding site interac- tions of cNLS sequences are allosterically affected by VP24 interaction with the outer Impα H2 helices [113]. Although bioinformatics analyses have suggested that a moderate proportion of yImpα-binding proteins lack a detectable linear NLS [64], nuclear localization may also occur through “piggy-back” mechanisms, whereby translocation of a non-NLS-containing protein by Impα is mediated via interaction with an NLS-containing binding partner. The prevalence of proteins that mediate Impα binding through non- canonical means such as the VP24 protein is not known and cannot be identified by current cNLS prediction algorithms. Auto-inhibition by the Impα IBB domain The N-terminal IBB domain of Impα contains a cNLS-like sequence. Similar to cNLS motifs, the Impα IBB domain is rich in basic amino acids and interacts with the NLS binding pockets in the absence of cargo. In the mouse Impα structure, the IBB domain residues K49RRN52 are bound to the major NLS binding site (and correspond to cNLS positions P2–P5) [12]. This is consistent with lower affinity of cNLS binding to full-length Impα, compared to truncated Impα proteins that lack the IBB domain [114]. Likewise, increased affinity for cNLSs was observed when the corresponding K54RR56 motif in the IBB domain of yeast Impα was substituted with alanine residues [115]. This suggests that the IBB domain inhibits cNLS binding. In rice Impα, the K47KRR50 motif in the IBB domain was found to form analogous interactions with the major NLS binding site [100]. Distinct from themouse structure, however, the G25RRRR29 motif in the rice Impα IBB domain additionally interacts with theminor NLS binding site. Similar to rice Impα, minor and major binding site interactions are observed between the IBB domain of the yeast protein and the solenoid domain when in complex with Cse1p and RanGTP [116]. The auto-inhibitory mechanism in plant and yeast Impα proteins may therefore differ from the mammalian proteins. The IBB domain of Impα also mediates binding to Impβ1 and thus the interaction with Impβ1 promotes cNLS binding to Impα. The IBB domain-mediated auto-inhibitorymechanism presumably reduces futile import of empty adaptors and hinders cNLS binding 2075Review: Protein Import into the Nucleus when Impβ1 is not present for nuclear translocation. Most structural studies of Impα binding to NLSs have therefore employed a truncated protein lacking the IBB domain. IBB domain-like NLSs A recent report describing the structures of theHeh1 and Heh2 inner membrane protein NLSs in complex with yeast Impα has revealed a bipartite mode of binding distinct from that observed in cNLSs [117]. Structural and biochemical analyses of these NLSs suggest similarities to the IBB domain interaction with the Impα adaptor: (1) the conformation of Heh2 NLS is similar to that observed for the IBB domain of full-length yImpα in the auto-inhibited state (when in complex with RanGTP and Cse1p [116]); (2) muta- tional analyses have identified that the key structural determinant for these NLS motifs is the P2′ pocket; this is in contrast to bipartite cNLS motifs, where nuclear accumulation is only modestly abrogated by disruption of P2′ interaction [70,106]; and (3) pull- down assays demonstrate that the Heh1 and Heh2 NLS can efficiently relieve IBB domain auto-inhibition of Impα in the absence of Impβ1, unlike typical bipartite cNLS sequences [117]. However, in contrast to the Impα IBB domain, direct binding to Impβ1 was not detected for the Heh1 and Heh2 NLSs [117]. Similar to the atypical monopartite NLSs, Heh1 and Heh2 mediate extensive contacts at the Impα minor binding site, with residues in this region having lower average B-factors compared to residues bound at the major binding pockets. Impα variants In some organisms, several Impα variants exist as a result of duplication events. The metazoan para- logues can be divided into three clades (α1, α2 and α3); Impα proteins from Viridiplantae and Fungi belong to the α1-like clade [118–120]. The members of different subfamilies share ~50% sequence identity, whereas within a subfamily, the identities are N80% [118]. The structure and recognition mechanism are highly conserved among Impα proteins from different species [90,105] (Supplementary Fig. 1). However, Impα variants can display preferences for specific NLSs, which may be important for development and tissue-specific roles [118]. In D. melanogaster, which encodes three Impα proteins (α1, α2 and α3), oogenesis depends on Impα2, and neither Impα1 nor Impα3 can substitute [121]. The mouse genome codes for six variants and Impα7 plays an essential role during the early stages of embryo development [122]. Some of the seven Impα variants in humans also display preferential interactions with specific cargoes, for example, Impα3 with RCC1 [123] and Impα5 for STAT proteins [124,125]. The overall structure and the key NLS binding features of the different Impα variants are conserved in the crystal structures determined to date (Supplemen- tary Table 3 and Fig. 7). Therefore, the reasons for specific NLS binding preferences by certain variants are not entirely clear but may relate to amino acid differences in the vicinity of NLS binding sites affecting NLS binding and auto-inhibition. Structural analyses showed that plant-specific NLSs bind specifically to the minor NLS binding site of rice Impα but preferentially to the major site of mouse Impα [100], which has been attributed to specific amino acid differences in the C-terminal region of Impα. In particular, a Thr-to-Ser mutation prevents the binding of a plant-specific NLS peptide to the minor site of mouse Impα through steric hindrance. Indeed, it has been shown that SV40- TAg NLS binding mode in the minor NLS site is not the same among Impα structures [105] (Supplemen- tary Table 2). The structures of mouse and yeast Impα in complex with the SV40-TAg NLS peptide show “KK” residues at P1′–P2′ positions, whereas rice and N. crassa Impα structures show “KR” residues at these positions. The minor binding site may play a more important role in the α1-like Impα family, which includes the rice and N. crassa proteins. The human is the only organism for which the structures of different Impα variants are available [83–85,113,126] (Supplementary Table 3). The overall structures and the NLS binding sites are conserved, consistent with the equivalent affinity in vivo of the influenza A PB2 NLS fragment for different Impα variants [84]. Differences identified between variants include a higher flexibility and reduced affinity between bipartite NLSs and Impα3, as well as different levels of auto-inhibition [84]. Cargo release and recycling of Impα A consequence of the auto-inhibitory function of the Impα IBB domain is the facilitation of cargo release in the nucleus; once the trimeric complex has traversed the nuclear pore, dissociation of Impβ1 from Impα in the nucleus allows the IBB domain to compete for the NLS binding site. In addition to this mechanism, some reports have implicated Nup50 (Nup2p in yeast) in Impα-cargo disassembly [86,102,116]. Solution-binding assays suggest that the addition of Nup2p can accelerate displacement of NLS cargo from the yeast protein [127]. Consistently, structures of Impα in complex with Nup50 peptide segments reveal interactions between the Nup side chains and the Impα minor binding site (Fig. 7E) [86,102,126]. In addition, contacts are observed outside the NLS binding region between the Nup and the Impα C-terminal region. Impα is recycled back to the cytoplasmby its export factor CAS (Cse1 in yeast), which binds preferentially 2076 Review: Protein Import into the Nucleus to cargo-free Impα [128]. Nuclear export of Impα further depends on RanGTP [129,130], and the formation of the trimeric CAS:Impα:RanGTP com- plex has been shown to be highly cooperative [128]. Binding of CAS-RanGTP to Impα displaces Nup50 through steric hindrance. Like other members of the β-Kap family, CAS has a superhelical HEAT repeat architecture and wraps around RanGTP and the Impα C-terminal region [116]. Extensive interactions are observed between the outer surfaces of Impα ARM repeats 8–10, in a region that overlaps with the VP24-binding site. Together, these mechanisms ensure that cargo is efficiently displaced from Impα and that recycling of the adaptor to the cytoplasm occurs only after cargo disassembly. Snurportin-Mediated Nuclear Import The nuclear import of assembled spliceosomal subunits, the uridine-rich small ribonucleoprotein particle UsnRNPs, employs a variation of the classical nuclear import pathway that utilizes a distinct adaptor protein termed snurportin-1. This protein, first identified via UV cross-linking to an m3G-caped oligonucleotide, binds Impβ1 with an IBB domain similar to that found in Impα but lacks the canonical ARM repeat region [131], instead adopting a double β-sheet fold to form the m3G-cap binding pocket [132]. Snurportin-1 binds both the hyper-methylated cap and the first nucleotide of the RNA in a stacking conformation, with the specificity determined by a highly solvent-exposed tryptophan [132]. Fig. 8. The structure of symportin-1. (A) Structure of the C. th has an extended superhelical conformation composed of a uni (residues 274–675) repeats. (B) Structure of the Syo1:Rpl5:R along the inner solenoid surface, while Rpl11 interacts with the from the HEAT repeat 1 acidic loop. Symportin-1-mediated nuclear import Recent reports have described a new adaptor protein termed symportin-1 (Syo1 for synchronized import) that links nuclear cargo to Trn1 for nuclear translocation. Syo1 was identified through tandem affinity purification analysis of the yeast ribosomal proteinRpl5. Biochemical assays demonstrated direct binding of Syo1 to Rpl5 and the related protein Rpl11, as well as stable trimeric Syo1:Rpl5:Rpl11 complexes [133]. As Rpl5 and Rpl11 form a functional cluster within the ribosome, simultaneous binding to Syo1 suggested concomitant import of the ribosomal subunits, unlike other import pathways described to date that mediate binding of individual cargoes. Syo1 is recognized by Trn1 through an N-terminal PY-NLS motif. The structure of Chaetomium thermo- philum Syo1 revealed an unusual combination of four N-terminal ARM repeats fused to six C-terminal HEAT repeats in its globular cargo-binding domain (Fig. 8A). The absence of the (^R/K)XXWXXXN motif that forms the binding pockets on the Impα surface suggests that cNLS sequences are not able to bind to Syo1. However, structural characterization of the Syo1-Rpl5 peptide complex reveals that the inner concave surface of the Syo1 HEAT repeats mediates binding to the Rpl5 N-terminal region in a manner highly similar to the Impα-NLS interaction [133]. The Rpl5 peptide binds to Syo1 in an extended conformation with a short helical segment [133]. Conserved basic and aromatic residues throughout the Rpl5 linear motif mediate Syo1 binding. By contrast, the structure of the trimeric complex revealed that Rpl11 binding is mediated by the outer surface of ermophilum Syo1 adaptor (PDB entry 4GMO). The protein que chimera of four ARM (residues 65–260) and six HEAT pl11 complex (PDB entry 5AFF). The Rpl5 peptide binds outer surface of the Syo1 superhelix and a helical region image of Fig.�8 Fig. 9. NTF2 dimer (blue) bound to two chains of RanGDP (green) and two FxFG Nup motif cores (yellow) (based on PDB entries 5BXQ and 1GYB). (A) The two chains of the NTF2 dimer interact through an extensive β-sheet, whereas the remainder of the molecule generates a cavity into which Phe72 of the RanGDP switch-II loop binds. The FxFG motif cores bind in a hydrophobic cavity generated between the two NTF2 chains. (B) Binding of the RanGDP switch-II loop to NTF2. Ran Phe72 inserts into the hydrophobic cavity and is supplemented by salt bridges formedbetween Lys71 andArg76 of RanandAsp92/94 and Glu42 of NTF2, respectively. 2077Review: Protein Import into the Nucleus the Syo1 superhelix, with additional contacts from a helical region in the large acidic loop of HEAT repeat 1 (Fig. 8B) [134]. Other Nuclear Import Pathways Although most nuclear proteins depend on β-Kaps to reach their subcellular destination, some alter- native pathways exist. These include the pathway operating during heat-shock stress that involves the carrier Hikeshi and the RanGDP import pathway that involves the nuclear transport factor NTF2 (see below). The actin-capping protein CapG also uses the interaction with NTF2 and Ran to enter the nucleus [135]. TheRaDAR (RanGDP/ankyrin repeat) pathway has recently been characterized as an importin- independent nuclear import pathway for a number of ankyrin repeat proteins, with the signal identified as a hydrophobic residue at a specific position of two consecutive repeats [136]. The calcium-binding pro- tein calmodulin can function as an import factor independent of β-Kaps, GTP and Ran, for a range of cargoes, particularly transcription factors [137–139]. Some proteins enter the nucleus independent of carrier molecules, for example, by direct binding to Nups, diffusion through the NPC and interaction with nuclear components (e.g., the ARM repeat protein β-catenin [140]). Lectins have been described as import factors for glycosylated proteins, and viruses disrupt the nuclear envelope during infection. Trans- port factors remaining to be characterized are involved in light-dependent nucleocytoplasmic trafficking in plants [141]. Some proteins can “piggy-back” through interactions with proteins with NLSs [142–145]. Many small proteins (e.g., histones) are imported by active transport mechanisms, although they could freely diffuse into the nucleus [146–149]. Proteins do not always use a single nuclear import pathway, which may be important under circumstances when conven- tional pathways are inhibited [150]. Hikeshi-mediatednuclear import ofHsp70proteins Nuclear import of heat-shock proteins from the Hsp70 family has been shown to be mediated by the nuclear transport factor Hikeshi [151] (see the review by Imamoto in this issue). The crystal structure of Hikeshi reveals a dimeric two-domain protein, with the N-terminal domains responsible for the interac- tion with Nups [152]. The asymmetric nature of the dimer has been suggested to be important for the recognition of the ATP-bound form of Hsp70. NTF2-mediated nuclear import of RanGDP The conformational changes generated by nuclear RanGTP binding to β-Kaps lead to the release of their macromolecular cargo and adaptors (such as Impα), but the karyopherins can only participate in another import cycle after Ran is released, following stimulation of its GTPase activity in the cytoplasm by RanGAP. The RanGDP generated in this way is then returned to the nucleus for recharging with GTP by the chromatin-bound RanGEF. Although Ran is a 25-kDa protein, its rate of nuclear import using simple diffusion appears to be insufficiently rapid to maintain adequate levels of karyopherin-based nuclear trans- port and is augmented by NTF2 [153,154] (reviewed in Ref. [155]). The structure of NTF2 features an extensive β-sheet flanked by three helices, yielding a cone- shaped molecule that dimerizes in solution (Fig. 9). The β-sheets of the protomers form an extensive interface in the NTF2 dimer, in which a considerable number of hydrophobic residues are buried [156,157]. The arrangement of the helices that flank the β-sheet generates an extensive cavity that is lined by image of Fig.�9 2078 Review: Protein Import into the Nucleus hydrophobic resides and that forms the principal interaction interface with RanGDP [158]. NTF2 recognizes the GDP-bound state of Ran through binding to the switch-II loop (Fig. 9). In the RanGDP conformation, F72 in the switch-II loop inserts into the NTF2 cavity and this essentially hydrophobic interac- tion is complemented by salt bridges formed between K71 and R76 of Ran and D92/D94 and E42 of NTF2, respectively [158]. To mediate movement though the nuclear pore, NTF2 also binds to FxFG motifs present in many Nups, with the Phe residues of these motifs becoming buried in a hydrophobic cavity formed between the two chains in the dimer (Fig. 9) at a position opposite from that to which RanGDP binds [40,159]. The affinity of NTF2 for RanGDP is of the order of 100 nM [157], which ensures that the dissociat ion rate of the NTF2:RanGDP complex is sufficiently slow for it to remain intact during nuclear transport, whereas the affinity of NTF2 for the Nup FxFG repeats is weaker (~5 μM), consistent with its forming much more rapidly dissociating complexes that enable the NTF2:RanGDP complex to move through the nuclear pore transport channel rapidly, using tran- sient binding to Nups [159]. Regulation of Nuclear Import Pathways One of the key features of limiting movement in and out of the nuclear compartment is the opportu- nity to regulate these transport processes. Regula- tion is essential for fine-tuning transport activities according to the actual cellular needs. Nucleocyto- plasmic trafficking is regulated on several levels (see Refs. [160–162] for reviews), with new mechanisms continuing to be discovered (Fig. 10). The nuclear accumulation rate of cargoes is directly related to their binding affinities for their import receptors [163,164] and thus transport processes can be regulated by modulating these binding affinities either by direct changes to the NLS or by physically blocking importin:cargo interactions through intermo- lecular or intramolecular NLS masking. Post-trans- lational modification-induced changes are perhaps the best-described means of modulating transport processes. Although phosphorylation plays a central role, there is also a growing number of examples based on methylation or acetylation [165–169]. Post-translational modifications link nucleocytoplas- mic transport to a variety of signaling pathways including the cell cycle, gene transcription, RNA metabolism, immune responses, apoptosis and the DNA damage response. Several of the case studies examined in the literature involve proteins that constantly shuttle in and out of the nucleus and the balance between nuclear import and export estab- lishes the specific localization pattern. However, many of the mechanistic studies fail to investigate precisely how a post-translational modification perturbs the dynamics. For instance, if the post-translational modifications in the nucleus lead to increased cyto- plasmic accumulation, this could result from changes in nuclear export (whether it is enhanced) or nuclear import (whether it is inhibited) or both. Regulation by phosphorylation Modulation of importin:cargo binding affinity There are several examples in the literature of the introduction of a negative charge by phosphorylation inhibiting NLS binding [160,161,170–172]. Well- established examples include the phosphorylation of the yeast transcription factor Pho4, which disrupts the interaction with its dedicated carrier Pse1, an β-Kap family member [173,174], and the inhibitory phosphorylation by Cdk1 (cyclin-dependent kinase 1), which introduces negative charges that interfere with importin binding in a number of proteins [93,175,176]. In the case of the human dUTPase, structural work suggests that phosphorylation in the vicinity of the NLS leads to altered intra-NLS contacts that prevent favorable interactions with Impα, resulting in the cytoplasmic accumulation of the phosphorylated form [93]. The NLS:importin dissociation constants fall into a rather wide range [107,177], and the effect of phosphorylation will depend on whether it is capable of moving the affinity over the threshold, so it falls outside the functional NLS range [65,90,107,177]. A high-affinity cargo complex would require a more substantial alteration to make the NLS non-functional. Phosphorylation can also enhance nuclear accu- mulation through increasing NLS:Impα affinity. A well-established example is protein kinase CK2- mediated phosphorylation of the SV40-TAg NLS at position S111/112, which enhances affinity for Impα 2-fold, leading to a considerable increase in the nuclear import rate [178]. The nuclear transport efficiency also increases for the Epstein-Barr virus nuclear antigen 1 if its NLS is phosphorylated at S385, which increases its affinity for Impα5. However, phosphorylation in two other neighboring positions, S383 and S386, decreases the nuclear import rate [179,180]. The precise structural reasons behind the enhanced affinity due to phosphorylation remain unclear [89]. Negative charges in the linker region of bipartite cNLSs were shown to have a positive effect on Impα binding. Generation of peptide inhibitors against the classical nuclear transport pathway led to bipartite NLSs that have several Glu or Asp residues in their linker regions. These help in maximizing the possible interactions in the cargo:- carrier complex [90,110]. Clearly, phosphorylation has an effect specific to the position of the phosphorylated residue relative to the positive cluster of the NLS [175]. Fig. 10. Regulatory mechanisms in nucleocytoplasmic trafficking. The control of trafficking in and out of the nucleus at the protein level operates by several types of mechanisms. First, different chemical moieties can be attached to the cargo or transport factors, as shown in the first row. These post-translational modifications may alter the thermodynamics and kinetics of the interactions between the cargo and the karyopherin (A) or may lead to masking of the interacting groups (B and C). Other mechanisms involve the microtubular system, which can influence the concentration gradient of cargo proteins such that these may accumulate around the nuclear pores where importins are readily available (D). “Piggy-backing” is the indirect coupling between cargo and karyopherins (E). NLS copy number variation for oligomeric cargo proteins constitutes a fine-tuning control that provides advantage to homo-oligomers or hetero-oligomers with an increased number of NLS segments (F). 2079Review: Protein Import into the Nucleus Intramolecular NLS masking Intramolecular NLS masking can also inhibit car- go:carrier complex formation, through induced struc- tural changes in the cargo making the NLS inaccessible to Impα. In the case of the X. laevis b-Myb protein, the C-terminal domain simultaneously inhibits DNA binding and NLS function. During embryo development, b-Myb is subjected to several modifications, which result in the NLSs that facilitate nuclear accumulation of the protein becoming acces- sible [181]. STAT1 activation through the phosphor- ylation of a tyrosine residue (Y701) is one of the central events in cytokine signaling and the regulation of immune responses. Phosphorylation induces a struc- tural rearrangement that shifts STAT1 dimers from an antiparallel to a parallel conformation, exposing a non-classical NLS that is recognized by Impα5. Dimerization and phosphorylation are essential for efficient nuclear accumulation during STAT1 activa- tion, even though the phospho-tyrosine residue is not a binding determinant for Impα5 [124,180,182–185]. Intermolecular NLS masking and organelle-specific retention Intermolecular masking can occur if the binding of a heterologous protein prevents the interaction of the image of Fig.�10 2080 Review: Protein Import into the Nucleus cargo and its carrier. One of the best-described examples is the NF-κB p50/p65 heterodimer, a transcription factor regulating immune and stress responses, apoptosis and differentiation. NF-κB is kept inactive by its inhibitor, IκBα, which impedes NF-κB from being recognized by the nuclear import machinery. Crystal structures show how the NLSs of both NF-κB p50 and p65 subunits are covered by the ankyrin repeat region of IκBα [186,187]. IκBα binds the NLS of NF-κB until phosphorylation licenses its ubiquitin-mediated degradation, which enables Impα3 and Impα4 to access the unmasked NF-κB NLSs [188–191]. A recently described E3 ubiquitin ligase that binds phospho-NLSs, the BRCA1-binding protein BRAP2, was shown to reduce the nuclear accumulation of several viral and endogenous proteins, depending on their phosphorylation state. Although it does not completely sequester its targets in the cytoplasm, it fine-tunes their localization pattern [192,193]. DNA or RNA can also be responsible for intermo- lecular masking. For example, the DNA binding region and the Impβ1-recognized NLS of the human sex-determining factor SRY overlap. DNA binding inhibits Impβ1 binding and vice versa. This mecha- nismmay also facilitate the release of the SRY:Impβ1 complex once it enters the nucleus [194]. Interest- ingly, acetylation of SRY is necessary for proper Impβ1:SRY complex formation, showing the inter- play among different modes of regulation [195]. The NLS of HDAC4, a class-IIa histone deacetylase, is masked by phosphorylation-induced 14-3-3 protein binding, making both NLSs inaccessible to nuclear import factors and causing cytoplasmic retention [196]. A similar mechanism seems to apply for HDAC5, HDAC7 and HDAC9, where phosphoryla- tion near the NLSs, along with 14-3-3 binding, inhibits nuclear translocation [197]. Regulation by methylation and acetylation Post-translational modifications of histones, in- cluding acetylation and methylation, are crucial in epigenetics. A growing number of examples show that acetylation and methylation of Lys or Arg residues, in addition to those in histones, regulates a variety of cellular functions [198–200]. Interest- ingly, Impα itself is targeted for acetylation within its IBB domain by p300/CBP, increasing its ability to bind Impβ1 in vitro [201]. p300/CBP interacts with several components of the nuclear transport ma- chinery, possibly fine-tuning their functions by affecting their intracellular distribution [202]. Meth- ylation and acetylation of Lys and Arg residues can directly modulate NLS/NES function through alter- ing the interaction with the transport machinery by modulating the residue charge. These modifications may also have indirect effects on localization through alteration of binding between interaction partners or by promoting conformational changes. Although the precise mechanisms are mostly unclear [203], several well-documented examples are summarized below. Modulation of importin:cargo binding affinity In c-Abl, lysine acetylation occurs in the NLS. The protein is involved in apoptosis when nuclear, whereas it responds to proliferative signals in the cytoplasm. It can be acetylated in the nucleus by P/CAF (an acetyltransferase) within one of its NLSs, leading to cytoplasmic accumulation. It is hypothe- sized that when the acetylated form is exported to the cytoplasm, it cannot re-enter the nucleus because its NLS is no longer recognized by the import machinery [204]. A similar mechanism was proposed for RECQL4 (a DNA helicase important for genomic integrity maintenance), which is acetylated by p300 [205], as well as for HMGB1 (a protein involved in transcriptional control) [206]. Acetylation within the NLS of the poly(A) polymerase PAP leads to its cytoplasmic accumulation. Acetylation directly interferes with PAP binding to Impα/Impβ1 [207]. In the case of another P/CAF substrate, E1A, a clear negative effect of NLS acetylation on Impα3 binding has been demonstrated [208]. Interestingly, for P/CAF itself, intramolecular acetylation is required for its nuclear localization. However, a mutant incapable of auto-acetylation is strictly cytoplasmic despite the fact that the acetylated form of P/CAF shows decreased binding to both Impα1 and Impβ1 in in vitro pull-down assays, compared to non-acetylated P/CAF. Acetylation possibly affects the P/CAF local- ization pattern through regulation of proteasome- dependent degradation processes [209]. Besides acetylation, phosphorylation also influences the local- ization of P/CAF. Phosphorylation may promote the dissociation of P/CAF-PP1/PP2a cytoplasmic com- plexes and may allow the nuclear import of P/CAF [210]. Although the precise mechanism is not yet known, acetylation near the NLSs of Net1A (a RhoA GEF protein) alters the dynamics of its nucleocyto- plasmic shuttling, probably by slowing its nuclear re-import rate, leading to cytoplasmic accumulation [211]. Acetylation can also directly enhance importin:- cargo interactions. It has been shown that p300-me- diated acetylation in the proximity of the NLS of SRY (K136) is needed for nuclear localization, facilitating Impβ1 binding [195]. Intramolecular NLS masking Conformational changes induced by acetylation can alter nucleocytoplasmic transport processes in a fashion similar to phosphorylation. The transcription factor HNF-4 shuttles between the nuclear and cytoplasmic compartments and CBP-mediated acet- ylation is hypothesized to induce conformational 2081Review: Protein Import into the Nucleus changes that make the NES inaccessible to CRM1. Acetylation also enhances the ability of HNF-4 to bind DNA and CBP, both of which help its nuclear anchoring [212]. One of the key aspects of the many modes of regulation of p53 localization is its phosphorylation- and acetylation-dependent tetra- merization, which influences the accessibility of its NES and NLSs (reviewed in Refs. [213] and [214]). Acetylation of lysine residues in the C-terminal region of p53 inhibits its oligomerization, resulting in an exposed NES and effective nuclear export [215]. By contrast, CBP-dependent acetylation of survivin at K129 enhances its oligomerization, making its NES sequence inaccessible to CRM1 and leading to nuclear accumulation of the protein [216,217]. The acetylation of K433 in the protein kinase PKM2 by p300 is speculated to influence its nuclear transport through stabilizing its dimeric state because the tetrameric state may bury its NLS [218] or by modulating a “piggy-back” nuclear entry mechanism of PKM2. Intermolecular NLS masking and organelle-specific retention The localization of Yap, one of the key compo- nents of the Hippo signaling pathway, is regulated through lysine methylation. Yap is monomethylated at K494 by Set7, which leads to its cytoplasmic retention though an unknown mechanism [219]. Hsp70, a protein with roles in folding and degrada- tion, mainly localizes to the nucleus when dimethy- lated on K561, whereas the unmethylated form is predominantly cytoplasmic. A small fraction of the overall Hsp70 pool was reported to be dimethylated in cancer cells and this form is thought to interact specifically with Aurora kinase B, which might be responsible for the chromatin association of methyl- ated Hsp70 [220]. RNA helicase A shuttles between the nuclear and cytoplasmic compartments, and PRMT1-mediated arginine methylation is hypothe- sized to inhibit its interaction with a putative cytoplasmic retention factor binding its NLS-contain- ing C-terminal region [221]. CtBP2 is actively exported from the nucleus in a CRM1-dependent manner, but this is prevented by K10 acetylation-de- pendent nuclear sequestration, mediated by p300 [222]. Interestingly, the localization of its closely related protein, CtBP1, is regulated by sumoylation (causing enhanced nuclear entry or retention) [223], cytoplasmic retention [224] and phosphorylation [225]. Finally, acetylation of the retinoblastoma protein Rb by P/CAF was shown to be important for Rb to remain nuclear during keratinocyte differ- entiation. Because acetylation presumably happens in the nucleus, it does not have an effect on nuclear import, although the modification is within the NLS [226]. Regulation by ubiquitination and sumoylation Other post-translational covalent modifications, including ubiquitination or sumoylation, can also regulate protein nuclear import. These modifications can directly target the Lys residues in the NLS. A well-documented example involves cytidylyltransfer- ase, a protein involved in phosphatidylcholine biosyn- thesis. Monoubiquitination on K57 masks the NLS of cytidylyltransferase, resulting in cytoplasmic accumu- lation [227]. By a similar mechanism, ubiquitinated K319–321 in p53 blocks Impα3 binding, inhibiting nuclear entry [228]. Nuclear entry of PTEN, the regulator of phosphatidylinositol 3-kinase signaling, is essential for its tumor suppressor function and this translocation depends on NEDD4-1-mediated mono- ubiquitination of two lysine residues (K13 and K289) [229–231]. Sumoylation of PAP within its NLS is essential for its nuclear import, providing regulation additional to acetylation [232]. Sumoylation can also enhance nuclear accumulation through masking NESs as in the case of Kfl5, a transcription factor regulating cell proliferation [233], or through support- ing nuclear retention without altering nuclear import dynamics as in the case of SAE1 (SUMO activating enzyme 1) [234]. Nuclear translocation of the en- zymes in the de novo thymidylate biosynthesis pathway is sumoylation dependent and takes place at the beginning of S-phase [235–237]. Other factors regulating nucleocytoplasmic trafficking Among several other possible ways to regulate nuclear translocation, cytoplasmic anchoring and the contribution of the microtubular network are important for a number of proteins. The microtubular system and its associated molecular motors are essential for some viral proteins to reach the nucleus to overcome barriers to diffusion (reviewed in Refs. [238] and [239]), but several non-viral proteins also use them, presumably to enhance the rate and extent of their nuclear import. Nocodazole (a tubulin polymerization inhibitor) treatment of cells signifi- cantly reduces the nuclear accumulation of p53 [240], Rb [241] and PTHrP [242,243]. These results show that NLS-containing cargoes can be trans- ported actively to sites close to the NPCs, enhancing nuclear entry rates by moving cargoes to intracellular regions where the Impβ1 concentrations are high [244,245]. This may enhance the response time of cells to extracellular or intracellular stimuli when rapid transport processes are needed for proper function (e.g., in cell signaling and DNA damage responses). The nuclear localization of several steroid recep- tors, including the glucocorticoid and estrogen receptors, is regulated through cytoplasmic reten- tion. Without their ligands, these receptors are sequestered in the cytoplasm by Hsp90 through 2082 Review: Protein Import into the Nucleus their ligand-binding domains. Upon ligand binding, they release Hsp90 and are imported into the nucleus using their NLSs [246]. The androgen receptor is similarly bound to importin-7 (Imp7), which in the absence of ligand inhibits binding of Impα to the NLS and causes the receptor to remain in the cytoplasm. Androgen binding induces confor- mational changes in the receptor that result in the release of Imp7, exposing the NLS for binding to Impα and leading to translocation into the nucleus [247]. In addition to the affinity for their receptors, the NLS copy number also has an important effect on nuclear accumulation efficiency. Higher numbers of NLSs provide advantage in competing for importins in the cellular environment, making oligomerization- induced NLS copy number variations another way of regulating nucleocytoplasmic transport processes [248,249]. Conclusions Although the first components of nuclear import pathwayswere only identified in 1993 [250,251], there is now a substantial molecular understanding of how many of the nuclear protein import pathways function. However, several key challenges remain, including determining the precise mechanism by which the cargo:carrier complex is able to overcome the barrier functi