research advances
July 2010 technical highlight
Nuclear magnetic resonance spectroscopy
PSI-SGKB [doi:10.1038/th_psisgkb.2010.31]
Protein NMR structures can now be solved in days rather than weeks or months.
The Bruker B-ACS 60 robotic sample changer and Bruker 600 MHz TCI 1.7-mm MicroCryoProbe are examples of two technologies that have enabled high-throughput NMR screening and micro-scale structure determination at PSI NESG.
Three-dimensional (3D) structures of small (<150 residues) proteins in 'structure-quality' solution can now be determined in days by NMR spectroscopy. In addition, NMR methods have been developed and applied that allow the assessment of more challenging proteins such as those larger than 20 kDa, integral membrane proteins, and proteins involved in large macromolecular complexes. NMR is poised to play an important role in systems biology, revealing protein movement, protein–ligand and protein–protein interactions, in addition to providing 3D structures in solution.
Three centers funded by the Protein Structure Initiative (PSI) have implemented high-throughput NMR pipelines to complement the efforts from X-ray crystallography and to tackle difficult proteins. The overarching areas of improvement have come from speeding up protein expression and purification [LINK to Best of expression], reducing data collection times, miniaturization, and automation.
Sample preparation
Sample preparation is now quicker, thanks to improved methods of construct design, and the application of high-throughput screening of novel constructs using microNMR probe technology, which was already established in PSI-1. 1, 2, 3 This has partly been achieved using a wheat-germ cell-free expression system 4, 5 or a condensed-phase bacterial single protein production (cSPP) system, which both reduce the costs of isotope-enrichment and also work for expression and isotope-labeling of integral membrane proteins. 6 Overall, we are now able to label recombinant proteins more easily with 13C and 15N stable isotopes.
Data collection
Typically, 10 years ago, it took several weeks to collect sufficient data for NMR protein structure determination. Now, the data can be collected in a few days. This is partly because of the use of cryogenic NMR probes, which improve signal-to-noise ratios, as well as the introduction of improved acquisition schemes, such as G-matrix Fourier transform NMR, 7 APSY, 8 HIFI-NMR, 9 and various sparse-sampling methods.
Miniaturization
The introduction of microcoil probes for NMR spectroscopy 1, 2, 3, 10, 11 has allowed spectra to be acquired from small volumes, and structures can now be determined with less than 100 micrograms of protein. This microscale approach is also useful for integral membrane proteins, which are more difficult to produce and require the presence of expensive additives. The ability to screen small sample sizes is especially useful for establishing optimal sample conditions. 1, 2, 3, 6, 11, 12, 13 Recent studies revealed a key feature of membrane protein sample preparation by showing that high-quality NMR spectra can only be obtained over a narrow range of detergent concentrations. Above this range, the NMR spectral quality decreases, presumably because of the increased viscosity at high detergent concentrations. 13
Automated NMR structure determination
The determination of protein structures by NMR comprises the spectral analysis or peak picking, backbone resonance assignments, side-chain resonance assignments, collection of NOE upper distance constraints, and structure calculation. Each of the three PSI centers has developed its own automated NMR data analysis pipeline to deal with each of these stages of analysis. PSI NESG 7, 14 utilizes the programs AutoAssign for determining resonance assignments with conventional or GFT-NMR triple resonance data, AutoStructure 15 and CYANA for analysis of NOESY spectra. This center has also developed Abacus 16 for simultaneous analysis of NOESY and triple-resonance data.
PSI JCSG makes use of APSY-type experiments for backbone assignments, and the side-chain assignments are obtained using the same commonly set of three NOESY experiments that is used for collection of NOE distance restraints. Extensive automation of the protein structure determination was achieved in collaboration with Torsten Herrmann, who developed the software package UNIO at the ETH Zürich and more recently at the University of Lyon. An alternative approach to automatic NMR assignment developed by the PSI CESG uses PINE-NMR, 17 which produces backbone and side-chain assignments from amino-acid sequences and peak lists, once experimental details are defined. If unequivocal assignment is not possible, the likely assignments are ranked in order and a graphical interface, PINE-SPARKY, is used to show both the assignments and the key experimental data. 18
The Chemical Shift (CS)-Rosetta approach provides good quality structures of small (< 120-residue) monomeric proteins using the Rosetta structure prediction method together with backbone 15N, 13C, and 1H, and 13Cβ chemical shift data, 19 The CS-Rosetta method was validated using data sets for protein structures that were not yet available to researchers ('blind” NMR structures) provided by the PSI. The PSI has also contributed in the development of CS-DP-Rosetta, 20 using unassigned NOESY peak lists [DP score 21 ] to direct a CS-Rosetta trajectory. CS-DP-Rosetta is a fully automated NOESY data analysis process, providing accurate structures for monomeric proteins of to 150 residues, including many for which CS-Rosetta fails. More recently, the PSI has contributed to the development of CS-RDC-Rosetta 22 , which represents an even more robust and general approach that uses 15N-1H residual dipolar coupling (RDC) and HN-HN NOE data, in addition to backbone chemical shift data, to guide Rosetta to the native structure for proteins of up to 25–30 kDa.
As solution of structures by NMR becomes more automated, so the need to assess the quality of the final structure — or the “goodness of fit” to the data — becomes more important. One quick and accurate method is to calculate Recall, Precision and F-measure from NOESY spectra. 21 NOE assignments are not needed for this and neither are complete relaxation matrix calculations, thus speeding up the process. Servers like Protein Structure Validation Software (PSVS) 23 have been developed for assessing the quality of NMR structures based on various knowledge-based criteria developed by analysis of high-resolution X-ray crystal structures.
Another way of testing how good automated structures are is through blind comparison with 'expert-solved' structures. CASD-NMR (critical assessment of automated structure determination through NMR), a community-wide challenge, was set up to compare the success of unsupervised automated NOESY analysis programs using data sets generated by the PSI project. Manually refined 'blind' protein NMR structures generated by the PSI are kept 'on hold' until the results of the fully automated methods are completed and archived by several groups involved in such software development. Initial results demonstrate that, although improvements can be made, several of the available automated NOESY assignment software packages can generate 3D protein structures from NMR data with accuracies similar to those obtained by detained manual refinement. 24 This project is continuing in PSI:Biology and will help drive the development of automated tools for NMR structure determination and structure quality assessment.
This is an exciting time for NMR spectroscopy, as we enter an era of high-throughput biology, with NMR poised to play an important role.