Analytical methods used in obtaining higher order structure information from protein therapeutics

The efficacy, safety and pharmacokinetic properties of a protein therapeutic substantially depend on the molecule having the right structure. This article reviews current methods used for obtaining higher order structure information of biotherapeutics.

HOS illustrative image

With regard to protein structure, the correct amino acid sequence and glycosylation (primary structure) are critically important, as are the secondary, tertiary and quaternary structures, which define how the protein presents in three dimensions. The latter are collectively referred to as having higher order structure (HOS) and contribute to the quality attributes of a biotherapeutic. Protein HOS can be affected and perturbed by changes in manufacturing, formulation and storage and must be measured and monitored, as per regulatory requirements.1

Protein secondary structure is maintained primarily by hydrogen bonds between amino acids and results in coiling of the polypeptide backbone (α-helix) and stacking of amino acid side chains (β-sheet). The tertiary structure involves protein folding through the formation of disulfide bonds and salt bridges between amino acids, resulting in the lowest energy state (high stability) of the molecule.

These globular proteins are water soluble, as folding exposes hydrophilic amino acids to the outer surface of the protein and shields hydrophobic amino acids in the interior. The quaternary structure comprises the arrangement of two or more folded protein subunits into a multi-subunit complex. Except for insulin and therapeutic antibodies, which consist of different subunits or chains linked by disulfide bonds, this is not a usual arrangement in biotherapeutics. Efforts are even made during manufacturing and through the selection of appropriate formulation and storage conditions to keep the formation of the aggregates to a minimum, as they are often associated with adverse immune responses.2

High-resolution methods

A variety of biophysical methods have been developed and refined to measure and characterise protein HOS. X-ray crystallography has arguably been the “gold standard” for obtaining high-resolution three-dimensional structures of proteins,3,4 however the formation of large protein crystals suitable for detailed analysis is complex and time consuming. Also, X-ray crystallography is not routine in biopharmaceutical characterisation.

Cryo-electron microscopy4,5 requires no crystallisation and electron diffraction from microcrystals,6 which are relatively easy to obtain. This newer technique is gaining traction and is likely to become widely used soon. High field nuclear magnetic resonance (NMR) of proteins is another high-resolution method for obtaining HOS information,7 but its routine adoption for the characterisation of biopharmaceuticals has been impeded. The difficulty in interpreting the resulting data for larger proteins (eg, antibodies), the relatively large sample requirements and because, similar to X-ray crystallography and cryo-electron microscopy, the equipment is costly and its use, analysis and interpretation of the data require specialised operators, are the main reasons for its slow acceptance.

Spectroscopic methods

Spectroscopic methods are commonly used in the HOS determination of biotherapeutics, as sample requirements are generally modest, data acquisition and analysis are relatively uncomplicated and the corresponding software streamlined. The cost of equipment is similar to other laboratory instruments. In general, these methods provide HOS information by inference compared to imaging of structural elements obtained by X-ray or electron diffraction methods outlined above.

Circular dichroism (CD) spectroscopy has been used extensively to measure secondary and tertiary structure features of proteins. CD data are obtained from the differential absorption of circularly polarised ultraviolet (UV) light. Far UV (180nm‑250nm) CD spectra relate to secondary structure elements (helixes and sheets) and near UV (>250nm) CD spectra, which side chains of aromatic amino acids absorb, relate to the tertiary structure of a protein.8 Using appropriate models for protein structures that are similar to the molecule being analysed, estimates of helix, sheet and unordered sequence can be obtained. These data are useful in comparing different manufactured batches of a biotherapeutic and assessing stability in different formulations and storage conditions, as loss of HOS features suggests protein unfolding, denaturation and aggregation.9

Fourier-transform infrared (FTIR) spectroscopy is also used to measure elements of a protein that comprise its secondary structure, thus providing HOS information.8 Different bonds absorb at different wavelengths and those associated with stretching of the amide carbonyl (C=O) bond ca. 1,650cm-1 (Amide 1 band), bending of the amide N-H bond ca. 1,540cm-1 (Amide 2 band) and stretching of the amide C-N bond ca. 1,240cm-1 (Amide 3 band) can be correlated to hydrogen bonding, which defines sheet and helix formation. As with other spectroscopic data, estimates of helix, sheet and unordered sequence can be generated from FTIR data using appropriate models. Recent developments using infrared laser spectroscopy, coupled with microfluidic sample introduction and continuous subtraction of the formulation buffer background, allow for increased sensitivity, extended range of protein concertation and automated analysis of samples. All of these are significant improvements over traditional FTIR spectroscopy.10

The vibrational frequencies recorded in Raman spectra of proteins are sensitive to the local environment of the molecule and frequency shifts can be related to the unfolding of the structure due to strain

Raman spectroscopy is based on inelastic light scattering by molecular bonds (the Raman effect). It uses the same amide bands as IR spectroscopy to generate data that can be related to protein secondary structure features.11 Additional information can be obtained from stretching of disulfide bonds as well as from aromatic amino acids. The vibrational frequencies recorded in Raman spectra of proteins are sensitive to the local environment of the molecule and frequency shifts can be related to the unfolding of the structure due to strain. The method can be used to compare different batches of material or monitor changes in HOS due to formulation or storage conditions.

Intrinsic protein fluorescence is due to aromatic amino acids, chiefly tryptophan. The aromatic side chain absorbs UV light (280-290nm). An electron is promoted to an excited state and upon its return to the ground state light is emitted at a longer wavelength than the absorbed light (ca. 340mm). As with other spectroscopic methods, intrinsic protein fluorescence can be used to compare different batches or to monitor denaturation, since the hydrophobic tryptophan would typically be shielded from the aqueous environment. Therefore, changes in fluorescence intensity and emission wavelength between different protein samples would indicate differences in their tertiary structure.12

Light scattering

Molecules in solution will scatter high-intensity monochromatic light, usually from a laser, and the scattering angle and intensity correlate with the size (hydrodynamic radius) of the molecule and the molecular mass. Dynamic light scattering (DLS) and multi-angle light scattering (MALS) yield molecular size information in the 0.5 to 100nm range and 10 to 500nm range, respectively. However, with proteins, HOS light scattering only provides an assessment of aggregation but no specific secondary structure information. Only very indirect tertiary structure data can be discerned from the experimentally determined hydrodynamic radius.13 Quantitation of aggregation is difficult as light scattering increases non-linearly with size. Coupling a light scattering detector with a chromatograph enables individual measurement of molecular mass and hydrodynamic radius of chromatographically separated components, the relative abundance of which can be accurately estimated from the chromatograms.

Mass spectrometry

Mass spectrometry (MS) and tandem mass spectrometry (MS/MS) of peptides, generated by enzymatic digestions of proteins, has been used almost since the emergence of the biotechnology industry to confirm the primary sequence of protein biotherapeutics and to characterise post-translational modifications as well as changes to the sequence brought about by handling and storage.14

1. Resolution refers to the detail that can be obtained for various elements of the protein higher order structure and not to the resolution of the analytical technique itself (eg, chromatographic resolution)
2. S: secondary structure; T: tertiary structure; Q: quaternary structure
3. The difficulty is assessed from combining the relative difficulties of sample preparation, data acquisition and data teinterpretation
4. Low: less than $20,000; Medium: between $20,000 and $100,000; High: greater than $100,000
5. Low: less than 5µg protein; Medium: between 5 and 100µg protein; High: greater than 100µg protein

Additionally, characterisation of aspects of tertiary structure, specifically the connectivity of disulfide bonds, has also been developed using similar methodology, digestion of a protein without prior reduction of disulfides and under conditions that do not promote disulfide scrambling.15 More recently, hydrogen-deuterium exchange (HDX) in protein samples, originally used with NMR, has been coupled with MS analysis to identify those regions of a protein that are exposed to the aqueous environment and those that are shielded from it.16 Incubation of a protein in deuterated water buffers leads to exchange of labile hydrogens (those on amides, amines and hydroxyls) that are exposed to the solvent. Rapid proteolysis under conditions that slow down amide hydrogen back exchange, followed by high performance liquid chromatography (HPLC) and MS, yields data that can be used to identify those regions of the protein molecule where deuterium was incorporated with resolution of a few amino acids. This can be related to the secondary and tertiary structures of the protein but also to quaternary structure in cases of non-covalent complexes between proteins or between a protein and a small molecule. The regions of contact will yield little or no hydrogen-deuterium exchange as they are not exposed to the aqueous environment.

Native MS is another relatively recent application in which the intact biomolecular structure of folded proteins and non-covalent protein assemblies can be studied. This is accomplished using buffers that do not significantly disrupt the native structure of biomolecules and do not adversely affect the ionisation process in the MS ion source.17 The distribution profiles of the multiply-charged electrospray ion envelopes of the same protein in native (folded) and partially or fully unfolded states are different and can be related to the basic amino acid side chains exposed to the solvent, so are available for protonation.

Chromatography and electrophoresis

High performance liquid chromatography (HPLC) of proteins can be carried out under conditions (buffers, temperature) that cause minimal or no disruption to the higher order structure of biological molecules. Species can be separated based on charge (ion exchange chromatography – IEC) or size (size exclusion chromatography – SEC) and the relative abundance of these species, as measured by UV detection, is thought to reflect their relative abundance in solution under native conditions. Similarly, polyacrylamide gel electrophoresis (PAGE), capillary gel electrophoresis (cGE) and capillary isolelectric focusing (cIEF) under native conditions allow for the separation, based either on charge or size, of native protein complexes.18 Such methods do not provide any information on the actual structure of these molecules but are useful in ascertaining the presence and measuring the comparative abundance of charge variants or aggregates, from which perturbations to the HOS can be inferred. Often, chromatography, especially SEC and to a lesser extent CE, is carried out in-line with light scattering detection or native MS,19,20 thus augmenting and confirming information obtained from the chromatographic separation. Often SEC and cGE, which are easy and quick to perform, are validated by more involved ‘orthogonal’ methods for obtaining information regarding the presence and distribution of protein aggregates, such as analytical ultracentrifugation (AUC) and field-flow fractionation (FFF).21 The various analytical methods outlined above are summarised in Table 1.


HOS IoannisIoannis Papayannopoulos has been carrying out peptide and protein analytical work for the past several decades, using mass spectrometry, chromatography, spectroscopy and other analytical techniques. He received his undergraduate degree in chemistry from Bowdoin College and his PhD in Organic Chemistry from the Massachusetts Institute of Technology. Over the years, Dr Papayannopoulos has alternated between academia, most recently as director of the Proteomics Facility at the Koch Institute for Integrative Cancer Research at MIT, and the biopharmaceutical industry in senior scientific and management roles at companies such as Biogen, Covance and AstraZeneca.

HOS Shannon Renn-BinghamShannon Renn-Bingham is Manager of Analytical Development at Celldex Therapeutics. She has been with the company for nine years and is responsible for method development and qualification of release and stability assays used in Quality Control and for generating characterisation data of drug substance for inclusion in Investigation New Drug (IND) applications. Shannon holds a BS in Biology from Brandeis University and an MS in Molecular, Cellular and Developmental Biology from Yale University.



  1. US Food and Drug Administration, “Guidance to Industry – Q6B Specifications: Test Procedures and Acceptance Criteria for Biotechnological/Biological Products”
  2. Moussa EM, et al. Journal of Pharmaceutical Sciences 2016. 105, 417-430 (2016).
  3. M.S. Smyth and J.H.J. Martin, Molecular Pathology 53, 8-14 (2000).
  4. H-W Wang and J-W Wang, Protein Science 26, 32-39 (2017).
  5. G. Zanotti, NanoWorld Journal 2, 22-23 (2016).
  6. M.T.B. Clabbers, et al. Acta Crystallographica Section D 73, 738-748 (2017).
  7. A. Bax, Annual Review of Biochemistry 58, 223-256 (1989).
  8. N.J. Greenfield, Nature Protocols 1, 2876-2890 (2006).
  9. D.M. Byler and H. Susi, Biopolymers 25, 469-487 (1986).
  10. E. Ma, L. Wang, B. Kendrick Spectroscopy 33, 46-52 (2018).
  11. A. Rygula, et al. Journal or Raman Spectroscopy 44, 1061-1076 (2013).
  12. A.B.T Ghisaidoobe and S.J. Chung, International Journal of Molecular Sciences 15, 22518-22538 (2014).
  13. R.M. Murphy, Current Opinion in Biotechnology 8, 25-30 (1997).
  14. K. Biemann and I.A. Papayannopoulos, Accounts of Chemical Research 27,
    370-378 (1996).
  15. J. Gorman, et al. Mass Spectrometry Reviews 21, 183-216 (2002).
  16. R. Y.-C. Huang and G. Chen, Analytical and Bioanalytical Chemistry 406, 6541-6558 (2014).
  17. G. Ben-Nissan, et al. Communications Biology 1, 1-12 (2018).
  18. L. Garcia-Descalzo, et al. Gel electrophoresis of proteins. In: Gel Electrophoresis – Principles and Basics, InTech, 2012; pp 57-68. ISBN: 978-953-51-0458-2.
  19. K. Muneeruddin, M. Nazzaro and I.A. Kaltashov, Analytical Chemistry 87, 101380-10145 (2015).
  20. Y. Yan, et al. Analytical Chemistry 90, 13013-13020 (2018).
  21. J.P. Gabrielson, et al. Journal of Pharmaceutical Sciences 96, 268-279 (2007).

One response to “Analytical methods used in obtaining higher order structure information from protein therapeutics”

  1. Henry says:

    This post is really helpful, it contains lots of information. Thank you so much for the article, it helps a lot.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.