What Is Antibody Sequencing? A Comprehensive Overview
- Home
- Resource
- Knowledge Bases
- What Is Antibody Sequencing? A Comprehensive Overview
Antibodies are key proteins produced by the immune system that are responsible for recognizing and binding to specific foreign invaders. Each antibody possesses a unique amino acid sequence that determines its three-dimensional structure and ability to recognize antigens. Understanding this sequence is the basis for a deeper understanding of its function, specificity and how it interacts with antigens. This is essential for studying immune responses, disease mechanisms and developing antibody-based therapies.
Antibody sequencing relies on mass spectrometry and gene sequencing (Next-generation sequencing, NGS). Mass spectrometry physically analyzes protein samples directly, inferring their amino acid sequence by measuring the mass of protein fragments. Gene sequencing, on the other hand, indirectly obtains amino acid sequence information by decoding the gene encoding the antibody (from the B-cells or hybridoma cells that produce the antibody), and is a high-throughput and relatively sophisticated method.
Antibody sequencing is extremely versatile and valuable. In basic scientific research, it helps to analyze the molecular details of the immune response, the relationship between antibody structure and function, and the mechanism of antibody diversity. In the field of biopharmaceuticals, it is the core of monoclonal antibody drug development, providing key data for new drug discovery and engineering.
Antibody sequencing is a technique used to determine the amino acid sequence of a specific antibody molecule in a monoclonal or polyclonal antibody, which provides key information to understand the antibody's specificity, activity, structure, and affinity for the antigen.
Accurate determination of antibody sequences is a central aspect of the biopharmaceutical field. In therapeutic antibody development, sequence information directly determines the effectiveness, safety and manufacturability of the drug. In the process development stage, it is the key basis for recombinant expression and stable production.
Sequence data provides the basis for antibody functional validation, is indispensable evidence for protecting intellectual property rights in patent applications, and enables the recovery of lost or mutated original sequences through "clone rescue" technology.
Focusing on Functional Region Resolution The core of antibody sequencing is the resolution of the variable region (VH/VL) that determines antigen binding specificity, which contains the highly conserved framework region (FR) and the highly variable complementary determining region (CDR). The FR maintains the overall spatial conformation of the antibody, while the CDR is directly involved in antigen recognition. Accurate resolution of the sequences of these two types of regions is a prerequisite for understanding antibody mechanisms and optimizing drug design.
A common challenge in antibody research and application is the lack of traceable sources of genetic information for many important antibodies. These antibodies fall into two main categories:
First, commercially produced antibodies (e.g., polyclonal sera or finished monoclonal antibodies), for which the original hybridoma cell lines or immunized animal sources used in the production process are often not preserved or available.
Second, laboratory "legacy clones", which are valuable early-developed antibodies whose genetic sequences may not have been fully resolved due to loss of hybridomas or technical limitations. Second, laboratory "legacy clones", which are valuable antibodies developed in the early days and whose gene sequences may not have been fully resolved due to loss of hybridomas, incomplete documentation, or technical limitations.
Traditional sequencing technologies (e.g., NGS) rely heavily on high-quality RNA/cDNA extracted from B-cells and hybridomas as templates for amplification and sequencing, and the nucleic acids of the samples cannot be obtained if the parental cell lines corresponding to the antibodies are lost or degraded, or if the samples are stored in the form of purified proteins only. In this case, the amino acid sequence of the antibody becomes a difficult problem to decipher, even if the antibody itself has high activity and application value.
Heavy reliance is placed on the extraction of intact RNA/cDNA from B-cells and hybridomas for amplification and sequencing, and these methods fail once the cell line is lost, degraded, or cannot be cultured (which accounts for more than 30% of historical antibody samples).
Mass spectrometry solves this dilemma by directly analyzing the molecular entities of proteins. The core breakthroughs are:
Mass spectrometry (MS)-based antibody sequencing is a technology that directly analyzes the primary structure of proteins, with the core advantage of true de novo sequencing that realizes sequence reconstruction without relying on gene templates or referring to databases, thus completely solving the problem of antibody resolution under the scenarios of lack of genetic information.
To learn more, click on the article "From Workflow to Data: A Practical Guide to Antibody Sequencing".
Mass spectrometry-based antibody sequencing technology breaks through the limitations of traditional gene sequencing and shows irreplaceable core value in the field of antibody characterization, and its advantages are mainly reflected in the following five aspects:
mass spectrometry technology requires only trace amounts (μg) of purified antibody protein to initiate the sequencing process, eliminating the need to rely on hybridomas, B-cells or nucleic acid samples.
This feature makes it a key tool for resolving commercially purchased finished antibodies, "legacy antibodies" that are untraceable due to loss of cell line, and rare antibodies isolated from complex samples.
Compared with traditional gene sequencing, mass spectrometry directly reads the mature protein structure after antibody translation instead of the theoretical sequence encoded by the gene.
This advantage avoids sequence deviations caused by somatic cell hypermutation and variable region splicing errors in genome sequencing; it truly reflects the actual structure of the final protein product. This ensures that the sequencing results are completely consistent with the molecular structure of the actual functional antibody.
This modification information is critical for antibody stability, immunogenicity, and efficacy, and gene sequencing cannot provide such data.
Modern high-resolution mass spectrometers (e.g. Orbitrap) combined with advanced algorithms enable significant improvements in sequencing reliability
The technology is applicable to all types of antibody molecules
Monoclonal Antibodies: Complete resolution of natural IgG, IgM and other subtypes
Engineered Antibodies: Accurate determination of small molecule configurations such as scFv, nanoantibodies, etc.
Bispecific Antibodies: Identification of light and heavy chain pairings in heterodimers
Mass spectrometry-based de novo sequencing of the monoclonal antibody Herceptin (Figure from Weiwei Peng, 2021)
For antibodies with unknown sequences due to historical technical limitations or missing samples (e.g., research antibodies lost in early hybridomas, discontinued commercial antibodies), mass spectrometry-based sequencing can reconstruct the full-length sequences by virtue of trace amounts of retained proteins.
Antibody sequences provide core credentials for technological innovation: provide sequence information to support the technical protection of novel antibody molecule registries; validate sequence originality (e.g., <80% concordance with known sequences) by comparison with public databases; and generate experimental data that can be independently verified to strengthen the credibility of technological achievements.
Mass spectrometry can reverse analyze the CDR sequence, Fc glycoforms and modification sites of the original drug to establish the target of imitation; prove the biosimilarity through sequence consistency and PTM spectrum overlap, and provide a chain of evidence for regulatory filing.
Antibody sequencing corrects RNA sequence errors caused by somatic hypermutation prior to transferring the antibody into production cell lines; verifies the authenticity of natural pairing of light and heavy chains to prevent non-functional contamination; provides PTM benchmarking data (e.g., glycosylation) to guide cell line screening and ensures that the recombinant antibody is functionally identical to the parental clone.
To learn more, click on the article "Antibody Sequencing in Special Applications,Monoclonals, Hybridomas, Biosimilars".
Choosing a professional and reliable mass spectrometry antibody sequencing service provider is the key to ensure data accuracy and project success.
Priority is given to examining the strength of the service provider's technology platform:
Verify that the service provider has successfully handled all types of antibody forms:
Key quality indicators need to be clearly committed:
To learn more, click on the article "How to Choose the Right Antibody Sequencing Service".
Three approaches in MS-based antibody sequencing (Figure from Sebastiaan C. de Graaf, 2022)
Professional antibody sequencing service should deliver a complete molecular profile that can be directly put into downstream research and development, including the following core contents:
Application scenario | NGS (gene level) | MS (protein level) |
---|---|---|
Cell/mRNA samples | ✅ Preferred | ✅ Possible |
Purified antibodies only | ❌ unfeasible | ✅ only option |
Confirmation of the actual structure of the expressed product is required | ❌ Predicted sequences only | ✅ direct parse |
PTM assay requirements | ❌ undetectable | ✅ accurate identification |
High-throughput antibody screening (>100 clones) | ✅ Efficient parallelism | 🚫 Suitable for monoclonal analysis |
Choose MS over NGS when: (1) genetic material is unavailable (e.g., lost hybridoma or purified antibody only); (2) actual expressed protein structure must be verified (including PTMs, avoiding somatic mutation errors); or (3) critical quality attributes (e.g., glycosylation, deamidation) require analysis for therapeutic antibodies.
Recombinant expression (mammalian cell production): Based on the delivered full-length expressible sequence, gene synthesis can be optimized by codon optimization, cloned into eukaryotic expression vectors (e.g., pcDNA3.4), and transfected into CHO/HEK293 cells to achieve high-yield recombinant antibody production.
Antibody engineering modification: Accelerate the development of therapeutic antibodies by eliminating immunogenicity through humanization, enhancing ADCC effect through Fc glycosylation, and enhancing binding power 10-100 times through CDR affinity maturation.
High-throughput screening for hit-to-lead sequence confirmation: targeting functional hit clones from hybridoma/single B-cell pools, avoiding NGS non-functional rearrangement false positives.
Quality control and stability monitoring: Compare samples before and after preservation to detect degradation hotspots; monitor glycoform shifts due to process changes as well as construct sequence-PTM databases to support Root Cause Analysis (RCA) of abnormal batches.
References
For research use only, not intended for any clinical use.