Mass Spectrometry De Novo Protein Sequencing
Proteomics Analytical Service

De Novo Protein and Peptide Sequencing Services: Accurate, Database-Free Analysis

In the era of precision biotherapeutics and advanced mass spectrometry proteomics, relying solely on genomic data or established FASTA databases is frequently insufficient. Our de novo protein sequencing services provide a direct, mass spectrometry-based analytical route to determining the exact, primary amino acid sequence of unknown proteins, synthetic peptides, and monoclonal antibodies (mAbs).

100% Sequence Coverage Accurate Leu/Ile Distinction Zero Database Dependency Sub-ppm Mass Accuracy

Service Scope

Unlocking Unknown Sequences with 100% Coverage

biotech

Monoclonal Antibodies

Reverse engineering mAbs when hybridoma lines are lost or inaccessible.

Novel Peptides

Identifying bioactive peptides from venoms, extracts, or secretions.

Neoantigens

Precise sequencing of tumor-specific HLA-presented peptide antigens.

Complex PTMs

Sequence mapping combined with precise post-translational modification localization.

Antibodies

Full-length heavy and light chain reconstruction.

Peptides

De novo structural characterization of discovery leads.

Proteins

Unknown target identification without FASTA dependency.

Service Details

Case Study

FAQ

Accurate Leu/Ile Differentiation

The absolute core challenge of de novo sequencing lies in achieving complete sequence continuity and absolute residue-by-residue accuracy. Standard database search algorithms can tolerate missing fragmentation peaks because they match empirical spectra against theoretical templates. In contrast, de novo sequencing must extract the sequence directly from the mass differences between adjacent fragment ions.

A pervasive pitfall in standard mass spectrometry is the analytical inability to distinguish between the isomeric amino acids Leucine (Leu) and Isoleucine (Ile). Because they possess the exact same monoisotopic mass (113.084 Da) and identical elemental compositions, standard collision-induced dissociation fails to physically separate them. In monoclonal antibody sequencing, a single Leu/Ile swap in a CDR region can drastically alter antigen-binding affinity or invalidate patents.

To definitively resolve this, we utilize specialized secondary fragmentation modes, including High-energy Collisional Dissociation (HCD) and Electron Transfer Dissociation (ETD). These advanced methods generate specific side-chain fragments—such as w-ions and c/z-ions—that allow our structural scientists to unambiguously differentiate Leu from Ile, providing a level of precision that traditional AI-only algorithms cannot achieve alone.

High-resolution MS/MS spectra demonstrating Leu/Ile differentiation via specific w-ions.

Selecting the Right Strategy: De Novo vs. Database Search

To ensure optimal project resource allocation, it is vital to select the appropriate analytical methodology based on the specific biochemical nature of your biological target. While de novo is ideal for unknown targets, other strategies may be applied based on structural objectives.

Dimension De Novo Sequencing Database Search (Bottom-Up) Edman Degradation
Prerequisite None (Mass Spectrometry only) Known Genome/FASTA Database Free, unmodified N-terminus
Unknown Species Ideal (Completely unbiased analysis) Ineffective (High false negative rate) Good for very short N-terminal tags
Novel Mutations 100% Accuracy and Discovery Fails on unexpected mutations Limited strictly to the N-terminus
Sequence Coverage Up to 100% full-length continuity High (if sequence is perfectly known) N-terminal only (typically 30-50 residues)
Leu/Ile Distinction High (requires specialized HCD/ETD) High (inferred strictly via database homology) Direct and explicit chemical resolution
Terminal Blocking Unaffected (analyzes internal peptides) Unaffected (analyzes internal peptides) Fails completely (e.g., Acetylation, Pyro-Glu)

Solution Selection Strategy: Choose De Novo Sequencing for antibody reverse engineering, novel peptide discovery, and resolving highly mutated proteins. It is the definitive solution when N-terminal sequence analysis chemically fails due to blocking. For intact structural validation, consider our top-down proteomics approach.

Diverse Application Scenarios

De novo sequencing is the definitive choice when genomic information is unavailable, incomplete, or when protein-level mutations render standard database searches entirely ineffective. We rigorously support a wide array of complex biomedical R&D initiatives.

medication

mAb Reverse Engineering

  • Comprehensively reconstructing full-length sequences for high-value biotherapeutic antibodies when the original cell line is lost.
join_inner

Bispecific Antibodies & ADCs

  • Resolving complex multivalent formats and ensuring primary sequence integrity around conjugation sites.
eco

Novel Peptide Discovery

  • Identifying highly bioactive peptides from complex natural sources such as marine animal venoms or plant extracts.
coronavirus

Neoantigen Identification

  • Precise sequencing of tumor-specific, mutated HLA-presented peptide antigens directly from clinical tumor biopsies.
compare

Biosimilar Development

  • Detailed structural characterization to ensure exact primary sequence identity and comparability with originator molecules.
bug_report

Unculturable Microorganisms

  • Sequencing unique proteins from complex environmental or microbiome samples lacking reference databases.

Our De Novo Sequencing Workflow & Quality Checkpoints

Achieving guaranteed 100% sequence accuracy requires a highly condensed, rigorously controlled workflow integrating advanced wet-lab chemistry with expert bioinformatics validation.

01

Protein Prep & Assessment

Sample purity is verified. Disulfide bonds are fully reduced (DTT/TCEP) and irreversibly alkylated (IAA) to guarantee complete protein unfolding.

02

Multi-Enzyme Digestion

To ensure 100% overlapping peptide maps and eliminate blind spots, the sample is digested using up to six orthogonal proteases in parallel.

03

NanoLC-MS/MS Acquisition

Desalted peptides undergo rigorous DDA analysis with alternating fragmentation modes (HCD/ETD) to capture all diagnostic b/y and c/z ions.

04

AI-Assisted Assembly

Raw mass spectra are processed using advanced graph-theory algorithms (e.g., PEAKS Studio) to assign an Average Local Confidence (ALC) score.

05

Intact Mass Verification

Assembled primary sequences are cross-referenced against intact mass measurements to perfectly confirm the complete absence of truncations.

06

Expert Manual Validation

Senior bioinformaticians manually inspect all critical spectra, cross-referencing AI scores with raw evidence (especially CDRs and Leu/Ile), ensuring absolute zero mismatches.

Advanced Analytical Platform for Zero-Mismatch Results

High-resolution laboratory mass spectrometry instrument
precision_manufacturing

Ultra-High Resolution Mass Spectrometers

Utilizing the top-tier Thermo Scientific™ Orbitrap™ Fusion™ Lumos™ Tribrid™ and Bruker timsTOF Pro/HT systems to routinely achieve mass resolutions exceeding 240,000 FWHM.

waves

Orthogonal Fragmentation Modes

Simultaneous application of CID, HCD, ETD, and EThcD ensures maximum peptide backbone cleavage across all amino acid combinations, drastically increasing sequence coverage.

science

TIMS & PASEF Integration

Leveraging advanced ion mobility to physically separate co-eluting isobaric peptides prior to MS/MS fragmentation, drastically reducing spectral complexity and background noise.

data_thresholding

Sub-ppm Mass Accuracy

Delivering mass measurement accuracy consistently within 1 ppm, which is computationally critical for algorithms to filter out false peptide assembly paths and isotopic interferences.

Sample Requirements & Prep Guidelines

Strict adherence to sample submission guidelines ensures the highest probability of analytical success and rapid turnaround.

Sample Type Minimum Amount Purity Requirement & Handling Notes
Monoclonal Antibodies (mAb) 100–500 µg
  • - Purity > 95% (Pure)
  • - Avoid all detergents (SDS/Triton) and primary amines (Tris/Glycine). Lyophilized state is preferred.
Recombinant Proteins 50–200 µg
  • - Purity > 90%
  • - Buffer exchange into 50 mM ABC or MS-grade water. Provide target MW.
Synthetic Peptides 10–50 µg
  • - Purity > 85%
  • - Lyophilized powder submitted in a clean, low-binding tube.
SDS-PAGE Gel Bands Visible band
  • - Ensure clear band separation.
  • - Use Coomassie Blue or SYPRO Ruby. Strictly avoid silver stain.

Handling Challenging Samples

Heavily Glycosylated Proteins

Integration of specialized deglycosylation workflows (PNGase F, Endo H, or O-glycosidase) to completely strip masking glycans and expose the hidden peptide backbone.

Membrane & Hydrophobic Proteins

Utilization of MS-compatible, acid-cleavable surfactants (e.g., Azo or RapiGest SF) to maintain target protein solubility without suppressing electrospray ionization.

Insoluble Aggregates

Advanced solubilization protocols utilizing high molarity Urea or Guanidine-HCl tailored specifically for severely misfolded protein inclusion bodies.

De Novo Sequencing Deliverables Report

Deliverables & Demo Results

We provide a rigorous, highly transparent analytical data package designed to be publication-ready, fully compliant with international proteomics data guidelines, and directly actionable for downstream gene synthesis and cloning operations.

description

Raw MS Data & FASTA Sequences

Original .raw instrument files provided alongside ready-to-use continuous FASTA strings optimized for immediate gene cloning.

database

ALC Confidence Scoring Reports

Residue-level Average Local Confidence scoring, ensuring that critical binding domains consistently exceed a strict 95% threshold.

hub

Annotated MS/MS Spectra

High-resolution fragmentation evidence for critical domains explicitly showing comprehensive b/y or c/z ion sequence ladder coverage.

bar_chart

Leu/Ile Differentiation Proof

Specific mass spectral visual exports demonstrating the physical presence of characteristic w-ions to justify all Iso/Leu assignments.

analytics

Sequence Variant Analysis (SVA)

Proactive identification of minor sequence variants, proteolytic miscleavages, or C-terminal lysine clippings within the sample population.

psychology

PTM Localization Mapping

Definitive identification and high-resolution sequence mapping of naturally occurring or artifactual modifications.

Independent Published Benchmark

Resolution of Mass Coincidence Errors and Isobaric Residues in Bottom-up Proteomics

Sample

Secreted Antibodies

Prep

Multi-Enzyme Digestion

Platform

High-Res LC-MS/MS

Output

Precise I/L Assignment

Project Overview

In this independent academic benchmark, researchers aimed to evaluate the capability of mass spectrometry-based methods to probe specific antibody sequences directly from secreted polypeptide products, circumventing the traditional need to sample and sequence the original antibody-producing B-cell clones.

Background

  • Obtaining direct amino acid sequences of antibodies without a priori knowledge is crucial for developing novel therapeutics.
  • Current genomic approaches fail when B-cell clones are lost or mutated post-transcriptionally.
  • A robust protein-level method is demanded to ensure structural fidelity.

Analytical Challenge

  • A major limitation in peptide-level sequencing is the presence of isobaric residues (Leucine/Isoleucine).
  • Complex mass coincidence errors create significant ambiguity in computational sequence assembly.
  • Lack of explicit differentiation poses severe risks for functional sequence reconstitution.

Workflow

Sample Type

Secreted antibody polypeptides.

Sample Preparation

Bottom-up proteomics utilizing multiple proteases to generate redundant sequence overlaps.

Mass Spectrometry Analysis

High-resolution MS/MS to capture primary backbone and secondary fragmentation data.

Data Analysis

Specialized post-processing procedures designed to definitively assign I/L residues based specifically on secondary fragmentation (diagnostic w-ions).

Key Insight

As detailed in Figure 1 of the referenced publication, the explicit analytical accounting for mass coincidence errors and the strategic integration of secondary w-ion fragments allow for exact residue assignment, successfully resolving the notorious Leucine/Isoleucine ambiguity.

Technical Value

  • Validates exact multi-enzyme high-resolution MS methodologies.
  • Proves MS-only workflows can achieve high-fidelity sequences.
  • Demonstrates absolute independence from genomic reference data.

Key Findings

Methodological Success Metrics

100%

Assignment

Exact mass-based resolution of isobaric I/L residues.

Dependency

Zero reliance on preexisting genomic databases.

Probing

Direct sequence probing from secreted serum polypeptides.

Frequently Asked Questions

Can you sequence a protein that is N-terminally blocked (e.g., by acetylation or pyroglutamate)?expand_more
Yes, absolutely. Because mass spectrometry de novo sequencing enzymatically digests the protein into smaller fragments and analyzes both internal and C-terminal peptides independently, it is completely immune to N-terminal modifications that routinely and permanently halt traditional Edman degradation cycles. Furthermore, our mass spectrometry data can definitively identify and map the exact mass of the blocking modification.
How do you guarantee the differentiation between Leucine and Isoleucine without relying on genomic guesswork?expand_more
We achieve this absolute differentiation by utilizing hybrid fragmentation techniques on ultra-high-resolution platforms. While standard Collision-Induced Dissociation (CID) yields identical spectra for Leu and Ile, we strategically use HCD and ETD fragmentation in tandem to generate unique side-chain specific w-ions. These distinct fragmentation pathways provide unique, physically measurable mass signatures for each isomeric residue, allowing us to accurately distinguish them based purely on fundamental physical chemistry principles.
What is the maximum length of a protein that can be subjected to de novo sequencing?expand_more
There is technically no theoretical limit to the length of the intact protein molecule, provided it can be adequately and efficiently digested. Because we utilize a sophisticated bottom-up approach integrated with multi-enzyme parallel digestion, massive proteins exceeding 150 kDa (such as intact IgG monoclonal antibodies, multi-subunit receptor complexes, or large complex enzymes) are routinely and successfully sequenced by computationally assembling the massive library of overlapping short peptide fragments into the complete contiguous map.
Will PEGylation or other synthetic polymer conjugations interfere with the sequencing process?expand_more
Yes, heavy synthetic modifications like PEGylation can severely suppress peptide electrospray ionization and create massive spectral complexity due to polymer polydispersity and charge-state shielding. For heavily PEGylated therapeutics, we employ highly targeted chemical or enzymatic removal of the polymer chains prior to mass spectrometry analysis, allowing us to successfully isolate, ionize, and sequence the underlying functional protein backbone.
How do you ensure sequence coverage truly reaches 100%?expand_more
Single-enzyme digestion (e.g., using Trypsin alone) almost always leaves significant sequence coverage gaps due to the generation of excessively large, un-ionizable peptides or extremely short hydrophilic fragments that are lost during desalting. We completely eliminate this fundamental flaw by utilizing up to six different orthogonal proteases in parallel parallel digestion workflows. This creates highly redundant, overlapping peptide maps. If one specific enzyme fails to cleave a problematic structural region, another enzyme with a different cleavage specificity will succeed, ensuring that not a single amino acid is left unsequenced.

Disclaimer: All services and products provided are for Research Use Only (RUO) and are not intended for use in clinical or diagnostic procedures.

Online Inquiry
Online Inquiry