Lipidomics, the scientific discipline dedicated to characterizing lipid molecular diversity, structural complexity, and functional roles in biological systems, has experienced exponential growth in data generation due to advancements in mass spectrometry (MS) technologies. To address the challenges of managing and interpreting these expansive datasets, specialized lipidomics databases have been established to systematically catalog lipid-related information. These repositories serve as benchmark resources by providing:
- Structural annotations: Precise molecular formulas, stereochemical configurations, and classification hierarchies.
- Metabolic pathway mapping: Integration of lipid biosynthesis, modification, and degradation routes.
- Spectral reference libraries: Curated MS/MS fragmentation patterns for lipid identification.
- Biological context: Associations with cellular processes, disease states, and regulatory networks.
This review critically evaluates the architectural frameworks, content diversity, and practical utilities of leading lipidomics databases, highlighting their indispensable role in accelerating interdisciplinary research and enabling translational applications across biomedicine.
Services You May Be Interested In
LIPID MAPS
Developed under the National Institutes of Health (NIH), the LIPID Metabolites and Pathways Strategy (LIPID MAPS) database stands as a globally recognized, authoritative repository for lipidomics data. Its core functionalities are organized as follows:
1. Comprehensive Taxonomy and Annotation System
The platform integrates structural, taxonomic, metabolic pathway, and mass spectrometry (MS) data for 48,179 lipid species. Lipids are systematically categorized into eight classes: Fatty acids, Glycerolipids, Glycerophospholipids, Sphingolipids, Sterols, Prenols, Saccharolipids, Polyketides.
Each entry is assigned a unique 12-character identifier (LM ID) encoding critical metadata such as biological origin, taxonomic classification, and molecular composition. Structural representations adhere to rigorous conventions, including fatty acid chain orientation and glycerophosphate stereochemical specifications.
2. Multidimensional Data Integration
- Structural Database (LMSD): Houses 48,169 lipid entries with atomic-level precision, supporting text-based searches, structural queries, and spectral matching.
- Gene/Protein Database (LMPD): Curates 8,500+ lipid-associated genes and 12,500+ proteins across humans and model organisms.
- Virtual Libraries: Include COMP_DB (core lipid classes) and oxidized phospholipid databases, featuring algorithmically generated metabolic intermediates.
- Analytical Tools:
- LipidBlast: 119,000+ reference MS/MS spectra for compound identification.
- Collision Cross-Section (CCS) Database: 3,800+ experimental CCS values for structural validation.
3. Functional Applications
- Lipid Identification: Achieves high-confidence molecular identification by aligning experimental MS/MS data with predicted spectral libraries and LM ID metadata.
- Metabolic Pathway Visualization: Tracks lipid dynamics in pathways such as the TCA cycle and sphingolipid metabolism, enriched with KEGG pathway analysis.
- Biomarker Discovery: Enables cross-cohort data correlation analyses (e.g., linking sphingomyelin levels to COVID-19 severity).
4. Technical Innovation and Accessibility
As an open-access platform, LIPID MAPS offers:
- Structure Drawing Tools: For standardized lipid representation.
- Statistical Modules: For quantitative and comparative analyses.
- Interoperability: Seamless integration with analytical pipelines like MS-DIAL and LipidIMMS Analyzer.
These features establish LIPID MAPS as an indispensable resource for lipidomic investigations, driving advancements in precision medicine and systems biology.
Url: https://lipidmaps.org/databases/lmsd/overview
HMDB
Developed by the Canadian Metabolome Innovation Centre (TMIC), the Human Metabolome Database (HMDB) serves as a globally recognized resource for multidimensional human metabolite data integration.
An example of an HMDB pathway (vWishart DS et al., 2022).
Its key features and extensions are outlined below:
1. Data Scope and Taxonomic Framework
HMDB 5.0 catalogs 220,945 metabolite entries—doubling the prior version's scope—spanning lipids, amino acids, organic acids, and more. Lipid entries include comprehensive physicochemical descriptors (e.g., molecular formulas like C₄₂H₆₆O₅, molecular weights such as 747.02 g/mol, and CAS registry numbers). Metabolites are systematically organized into 8 primary classes and 24 subcategories, with multifaceted search capabilities based on structure, pathology, or biospecimen type.
2. Multifaceted Data Integration
Structural Data
- Interactive 2D/3D molecular visualizations via JSmol integration.
- Predicted spectral libraries: 1.44 million LC-MS/MS and NMR reference spectra.
Clinical Annotations
- Documents 660 disease-metabolite associations, including cardiovascular disorders and diabetes.
- Identifies COVID-19 biomarkers (e.g., sphingomyelins).
Functional Insights
- Pathway roles in processes like the TCA cycle and sphingolipid metabolism.
- Biological roles: phosphatidylcholine (PC) in membrane architecture; acylcarnitines in mitochondrial energetics.
Xenobiotic Data:
- 8,610 protein sequences (e.g., CYP450 enzymes).
- Drug metabolites, dietary additives, and environmental contaminants.
3. Advanced Analytical Tools
Visualization Suite
- JSpectraViewer: LC-MS/MS spectral interpretation.
- 3D Label Viewer: Stereochemical analysis of complex metabolites.
A screenshot montage of some of the new visualization features in HMDB 5.0 (vWishart DS et al., 2022).
Smart Search Capabilities
- Structure-based queries, molecular weight filters, and mass spec parameter searches (retention time, CCS).
- Cross-database integration with KEGG, PubChem, and others.
Data Accessibility
- Curated biospecimen libraries (blood, urine, cerebrospinal fluid).
- Batch data retrieval via Python APIs.
4. Open Science and Innovation
HMDB adheres to FAIR principles, offering open APIs and standardized formats (JSON/XML) for global collaboration. Continuous updates incorporate predicted metabolites, environmental compounds, and microbial metabolism data, with plans to incorporate AI-driven annotation tools by 2025.
By synthesizing chemical, clinical, and biological data, HMDB has emerged as a cornerstone of metabolomics research. Its robust infrastructure and lipid-centric insights serve as a critical foundation for biomarker discovery, precision medicine, and therapeutic innovation.
Url: https://hmdb.ca/
LipidBlast
Developed by the Fiehn Laboratory in 2013, LipidBlast is a pioneering mass spectral library designed to address challenges in lipid identification and annotation.
Creation, validation and application of in-silico generated tandem mass spectra in LipidBlast (Kind T et al., 2013).
Its key innovations are outlined below:
1. Data Scope and Coverage
LipidBlast catalogs 212,516 in silico-generated MS/MS spectra, encompassing 26 lipid classes such as glycerophospholipids, sphingolipids, and glycolipids. With 119,200 unique lipid structures, it notably extends coverage to bacterial and plant lipidomes, filling critical gaps in existing resources like LIPID MAPS.
2. Spectral Diversity and Predictive Accuracy
- Multipolar Ionization Modes:
- Positive-ion spectra: 80,000+ entries for adducts including [M+H]⁺, [M+Na]⁺, and [M+NH₄]⁺.
- Negative-ion spectra: 130,000+ entries for adducts such as [M-H]⁻.
- Class-Specific Fragmentation Rules: Simulates lipid cleavage patterns (e.g., sn-position-specific acyl chain cleavage in sphingolipids) to enhance spectral fidelity.
3. Methodological Advancements
LipidBlast employs algorithmic combinatorial chemistry to generate hypothetical lipid structures, coupled with machine learning-driven heuristic modeling of fragmentation patterns. This approach circumvents limitations of experimental data dependency, achieving validation metrics of 89% sensitivity, 96% specificity, and a 4% false-positive rate.
4. Functional Applications
- Complex Lipid Annotation: Enables precise identification of oxidized phospholipids, isomeric species, and other challenging lipids in untargeted lipidomic studies.
- High-Throughput Screening: Optimized algorithms facilitate rapid analysis of large datasets, exemplified by COVID-19 biomarker discovery.
- Novel Lipid Discovery: Critical for characterizing unannotated lipids lacking reference standards.
Url:https://fiehnlab.ucdavis.edu/projects/LipidBlast
LipidHome
LipidHome is a specialized database dedicated to lipid structural data, offering the following key functionalities:
1. Structural Database Architecture
- Interactive Molecular Visualization: Supports ChemDraw-compatible structure rendering, molecular formula display (e.g., C₄₂H₆₆O₅), and precise molecular weight annotation (e.g., 747.02 g/mol).
- Systematic Taxonomy: Classifies lipids into eight primary classes (e.g., fatty acids, glycerolipids, glycerophospholipids) under the LIPID MAPS framework, with unique identifiers (e.g., LM ID) for cross-database referencing.
2. Data Aggregation and Maintenance
- Comprehensive Coverage: Aggregates lipid structures from authoritative repositories (LIPID MAPS, PubChem) and generates structural variants (e.g., glycerophospholipids with variable chain lengths/unsaturation) via automated Perl scripting.
- Dynamic Updates: Synchronizes with external databases (e.g., PubChem) to ensure current and comprehensive lipid entries.
3. Advanced Search Capabilities
- Structure-Based Queries:
- Topological structure sketching (adjacency matrix encoding).
- Similarity searches (Tanimoto coefficient ≥ 0.7).
Results are split into each input search mass and their corresponding lipid "Species" identifications (Foster JM et al., 2013).
- Taxonomic Filtering: Hierarchical navigation by lipid class (e.g., sphingolipids → ceramides → specific species).
- Metadata Filters: Search by structural features (ring systems, unsaturation indices, functional groups) to infer biological roles.
4. Technical Innovations
- Standardized Identifiers: Utilizes InChI strings/keys for unambiguous lipid identification.
- Analytical Integration: Stores collision cross-section (CCS) values and interfaces with platforms like MS-DIAL and LipidIMMS Analyzer.
- Predictive Algorithms: Employs Monte Carlo-based tools (e.g., ISIS) to simulate high-accuracy spectra for complex lipids (e.g., oxidized phospholipids).
Url: http://www.ebi.ac.uk/apweiler-srv/lipidhome
LipidBank
Established in 1989 by the Japanese Conference on Biochemistry of Lipids (JCBL), LipidBank stands as a globally recognized database specializing in naturally occurring lipids. Its core attributes and advancements are detailed below:
1. Data Scope and Curation Standards
LipidBank hosts 6,000+ naturally occurring lipid species, spanning fatty acids, glycerides, sphingolipids, and sterols. Each entry undergoes expert curation to ensure accuracy, with stringent exclusion of synthetic compounds. This focus on natural products distinguishes LipidBank as a unique resource for lipid biochemistry.
2. Multifaceted Data Resources
- Structural Data:
- ChemDraw (CDX) and MDL MOL file formats for molecular structures.
- Interactive 2D/3D visualization tools.
- Spectral Libraries:
- UV, IR, NMR, and MS/MS spectra, including polarity-specific fragmentation patterns (positive/negative ion modes).
- Functional Annotations:
- Biological roles (e.g., anti-inflammatory properties, signaling functions).
- Associations with 660+ diseases and metabolic pathways.
3. Technical Innovations
- Integrated Spectral Library: 267,716 reference MS/MS spectra for high-confidence lipid identification via LC-MS/MS alignment.
- Interoperability: Compatible with platforms like MS-DIAL for large-scale cohort analyses.
4. Continuous Expansion
LipidBank undergoes routine updates with novel natural lipid discoveries and cross-references LIPID MAPS/HMDB for consistency. The 2024 release introduced oxidized phospholipid and microbial lipid modules, enhancing applications in inflammation and gut microbiome studies.
SwissLipids
Developed by the Swiss Institute of Bioinformatics (SIB), SwissLipids stands as a globally recognized repository for lipidomic data, offering the following features:
1. Data Scope and Classification Framework
SwissLipids catalogs 150,000+ lipid species, systematically categorized under the LIPID MAPS framework into classes such as fatty acids, glycerophospholipids, and sphingolipids. Each entry includes structural data compatible with ChemDraw, molecular formulas (e.g., C₄₂H₆₆O₅), precise molecular weights (e.g., 747.02 g/mol), and CAS registry numbers, with select entries offering 3D structural visualization.
A SwissLipids entry (Aimo L et al., 2015).
2. Multidimensional Data Integration
- Metabolic Pathway Mapping: Integrates with KEGG and MetaCyc to visualize lipid dynamics in pathways such as the TCA cycle and sphingolipid metabolism.
- Spectral Libraries: Curated MS/MS data for both positive and negative ionization modes, enabling alignment with experimental LC-MS/MS datasets.
- Functional Links: Cross-references UniProt to annotate lipid-associated enzymes (e.g., lipoprotein lipase) and their biological roles.
3. Technical Capabilities
- Integrated with the Expasy platform, SwissLipids offers:
- Advanced Search Tools: InChI string queries and molecular topology-based searches.
- Pathway Enrichment Modules: For identifying lipid-centric metabolic networks.
- Structural Prediction: Compatibility with tools like Swiss-Model for structure-to-function analysis.
4. Continuous Development
SwissLipids is routinely updated with novel lipid entries, validated against LIPID MAPS and HMDB. The 2024 release introduced oxidized lipid and microbial lipid modules, enhancing its utility in inflammation and gut microbiome studies.
Url: https://beta.sparql.swisslipids.org/
Lipidomics Gateway
Lipidomics Gateway serves as a centralized hub for harmonizing multi-source lipidomic data, offering the following functionalities:
1. Data Architecture and Classification
At its core, Lipidomics Gateway integrates the LIPID MAPS database, cataloging 48,000+ lipid species with structural details (e.g., molecular formulas like C₄₂H₆₆O₅, molecular weights such as 747.02 g/mol). Lipids are systematically classified into 8 categories, including fatty acids, glycerolipids, and sphingolipids, each assigned a unique 12-character identifier (e.g., LMGL02020001). This framework links lipids to gene/protein datasets (LMPD) and metabolic pathways (LMISSD).
2. Cross-Database Integration
As a global standardization platform, Lipidomics Gateway unifies resources such as LIPID MAPS, HMDB, and Metabolon Workbench. Its metabolite portal enables cross-platform queries by name, formula, or identifier, with pathway integration (KEGG, MetaCyc) to visualize lipid dynamics in processes like sphingolipid metabolism and the TCA cycle.
3. Analytical Tools and Spectral Libraries
- Spectral Database: 267,716 lipid MS/MS spectra covering polarity-specific fragmentation patterns (e.g., sn-1/sn-2 acyl chain cleavage in sphingomyelins).
- Advanced Workflows:
- LC-MS/MS data alignment with collision cross-section (CCS) values via Skyline.
- Enhanced identification of complex lipids (e.g., oxidized phospholipids) through ion mobility spectrometry (IMS).
4. Continuous Innovation
- Dynamic Updates: Incorporates novel lipids (e.g., microbial species) and environmental compounds. The 2024 expansion introduced an oxidized lipid module for inflammation research.
- Interoperability: Interfaces with tools like Swiss-TargetPrediction for end-to-end structural and functional analysis.
Url:https://lipidmaps.org/tools/ms/
If you want to know more, please refer to "Lipidomics: A Comprehensive Overview".
References
- Wishart DS, Guo A, Oler E, Wang F, Anjum A, Peters H, Dizon R, Sayeeda Z, Tian S, Lee BL, Berjanskii M, Mah R, Yamamoto M, Jovel J, Torres-Calzada C, Hiebert-Giesbrecht M, Lui VW, Varshavi D, Varshavi D, Allen D, Arndt D, Khetarpal N, Sivakumaran A, Harford K, Sanford S, Yee K, Cao X, Budinski Z, Liigand J, Zhang L, Zheng J, Mandal R, Karu N, Dambrova M, Schiöth HB, Greiner R, Gautam V. "HMDB 5.0: the Human Metabolome Database for 2022." Nucleic Acids Res. 2022 Jan 7;50(D1):D622-D631. doi: 10.1093/nar/gkab1062
- Kind T, Liu KH, Lee DY, DeFelice B, Meissen JK, Fiehn O. "LipidBlast in silico tandem mass spectrometry database for lipid identification." Nat Methods. 2013 Aug;10(8):755-8. doi: 10.1038/nmeth.2551
- Foster JM, Moreno P, Fabregat A, Hermjakob H, Steinbeck C, Apweiler R, Wakelam MJ, Vizcaíno JA. "LipidHome: a database of theoretical lipids optimized for high throughput mass spectrometry lipidomics." PLoS One. 2013 May 7;8(5):e61951. doi: 10.1371/journal.pone.0061951
- Aimo L, Liechti R, Hyka-Nouspikel N, Niknejad A, Gleizes A, Götz L, Kuznetsov D, David FP, van der Goot FG, Riezman H, Bougueleret L, Xenarios I, Bridge A. "The SwissLipids knowledgebase for lipid biology." Bioinformatics. 2015 Sep 1;31(17):2860-6. doi: 10.1093/bioinformatics/btv285