Q&A of Metabolomic Data Analysis

What criteria are generally used for differential metabolite screening?

A: VIP values from multivariate statistical models and P values from univariate statistical t-tests are generally used simultaneously to screen for differential metabolites. Univariate statistical analysis methods such as t-test and ANOVA focus more on independent changes in metabolite levels. Multivariate statistical analysis focuses more on the relationships between metabolites and their facilitation/antagonism relationships in biological processes. Considering the results of both types of statistical analysis methods simultaneously helps us to observe the data from different perspectives and draw conclusions, and also helps us to avoid false positive errors or model overfitting caused by using only one type of statistical analysis method.

The screening thresholds are generally VIP > 1 and P < 0.05. If a large number of differential metabolites are obtained, the screening condition of differential multiplicity can be added.

What should I do if I find no differential metabolites?

A: If the commonly used thresholds (VIP>1 and P<0.05) are used for screening but no differential metabolites are found, the thresholds can be set more stringently, such as VIP>1.5, or P<0.01. If still no differential metabolites are screened, KEGG pathway analysis can be performed on the detected substances. The metabolic pathways involved in the metabolites are investigated to observe whether there are other replenishment pathways and whether there is some correlation between the metabolic pathways and the disease.

What is the difference between PLS-DA and OPLS-DA models?

A: OPLS-DA has an additional positive exchange algorithm than PLS-DA, which filters out signals that are irrelevant to the model classification. For example, when the between-group differences are relatively small and the within-group differences are relatively large, the VIP filtering with PLS-DA may be a within-group difference variable, which is easily misleading, while OPLS-DA can filter out the between-group differences more accurately.

In the PCA and OPLS-DA models, some samples deviate from the 95% confidence interval, do such data need to be excluded?

A: It is not recommended to reject. It is normal for individual samples to deviate from the 95% confidence interval, and it will not affect the subsequent data analysis.

What is the basis for discriminating when 2 or 3 principal components are extracted in PCA?

A: In SIMCA, it is discriminated by Q2. When adding principal components leads to a decrease in Q2, it means the model is overfitted and stop adding principal components.

Why the explanation rate of PCA/OPLS-DA model is sometimes very low?

A: It must have something to do with the sample. In addition, it is related to the way of scaling and transform. In this case, we can adjust the normalization method of data processing and the transform and scaling method of modeling to observe if there is any improvement.

Does a Q2 value of less than 0.5 for PLS model cross-validation mean that the model cannot be used?

A: In general, the closer the Q2 value is to 1, the better the prediction of the model is, but there is no clear requirement that the Q2 must be >0.5. If the Q2 is less than 0.5, it means that the prediction of the model is not that good and the reliability is not that high, but it can be used.

The Q2 value is used as a reference for judgment, and is not absolute.

If my data volume is not very large and complex, how can I use multivariate methods for analysis?

A: If the data volume is not very large, the same multivariate methods can be used for analysis in software such as SIMCA. However, the data volume is small and may be over-fitted. Therefore, it is not necessary to use multivariate, you can choose other methods, such as univariate analysis methods.

Isn't multivariate statistical analysis suitable for cases with many variables and small sample sizes? Why is it better to do multivariate statistical analysis with 6 replicates than 3 replicates?

A: For statistical analysis, only a certain sample size can show the statistical significance. For metabolomics, there are many factors affecting metabolism, so a larger sample size can reduce individual differences.

Why is metabolomics analysis usually limited to a two-by-two comparison?

A: The main limitation is the OPLS-DA analysis. For comparative analysis of more than two groups, it is difficult for OPLS-DA model to calculate the contribution of metabolites to the differences between groups. The bigger difficulty is the difficulty in giving a reasonable explanation.

Can the sample size be different for the two comparison groups?

A: Yes, it is possible, only that the number of biological replicates in each group should meet the minimum requirement

Does the "area" in "area normalization" refer to the total area of a sample or the total area of all samples?

A: The total area of all substances tested in a sample.

How can I find the peak of the substance of interest from the TIC graph?

A: Combine the retention time (RT) and the characteristic mass-to-charge ratio (M/Z) values to find the peak of interest.

* For Research Use Only. Not for use in diagnostic procedures.

Our customer service representatives are available 24 hours a day, 7 days a week. Inquiry

From Our Clients

"I recently used their proteomics service for a project analyzing protein interactions in yeast models. The team was very responsive and helped clarify the methodology they employed, which made me feel confident in the results. The data quality was solid, with clear identification of several key proteins involved in our study. Their thorough analysis enabled me to pinpoint specific interactions that I hadn't considered before, which significantly improved the direction of my research. I appreciate their professionalism and support throughout the process."

Sarah Thompson, University of California, Berkeley

"Our lab collaborated with them on a project studying cancer biomarkers. The proteomics analysis provided was detailed and focused, specifically highlighting the differential expression of proteins between healthy and tumor samples. Their clear explanations of the data helped my team understand the biological implications. I also appreciated their willingness to revise the reports based on our feedback, ensuring that we had everything we needed for our publication. This collaborative spirit was invaluable."

Emily Rodriguez, Stanford University

"Our lab worked with them on a project studying the effects of diet on gut microbiota using proteomics. They used a label-free quantification method to analyze proteins in fecal samples before and after dietary intervention. The results showed significant changes in protein expression linked to microbial activity. This was pivotal for our hypothesis about diet-microbiota interactions. The clarity of their data presentation made it easy for our team to integrate these findings into our ongoing research."

Dr. Lisa Wong, University of Toronto

"My experience with Creative Proteomics during the mass spectrometry analysis was excellent. We sent in human saliva and mouse brain tissue samples, which they expertly analyzed using both LC-MS and GC-MS techniques. The results were invaluable, revealing key metabolites in the saliva and identifying biomarkers linked to brain function in the brain tissue."

Dr. Emily Carter, Senior Research Scientist

"The overall service from Creative Proteomics was outstanding. They made the entire process seamless and efficient, allowing us to focus on our research. We worked with leaf and root samples from various Arabidopsis genotypes for targeted metabolomics analysis. Their thorough profiling of primary and secondary metabolites gave us important insights into how the plants respond metabolically to environmental stress."

Dr. Laura Henderson, Plant Physiologist

"We had a pleasant collaboration with Creative Proteomics on mass spectrometry analysis of lipids. They conducted a detailed analysis of lipid species, providing us with important insights into lipid metabolism and its relationship with metabolic syndrome disease states."

Dr. Sarah Mitchell, Research Scientist