Early detection of cancer improves patient survival. Characterizing the association of peptides and glycans with cancer is one of the most promising strategies to discover early-diagnosis cancer biomarkers. This study evaluates peptide and glycan expression profiles in the progression of chronic liver disease (CLD) to hepatocellular carcinoma (HCC) by using the liquid chromatography-mass spectrometry (LC-MS) technology. The goal is to find and validate peptide and glycan biomarkers for detection of HCC at a treatable stage in a high-risk population of patients with CLD. Label-free LC-MS quantification allows comparison of peptides and glycans with good throughput which allows us to compare a large population of patients. However, such quantification is not addressed adequately in the instrument-specific software packages. In particular, alignment and normalization of LC-MS data present a significant challenge in label-free quantification and comparison of biomolecules. This challenge coupled with biological variability and disease heterogeneity in human populations has restricted recent advances in LC-MS-based biomarker discovery studies. This project brings together experts in bioinformatics, biostatistics, biochemistry, and mass spectrometry to develop a suite of novel analytical tools for LC-MS-based label-free quantification and comparison of peptides and glycans in serum and plasma. Specifically, a novel Bayesian hierarchical model will be investigated for simultaneous alignment and normalization of LC-MS data and for identification of patient subgroups. The Bayesian framework involves fixed and random effects to account for subpopulation homogeneous behavior (fixed systematic changes), while allowing for modeling heterogeneity within a group (random effects). A spike-in study will be conducted to obtain replicate LC-MS runs with known peptide and glycan concentrations. The data will be utilized to develop and optimize the proposed Bayesian framework and to compare its performance with other existing solutions. The optimized framework and a machine learning-based feature selection method will be applied to identify an integrated set of peptide and glycan candidate biomarkers for early detection of HCC. LC-MS analysis of integrated peptides and glycans in both serum and plasma of patients with HCC is to our knowledge unprecedented. Blood samples from patients with HCC and CLD controls in Egypt and United States will be used. The biomarkers will be validated using isotope dilution mass spectrometric assays.

Public Health Relevance

This project will lead to the development of a suite of novel open source analytical tools for label-free quantification of peptides and glycans in serum and plasma using liquid chromatography-mass spectrometry (LC-MS) technologies. The availability of such tools will assist the research community in advancing the promising LC-MS-based biomarker discovery research. The proposed tools will be utilized to find and validate early-diagnosis biomarkers of hepatocellular carcinoma (HCC). Defining clinically applicable biomarkers that detect early-stage HCC in a high-risk population of patients with chronic liver disease has potentially far-reaching consequences for disease management and patient health. This project is important because most HCC patients are diagnosed at a late stage, where the treatment options are limited. There is a pressing need to identify biomarkers of HCC that could be used for early detection and more accurate classification of disease. In addition to screening high-risk populations for early signs of disease, the resulting biomarkers could be used to design and test improved treatment strategies.

National Institute of Health (NIH)
Research Project (R01)
Project #
Application #
Study Section
Cancer Biomarkers Study Section (CBSS)
Program Officer
Rinaudo, Jo Ann S
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Georgetown University
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Wang, Jinlian; Zuo, Yiming; Liu, Lun et al. (2014) Identification of functional modules by integration of multiple data sources using a Bayesian network classifier. Circ Cardiovasc Genet 7:206-17
Xiao, Junfeng; Zhao, Yi; Varghese, Rency S et al. (2014) Evaluation of metabolite biomarkers for hepatocellular carcinoma through stratified analysis by gender, race, and alcoholic cirrhosis. Cancer Epidemiol Biomarkers Prev 23:64-72
Zuo, Yiming; Yu, Guoqiang; Tadesse, Mahlet G et al. (2014) Biological network inference using low order partial correlation. Methods 69:266-73
Tsai, Tsung-Heng; Tadesse, Mahlet G; Di Poto, Cristina et al. (2013) Multi-profile Bayesian alignment model for LC-MS data analysis with integration of internal standards. Bioinformatics 29:2774-80
Tsai, Tsung-Heng; Tadesse, Mahlet G; Wang, Yue et al. (2013) Profile-Based LC-MS data alignment--a Bayesian approach. IEEE/ACM Trans Comput Biol Bioinform 10:494-503