The advent of genomic sequencing revolutionized the field of science, revealing the human genome sequence to be ~3 billion base pairs. The result of this venture was the identification of ~20,000 protein-coding genes, which compromise only ~1% of the genome. While ~10% of the genome is considered ?silent?, the remaining ~89% is transcribed, but thought to be untranslated. Previous annotation methods resulted in the exclusion of transcripts under 300bp (100aa) due to the high rate of gene misidentification. Recent papers have discovered small open reading frames (sORFs) that encode peptides under 100aa with distinct biological functions. We hypothesize that there are numerous peptides with unknown biological functions that are prominent in cardiac physiology and disease. To identify novel sORFs we combined in vivo and in silico approaches, including a statistical prediction algorithm sORfinder, to compile a database of putative sORFs. Given the prominent role of mitochondrial dysfunction in heart failure (HF) we included mitochondrial-targeting prediction for all sORFs by employing n-terminal protein sequence analysis using the computational programs MitoFates and MitoProt. In addition, we have incorporated mRNA deep sequencing left ventricular samples of mice subjected to transaortic constriction (pressure-overload HF) or permanent ligation of the left coronary artery (myocardial infarction) at multiple stages of disease progression to identify differentially expressed sORFs in HF. To prioritize our search, we are ranking various components for unbiased target selection and experimental confirmation. We envision this novel database as having great importance within and beyond the cardiovascular field for identifying novel genes with therapeutic potential.
Cardiovascular disease (CVD) is the leading cause of death in the US and by the year 2035 is estimated to have an annual cost of $1 trillion. Due to rising costs and the ever increasing life expectancies of CVD sufferers, new strategies are necessary to prevent and treat this illness. Currently, advances in technology have allowed biologists to scan the human genome in search of undiscovered genes, specifically those of smaller size. This previously unexamined area of our genome has the potential to include many genes involved in CVD, which excitingly can lead to new therapeutic treatments and improve quality of life. !