Analysis of inframe indels in cardiovascular disease-associated proteins
https://doi.org/10.15829/1728-8800-2025-4597
EDN: CUZOEV
Abstract
Aim. To describe the prevalence, location, pathogenicity, and penetrance of inframe indels in clinically significant genes associated with cardiovascular diseases (CVD).
Material and methods. We used the ClinVar and dbSNP databases, as well as sequencing data from samples from various Russian cohorts. Genome variant annotation was performed using the ENSEMBL VEP program.
Results. Forty-two genes associated with 22 CVDs were selected, and indels in the genes from this sample, described in the ClinVar database, were analyzed. A wide range of indel numbers and their distribution by type of clinical significance was observed. Indels are significantly less common, but are more pathogenic than non-synonymous variants (missenses). Indels, as causal variants, are the rarest in cardiomyopathies, myopathies, and vascular diseases, while they are more common in arrhythmias and hypercholesterolemia. Pathogenic indels were shown to be rarely observed in sequence repeats or low-complexity regions, while benign indels were less frequently observed in sequence regions annotated as Pfam functional domains. Based on the analysis of >6800 sequenced samples from various Russian cohorts, we revealed that the studied pathogenic indels have a relatively high penetrance. Examples of pathogenic indels potentially specific to the Russian population are described.
Conclusion. For the first time, the characteristics of a certain type of genomic variant (inframe indels) in key CVD genes have been systematically described. The obtained results highlight the clinical importance of causality and penetrance of protein indels, despite their lower prevalence compared to nonsynonymous variants.
About the Authors
V. E. RamenskiyRussian Federation
Petroverigsky, 10, bld. 3, Moscow, 101990,
Leninskie gory, 1, Moscow, 119991
M. Zaychenoka
Russian Federation
Petroverigsky, 10, bld. 3, Moscow, 101990,
Institutsky Ln., 9, Dolgoprudny, Moscow region, 141701
A. V. Kiseleva
Russian Federation
Petroverigsky, 10, bld. 3, Moscow, 101990
А. А. Bukaeva
Russian Federation
Petroverigsky, 10, bld. 3, Moscow, 101990
A. I. Ershova
Russian Federation
Petroverigsky, 10, bld. 3, Moscow, 101990
A. N. Meshkov
Russian Federation
Petroverigsky, 10, bld. 3, Moscow, 101990
O. M. Drapkina
Russian Federation
Petroverigsky, 10, bld. 3, Moscow, 101990
References
1. Garcia-Diaz M, Kunkel TA. Mechanism of a genetic glissando: structural biology of indel mutations. Trends Biochem Sci. 2006;31(4):206-14. doi:10.1016/j.tibs.2006.02.004.
2. Chen S, Francioli LC, Goodrich JK, et al. A genomic mutational constraint map using variation in 76,156 human genomes. Nature. 2024;625(7993):92-100. doi:10.1038/s41586-023-06045-0.
3. Karczewski KJ, Francioli LC, Tiao G, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020;581(7809):434-43. doi:10.1038/s41586-020-2308-7.
4. Pagel KA, Antaki D, Lian A, et al. Pathogenicity and functional impact of non-frameshifting insertion/deletion variation in the human genome. PLOS Comput Biol. 2019;15(6):e1007112. doi:10.1371/journal.pcbi.1007112.
5. Landrum MJ, Lee JM, Benson M, et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016;44(D1):D862-8. doi:10.1093/nar/gkv1222.
6. Sotnikova EA, Kiseleva AV, Kutsenko VA, et al. Identification of Pathogenic Variant Burden and Selection of Optimal Diagnostic Method Is a Way to Improve Carrier Screening for Autosomal Recessive Diseases. J Pers Med. 2022;12(7):1132. doi:10.3390/jpm12071132.
7. Richards S, Aziz N, Bale S, et al. Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17(5):405-24. doi:10.1038/gim.2015.30.
8. Ryzhkova OP, Kardymon OL, Prohorchuk EB, et al. Guidelines for the interpretation of massive parallel sequencing variants (update 2018, v2). Medical Genetics. 2019;18(2):3-23. (In Russ.) doi:10.25557/2073-7998.2019.02.3-23.
9. Jiang Y, Turinsky AL, Brudno M. The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection. Nucleic Acids Res. 2015;43(15):7217-28. doi:10.1093/nar/gkv677.
10. Savino S, Desmet T, Franceus J. Insertions and deletions in protein evolution and engineering. Biotechnol Adv. 2022;60:108010. doi:10.1016/j.biotechadv.2022.108010.
11. Miton CM, Tokuriki N. Insertions and Deletions (Indels): A Missing Piece of the Protein Engineering Jigsaw. Biochemistry. 2023; 62(2):148-57. doi:10.1021/acs.biochem.2c00188.
12. Van Hout CV, Tachmazidou I, Backman JD, et al. Exome sequencing and characterization of 49,960 individuals in the UK Biobank. Nature. 2020;586(7831):749-56. doi:10.1038/s41586-020-2853-0.
13. Fowler DM, Fields S. Deep mutational scanning: a new style of protein science. Nat Methods. 2014;11(8):801-7. doi:10.1038/nmeth.3027.
14. Cannon S, Williams M, Gunning AC, Wright CF. Evaluation of in silico pathogenicity prediction tools for the classification of small in-frame indels. BMC Med Genomics. 2023;16(1):36. doi:10.1186/s12920-023-01454-6.
15. Macdonald CB, Nedrud D, Grimes PR, et al. DIMPLE: deep insertion, deletion, and missense mutation libraries for exploring protein variation in evolution, disease, and biology. Genome Biol. 2023;24(1):36. doi:10.1186/s13059-023-02880-6.
16. Shin JE, Riesselman AJ, Kollasch AW, et al. Protein design and variant prediction using autoregressive generative models. Nat Commun. 2021;12(1):2403. doi:10.1038/s41467-021-22732-w.
17. Fan X, Pan H, Tian A, et al. SHINE: protein language model-based pathogenicity prediction for short inframe insertion and deletion variants. Brief Bioinform. 2023;24(1):bbac584. doi:10.1093/bib/bbac584.
18. Barbitoff YA, Abasov R, Tvorogova VE, et al. Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery. BMC Genomics. 2022;23(1):155. doi:10.1186/s12864-022-08365-3.
19. Yen JL, Garcia S, Montana A, et al. A variant by any name: quantifying annotation discordance across tools and clinical databases. Genome Med. 2017;9. doi:10.1186/s13073-016-0396-7.
20. den Dunnen JT, Dalgleish R, Maglott DR, et al. HGVS Recommendations for the Description of Sequence Variants: 2016 Update. Hum Mutat. 2016;37(6):564-9. doi:10.1002/humu.22981.
21. Meshkov A, Ershova A, Kiseleva A, et al. The LDLR, APOB, and PCSK9 Variants of Index Patients with Familial Hypercholesterolemia in Russia. Genes. 2021;12(1):66. doi:10.3390/genes12010066.
22. Meshkov AN, Kiseleva AV, Ershova AI, et al. ANGPTL3, ANGPTL4, APOA5, APOB, APOC2, APOC3, LDLR, PCSK9, LPL gene variants and coronary artery disease risk. Russian Journal of Cardiology. 2022;27(10):5232. (In Russ.) doi:10.15829/1560-4071-2022-5232.
23. Ramensky VE, Ershova AI, Zaicenoka M, et al. Targeted Sequencing of 242 Clinically Important Genes in the Russian Population From the Ivanovo Region. Front Genet. 2021;12:1782. doi:10.3389/fgene.2021.709419.
24. Miller DT, Lee K, Chung WK, et al. ACMG SF v3.0 list for reporting of secondary findings in clinical exome and genome sequencing: a policy statement of the American College of Medical Genetics and Genomics (ACMG). Genet Med. 2021;23(8):1381-90. doi: 10.1038/s41436-021-01172-3.
25. Vasilevsky NA, Toro S, Matentzoglu N, et al. Mondo: Integrating Disease Terminology Across Communities. Genetics. Published online October 6, 2025:iyaf215. doi:10.1093/genetics/iyaf215.
26. Morales J, Pujar S, Loveland JE, et al. A joint NCBI and EMBL-EBI transcript set for clinical genomics and research. Nature. 2022;604(7905):310-15. doi:10.1038/s41586-022-04558-8.
27. Paysan-Lafosse T, Andreeva A, Blum M, et al. The Pfam protein families database: embracing AI/ML. Nucleic Acids Res. 2025;53(D1):D523-34. doi:10.1093/nar/gkae997.
28. Piovesan D, Del Conte A, Clementel D, et al. MobiDB: 10 years of intrinsically disordered proteins. Nucleic Acids Res. 2023; 51(D1):D438-44. doi:10.1093/nar/gkac1065.
29. Lin M, Whitmire S, Chen J, et al. Effects of short indels on protein structure and function in human genomes. Sci Rep. 2017;7. doi:10.1038/s41598-017-09287-x.
30. Boytsov SA, Drapkina OM, Shlyakhto EV, et al. Epidemiology of Cardiovascular Diseases and their Risk Factors in Regions of Russian Federation (ESSE-RF) study. Ten years later. Cardiovascular Therapy and Prevention. 2021;20(5):3007. (In Russ.) doi:10.15829/1728-8800-2021-3007.
31. Kopylova OV, Ershova AI, Pokrovskaya MS, et al. Population-nosological research biobank of the National Medical Research Center for Therapy and Preventive Medicine: analysis of biosamples, principles of collecting and storing information. Cardiovascular Therapy and Prevention. 2021;20(8):3119. (In Russ.) doi:10.15829/1728-8800-2021-3119.
32. Mistry J, Chuguransky S, Williams L, et al. Pfam: The protein families database in 2021. Nucleic Acids Res. 2021;49(D1):D412- 9. doi:10.1093/nar/gkaa913.
33. Montgomery SB, Goode DL, Kvikstad E, et al. The origin, evolution, and functional impact of short insertion-deletion variants identified in 179 human genomes. Genome Res. 2013;23(5):749-61. doi:10.1101/gr.148718.112.
34. Yue Z, Xiang Y, Chen G, et al. PredinID: predicting pathogenic inframe indels in human through graph convolution neural network with graph sampling technique. In: IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2023;20(5):3226-33. doi:10.1109/TCBB.2023.3266232.
What is already known about the subject?
- Genome variants resulting in the deletion or insertion of a small number of amino acid residues in a protein (indels) are less common than nonsynonymous variants but are of interest for assessing their causality in various monogenic diseases.
- The exome of a representative of the European population contains, on average, 115 indels, which are characterized by a wide variety of functional and/or clinical manifestations.
What might this study add?
- For the first time, the indel profile was analyzed in a group of genes significantly associated with cardiovascular diseases. The number and types of clinical significance of indels in genes vary greatly. The most common are deletions of one residue and insertions of two to five residues.
- Indels are significantly less common, but more pathogenic, than nonsynonymous variants (missenses). Indels, as causal variants, are the rarest in cardiomyopathies, myopathies, and vascular diseases, while they are more common in arrhythmias and hypercholesterolemia.
- Pathogenic indels are rarely observed in sequence repeats or low-complexity regions, while benign indels avoid functional domains of Pfam.
- Based on analysis of target genes in >6800 sequenced samples from various Russian cohorts, a conclusion was reached about the relatively high penetrance of the studied pathogenic indels, and examples of pathogenic indels potentially specific to the Russian population are described.
Review
For citations:
Ramenskiy V.E., Zaychenoka M., Kiseleva A.V., Bukaeva А.А., Ershova A.I., Meshkov A.N., Drapkina O.M. Analysis of inframe indels in cardiovascular disease-associated proteins. Cardiovascular Therapy and Prevention. 2025;24(12):4597. (In Russ.) https://doi.org/10.15829/1728-8800-2025-4597. EDN: CUZOEV
JATS XML













































