
Warning: Undefined array key "embedded-content" in /srv/http/omrf.org-2026/site_root/wp-content/themes/omrf-2025/omrf-block-functions/omrf-block-functions.php on line 14

My Research
Many scientific questions can be answered with the vast amount of data that already exists. The overarching goal of my laboratory is to develop bioinformatics and artificial intelligence (AI) based methods to accelerate the pace of scientific discovery.
One long-term area of research interest is to develop effective ways of modelling the current state of scientific knowledge. Scientific publications have been growing in number at an exponential rate (42M in 2026, growing 5%/yr) and also in size with more supplementary data. It’s far too vast for one person to be familiar with any more than a tiny subset, so we have developed computational methods to model, summarize, and synthesize the fundamental relationships between things of interest within it (e.g., genes, diseases, chemicals, mutations, etc).
The ability to create these “knowledge networks” is useful by itself, but also a critical predecessor to understanding what is going on at the molecular level in disease. For example, humans have ~25,000 protein-coding genes and ~30,000 non-coding RNA transcripts, yet we still know little or nothing about the functions of half of them. Even though we know the genomic location and sequence of each transcript and have experimental data on the expression, this illustrates the primary problem we are attempting to solve: Having data is not the same thing as having knowledge. So, we have developed methods to use transcriptional correlation networks combined with literature networks to predict the functions of these unknown genes, and experimental validation studies have shown it is approximately 85% accurate.
A new age of AI was ushered in with the invention of Transformer technology, which underlies the success of Large Language Models such as ChatGPT, as well as de novoprotein structure prediction models such as AlphaFold2. In brief, it permitted neural network-based models to “pay attention” to non-local features such as words said in previous sentences or amino acids in a different part of a protein’s sequence. Importantly, this new generation of AI has solved problems that previously bottlenecked progress, allowing us to now focus on higher-level issues, such as what might be called “synthetic understanding”. Take, for example, an image-generation program like DALL-E where you type words to generate an image of a cat. From analysis of millions of labelled cat pictures, it can intelligently blend the words you provide to determine the arrangement and color of pixels, such that it “understands” things like size, color, pose, breed, etc. Just as AI can understand the essence of “cat-ness”, it should be able to similarly understand the essence of “Alzheimer-ness”, or most any disease. We are currently working on AI-based approaches to do exactly this, which should allow us to model potential interventions and predict patient outcome faster and cheaper.
Research Keywords
- Artificial intelligence
- Machine learning
- Bioinformatics
- Knowledge networks
- Disease modeling

Contact

Jonathan D. Wren, Ph.D.
Genes & Human Disease Research Program, MS 42
Oklahoma Medical Research Foundation
825 N.E. 13th Street
Oklahoma City, OK 73104
Phone: 405-271-6989
Fax: 405-271-4110











