Jacob Washburn and Hai Wang test machine learning methods to predict mRNA expression levels

Updated: Apr 3, 2019

Machine learning methodologies can be applied readily to biological problems, but standard training and testing methods are not designed to control for evolutionary relatedness or other biological phenomena. In their paper recently published in PNAS, Drs. Washburn and Wang propose, implement, and test two methods to control for and utilize evolutionary relatedness within a predictive deep learning framework. The methods are tested and applied within the context of predicting mRNA expression levels from whole-genome DNA sequence data and are applicable across biological organisms. Potential use cases for the methods include plant and animal breeding, disease research, gene editing, and others.

Read more here.

Evolutionarily informed strategies for deep learning. (A) For prediction tasks involving a single species, genes are grouped into gene families before being further divided into a training and a test set to prevent deep learning models from learning family-specific sequence features that are associated with target variables. (B) For prediction tasks involving two species, orthologs are paired before being divided into a training and a test set to eliminate evolutionary dependencies.

Contact Us

Tel: 607-255-1809           



175 Biotechnology Bldg  

Ithaca, NY 14853-2703  

  • YouTube Social  Icon
  • Facebook
  • Twitter

The Cornell University campus is located on traditional homelands of the Gayogo̱hó꞉nǫ' (the Cayuga Nation).

Please read the full text of the land acknowledgement here.