Identifying an organism’s optimal growth temperature (OGT) is an important step in understanding how organisms evolve and adapt to their environment. Many models that have been developed to predict OGT in prokaryotic species use protein or genome-wide features as predictors. Buckler Lab member Sarah Jensen, and former Buckler postdoc, Emre Cimen, set out to develop a method to predict OGT for new organisms using as little of the genome as possible. “We wanted to study how temperature affects molecular evolution across the Central Dogma of biology, but existing models used too many genome features for this to be possible. Our models only use tRNA sequences, so now we can study how temperature affects molecular features in any prokaryote species with a genome,” Jensen said.
Jensen and Cimen built a “tRNA thermometer” that predicts OGT using only tRNA sequences. To construct their model, they used sequences from 100 archaea and 683 bacteria species as input to train two Convolutional Neural Network models.
The first model they built pairs individual tRNA sequences from different species to predict which comes from a more thermophilic organism, with accuracy ranging from 0.538 to 0.992. The second uses the complete set of tRNAs in a species to predict optimal growth temperature with an r^2 of 0.86; comparable with other prediction accuracies in the literature despite a significant reduction in the quantity of input data. The length of tRNA sequences used in these models makes up only about 0.1% of the total genome space.
In addition to the reduced data input requirements, this model improves on previous OGT prediction models by removing laborious feature extraction and data preprocessing steps, and widening the scope of valid downstream analyses.
You can read more about their research here.