Enhancing software runtime with reinforcement learning-driven mutation operator selection in genetic improvement

Apr 27, 2025·
Damien Bose
,
Carol Hanna
,
Justyna Petke
· 0 min read
Abstract
Genetic Improvement employs heuristic search algorithms to explore the search space of program variants by modifying code using mutation operators. This research focuses on operators that delete, insert and replace source code statements. Traditionally, in GI an operator is chosen uniformly at random at each search iteration. This work leverages Reinforcement Learning to intelligently guide the selection of these operators specifically to improve program runtime. We propose to integrate RL into the operator selection process. Four Multi-Armed bandit RL algorithms (Epsilon Greedy, UCB, Probability Matching, and Policy Gradient) were integrated within a GI framework, and their efficacy and efficiency were benchmarked against the traditional GI operator selection approach. These RL-guided operator selection strategies have demonstrated empirical superiority over the traditional GI methods of randomly selecting a search operator, with UCB emerging as the top-performing RL algorithm. On average, the UCBguided Hill Climbing search algorithm produced variants that compiled and passed all tests 44 % of the time, while only 22 % of the variants produced by the traditional uniform random selection strategies compiled and passed all tests.
Type
Publication
In IEEE/ACM International Workshop on Genetic Improvement (GI)
publications