Devin White
Machine Learning Researcher
Hi, I’m Devin, a Machine Learning Researcher with an M.S. in Artificial Intelligence and over three years of combined professional and academic research experience. Specializing in investigating the emergent capabilities of Large Language Models (LLMs), particularly in interactive environments (like Atari!), alongside expertise in Reinforcement Learning (RL), Reinforcement Learning from Human Feedback (RLHF) specifically Rating-based RL (RbRL). Proven ability to conduct end-to-end research, demonstrated by 5 publications in top AI venues (AAAI, ICML Workshops, AAAI Workshops and AAAI Bridge). Proficient in Python, PyTorch, TensorFlow, Stable Baselines3 and developing novel AI models and experiments.
Skills
- Python
- PyTorch
- TensorFlow
- Stable Baselines3
- Apple MLX
- NumPy
- Pandas
- Matplotlib
- Gymnasium
- OpenAI API
- Hugging Face API
- Google Gemini API
- Reinforcement Learning (RL)
- Reinforcement Learning from Human Feedback (RLHF)
- Large Language Models (LLMs)
- Git & GitHub
Experience:

Oct 2023 - Present Machine Learning Researcher, AEOP (Army Educational Outreach Program)
Presented Rating-based Reinforcement Learning (RbRL) at AAAI 2024. Expanded the research to optimize RbRL for improved consistency across diverse environments and conditions, this work has been accepted to the Collaborative AI and Modelling of Humans Bridge Program at AAAI 2025. Investigated the use of poorly rated examples as demonstrations to avoid, resulting in another accepted paper to the Collaborative AI and Modelling of Humans Bridge Program at AAAI 2025. Additionally, I worked on benchmarking the capabilities of Large Language Models (LLMs) in Atari gameplay, which was accepted to the Toward Knowledgeable Foundation Models Workshop at AAAI 2025.

Sep 2022 - Dec 2023 Graduate Research Assistant, UTSA
Finalized research on Rating-based Reinforcement Learning (RbRL), demonstrating that using ratings—either from user feedback or synthetic feedback—achieved similar or superior performance compared to preference-based methods under comparable conditions. Presented this work at the Many Facets of Preference Learning workshop at ICML 2023 and the 2022 Army Research Labs Humans in Complex Systems Technical Advisory Board meeting, hosted by the National Academy of Sciences. Additionally, Deep Reinforcement Learning-based Optimal Time-constrained Intercept Guidance was accepted at the AIAA Guidance, Navigation, and Controls Conference 2024.

Jan 2022 - Aug 2022 Technical Laboratory Assistant II, UTSA
Conducted research on Rating-based Reinforcement Learning (RbRL) in the DeepMind Control Suite, exploring performance changes in more complex environments. Designed and executed an IRB-approved user study with 20 participants, which revealed that providing ratings was less mentally demanding and enabled participants to offer more frequent feedback. Additionally, developed a reinforcement learning framework for optimal time-constrained intercept guidance by creating a custom Gymnasium environment with realistic physics, implementing the codebase to support RL training in this challenging scenario.

Aug 2021 - Dec 2021 Undergraduate Research Assistant, UTSA
Continued advancing research in Rating-based Reinforcement Learning (RbRL) by leading a team of four undergraduate students in implementing and testing RbRL algorithms in Atari environments. Our work demonstrated that leveraging ratings is a viable approach, reducing computational resources and time while maintaining robust performance. This experience showcased my leadership in coordinating a research team and my technical expertise in reinforcement learning applications.

Jun 2021 - Aug 2021 NSF REU (Research Experience for Undergraduates), UTSA
I participated in the NSF Research Experiences for Undergraduates (REU) program at UTSA, where I gained a strong foundation in Artificial Intelligence through hands-on research and lectures by leading AI researchers. As part of the program, I defined research objectives for Rating-based Reinforcement Learning (RbRL), contributing to the early development of this innovative approach. My work culminated in a presentation of preliminary findings to peers and faculty, showcasing my ability to conduct and communicate complex research effectively.
Education:

Dec 2023Master's in Artificial Intelligence
