Abstract
This study presents a practical application of reinforcement learning (RL) for autonomous vehicle control within the Gymnasium CarRacing-v0 simulation environment. The primary objective was to design and train an agent capable of autonomously navigating procedurally generated racetracks by employing deep reinforcement learning methodologies.
The research utilized the Proximal Policy Optimization (PPO) algorithm, implemented via the Stable-Baselines3 library, chosen for its proven efficiency and robustness in continuous action domains. The training process was conducted in Python, using the PyTorch and Gymnasium frameworks.
Particular emphasis was placed on the agent’s ability to develop smooth steering and throttle control—capabilities essential for handling sharp curves and maintaining optimal velocity. The findings demonstrate that a properly configured PPO agent can successfully learn to navigate the CarRacing environment after sufficient training. The study concludes with a performance evaluation, an analysis of the agent’s acquired behavior, and suggestions for future research avenues.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2025 Information Technology Applications
