In the field of robotics, the use of robots in unpredictable environments remains a significant challenge, largely due to the complexities of programming for dynamic conditions. Addressing this, the current research focuses on the application of Reinforcement Learning (RL) to simulate human gait. This bachelor thesis explores the training of simulated agents to replicate human gait patterns using RL, specifically employing Proximal Policy Optimization (PPO) within NVIDIA's Isaac Gym environment.
One of the critical issues in RL is the generation of sub-optimal, non-human-like movements by simulated agents. To counter this, the thesis investigates various training strategies and reward function compositions, including the integration of human motion capture data to refine the learning process. A key innovation in this research is the application of generative AI to optimize these reward functions, thus enhancing the efficiency and realism of the simulated gait.
The study systematically evaluates the effectiveness of different training methodologies in overcoming the limitations of previous RL applications. It demonstrates that the strategic combination of advanced RL techniques and human motion data can lead to more accurate and human-like movement patterns in robots.