The goal is to minimize the number of obstacles hit by the duckie. We will be using DuckieTown’s environments, agents, and variables for our project. The input of the project will be data from cameras or sensors within the environment. The output will be the command that the agent receives or learns. Ideally, the DuckieTown will display a map or route with the agent in first-person and will move based on the model’s output. This navigation system we are facing has many applications, for example delivery robots navigating urban environments.
The algorithms that we plan to use in this project include PPO (Proximal Policy Optimization) and reinforcement policies with dynamic programming to maintain an efficient model. Ideally, we would integrate the PPO algorithm and train the agent to take certain movements that prevent the agent from hitting obstacles or going out of bounds. We plan to use Python 3D Libraries such as Panda3D and render 3D objects to create and simulate the environment. We will use obstacles and safe movements to categorize the reward system, where each turn that hits a obstacle would result in a negative reward and each turn that results in the agent progressing through the environment without hitting an obstacle would result in a positive reward. We are considering having a higher reward when a lap is completed.
The goal is to minimize the number of obstacles hit by the duckie. We will be using DuckieTown’s environments, agents, and variables for our project. The input of the project will be data from cameras or sensors within the environment.
The output will be the command that the agent receives or learns. Ideally, the DuckieTown will display a map or route with the agent in first-person and will move based on the model’s output. In terms of qualitative metrics, we would be assessing how smooth the agent moves through the terrain and how it handles edge cases like in high-risk situations or navigating obstacle, where it requires intense movement.