❄️ PPO Agent on SnowballTarget

This repository contains a trained Proximal Policy Optimization (PPO) agent that plays the SnowballTarget environment using the Unity ML-Agents Library.

📊 Model Card

Model Name: ppo-SnowballTarget
Environment: SnowballTarget (Unity ML-Agents)
Algorithm: PPO (Proximal Policy Optimization)
Performance Metric:

Achieves stable performance in target-hitting tasks
Demonstrates convergence to an effective policy

🚀 Usage (with ML-Agents)

Documentation: ML-Agents Toolkit Docs

mlagents-learn <your_configuration_file_path.yaml> --run-id=<run_id> --resume

# Example: loading the trained PPO model
# (requires Unity ML-Agents setup)
model_id = "KraTUZen/ppo-SnowballTarget"
# Select your .nn or .onnx file from the repo

🧠 Notes

The agent is trained using PPO, a robust on-policy algorithm widely used in Unity ML-Agents.
The environment involves throwing snowballs at targets, requiring precision and timing.
The trained model is stored as .nn or .onnx files for direct Unity integration.

📂 Repository Structure

SnowballTarget.nn / SnowballTarget.onnx → Trained PPO policy
README.md → Documentation and usage guide

✅ Results

The agent learns to consistently hit targets with snowballs.
Demonstrates stable training and effective policy convergence using PPO.

🔎 Environment Overview

Observation Space: Continuous (agent position, target position, environment state)
Action Space: Continuous (throwing angle, force)
Objective: Maximize hits on targets with snowballs
Reward: Positive reward for successful hits, penalties for misses

📚 Learning Highlights

Algorithm: PPO (Proximal Policy Optimization)
Update Rule: Clipped surrogate objective to ensure stable updates
Strengths: Robust, stable, widely used in Unity ML-Agents
Limitations: Requires careful tuning of hyperparameters (clip ratio, learning rate, batch size)

🎮 Watch Your Agent Play

You can watch your agent directly in your browser:

Visit Unity ML-Agents on Hugging Face
Find your model ID: KraTUZen/ppo-SnowballTarget
Select your .nn or .onnx file
Click Watch the agent play 👀

Downloads last month: 41

Video Preview

Reinforcement Learning

Evaluation results

mean_reward on SnowballTarget
self-reported

3.270
std_reward on SnowballTarget
self-reported

1.750