stable-baselines3

8.3

339

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.

reinforcement learning

8.3

Rating

Installs

Machine Learning

Quick Review

Exceptional RL skill with comprehensive coverage of Stable Baselines3. The description clearly articulates capabilities and use cases, enabling easy invocation. Task knowledge is outstanding with complete code patterns, templates, and detailed references for all major workflows (training, custom environments, callbacks, vectorization). Structure is excellent with a well-organized SKILL.md providing overview and quick patterns while delegating detailed documentation to referenced files. The skill demonstrates high novelty by packaging complex RL workflows that would require extensive documentation reading and trial-and-error into ready-to-use templates. Minor improvement possible: could slightly expand description to mention evaluation and persistence capabilities explicitly, though these are well-covered in the body.