Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for learning transformers. By Andrej Karpathy. Perfect for understanding GPT architecture from scratch. Train on Shakespeare (CPU) or OpenWebText (multi-GPU).
7.0
Rating
0
Installs
AI & LLM
Category
Excellent educational skill for GPT implementation with comprehensive workflows covering Shakespeare training, GPT-2 reproduction, fine-tuning, and custom datasets. The description is clear and actionable, with detailed code examples, configs, and troubleshooting. Structure is well-organized with logical sections and appropriate references to separate files for advanced topics. Task knowledge is thorough with complete end-to-end pipelines, hardware requirements, and performance benchmarks. Novelty is moderate-to-good: while the underlying nanoGPT is well-documented externally, this skill provides valuable workflow orchestration, troubleshooting guidance, and decision frameworks that reduce token usage compared to researching documentation. Minor improvement area: could benefit from more explicit CLI invocation patterns for agents to follow programmatically.
Loading SKILL.md…