Use when building Apache Spark applications, distributed data processing pipelines, or optimizing big data workloads. Invoke for DataFrame API, Spark SQL, RDD operations, performance tuning, streaming analytics.
6.4
Rating
0
Installs
Data & Analytics
Category
Excellent Spark skill with comprehensive coverage of distributed data processing. The description clearly conveys when to invoke it (DataFrame API, Spark SQL, performance tuning, streaming). The structured reference system efficiently organizes deep technical knowledge across 5 specialized files. Core workflow provides clear steps from analysis to validation. Strong constraints section with actionable MUST/MUST NOT rules (broadcast joins, avoiding collect(), skew handling). Well-targeted for production Spark engineering. Novelty score reflects that while Spark expertise is valuable, a skilled CLI agent could handle basic Spark tasks; this skill excels at optimization and production-grade patterns that would otherwise require extensive token usage.
Loading SKILL.md…

Skill Author