deepspeed

2.2

111

Expert guidance for distributed training with DeepSpeed - ZeRO optimization stages, pipeline parallelism, FP16/BF16/FP8, 1-bit Adam, sparse attention

distributed training

2.2

Rating

Installs

Machine Learning

Quick Review

No summary available.

LLM Signals

Description coverage-

Task knowledge-

Structure-

Novelty-

GitHub Signals

957

Last commit 2 days ago

Publisher

zechenzhangAGI

Skill Author

Loading SKILL.md…

Try onlineView on GitHub

Publisher

zechenzhangAGI

Skill Author

Related Skills

ml-pipeline

Jeffallan

6.4

sparse-autoencoder-training

zechenzhangAGI

7.6

huggingface-accelerate

zechenzhangAGI

7.6

moe-training

zechenzhangAGI

7.6

Try online

Improve

deepspeed

2.2

by zechenzhangAGI

111

Expert guidance for distributed training with DeepSpeed - ZeRO optimization stages, pipeline parallelism, FP16/BF16/FP8, 1-bit Adam, sparse attention

distributed training

2.2

Rating

Installs

Machine Learning

Quick Review

No summary available.

LLM Signals

Description coverage-

Task knowledge-

Structure-

Novelty-

GitHub Signals

957

Last commit 2 days ago

Publisher

zechenzhangAGI

Skill Author

Loading SKILL.md…

Try onlineView on GitHub

Publisher

zechenzhangAGI

Skill Author

Related Skills

ml-pipeline

Jeffallan

6.4

sparse-autoencoder-training

zechenzhangAGI

7.6

huggingface-accelerate

zechenzhangAGI

7.6

moe-training

zechenzhangAGI

7.6

Try online