2023
an archive of posts from this year
| Oct 27, 2023 | A few things you need to know about Megatron-LM DistributedOptimizer |
|---|---|
| Oct 19, 2023 | Math for Megatron Mixture-of-Experts (MoE) |
| Aug 22, 2023 | GPT Training Memory Estimation - NeMo Practice |
an archive of posts from this year
| Oct 27, 2023 | A few things you need to know about Megatron-LM DistributedOptimizer |
|---|---|
| Oct 19, 2023 | Math for Megatron Mixture-of-Experts (MoE) |
| Aug 22, 2023 | GPT Training Memory Estimation - NeMo Practice |