2023
an archive of posts from this year
Oct 27, 2023 | A few things you need to know about Megatron-LM DistributedOptimizer |
---|---|
Oct 19, 2023 | Math for Megatron Mixture-of-Experts (MoE) |
Aug 22, 2023 | GPT Training Memory Estimation - NeMo Practice |
an archive of posts from this year
Oct 27, 2023 | A few things you need to know about Megatron-LM DistributedOptimizer |
---|---|
Oct 19, 2023 | Math for Megatron Mixture-of-Experts (MoE) |
Aug 22, 2023 | GPT Training Memory Estimation - NeMo Practice |