References

Primary and implementation sources used by these notes:

  1. DeepSeek-AI. DeepSeek-V3 Technical Report. arXiv:2412.19437.
    https://arxiv.org/abs/2412.19437

  2. DeepSeek-AI. DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model. arXiv:2405.04434.
    https://arxiv.org/abs/2405.04434

  3. Dai et al. DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models. arXiv:2401.06066.
    https://arxiv.org/abs/2401.06066

  4. DeepSeek-AI. DeepSeek-V3 official repository.
    https://github.com/deepseek-ai/DeepSeek-V3

  5. DeepSeek-AI. DeepSeek-V3 inference model implementation.
    https://github.com/deepseek-ai/DeepSeek-V3/blob/main/inference/model.py

  6. DeepSeek-AI. DeepSeek-MoE official repository.
    https://github.com/deepseek-ai/DeepSeek-MoE