Fundamental
- 2024-08-10, [Fundamental]From Online Softmax to Flash Attention V3
- 2024-08-11, [Fundamental] 旋转位置编码(RoPE)
- 2024-08-19, [Fundamental] 模型量化
- 2024-08-21, [Fundamental] FlashDecoding Series
MIT6.S081, XV6 Lab
- 2023-02-23, XV6 lab 1 Utilities
- 2023-02-25, XV6 lab 2 syscall
- 2023-03-02, XV6 lab 3 page table
- 2023-03-12, XV6 lab 4 traps
- 2023-03-17, XV6 lab 5 copy on write
Tools
- 2023-04-18, CUDA: NSight System
- 2024-10-08, Xnnpack 使用指南
- 2024-10-12, Roofline Model
Parallel
- 2023-05-02, 浅析机器学习中的并行模型和自动并行方法
Framework
- 2024-06-28, mllm框架浅析(一)-以QWen0.5B为例
- 2024-08-13, mllm框架浅析(二)-QNN-Backend
Kernel Impls
- 2024-09-17, Q8_0 @ Q4_0_4 GEMM/GEMV in llama.cpp