Abouts

🎉News🎉

About

Hi👋, I’m Chenghua Wang, currently a postgraduate CS student at BUPT. I’m interested in AI&Sys. I’ve struggled for almost one and half years on computer vision(Dehazing, Multi-label classification. July 1 2021 -> Dec 31 2022) contact me: chenghua.wang.edu@gmail.com Research Interests Machine Learning Infrastructure ML Compiler NNCV(ZJGSU, BS dissertation), it contains an Aten-lang auto-parallel language(using Polyhedra Algorithms) and a set of deep learning compilation pipelines(Based on MLIR). Distributed and Parallel System Design Computer Vision...

AI & Sys 入门

日期更新内容 2024-01-01 文章创建 2024-08-20 1. 根据最近的AI&Sys发展情况做了跟进 2. 对几个AI&Sys分支给出了学习路径和必读的文献本文是笔者在学习AI&Sys的过程中梳理出来的。本文默认读者掌握了：基础的深度学习知识，对计算机视觉(完成大部分cs231n lab)有一定的了解。能熟练使用Python、C++完成中型的项目。对操作系统、体系结构有一定的了解(大部分CSAPP lab完成，理解了OSTEP书中的大部分知识/完成MIT 6.S081 XV6 lab大部分内容)。对 CUDA 编程模型有了解，不要求使用。本文只记录入门需要看的书籍/课程/论文/项目等，暂时不包括更加深入的内容。 1. AI&Sys / MLSys / LLMSys的基础内容 1.1 课程 TinyML and Efficient Deep Learning Computing, MIT han lab MIT 6.5940 韩松老师主讲的课程课程质量很高，在B站有FAll 2023的视屏。推荐入门的同学可以先看看这个，可以带你参观整个MLSys相关的大部分领域（偏向算法）。 CS559E, cs.washington： cs.washington CSE559M 完整的讲述了MLSys大部分的领域（偏向Sys），类似一个综述类型的课程，可以带领你了解完大部分MLSys的领域。缺点是没有视屏，只有一些资料。 Large Language Model Systems, CMU CMU 11868 与LLM联系的更加紧密一点，如RAG，大模型Serving。还有一些Sys上的，如GPU just-in-time compilation、Communication Efficient Distributed Training等。推荐先看韩松老师的课程来获得一个全局的视角（主要是补齐大模型的一些基本算法知识，大部分是推理上的，训练的知识是缺失的）。然后其他的课程可以选择看看，或者直接看您想从事方向的论文。 1.2 阅读材料笔者总结了一些常识性的阅读材料，仅供入门阅读，可能会有重复，读者可以选择一些来阅读。阅读材料比较多，阅读的时候应该详略得当，明确主攻哪块方向。 LLM Basics...

思考

目前头脑还是一片荒芜

技术相关

Fundamental 2024-08-10, [Fundamental]From Online Softmax to Flash Attention V3 2024-08-11, [Fundamental] 旋转位置编码(RoPE) 2024-08-19, [Fundamental] 模型量化 2024-08-21, [Fundamental] FlashDecoding Series MIT6.S081, XV6 Lab 2023-02-23, XV6 lab 1 Utilities 2023-02-25, XV6 lab 2 syscall 2023-03-02, XV6 lab 3 page table 2023-03-12, XV6 lab 4 traps 2023-03-17, XV6 lab 5 copy on write Tools 2023-04-18, CUDA: NSight System 2024-10-08, Xnnpack 使用指南 2024-10-12, Roofline Model Parallel 2023-05-02, 浅析机器学习中的并行模型和自动并行方法 Framework 2024-06-28, mllm框架浅析(一)-以QWen0.5B为例 2024-08-13, mllm框架浅析(二)-QNN-Backend Kernel Impls 2024-09-17, Q8_0 @ Q4_0_4 GEMM/GEMV in llama....

论文解析

Fundamental 2024-08-21, [Fundamental] FlashDecoding Series 2024-08-19, [Fundamental] 模型量化 2024-08-11, [Fundamental] 旋转位置编码(RoPE) 2024-08-10, [Fundamental]From Online Softmax to Flash Attention V3 Distributed System 2023-01-19, The Design of a Practical System for Fault-Tolerant Virtual Machines 2023-01-18, Google File System(GFS) 2023-01-17, MapReduce 量化 2024-08-15, ✅[April-May 2024] 模型量化之 🥕Quarot & SpinQuant 里程碑，旋转矩阵缓解Outliers 2024-06-25, ✅[Oct 2023] 模型量化之QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models MLSys2024， MoE量化 2024-05-25, ✅[April 2024] 模型量化之AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration...