Skip to content

CME295 Lecture Notes

이 폴더는 CME295 강의 영상을 기반으로 정리한 한국어 lecture notes 모음이다. 각 강의 README는 강의 요약, 핵심 개념, 실무 관점의 메모, 복습 질문과 답변, 그리고 Mermaid/SVG 다이어그램을 포함한다.

LectureTopicNotesSource
01Transformer 기초lec-01/README.mdYouTube
02Transformer-based models and trickslec-02/README.mdYouTube
03LLMs, decoding, prompting, and inferencelec-03/README.mdYouTube
04LLM training, fine-tuning, and efficient adaptationlec-04/README.mdYouTube
05LLM tuning and human preferenceslec-05/README.mdYouTube
06LLM reasoning and GRPOlec-06/README.mdYouTube
%%{init: {"theme": "base", "themeVariables": {"background": "#171717", "primaryColor": "#232323", "primaryTextColor": "#f5f5f5", "primaryBorderColor": "#d0d0d0", "lineColor": "#cfcfcf", "fontFamily": "Inter, Arial, sans-serif"}}}%%
flowchart LR
    A[Lecture 1<br/>Transformer basics] --> B[Lecture 2<br/>Transformer variants]
    B --> C[Lecture 3<br/>LLM inference]
    C --> D[Lecture 4<br/>LLM training]
    D --> I[Lecture 5<br/>Preference tuning]
    I --> J[Lecture 6<br/>LLM reasoning]
    A --> E[Attention<br/>Q/K/V]
    B --> F[Position, norm,<br/>BERT, KV cache]
    C --> G[MoE, decoding,<br/>prompting]
    D --> H[Pre-training, ZeRO,<br/>SFT, LoRA]
    I --> K[RLHF, PPO,<br/>DPO]
    J --> L[GRPO, pass@K,<br/>DeepSeek-R1]

    classDef primary fill:#232323,stroke:#d0d0d0,color:#f5f5f5,stroke-width:2px;
    classDef secondary fill:#3b2f20,stroke:#d0d0d0,color:#f5f5f5,stroke-width:2px;
    classDef note fill:#52676b,stroke:#d0d0d0,color:#f5f5f5,stroke-width:2px;
    classDef accent fill:#62164d,stroke:#d0d0d0,color:#f5f5f5,stroke-width:2px;
    class A,B,C,D,I,J primary
    class E,F,G,K,L note
    class H accent

Lecture 1은 NLP task, tokenization, representation learning, RNN/LSTM의 한계, attention의 필요성을 거쳐 Transformer encoder-decoder 구조를 설명한다. 핵심은 self-attention이 token 간 의존성을 병렬적으로 계산하고, query/key/value 구조로 information retrieval처럼 동작한다는 점이다.

주요 다이어그램:

Lecture 2: Transformer-Based Models and Tricks

Section titled “Lecture 2: Transformer-Based Models and Tricks”

Lecture 2는 Transformer를 실제 model family로 확장할 때 필요한 positional encoding, RoPE, layer normalization, RMSNorm, MHA/MQA/GQA, encoder-only model, BERT pre-training을 다룬다. Lecture 1이 architecture의 기본 원리를 설명했다면, Lecture 2는 modern Transformer를 안정적이고 효율적으로 만드는 설계 선택을 정리한다.

주요 다이어그램:

Lecture 3: Large Language Models, Decoding, Prompting, and Inference

Section titled “Lecture 3: Large Language Models, Decoding, Prompting, and Inference”

Lecture 3은 decoder-only Transformer가 LLM으로 확장되는 과정을 다룬다. Mixture of Experts, next-token decoding, greedy/beam/sampling, temperature, prompt structure, in-context learning, chain of thought, KV cache, PagedAttention, speculative decoding 등 inference-time behavior와 serving optimization이 중심이다.

주요 다이어그램:

Lecture 4: LLM Training, Fine-Tuning, and Efficient Adaptation

Section titled “Lecture 4: LLM Training, Fine-Tuning, and Efficient Adaptation”

Lecture 4는 LLM을 어떻게 학습하고 조정하는지 설명한다. Pre-training, scaling laws, FLOPs/FLOP/s, GPU memory footprint, data parallelism, ZeRO, model parallelism, FlashAttention, mixed precision, SFT, instruction tuning, evaluation, alignment, LoRA, QLoRA가 핵심이다.

주요 다이어그램:

Lecture 5: LLM Tuning and Human Preferences

Section titled “Lecture 5: LLM Tuning and Human Preferences”

Lecture 5는 SFT model을 human preference에 맞게 조정하는 preference tuning을 설명한다. Pairwise preference data, reward model, Bradley-Terry formulation, RLHF, PPO clip/KL penalty, reward hacking, best-of-N, DPO가 핵심이다.

주요 다이어그램:

Lecture 6은 reasoning model을 answer 전에 reasoning chain을 생성하는 LLM으로 보고, math/code처럼 정답 검증이 가능한 task에서 verifiable rewards와 GRPO로 reasoning behavior를 학습하는 방법을 설명한다. pass@K, sampling temperature, reasoning token cost, output length growth, DeepSeek-R1-Zero/R1 training pipeline, reasoning distillation이 핵심이다.

주요 다이어그램:

ConceptFirst coveredLater use
TokenizationLecture 1pre-training data, SFT loss masking
Self-attentionLecture 1MHA, KV cache, FlashAttention
Q/K/VLecture 1MHA/MQA/GQA, RoPE, KV cache
Positional informationLecture 2long context, RoPE, context rot
Transformer familiesLecture 2decoder-only LLMs
Decoder-only LLMLecture 3pre-training and SFT
MoELecture 3expert parallelism
KV cacheLecture 3inference memory and throughput
Scaling lawsLecture 4model/data/compute allocation
ZeROLecture 4distributed training memory
FlashAttentionLecture 4exact attention with lower HBM IO
LoRA/QLoRALecture 4efficient fine-tuning
Preference tuningLecture 5human preference alignment after SFT
Reward modelLecture 5RLHF and best-of-N scoring
PPOLecture 5RLHF policy optimization
DPOLecture 5supervised-style preference optimization
Chain of thoughtLecture 3reasoning chains and test-time compute
Preference tuning / PPOLecture 5GRPO comparison and reasoning RL
pass@KLecture 6coding/math reasoning evaluation
GRPOLecture 6reasoning model RL training
DeepSeek-R1Lecture 6multi-stage reasoning training pipeline
  • Lecture notes are written in Korean, while technical terms are often kept in English when that is clearer.
  • SVG diagrams use the shared editorial style from AGENTS.md: restrained palette, thin line boxes, minimal fill, and accent red only for critical paths.
  • Mermaid diagrams include local classDef styling so they follow the same visual scheme in GitHub-rendered markdown.
  • Each lecture README ends with review questions and answers for quick self-checking.