Yannic Kilcher

(updated 1 year ago)

Timeline

Jump To

26 Jan 2025 (GMT)
[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models01:09:00
expand...

27 Dec 2024 (GMT)
Traditional Holiday Live Stream01:28:17
expand...

24 Dec 2024 (GMT)
Byte Latent Transformer: Patches Scale Better Than Tokens (Paper Explained)36:15
expand...

10 Dec 2024 (GMT)
Safety Alignment Should be Made More Than Just a Few Tokens Deep (Paper Explained)48:53
expand...

23 Nov 2024 (GMT)
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters (Paper Explained)28:23
expand...

19 Oct 2024 (GMT)
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models37:06
expand...

12 Oct 2024 (GMT)
Were RNNs All We Needed? (Paper Explained)27:48
expand...

05 Oct 2024 (GMT)
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)53:02
expand...

04 Aug 2024 (GMT)
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models (Paper Explained)01:03:56
expand...

08 Jul 2024 (GMT)
Scalable MatMul-free Language Modeling (Paper Explained)49:45
expand...

Load More