AI-Paper-2024
Most Valuable AI Research Papers in 2024
Mixture of Experts Approach (MoE)
You can find the original paper[3] here.
DoRA: Weight-decomposed LoRA
You can find the original paper[4] here.
Simple and Scalable Strategies to Continually Pre-train Large Language Models
You can find the original paper[5] here.
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
You can find the original paper[6] here.
LoRA Learns Less and Forgets Less
You can find the original paper[7] here.
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
You can find the original paper[8] here.
The Llama 3 Herd of Models
You can find the original paper[9] here.
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
You can find the original paper[10] here.
NVLM: Open Frontier-Class Multimodal LLMs
You can find the original paper[11] here.
O1 Replication Journey: A Strategic Progress Report – Part 1
You can find the original paper[12] here.
Scaling Laws for Precision
You can find the original paper[13] here.
DeepSeek-V3 Technical Report
You can find the original paper[14] here.
Phi-4 Technical Report
You can find the original paper[15] here.
References
papers related
- https://substack.com/home/post/p-153341037 ↩
- https://substack.com/home/post/p-153692738 ↩
- https://arxiv.org/abs/2401.04088 ↩
- http://arxiv.org/abs/2402.09353 ↩
- https://arxiv.org/abs/2403.08763 ↩
- https://arxiv.org/abs/2404.10719 ↩
- https://arxiv.org/abs/2405.09673 ↩
- https://arxiv.org/abs/2406.17557 ↩
- https://arxiv.org/abs/2407.21783 ↩
- https://arxiv.org/abs/2408.03314 ↩
- https://arxiv.org/abs/2409.11402 ↩
- https://arxiv.org/abs/2410.18982 ↩
- https://arxiv.org/abs/2411.04330 ↩
- https://arxiv.org/abs/2412.19437 ↩
- https://arxiv.org/abs/2412.08905 ↩
AI-Paper-2024
https://xiyuanyang-code.github.io/posts/AI-Paper-2024/