
[2502.21321] LLM Post-Training: A Deep Dive into Reasoning …
2025年2月28日 · Large Language Models (LLMs) have transformed the natural language processing landscape and brought to life diverse applications. Pretraining on vast web-scale data has laid the foundation for these models, yet the research community is now increasingly shifting focus toward post-training techniques to achieve further breakthroughs. While pretraining provides a broad linguistic foundation ...
mbzuai-oryx/Awesome-LLM-Post-training - GitHub
A taxonomy of post-training approaches for **LLMs**, categorized into Fine-tuning, Reinforcement Learning, and Test-time Scaling methods. We summarize the key techniques used in recent LLM models.
New LLM Pre-training and Post-training Paradigms - Sebastian …
2024年8月17日 · Build a Large Language Model (from Scratch) is a highly focused book dedicated to coding LLMs from the ground up in PyTorch, covering everything from pre-training to post-training—arguably the best way to truly understand LLMs.
LLM Post-Training: A Deep Dive into Reasoning Large Language …
2025年3月7日 · LLM Post-Training: A Deep Dive into Reasoning Large Language Models. This survey provides a systematic exploration of post-training methodologies, analyzing their role in refining LLMs beyond pretraining, addressing key challenges such as catastrophic forgetting, reward hacking, and inference-time trade-offs.
Arena Learning: Build Data Flywheel for LLMs Post-training via ...
2024年7月15日 · This fully automated training and evaluation pipeline sets the stage for continuous advancements in various LLMs via post-training. Notably, Arena Learning plays a pivotal role in the success of WizardLM-2, and this paper serves both as an exploration of its efficacy and a foundational study for future discussions related to WizardLM-2 and its ...
Plug-and-Play: An Efficient Post-training Pruning Method for Large...
2024年1月16日 · In this paper, we present a plug-and-play solution for post-training pruning of LLMs. The proposed solution has two innovative components: 1) **Relative Importance and Activations (RIA)**, a new pruning metric that jointly considers the weight and activations efficiently on LLMs, and 2) **Channel Permutation**, a new approach to maximally ...
[2406.05981] ShiftAddLLM: Accelerating Pretrained LLMs via Post ...
2024年6月10日 · To address this, we propose accelerating pretrained LLMs through post-training shift-and-add reparameterization, creating efficient multiplication-free models, dubbed ShiftAddLLM. Specifically, we quantize each weight matrix into binary matrices paired with group-wise scaling factors.
Models (LLMs), the survey systematically covers various aspects: • Different types of human (and non-human) feedback (Section 7.3), • The training methods in RLHF (Section 7.6), • Alternative approaches that do not rely on RL or reward models (Section 7.9).
How LLMs Work: Pre-Training to Post-Training, Neural Networks ...
2025年2月18日 · With the recent explosion of interest in large language models (LLMs), they often seem almost magical. But let’s demystify them. I wanted to step back and unpack the fundamentals — breaking down how LLMs are built, trained, and fine-tuned to become the AI systems we interact with today.
大模型量化技术原理-ZeroQuant系列 - CSDN博客
训练后量化 (ptq) 已成为一种有前途的技术,可减少大语言模型中的内存消耗和计算成本 (llms)。 然而,目前缺乏对各种量化方案、模型族和量化位精度的系统检查。
- 某些结果已被删除