2025 年初,所有實驗室的 LLM 生產(chǎn)棧看起來大致是這樣的:
1.Pretraining (GPT-2/3 of ~2020)
預(yù)訓(xùn)練(約 2020 年的 GPT-2/3)
2.Supervised Finetuning (InstructGPT ~2022) and
監(jiān)督微調(diào)(InstructGPT ~2022)和
Reinforcement Learning from Human Feedback (RLHF ~2022)
3.人類反饋強化學(xué)習(xí)(RLHF ~2022)
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
特別聲明:以上內(nèi)容(如有圖片或視頻亦包括在內(nèi))為自媒體平臺“網(wǎng)易號”用戶上傳并發(fā)布,本平臺僅提供信息存儲服務(wù)。
Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.