Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
Training Language Models to Self-Correct via Reinforcement Learning
arxiv.org
230 points
weirdcat
2 years ago
92 comments
Loading...
Training Language Models to Self-Correct via Reinforcement Learning | Heykuki News