Is One Layer Enough? A Single Transformer Layer Matches Full-Parameter RL Train

· Hacker News