Reinforcement Learning LLM

Nvidia researchers boost LLMs reasoning skills by getting them to 'think' during pre-training

Researchers at Nvidia have developed a new technique that flips the script on how large language models (LLMs) learn to reason. The method, called reinforcement learning pre-training (RLP), integrates ...

InfoQ

Google Publishes LLM Self-Correction Algorithm SCoRe

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources. Birgitta Böckeler, Distinguished Engineer at ...

Forbes

The Rise And Rise Of Reinforcement Learning: AI’s Quiet Revolution

Forbes contributors publish independent expert analyses and insights. Author, Researcher and Speaker on Technology and Business Innovation. Apr 19, 2025, 03:24am EDT Apr 21, 2025, 10:40am EDT ...

VentureBeat

MiniMax-M1 is a new open source model with 1 MILLION TOKEN context and new, hyper efficient reinforcement learning

Chinese AI startup MiniMax, perhaps best known in the West for its hit realistic AI video model Hailuo, has released its latest large language model, MiniMax-M1 — and in great news for enterprises and ...

Sify.com

Pressure Paradox: How Punishing AI Makes Better LLMs

So far, scientists have relied on positive reinforcement learning to train LLMs, but the opposite seems to be giving much better results, finds Satyen K. Bordoloi… This is a finding that’ll have ...

The Conversation

What is reinforcement learning? An AI researcher explains a key method of teaching machines – and how it relates to training your dog

Understanding intelligence and creating intelligent machines are grand scientific challenges of our times. The ability to learn from experience is a cornerstone of intelligence for machines and living ...

Morocco World News

The Sycophancy Problem: Why AI Can’t Stop Agreeing With You

LLMs are built to be helpful, but a growing body of research shows they have developed the habit of telling users what they want to hear.

NextBigFuture

Progress to Continual Learning AI

2025 saw a tripling of continual learning LLM papers according to arXiv trends. This is driven by foundation model scale and multimodal extensions. However, no flagship AI released models (GPT-5, Grok ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results