Deep Reinforcement Learning Alphago

The Dopamine Loop: Why Arguments Are Hard to Let Go

If you replay arguments long after they end, your brain may be seeking reward, not resolution. Here’s how dopamine shapes ...

Brain-inspired AI: Human brain separates goals and uncertainty to enable adaptive decision-making

Humans possess a remarkable balance between stability and flexibility, enabling them to quickly establish new plans and ...

GitHub

verl: Volcano Engine Reinforcement Learning for LLMs

verl is a flexible, efficient and production-ready RL training library for large language models (LLMs). verl is the open-source version of HybridFlow: A Flexible and Efficient RLHF Framework paper.

Unite.AI

The Reinforcement Gap: Why AI Excels at Some Tasks but Stalls at Others

Artificial Intelligence (AI) has achieved remarkable successes in recent years. It can defeat human champions in games like Go, predict protein structures with high accuracy, and perform complex tasks ...

GitHub

TabM: Advancing Tabular Deep Learning With Parameter-Efficient Ensembling" (ICLR 2025)

This is the official repository of the paper "TabM: Advancing Tabular Deep Learning With Parameter-Efficient Ensembling". It consists of two parts: One dot represents a performance score on one ...

Microsoft

Agent Lightning: Adding reinforcement learning to AI agents without code rewrites

AI agents are reshaping software development, from writing code to carrying out complex instructions. Yet LLM-based agents are prone to errors and often perform poorly on complicated, multi-step tasks ...

IEEE

Deep Reinforcement Learning-Based Collision-Free Navigation for Magnetic Helical Microrobots in Dynamic Environments

Abstract: Magnetic helical microrobots have great potential in biomedical applications due to their ability to access confined and enclosed environments via remote manipulation by magnetic fields.

IEEE

A Q-Learning Novelty Search Strategy for Evaluating Robustness of Deep Reinforcement Learning in Open-World Environments

Abstract: Despite substantial progress in deep reinforcement learning (DRL), a systematic characterization of DRL agents’ robustness to unexpected events in the environment is relatively understudied.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results