Self-Evolving Agents - Research Showcase

Featured Research

New

Procedural Memory Distillation

Online Reflection for Self-Improving Language Models

A novel framework that converts cross-episode signals into reusable procedural memory and distills it into the policy's weights during training. Memory functions as a training scaffold, enabling memory-free inference while achieving significant improvements over existing methods.

🎯
                                3.8-5.5% improvement on SciKnowEval
                            

💻
                                7.9-13.6% improvement on LiveCodeBench
                            

🚀
                                Memory-free inference with training-time scaffolding
                            

Self-Improvement Memory Distillation Co-Evolution Reinforcement Learning

Liu, Bansal, Pang, Li, and 5 others

2026

Read the paper →

New

Variational Policy Distillation

Learning from Language Feedback via a Co-Evolutionary EM Framework

VPD reframes on-policy self-distillation as a Variational Expectation-Maximization problem. The teacher is actively refined on trajectory outcomes (E-step) and then distilled into the student's own rollouts (M-step), with a dynamic trust region anchored to the current policy — co-evolving teacher and student inside a single shared-weight network.

💻
                                49.6% on LiveCodeBench v6 (+2.3 vs SDPO)
                            

🧪
                                +2.7–4.7% SciKnowEval AVG across Qwen3 & OLMo3
                            

🔁
                                Shared-weight EM with adaptive trust region
                            

Variational EM Language Feedback Self-Distillation RLVR

Li, Nijkamp, Yavuz, Joty

2026

Read the paper →

Coming Soon

More Research Coming Soon

Stay tuned for more groundbreaking research in artificial intelligence, machine learning, and natural language processing.

Research Themes

🧠

Self-Improving Systems

Developing AI systems that learn from their own experiences and continuously improve their capabilities over time.

💡

Memory & Learning

Exploring how AI can effectively store, retrieve, and utilize knowledge to enhance reasoning and decision-making.

🔄

Reinforcement Learning

Advancing RL techniques to enable more efficient and effective learning from verifiable rewards and feedback.