To read - a neuraloverflow Collection

neuraloverflow 's Collections

To read

updated about 9 hours ago

Upvote

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 107
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 80
In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 43
Matryoshka Diffusion Models

Paper • 2310.15111 • Published Oct 23, 2023 • 46
Contrastive Prefence Learning: Learning from Human Feedback without RL

Paper • 2310.13639 • Published Oct 20, 2023 • 25
Safe RLHF: Safe Reinforcement Learning from Human Feedback

Paper • 2310.12773 • Published Oct 19, 2023 • 28
An Image is Worth Multiple Words: Learning Object Level Concepts using Multi-Concept Prompt Learning

Paper • 2310.12274 • Published Oct 18, 2023 • 13
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V

Paper • 2310.11441 • Published Oct 17, 2023 • 29
In-Context Pretraining: Language Modeling Beyond Document Boundaries

Paper • 2310.10638 • Published Oct 16, 2023 • 30
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion

Paper • 2310.03502 • Published Oct 5, 2023 • 79
How FaR Are Large Language Models From Agents with Theory-of-Mind?

Paper • 2310.03051 • Published Oct 4, 2023 • 35
Large Language Models Cannot Self-Correct Reasoning Yet

Paper • 2310.01798 • Published Oct 3, 2023 • 36
Enable Language Models to Implicitly Learn Self-Improvement From Data

Paper • 2310.00898 • Published Oct 2, 2023 • 24
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Paper • 2310.00426 • Published Sep 30, 2023 • 61
Conditional Diffusion Distillation

Paper • 2310.01407 • Published Oct 2, 2023 • 19
Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 86
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Paper • 2310.04378 • Published Oct 6, 2023 • 22
CodeFusion: A Pre-trained Diffusion Model for Code Generation

Paper • 2310.17680 • Published Oct 26, 2023 • 74
Personas as a Way to Model Truthfulness in Language Models

Paper • 2310.18168 • Published Oct 27, 2023 • 5
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation

Paper • 2310.16656 • Published Oct 25, 2023 • 54
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling

Paper • 2311.00430 • Published Nov 1, 2023 • 56
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 42
Controllable Music Production with Diffusion Models and Guidance Gradients

Paper • 2311.00613 • Published Nov 1, 2023 • 25
De-Diffusion Makes Text a Strong Cross-Modal Interface

Paper • 2311.00618 • Published Nov 1, 2023 • 22
The Generative AI Paradox: "What It Can Create, It May Not Understand"

Paper • 2311.00059 • Published Oct 31, 2023 • 19
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?

Paper • 2311.00047 • Published Oct 31, 2023 • 10
CapsFusion: Rethinking Image-Text Data at Scale

Paper • 2310.20550 • Published Oct 31, 2023 • 27
Beyond U: Making Diffusion Models Faster & Lighter

Paper • 2310.20092 • Published Oct 31, 2023 • 12
LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery

Paper • 2310.18356 • Published Oct 24, 2023 • 24
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Paper • 2310.20587 • Published Oct 31, 2023 • 18
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

Paper • 2305.07759 • Published May 12, 2023 • 45
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 155
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models

Paper • 2310.16795 • Published Oct 25, 2023 • 27
FLAP: Fast Language-Audio Pre-training

Paper • 2311.01615 • Published Nov 2, 2023 • 16
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Paper • 2311.05556 • Published Nov 9, 2023 • 86
Levels of AGI for Operationalizing Progress on the Path to AGI

Paper • 2311.02462 • Published Nov 4, 2023 • 36
The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4

Paper • 2311.07361 • Published Nov 13, 2023 • 14
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster

Paper • 2311.08263 • Published Nov 14, 2023 • 16
Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure

Paper • 2311.07590 • Published Nov 9, 2023 • 17
Music ControlNet: Multiple Time-varying Controls for Music Generation

Paper • 2311.07069 • Published Nov 13, 2023 • 44
Prompt Engineering a Prompt Engineer

Paper • 2311.05661 • Published Nov 9, 2023 • 23
PolyMaX: General Dense Prediction with Mask Transformer

Paper • 2311.05770 • Published Nov 9, 2023 • 8
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Paper • 2311.09257 • Published Nov 14, 2023 • 47
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Paper • 2311.10093 • Published Nov 16, 2023 • 58
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration

Paper • 2311.04257 • Published Nov 7, 2023 • 22
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

Paper • 2311.10642 • Published Nov 17, 2023 • 25
Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 77
Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 119
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning

Paper • 2311.11501 • Published Nov 20, 2023 • 37
System 2 Attention (is something you might need too)

Paper • 2311.11829 • Published Nov 20, 2023 • 43
GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 247
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Paper • 2311.13231 • Published Nov 22, 2023 • 28
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Paper • 2312.03818 • Published Dec 6, 2023 • 34
Magicoder: Source Code Is All You Need

Paper • 2312.02120 • Published Dec 4, 2023 • 82
FaceStudio: Put Your Face Everywhere in Seconds

Paper • 2312.02663 • Published Dec 5, 2023 • 32
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis

Paper • 2312.03491 • Published Dec 6, 2023 • 34
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Paper • 2312.04474 • Published Dec 7, 2023 • 34
DeepCache: Accelerating Diffusion Models for Free

Paper • 2312.00858 • Published Dec 1, 2023 • 23
Your ViT is Secretly an Image Segmentation Model

Paper • 2503.19108 • Published Mar 24, 2025 • 25
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

Paper • 2503.19757 • Published Mar 25, 2025 • 51
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper • 2504.17192 • Published Apr 24, 2025 • 124
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 122
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 191
The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies

Paper • 2602.09877 • Published Feb 10 • 197
Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning

Paper • 2602.10090 • Published Feb 10 • 52
Code2World: A GUI World Model via Renderable Code Generation

Paper • 2602.09856 • Published Feb 10 • 202
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger

Paper • 2602.08222 • Published Feb 9 • 290
PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published Jan 30 • 225
SWE-Universe: Scale Real-World Verifiable Environments to Millions

Paper • 2602.02361 • Published Feb 2 • 60
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss

Paper • 2602.02493 • Published Feb 2 • 46
Rethinking Generative Recommender Tokenizer: Recsys-Native Encoding and Semantic Quantization Beyond LLMs

Paper • 2602.02338 • Published Feb 2 • 42
Context Forcing: Consistent Autoregressive Video Generation with Long Context

Paper • 2602.06028 • Published Feb 5 • 36
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Paper • 2602.02488 • Published Feb 2 • 36
Reinforcement World Model Learning for LLM-based Agents

Paper • 2602.05842 • Published Feb 5 • 27
Beyond Pixels: Visual Metaphor Transfer via Schema-Driven Agentic Reasoning

Paper • 2602.01335 • Published Feb 1 • 16
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration

Paper • 2602.03786 • Published Feb 3 • 90
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing

Paper • 2602.02437 • Published Feb 2 • 80
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 75
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

Paper • 2602.08990 • Published Feb 9 • 77
AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders

Paper • 2602.05027 • Published Feb 4 • 63
Improving Data and Reward Design for Scientific Reasoning in Large Language Models

Paper • 2602.08321 • Published Feb 9 • 43
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 32
Self-Improving World Modelling with Latent Actions

Paper • 2602.06130 • Published Feb 5 • 32
When Actions Teach You to Think: Reasoning-Action Synergy via Reinforcement Learning in Conversational Agents

Paper • 2512.11277 • Published Dec 12, 2025
Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math

Paper • 2602.06291 • Published Feb 6 • 24
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Paper • 2602.10560 • Published Feb 11 • 31
iGRPO: Self-Feedback-Driven LLM Reasoning

Paper • 2602.09000 • Published Feb 9 • 18
UI-Venus-1.5 Technical Report

Paper • 2602.09082 • Published Feb 9 • 157
Chain of Mindset: Reasoning with Adaptive Cognitive Modes

Paper • 2602.10063 • Published Feb 10 • 75
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published Feb 11 • 244
AgentSkiller: Scaling Generalist Agent Intelligence through Semantically Integrated Cross-Domain Data Synthesis

Paper • 2602.09372 • Published Feb 10 • 7
AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent

Paper • 2602.03955 • Published Feb 3 • 8
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models

Paper • 2503.06692 • Published Mar 9, 2025 • 2
InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

Paper • 2602.06960 • Published Feb 6 • 14
Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception

Paper • 2602.11858 • Published Feb 12 • 63
Intelligent AI Delegation

Paper • 2602.11865 • Published Feb 12 • 16
Architecting Agentic Communities using Design Patterns

Paper • 2601.03624 • Published Jan 7
Internet of Agentic AI: Incentive-Compatible Distributed Teaming and Workflow

Paper • 2602.03145 • Published Feb 3
BrowseComp-V^3: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents

Paper • 2602.12876 • Published Feb 13 • 12
WebWorld: A Large-Scale World Model for Web Agent Training

Paper • 2602.14721 • Published Feb 16 • 11
HeartMuLa: A Family of Open Sourced Music Foundation Models

Paper • 2601.10547 • Published Jan 15 • 48
Recursive Language Models

Paper • 2512.24601 • Published Dec 31, 2025 • 94
SimpleMem: Efficient Lifelong Memory for LLM Agents

Paper • 2601.02553 • Published Jan 5 • 37
Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Paper • 2510.06961 • Published Oct 8, 2025 • 12
Qwen3-ASR Technical Report

Paper • 2601.21337 • Published Jan 29 • 36
Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices

Paper • 2509.02523 • Published Sep 2, 2025 • 21
Index-ASR Technical Report

Paper • 2601.00890 • Published Dec 31, 2025
Fast KV Compaction via Attention Matching

Paper • 2602.16284 • Published Feb 18 • 1
ArXiv-to-Model: A Practical Study of Scientific LM Training

Paper • 2602.17288 • Published Feb 19 • 9
AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines

Paper • 2602.14296 • Published Feb 15 • 51
Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

Paper • 2602.16855 • Published Feb 15 • 51
SkillOrchestra: Learning to Route Agents via Skill Transfer

Paper • 2602.19672 • Published Feb 23 • 58
A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 519
Multi-Vector Index Compression in Any Modality

Paper • 2602.21202 • Published Feb 24 • 22
CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era

Paper • 2602.23452 • Published Feb 26 • 17
Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Paper • 2602.23008 • Published Feb 26 • 37
Imagination Helps Visual Reasoning, But Not Yet in Latent Space

Paper • 2602.22766 • Published Feb 26 • 44
MediX-R1: Open Ended Medical Reinforcement Learning

Paper • 2602.23363 • Published Feb 26 • 22
Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization

Paper • 2602.22675 • Published Feb 26 • 23
AgentDropoutV2: Optimizing Information Flow in Multi-Agent Systems via Test-Time Rectify-or-Reject Pruning

Paper • 2602.23258 • Published Feb 26 • 28
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints

Paper • 2601.18137 • Published Jan 26 • 35
Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 204
Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 194
Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 264
DREAM: Deep Research Evaluation with Agentic Metrics

Paper • 2602.18940 • Published Feb 21 • 14
Experiential Reinforcement Learning

Paper • 2602.13949 • Published Feb 15 • 74
MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs

Paper • 2602.12705 • Published Feb 13 • 68
TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents

Paper • 2602.07274 • Published Feb 6 • 209
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Paper • 2602.02185 • Published Feb 2 • 118
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 111
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents

Paper • 2602.02474 • Published Feb 2 • 63
FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents

Paper • 2602.01566 • Published Feb 2 • 52
Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles

Paper • 2602.01590 • Published Feb 2 • 33
Self-Hinting Language Models Enhance Reinforcement Learning

Paper • 2602.03143 • Published Feb 3 • 31
LLM-in-Sandbox Elicits General Agentic Intelligence

Paper • 2601.16206 • Published Jan 22 • 86
MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning

Paper • 2603.03379 • Published Mar 3 • 32
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier

Paper • 2603.03756 • Published Mar 4 • 89
SkillNet: Create, Evaluate, and Connect AI Skills

Paper • 2603.04448 • Published Feb 26 • 93
Progressive Residual Warmup for Language Model Pretraining

Paper • 2603.05369 • Published Mar 5 • 36
Memory in the Age of AI Agents

Paper • 2512.13564 • Published Dec 15, 2025 • 157
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains

Paper • 2507.17746 • Published Jul 23, 2025 • 5
Agentic Rubrics as Contextual Verifiers for SWE Agents

Paper • 2601.04171 • Published Jan 7 • 13
DeepResearch Bench II: Diagnosing Deep Research Agents via Rubrics from Expert Report

Paper • 2601.08536 • Published Jan 13 • 3
RubricBench: Aligning Model-Generated Rubrics with Human Standards

Paper • 2603.01562 • Published Mar 2 • 63
ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents

Paper • 2511.07685 • Published Nov 10, 2025 • 10
Reinforcing Chain-of-Thought Reasoning with Self-Evolving Rubrics

Paper • 2602.10885 • Published Feb 11 • 1
How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 59
Reasoning Models Struggle to Control their Chains of Thought

Paper • 2603.05706 • Published Mar 5 • 37
General Agentic Memory Via Deep Research

Paper • 2511.18423 • Published Nov 23, 2025 • 170
TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment

Paper • 2602.23068 • Published Feb 26 • 7
Fish Audio S2 Technical Report

Paper • 2603.08823 • Published Mar 9 • 37
One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers

Paper • 2603.12245 • Published Mar 12 • 18
Test-Driven AI Agent Definition (TDAD): Compiling Tool-Using Agents from Behavioral Specifications

Paper • 2603.08806 • Published Mar 9 • 7
ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

Paper • 2603.05863 • Published Mar 6 • 6
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Paper • 2603.09906 • Published Mar 10 • 75
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

Paper • 2603.12201 • Published Mar 12 • 53
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Paper • 2603.12180 • Published Mar 12 • 65
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

Paper • 2603.12228 • Published Mar 12 • 12
Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning

Paper • 2509.24372 • Published Sep 29, 2025 • 12
Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions

Paper • 1901.01753 • Published Jan 7, 2019 • 2
Learning to Continually Learn via Meta-learning Agentic Memory Designs

Paper • 2602.07755 • Published Feb 8 • 7
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

Paper • 2505.22954 • Published May 29, 2025 • 15
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 210
OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 152
In-Context Reinforcement Learning for Tool Use in Large Language Models

Paper • 2603.08068 • Published Mar 9 • 43
CREATE: Testing LLMs for Associative Creativity

Paper • 2603.09970 • Published Mar 10 • 15
LMEB: Long-horizon Memory Embedding Benchmark

Paper • 2603.12572 • Published Mar 13 • 73
Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

Paper • 2603.12793 • Published Mar 13 • 38
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 148
FadeMem: Biologically-Inspired Forgetting for Efficient Agent Memory

Paper • 2601.18642 • Published Jan 26 • 1
AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data

Paper • 2603.15594 • Published Mar 16 • 149
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings

Paper • 2603.13594 • Published Mar 13 • 148
Learning to Discover at Test Time

Paper • 2601.16175 • Published Jan 22 • 44
Why AI systems don't learn and what to do about it: Lessons on autonomous learning from cognitive science

Paper • 2603.15381 • Published Mar 16 • 1
GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent

Paper • 2603.13875 • Published Mar 14 • 35
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published Mar 17 • 139
Hyperagents

Paper • 2603.19461 • Published Mar 19 • 50
REVERE: Reflective Evolving Research Engineer for Scientific Workflows

Paper • 2603.20667 • Published Mar 21 • 18
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis

Paper • 2603.20278 • Published Mar 17 • 95
Deep Tabular Research via Continual Experience-Driven Execution

Paper • 2603.09151 • Published Mar 10 • 15
CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare

Paper • 2603.24157 • Published 29 days ago • 10
Memento-Skills: Let Agents Design Agents

Paper • 2603.18743 • Published Mar 19 • 58
Voxtral TTS

Paper • 2603.25551 • Published 28 days ago • 59
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Paper • 2603.24533 • Published 29 days ago • 47
T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search

Paper • 2603.22341 • Published Mar 21 • 37
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Paper • 2604.02029 • Published 22 days ago • 144
Terminal Agents Suffice for Enterprise Automation

Paper • 2604.00073 • Published 23 days ago • 96
Towards a Medical AI Scientist

Paper • 2603.28589 • Published 24 days ago • 89
Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization

Paper • 2603.28342 • Published 24 days ago • 26
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

Paper • 2603.24440 • Published 29 days ago • 98
Composer 2 Technical Report

Paper • 2603.24477 • Published 29 days ago • 15
MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies

Paper • 2603.24649 • Published 29 days ago • 31
Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models

Paper • 2603.25750 • Published Mar 20 • 36
INSID3: Training-Free In-Context Segmentation with DINOv3

Paper • 2603.28480 • Published 24 days ago • 5
When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning

Paper • 2603.21289 • Published Mar 22 • 35
Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

Paper • 2603.22582 • Published about 1 month ago • 7
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published 29 days ago • 131
Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design

Paper • 2603.28376 • Published 24 days ago • 23
Story2Proposal: A Scaffold for Structured Scientific Paper Writing

Paper • 2603.27065 • Published 27 days ago • 22
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published 15 days ago • 284
ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published 15 days ago • 259
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents

Paper • 2604.07429 • Published 16 days ago • 113
RAGEN-2: Reasoning Collapse in Agentic RL

Paper • 2604.06268 • Published 17 days ago • 65
Combee: Scaling Prompt Learning for Self-Improving Language Model Agents

Paper • 2604.04247 • Published 19 days ago • 30
Neural Computers

Paper • 2604.06425 • Published 17 days ago • 30
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents

Paper • 2604.06132 • Published 17 days ago • 117
Learning to Retrieve from Agent Trajectories

Paper • 2604.04949 • Published 25 days ago • 70
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

Paper • 2604.05091 • Published 18 days ago • 45
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement

Paper • 2604.01591 • Published 22 days ago • 41
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning

Paper • 2604.05404 • Published 17 days ago • 42
Squeez: Task-Conditioned Tool-Output Pruning for Coding Agents

Paper • 2604.04979 • Published 20 days ago • 10
Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published 22 days ago • 486
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published 18 days ago • 110
Memory Intelligence Agent

Paper • 2604.04503 • Published 18 days ago • 58
SkillX: Automatically Constructing Skill Knowledge Bases for Agents

Paper • 2604.04804 • Published 18 days ago • 33
Self-Distilled RLVR

Paper • 2604.03128 • Published 21 days ago • 165
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published 21 days ago • 367
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Paper • 2603.26164 • Published 27 days ago • 359
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published 22 days ago • 96
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published 29 days ago • 183
MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome

Paper • 2603.28407 • Published 24 days ago • 68
Embarrassingly Simple Self-Distillation Improves Code Generation

Paper • 2604.01193 • Published 22 days ago • 46
Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers

Paper • 2604.01128 • Published 22 days ago • 15
Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

Paper • 2604.00842 • Published 22 days ago • 14
Toward Autonomous Long-Horizon Engineering for ML Research

Paper • 2604.13018 • Published 10 days ago • 34

Upvote

Collection guide
Browse collections