đź“„ Papers

Core references and recent advances in sample-efficient learning • 🕸️ View Graph

Core Reference
New (2024-2025)
Related Work

🌍 World Models & Model-Based RL

DreamerV3: Mastering Diverse Domains through World Models Core

Hafner et al., 2023 • arXiv:2301.04104

Third generation Dreamer. World model + actor-critic trained from replayed experience. Key baseline in HICA.

Director: Deep Hierarchical Planning from Pixels Core

Hafner et al., 2022 • arXiv:2206.04114

Hierarchical extension of Dreamer with worker-manager architecture. Key comparison model in HICA.

Clockwork VAE (CWVAE): Temporal Abstraction in Sequences Core

Saxena et al., 2021 • arXiv:2102.09532

Hierarchical extension of RSSM with multiple temporal scales. Inspiration for HICA architecture.

S4WM: State Space World Models Core

Deng et al., 2023 • arXiv:2312.01035

State-space models for world modeling. Alternative architecture to RNN-based approaches.

MuZero: Mastering Games by Planning with Learned Models Core

Schrittwieser et al., 2020 • Nature • arXiv:1911.08265

Planning with learned world model. Mastered Atari, Go, Chess, Shogi without knowing rules.

World Models Core

Ha & Schmidhuber, 2018 • arXiv:1803.10122

The foundational paper. Learn world model in latent space, train controller entirely in imagination.

Dreamer 4: Training Agents Inside of Scalable World Models New

Hafner et al., Sep 2025 • arXiv:2509.24527

First agent to get diamonds in Minecraft from offline data. Scalable imagination training.

Accelerating MBRL with State-Space World Models New

Feb 2025 • arXiv:2502.20168

10Ă— faster world model training, 4Ă— faster overall MBRL.

THICK: Learning Hierarchical World Models with Adaptive Temporal Abstractions New

Gumbsch et al., ICLR 2024 • OpenReview • GitHub

Temporal Hierarchies from Invariant Context Kernels. Lower level forms invariant contexts via discrete latent dynamics, higher level predicts context changes.

🔄 Complementary Learning Systems

Why There Are Complementary Learning Systems in the Hippocampus and Neocortex Core

McClelland, McNaughton & O'Reilly, 1995 • Psychological Review

Foundational CLS theory paper. Explains hippocampal fast learning + neocortical slow consolidation.

What Learning Systems Do Intelligent Agents Need? Core

Kumaran, Hassabis & McClelland, 2016 • Trends in Cognitive Sciences

Modern update to CLS theory. Framework for AI systems inspired by hippocampus-neocortex.

FSC-Net: Fast-Slow Consolidation Networks New

Nov 2025 • arXiv:2511.11707

Dual-network: fast learner + slow learner inspired by memory consolidation.

Semantic and Episodic Memories in Predictive Coding New

Sep 2025 • arXiv:2509.01987

Semantic memory in neocortex, episodic memory resembles hippocampal CA3.

ComS2T: Complementary Spatiotemporal Learning New

Mar 2024 • arXiv:2403.01738

Stable neocortex for consolidation + dynamic hippocampus for new knowledge.

🏗️ Hierarchical Reinforcement Learning

Neural History Compressor Core

Schmidhuber, 1992 • Neural Computation • MIT Press

Pioneering hierarchical sequence learning. Higher levels receive only unpredictable (surprising) inputs—automatic temporal abstraction through compression.

A Clockwork RNN Core

Koutnik, Greff, Gomez & Schmidhuber, 2014 • ICML • arXiv:1402.3511

Hidden layer partitioned into modules running at different clock rates (1, 2, 4, 8... steps). Architectural inductive bias for multi-timescale temporal processing.

Phased LSTM: Accelerating Recurrent Network Training Core

Neil, Pfeiffer & Liu, 2016 • NeurIPS • arXiv:1610.09513

Adds learned time gate with oscillating open/close periods. Only updates cell state during open phases—handles irregular/event-driven sequences and long time lags efficiently.

Hierarchical Multiscale Recurrent Neural Networks Core

Chung, Ahn & Bengio, 2017 • ICLR • arXiv:1609.01704

Learns to discover latent hierarchical structure via boundary detection. Higher layers update only at detected boundaries—adaptive temporal abstraction without predefined timescales.

Recent Advances in Hierarchical Reinforcement Learning Core

Barto & Mahadevan, 2003 • Discrete Event Dynamic Systems

Foundational survey on HRL. Temporal abstraction and options framework.

Introduction to Reinforcement Learning Core

Sutton & Barto, 2018 • MIT Press

The RL textbook. Foundation for all modern RL research.

Emergent Temporal Abstractions in Autoregressive Models New

Kobayashi et al., Dec 2025 • arXiv:2512.20605

"Internal RL" enables hierarchical RL within foundation models.

Exploring the Limits of Hierarchical World Models in RL New

Nature Scientific Reports, 2024 • Nature

Analyzes HMBRL: combining MBRL sample efficiency with HRL abstraction. Comprehensive study of hierarchical model-based RL.

đź§­ Spatial Memory & Navigation

Memory Maze: A Benchmark for Long-Term Memory Core

Pasukonis et al., 2022 • arXiv:2206.10502

Evaluation platform for HICA. Tests rapid long-term memory acquisition.

Clone Structured Causal Graph (CSCG) Core

George et al., 2021

Hidden Markov Model approach. Efficiently learns simplified action spaces.

Grid Cells in the Entorhinal Cortex Core

Moser et al., 2008 • Annual Review of Neuroscience

Biological spatial representation. Inspiration for cognitive maps in AI.

The Hippocampus as a Predictive Map Core

Stachenfeld, Botvinick & Gershman, 2017 • Nature Neuroscience • nn.4650

Seminal HICA paper. Successor representation in hippocampus for predictive coding of future states.

The Tolman-Eichenbaum Machine Core

Whittington et al., 2020 • Cell • Cell

Unifying space and relational memory. Transformer-like attention in hippocampal formation.

Improving Generalization: The Successor Representation Core

Dayan, 1993 • Neural Computation

The root of Successor Representation. Foundation for predictive map theories.

🏛️ Predictive Coding & Cortical Theory

A Brief History of Intelligence Core

Max Bennett, 2023 • Mariner Books

Traces 5 evolutionary breakthroughs in intelligence (steering, reinforcing, simulating, mentalizing, speaking) and connects them to modern AI.

On Intelligence Core

Jeff Hawkins, 2004 • Times Books

Foundational book proposing the Memory-Prediction Framework. Intelligence = ability to predict the future.

Canonical Microcircuits for Predictive Coding Core

Bastos et al., 2012 • Neuron

Predictive coding in cortical columns. Foundation for hierarchical brain models.

Hierarchy or Heterarchy: Long-Range Cortical Theory New

Jul 2025 • arXiv:2507.05888

Cortical columns learn structured world models for prediction.

Brain-inspired Intelligence via Predictive Coding New

Oct 2025 • arXiv:2308.07870

Neural Generative Coding (NGC) for cortical functionality.

🤖 Robot Learning & Sample Efficiency

Towards Sample Efficient Reinforcement Learning Core

Yu, 2018 • IJCAI

Defines the sample efficiency challenge. Key motivation for HICA research.

DayDreamer: World Models for Physical Robot Learning Core

Wu et al., 2022 • arXiv:2206.14176

Dreamer applied to real robots. Quadruped and wheeled robot navigation.

EmbodiSwap for Zero-Shot Robot Imitation New

Oct 2025 • arXiv:2510.03706

Human→robot swap. 82% zero-shot success rate.

Annotation-Free One-Shot Imitation New

Sep 2025 • arXiv:2509.24972

Multi-step tasks from single demonstration.

🎯 Exploration & Intrinsic Motivation

Intrinsically Motivated Reinforcement Learning Core

Singh, Barto & Chentanez, 2005

Foundation of curiosity-driven learning. Internal reward signals.

Planning to Explore via Self-Supervised World Models Core

Sekar et al., 2020 • ICML • arXiv:2005.05960

Plan2Explore: curiosity via world model disagreement.

VIME: Variational Information Maximizing Exploration Core

Houthooft et al., 2016 • NeurIPS

Prior/posterior belief comparison for exploration. Used in HICA.

🎮 Game AI Benchmarks

AlphaGo: Mastering Go without Human Knowledge Core

Silver et al., 2017 • Nature

Self-play without human data. Demonstrates power of MCTS + neural networks.

AlphaStar: Grandmaster Level in StarCraft II Core

Vinyals et al., 2019 • Nature

Complex real-time strategy. Example of sample inefficiency problem.

🕸️ Paper Relationship Graph

World Models CLS HRL Spatial Predictive Robot Exploration

Click nodes for details • Drag to move • Scroll to zoom