Chapter 10.12: Explicit Memory
Deep Learning Book - Chapter 10.12 (page 410)
Neural networks are good at storing implicit knowledge, but weak at remembering facts.
Key Concepts
Explicit memory introduces a separate, addressable storage component outside the network parameters, allowing models to read from and write to memory explicitly rather than encoding all information in hidden states.
A controller network (often an RNN) learns how to access memory via read/write mechanisms, using content-based and/or location-based addressing.
This design alleviates the need to propagate information through long time spans, thereby avoiding vanishing/exploding gradients for long-term dependencies.
Explicit memory is well suited for tasks requiring precise storage, retrieval, and algorithmic behavior (e.g., copying, sorting, question answering).
Practical Considerations
However, classic explicit memory architectures (e.g., Memory Networks, Neural Turing Machines) are computationally expensive, difficult to train, and rarely used in practice.
Their primary contribution is conceptual: separating computation from storage and inspiring later, more practical mechanisms.
Modern Realizations
Modern architectures such as attention, Transformers, and retrieval-augmented models can be viewed as successful, scalable realizations of the explicit memory idea.