0
Evolution of Buffer Management in Database Systems: From Classical Algorithms to Machine Learning and Disaggregated Memory
arXiv:2512.22995v1 Announce Type: new
Abstract: Buffer management remains a critical component of database and operating system performance, serving as the primary mechanism for bridging the persistent latency gap between CPU processing speeds and storage access times. This paper provides a comprehensive survey of buffer management evolution spanning four decades of research. We systematically analyze the progression from foundational algorithms like LRU-K, 2Q, LIRS, and ARC to contemporary machine learning-augmented policies and disaggregated memory architectures. Our survey examines the historical OS-DBMS architectural divergence, production system implementations in PostgreSQL, Oracle, and Linux, and emerging trends including eBPF-based kernel extensibility, NVM-aware tiering strategies, and RDMA-enabled memory disaggregation. Through analysis of over 50 seminal papers from leading conferences (SIGMOD, VLDB, OSDI, FAST), we identify key architectural patterns, performance trade-offs, and open research challenges. We conclude by outlining a research direction that integrates machine learning with kernel extensibility mechanisms to enable adaptive, cross-layer buffer management for heterogeneous memory hierarchies in modern database systems.
Abstract: Buffer management remains a critical component of database and operating system performance, serving as the primary mechanism for bridging the persistent latency gap between CPU processing speeds and storage access times. This paper provides a comprehensive survey of buffer management evolution spanning four decades of research. We systematically analyze the progression from foundational algorithms like LRU-K, 2Q, LIRS, and ARC to contemporary machine learning-augmented policies and disaggregated memory architectures. Our survey examines the historical OS-DBMS architectural divergence, production system implementations in PostgreSQL, Oracle, and Linux, and emerging trends including eBPF-based kernel extensibility, NVM-aware tiering strategies, and RDMA-enabled memory disaggregation. Through analysis of over 50 seminal papers from leading conferences (SIGMOD, VLDB, OSDI, FAST), we identify key architectural patterns, performance trade-offs, and open research challenges. We conclude by outlining a research direction that integrates machine learning with kernel extensibility mechanisms to enable adaptive, cross-layer buffer management for heterogeneous memory hierarchies in modern database systems.