Ph.D. Alumni: Sari Sultan
Reference:
Sari Sultan
Configuring In-Memory Caches: From TTL-Aware Sizing to Interval-Based Historical Analysis with HistoChron
Ph.D. Thesis, Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada, 2024.
Supervisor(s):
Michael Stumm
Download Thesis:
Abstract:
In-memory caches such as Memcached and Redis are crucial for enhancing the performance of
distributed systems by significantly reducing query response times. Correctly sizing these caches is
critical, especially considering that prominent organizations use terabytes to petabytes of Dynamic
Random Access Memory (DRAM) for these caches. Configuring these caches to operate efficiently
remains a challenging task, considering the dynamic nature of modern workloads where caching
requirements can change significantly over time.
Our thesis is that the state-of-the-art for in-memory cache performance analysis does not
accommodate modern workloads. This gap is evident in the lack of consideration for Time-to-Live
(TTL) attributes and heterogeneous object sizes, as well as the absence of interval-based historical
analysis to address the dynamic nature of these workloads. This dissertation introduces a
comprehensive reevaluation of in-memory cache performance analysis tools. We propose novel tools
that account for TTL attributes and heterogeneous object sizes, and we introduce a new tool that
enables efficient interval-based historical analysis of in-memory cache workloads. In particular, one
of our primary contributions is the development of Miss Ratio Curve (MRC) generation andWorking
Set Size (WSS) estimation algorithms that accommodate TTL attributes and heterogeneous object
sizes. Our analysis of real-world cache workloads demonstrates that including TTLs can lead to an
average reduction in cache memory footprint by 69%, and up to 99%.
Additionally, we introduce HistoChron, a novel methodology with a Graphical User Interface
(GUI) that enables efficient interval-based historical analysis of caching workloads. Evaluated on over
5, 000 cache access traces from six real-world datasets, encompassing more than 300 billion accesses
over an 18-year span, HistoChron demonstrates its efficacy by generating exact MRCs over any
arbitrary time interval using just 24MiB of storage space weekly. We also present a lower-overhead
variant of HistoChron that generates approximate results with a mean error of less than 1%. These
contributions advance the field of in-memory cache management, offering a robust framework for
optimizing in-memory caches in alignment with the dynamic demands of modern workloads.
Keywords:
Memory Management, In-memory Caches, TTL, MRC-generation, Working Set Size
BibTeX:
@phdthesis(Sultan-PhD24, author = {Sari Sultan}, title = {Configuring In-Memory Caches: From TTL-Aware Sizing to Interval-Based Historical Analysis with HistoChron}, school = {Department of Electrical and Computer Engineering, University of Toronto}, address = {Toronto, Canada}, supervisors = {Michael Stumm}, month = {August}, year = {2024}, keywords = {Memory Management, In-memory Caches, TTL, MRC-generation, Working Set Size} )