Transparent Memory Offloading (TMO) is Meta’s solution for heterogeneous data centre environments. It introduces a new Linux kernel mechanism that measures the lost work due to resource shortage across CPU, memory, and I/O in real-time. Guided by this information and without prior application knowledge, TMO automatically adjusts the amount of memory to offload to a heterogeneous device, such as compressed memory or an SSD. It does so according to the device’s performance characteristics and the application’s sensitivity to slower memory accesses. TMO holistically identifies offloading opportunities from application and sidecar containers that provide infrastructure-level functions.
TMO has been running in production since 2021 and has saved 20 per cent to 32 per cent of total memory across millions of servers in Meta’s expansive data centre fleet.
Sign up for your weekly dose of what's up in emerging technology.
It is now part of the Linux kernel and, in a nutshell, automatically offloads data to other storage tiers (e.g. Samsung’s CX memory expander) that are less costly and more power-efficient than memory.
TMO has been running on millions of Facebook servers for more than a year, saving almost a third of memory per server. While that is likely insignificant across dozens or hundreds of servers, Facebook’s immense scale presents a unique challenge.
TMO consists of the following components:
- Pressure Stall Information (PSI), a Linux kernel component that measures the lost work due to resource shortage across CPU, memory, and I/O in real time. For the first time, we can directly measure an application’s sensitivity to memory access slowdown without resorting to fragile low-level metrics such as the page promotion rate.
- Senpai, a userspace agent that applies mild, proactive memory pressure to effectively offload memory across diverse workloads and heterogeneous hardware with minimal impact on application performance.
- TMO performs memory offloading to swap at subliminal memory pressure levels, with turnover proportional to file cache. This contrasts with the historical behaviour of swapping as an emergency overflow under severe memory pressure.