Container Testing Reveals ‘Memory Pressure’ on Apps
With early adopters of application container technology completing early testing in multi-tenant settings, potential performance issues are beginning to surface. Among them, according to hyper-scaler LinkedIn, is a Linux kernel feature called "control groups" used with most containers to assign resources.
In an analysis based on several months of "pressure testing," Zhenyun Zhuang, a software engineer at LinkedIn (NYSE: LNKD), notes that many container projects running on Docker or rival CoreOS platforms rely on "cgroups" to limit resources like computing and memory. "Ensuring the high performance of the applications running in cgroups is very important for business-critical computing environments." Zhuang stressed in a paper released Thursday (Aug. 18).
The social media company has been using cgroups for an internal container project called LinkedIn Platform as a Service. One goal was to determine how resource-limiting policies would affect application performance. "We have found that cgroups do not totally isolate resources, but rather limit resource usage so that applications running in memory-limited cgroups do not starve other cgroups," the paper concludes.
One result is "memory pressure," which the author warns could raise issues affecting the performance of applications running in cgroups, including:
- Unlike virtual machines, memory is not reserved for cgroups.
- Page cache used by containerized applications is counted against a cgroup's memory limits, "therefore anonymous memory usage can steal page cache usage for the same cgroup."
- The operating system also can "steal" memory from cgroups. The researcher attributes this to the fact that the root cgroup is "unbounded."
Among the performance shortfalls posed by memory pressure on either root or "regular" cgroups is much slower application startup, especially if the OS has to free up memory to meet an application memory request, Zhuang noted.
Rather than reserving memory as with virtual machines, cgroups impose only an upper limit on memory usage for applications in the control group. Hence, the LinkedIn engineer found that memory is allocated on demand, and applications deployed in cgroups must compete for free memory from the OS.
One implication is that the OS must reclaim memory from the page cache or from "anonymous" memory if it does not have enough free memory to meet the cgroup request. "Memory reclamation by the OS could be a performance killer, affecting the performance of other cgroups," Zhuang warned.
LinkedIn also found that page cache used by applications is counted against a cgroup's memory limit, and anonymous memory usage can "steal" page cache for the same cgroup. Further, the paper notes, the OS also can if needed steal page cache from cgroups.
The Linux kernel also allows for swapping computing and memory resources between the unbounded root and bounded regular cgroups. The LinkedIn analysis notes that each cgroup can have its own "swappiness" setting, but all cgroups rely on the same swap space configured by the OS. Hence, memory pressure on the root or regular cgroups could have a cascade effect on other cgroups in the container.
Among the proposed remedies to the memory pressure issue arising from early application container stress testing is "pre-touching" memory limits for cgroups while avoiding "use-as-you-go" memory requests. Hence, LinkedIn recommended allocating memory for application containers beforehand.
Moreover, when "onboarding an application to cgroups, the memory footprint of the application needs to be sized," Zhuang stressed. "Since a cgroup’s memory limit counts both anonymous memory and page cache used by the cgroup, sizing memory footprint should consider both memory types."
LinkedIn also recommends isolating memory usage when running containers, including moving as many "house-keeping" processes from the root cgroup to "properly-sized user cgroups."
In sum, Zhuang called cgroups "a decent mechanism to limit memory usage of applications" so long as steps are taken ahead of time to avoid memory pressure. With that in mind, Control Group v2 containing performance improvements was released earlier this year. LinkedIn reported this week it expects to test the latest version as part of its container platform service.
Representatives from application container leaders Docker and CoreOS were not immediately available to comment on the LinkedIn analysis.
Related
George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).