The Wonders of NUMA

This talk was presented at the OpenStack Summit in Vancouver in May 2018. It provided an in-depth summary of what NUMA is and why it matters so much for high-performance workloads. It was delivered as a lightning talk which, to be honest, wasn’t really sufficient for such a complicated area (at least in the context on NUMA within OpenStack) but the slides should provide a decent overview of the topic.

You select the best possible hardware for the job, you optimize the host OS to deliver the best performance ever seen by mankind, and you tweak your high-performance vSwitch to ensure nothing, nothing, could possibly stop you now. You rub your bloodshot eyes, run openstack server create and, well, things don’t look so rosy.

Welcome to the world of OpenStack on NUMA-based architectures, where one poor scheduling decision can result in drastic performance reductions. Thankfully, OpenStack realized this some time ago and has been doing many wonderful things since then to prevent this pain. In this talk, we shine a light on all things NUMA, from both a general and OpenStack-orientated perspective.

Coming out of this talk, you should know everything there is to know about NUMA in OpenStack and will be able to, one can hope, finally put those performance issues to bed and get some sleep.

What can I expect to learn?

The talk is multi-faceted, covering a breadth of topics from hardware design right up to user interaction with OpenStack deployments. We aim to cover:

What NUMA is and why it exists

How and to what extent NUMA can affect your OpenStack deployment, and the techniques we use under-the-hood to prevent any performance degradation

What you, as a user, can do to further optimize your instances’ performance

In addition, work continues on these features. We also take a look forward at some of the work we’re doing to further improve things:

Why OVS-DPDK doesn’t deliver on NUMA-based systems, and what we’re doing about it (*)

Why live migration with CPU pinning doesn’t work, and what we’re doing about it (*)

(*) If we haven’t already done something about it by now

More information is available on the OpenStack Summit Vancouver website.