NUMA++ provides a more convenient C++ API to control:
- memory policy and other memory related functions (mainly provided by
libnuma)
- CPU affinity (
libnuma)
- Kernel scheduler (
glibc/pthread)
Policies are described with classes and are applied using the free functions numapp::Apply() where overloads are provided for each policy. To indicate that a policy is applied to the current thread Apply() has overloads in numapp::thisThread as well: numapp::thisThread::Apply().
- Warning
- Before using other library functions
numapp::NumaAvailable() must be called and confirmed to return true.
- Note
- Some policies can only be set on the current thread and only provides overload in
numapp::thisThread.
- Warning
- Efforts are made to not make breaking changes but this library is still under development and API may change.
Permissions
Certain operations are subject to permission checks by the Kernel. Often there are resource limits below which no additional permissions are required (getrlimit(2)), exceeding those limits typically requires the appropriate capabilities(7) or to run as root/have SUID bit set.
Capabilities
Notable capabilities:
See capabilities(7) for details.
Resource Limits
Notable limits:
RLIMIT_AS limits maximum virtual address space.
RLIMIT_CPU limits amount of CPU time a process can consume.
RLIMIT_MEMLOCK limits amount of memory that can be locked.
RLIMIT_NICE specifies ceiling for nice level.
RLIMIT_RTPRIO specifies ceiling of real-time scheduler priority.
RLIMIT_RTTIME effectively limits how long a real-time process may run without yielding to kernel.
See getrlimit(2) for details.
Dependencies
Library dependencies:
pthread
numa (a.k.a. libnuma or numactl)
General Utilities
Before use the availability of NUMA on the host system should be ensured with numapp::NumaAvailable().
NUMA++ provides a utility to create threads with NUMA policies in <numapp/thread.hpp> with numapp::MakeThread().
An example of creating a thread that only runs on CPUs 0-3:
#include <chrono>
#include <iostream>
#include <numapp/numa.hpp>
void ThreadFunc(std::chrono::milliseconds duration) {
std::this_thread::sleep_for(duration);
}
int main() {
using namespace std::chrono_literals;
std::cout << "NUMA not available";
return -1;
}
auto thread =
MakeThread(
"myThread", policies, &ThreadFunc, 500ms);
thread.join();
}
static CpuAffinity MakeFromCpuStringAll(char const *cpustring)
Create CpuAffinity from `cpustring` without considering current cpuset.
static CpuAffinity MakeFromActive()
Create current affinity settings.
Combines the the available NUMA policy types in one object.
void SetCpuAffinity(std::optional< CpuAffinity > affinity) noexcept
Set CPU affinity.
bool NumaAvailable() noexcept
Query whether system has NUMA support.
std::thread MakeThread(std::string_view thread_name, NumaPolicies const &policies, Func &&func, Args &&... args)
Primary overload accepting string-view for thread_name.
Contains declarations for numapp thread utilities.
Memory APIs
- Attention
- If you take the step of managing memory with NUMA in mind, it is important to understand how the primitives provided by NUMA++ works and the context they are used. It may be helpful to think of NUMA++ as block allocators with the minimum block size being the system page size for normal allocations and the specified huge page size for huge page allocations. It is not recommended to use this as a general purpose allocator. As an example, foonathan-memory has a compatible concept
BlockAllocator.
Manual Allocation
Allocate pages of memory with specified memory policy (see function group).
To e.g. allocate one page from NUMA node 1
static MemPolicy MakeBindNode(int node)
Creates a strict policy (using MPOL_BIND) to allocate all memory to the specified node.
std::size_t GetPageSize() noexcept
Fast query of system page size.
void Free(void *ptr, std::size_t size, std::error_code &ec) noexcept
See group for details.
void * Allocate(std::size_t size, MemPolicy const &policy, std::error_code &ec) noexcept
See group for details.
Hardware Queries
Provides the following functions:
Memory Locking
Prevents all or parts of process memory from being paged to the swap area.
This is an example of how to effectively disable swapping completely for the process:
- Note
- This example will cause all allocated memory to be locked, not only the resident memory which will increase memory pressure and possibly cause out-of-memory situations.
#include <iostream>
int main() {
using numapp::operator|;
std::cout << "NUMA not available";
return -1;
}
std::cerr << "Failed to lock memory!" << std::endl;
return -1;
}
}
LockAllFlag
Flags that are combined to modify behaviour of MemLockAll().
std::error_code MemLockAll(LockAllFlag flags) noexcept
Lock all memory pages as specified by provided flags.
@ Current
Lock all pages which are currently mapped into the address space of the process.
@ Future
Lock all future pages that are mapped into the address space of the process.
Contains memory function declarations.
Memory Policy
Controls how physical memory is by default allocated by a thread or the memory placement for an already mapped virtual memory range.
- Note
- Important to note is that the memory policy is used by the Kernel when allocating physical memory, not when application allocates virtual memory. It is also the policy of the thread that trigger the page fault which will be used, which is not necessarily the thread that allocated the virtual memory.
A memory policy is described using numapp::MemPolicy in <numapp/mempolicy.hpp>. And is applied to the current thread with e.g. the function numapp::thisThread::Apply(numapp::MemPolicy const&) or a memory range with numapp::Apply(void*,std::size_t,numapp::MemPolicy const&,MemPolicyFlag).
To apply a policy temporarily to the current thread, e.g. to have the policy used when mmap'ing some memory, the numapp::ScopedMemPolicy can be used to automatically apply new policy and restore the old, within a scope.
A helper function to apply a memory policy to the current thread stack memory is also provided with numapp::thisThread::ApplyStack().
Memory policies are inherited at fork(), clone() (without CLONE_VM flag) and exec*.
The following example applies a bind policy for NUMA nodes 0-1 to the current thread:
#include <iostream>
#include <numapp/numa.hpp>
int main() {
std::cout << "NUMA not available";
return -1;
}
std::cout << "Failed to apply policy: " << ec.message() << std::endl;
return 1;
}
std::cout << "Failed to apply policy: " << ec.message() << std::endl;
return 1;
}
}
Class representing a memory policy that can be modified and used to apply to the current thread or a ...
@ Bind
The Bind mode defines a strict policy that restricts memory allocation to the nodes specified in node...
std::error_code Apply(CpuAffinity const &affinity) noexcept
Apply policy to calling thread.
std::error_code ApplyStack(MemPolicy const &policy, MemPolicyFlag flags=MemPolicyFlag::Move|MemPolicyFlag::Strict) noexcept
Convenience function that applies a memory policy to current thread stack memory.
Contains declarations for numapp::MemPolicy.
Type-safe NUMA node mask.
static Nodemask MakeFromNodestring(char const *nodestring)
Construct Nodemask from nodestring that considers current cpuset.
The following example applies policy to a memory range, rather than setting the policy to the current thread:
#include <iostream>
void BindMemoryRange(int node, void* address, std::size_t size) {
using numapp::operator|;
using numapp::MemPolicyFlag;
auto flags = MemPolicyFlag::Move | MemPolicyFlag::Strict;
std::cerr << "Failed to apply policy: " << ec.message() << std::endl;
throw std::system_error(ec);
}
std::cerr << "Failed to apply policy: " << ec.message() << std::endl;
throw std::system_error(ec);
}
}
std::error_code Apply(pid_t thread, CpuAffinity const &affinity) noexcept
Apply policy to specified thread.
std::error_code MemLock(void const *addr, std::size_t len, LockFlag flag) noexcept
Lock memory pages in the specified address range.
LockFlag
Mutually exclusive flags that modifies behaviour of MemLock().
@ PreFault
Locks pages whether they are resident or not.
Polymorphic Memory Resource
NUMA++ provide implementation of std::pmr::memory_resource with numapp::PageResource that can be used to allocate memory with specified memory policy using the STL allocator std::pmr::polymorphic_allocator.
An example of this is shown below where the pool allocator std::pmr::unsynchronized_pool_resource use memory from numapp::PageResource.
#include <string>
#include <vector>
void Example() {
numapp::MemPolicyFlag::Strict);
auto pool = std::pmr::unsynchronized_pool_resource(&resource);
[[maybe_unused]] auto vector = std::pmr::vector<std::uint8_t>(1024, &pool);
[[maybe_unused]] auto string = std::pmr::string(1024, 'a', &pool);
}
- Note
- Boost.SmartPtr provides allocator-aware utilities for constructing
std::unique_ptr with boost::allocate_unique().
NUMA++ also provide a std::pmr::memory_resource implementation that locks memory allocated from upstream memory resource in numapp::LockResource.
#include <vector>
void Example() {
numapp::MemPolicyFlag::Strict);
[[maybe_unused]] auto vector =
std::pmr::vector<std::uint8_t>(4 * 1024 * 1024, &locked);
}
Lock memory allocated from upstream memory resource using specified LockFlag.
Huge Pages
NUMA++ provides primitives for allocating and freeing huge pages with numapp::AllocateHuge() and numapp::FreeHuge() as well as std::pmr::memory_resource implementation numapp::HugePageResource.
An example usage is shown below:
void Example() {
auto const size = 16 * 1024 * 1024;
page_size,
numapp::MemPolicyFlag::Strict);
throw std::system_error(ec, "Page fault failed");
}
}
void FreeHuge(void *ptr, std::size_t size, HugePageSize page_size, std::error_code &ec) noexcept
Free huge pages previously allocated with AllocateHuge.
void * AllocateHuge(std::size_t size, HugePageSize page_size, MemPolicy const &policy, MemPolicyFlag flags, std::error_code &ec) noexcept
Non-throwing version of AllocateHuge()
CPU Affinity API
Control where a thread is executed.
The CPU affinity of the current thread can be controlled using numapp::CpuAffinity in <numapp/cpuaffinity.hpp>.
To e.g. set CPU affinity of current thread to run on cores 0-3, disregarding whether they are isolated or not:
#include <iostream>
#include <numapp/numa.hpp>
int main() {
std::cout << "NUMA not available";
return -1;
}
std::cout << "Failed to apply policy: " << ec.message() << std::endl;
return 1;
}
}
Create CPU affinity and apply to current thread.
static Cpumask MakeFromCpuStringAll(char const *cpustring)
Construct Cpumask from cpustring that does not consider current cpuset.
Contains declarations for CpuAffinity.
Scheduler API
Control how a thread is scheduled for execution by the kernel.
The scheduler and priority of threads can be controlled using the following primitives from <numapp/scheduler.hpp>.
These APIs are used to create and apply policies, either from currently active policy or explicitly created manually.
To e.g. apply the static FIFO scheduling policy with highest priority:
#include <iostream>
#include <numapp/numa.hpp>
int main() {
std::cout << "NUMA not available";
return -1;
}
std::cout << "Failed to apply policy: " << ec.message() << std::endl;
return 1;
}
}
Static priority scheduler (real-time).
@ Fifo
A first-in, first-out "real-time" policy.
Contains scheduler declarations.
If the type of scheduler is not statically known the sum-type numapp::Scheduler can be used. In this example the current policy is queried:
#include <iostream>
#include <numapp/numa.hpp>
int main() {
std::cout << "NUMA not available";
return -1;
}
std::cout <<
"Nice: " << sched.
GetNice() << std::endl;
}
std::variant<DynamicScheduler, StaticScheduler, IdleScheduler> variant = current.
Get();
}
Normal non-realtime scheduler that use dynamic priority (nice value).
int GetNice() const noexcept
Get dynamic nice value.
A sum-type of all supported schedulers.
SchedulerVariant const & Get() const noexcept
Get the underlying scheduler.
static Scheduler MakeFromActive()
Return active scheduler policy for this thread.
constexpr T GetScheduler() const
Get the held scheduler.
constexpr bool HoldsScheduler() const noexcept
Query the held scheduler.