A hardware bug afflicts the virtual memory of Intel CPUs

A hardware bug afflicts the virtual memory of Intel CPUs

In the last hours have emerged information related to a bug in the Intel processors , all models from those of last introduction on the market up to proposals that date back to about 10 years ago, linked to the virtual memory management mode.

A hardware bug afflictsThe bug would allow programs to access the contents in memory that are relevant to the kernel. A software solution is currently being implemented for Linux, Windows and MacOS operating systems: the former has already been released as far as not public, while for Windows we expect an integrated fix in one of the next update of the so-called “Tuesday patch”.

The memory within a system is not freely accessible by the programs, but is managed on the software side by the kernel and hardware side by the MMU (Memory Management Unit). The kernel has a portion of the dedicated memory, accessible only to the kernel itself and totally separate from the one dedicated to user programs (so-called user space ), as well as the code execution takes place in two different processor modes ( kernel mode and user mode).

This separation is linked to the possibility of creating problems due to the introduction of operating systems in multi-programmed environments; since the kernel actually manages the system and is – to make a metaphor – similar to an unseen and unknowable deity that dictates the life and death of the whole system, if a program could somehow influence the kernel would in fact have the potential control of the whole system.

The problem that emerges inside the Intel processors concerns pagination: the memory is divided into pages, usually 4096 bytes (ie 1024 32-bit memory addresses or 512 64-bit addresses), and there is no clear separation between the tables of the kernel pages and those of the user processes.

Normally this is possible because there are hardware protections, such as the distinction between kernel mode and user mode of the processor, and therefore a complete and total page cache cleaning is not performed. The motivation behind this choice lies in the greater speed allowed by this operation: since the kernel is often invoked (for example every time a file is accessed), the time saving is considerable.

At the time of the context switch, that is, switching control from one program to another (in this case from the kernel to a user process), the page cache is not emptied and updated. This is because the context switch is one of the most time-consuming operations for a processor, due to the slowness of memory. Due to a bug that is not yet specified, user processes are able to have visibility on the kernel pages .

This means that it is possible for them to read the contents of the kernel’s memory, to have visibility on the addresses in which it is mapped (since an address in virtual memory does not correspond to the same address in physical memory) and, more generally, to have access to data that should not have access.

This leads to the possibility of conducting targeted attacks that exploit more or less known vulnerabilities of the kernel but which require knowledge of physical addresses, for example, in addition to undermining the very privacy of the system.

The bug, according to what reported by AMD, would consist in a lack of security control by the processor in the execution of the code: the latter executes code in a speculative manner, thus going to execute instructions before they are actually required. If a check is missing, however, of the current privilege level (if a user program is running, then) and code is run in kernel mode, the above situation would be created. At the moment there are many details that allow us to fully understand what happens and we go by hypothesis, so this conclusion is to be taken with the pliers.

The bug correction intervenes at the operating system level by implementing a Kernel Page Table Isolation or KPTI, which makes the kernel invisible to processes that are running in the CPU: the page cache is then emptied and reloaded at each context switch, done which significantly increases the overhead given by the kernel.

The software fix is ​​required necessarily because the bug can only be corrected at the hardware level , and it will be with future processor versions that Intel will put on the market. The software solution, however, has an obvious contraindication: according to the information available at present it introduces a negative impact on the performance quantified in 5% at a minimum and in 30-35% as a maximum. The actual percentage varies greatly from the type of application and could be well below the 30% budgeted, but it is clear how the use of the patch leads to negative repercussions in terms of performance.

The practical consequence is that systems based on Intel CPUs are currently subject to an evident performance penalty needed to correct the bug and prevent security problems from arising . Thinking about the number of Intel CPU-based server systems in the world, it’s easy to understand how vast the scope of this bug is and the risks that its presence can generate in terms of attacks.

The bug is, as reported, present within the Intel processors, while there are no problems with those AMD . These CPUs do not allow direct memory references, including the speculative references mentioned above, that prevent the occurrence of this bug. Currently, the Linux kernel identifies as potentially unsafe Intel and AMD processors, forcing the bugfix to be enabled with all CPUs and therefore generating a negative performance impact even with AMD processors. The American company recommends, at the current time, not to enable this patch as it is not required for its architectures.

We will see in the next few days what additional information will emerge, and if Intel will release an official position. In addition to this it will be important to evaluate the performance impacts on typical usage scenarios both in the user and in the server.