NVIDIA — Failed to allocate software rendering cache surface

I have been using a kvm-client for years. This is a VM running within KVM, but it has a physical keyboard, mouse, and display. From my desk, you do not even know that it is a VM. Until recently, I could not upgrade my NVIDIA drivers or my kernel with the latest security patches. This was unsettling, to say the least. The problem was “Failed to allocate software rendering cache surface” when trying to load X Windows.

Any attempt to upgrade the NVIDIA driver or the kernel resulted in the error “Failed to allocate software rendering cache surface.” The NVIDIA DevTalk forums were less than helpful. However, I finally found a useful discussion on the Red Hat Bugzilla site. Some hints emerged that were very helpful in finally solving the problem. You have to read quite a ways down to get to where a solution is presented.

While I did not follow the solution exactly, I did follow the intent. The problem was caused by not passing the Performance Acceleration Technology (PAT) chipset feature through to the VM. However, a previous setup requirement was to pass “nopat” as a kernel option for the virtual machine. The combination kept the NVIDIA driver from loading, with the “Failed to allocate software rendering cache surface” error.

So what was the solution? I could change the entire virtual machine configuration or change the type of CPU the VM was using while also removing the “nopat” option from the kernel boot line. In the end, I went with a CPU type of SandyBridge. My underlying hardware is SandyBridge (yes, it is a bit old), so that was a safe thing to do. I did make sure, however, that “PAT” was present to the hypervisor by using:

$ cat /proc/cpuinfo | grep pat 
... 
flags        : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 ...

This resulted in many lines, as the feature was available. This meant that next, I needed to expose this feature to the virtual machine. It sounds simple. First, power off the VM. Then, bring up the Virtual Machine Manager application (or you can use virsh and directly edit the XML defining the VM as well). Then, change the CPU type to SandyBridge.

Simply connect to the KVM hypervisor, select the VM in question, select the hardware button (looks like a lightbulb), select CPUs, and change the model of the CPU to SandyBridge. (In my case, there are other CPU types. If you have very modern hardware, such as Skylake, then you may be able to just copy the host CPU configuration. I could not do this, so I selected SandyBridge.)

The above will add the following lines into the XML after the features end tag. For that matter, you will want to format your features per the following as well:

...
<features>
    <acpi/>
    <apic/>
    <kvm>
      <hidden state='on'/>
    </kvm>
</features>
<cpu mode='custom' match='exact' check='partial'>
    <model fallback='allow'>SandyBridge</model>
</cpu>
...

The next step was to boot the VM. Once the VM was booted, the graphics would not come up until I modified the grub2 config file to remove the “nopat” option. For kernel version 3.10.0-957, your linux16 line needs to look like the following:

...
linux16 /vmlinuz-3.10.0-957.1.3.el7.x86_64 root=/dev/mapper/centos_liberte-root ro nomodeset rd.lvm.lv=centos_liberte/root crashkernel=auto  vconsole.keymap=us vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos_liberte/swap rhgb quiet rdblacklist=nouveau LANG=en_US.UTF-8
...

In effect, any line in /etc/grub2.cfg that contains the “nopat” word needs to have that word removed. Once that is done, save the file and reboot your system.

If the NVIDIA drivers are already installed, then graphics will miraculously start up on boot. If not, you will need to ensure the latest NVIDIA drivers from nvidia.com are properly installed. You can either reboot or change the init state of the VM from 5 (interactive with graphics) to 3 (interactive shell) and back while you install the NVIDIA graphics drivers.

There is a set of NVIDIA drivers as a package already, but I tend to use the ones direct from NVIDIA.

Voilà—problem solved after doing much research and digging! Always check those chipset features and expose as many as you can, but also double-check that performance features are available. Later drivers may need them.

NVIDIA — Failed to allocate software rendering cache surface

Leave a comment

Cancel reply