In my last post I mentioned that I recently had a hardware failure that took down my server. I needed to get it back up and running again ASAP, but due to a large number of complications I was unable to get the original hardware up and running again, nor could I get any of the three other systems I had at my disposal to work properly. Seriously, it was like Murphy himself had taken up residence here. In the end, rather desperate and out of options, I turned to Xen (for those unfamiliar with it, it's similar to VMware or Virtual Box, but highly geared towards server0. I'd recently had quite a bit of experience getting Xen running on another system, so I felt it'd be a workable, albeit temporary, solution to my problem.
Unfortunately, the only working system I had suitable for this was my desktop, and while the process of installing and migrating the server to a Xen guest host was successful (this site is currently on that Xen instance) it was not without it's drawbacks. For one thing, there's an obvious performance hit on my desktop while running under Xen concurrently with my server guest, though fortunately my desktop is powerful enough that this mostly isn't an issue (except when the guest accesses my external USB drive to backup files; for some reason that consumes all CPU available for about 2 minutes and kills performance on the host). There were a few other minor issues, but by far the biggest problem was that the binary nVidia drivers would not install under Xen. Yes, the open source 'nv' driver would work, but that had a number of problems/limitations:
In fairness, issues 1 and 2 are a direct result of nVidia not providing adequate specifications for proper driver development. Nonetheless, I want my hardware to actually work, so the performance was not acceptable. Issue 3 was a major problem as well, as I have two monitors and use both heavily while working. I can only assume that this is due to a bug in the nv driver for the video card I'm using (a GeForce 8800 GTS), as dual monitors should be supported by this driver. It simply wouldn't work, though. Issue 4 wasn't that significant, but it did require quite a bit of time to rework it, which was ultimately pointless anyway due to issue 3.
So, with all that said, I began my quest to get the binary nVidia drivers working under Xen. Some basic searches showed that this was possible, but in every case the referenced material was written for much older versions of Xen, the Linux kernel, and/or the nVidia driver. I tried several different suggestions and patches, but none would work. I actually gave up, but then a few days later I got so fed up with performance that I started looking into it again and trying various different combinations of suggestions. It took a while, but I finally managed hit on the special sequence of commands necessary to get the driver to compile AND load AND run under X. Sadly, the end result is actually quite easy to do once you know what needs to be done, but figuring it out sure was a bitch. So, I wanted to post the details here to hopefully save some other people a lot of time and pain should they be in a similar situation.
This guide was written with the following system specs in mind:
Version differences shouldn't be too much of an issue; however, a lot of this is Gentoo-specific. If you're running a different distribution, you may be able to modify this technique to suit your needs, but I haven't tested it myself (if you do try and have any success, please leave a comment to let others know what you did). The non-Xen kernel should be typically left over from before you installed Xen on your host; if you don't have anything else installed, however, you can do a simple emerge gentoo-source to install it. You don't need to run it, just build against it.
Once everything is in place, and you're running the Xen-enabled (xen-sources) kernel, I suggest uninstalling any existing binary nVidia drivers with emerge -C nvidia-drivers. I had a version conflict when trying to start X at one point as the result of some old libraries not being properly updated, so this is just to make sure that the system's in a clean state. Also, while you can do most of this while in X while using the nv driver, I suggest logging out of X entirely before the modprobe line.
Here's the step-by-step guide:
uname -r to verify the version of your currently running Xen-enabled kernel; eg., mine's 2.6.21-xencd /usr/src/ && ls -l
ln -sfn linux-2.6.24-gentoo-r8 linuxemerge -av nvidia-drivers
emerge -f nvidia-drivers (look for the NVIDIA-Linux-* line)bash /usr/portage/distfiles/NVIDIA-Linux-x86_64-173.14.09-pkg2.run -a -xcd NVIDIA-Linux-x86_64-173.14.09-pkg2/usr/src/nv/IGNORE_XEN_PRESENCE=y make SYSSRC=/lib/modules/`uname -r`/build module
mkdir /lib/modules/`uname -r`/videocp -i nvidia.ko /lib/modules/`uname -r`/video/depmod -amodprobe nvidiastartx/etc/init.d/xdm startAssuming all went well, you should now have a fully functional and accelerated desktop environment, even under a Xen dom0 host. W00t. If not, feel free to post a comment and I'll try to help if I can. You should also hit up the Gentoo Forums, where you can get help from people far smarter than I.
I really hope this helps this helps some people out. It was a royal pain in the rear to get this working, but believe me, it makes a world of difference when using the system.
Comments
Re: Running Binary nVidia Drivers under Xen Host
Hi,
A question:
Why do I need to install the non-Xen kernel? Is this only to be able to properly install the nvidia driver using it's setup-script?
Im using openSuSE 10 x64 with a almost recent kernel (2.6.25.4) and currently without xen.
According to you writing the nvidia-driver/xen support each other (and compile fine under xen). My last state was that this setup is only possible for an old patched nvidia driver (with several performance and stability problems).
Thanks ahead!
PS: sorry for my bad english
- Simon
Re: Running Binary nVidia Drivers under Xen Host
There are two parts to the binary driver package:
While the kernel module will indeed build against the Xen kernel (provided the appropriate CLI options are used, as discussed above), I was unable to get the necessary libraries installed using the Xen kernel. It might be possible to do this, but I don't know how. For me, it was easier to let my package manager (Portage, for Gentoo) install the package. This would only install when I'm using the non-Xen kernel. After that was installed, I could then switch back to the Xen kernel and manually build/install the kernel module.
Of course, as I mentioned above, this was done on a Gentoo system. Other distributions behave differently, and I'm not sure what may be involved in getting the binary drivers setup correctly on them. If you have any luck, though, please consider posting your results here for the benefits of others.
Good luck.
--
http://www.legroom.net/
Re: Running Binary nVidia Drivers under Xen Host
I have it working on CentOS 5.2 with a Xen kernel as well, thanks to this I have TwinView available again:
[root@mythtv ~]# dmesg | grep NVRMNVRM: loading NVIDIA UNIX x86_64 Kernel Module 100.14.19 Wed Sep 12 14:08:38 PDT 2007
NVRM: builtin PAT support disabled, falling back to MTRRs.
NVRM: bad caching on address 0xffff880053898000: actual 0x77 != expected 0x73
NVRM: please see the README section on Cache Aliasing for more information
NVRM: bad caching on address 0xffff880053899000: actual 0x77 != expected 0x73
NVRM: bad caching on address 0xffff88005389a000: actual 0x77 != expected 0x73
NVRM: bad caching on address 0xffff88005389b000: actual 0x77 != expected 0x73
NVRM: bad caching on address 0xffff88005389c000: actual 0x77 != expected 0x73
NVRM: bad caching on address 0xffff88005389d000: actual 0x77 != expected 0x73
NVRM: bad caching on address 0xffff88005389e000: actual 0x77 != expected 0x73
NVRM: bad caching on address 0xffff88005389f000: actual 0x77 != expected 0x73
NVRM: bad caching on address 0xffff8800472f4000: actual 0x67 != expected 0x63
NVRM: bad caching on address 0xffff880045125000: actual 0x67 != expected 0x63
[root@mythtv ~]# uname -r
2.6.18-92.1.13.el5xen
[root@mythtv ~]#
Now see if I can fix the bad caching errors... and see if I can run a dom host.
Thanks heaps!
Re: Running Binary nVidia Drivers under Xen Host
Can you explain how you got that to work. I'm still getting a error on the modprobe step.
[root@localhost ~]# modprobe nvidia
nvidia: disagrees about version of symbol struct_module
FATAL: Error inserting nvidia (/lib/modules/2.6.18-92.el5xen/kernel/drivers/video/nvidia.ko): Invalid module format
Any ideas anyone?
Re: Running Binary nVidia Drivers under Xen Host
I have the following kernel related packages installed and am compiling some older drivers (100.14.19) as they work for my card in non-xen kernels as well:
[root@mythtv ~]# rpm -qa kernel* | grep $(uname -r | sed -e 's/xen//') | sortkernel-2.6.18-92.1.18.el5
kernel-devel-2.6.18-92.1.18.el5
kernel-headers-2.6.18-92.1.18.el5
kernel-xen-2.6.18-92.1.18.el5
kernel-xen-devel-2.6.18-92.1.18.el5
[root@mythtv ~]#
I am booted into the xen kernel:
[root@mythtv ~]# uname -r2.6.18-92.1.18.el5xen
[root@mythtv ~]#
I already have my source extracted like explained in the article and navigated to it. Inside the ./usr/src/nv folder of the source tree I issue the following command (from the article as well) which starts compiling:
[root@mythtv nv]# IGNORE_XEN_PRESENCE=y make SYSSRC=/lib/modules/`uname -r`/build moduleAbove command should start compilation. After compilation I copy the driver to my lib tree:
[root@mythtv nv]# mkdir -p /lib/modules/`uname -r`/kernel/drivers/video/nvidia/[root@mythtv nv]# cp -i nvidia.ko /lib/modules/`uname -r`/kernel/drivers/video/nvidia/
Then to load the driver:
[root@mythtv ~]# depmod -a[root@mythtv ~]# modprobe nvidia
To see if it was loaded I issue this command:
[root@mythtv ~]# dmesg | grep NVIDIAwhich in my case outputs this:
[root@mythtv ~]# dmesg |grep NVIDIAnvidia: module license 'NVIDIA' taints kernel.
NVRM: loading NVIDIA UNIX x86_64 Kernel Module 100.14.19 Wed Sep 12 14:08:38 PDT 2007
[root@mythtv ~]#
I do not worry about the tainting of the kernel as it seems to work pretty well for me as well as for this error:
[root@mythtv nv]# dmesg |grep NVRMNVRM: loading NVIDIA UNIX x86_64 Kernel Module 100.14.19 Wed Sep 12 14:08:38 PDT 2007
NVRM: builtin PAT support disabled, falling back to MTRRs.
[root@mythtv nv]#
Re: Running Binary nVidia Drivers under Xen Host
A really intersting article you created here -- if there were not (I hope) a typo that destroys everything:
The last paragraph reads:
"Assuming all went well, you should not have a fully functional ..."
The word "not" is disturbing me, and I have some hope that it should be a "now", as that would make sense with all your efforts.
Can you please comment on this issue?
Thanks
Re: Running Binary nVidia Drivers under Xen Host
Oops. Yeah, that was a typo. I guess it would pretty much defeat the purpose of going through this exercise, considering it's already not working at the start. :-)
Corrected now - thanks for pointing it out.
--
http://www.legroom.net/
Re: Running Binary nVidia Drivers under Xen Host
Xcellent! Works great for me... Now I have to choose between VMWare, VirtualBox and Xen...
Re: Running Binary nVidia Drivers under Xen Host
works with 173.14.12 drivers too! ;)
Re: Running Binary nVidia Drivers under Xen Host
Thanks for the solution - it works like a charm.
Fedora Core 8, kernel 2.6.21.7-5, XEN 3.1.2-5 and latest nvidia driver (173.14.12).
Re: Running Binary nVidia Drivers under Xen Host
Hm.. strange - I wasn't able to get this to work with the newest version of the Nvidia drivers. it says something along the lines of "will not install to Xen-enabled kernel." Darned Nvidia - serves me right.. I ought've gotten me an ATI card!
Re: Running Binary nVidia Dr...==> it wont build (HELP!)
openSUSE 11.0 with linux-2.6.27 (its a fresh install and dont remember exact version of kernel, im under windows), Leadtek Winfast 9600GT
For me it doesn't work. It won't build with: IGNORE_XEN_PRESENCE=y make SYSSRC=/lib/modules/`uname -r`/build module. It says something like this kernel is not supported, the same error as the setup, nothing about xen though.
I need xen for studying purposes on my desktop pc, and running it without drivers is not an option as the cooler is blowing at full speed.
Re: Running Binary nVidia Dr...==> it wont build (HELP!)
I think you need to have kernel headers installed in order to build the module. I'm not sure what you'd need to install in OpenSUSE, but query your package repository for kernel-headers, linux-headers, etc. and see if you can find something that matches your specific kernel.
--
http://www.legroom.net/
Re: Running Binary nVidia Dr...==> it wont build (HELP!)
Same here, with 2.6.27 (x86/64) and 177.80 step 8 fails ("Unable to determine kernel version").
Re: Running Binary nVidia Dr...==> it wont build (HELP!)
For OpenSuse 11 I got it working doing this
cd /usr/src/linux
make oldconfig && make scripts && make prepare
# Extract the source code from nvidia installer
sh NVIDIA-Linux-whateverversion-pkg2.run -a -x
cd NVIDIA-Linux-whateverversion-pkg2/usr/src/nv/
#build
IGNORE_XEN_PRESENCE=y make SYSSRC=/usr/src/linux module
#should have built a kernel module
cp nvidia.ko /lib/modules/`uname -r`/kernel/drivers/video/
cd /lib/modules/`uname -r`/kernel/drivers/video/
depmod -a
modprobe nvidia
glxinfo is showing direct rendering: yes
So it seems to be working.