Fun with OpenSolaris snv_107 and Nvidia drivers

nvidia_settingI’ve been eagerly waiting an upgrade from OpenSolaris 2008.11 that would provide working WPA2 PSK for my laptop, as my laptop is the easiest system for operating system experimentation. I’ve heard stories of success with snv_107, so I decided to give it a shot.

Somehow the upgrade from snv_101b to snv_107 was quirky – Xorg didn’t work and ‘pkg fix’ showed such a large number of problems while crashing repeatedly. I decided a fresh installation was called for.

One of the first things I do on new Linux or Solaris workstations is install the closed-source Nvidia drivers. On OpenSolaris, as opposed to my usual OS choice of (Fedora), they come built in and apparently even slightly modified sometimes to work with the latest build. I ran nvidia-xconfig to get a simple configuration in place, and then restarted gdm.

Instead of Gnome, I was greeted by a white console screen. I logged in and met my first challenge, a relatively easy one:

(==) Log file: "/var/log/Xorg.0.log", Time: Tue Feb 24 02:44:59 2009
(==) Using config file: "/etc/X11/xorg.conf"
Parse error on line 12 of section Files in file /etc/X11/xorg.conf
        "RgbPath" is not a valid keyword in this section.
(EE) Problem parsing the config file
(EE) Error parsing the config file

Fatal server error:
no screens found

The simple fix is to comment out the RgbPath line in the “Files” Section.

A quick ‘svcadm restart gdm’ presented the next challenge:

(EE) NVIDIA(0): Failed to initialize the GLX module; please check in your X
(EE) NVIDIA(0):     log file that the GLX module has been loaded in your X
(EE) NVIDIA(0):     server, and that the module is the NVIDIA GLX module.  If
(EE) NVIDIA(0):     you continue to encounter problems, Please try
(EE) NVIDIA(0):     reinstalling the NVIDIA driver.
(EE) NVIDIA(0): Failed to initialize the NVIDIA graphics device PCI:1:0:0.

I pursued this one for a while before I came up with the solution. Apparently the ogl-select “determines at boot time which vendor supplied OpenGL headers and libraries will be used. The selection of the OpenGL vendor should be automatic and in most cases will not require any configuration.” – except it didn’t. There is a manual way to set the vendor:

/lib/opengl/ogl_select/nvidia_vendor_select

I restarted gdm again, to find the GLX module now loads nicely, but I still received the same “Failed to initialize the NVIDIA graphics device” message. It seemed like an opportune time to upgrade to the latest Nvidia drivers – 180.29.

The drivers installed flawlessly – and much faster than Linux since drivers don’t have to be compiled against a specific kernel. I tempted fate again with a gdm restart, to again fail:

(II) Loading /usr/X11/lib/modules/amd64//libwfb.so
dlopen: ld.so.1: Xorg: fatal: relocation error: file /usr/X11/lib/modules/amd64//libwfb.so: symbol miZeroLineScreenIndex: referenced symbol not found
(EE) Failed to load /usr/X11/lib/modules/amd64//libwfb.so
(II) UnloadModule: "wfb"
(EE) Failed to load module "wfb" (loader failed, 7)
.
.
.
(II) NVIDIA(0): NVIDIA 3D Acceleration Architecture Initialized
(EE) NVIDIA(0): Need libwfb but wfbScreenInit not found

Fatal server error:
AddScreen/ScreenInit failed for driver 0

I suspected that I needed to replace /usr/X11/lib/modules/amd64/libwfb.so and /usr/X11/lib/modules/libwfb.so with the original versions from SUNWxorg-server, as the Nvidia versions appear incompatible with this version of Xorg. Of course, these files were now gone, and of course I had neglected to take a ZFS snapshot.

I decided to try out the ‘fix’ feature of pkg. I simply ran ‘pkg fix SUNWxorg-server’ and it correctly identified these two files (and only these two files) as needing replacement. It downloaded them and completed patching in < 30 seconds. My final gdm restart was successful; X started up, and nvidia-settings confirmed the optimized drivers were installed.

Points scored by OpenSolaris over Linux:

* I don’t miss Linux’s annoying requirement of compiling drivers against specific kernel versions.

* I appreciate that OpenSolaris, like many Linux distros, values end-user experience over open source purity and delivers closed-source Nvidia drivers. I’m certainly tired of performing the Nvidia shuffle on Fedora/RHEL.

* pkg fix is nice – downloading only the delta between what should have been installed and what was saved time.

Points scored by Linux over OpenSolaris:

* There’s much more information available on debugging Linux problems than Solaris problems, in sharp contrast to 10+ years ago when I was a full time Solaris junkie.

* WPA2 is flaky. In fact, it may be wireless in general, but on my laptop it randomly flips between trying to use an unplugged wired port and the wireless adapter.

Points scored by OpenSolaris over me:

* Not taking a manual snapshot before installing the Nvidia drivers. ZFS makes it so simple, there’s not much of an excuse. OpenSolaris actually takes snapshots for you when you run ‘pkg fix’ – very nice.