|
To get familiar with configuring a zfs based Lustre filesystem it is helpful to setup a small single node filesystem. This exact configuration is designed to illustrate several of the available options. If you're not already familiar with configuring a traditional ldiskfs based Lustre filesystem you should review the Lustre documentation. The filesystem will be named lustre and it will consist of:
The first thing which you should do is ensure SELinux is disabled. The filesystem types ldiskfs, zfs, and lustre are not part of the default SELinux policy. This can cause mount failures because SELinux is unaware that these filesystems support xattrs. To disable SELinux edit /etc/selinux/config and change the SELINUX line to SELINUX=disabled. Then reboot your system to remove the existing policy.
Next configure the required Lustre services. This process will be slightly different than that described in the official Lustre documentation. Aside from the zfs specific changes these packages contain Livermore's init scripts. They are used to provide a traditional interface for starting/stopping services. Additionally, they are integrated with the Linux Heartbeat package to provide automatic Lustre failover. While you do not have to use the init scripts, this example does. Create a 128 MiB MGS, a 128 MiB MDT, and two 1 GiB OSTs. To specify zfs based services use the --backfstype=zfs option to override the ldiskfs default value. When testing you can also include the --vdev-size option to automatically create sparse files for vdevs. This avoids the need to setup traditional loopback devices. Next, for MDT and OST services you will need to explicitly set their service index and the NID of the MGS host. Finally, you must list the dataset name and the desired zfs pool configuration.
You may have noticed that we created a seperate zfs pool and dataset for each of our services. While all the services can be created in a single zfs pool this will result in the available free space being over reported. Just like normal zfs filesystems each dataset may use any available free space in the pool. Now create a file called /etc/ldev.conf. It is used by the init scripts and the failover infrastructure to control the Lustre services. Minimally this file must contain the hostname where the service should run, the service label, and the ldiskfs block device or zfs dataset name. Make sure you update this file with your hostname.
Everything is now in place to start the Lustre services. You should start the MGS, then MDT, and lastly the OSTs. Once everything is running the Lustre filesystem can be mounted. A word of caution before starting the services. The Lustre Orion code still requires some polish and you may encounter a few bugs. It is not currently considered ready for real production usage.
|