01 August 2011
The venerable network file system is about to undergo another big change.
In my last installment I talked about (Network File System) NFS versions two and three. This month I would like to round out the discussion by highlighting NFS version four and what has become known as pNFS (part of NFSv4.1). I'll also add a short sidebar to discuss NFS/RDMA.
Version 4 of NFS (NFSv4) was released in 2003. It maintained most of the features of NFSv3 and added some speed improvements and strong security. Most importantly, it also introduced file locking to reduce file corruption (discussed previously). With the introduction of file locking, the server now needs to maintain the state of shared files (i.e., who can use them and who cannot). In NFSv4, file locking is mandatory and is implemented by a separate daemon (nfslock on Linux) that runs on both the client and the server. The protocol has some redundancy because the server can lose state (lock information) due to a restart and still recover by checking the clients. Interestingly, the improved security, performance, and file locking has not fostered widespread adoption of NFSv4. The most popular version continues to be NFSv3.
In recent years, new "kernel bypass" protocols have been created to improve network performance. (i.e., instead of using kernel services such as TCP or UDP, the transfer happens directly). These protocols are generally called Remote Direct Memory Access (RDMA) and include iWarp (RDMA over Ethernet) and RDMA over InfiniBand. NFS performance can be improved by using kernel bypass protocols, and work on NFS/RDMA continues. The full client/server version was released into kernel version 2.6.25.
Since NFS-RDMA works at the Remote Procedure Call (RPC) layer, NFSv2, NFSv3, and NFSv4 will all work without any major changes. But even with the many performance improvements available, NFS is still limited to a single server. That is, all data must go through a single server, which is limited by the amount of data it can send and receive from the actual spinning disks.
To address the "single server issue," storage vendors and The Internet Engineering Task Force (IETF) have developed the NFSv4.1 standard, or what is often called pNFS or parallel Network File System. As part of the new standard, pNFS allows clients to access raw storage devices directly and in parallel. The pNFS architecture eliminates the scalability and performance issues associated with a single NFS server. This improvement is achieved by separating the actual raw data from the metadata (data that describes where the actual data is stored). By keeping the metadata on a separate server (or servers), the number of paths to the actual data drives can be increased. Essentially, the metadata needs to be global and consistent, but the actual data can be spread across a large number of storage devices. The pNFS design can be seen in Figure One.



