The ubiquitous Network File System (NFS) provides great utility, but may have some surprising behaviors

In the Unix/Linux environment almost everyone has used the Network File System (NFS). The success of NFS is due in part to the fact that it is "just there" and easy to set up; NFS has become the de-facto distributed file system for many installations. Users typically assume that NFS works exactly like a local file system, but are often surprised to learn that this is not always the case. For most file operations, NFS works quite well. It is not, however, a POSIX standard file system, and thus may have some unexplained behaviors from time to time.

Perhaps the most common issue with NFS is the "stale file handle" error. NFS operates using file handles rather than the actual files themselves. If a file handle is removed due to another host deleting the file, then other hosts may have "stale" information and would not be able to carry out various file operations. These errors do not occur with a local file system and are often overlooked by coders who assume all files are local.

These errors often result because the popular NFS versions (2 and 3) were inherently "stateless." That is, the NFS server and clients just send and receive data and do not keep track of what other systems have done. A stateless protocol provides high performance, but can create some issues when multiple users (or machines) are working with the same file.

NFSv3 is perhaps the most common version in use today and was an improvement over NFSv2. It does support large file sizes, asynchronous writes (i.e., the client write completes before the data actually hits the server disk), and uses TCP as a transport layer (instead of UDP). It is still stateless and this can create some issues for users.

When multiple users or machines are writing to the same file, problems and data loss may occur. Note that NFS does not prohibit users from sharing files, so it assumes that you as a user understand these limitations. For instance, it is possible for two or more clients to write to the same file without creating problems as long as their writes don't overlap.

This requirement is a bit more complicated than most users realize. NFS reads and writes to files using a minimum page size. If a file read or write is less than the page size, then a whole page is written to the file. There is no requirement that page size be the same on the clients and/or server and there is no way for clients to communicate the page size to other clients. As a result, there is no guarantee that the file reads and writes will be coherent. Using NFSv3 in this way requires careful programming and attention to client details that are beyond the NFS specification.

In the event that multiple writes occur on the same file, "the last writer always wins." For example, suppose client A is writing in a lower half of a page. Before its writes, assume Client B reads the same page from the server and writes to the upper half of the page after Client A has completed its lower page write. Users may expect a file that has [Client-A-data and Client-B-data], but are surprised to learn they have [old-data and Client-B-data]. If for some reason Client A writes last, then the file would look like [Client-A-data and old-data]. In either case, the result is not what is expected even though NFSv3 is behaving as it should.

A similar issue with NFSv3 is that the client is only required to write back modified data when the file is closed. This situation can lead to "open-to-close" cache consistency errors. If client A writes to a file, but does not close the file, and client B subsequently reads from the file, it is not guaranteed to see any new data until client A closes the file. Again, this behavior is within the specification, but often surprises users.

Even with these limitations NFSv3 is probably the most successful distributed file system in use today. In the next installment, I'll talk about NFSv4 and how some of these issues are addressed.