Distributed File Systems (DFS)
The NFS and SMB/CIFS network file systems discussed in earlier chapters are both venerable network file systems in the Linux/UNIX world. They both allow the sharing of file systems and other resources across the network. This chapter covers another class of network file systems that solves a slightly different set of problems in addition to the problems solved by traditional NFS and SMB network file systems. This class of network file systems is referred to as Distributed File Systems (DFS). DFS provides a mechanism to give you a unified view and access to file system resources located on different servers. The operative word here is “distributed,” and it essentially means that the file system resources can be divided, shared, or distributed in a logical or physical manner across a network.
Plain-old vanilla network files systems do a wonderful job of making the local storage on a single server available to other client systems on a local area network (LAN). However, they do not often have the inherent ability to scale or grow easily beyond that single server or LAN. And they do not have the inherent ability to offer a unified (and consistent) view of a shared file system resource residing on multiple servers or across several physically or geographically disparate locations—to make all the data appear as a single resource to the end user.
NOTE We say “inherent” here because, in truth, some solutions/hacks do exist that can make the plain network file systems mimic the behavior of full-fledged DFS systems.
“A picture,” they say, “is worth a thousand words.” We’ll use a series of pictures to provide an overview of DFS. Suppose, for example, that we have a network of systems like the one depicted in Figure 26-1.
Figure 26-1 Traditional network file server and LAN clients
Our network consists of the following elements:
• A massive set of DATA (files, directories, and so on). In this example, all the DATA in its entirety is referred to as MASSIVE_DATA.
• A traditional NFS or Common Internet File System (CIFS) server hosting MASSIVE_DATA. This is server-A.
• A bunch of client systems and users that need to access MASSIVE_DATA hosted on server-A. These client systems are client-C, client-D, client-E, and client-F.
• With everything else being equal, server-A is fine and up to the task of serving MASSIVE_DATA to the clients—provided that the clients are connected to it via some high-speed network, such as a LAN.
Figure 26-2 Traditional network file server, LAN clients, and WAN clients
• A bunch of new client systems—client-G, client-H, client-I, and client-J—that need access to MASSIVE_DATA on server-A.
• The new clients, however, are not directly connected to the same LAN as server-A. Instead they are connected over a WAN (Internet).
• The new clients are located in physically and geographically different locations.
Obviously, nothing much has changed as far as server-A is concerned—just a bunch of new clients requiring access to MASSIVE_DATA.
However, a problem is created by the new requirements: even though server-A may still be up to the task of making its file system available to the new additional clients, the intervening network between the server and the new clients might not be up to the task. The intervening network might be too slow, too congested, too unpredictable, and so on. It is also possible that server-A was simply not built to handle the additional demands and load of so many new clients.
The solution that DFS offers to these problems is shown in Figure 26-3 and described in the following scenario:
Figure 26-3 Distributed File Systems network
• Place an additional file server (server-B) in physical proximity to the clients that require access to MASSIVE_DATA.
• Ensure that a copy of MASSIVE_DATA exists on all file servers regardless of where they’re located.
• Ensure that clients have fast, easy, and consistent access to MASSIVE_DATA.
• Ensure that bits and bytes that make up MASSIVE_DATA remain consistent and uncorrupted regardless of from where the client is accessing or modifying the data.
• Ensure that any changes to MASSIVE_DATA are immediately reflected across all the copies, regardless of where they physically reside.
Numerous DFS solutions exist to tackle the scenarios described so far. These solutions vary in so many degrees: they vary in complexity, ease of manageability, ease of configuration, ease of deployment, supported platform, maturity, application, and, of course, features. The implementations also differ because they can include special features that make them more suitable for specific applications or workloads. Following are some DFS implementations:
• GlusterFS (www.gluster.org) This popular open source distributed/clustered file system is easy to set up and use. The source code is under active development. Packaged binaries are readily available for most of the popular Linux distros. It’s well suited for high-performance and virtualized cloud (cloud computing) type workloads and applications.
• Lustre (http://lustre.org) This high-performance DFS implementation is frequently used for clustering types of workloads and LAN-type environments. Its architecture makes it more complicated to install, configure, and maintain in comparison to other DFS solutions. A standard Lustre setup includes these components: Metadata Server (MDS), Metadata Target (MDT), Object Storage Server (OSS), Object Storage Target (OST), and Lustre clients. For some features, Lustre requires a specially patched Linux kernel.
• Microsoft DFS (MS-DFS) Microsoft’s implementation is a mature product that is relatively easy to install in a pure Windows environment. The open source project Samba can emulate some of the features of the proprietary MS-DFS in its implementation of the CIFS/SMB protocols.
• OpenAFS (www.openafs.org) One of the older DFS implementations, OpenAFS is robust and well supported on multiple platforms—Windows, Linux, and macOS. However, it is not easy to install and configure. It relies heavily on Kerberos.
• CEPH (www.ceph.com) Ceph is a mature and industrial-grade storage platform. It is both distributed and clustered and provides interfaces for most common DFS use cases. The interfaces provided to applications for storing data in a Ceph cluster are Ceph object storage, Ceph block device, and Ceph file system. Because Ceph is an object storage platform at its core, all data is stored as objects in a flat namespace; this means there is no traditional hierarchy of directories (except via one of the interfaces)!
A typical Ceph cluster implementation consists of the following daemons: Monitors (ceph-mon), managers (ceph-mgr), Object Storage Daemon (ceph-osd), and Metadata servers (ceph-mds).
In the following sections, we’ll cherry-pick and discuss and configure the popular GlusterFS DFS implementations.
GlusterFS is a popular DFS implementation backed by Red Hat, Inc. The GlusterFS software is available as precompiled packages for several Linux distros, but as with most Free and Open Source Software (FOSS) projects, you can always download the source code, compile, and build it yourself. Compiling and building it yourself might be necessary if you want the very latest version of the software—which your distro maintainers may not yet have available.
In the following sections, we’ll create a small DFS environment consisting of two servers and one client. The two servers will host a distributed replicated volume. This means that contents of the glusterfs volumes on both servers will be identical—this will provide redundancy similar to when using physical disks in a RAID 1 setup (mirroring). This helps to ensure high availability and reliability.
Because our sample GlusterFS environment consists of three different systems (server-A, server-B, and client-C), we have broken down the steps to build our GlusterFS network into different sections and subsections. So wherever a section applies only to the servers or only to one of the servers, we label that section accordingly. Similarly, when the steps apply only to the client system, we label that section as such.
Ready? Let’s go!
We will install the GlusterFS software and its dependencies on our sample servers and client systems running Fedora. The steps and commands to run are very similar on other RPM-based Linux distros such as RHEL, CentOS, and so on. Try as much as possible to deploy the same version/release of the gluster stack among a given set of servers and clients.
Install GlusterFS on server-A and server-B While logged into the system as a user with administrative privileges, at the console type the following:
Install GlusterFS on the Client System (client-C) On the client systems that need to access only the data exported by gluster servers, we need to install the core GlusterFS package as well as the native GlusterFS client package. To install these, type the following:
The client component of gluster is provided by the glusterfs-client package on Debian-based distros like Ubuntu. This can be installed in Ubuntu by running the following:
As with most of the other client/server–based services discussed in this book, managing the gluster service involves starting, stopping, and configuring the service.
Name Resolution It’s nice and convenient for the GlusterFS software stack to be able to translate the hostnames of other systems reliably into their IP addresses. You can do this on the cheap by making sure the /etc/hosts file on all the systems contains the necessary name and IP address maps for all participating systems or by ensuring that all hostnames are resolvable via DNS (see Chapter 17). We will go the easy route here and make sure that the following entries that are applicable for our demo environment are in the /etc/hosts file of all three systems:
Start and Stop Gluster on server-A and server-B Check the status of the glusterd daemon by running the following:
To stop glusterd, type this:
Once you’ve confirmed that the daemon is not currently running, you can start glusterd like so:
Create a Trusted Storage Pool In GlusterFS parlance, a trusted storage pool consists of all the servers or bricks in your setup that you trust to be part of your DFS. The notion of storage pools help in managing different aspects of a GlusterFS environment—for example, you can increase the total capacity of the DFS by adding new storage bricks to the storage pool.
TIP The hosts in a gluster DFS communicate via the TCP port number ranges 24007 to 24008 and 49152 to <49152 plus number_of_bricks minus 1>. Be sure that these ports are open on any local or external firewalls between the GlusterFS servers and clients. For example, on an RPM-based distro gluster setup with 10 bricks using the firewalld firewall manager, you can open up these ports on the servers by running the following:
For testing and troubleshooting purposes, you can also temporarily disable the entire firewall subsystem (
ufw) on the server.
From server-A, add server-B to the storage pool:
Check the status of the peers in the storage pool you just created:
TIP If you ever need to, you can remove a server (for example, server-B) from the storage pool by typing this:
Create a Distributed Replicated Volume We will create and use directories under the /data directory as the storage space to be distributed among the trusted peers in our DFS. We will then create a replicated volume named MASSIVE_DATA.
On server-A, create the parent and subdirectory named /data/A:
On server-B, create the parent and subdirectory named /data/B:
From server-A, create the MASSIVE_DATA volume:
info option with the
gluster command to view information about the glusterfs volume you just created. This command can be issued from any of the gluster servers:
Start the GlusterFS Volume on the Servers After the logical volume has been created, you need to start the volume to enable the clients to access the data stored therein. You can start the volume from any of the trusted peers in the storage pool.
To start the MASSIVE_DATA volume, type the following:
Mount the GlusterFS Volume from the Client (client-C) Client systems that need to access the glusterfs volume can do so using different methods and protocols. The recommended way is to use the so-called “native” glusterfs client to mount remote glusterfs volumes. This method provides the best performance, because it utilizes built-in optimizations for the software. However, it might not always be possible to use the native gluster client.
You can also access glusterfs volumes by enabling and configuring the appropriate protocol from the server side and sharing/exporting the glusterfs mount point as you normally would. For example, you can make the contents of the glusterfs mount point on our sample server-B available via the CIFS/SMB protocol to Windows, macOS, and Linux clients by creating a Samba (see Chapter 25) share definition similar to the following:
Similarly, you can make the glusterfs volume available to other Linux client systems that are capable of speaking NFS (see Chapter 24). This can be done on our sample glusterfs server-A by creating an entry similar to the following in the NFS configuration file (/etc/exports):
In this section, we’ll access our remote glusterfs volume using the recommended native glusterfs client.
We will create and mount the remote glusterfs volume at the /mnt/glusterfs/MASSIVE_DATA directory of the client.
On the client system, create the mount point:
Mount the exported volume on the gluster servers by running the following:
df command to run some basic checks on the remote volume:
Access the GlusterFS Volume from the Client As soon as you’ve successfully mounted the remote glusterfs volume on the local client, the users on the client system can begin to read and write data onto the remote storage—provided, of course, that they have the correct file system permissions to do this.
From the client system, create some sample files on the MASSIVE_DATA volume:
Make sure that the files were properly created by performing a simple directory listing:
TIP If you perform a directory listing on the /data/A and /data/B directories of server-A and server-B, respectively, you should see the same files that were created from the client replicated on both servers. Furthermore, if one of the servers (let’s say server-A) were to suddenly go offline for whatever reason, the remaining server (server-B) would continue to host the replicated files as if nothing happened, and all this would be transparent to the client.
You were introduced to the world of distributed network file systems in this chapter. Among other things, a DFS provides a way of spreading data over multiple physical servers and making the data appear as one single file system resource to clients. Several DFS implementations exist today, both proprietary and open source variants. We provided a quick overview of some of the more popular and promising implementations that are readily available.
Finally, you learned how to install, configure, and manage a simple replicated DFS environment using the GlusterFS software. Our sample setup consisted of two servers and one client.