Chapter 4 The Command Line – Linux Administration: A Beginner's Guide, Eighth Edition, 8th Edition

CHAPTER

4

The Command Line

The level of power, control, and flexibility that the command line offers Linux/FOSS users has been one of its most endearing and enduring qualities. There is also a flip side to this, however: for the uninitiated, the command line can also produce extremes of emotions, including awe, frustration, and annoyance. Casual observers of Linux gurus at work are often astounded at the results of a few carefully crafted and executed commands. This awesomeness comes at a cost—it can make using Linux appear less intuitive to the average user. For this reason, graphical user interface (GUI) front-ends for various Linux tools, functions, and utilities have been written.

More experienced users, however, may find that it is difficult for a GUI to present all of the available options. Typically, doing so would make the interface just as complicated as the command-line equivalent. The GUI design is often oversimplified, and experienced users ultimately return to the comprehensive capabilities of the command line. After all is said and done, the fact remains that it just looks plain cool to do things at the command line!

Before we begin our study of the command-line interface (CLI) under GNU’s Not Unix (GNU)/Linux-based systems, understand that this chapter is far from an exhaustive resource. Rather than trying to cover all the tools without any depth, this chapter describes a handful of tools that are most critical for the day-to-day work of a system administrator.

NOTE  This chapter assumes that you are logged into the system as a regular/non-privileged user. You can follow the examples here whether you are logged into the system at the console or via a GUI desktop environment.

If you are using the GNOME desktop environment, for example, you can start a virtual terminal in which to issue commands. To launch a virtual terminal application, press the ALT-F2 key combination on your keyboard to bring up the Run Application dialog box. Type the name of a terminal emulator (for example, xterm, gnome-terminal, or konsole) into the Run text box and then press ENTER. You can alternatively look under the Applications menu for any of the installed terminal emulator applications. All the commands you enter in this chapter should be typed into a virtual terminal or console at the shell prompt.

An Introduction to Bash

In Chapter 6, you will learn that an important component of creating new users pertains to the user’s login shell. The login shell is the first program that runs when a user logs into a system. The shell is comparable to the Windows Program Manager, except that in the case of Linux, the system administrator (and user) has a say in the choice of shell program used.

The formal definition of a shell is “a command language interpreter that executes commands.” A less formal definition might be simply “a program that provides an interface to the system.” The Bourne Again Shell (Bash), in particular, is a command-line-only interface containing a handful of built-in commands; it has the ability to launch other programs and to control programs that have been launched from it (job control).

A variety of shells exist, most with similar features but different means of implementing them. Again, for the purpose of comparison, you can think of the various shells as being like web browsers; among several different browsers, the basic functionality is the same—displaying content from the Web. In any situation like this, everyone proclaims that his or her shell is better than the others, but it all really comes down to personal preference.

In this section, we’ll examine some of Bash’s built-in commands. A complete reference on Bash could easily fill a large book in itself, so we’ll stick with the commands that a system administrator (or regular user) might use frequently. However, it is highly recommended that you eventually study Bash’s other functions and operations. The differences between Bash and other popular shells are very subtle, but it isn’t a bad idea for a system administrator to be as familiar with as many shells (and their idiosyncrasies) as possible.

Job Control

When working in the Bash environment, you can start multiple programs from the same prompt. Each program is considered a job. Whenever a job is started, it takes over the terminal. On today’s machines, the terminal is either a straight stand-alone text/console interface or a window displayed within a graphical environment (Xorg, WayLand, and so on). The terminal interfaces in graphical environments are called a pseudo-tty, or pty for short. If a job has control of the terminal, it can issue control codes so that text-only interfaces can be made more attractive (via colorization or other visual cues). Once the program is done, it gives full control back to Bash, and a prompt is redisplayed for the user.

Not all programs require this kind of terminal control, however. Some, including programs that interface with the user through the X Window System, can be instructed to give up terminal control and allow Bash to bring back (re-present) the user prompt, even though the invoked program is still running.

In the following example, with the user master logged into the system via a graphical desktop environment, the user launches the Firefox web browser from the CLI or shell, with the additional condition that the program (Firefox) gives up control of the terminal (this condition is specified by appending the ampersand symbol to the program name):

Immediately after you press ENTER, Bash will present its prompt again. This is called backgrounding the task.

If a program is already running and has control of the terminal, you can make the program give up control by pressing CTRL-Z in the terminal window. This will stop the running job (or program) and return control to Bash. At any given time, you can find out how many jobs Bash is tracking by typing this command:

The running programs that are listed will be in one of two states: running or stopped. The preceding sample output shows that the Firefox program is in a running state. The output also shows the job number in the first column: [1].

To bring a job back to the foreground—that is, to give it back control of the terminal—you would use the fg (foreground) command, like this:

Here, NUMBER is the job number you want in the foreground. For example, to place the Firefox program (with job number 1) launched earlier in the foreground, type this:

If a job is stopped (that is, in a stopped state), you can start it running again in the background, thereby allowing you to keep control of the terminal and resume running the job. Or a stopped job can run in the foreground, which gives control of the terminal back to the program.

To place a running job in the background, type this:

Here, NUMBER is the job number you want to background.

NOTE  You can background any process. Applications that require terminal input or output will be put into a stopped state if you background them. You can, for example, try running the top utility in the background by typing top &. Then you can check the state of that job with the jobs command.

Environment Variables

Every instance of a shell, and every process that is running, has its own “environment”—these are settings that give it a particular look, feel, and, in some cases, behavior. These settings are typically controlled by environment variables. Some environment variables have special meanings to the shell, but there is nothing stopping you from defining your own and using them for your own needs. It is through the use of environment variables that most shell scripts are able to do interesting things and remember results from user inputs as well as program outputs. If you are already familiar with the concept of environment variables in Microsoft Windows, you’ll find that many of the things you know about them will apply to Linux as well; the only difference is how they are set, viewed, and removed.

Printing Environment Variables

To list all of your environment variables, use the printenv command. Here’s an example:

To show a specific environment variable, specify the variable as a parameter to printenv. For example, here is the command to see the environment variable TERM:

Setting Environment Variables

To set an environment variable, use the following format:

Here, variable is the variable name and value is the value you want to assign the variable. For example, to set the environment variable FOO to the value BAR, type this:

Whenever you set environment variables in this way, they stay local to the running shell. If you want that value to be passed to other processes that you launch, use the export built-in command. The format of the export command is as follows:

Here, variable is the name of the variable. For example, to set the value of the variable FOO and export it at the same time, you would enter this command:

If the value of the environment variable you want to set has spaces in it, surround the variable with quotation marks. Using the preceding example, to set FOO to “Welcome to the BAR of FOO.”, you would enter this:

You can then use the printenv command to see the value of the FOO variable you just set by typing this:

NOTE  Modern cloud-native utilities and web application stacks make heavy use of environment variables to set parameters on the fly that help to serve various purposes, such as authentication, configuring runtime variables, and so on. Therefore, you might see instructions asking you to, for example, set your AWS secret key credentials by setting the following environment variable at your shell: AWS_SECRET_ACCESS_KEY=EXAMPLEI/K7MDENG/bPxRfiCY.

Unsetting Environment Variables

To remove an environment variable, use the unset command. Here’s the syntax for the unset command:

unset variable

Here, variable is the name of the variable you want to remove. For example, here’s the command to remove the environment variable FOO:

NOTE  This section assumes that you are using Bash. You can choose to use many other shells; the most popular alternatives are the C shell (csh) and its brother, the Tenex/Turbo/Trusted C shell (tcsh), which uses different mechanisms for getting and setting environment variables. Bash is documented here because it is often the default shell for new Linux user accounts in most Linux distributions.

Pipes

Pipes are a mechanism by which the output of one program can be sent as the input to another program. Individual programs can be chained together to become extremely powerful tools using pipes.

Let’s use the grep program to provide a simple example of how pipes can be used. When given a stream of input, the grep utility will try to match the line with the parameter supplied to it and display only matching lines. You will recall from the preceding section that the printenv command prints all the environment variables. The list it prints can be lengthy, so, for example, if you were looking for all environment variables containing the string “TERM”, you could enter this command:

The vertical bar ( | ) character represents the pipe between printenv and grep.

The command shell under Windows also utilizes the pipe function. The primary difference is that all commands in a Linux pipe are executed concurrently, whereas Windows runs each program in order, using temporary files to hold intermediate results.

Redirection

Through redirection, you can take the output of a program and have it automatically sent to a file (remember that everything in Linux is regarded as a file!). The shell rather than the program itself handles this process, thereby providing a standard mechanism for performing the task. Having the shell handle redirection is therefore much cleaner and easier than having individual programs handle redirection themselves.

Redirection comes in three classes: output to a file, append to a file, and send a file as input.

To send the output of a program into a file, end the command line with the greater-than symbol (>) and the name of the file to which you want the output redirected. If you are redirecting to an existing file and you want to append additional data to it, use two symbols (>>) followed by the filename. For example, here is the command to send the output of a directory listing into a file called /tmp/directory_listing:

Continuing this example with the directory listing, you could append the string “Directory Listing” to the end of the /tmp/directory_listing file by typing this command:

The third class of redirection, using a file as input, is done by using the less-than sign (<) followed by the name of the file. For example, here is the command to feed the /etc/passwd file into the grep program:

Command-Line Shortcuts

Most of the popular Linux shells have a tremendous number of shortcuts. Learning and getting used to the shortcuts can be a huge cultural shock for users coming from the Windows world. This section explains the most common of the Bash shortcuts and their behaviors.

Filename Expansion

Under traditional Linux-based shells such as Bash, wildcards on the command line are expanded before being passed as a parameter to the application. This is in sharp contrast to the default mode of operation for DOS-based tools, which often have to perform their own wildcard expansion. The Linux method also means that you must be careful where you use the wildcard characters. The wildcard characters themselves in Bash are identical to those in cmd.exe in the Windows world.

The asterisk (*) matches against all filenames, and the question mark (?) matches against single characters. If you need to use these characters as part of another parameter for whatever reason, you can escape them by preceding them with a backslash (\) character. This causes the shell to interpret the asterisk and question mark as regular characters instead of wildcards.

NOTE  In Linux, wildcard patterns are not exactly the same as regular expressions, although there are some similarities. Wildcards are used for matching filenames, and regular expressions are used for matching text.

The distinction is important, since regular expressions are substantially more powerful than just wildcards alone. All of the shells that come with GNU/Linux support regular expressions. You can learn more about wildcards and regular expressions in their respective manual pages (man 7 glob and man 7 regex).

Environment Variables as Parameters

Under Bash, you can use environment variables as parameters on the command line.

For example, issuing the parameter $FOO will cause the value of the FOO environment variable to be passed rather than the string “$FOO”.

Multiple Commands

Under Bash, multiple commands can be executed on the same line by separating the commands with semicolons (;). For example, here’s how to execute this sequence of commands (cat and ls) on two separate lines:

You could instead type the following:

Since the shell is also a programming language, it understands the semantics of programming languages. For example, you can run commands serially only if the first command succeeds. This can be done by using the double ampersand (&&) symbol. For example, you can use the ls command to try to list a file that does not exist in your home directory, and then execute the date command right after that on the same line:

This command will run the ls command, but that command will fail because the file it is trying to list does not exist, and, therefore, the date command will not be executed either. But if you switch the order of commands around, you will notice that the date command will succeed, while the ls command will fail:

Backticks

Any text enclosed within backticks (`) is treated as a command to be executed. This allows you to embed commands within backticks and pass the result as parameters to other commands, for example. You’ll see this technique used often in this book and in various system scripts. For example, you can pass the value of a number (a process ID number) stored in a file and then pass that number as a parameter to the kill command. A sample use of this is for killing (stopping) the Domain Name System (DNS) server named (or other services/daemons that works the same way). When named starts, it writes its process identification (PID) number into the file /var/run/named/named.pid. Thus, the generic and dirty way of killing the named process is to look at the number stored in /var/run/named/named.pid using the cat command, and then issue the kill command with that value. Here’s an example:

One problem with killing the named process in this way is that it cannot be easily automated—we are counting on the fact that a human will read the value in /var/run/named/named.pid in order to pass the kill utility the number. Another issue isn’t so much a problem as it is a nuisance: It takes two steps to stop the DNS server.

Using backticks, however, you can combine the steps into one and do it in a way that can be automated. The backticks version would look like this:

When Bash sees this command, it will first run cat /var/run/named/named.pid and store the result. It will then run kill and pass the stored result to it. From our point of view, this happens in one graceful step.

NOTE  So far in this chapter, we have looked at features that are internal to Bash (or “Bash built-ins” as they are sometimes called). The remainder of the chapter explores several common commands accessible outside of Bash.

Documentation Tools

Linux-based systems comes with two superbly useful tools for making documentation accessible: man and info. Currently, a great deal of overlap exists between these two documentation systems, because many applications are moving their documentation to the info format. This format is considered superior to man because it allows the documentation to be hyperlinked together in a web-like way, but without actually having to be written in Hypertext Markup Language (HTML) format.

The man format, on the other hand, has been around for decades. For thousands of Linux utilities/programs, their man (short for manual) pages are their only source of documentation. Furthermore, many applications continue to utilize the man format because many other UNIX-like operating systems use it.

TIP  Many Linux distributions also include a great deal of documentation in the /usr/doc or /usr/ share/doc directory.

The man Command

Man pages are a form of built-in and self-documentation system on Linux systems that cover the use of tools and their corresponding configuration files. The syntax of the man command is as follows:

Here, program_name identifies the program in which you’re interested. For example, to view the man page for the ls utility that we’ve been using, type this:

While reading about Linux and Linux-related information sources online, you may encounter references to commands followed by numbers in parentheses—for example, ls (1). The number represents the section of the manual pages (see Table 4-1). Each section covers various subject areas to accommodate the fact that some tools (such as printf) are commands/functions in the C programming language as well as command-line commands.

Table 4-1   Man Page Sections

To refer to a specific man page section, simply specify the section number as the first parameter and then the command as the second parameter. For example, to get the C programmers’ information on printf, you’d enter this:

To get the plain command-line information (user tools), you’d enter this:

If you don’t specify a section number with the man command, the default behavior is that the lowest applicable section number gets printed first.

TIP  A handy option to the man command is an -f preceding the command parameter. With this option, man will search the summary information of all the man pages and list pages matching your specified command, along with their section number. Here’s an example:

The texinfo System

Another common form of documentation is texinfo. Established as the GNU standard, texinfo is a documentation system similar to the hyperlinked World Wide Web format. Because documents can be hyperlinked together, texinfo is often easier to read, use, and search in comparison to man pages.

To read the texinfo documents on a specific tool or application, invoke info with the parameter specifying the tool’s name. For example, to read about the wget program, you’d type:

In general, you will want to verify whether a man page exists before using info (because there is still a great deal more information available in man format than in texinfo). On the other hand, some man pages will explicitly state that the texinfo pages are more authoritative and should be consulted instead.

Files (Types, Ownership, and Permissions)

This section covers basic file management tools and concepts under Linux. We’ll start with specifics on some useful general-purpose commands, and then we’ll step back and look at some background information.

Under Linux, almost everything is abstracted to a file. Originally, this was done to simplify the programmer’s job. Instead of having to communicate directly with device drivers, special files (which look like ordinary files to the application) are used as a bridge. We discuss the different types of file categories in the following sections.

Normal Files

Normal files are just that—normal. They contain data and can also be executables. The operating system makes no assumptions about their contents or their names or extensions.

Directories

Directory files are a special instance of normal files. Directory files list the locations of other files, some of which may be other directories. This is similar to folders in Windows.

In general, the nitty-gritty of directory files won’t be of importance to your daily operations, unless you need to open and read the file yourself rather than using existing applications to navigate directories; doing this would be similar to trying to read the DOS file allocation table directly rather than using cmd.exe to navigate directories or using the findfirst/findnext system calls.

Hard Links

Each file in the Linux file system gets its own i-node. An i-node keeps track of a file’s attributes and its location on the disk. If you need to be able to refer to a single file using two separate filenames, you can create a hard link. The hard link will have the same i-node as the original file and will, therefore, look and behave just like the original.

With every hard link that is created, a reference count is incremented. When a hard link is removed, the reference count is decremented. Until the reference count reaches zero, the file will remain on disk.

NOTE  A hard link cannot exist between two files on separate file systems (or partitions). This is because the hard link refers to the original file by i-node, and a file’s i-node is only unique on the file system on which it was created.

Symbolic Links

Unlike hard links, which point to a file by its i-node, a symbolic link points to another file by its name. This allows symbolic links (often abbreviated symlinks) to point to files located on other file systems, even other network drives.

Block Devices

Traditional hard disks are a type of block or storage device. All devices are accessed through their device file abstractions on the file system. Files of type block device are used to interface with devices such as disks. A block device file has two identifying traits:

•   It has a major number.

•   It has a minor number.

When viewed using the ls -l command, it shows b as the first character of the permissions field. Here’s an example:

In the sample output, note the b at the beginning of the file’s permissions; the 8 in the fifth field is the major number, and the 0 in the sixth field is the minor number.

A block device file’s major number identifies the represented device driver. When this file is accessed, the minor number is passed to the device driver as a parameter, telling it which instance of the device it is accessing. For example, if there are two serial ports, they will share the same device driver and thus the same major number, but each serial port will have a unique minor number.

Character Devices

Similar to block devices, character devices are special files that allow you to access devices through the file system. The obvious difference between block and character devices is that block devices communicate with the actual devices in large blocks or chunks, whereas character devices work one character at a time. A hard disk is a block device; a modem is a character device. Character device permissions start with a c, and the file has a major number and a minor number. Here’s an example:

Listing Files: ls

Out of necessity, we have been using the ls command in previous sections without properly explaining it. We will look at the ls command and some of its options here.

The ls command is used to list all the files in a directory. Of more than 50 available options, those listed in Table 4-2 are the most commonly used. The options can be used in any combination.

Table 4-2   Common ls Options

To see an extended (long) list of all files (including hidden files) in a directory, type:

To list a directory’s non-hidden files that start with the letter A, type this:

If no such file exists in your working directory, ls prints out a message telling you so.

CAUTION  Linux is case-sensitive. For example, a file named thefile.txt is different from a file named thefile.txt.

Change Ownership: chown

The chown command allows you to change the ownership of a file to another user. Only the root (superuser) user can do this; therefore, normal users may not assign file ownership or steal ownership from another user. The syntax of the command is as follows:

Here, username is the login of the user to whom you want to assign ownership, and filename is the name of the file in question. The filename may be a directory as well.

The -R option applies when the specified filename is a directory name. This option tells the command to descend recursively through the directory tree and apply the new ownership to the named directory itself and all of the files and directories within it.

NOTE  The chown command supports a special syntax that allows you to also specify a group name to assign to a file. The format of the command becomes this:

Change Group: chgrp

The chgrp command-line utility lets you change the group settings of a file. It works much like chown. Here is the format:

Here, groupname is the name of the group to which you want to assign filename ownership. The filename can be a directory as well.

The -R option applies when the specified filename is a directory name. As with chown, the -R option tells the command to descend recursively through the directory tree and apply the new ownership to all of the files and directories within it.

Change Mode: chmod

Directories and files within the Linux file system have permissions associated with them. By default, permissions are set for the owner of the file, the group associated with the file, and everyone else who can access the file (also known as owner, group, and other, respectively).

When you list files or directories, you see the permissions in the first column of the output. Permissions are divided into four parts. The first part is represented by the first character of the permission. Normal files have no special value and are represented with a hyphen (-) character. If the file has a special attribute, it is represented by a letter. The two special attributes we are most interested in here are directories (d) and symbolic links (l).

The second, third, and fourth parts of a permission are represented in three-character chunks. The first part indicates the file owner’s permission. The second part indicates the group permission. The last part indicates the world permission. In the context of Linux, “world” means all users (everyone) in the system, regardless of their group settings.

Following are the letters used to represent permissions and their corresponding values. When you combine attributes, you add their values. The chmod command is used to set permission values.

Using the numeric command mode is typically known as the octal permissions, since the value can range from 0 to 7. To change permissions on a file, you simply add or subtract these values for each permission you want to apply.

For example, if you want to make it so that only the user (owner) can have full access (read, write, and execute: rwx) to a file called foo, you would type this:

What is important to note is that using the octal mode replaces any permissions that were previously set. So if a file in the /usr/local directory is tagged with a SetUID bit, and you run the command chmod -R 700 /usr/local, that file will no longer be a SetUID program.

If you only want to change certain bits, you should use the symbolic mode of chmod. This mode turns out to be much easier to remember, and you can add, subtract, or overwrite permissions.

The symbolic form of chmod allows you to set the bits of the owner, the group, or others. You can also set the bits for all three classes at the same time.

For example, if you want to change a file called foobar.sh so that it is executable (x) for the owner (u), you can run the following command:

If you want to change the group’s bit to execute also, use the following:

If you need to specify different permissions for others, just add a comma and its permission symbols. For example, to make the foobar.sh file executable for the user and the group, but also remove read, write, and executable permissions for all others, you could try this:

If you do not want to add or subtract a permission bit, you can use the equal (=) sign instead of a plus (+) sign or minus (–) sign. This will write the specific bits to the file and erase any other bit for that permission. The preceding examples used + to add the execute bit to the User and Group fields. If you want only the execute bit, you would replace the + with =. You can also use a fourth character: a. This will apply the permission bits to all the fields.

The following list shows the most common combinations of the three permissions. Other combinations, such as -wx, also exist, but they are rarely used.

For each file, three of these three-letter chunks are grouped together. The first chunk represents the permissions for the owner of the file, the second chunk represents the permissions for the file’s group, and the last chunk represents the permissions for all users on the system. Table 4-3 shows some permission combinations, their numeric equivalents, and their descriptions.

Table 4-3   File Permissions

File Management and Manipulation

This section covers the basic command-line tools for managing files and directories. The use and functions of some of these tools are similar to their use on other operating systems.

Copy Files: cp

The cp command is used to copy files. It has a substantial number of options. See its man page for additional details. By default, this command works silently, displaying status information only if an error condition occurs. Following are the most common options for cp:

First, let’s use the touch command to create an empty file called foo.txt in the user master’s home directory:

Then use the cp (copy) command to copy foo.txt to foo.txt.html:

To copy all files in the current directory ending in .html to the /tmp directory, type this:

To interactively recopy all files in the current directory ending in .html to the /tmp directory, type this command:

You will notice that using the interactive (-i) option with cp forces it to prompt or warn you before overwriting existing files with the same name in the destination. To continue the copy and overwrite the existing file at the destination, type yes or y at the prompt, like this:

Move Files: mv

The mv command is used to move files from one location to another. Files can be moved across partitions/file systems as well. Moving files across partitions involves a copy operation, and as a result, the move command can take longer. But you will find that moving files within the same file system is almost instantaneous.

Following are the most common options for mv:

To move a file named foo.txt.html from /tmp to your present working directory, for example, you’d use this command:

NOTE  That last dot (.) is not a typo!! It literarily means “this directory.”

Besides being used for moving files and folders around the system, mv can also be used simply as a renaming tool.

To rename the file foo.txt.html to foo.txt.htm, type the following:

Link Files: ln

The ln command lets you establish hard links and soft links (see “Files (Types, Ownership, and Permissions)” earlier in this chapter). The general format of ln is as follows:

Although ln has many options, you’ll rarely need to use most of them. The most common option, -s, creates a symbolic link (similar to a shortcut) instead of a hard link.

To create a symbolic link called link-to-foo.txt that points to the original file called foo.txt, issue this command:

Find a File: find

The find command lets you search for files using various search criteria. find has a large number of options that you can read about in its man page. Here is the general format of find:

start_directory is the directory from which the search should start.

To find all files in the current directory (that is, the “.” directory) that have not been accessed in at least seven days, you’d use the following command:

Type this command to find all files in your present working directory whose names are core and then delete them (that is, automatically run the rm command on the search result):

TIP  The syntax for the -exec option with the find command as used here can be difficult to remember, so you can also use the xargs method instead of the exec option used in this example. Using xargs, the command would then be this:

To find all files in your PWD whose names end in .txt (that is, files that have the .txt extension) and are also less than 100 kilobytes (KB) in size, issue this command:

To find all files in your PWD whose names end in .txt and are also greater than 100KB in size, issue this command:

File Compression: gzip

The gzip utility is used for reducing the size (compressing) or expanding files. It is able to achieve impressive compression ratios. By convention, the .gz extension or suffix is used for naming files compressed with gzip.

Note that gzip compresses the file in place, meaning that after the compression process, the original file is removed, and the only thing left is the compressed file.

To compress a file named foo.txt.htm in your PWD, type this:

And then to decompress it, use gzip again with the -d option:

Issue this command to use the best compression method (-9 or –best option) to compress all files ending in .htm in your PWD:

File Compression: bzip2

The bzip2 tool uses a different compression algorithm that usually results in smaller files than those compressed with the gzip utility, and it uses semantics that are similar to gzip. In other words, bzip2 offers better compression ratios in comparison to gzip.

By convention, file archives compressed using the bzip2 utility usually have the .bz extension or suffix. For more information, read the man page on bzip2 (man bzip2).

File Compression: xz

xz is a general-purpose data compression and decompression utility. xz purportedly produces better compression ratios than gzip and bzip2. The important bits behind xz’s compression algorithms are in the public domain—and this makes it a popular compression format for numerous open source projects that require compression.

Let’s create a sample file named foo2.txt.htm and then compress it (using xz):

Use the ls command to verify that a new compressed file named foo2.txt.htm.xz was created for you:

Then decompress the foo2.txt.htm.xz file and use the --keep option with the xz utility so the original archive file is not deleted after successful decompression:

Create a Directory: mkdir

The mkdir command is used for creating directories or folders. An often-used option of the mkdir command is the -p option. This option will force mkdir to create parent directories if they don’t exist already. For example, if you need to create /tmp/bigdir/subdir/mydir and the only directory that exists is /tmp, using -p will cause bigdir and subdir to be automatically created along with mydir.

To create a single directory called mydir, under the /tmp folder use this command:

To create a directory tree like bigdir/subdir/finaldir in your PWD, type this:

Remove Files or Directories: rm

The rm command is used for removing (deleting) files or directories. It is one of the more often used utilities on any system. By default, rm does not remove directories and, as such, you will have to pass it a specific option (-r) to make it remove a directory. This command also accepts the -i parameter, which makes it prompt you interactively before deleting anything.

Use the touch command to create a file called myfile and then remove the file:

To remove a directory located under the /tmp folder called mydir, you’d type this:

If you want to interactively (the -i option) get rid of all the directories from bigdir to finaldir that were created earlier, you’d issue this command:

TIP  You can also use the popular rmdir command to delete directories.

Show Present Working Directory: pwd

Inevitably, you will find yourself at the terminal or shell prompt of an already logged-in workstation and you won’t know where you are in the file system hierarchy or directory tree. To get this information, you need the pwd command. Its only task is to print the current working directory. To display your current working directory, use this command:

Tape Archive: tar

If you are familiar with the WinZip program, you are accustomed to the fact that the compression tool not only reduces file size but also consolidates files into compressed archives. Under GNU/Linux, this process is separated into two tools: gzip and tar.

The tar command combines multiple files into a single large file. It is separate from the compression tool, so it allows you to select which compression tool to use or whether you even want compression. In addition, tar is able to read and write to devices, thus making it a good tool for backing up to tape devices.

TIP  Although the tape archive, or tar, program includes the word “tape,” it isn’t necessary to read or write to a tape drive when you’re creating archives. In fact, you’ll rarely use tar with a tape drive in day-to-day situations any longer (traditional backups aside).

Here’s the syntax for the tar command:

Some of the options for the tar command are shown here:

In order to see sample usage of the tar utility, first create a folder called junk in the PWD that contains some empty files named 1, 2, 3, 4:

Now create an archive called junk.tar containing all the files in the folder called junk that you just created by typing this:

Create another archive called 2junk.tar containing all the files in the junk folder, but this time, add the -v (verbose) option to show what is happening as it happens:

The archive that we just created was not compressed in any way. The files and directory have only been combined into a single file.

To create a gzip-compressed archive called 3junk.tar.gz containing all of the files in the junk folder and to show what is happening as it happens, issue this command:

To extract the contents of the gzipped tar archive just created, issue this command:

TIP  The tar command is one of the few GNU/Linux utilities that cares about the order in which you specify its options. If you issued the preceding tar command as tar -xvfz 3junk.tar.gz, the command would fail, because the -f option was not immediately followed by a filename.

If you like, you can also specify a physical device to tar to and from. This is handy when you need to transfer a set of files from one system to another and for some reason you cannot create a file system on the device. (Or, sometimes, it’s just more entertaining to do it this way!)

Assuming you have a USB disk plugged into your system and the USB device is mapped to /dev/null, you can try creating an archive on the USB disk by typing this:

CAUTION  The command tar -cvzf  /dev/null will treat our phantom USB disk device (/dev/null) as a raw device and erase anything that is already on it. This is why, in our example, we deliberately (and incorrectly) used the null device (/dev/null), which will not result in any harm to your system.

The moral of this caution is to be careful when running any commands that write to real and raw devices, such as /dev/sdb, /dev/vda1, dev/sr0 and so on.

To pull (extract) that archive off of a disk, you would type this:

Concatenate Files: cat

The cat program fills an extremely simple role: it concatenates and displays files. More creative things can be done with it, but nearly all of its usage will be in the form of simply displaying the contents of text files—much like the type command under Microsoft’s CMD.

Because multiple filenames can be specified on the command line, it’s possible to concatenate files into a single, large, continuous file. This is different from tar in that the resulting file has no control information to show the boundaries of different files.

To display the /etc/passwd file, use this command:

To display the /etc/passwd file and the /etc/group file, issue this command:

Type this command to concatenate /etc/passwd with /etc/group and send the output into the file users-and-groups.txt:

To append the contents of the file /etc/hosts to the users-and-groups.txt file you just created, type this:

TIP  If you want to cat a file in reverse, you can use the tac command.

Display a File One Screen at a Time: more or less

The more and less commands take an input file and display it one screen at a time. The input file can come either from its stdin or from a command-line parameter.

To view the /etc/passwd file one screen at a time, use this command:

To view the /etc/passwd file one screen at a time, using the less command, type:

To view the directory listing generated by the ls command one screen at a time, type:

Show the Directory Location of a File: which

The which command searches the locations specified in the PATH environment variable ($PATH) to find the name of an executable specified on the command line. If the file is found, the command output includes the actual path to the file.

Use the following command to find out in which directory the binary for the rm command is located:

You might find this similar to the find command. The difference here is that since which searches only $PATH, it is much faster. Of course, it is also much more limited in features than find!

Locate a Command: whereis

The whereis tool searches the locations specified in the PATH and MANPATH environment variables and displays the name of the program and its absolute directory, the source file (if available), and the man page for the command (again, if available).

To find the location of the program, source, and manual page for the grep command, type:

Editors

Editors are easily among the bulkiest of the common system utilities, but they are also incredibly useful. Without them, making any kind of change to a text file would be a tremendous undertaking. Regardless of your Linux distribution, you will have gotten a few editors automatically installed for you during the operating system installation. You should take a few moments to get comfortable with them.

NOTE  Different Linux distributions favor some editors over others. As a result, you might have to find and install your preferred editor if it doesn’t come installed with your distribution by default.

vi

The vi editor has been around UNIX-based systems since the 1970s, and its interface shows it. It is arguably one of the last editors to use a separate command mode and data entry mode; as a result, most newcomers may find it unpleasant to use. But before you give vi the cold shoulder, take a moment to get comfortable with it. In difficult situations, you might not have the luxury of a pretty graphical editor at your disposal, but you will find that vi is ubiquitous across all GNU/Linux systems.

Another version of vi that is readily available for installation on most Linux distributions is vim (VI iMproved). It has a lot of what made vi popular in the first place and many features that make it relevant and useful in modern computing environments—there’s even a GUI version of the editor!

To start vi, simply type this:

The vim editor has an online tutor that can help you get started with it quickly. To launch the tutor, type this:

Another easy way to learn more about vi is to start it and enter :help. If you ever find yourself stuck in vi, press the ESC key several times and then type :q! to force an exit without saving. If you instead want to save the file you are editing and quit vi, type :wq!.

emacs

It has been argued that emacs can easily be an entire operating system all by itself! It’s big, feature-rich, expandable, programmable, and all-around amazing. If you’re coming from a GUI background, you’ll probably find emacs a pleasant environment to work with at first. You might to install emacs if it doesn’t come installed by default on your distro. On its face, it works like Notepad in terms of its interface. Yet underneath is a complete interface to the GNU development environment, a mail reader, a news reader, and a web browser. Believe it or not, it even has a cute built-in help system that’s disguised as your very own personal psychotherapist! You can have some interesting conversations with this automated/robotic psychotherapist.

To start emacs, simply type the following:

Once emacs has started, you can visit the therapist by pressing ESC-X and then typing doctor. To get help using emacs, press CTRL-H.

pico

The pico program is an editor inspired by simplicity. Typically used in conjunction with the Pine e-mail reading system, pico can also be used as a stand-alone editor. joe is another simple CLI text editor that functions similarly to pine. Both pine and joe work in a manner similar to Notepad, but pico has its own set of key combinations. Thankfully, all available key combinations are always shown at the bottom of the screen.

To start pico, simply type this:

TIP  The pico program will perform automatic word wraps. If you’re using it to edit configuration files, for example, be careful that it doesn’t word-wrap a line into two lines if it should really be parsed as a single line.

sed

sed is a stream-based line editor. It is not a traditional file-editing program, but is instead a powerful purpose-built programming language that also happens to have editing functions in its repertoire. It is best suited for manipulating data that has some known patterns. The power of sed comes from its commands. It has commands for searching for patterns, appending text, replacing text, deleting text, printing text, and so on. One way that system administrators often make use of sed is making in-place edits of system configuration files by searching for keywords (configuration parameters) in the files. Let’s look at some simple sed one-liners.

Use the echo command as the input source (instead of a file) and change the word night to the word day using sed’s substitute (s) command:

Let’s do something similar to the previous command, but this time use sed to do an in-place edit of a file.

Use echo to create a file named contacts.txt with two entries (name: adere on the first line and phone: 555-723-9709 on the second line), like so:

We’ll first do a dry run, which will only print (p) what will be done. Use sed’s substitute (s) command and regex (regular) expression to search for all lines in the file that begin with the string “name:” and change it to the new line “name: hiromi”:

The output from the test run looks like what we want. We will now use sed’s in-place (-i or --in-place) option to make the change and also save a backup of the original unchanged file to a new file named contacts.txt.bak by typing the following command:

Use the cat command to view the contents of the modified and backup files:

Let’s see sed’s delete command in action. We’ll use the seq command as the input source to print a sequence of numbers from 1 to 3 and then use sed’s delete (d) command to delete the second line (2) of the output. Type the following command:

Miscellaneous Tools

The following tools don’t fall into any specific category covered in this chapter, but they are often used for daily system administration chores.

Disk Utilization: du

You will often need to determine where disk space is being consumed, and by whom, especially when you’re running low on it! The du command allows you to determine the disk utilization on a directory-by-directory basis.

Following are some of the options available:

To display the total amount of space being used by all the files and directories in your PWD in human-readable format, use this command:

NOTE  You can use the pipe feature of the shell, discussed earlier in the chapter, to combine the du command with some other utilities (such as sort and head) to gather some interesting statistics about the system.

The sort command is used for sorting lines of text in alphanumeric, numeric order, and the head command is used for printing or displaying any specified number of lines of text to the standard output (screen).

So, for example, to combine du, sort, and head together to list the 12 largest files and directories taking up space, under the /home/master directory, you could run this:

Disk Free: df

The df program displays the amount of free space available on mounted file systems. The drives/partitions/volumes/network shares must be mounted in order to get this information. Some parameters for df are listed in the following table; additional options are listed in the df manual page.

To show the free space for all locally mounted drives, use this command:

To show the free space in a human-readable format for the file system on which /tmp is located, type this command:

List Processes: ps

The ps command lists all the processes in a system, as well as their state, size, name, owner, CPU time, wall clock time, and much more. Many command-line parameters are available; those most often used are described in Table 4-4.

Table 4-4   Common ps Options

The most common set of parameters used with the ps command is auxww. These parameters show all the processes (regardless of whether they have a controlling terminal), each process’s owners, and all the processes’ command-line parameters.

Let’s examine some sample output of an invocation of ps auxww:

The first line of the output provides column headers for the listing. The column headers are described in Table 4-5.

Table 4-5   ps Header Description

Show an Interactive List of Processes: top

The top command is an interactive version of ps. Instead of giving a static view of what is going on, top refreshes the screen with a list of processes every 2–3 seconds (user-adjustable). From this list, you can reprioritize processes or kill them. Figure 4-1 shows a top screen.

Figure 4-1   top output

The top program’s main disadvantage is that it’s a CPU hog. On a congested and under-resourced system, multiple running instances of this program might not be very helpful. This can happen, for example, when multiple users start running top to see what’s going on, only to find several other people running the program as well, slowing down the overall system even more!

Send a Signal to a Process: kill

This program’s name is a little misleading: It doesn’t really kill processes. What it does is send signals to running processes. The operating system, by default, supplies each process with a standard set of signal handlers to deal with incoming signals. From a system administrator’s standpoint, the most common handlers are for signal numbers 1, 9, and 15, which translate to the hang-up process, kill process, and terminate process, respectively.

When kill is invoked, it requires at least one parameter: the process identification number (PID) as derived from the ps command.

Signals

An optional parameter available for kill is -n, where the n represents a signal number. If you don’t specify the -n option, signal 15 will be sent by default. We’ll discuss signals 1, 9, and 15 here.

The hang-up signal, 1 (SIGHUP or HUP), is very handy; for example, it can be used to tell certain server applications to go and reread their configuration files.

The kill signal, 9 (SIGKILL), is the impolite way of stopping a process. Rather than asking a process to stop, the operating system simply kills the process. The only time this will fail is when the process is in the middle of a system call (such as a request to open a file), in which case the process will die once it returns from the system call.

The kill signal, 15 (SIGTERM), can be used to request a process to gracefully terminate itself. When passed only the PID, kill sends signal 15 (SIGTERM) by default. Some programs intercept this signal and perform a number of actions so that they can shut down cleanly. Others just stop running in their tracks. Either way, SIGTERM isn’t a guaranteed method for making a process stop.

Security Issues

The ability to terminate a process is obviously a powerful one, thereby making security precautions important. Users may kill only processes they have permission to kill. If non-root users attempt to send signals to processes other than their own, error messages are returned. The root user is the exception to this limitation; root may send signals to all processes in the system. Of course, this means root needs to exercise great care when using the kill command.

Examples Using the kill Command

The following examples are arbitrary; the PIDs used are completely fictitious and will be different or nonexistent on your system.

Recall that whenever a signal to send is not specified, SIGTERM (signal 15) will be sent by default. Use this command to terminate (signal 15) a process with PID number 205989:

For an almost guaranteed kill of process number 593999, issue this command:

Type the following to send the HUP signal to process number 593888:

This command does the same thing:

TIP  To get a listing of all the possible signals available, along with their numeric equivalents, issue the kill -l command.

Show System Information: uname

The uname program produces some system details that can be helpful in several situations. Perhaps you’ve managed to log into a dozen different computers remotely and have lost track of where you are!

To get the operating system’s name, release, system hostname, and kernel release name/version, enter the following command:

TIP  Another command that offers distribution-specific information is the lsb_release command. Specifically, it can show Linux Standard Base (LSB) related information, such as the distribution name, distribution code name, release or version information, and so forth. A common option used with the lsb_release command is -a. For example:

Who Is Logged In: who

On multiuser systems that have many user accounts that can be simultaneously logged in locally or remotely, the system administrator may need to know who is logged on.

A report showing all logged-on users as well as other useful statistics can be generated by using the who command:

A Variation on who: w

The w command displays the same information that who displays, plus a whole lot more. The details of the report include who is logged in, what their terminal is, from where they are logged in, how long they’ve been logged in, how long they’ve been idle, and their CPU utilization. The top of the report also gives you the same output as the uptime command.

Switch User: su

The su (switch user or substitute user) command is used for running commands as a different user. Once you have logged into the system as one user, you need not log out and back in again in order to assume another identity (root user, for instance). Instead, use the su command to switch. This command has few command-line parameters.

Running su without any parameters will automatically try to make you the root user. You’ll be prompted for the root password, and, if you enter it correctly, you will drop down to a root shell. If you are already the root user and want to switch to another ID, you don’t need to enter a password when you use this command.

For example, if you’re logged in as the user master and want to switch to the root user, type this command:

You will be prompted for root’s password.

If you’re logged in as root and want to switch to, say, user master, enter this command:

You will not be prompted for master’s password.

The optional hyphen (-) parameter tells su to switch identities and run the login scripts for that user. For example, if you’re logged in as root and want to switch over to user master with all of its login and shell configurations, type this command:

TIP  The sudo command is used extensively (instead of su) on modern GNU/Linux distros to execute commands as another user and to temporarily elevate the privilege level of a regular user when necessary. When configured properly, sudo offers finer grained controls than su does. During the installation of our sample Fedora server (Chapter 2), we elected to make the master user an administrator. The implication of this on Red Hat like distros such as Fedora, RHEL, and CentOS is that the user was also automatically added to the special wheel group and will thus have sudo privileges. The equivalent group on Debian like distros such as Ubuntu is the aptly named sudo group!

Putting It All Together (Moving a User and Its Home Directory)

This section demonstrates how to put together some of the topics and CLI utilities covered so far in this chapter (as well as some new ones covered in more detail in Chapter 5). You will see how the elegant design of GNU/Linux allows you to combine simple commands to perform advanced system administration operations. The example operation will be to move a user and the user’s files around on the system.

Specifically, in the following exercises, you are going to create and then move the user named project4 from his default home directory /home/project4 to /export/home/project4. You will also have to set the proper permissions and ownership of the user’s files and directories so that the user can access them.

Unlike the previous exercises, which were performed as a regular user (the user master), you will need superuser privileges to perform the steps in this exercise. Use the su command to change your identity temporarily from the current logged-in user to the superuser (root). You will need to provide root’s password, when prompted.

At the virtual terminal prompt, type:

Create the user to be used for this project. The username is project4. Type the following:

Use the grep command to view the entry for the new user in the /etc/passwd file:

Use the ls command to display a listing of the user’s home directory:

Check the total disk space being used by the user:

Use the su command to change your identity temporarily from the root user to the newly created project4 user:

As user project4, view your present working directory:

As user project4, create some empty files:

Go back to being the root user by exiting out of project4’s profile:

Create the /export directory that will house the user’s new home:

Now use the tar command to archive and compress project4’s current home directory (/home/project4) and untar and decompress it into its new location:

TIP  The dashes (-) you used here with the tar command forces it to send its output to standard output (stdout) first and then receive its input from standard input (stdin).

Use the ls command to ensure that the new home directory was properly created under the /export directory:

Make sure that the project4 user account has complete ownership of all the files and directories in his new home:

Now delete project4’s current home directory:

We are almost done. Try to temporarily assume the identity of project4 again:

There’s one more thing left to do. We have deleted the user’s home directory (/home/project4). The path to the user’s home directory is specified in the /etc/passwd file (see Chapter 6), and since we already deleted that directory, the su command helpfully complained.

Exit out of project4’s profile using the exit command:

Now we’ll use the usermod command to update the /etc/passwd file automatically with the user’s new home directory:

Use the su command again to become project4 temporarily:

While logged in as project4, use the pwd command to view your present working directory:

The output shows that our migration worked out well.

Exit out of project4’s profile to become the root user, and then delete the user called project4 from the system:

That’s it—we are done!

Summary

This chapter discussed Linux’s command-line interface, the Bourne Again Shell (Bash), many command-line tools, and a few editors. As you continue through this book, you’ll find many references to the information in this chapter, so be sure that you get comfortable with working at the CLI. You might find it a bit annoying at first, especially if you are accustomed to using a GUI for performing many of the basic tasks mentioned here—but stick with it. You might even find yourself eventually working faster at the command line than with the GUI!

Obviously, this chapter can’t cover all the command-line tools available as part of your default Linux installation. It is highly recommend that you take some time to look into some of the reference books available. In addition, there is a wealth of texts on shell scripting/programming at various levels and from various points of view. Get whatever suits you; shell scripting/programming is a skill well worth learning, even if you don’t do system administration.

And above all else, R.T.F.M.—that is, Read The Fine Manual (documentation).