Unit VI – Application Layer – Express Learning: Data Communications and Computer Networks

13

Application Layer Protocols

1. Explain how TELNET helps in remote login.

Ans: The word “TELNET” is derived from telecommunications and network and is a protocol that allows a user to log on to a remote computer. Further, TELNET is also known as remote login, which means connecting one machine to another in such a way that a person may interact with another machine as if it is being used locally. It means that someone in New Delhi could connect to a computer in the New York City Public Library and search the card catalogue the same way as someone located at a terminal in the library. Once connected, the user's computer emulates the remote computer. When the user types in commands, they are executed on the remote computer. The user's monitor displays what is taking place on the remote computer during the TELNET session.

The user's computer, which initiates the connection, is referred to as the local computer or TELNET client, and the machine being connected to, which accepts the connection, is referred to as the remote computer or TELNET server. The transmission control protocol/Internet protocol (TCP/IP) protocol is used to transmit information between the TELNET client and the TELNET server. In addition, TELNET is text based and only the keyboard can be used for navigation. For this reason, it is widely used in UNIX-based systems. Further, TELNET creates a standard and a fictional terminal called the network virtual terminal (NVT) that is used for communication by all the computers on the network. The following steps are performed while conducting a TELNET session.

The inputs from the user are taken and translated by the TELNET client to the NVT form.
This input is send to a TELNET server running on a remote computer.
The server translates the NVT form to whatever representation the computer being accessed requires.

The same steps are repeated when data are sent from the remote computer back to the user. This system allows clients and servers to communicate even if they use entirely different hardware and internal data representations.

2. What is a domain name system?

Ans: The TCP/IP protocol identifies entities that are connected on the Internet using their unique IP addresses. These addresses being numeric are difficult to remember for the users as compared to their names, therefore, a system was needed that can change a name to an address and vice versa. Domain name system (DNS) is such a system that is used to map the user friendly host names to their IP addresses and vice versa. This system is used extensively over the Internet and other private businesses.

3. What is a resolver? What is its role in DNS?

Ans: A resolver is a library procedure used by the DNS during mapping of a host name to its corresponding IP address. Whenever an application program requires mapping of a host name to IP address, or IP address to host name it invokes the resolver, passing the host name or IP address to it as a parameter. The resolver passes this parameter to a local DNS server, which searches the name or address in its database. After the name or address has been found, the DNS server returns the result back to the resolver. Then, the resolver passes on the result to the calling application program so that a connection could be established with the destination.

4. Explain the DNS in terms of name space, resource record and name server.

Ans: The DNS is used to map host names to their corresponding IP addresses or vice versa. Various components of DNS include name space, resource record and name server.

Name Space

It is a representation of the domains of the Internet as a tree structure. Each domain can be subdivided into many other domains, which are further partitioned thereby creating a hierarchy. The leaf nodes of the tree cannot be subdivided and each leaf node may contain only a single host or several hosts. The top-level domains are classified into two groups, namely, generic and country. The generic group contains domain names such as com (commercial), edu (educational institutions), gov (governments), int (international organizations), mil (armed forces), net (network providers) and org (organizations). The country domains contain one entry for every country, as per ISO 3166 specification. The tree structure of the name space is shown in Figure 13.1.

Figure 13.1 A Portion of Internet Domain Name Space

A label is included in each node of the tree (Figure 13.2). The label is a string, which has a maximum size of 63 characters. The label at the root node is just an empty string. The child nodes, which have the same parent, are not allowed to have the same label name as it may cause ambiguity. Each node in the tree has a domain name, which is formed by a sequence of labels separated by dots (.). The domain names are again divided into two types, namely, fully qualified domain name (FQDN) and partially qualified domain name (PQDN).

Figure 13.2 Labels and Domain Names

FQDN: If the domain name ends with a dot (.), that is, null string, it is said to be an FQDN. It is the full name of a host, which includes all labels, starting from the host label toward the root label, which is a dot (.). For example, in Figure 13.2, the FQDN of a host named terminator installed at the technology centre tc is terminator.tc.flag.edu.
PQDN: If a domain name does not end with a null string, then it is said to be a PQDN. This means a PQDN starts with the host label but does not end with the root node. A partial domain name address is used when the domain name and the client reside in the same site. The PQDN can take help of the resolver to convert it to an FQDN. For example, if a person at the flag.edu site wants to get the IP address of the terminator computer, he/she defines only the partial name terminator. The rest of the part (called suffix), that is, tc.flag.edu, is added by the resolver and then the address is passed to the DNS server.

Resource Records

Every domain is associated with a set of information known as resource records (RRs). The most common RR is the IP address for a single host, although many other kinds of resource records are also found. A RR is a five-tuple set, which is mostly represented as ASCII text, but for better efficiency, it can also be encoded in binary form. The different fields of the five-tuple set are described as follows:

Domain_Name: This field identifies the domain with which this record is associated. Usually, many records related to a single domain exist in the databases and the database contains RRs for multiple domains. Thus, this field is used for search operations so that queries can be executed efficiently.
Time_to_Live: This field indicates the stability of the record. It specifies the time interval within which a RR may be cached by the receiver so that the server need not be consulted for RR repeatedly. A zero value in this field indicates that the RR is to be used for a single transaction and therefore, should not be cached. The highly stable records are given large values while less stable records are given small values.
Class: This field identifies the protocol family. It is set to IN for Internet information; otherwise, some other codes are used.
Type: This field specifies the type of the record.
Value: This field depends on the Type field and can be a domain name, a number or an ASCII character.

Name Servers

As DNS database is very vast, it is impossible for a single server to hold information about the complete database and respond to all queries. Even if it is done, then a failure in the single name server would bring the whole network down. To avoid such a situation, the DNS name space is divided into many non-overlapping zones (represented by dotted areas in Figure 13.3) with each zone containing name servers holding information about that zone. Each zone covers some nodes of the tree and contains a group of one or more subdomains and their associated RRs that exist within that domain. The name server creates a zone file, which holds all the information of the nodes under it. Each zone has a primary name server and one or more secondary name servers. The primary name server of a zone obtains the information from a file on the disk, whereas the secondary name servers obtain information from the primary name server. Some servers can also be located outside the zone to improve the reliability.

Figure 13.3 Division of DNS Name Space into Zones

Whenever a query from a resolver for a domain name arrives at a local name server, it checks whether that information falls under its zone. If so, then the domain name is provided; otherwise, the query is forwarded to the upper-level name servers. For example, consider a resolver john.cs.vu.nl wants to query about the IP address of the host texas.cs.yale.edu (Figure 13.3). To handle this query, firstly the resolver sends a query to the local name server cs.vu.nl to check whether it already has the IP address. If the local name server has the IP address in its cache, it returns the address to the resolver; else, it forwards the query to the root server. The root server then forwards the UDP packet to yale.edu, which is further forwarded to cs.yale.edu because this name server has the IP address of the host texas.cs.yale.edu. The whole process is repeated in the reverse order in order to reach to the client who had sent the original request. Once the client has been reached, the local name server stores this IP address in its cache so that it may be used later when required.

5. Why was there a need of dynamic domain name system (DDNS)?

Ans: The DNS database contains large number of addresses that need to be changed very often due to addition of a new host, removal of some existing host or change in the IP address of some host. All these changes must be updated in the DNS master file, which leads to lot of manual work consuming a lot of time. To overcome this problem, DDNS was devised that updates the master file automatically. In DDNS, whenever a binding of name and address is found, the information is passed to a primary DNS server by dynamic host control protocol (DHCP). After receiving the information, the primary server updates its zone and the secondary servers are informed about the change using one of the two modes: active and passive. In the active mode, the primary name server itself informs about the change to the secondary servers by sending the message while in the passive mode, the secondary name server checks for any change at regular intervals of time. However, in both modes, the secondary server requests information about the entire zone after being notified about the change. Further, DDNS also secures DNS database from unauthorized changes by using an authentication mechanism.

6. What is e-mail? State its advantages and disadvantages.

Ans: E-mail (or electronic mail) is the process of exchanging messages electronically via a communications network using the computer. E-mail allows users to communicate with each other in less time and at nominal cost as compared to traditional phone or mail services. Apart from the textual message, e-mails can also consist of other data formats such as pictures, sound and video. E-mails can be sent anywhere in the world using the computer and a modem. Its delivery is almost instant and is very economical to use. One may send many messages at a time or just one to a designated location.

In order to use e-mail, one must have access to the Internet and an e-mail account. An e-mail account is a service that allows the user to send and receive e-mails through the Internet. The e-mail account provides a unique e-mail address and a mailbox where the user can save all his/her mails. The e-mail address is made up of two parts, namely, the logon identity and the identity of the e-mail server. Both these parts are separated by the symbol @ (pronounced as at the rate). A typical e-mail address is username@ website.com. The first part of the address indicates name of the user. The symbol @ in the address is used to separate the user name from the rest of the address. Next comes the host name (website.com), also called the domain name.

E-mail consists of two fields: envelope and message (Figure 13.4). The envelope field defines the address of the sender and the receiver. The message field is again subdivided into parts, namely, header and the body. The header of the message specifies the sender and the receiver while the body contains the information that needs to be transmitted.

Figure 13.4 Format of an E-mail

Though e-mail is quite popular and efficient but it has some disadvantages along with its advantages. The advantages and disadvantages are discussed in Table 13.1.

Table 13.1 Advantages and Disadvantages of E-mail

Advantages   Disadvantages
  • The delivery of messages is very fast, sometimes almost instantaneous, even though the message is meant for overseas or just to a friend next door.
 
  • Although e-mail is delivered instantly, the recipient may or may not read his/her mail on time. That defeats the quickness of electronic mailing.
  • The cost of e-mailing is almost free as it involves negligible amount of telephone and ISP charges.
 
  • The user must stay online to read and write more than one mail. In addition, most webmail either display advertisements during use or append them to mails sent. It results in increased size of the original mail, which brings a significant decrease in speed of use.
  • Multiple copies of the same message can be sent to a group of people at the same time and can be sent as easily to a single person.
 
  • Since e-mail passes through a network, therefore, it may be intercepted in between. Moreover, viruses can enter the system while downloading the e-mails.
  • Pictures, documents and other files can also be attached to messages.
 
  • The slightest error in the address or a failure in one of the links between sender and receiver is enough to prevent a delivery.

7. Describe the architecture of e-mail.

Ans: The architecture of e-mail includes three components, namely, the user agent (UA), message transfer agent (MTA) and message access agent (MAA). The UA is a program that helps the user in reading, writing, replying and forwarding message. It also handles the user's mailboxes. The MTA is a client/server program where the MTA client can push messages to the MTA server. The MAA is a client/ server program where the MAA client pulls (retrieves) messages from the MAA server.

To understand how e-mail system works, consider two users A and B on two different systems. Further, assume that both A and B are connected to their respective mail servers by local area network (LAN) or wide area network (WAN). When A wishes to send a message to B, it executes the UA program to prepare the message. Now, as the message is to be sent to A's mail server through LAN/WAN, a pair of MTAs (MTA client and MTA server) is used. The UA after preparing the message sends it to an MTA client, which then sets up a connection with MTA server on the mail server. At A's mail server, the message waiting to be sent is kept in a queue maintained by the mail server. Then, the mail server calls its MTA client to send the message to MTA server at B's mail server connected via Internet. At B's mail server, the received message for B is kept in his/her mailbox. To retrieve the message from mailbox at mail server, B calls the MAA client, which then establishes a connection with MAA server at the mail server. This way B can receive the message sent by A. Figure 13.5 depicts the whole mechanism.

8. What are the services offered by user agent? Also, discuss its types.

Ans: The UA offers various services to the user to make the process of sending and receiving messages easier. These services are discussed as follows:

Composing Messages: This service helps the user in composing the messages that are to be sent. A template is provided by UA, which can be filled by the user to create the messages. Some UAs provide users with built-in editor to perform functions like spelling and grammar checking, emphasizing text by making it bold, italic, etc.

Figure 13.5 Mechanism of E-mail Transfer

Reading Messages: This service helps the user to read the messages, which are in its inbox. Most user agents show a one-line description of each received mail.
Replying to Messages: This service is used to reply to the messages that have been received by the user. While replying, a user can send a new reply or may include the original message sent by the sender along with the new one. Moreover, the user can reply either to the original sender or to all the recipients of message.
Forwarding of Messages: This service helps the user to forward the message to the third party instead of sending it to the original sender. The user can also add some more content in the message to be forwarded.
Handling Mailboxes: The user agent is responsible for maintaining all the mailboxes in e-mail system. Basically, it creates two types of mailboxes, namely, inbox and outbox. The inbox contains all the messages received by a user and the outbox contains all the messages sent by the user. The messages are kept in both mailboxes until the user deletes them.

There are two types of UAs namely, command-driven and graphical-user-interface (GUI)-based UAs. These types are described as follows:

Command-driven UA: This UA was used in the early days in e-mail. In this type, the user can type one character at a time at the command prompt while replying to the sender. A few command-driven UAs include pine, elm and mail.
GUI-based UA: This UA being used nowadays allows the user to use both mouse and keyboard to interact with the software. As the name of this UA suggests, it provides GUI components such as menus and icons that help the users to access the services more easily. Thus, GUI-based UAs are more user friendly.

9. Write a short note on MIME.

Ans: Multipurpose Internet mail extensions (MIME) is a protocol that enables the transfer of non-ASCII data through e-mails and thus, overcome the limitation of simple e-mail format. It converts non-ASCII messages to a 7-bit NVT ASCII format at the sender's side. The converted message is then forwarded to the MTA client so that it can be sent over the Internet to the receiver. At receiver's side, the message is converted to its original format. Further, MIME can also be used to send messages in different languages such as French, German, Chinese, etc. The structure of MIME defines five new headers that were included in the original e-mail header section. These headers are described as follows:

 

MIME Version: This header specifies the MIME version and tells the receiver that the sender is using MIME message format. The version number 1.1 is being used nowadays.
Content Type: This header defines the type and subtype of the data used in the message body. The type of the data is followed by its subtype, separated by a slash, that is, type/subtype. Some of the types and their subtypes used by MIME are listed in Table 13.2.

Table 13.2 Data Types and Subtypes in MIME

Type Subtype Description
Text Plain Unformatted
  HTML HTML format
Image JPEG Image is in JPEG format
  GIF Image is in GIF format
Video MPEG Video is in MPEG format
Audio Basic Single-channel encoding of voice at 8kHz

 

Content Transfer Encoding: This header defines the different methods used for encoding the messages into various formats, so that it can be transmitted over the network. Some schemes used for encoding the message body are listed in Table 13.3.

Table 13.3 Content Transfer Encoding

Type Description
7 bit NVT ASCII characters and short lines
8bit Non-ASCII characters and short lines
Binary Non-ASCII characters with unlimited length

 

Content Id: This header uniquely identifies the message content.
Content Description: This header tells what the body of message contains, that is, whether it contains picture, audio or video. It is an ASCII string that helps the receiver decide whether the message needs to be decoded.

10. Explain SMTP.

Ans: Simple mail transfer protocol (SMTP) is a TCP/IP application protocol that supports e-mail service. It handles the transfer of messages between the sender and receiver. It is based on the client-server model and defines the MTA client and server in the Internet. During the exchange of message between the sender and the receiver, SMTP is used twice. Once, it is used to transfer the mail from sender's end to sender's mail server, and then to transfer the mail from sender's mail server to receiver's mail server. To retrieve the mail from receiver's mail server at the receiver's end, a different mail protocol such as POP3 and IMAP (discussed in the next question) is used. While transferring mails, SMTP uses commands and responses between MTA client and MTA server.

Commands: They are sent from the client machine to the server machine. The syntax of a command consists of a keyword followed by zero or more arguments. There are total 14 commands defined by SMTP some of which are listed in Table 13.4.

Table 13.4 SMTP Commands

Keyword Arguments Description
HELO<domain> Sender's host name It is used for sending sender's identification.
MAIL FROM: <…> Sender of the message It specifies the sender's name.
RCPT TO: <…> Intended recipient of the message It specifies the receiver's name.
DATA Body of the mail It indicates the beginning of mail transmission.
RSET No arguments It ends the current mail transaction.
SEND FROM: <…> Intended recipient of the message It specifies that this mail should be sent directly to the user's terminal.
SOML FROM: <…> Intended recipient of the message It is used to specify that the mail should be sent to user's terminal if possible; otherwise to mailbox.
VRFY Name of the recipient to be verified It is used to confirm the user name.

 

Responses: They are just the opposite of commands, that is, they are sent from a server machine to a client machine. A response consists of a three-digit code, which may be followed by additional textual information. Some of the SMTP responses are shown in Table 13.5.

Table 13.5 SMTP Responses

Code Information
221 Service closing transmission channel
354 Start mail input
500 Syntax error, unrecognized command
503 Bad sequence or commands

11. Explain in brief the following:

(a) POP3

(b) IMAP4

Ans:

(a) POP3: It stands for post office protocol and the number 3 is its version number. It is a simple MAA protocol that has the limited functionality. To use POP3, the client POP3 software and server POP3 software must be installed on the recipient's machine and on its mail server, respectively. Further, POP3 works in two modes: delete mode and keep mode. In the delete mode, as a message has been pulled from the mail server, it is deleted from the mailbox on the mail server. On the other hand, in the keep mode the message remains in the mailbox even after it has been pulled from the mail server. This mail can be read later from any other computer or location.

Whenever a recipient (client) needs to retrieve mails from the mail server, it establishes a TCP connection to the server on the port 110. Then, it passes its username as well as the password to the mail server to get access to the mailbox on the mail server. After the server has verified the client, the client can list and download the messages one at a time.

POP3 has some disadvantages, which are as follows:

POP3 does not support mail organization on the server, that is, a user cannot have different folders on the mail server.
POP3 does not allow the contents of the mail to be checked in parts while the mail is being downloaded. The mail can be checked after it has been completely downloaded.

(b) IMAP4: It stands for Internet mail access protocol and the number 4 denotes its version number. Like POP3, it is also an MAA protocol but it provides more functionality and is more complex than POP3. Some of the additional features provided by IMAP4 are as follows:

A user can create folders on the mail server and can delete or rename the mailboxes.
IMAP enables the user to partially download the mails. This is especially useful in cases where a message contains large audio and video files, which may take a lot of time to download because of slow Internet connection. In such cases, the user can download only the text part of message if required using IMAP4.
A user can search through the contents of the messages while the messages are still on the mail server.
IMAP allows the user to check the contents of the e-mail before it has been downloaded.
A user can selectively retrieve the attributes of messages such as body, header, etc.

12. What is FTP? How the files are transferred using FTP?

Ans: File transfer protocol (FTP) is a mechanism provided by TCP/IP to transfer the files between hosts connected via Internet. Further, FTP allows access to the files stored in the directory of a remote computer that is connected to the Internet. In order to access a remote system by FTP, you need to know either the uniform resource locator (URL) or the IP address of the FTP site such as ftp: ftp.microsoft.com.

Unlike other client-server applications, FTP establishes two TCP connections between the hosts, namely, control connection and data connection. The control connection is established over well-known TCP port 21 and it remains active throughout the session. This connection is used for transferring control information such as commands and responses. Only one line of command or response can be transferred at one time through the control connection. On the other hand, the data connection is established over the well-known TCP port 20 and it is established only after the control connection has established. In addition, it needs to be established and released for each file to be transferred while the control connection is open.

Figure 13.6 shows the FTP file transfer procedure. At the client side, there are three components, namely, user interface, control process and the data transfer process while the server side uses only two components, namely, control process and the data transfer process. The control processes of client and server are connected via control connection, whereas the data transfer process of client and server are connected via data connection. The control processes of client and server communicate using NVT format. They are responsible for converting from their local syntax such as DOS or UNIX to NVT format and vice versa. The data transfer processes of client and server communicate under the control of commands transferred through the control connection.

Figure 13.6 Mechanism of File Transfer in FTP

13. Explain the following with respect to FTP:

File Type
Data Structure
Transmission Mode

Ans: To transfer a file through the data connection in FTP, the user (client) has to specify certain attributes to the server including type of file to be transferred, the data structure and the transmission mode so that the control connection could be prepared accordingly. These attributes are described as follows:

File Type: FTP supports three types of files for transmission over the data connection, namely, an ASCII file, EBCDIC file or image file. The ASCII file is the default format used for text files. It uses the 7-bit ASCII format to encode each character of text file. The sender converts the file from its original form to ASCII characters, while the receiver converts the ASCII characters back to the original form. If EBCDIC encoding (file format used by IBM) is supported at the sender or receiver side, then files can be transmitted using the EBCDIC encoding. The image file is the default format used in the transmission of binary files. Binary files are sent as continuous stream of bits without using any encoding method. Usually, the compiled programs are transferred using the image file.
Data Structure: FTP uses three data structures to transfer a file, namely, file structure, record structure and page structure. When file structure format is used, the file is sent as a continuous stream of bytes. The record structure can be used only with text files and the file is divided into many records. In page structure, each file is divided into a number of pages where each page contains a page number and a page header. These pages can be accessed sequentially as well as randomly.
Transmission Mode: FTP uses three types of transmission modes, namely, stream mode, block mode and compressed mode. The default mode of transmission is the stream mode, which sends the data as a continuous pattern of bytes. In case the data contains only a stream of bytes, then no end-of-file (closing of data connection) is required; the end of file is simply indicated by closing of connection by the sender. In the block mode, data is sent in blocks, where each block is preceded by a 3-byte header. The first byte is just a description about the block and the next two bytes define the size of the block in bytes. The compressed mode is used in case of large files to reduce their size so that they can be transmitted conveniently. The size of file is reduced by replacing multiple consecutive occurrences of characters with a single character or reducing the number of repetitions. For example, in text files blank spaces can be compressed.

14. What is anonymous FTP access?

Ans: Generally, in FTP a user needs an account (username) and password to login to a particular site for accessing the file. However, some sites provide public access to a set of files; the user can access these files without having any account or password. The users can type “anonymous” in the username part and “guest” in the password part. Using FTP in this way is known as anonymous FTP access. With an anonymous FTP, users have restricted access rights and usually, can only list, view or copy files to and from a public directory on the remote system.

15. Define WWW. Explain its architecture.

Ans: The World Wide Web (abbreviated as the Web or WWW) is a collection of linked documents or pages, stored on millions of computers and distributed across the world. The concept of the Web began at CERN (the European Center for Nuclear Research) Geneva, Switzerland in the year 1989. Since then, WWW is the most popularly used Internet subnetwork. One of the main reasons that led to its popularity is that it provides information in multimedia form, that is, in more than one medium such as, text, graphics, video and audio. Further, it provides a simple and consistent way of accessing the information available on the Internet by using hypertext system. In the hypertext system, the documents are connected to other related documents on the Internet through links. The Web uses a specific Internet protocol called hypertext transfer protocol (HTTP) to support hypertext documents.

Further, WWW is based on a distributed client/server architecture in which the services provided by the server are distributed over various locations termed as sites and the client can access these services. Each site contains one or more documents, which are known as web pages. A web page contains hyperlinks that enable the user to jump on other web pages on the same site or on the different sites. The client can access the web pages through the browser—the software that enables a user to read/view web pages. Whenever a user needs to access some web page stored on some site, it sends a request including the address of the site and the web page (known as the URL) through its browser to the server. On receiving the request, the server searches the required document and returns it to the client.

The WWW includes many components, which are discussed as follows:

CLient (Browser)

Browser is a program which accesses and displays the web pages. It consists of three components, namely, controller, interpreter and client protocol. The user provides inputs (request for a web document) to the controller through a keyboard or a mouse. After receiving the input, the controller uses client protocols such as FTP or HTTP to access the web document. Once the controller has accessed the desired web document, it selects an appropriate interpreter such as hypertext markup language (HTML) or JavaScript depending on the type of the web document accessed. The interpreters help the controller to display the web document. A few of the web browsers used today include Microsoft Internet Explorer, Opera and Google chrome.

To understand how a browser works, consider a user who wants to access the link http://www.ipl.com/home/teams.html. When the user provides this link (URL) to the browser, the browser goes through the following steps:

  1. The browser determines the given URL and sends a query to the DNS server asking for the IP address of www.ipl.com.
  2. The DNS sends a reply to the browser, providing the desired IP address.
  3. A TCP connection to port 80 on the received IP address is made by the browser.
  4. The browser then sends a request for the file /home/teams.html.
  5. The file /home/teams.html is sent by the www.ipl.com server.
  6. The TCP connection is ended.
  7. The browser displays the text in the file /home/teams.html. It also fetches and displays the images in the file.

Server

Server is the place where web pages are stored. On a request from the client, the server searches the desired document from the disk and returns the document to the browser through a TCP connection. The steps performed by a server are as follows:

  1. The server accepts the TCP connection request arriving from the client.
  2. It then acquires the name of the file requested by the client.
  3. The server retrieves the file from the disk.
  4. The file is sent back to the client.
  5. The TCP connection is released.

The efficiency of a server can be improved by caching the recently accessed pages so that those pages could be directly accessed from memory and need not be accessed from the disk. Moreover, server can support multithreading, that is, serving multiple clients at the same time to increase the efficiency.

Uniform Resource Locator

Each web page has a unique address, called the URL that identifies its location on the Internet. This address can be used to locate that page all over the world by millions of people. Usually, the format of an URL consists of four parts: protocol, name of the web server (or domain name), path and filename. Here is an example: http://www.xyz.com/tutor/start/main.htm. The structure of this URL is:

Protocol: http
Web server name/domain name: www.xyz.com
Path: tutor/start/
File name: main.htm

The first part of the address, the part before the colon, is the protocol. Most of the time http is used for accessing a web page. After the protocol, comes the domain name. Colons and slashes respectively separate the protocol and the domain name. Then comes the last part of a URL, namely, the path and the file name. The path name specifies the hierarchical location of the said file on the computer. For instance, in http://www.xyz.com/tutor/start/main.htm, the file main.htm is located in start, which is a subdirectory of tutor.

Cookies

Cookies are the small files or strings, which are used to store information about the users. This stored information may be later used by the server while responding to the requests of the client(s). For some particular sites, only registered users are permitted to access the information. In such a case, the server stores the user's registration information in the form of cookies on the client's machine. The size of a cookie file cannot exceed 4 KB. A user can disable the cookies in the browser or can even delete them.

16. What is HTTP? Describe the format of HTTP request and response message.

Ans: Hypertext transfer protocol (HTTP) is the most common protocol that is used to access information from the Web. It manages the transfer of data between the client and the server. The older version of HTTP was 1.0, in which TCP connection was released after serving a single request. This was not adequate as every time a new connection had to be established. This led to the development of HTTP version 1.1 that supports persistent connection, that is, it is meant for multiple request-response operations.

Further, HTTP is a stateless protocol and all the transactions between the server and client are carried out in the form of messages. The client sends a request message to the server and the server replies with a response message. The HTTP request and response messages have a similar format (Figure 13.7) except that in request message, the first line is the request line while in response message, and the first line is the status line. The remaining part of both the messages consists of a header and sometimes, a body.

Figure 13.7 Format of HTTP Request and Response Message

The HTTP request messages are of different types, which are categorized into various methods as shown in Table 13.6.

For each HTTP request message, the server sends an HTTP response that consists of status line and some additional information. The status line comprises a three digit status code, similar to the response message of FTP and SMTP. The status code indicates whether the client request is satisfied or if there is some error. The first digit of the status code can be 1, 2, 3, 4 or 5 and it indicates one of the five groups into which response messages have been divided. Codes falling in the 100 range are only informational and thus, rarely used. The codes falling in the 200 range indicate a successful request, codes in the 300 range redirect the client to some other site, the codes in the 400 range indicate an error in the client side and the codes in the 500 range indicate an error at the server site.

Table 13.6 The Built-in HTTP Request Methods

Method Description
GET Request to access a web page from the server.
HEAD Request to get the header of a web page
PUT Request to store a web page
POST Append to a named resource
DELETE Remove the web page
TRACE Echo the incoming request
CONNECT Reserved for future use
OPTIONS Enquire about certain options

Figure 13.8 HTTP Header Format

Further, HTTP also contains various headers, which are used to transfer additional information other than the normal message between the client and the server. For example, the request header can ask for a message to be delivered in some particular format while a response header can contain a description of the message. The additional information can be included in one or more header lines within the header. Each header line has a format as shown in Figure 13.8.

Each header line may belong to one of the four types of HTTP headers, which are discussed as follows:

General Header: This header can be included in both request and response message. It contains general information about the sent or received messages. An example of a general header is Date that is used to display the current date.
Request Header: This header can be used only in the request messages from the client. The details about the client setup and the preference of the client for any particular format are included in this header. An example of a request header is the From header, which shows the e-mail address of the user.
Response Header: This header is part of the response messages only. It contains the server's setup information. An example of a response header is the Age header, which shows the age of the document.
Entity Header: This header includes information about the body of a document. It is mostly present in the request or response messages. An example of entity header is the Allow header, which lists the valid methods that can be used with a URL.

17. Compare HTTP with FTP and SMTP.

Ans: Functionally, HTTP works as a combination of FTP and SMTP. It is similar to FTP as it is also used to transfer files over the TCP connection. However, it differs from FTP in the respect that it can transfer only data and thus, no separate control connection is needed. It uses only one TCP connection over port number 80.

Further, HTTP is similar to SMTP because in both protocols, the client initiates a request and the server responds to that particular request. However, HTTP messages can only be read by HTTP server and HTTP client, whereas in SMTP, messages can be read by humans also. Moreover, HTTP messages are forwarded immediately unlike in SMTP, where messages are first stored and then forwarded.

18. What is a proxy server and how it is related to HTTP?

Ans: A proxy server is a computer, which stores the copies of responses to most recently requests in its cache so that further requests for these pages need not be sent to the original server. The proxy server helps to reduce the load on the original server as well as decrease the network traffic thereby improving the latency. To use the proxy server, the client must be configured to send HTTP requests to proxy server instead of the original server. Whenever a HTTP client wishes to access a page, it sends HTTP request to a proxy server. On receiving a request, the proxy server looks up in the cache for the desired web page. If the page is found, the stored copy of response is sent back to the client; otherwise, the HTTP request is forwarded to the target server. Similarly, the proxy server receives responses from the target server. The proxy server stores these responses in the cache and then sends it to the client.

19. Explain SNMP and mention the two protocols used by it for managing tasks.

Ans: Simple network management protocol (SNMP) is an operational framework, which helps in the maintenance of the devices used in the Internet. In addition, SNMP comprises two components, namely, manager and agent. A station (host) which manages another station is called the manager. It runs the SNMP client program. The agent is the station (router) which runs the server program and is managed by the manager. Both the stations interact with each other to carry out the management. The manager can access all the information stored in the agent station. This information can be used by the manager to check the overall performance. The manager can also perform some remote operation on the agent station such as rebooting the remote station. On the other hand, the agent can also perform some management operations, such as informing the manager about the occurrence of any unusual situation, which may hamper the performance.

In order to manage tasks efficiently, SNMP uses two protocols: structure of management information (SMI) and management information base (MIB).These two protocols function together with the SNMP. The role of SMI and MIB protocols is discussed as follows:

SMI: It generally deals with the naming of objects and defining their range and length. However, it does not specify the number of objects maintained by an entity, the relationship between objects, and their corresponding values.
MIB: MIB basically performs the work that SMI has left behind. It defines the object name as per the conventions specified by SMI. It also states the number of objects and their types.

Multiple Choice Questions

  1. Which of the following is/are an application layer service?

    (a)   remote login

    (b)   file transfer and access

    (c)   domain name service

    (d)   all of these

  2. Which port number is used by the DNS server for both TCP and UDP connections?

    (a)   54

    (b)   53

    (c)   51

    (d)   50

  3. The client TELNET translates characters that come from the local terminal into______character form and delivers them to the network.

    (a)   ASCII

    (b)   BCD

    (c)   NVT

    (d)   EBCDIC

  4. The well-known port________is used for the control connection and the well-known port________is used for the data connection.

    (a)   20, 21

    (b)   21, 20

    (c)   21, 22

    (d)   20, 22

  5. Which is the default format used by FTP for transferring text files?

    (a)   EBCDIC

    (b)   binary file

    (c)   ASCII file

    (d)   bytes

  6. Which port is used by SMTP for TCP connection?

    (a)   25

    (b)   26

    (c)   21

    (d)   20

  7. Which protocol uses the GET and POST methods?

    (a)   SMTP

    (b)   MIME

    (c)   IMAP4

    (d)   HTTP

  8. Which of the following supports a persistent connection?

    (a)   FTP

    (b)   SMTP

    (c)   HTTP

    (d)   MIME

  9. Which of the following are MAA protocols?

    (a)   SMTP and MIME

    (b)   FTP and HTTP

    (c)   SMTP and FTP

    (d)   POP3 and IMAP4

  10. Which of the following is the protocol used for network management?

    (a)   SNMP

    (b)   POP3

    (c)   FTP

    (d)   IMAP4

Answers

1. (d)

2. (b)

3. (c)

4. (b)

5. (c)

6. (a)

7. (d)

8. (c)

9. (d)

10. (a)