Sunday, November 30, 2014

Concurrent Processing In Client-Server Software

Concurrency In Networks

The term concurrency refers to real or apparent simultaneous computing. For example, a multi-user computer system can achieve concurrency by time-sharing, a design that arranges to switch a single processor among multiple computations quickly enough to give the appearance of simultaneous progress; or by multiprocessing, a design in which multiple processors perform multiple computations simultaneously.

Concurrent processing is fundamental to distributed computing and occurs in many forms. Among machines on a single network, many pairs of application programs can communicate concurrently, sharing the network that interconnects them. For example, application A on one machine may communicate with application B on another machine, while application C on a third machine communicates with application D on a fourth. Although they all share a single network, the applications appear to proceed as if they operate independently. The network hardware enforces access rules that allow each pair of communicating machines to exchange messages. The access rules prevent a given pair of applications from excluding others by consuming all the network bandwidth.

Concurrency can also occur within a given computer system. For example, multiple users on a timesharing system can each invoke a client application that communicates with an application on another machine. One user can transfer a file while another user conducts a remote login session. From a user's point of view, it appears that all client programs proceed simultaneously.

Concurrency In Networks
In addition to concurrency among clients on a single machine, the set of all clients on a set of machines can execute concurrently. Figure 3.1 illustrates concurrency among client programs running on several machines.

Client software does not usually require any special attention or effort on the part of the programmer to make it usable concurrently. The application programmer designs and constructs each client program without regard to concurrent execution; concurrency among multiple client programs occurs automatically because the operating system allows multiple users to each invoke a client concurrently. Thus, the individual clients operate much like any conventional program. To summarize:

Most client software achieves concurrent operation because the underlying operating system allows users to execute client programs concurrently or because users on many machines each execute client software simultaneously. An individual client program operates like any conventional program; it does not manage concurrency explicitly.

Concurrency In Servers

In contrast to concurrent client software, concurrency within a server requires considerable effort. As figure 3.2 shows, a single server program must handle incoming requests concurrently.

To understand why concurrency is important, consider server operations that require substantial computation or communication. For example, think of a remote login server. It it operates with no concurrency, it can handle only one remote login at a time. Once a client contacts the server, the server must ignore or refuse subsequent requests until the first user finishes. Clearly, such a design limits the utility of the server, and prevents multiple remote users from accessing a given machine at the same time.

Concurrency In Servers

Terminology And Concepts 

Because few application programmers have experience with the design of concurrent programs, understanding concurrency in servers can be challenging. This section explains the basic concept of concurrent processing and shows how an operating system supplies it.

The Process Concept 

In concurrent processing systems, the process abstraction defines the fundamental unit of computation1. The most essential information associated with a process is an instruction pointer that specifies the address at which the process is executing. Other information associated with a process includes the identity of the user that owns it, the compiled program that it is executing, and the memory locations of the process' program text and data areas.

A process differs from a program because the process concept includes only the active execution of a computation, not the code. After the code has been loaded into a computer, the operating system allows one or more processes to execute it. In particular, a concurrent processing system allows multiple processes to execute the same piece of code "at the same time." This means that multiple processes may each be executing at some point in the code. Each process proceeds at its own rate, and each may begin or finish at an arbitrary time. Because each has a separate instruction pointer that specifies which instruction it will execute next, there is never any confusion.

Of course, on a uniprocessor architecture, the single CPU can only execute one process at any instant in time. The operating system makes the computer appear to perform more than one computation at a time by switching the CPU among all executing processes rapidly. From a human observer's point of view, many processes appear to proceed simultaneously. In fact, one process proceeds for a short time, then another process proceeds for a short time, and so on. We use the term concurrent execution to capture the idea. It means "apparently simultaneous execution." On a uniprocessor, the operating system handles concurrency, while on a multiprocessor, all CPUs can execute processes simultaneously.

The important concept is:

Application programmers build programs for a concurrent environment without knowing whether the underlying hardware consists of a uniprocessor or a multiprocessor.

Programs vs. Processes 

In a concurrent processing system, a conventional application program is merely a special case: it consists of a piece of code that is executed by exactly one process at a time. The notion of process differs from the conventional notion of program in other ways. For example, most application programmers think of the set of variables defined in the program as being associated with the code. However, if more than one process executes the code concurrently, it is essential that each process has its own copy of the variables. To understand why, consider the following segment of C code that prints the integers from 1 to 10:
for ( i=1 ; i <= 10 ; i++)
printf("%d\n", i);
The iteration uses an index variable, i. In a conventional program, the programmer thinks of storage for variable i as being allocated with the code. However, if two or more processes execute the code segment concurrently, one of them may be on the sixth iteration when the other starts the first iteration. Each must have a different value for i. Thus, each process must have its own copy of variable i or confusion will result, To summarize:

When multiple processes execute a piece of code concurrently, each process has its own, independent copy of the variables associated with the code.

Procedure Calls 

In a procedure-oriented language, like Pascal or C, executed code can contain calls to subprograms (procedures or functions). Subprograms accept arguments, compute a result, and then return just after the point of the call. If multiple processes execute code concurrently, they can each be at a different point in the sequence of procedure calls. One process, A, can begin execution, call a procedure, and then call a second-level procedure before another process, B, begins. Process B may return from a first-level procedure call just as process A returns from a second-level call. The run-time system for procedure-oriented programming languages uses a stack mechanism to handle procedure calls.

The run-time system pushes a procedure activation record on the stack whenever it makes a procedure call. Among other things, the activation record stores information about the location in the code at which the procedure call occurs. When the procedure finishes execution, the run-time system pops the activation record from the top of the stack and returns to the procedure from which the call occurred. Analogous to the rule for variables, concurrent programming systems provide separation between procedure calls in executing processes:

When multiple processes execute a piece of code concurrently, each has its own run-time stack of procedure activation records.

An Example Of Concurrent Process Creation

A Sequential C Example

The following example illustrates concurrent processing in the UNIX operating system. As with most computational concepts, the programming language syntax is trivial; it occupies only a few lines of code. For example, the following code is a conventional C program that prints the integers from I to 5 along with their sum:

/* sum.c - A conventional C program that sum integers from 1 to 5 */
#include <stdlib.h>
#include <stdio.h>
int sum;
/* sum is a global variable */
main () {
int i;
/* i is a local variable */
/* iterate i from 1 to 5 */
sum = 0;
for (i=1 ; i <=5 ; i++) {
printf("The value of i is %d\n", i);
fflush(stdout);
/* flush the buffer
*/
sum += i;
}
printf ("The sum is %d\n", sum);
exit(0);
/* terminate the program */
}

When executed, the program emits six lines of output:

The value of i is 1
The value of i is 2
The value of i is 3
The value of i is 4
The value of i is 5
The sum is 15

A Concurrent Version

#include <stdlib.h>
#include <stdio.h>
int
sum;
main() {
int i;
sun = 0;
fork();
/* create a new process */
for (i=1 ; i<=5 ; i++) {
printf ("The value of i is %d\n", i);
fflush(stdout);
sum += i;
}
printf ("The sum is %d\n", sum);
exit (0)
}
When a user executes the concurrent version of the program, the system begins with a single process executing the code.However, when the process reaches the call to fork, the system duplicates the process and allows both the original process andthe newly created process to execute. Of course, each process has its own copy of the variables that the program uses. In fact, theeasiest way to envision what happens is to imagine that the system makes a second copy of the entire running program. Then imagine that both copies run Oust as if two users had both simultaneously executed the program). To summarize:

To understand the fork function, imagine that fork causes the operating system to make a copy of the executing program and allows both copies to run at the same time.

Wednesday, November 26, 2014

Servers As Clients

Programs do not always fit exactly into the definition of client or server. A server program may need to access network services that require it to act as a client. For example, suppose our file server program needs to obtain the time of day so it can stamp files with the time of access. Also suppose that the system on which it operates does not have a time-of-day clock. To obtain the time, the server acts as a client by sending a request to a time-of-day server as Figure 2.2 shows.



Stateless Vs Stateful Servers

Information that a server maintains about the status of ongoing interactions with clients is called state information. Servers that do not keep any state information are called stateless servers; others are called stateful servers.

The desire for efficiency motivates designers to keep state information in servers. Keeping a small amount of information in a server can reduce the size of messages that the client and server exchange, and can allow the server to respond to requests quickly. Essentially, state information allows a server to remember what the client requested previously and to compute an incremental response as each new request arrives. By contrast, the motivation for statelessness lies in protocol reliability: state information in a server can become incorrect if messages are lost, duplicated, or delivered out of order, or if the client computer crashes and reboots. If the server uses incorrect state information when computing a response, it may respond incorrectly.

Connectionless Vs. Connection-Oriented Servers

When programmers design client-server software, they must choose between two types of interaction: a connectionless style or a connection-oriented style. The two styles of interaction correspond directly to the two major transport protocols that the TCP/IP protocol suite supplies. If the client and server communicate using UDP, the interaction is connectionless; if they use TCP, the interaction is connection-oriented.

From the application programmer's point of view, the distinction between connectionless and connection-oriented interactions is critical because it determines the level of reliability that the underlying system provides. TCP provides all the reliability needed to communicate across an internet. It verifies that data arrives, and automatically retransmits segments that do not. It computes a checksum over the data to guarantee that it is not corrupted during transmission. It uses sequence numbers to ensure that the data arrives in order, and automatically eliminates duplicate packets. It provides flow control to ensure that the sender does not transmit data faster than the receiver can consume it. Finally, TCP informs both the client and server if the underlying network becomes inoperable for any reason.

Standard Vs. Nonstandard Client Software

Standard application services consist of those services defined by TCP/IP and assigned well-known, universally recognized protocol port identifiers; we consider all others to be locally-defined application services or nonstandard application services.

The distinction between standard services and others is only important when communicating outside the local environment. Within a given environment, system administrators usually arrange to define service names in such a way that users cannot distinguish between local and standard services. Programmers who build network applications that will be used at other sites must understand the distinction, however, and must be careful to avoid depending on services that are only available locally.

The Client Server Model And Software Design

From the viewpoint of an application, TCP/IP, like most computer communication protocols, merely provides basic
mechanisms used to transfer data. In particular, TCP/IP allows a programmer to establish communication between two application programs and to pass data back and forth. Thus, we say that TCP/IP provides peer-to-peer communication. The peer applications can execute on the same machine or on different machines.

Although TCP/IP specifies the details of how data passes between a pair of communicating applications, it does not dictate
when or why peer applications interact, nor does it specify how programmers should organize such application programs in a distributed environment. In practice, one organizational method dominates the use of TCP/IP to such an extent that almost all applications use it. The method is known as the client-server paradigm. In fact, client-server interaction has become so fundamental in peer-to-peer networking systems that it forms the basis for most computer communication.

Sunday, November 23, 2014

Viewing Services From The Provider's Perspective

Viewing Services From The Provider's Perspective

The examples of application services given above show how a service appears from an individual user's point of view. The user runs a program that accesses a remote service, and expects to receive a reply with little or no delay.

From the perspective of a computer that supplies a service, the situation appears quite different. Users at multiple sites may choose to access a given service at the same time. When they do, each user expects to receive a response without delay.

Application Protocols And Software Flexibility

The example above shows how a single piece of software, in this instance the telnet program, can be used to access more than one service. The design of the TELNET protocol and its use to access the Knowbot service illustrate two important points. First, the goal of all protocol design is to find fundamental abstractions that can be reused in multiple applications. In practice, TELNET suffices for a wide variety of services because it provides a basic interactive communication facility. Conceptually, the protocol used to access a service remains separate from the service itself. Second, when architects specify application services, they use standard application protocols whenever possible. The Knowbot service described above can be accessed easily because it uses the standard TELNET protocol for communication. Furthermore, because most TCP/IP software includes an application program that users can invoke to run TELNET, no additional client software is needed to access the Knowbot service. Designers who invent new interactive applications can reuse software if they choose TELNET for their access protocol. The point can be summarized:

Using TELNET To Access An Alternative Service

TCP/IP uses protocol port numbers to identify application services on a given machine. Software that implements a given service waits for requests at a predetermined (well-known) protocol port. For example, the remote login service accessed with the TELNET application protocol has been assigned port number 23. Thus, when a user invokes the telnet program, the program connects to port 23 on the specified machine.

Standard And Nonstandard Application Protocols

The TCP/IP protocol suite includes many application protocols, and new application protocols appear daily. In fact, whenever a programmer devises a distributed program that uses TCP/IP to communicate, the programmer has invented a new application protocol. Of course, some application protocols have been documented in RFCs and adopted as part of the official TCP/IP protocol suite. We refer to such protocols as standard application protocols.  Other protocols, invented by application programmers for private use, are referred to as nonstandard application protocols.

Designing Applications For A Distributed Environment

Programmers who build applications for a distributed computing environment follow a simple guideline: they try to make each distributed application behave as much as possible like the nondistributed version of the program. In essence, the goal of distributed computing is to provide an environment that hides the geographic location of computers and services and makes them appear to be local.
 

For example, a conventional database system stores information on the same machine as the application programs that access it. A distributed version of such a database system permits users to access data from computers other than the one on which the data resides. If the distributed database applications have been designed well, a user will not know whether the data being accessed is local or remote.

Use Of TCP/IP

In 1982, the TCP/IP Internet included a few hundred computers at two dozen sites concentrated primarily in North America.In 1996, over 8,000,000 computer systems attach to the Internet in over 85 countries spread across 7 continents; its size continues to double every ten months. Many of the over 60,000 networks that comprise the Internet are located outside the US.

In addition, most large corporations have chosen TCP/IP protocols for their private corporate internets, many of which are now as large as the connected Internet was twelve years ago. TCP/IP accounts for a significant fraction of networking throughout the world. Its use is growing rapidly in Europe, India, South America, and countries on the Pacific rim.