6. Dynamic Programming – Operations Research, 2nd Edition


Dynamic Programming


The mathematical technique of optimising a sequence of interrelated decisions over a period of time is called dynamic programming (DP). It uses the idea of recursion to solve a complex problem, broken into a series of sub-problems. The word dynamic has been used because time is explicitly taken into consideration. The objective in dynamic programming is to select a decision policy so to optimise the returns that are in the form of costs or benefits.

Mathematically a dynamic programming problem DPP is a decision–making problem in n variables, the problem being sub-divided in to n sub-problems and each such problem being a decision-making problem in one variable only. The solution to a DPP is achieved sequentially starting from one (initial) stage to the next till the final stage is reached. This dynamic programming technique was developed by Richard Bellman in the early 1950. Dynamic programming can be called as recursive optimisation.

To convert a verbal problem into a multi-stage structure is not always simple and sometimes it becomes very difficult, DP technique can be applied to problems of inventory control, production planning, chemical reactor design, business situation to take an optimal decision for investment and so on.

Some important concepts in DP are:

  1. Stage: The DPP can be decomposed or divided into a sequence of smaller sub-problems called stages of the original problem. At each stage there are a number of decision alternatives, and the best out of those is called the stage decision, which may not be optimal for the stage, but contributes to obtain the optimal decision policy. Stages very often represent different time periods associated with the planning period of the problem, places, people or other entities. For example, in the replacement problem, each year is a stage and in the salesman allocation problem, each territory is a stage.
  2. State: The condition of the decision process at a stage is called its state. The variables which specify the condition of the decision process at a particular stage are called State variables. The number of state variables should be as small as possible, since larger the number of state variables, more complicated is the decision process.
  3. Bell man’s principle of optimality: An optimal policy (set of decisions) has the property that whatever be the initial state and initial decisions, the remaining decisions must constitute an optimal policy for the state resulting from the first decision. The problem which does not satisfy the principle of optimality cannot be solved by the dynamic programming method. This principle implies that a wrong decision taken at a stage does not prevent from taking optimal decision for the remaining stages.
  4. Return function: At each stage, a decision is made which can effect the state of the system at the next stage and help in arriving at the optimal solution at the current stage. Every decision that is made has its merit which can be represented in an algebraic equation form. This equation is called a return function. This return function generally depends on the state variables and the decision made at a particular stage. In optimal policy at a stage gives optimal return for a given value of the state variable.
  5. Recursive relationship: A mathematical relationship that connects the decision policy for each state at stage n, given the optimal policy for each state at stage (n − 1), is called recursive relationship.

Let n = stage number, sn = state input to stage n from stage (n + 1), dn = the decision variable at stage n independent of previous stages.


fn = rn (sn, dn) = return (or objective) function for stage n.

Suppose, there are n stages at which a decision is to be made. These n stages are interconnected by the relationship called transition function , where * represents any mathematical operation like addition, multiplication, subtraction or division. The units of sn, dn, sn − 1 are homogenous.

At each stage of the problem, there are two inputs: one is, state variable sn and the other is decision variable dn. The state variable is the state input which relates the present stage back to the previous stage. The decision dn is made at stage n for optimising the total return over the remaining n − 1 stages.

The decision dn which optimises the output at stage n produces two outputs:

  1. the return rn (sn, dn).
  2. the new state variable sn − 1

The return function which is expressed as function of the state variable sn and decision variable dn indicates about the state of the process at the beginning of the next state (stage n − 1) and is denoted by sn − 1 = tn (sn, dn) where tn represents a state transformation function and its form depends on the particular problem to be solved.

If the problem is solved by using recursive equation starting from the first through the last stage, that is, obtaining f1f2 → … → fn, the computation is called forward computation and the equation formulated to obtain a sequence fnfn−1 → … → f1, then the computation is called backward computation. Both the computation yield the same solution.

The recursion relationship would vary according to the nature of the DPP. The stage 1 return is given by


f1 = r1 (s1, d1)

and the optimal value of f1 under the state variable s1 can be obtained by selecting a suitable decision variable d1. That is,

The range of d1 is determined by s1, but s1, is determined by what has happened in stage 2.

Then, in stage 2, the return function will take the form


s1 = t2(s2, d2).

Recursively, we can define,

where, sn − 1 = tn(sn, dn), ‘*’ denotes addition, subtraction, multiplication and so on.

6.1.1 Need for Dynamic Programming Problem

In many situations, the decision-making process consists of selecting a combination of plans from a large number of alternative combinations.

Before making a decision, it is required that

  1. all the decisions of a combination are specified.
  2. the optimal policy can be selected only after all the combinations are evaluated.

While considering the problem as a whole, the problem has some important drawbacks like

  1. lot of computational work and too much time is involved.
  2. all combinations may not satisfy the limitations and thus may be infeasible.
  3. the number of combinations is so large.

So, in order to overcome the difficulties, the given problem is divided into sub-problems or stages. One stage at a time is considered and the various infeasible combinations are eliminated with the objective of reducing the volume of computations.

6.1.2 Application of Dynamic Programming Problem

  1. In the production area, this technique has been used for production, scheduling and employment smoothening problems.
  2. It has been used in inventory model, for formulating inventory recording.
  3. It has been used in assignment of different resources to different destinations.
  4. It has been used in replacement model to determine at which age the equipment is to be replaced for optimal return from the facilities.
  5. It has been used in network analysis, job shop scheduling, capital budgeting, allocation of research and development funds, long-term corporate planning. solving linear programming problem, probabilistic decision and so on.

6.1.3 Characteristics of Dynamic Programming

The characteristic of DPP are as follows:

  1. The problem can be divided into stages, with a policy decision at each stage.
  2. Each stage consists of a number of states associated with it.
  3. Decision at each stage converts the current stage into a state associated with the next stage.
  4. The state of the system at a stage is described by state variables.
  5. When the current state is known, an optimal policy for the remaining stages is independent of the policy of the previous ones.
  6. The solution procedure begins by finding the optimal policy for each state to the last stage.
  7. To identify the optimal policy for each state of the system a recursive relationship is formulated with n stages remaining, given the optimal policy for each state with (n − 1) stages left.
  8. Using recursive equation approach, each time the solution procedure moves backward stage by stage for obtaining the optimum policy of each state for that particular stage, till it attains the optimum policy beginning at the initial stage.

6.1.4 Definition

Decision Tree

A multi-stage decision system, in which each decision and state variable can take only finite number of values, can be represented graphically by a decision tree.

FIGURE 6.1 Decision tree

In Fig 6.1, circles represents the nodes corresponding to stages and lines between circles denotes arcs corresponding to decisions.

6.1.5 Dynamic Programming Algorithm

The procedure for solving a problem by using the DPP approach can be summarised in the following steps.

Step 1: Identify the problem decision variables and specify objective function to be optimised under certain limitations, if any

Step 2: Decompose the given problem into a number of smaller sub-problems (or stages).
Identify the state variable at each stage and write down the transformation function.

Step 3: Write down a general recursive relationship for computing the optimal policy.
Decide which method is to be followed either backward or forward.

Step 4: Write the relation giving the optimal decision function for one stage sub-problem and solve it.

Step 5: Solve the optimal decision function for 2-stage, 3-stage, … n-stage problem.

Example 1

Solve the following by dynamic programming


Minimise z = y1 + y2 + … +yn.

subject to the constraints


y1y2yn = b



y1y2yn ≥ 0 (or)

Factorise a positive quantity b into n factors in such a way so that their sum is a minimum.

Solution: To develop the recursive equation.

Let fn(b) be the minimum attainable sum y1 + y2 … + yn, when the positive quantity ‘b’ is factorised in to n factors y1, y2, … yn.

One-stage problem: Let n = 1.

Here b is factorised into one factor namely y1 = b. Hence,

Two-stage problem: Let n = 2.

Factorise b into two factors, y1 = x and such that y1, y2 = b. Then.

Three-stage problem: Factorise b into three factors y1 = x, y2. so that y1y2y3 = b. Then,

In general the recursive equation for the n–stage problem is

To solve the recursive equation.

For n = 2, equation (6.1.2) is

The function will attain its minimum when and (by using calculus).

That is, when , the function will become minimum.

That is when the function will become minimum. Hence,

The optimum policy is and

For n = 3, equation (6.1.3) becomes

The function will attain its minimum, when

That is, the function will attain its minimum when x = b1/3. Hence,

The optimal policy is (b1/3, b1/3, b1/3) and f3(b) = 3b1/3. Let us assume that the optimal policy for and fm(b) = mb1/m.

Now, for n = m + 1, equation (6.1.3) becomes

The function will attain its minimum when .

That is, when 1 , the function attains its minimum. That is, when the function attains its minimum. Further, note that . For . Hence,

The result is true for n = m + 1. Hence, by mathematical induction, the optimal policy is


(b1/n, b1/n, … b1/n) and fn(b) = nb1/n.


6.2.1 Single Additive Constraint, Multiplicatively Separable Return

Consider the problem:

To maximise

subject to

First, introduce state variables, that is,


then the general recursion formula becomes


6.2.2 Single Additive Constraint, Additively Separable Return

Consider the problem.

To minimise subject to the constraints

and b are real number

The return at the jth stage is the function fj(yj), where yj is the decision variable. Introduce state variables s0, s1, s2, …, sn as

sn = a1y1 + a2y2 + … + anynb
sn − 1 = a1y1 + a2y2 + … + an − 1yn − 1 = snanyn
sn − 2 = a1y1 + … + an − 2yn − 2 = sn−1an − 1yn − 1
…   …   …
s1 = a1y1 = s2a2y2.



sj − 1 = tj(sj, yj), 1 ≤ jn

is the stage transformation function and indicates that each state variable is a function of next state and decision variables. denotes the minimum value of z for any feasible value of sn, where sn being the function of all decision variables. Thus,





6.2.3 Single Multiplicative Constraint, Additively Separable Return

Consider the problem,


Minimise z = f1(y1) + f2(y2) + … + fn(yn)

subject to the constraints


y1y2ynp, p > 0, yi ≥ 0 for all i

State variables are defined as:

These state variables are stage transformation of the type


sj −1 = tj (sj, yj)

Let be the minimum value of the objective function for any feasible sn. Thus, proceeding as earlier obtain the recursive formula

6.2.4 Systems Involving More than One Constraint

Dynamic programming models discussed so far involving only one constraint apart from non- negativity. In single constraint problems there has to be single state variable for each stage, whereas in multi-constraint problems there has to be one state variable per constraint per stage. Sometimes, it is possible to reduce the number of state variables. The procedure for this model will remain the same.

6.2.5 Problems

Example 1 Discrete Variables

A government space project is conducting research on a certain engineering problem that must be solved before man can fly to the moon safely. These research teams are currently trying three different approaches for solving this problem. The estimate has been made that, under present circumstances, the probability that the respective teams, call them A, B and C will not succeed are 0.40, 0.60 and 0.80, respectively. Thus, the current probability that all three teams will fail is (0.40) × (0.60) × (0.80) = 0.192. Since the objective is to minimise this probability, the decision has been made to assign two or more top scientists among the three teams in order to lower it as much as possible. The following Table gives the estimated probability that the respective teams will fail when, 0, 1, or additional scientists are added to that team.

How should the additional scientists be allocated to the team?

Solution: In this problem, the research teams are corresponding to the stages in the DPP.

Let s denote the number of new scientists still available for assignment at that stage, xj the number of additional scientists allocated to term (stage j), pj (xj) denote the probability of failure for team j if it is assigned xj additional scientists as prescribed in the Table.

The problem is:


minimise z = p1(x1) p2(x2) p3(x3)

subject to the constraints


x1 + x2 + x3 = 2



x1, x2, x3 ≥ 0, x1, x2, x3 are integers.

To obtain Recursive Equations

Let fj(xj) be the value of the optimal allocation for teams 1 through j both inclusive.

For j = 1


f1 (x1) = p1 (x1)

If fj (s, xj) is the probability associated with the optimum solution , then


fj (s, xj) = pj(xj) × min [pj + 1 (xj + 1) pj + 2 (xj + 2)pj + 3 (xj + 3)]

such that and xj are non-negative integers, j = 1, 2, 3.

The recursive equations are:


When j = 3

Solution of Recursive Equations

Since all the quantities in the recurrence equation are discrete, differential calculus method cannot be used. Let the optimal policy be denoted by . Now, proceed backward from j = 3 for each stage one by one.

Computation for one-stage problem 1: (j = 3)

0 0.80 0
1 0.50 1
2 0.30 2

Computation for two-stage problem2: (j = 2)

Computation for three-stage problem 3: (j = 1)

Therefore, optimum solution will have , making s = 1 at the second stage, so that making s = 1 at the third stage, so that . Hence, first and third teams should receive one additional scientist. The new probability that all the three teams will fail is 0.060.

Example 2

Divide a positive quantity c into n-parts in such away that their product is maximum.



Maximise z = y1y2yn subject to the constraints
y1 + y2 + … + yn = c



yi ≥ 0, i = 1, 2 … n

Solution: Let yi be the ith part of c and each i may be considered as a stage. Since yi may assume any non-negative value which satisfies y1 + y2 + … + yn = c, the alternative at each stage are infinite. This means yi is continuous. Hence, the optimal decision at each stage are obtained by using differentiation.

Let fn(c) be the maximum attainable product y1, y2yn when c is divided into n parts y1, y2, … yn. Thus, fn(c) is a function of n.

One-stage problem 1: For n = 1, c is divided into one part only, then y1 = cf1(c) = c

Two-stage problem 2: For n = 2, c is divided into two parts y1 = x, y2 = cx such that y1 + y2 = c.


Three-stage problem 3: For n = 3, c is divided into three parts y1, y2, y3. Let y1 = x, y2 + y3 = cx. So that y1 + y2 +y3 = c

Generalising, for n stages we get

To solve the recursive equation

The function x (cx) will attain its maximum, when


The optimal policy is and .

For n = 3,

The function will attain its maximum value at .


The optimal policy is and .

Assume the optimal policy for n = m. That is, and . Then, for n = m + 1,

The function attains its maximum value at


Hence, the result is true for n = m + 1. By induction the optimal policy is and .

Example 3 Discrete Variable

A student has to take examination in three courses x, y, z. He has three days available for study. He feels it would be best to devote a whole day to study the same course, so that he may study a course for one day, two days or three days or not at all. His estimates of grades he may get by studying are as follows:

How should he plan to study so that he maximises the sum of his grades?

Solution: Let n1, n2, n3 be the number of days he should study the courses x, y, z, respectively. If f1(n1), f2(n2), f3(n3) be the grades earned by such a study then the problem becomes,



z = f1(n1) + f2(n2) + f3(n3)

subject to


n1 + n2 + n3 ≤ 3



n1, n2, n3 ≥ 0 and integers.

Here, nj are decision variables and fj(nj) are the respective return functions for j = 1, 2, 3. Let us define the state variables sj(sj denote number of days available for jth stage) as follows:


s3 = n1 + n2 + n3 ≤ 3
s2 = n1 + n2 = s3n3
s1 = n1 = s2n2

The state transformation functions are defined by


sj − 1 = tj(sj, nj), j = 2, 3.

The recursive equations are:



for any feasible value of s3 the required solution would become .

Recursive equation gives the solution as follows.

Stage returns fj(nj)

Stage transformation sj − 1, j = 2, 3

Recursive operations

Proceeding backwards through enclosed type numbers the optimal policy is obtained as n3 = 2, n2 = 0, n1 = 1, keeping in view that n1 + n2 + n3 ≤ 5.

The required maximum return is 5.

Example 4 Maximisation Problem

A truck can carry a total of 10 tonnes of product. Three types of product are available for shipment. Their weights and values are tabulated. Assuming that at least one of each type must be shipped determine the loading which will maximise the total value.

Type Value
Weight (tonnes)
A 20 1
B 50 2
C 60 2

Solution: In this problem, a decision is to be taken as to how many units of A, B, C should be transported. Let each stage represent the decision regarding the number of units of a given product to be transported. The problem is divided into three stages.

One-stage problem 1: It is a problem of dividing the amount of product C to be transported. The profit per unit of C (weight = 2 tonnes) is 60. The restriction is that at least one limit of types A and B must be transported. Out of maximum of 10 tonnes, (1 + 2) tonnes are allotted to A and B. Hence, we can load the remaining 7 tonnes only. The total load varies from 2 tonnes to 7 tonnes represented by sj. Let xj be the states representing one, two, or three units (j = 1, 2, 3). Let be the optimum profit in one stage problem. The results are tabulated as:

Two–stage problem 2: Here, how much of product B and C is to be transported is to be decided. We take the decision to allot some space to transport product B and the remaining space is available for C for which the optimum values are taken from the results of one stage problem. The space to be used for B and C together varies from 4 tonnes to 9 tonnes.

Let x2 represent the amount of units of product of type B to be transported. Then, the remaining (s2x2) tonnes of space is allocated for C. The results are shown in the following Table.

Three-stage problem: Here we consider all the three types of products. Let s3 be the space available for transporting items, x3 be the amount of product of type A to be transported, so that (s3x3) is available for loading the products B and C together for which the optimum profit is taken from the two-stage problem. The space available (s3) varies from 5 tonnes to 10 tonnes and x3 varies from one tonne to 6 tonnes. The result of the three stage problem is as shown below:

Thus, the optimal solution is given by

, with .

Example 5 Project Management

A construction contractor has four construction projects underway and wants to minimise the time required to complete all projects. The following Table gives the estimated time required to complete the project for a specified number of foreman assignment to the project.


Estimated time

Solution: The contractor has only six foreman and each project must be assigned to at least one foreman. By considering each project as a stage, we have 4 stages. Let xi be the number of foreman assigned to project i and si be the number of foreman still available for allocation. Let t(xi) be the completion time for project i when xi foreman are assigned. The state variables si − 1 = sixi, from which the values of s4, s3, s2, s1 can be computed. For the stage 4, six foreman are available for allocation. x4 can take values 1, 2 or 3, since least one foreman must be allocated to each project. Hence, the possible values for s3 are 6 − 1 = 5, 6 − 2 = 4, 6 − 3 = 3. Also, as x3 assume values 1, 2, 3, s2 takes the values as 5 −1 = 4, 5 − 2 = 3, 5 − 3 = 2. Note that s2 cannot have value 1, since we need at least one foreman for the project A. Lastly, s1 has the values as 3, 2, 1. The total return function is

where, is the minimum completion time for stage (i = 1).


Stage 1:

Stage 2:

Note that , where is read from the table for stage 1, and t(1) the estimated time for project B.

Stage 3:

Stage 4:

Since two foreman are assigned to project D. Hence, s3 = 6 − 2 = 4. Since ; one foreman is assigned to project C. Now, s2 = 4 − 1 = 3. As or 2, we have two possibilities. If , one foreman is assigned to B and the remaining two foremen to A, otherwise, if , two foremen are assigned to B and the remaining two foreman is assigned to A. The minimum completion time is 67.

Example 6

Solve the following using dynamic programming


subject to the constraints


y1y2yn = c ≠ 0



yj ≥ 0, j = 1, 2, … n

Solution: Let fn(c) be the minimum attainable value of z when c is divided into n factors.

For n = 1, we have y1 = c and therefore,

For n = 2, let y1 = x, y2 = c/x. Then,

since f1(c) = c2, we have f1 (c/x) = (c/x)2.

For n = 3, let y1 = x, y2y3 = c/x. Then,

Recursively, we have

Solution of recursive equation:

Let f(x) = x2 + (c/x)2. Then, gives x = c1/2, therefore y1 = c1/2 and y2 = c/x = c1/2.




f2 (c/x) = 2 (c/x).

Hence, the optimal policy is (c1/2, c1/2) and f2 (c) = 2c.

as the minimum of f3(x) = x2 + 2c/x was obtained at x = c1/3. Hence, the optimal policy is {c1/3, c1/3, c1/3} and f3 (c) = 3c1/3. Generalising, the optimal policy for an n-stage problem will be {c1/n, c1/n, …, c1/n} and fn(c) = nc2/n.

Example 7

Suppose there are n machines which can perform two jobs. If x of them do the first job, then they produce goods worth g(x) = 3x, and if y of them perform the second job, then they produce goods worth n(y) = (2.5)y. Machines are subject to depreciation so that after performing the first job only a(x) = x/3 machines remain available and after performing the second job b(y) = 2y/3 machines remain available in the beginning of the second year. This process is repeated with the remaining machines. Obtain the maximum total return after three years and also find the optimal policy in each year.

Solution: Let us define the following notations:

j = stage, each year being viewed as a stage, y = 1, 2, 3.

xj = number of machines devoted to job 1 in the year j.

yj = number of machines devoted to job 2 in the year j.

s = state variable, the number of machines at any stage (year).

fj(s) = maximum return function when initial available machines are s with j more stages (years) to go.

There are three stages in this problem.

Stage 1: When j = 1

Using backward approach, that is, j = 1 (third year) we have s = s3 number of machines at the beginning of the year. The return function is

subject to the constraint


x3 + y3s3



x3, y3 ≥ 0.

Since the function f1 (s3) is a linear function in x3 and y3 its maximum can be obtained by the concept of extreme point solutions of linear programming problems as shown in Figure 6.2.

FIGURE 6.2 Extreme point solution (Stage 1)

FIGURE 6.3 Extreme point solution (Stage 2)

The maximum value of return function occurs at B(s3, 0), where


f1(s3) = 3x3 + (2.5)y3
= 3s3 + (2.5)0 = 3s3

Hence, optimal decision at this stage is

Stage 2: j = 2

where f1(s) = 3x2 + (2.5)y2 = Immediate return from stage 2, is the maximum known return from stage 1, subject to the constraint


x2 + y2s2
x2, y2 ≥ 0

where, x2, y2 = number of machines devoted to job 1 and 2, respectively in the second year.

machines which will be available at the beginning of the next year.

Since , we have

This is again a liner function in x2 and y2 whose maximum occurs at corner point A(0, s2) as shown in Fig. 6.3.

The maximum value of the return function at A(0, s2) is f2 (s2) = 4 × 0 + (4.5) s2 = (4.5) s2

Hence, the optimal decision at this stage is


and .

Stage 3: (j = 3). By backward approach, consider the first year as third stage. The return function at this stage is

subject to the constraint


x1 + y1s1
x1, y1 ≥ 0

The return function is a linear function in x1 and y1, whose maximum occurs at the corner point A(0, s1) as shown in Fig. 6.4. The maximum value of return function at A(0, s1) is


f3 (s1) = 4.5 × 0 + (5.5)s1 = (5.5)s1

FIGURE 6.4 Extreme point solution (Stage 3)

Hence, the optimal decision at this stage is

(= m machines say), and m.

The values of x2, x3, y2, y3 can be obtained as follows


y2 = s2 = 2y1/3 = 2m/3  and  x2 = 0
  and   y3 = 0
x1 = 0; y1 = m.


At first year (stage 3): x1 = 0, y1 = m

At second year (stage 2): x2 = 0, y2 = 2m/3

At third year (stage 1): x3 = 4m/9, y3 = 0.

Example 8 Cargo Load Problem

A vessel is to be loaded with stocks of three items. Each unit of item i has a weight wi and value ri. The maximum cargo weight the vessel can take is 5 and the details of the three items are as follows:

i wi ri
1 1 30
2 3 80
3 2 65

Develop the recursive equation for the above case and find the most valuable cargo load without exceeding the maximum cargo weight by using dynamic programming.

Solution: Number of items is 3, So it is a three-stage problem. Let xj(j = 1, 2, 3) be the decision variable, fj(xj) denote the value of the optimal allocation for the three-types of items. fj(s, xj) be the value associated with the optimum solution and,

where pj(xj) denotes the excepted value obtained from allocation of xj units of weight to the item j. One-stage problem: (One item cargo loading)

the largest value of x1 is

Value of 30 x1

Two-stage problem: the largest value of x2 is and

= max[80x2 + f1(s − 3x2)]
Value of 80x2 + f1(s − 3x2)

For three-stage problem, the largest value of x3 is 5 , and

Given maximum weight is 5, the optimal solution is and .

Example 9 Application in Inventory Control

A man is engaged in buying and selling items. He operates from a warehouse that can hold 500 Items. Each month he can sell any quantity that he chooses up to the stock at beginning of the month. Each month, he can buy as much as he wishes for delivery at the end of the month so long as his stock does not exceed 500 items. For the next four months, he has the following error-free forecasts of cost sales prices:

If he currently has a stock of 200 units, what quantities should he sell and buy in next four months? Find the solution using dynamic programming.



xi = amount of items to be sold during the month i
yi = amount of items to be ordered during the month i
bi = stock level in the beginning of month i
pi = sale price in the month i
ci = purchase price in the ith month
H = warehouse capacity

Let fn(bn) be the maximum possible return when there are n months to precede and initial stock is bn. This is a four-stage problem. Thus,

where, bnxn, bnxn + ynH. Also,

For n = 1,


y1 = 1, x1 = b1, hence f1(b1) = p1b1 = 27b1


b1 = b2x2 + y2

For n = 2,

where, y2Hb2 + x2 = 500 − b2 + x2.



b2 = b3x3 + y3

For n = 3,

As b3 = b4x4 + y4, taking n = 4,

Given: b4 = 200


which implies b3 = b4x4 + y4,
= 200 − 200 + 0 = 0,  since x4 = 200, y4 = 0
which implies b2 = b3x3 + y3
= 0 − 0 + 500,  since x3 = 200, y3 = 500
= 500
which implies b1 = b2x2 + y2,
= 500 − 0 − 0 = 500,  since x2 = 0, y2 = 0,

and, y1 = 0, x1 = b1 = 500.

Thus, the required solution is:

Maximum possible return = 28 × 200 + 1500 = 7100.


Example 1

Suppose that a company is planning its capital improvement projects for the next years. The company has budgeted 100 lakh and is reviewing four possible projects for funding. The following Table shows the projects, together with their costs and net present value of return.

A project will be funded entirely or not at all. Maximise the return on the projects.

Solution: Let xi be 1 if ith project (i = 1, 2, 3, 4) is funded and 0 otherwise. Hence, we have to maximise z = 50x1 + 100x2 + 60x3 + 30x4

subject to


50x1 + 70x2 + 30x3 + 30x4 ≤ 100

Each project is taken as stage and the state is the amount of capital available. Thus, si = the capital available at stage i.

The policy decision to be made is whether to fund a project i or not. Hence, the decision variable is

Let the cost of the project i be ci. Then, the successive state variables are related as

si−1 = sicixi

= sici if the project is selected

= si if the project is not selected.

The recursive equation for the total return is

where pi is the return from project i.

The available fund for stage 4 is 100 lakh. The available capital for stage i depends on the projects funded in stage i + 1, …, 4. So various possible states are

Stage State
4 100
3 70 if the project 4 is funded
100 if the project 4 is not funded
2 40 if projects 3 and 4 are funded
70 if any one of projects 3 or 4 is funded
100 if none of the projects are funded
1 0 if projects 2 & 3 are funded
30 if project 2 is funded
40 if projects 3 & 4 are funded
70 if any one of 3 or 4 is funded
100 if none of the projects are funded

Stage 1:

Stage 2:

Stage 3:

Stage 4:

The optimal return is 160 lakh. Since x4 = 0, project 4 is not selected for funding. Hence, the capital available at stage 3, s3 = is 100. In this case x3 = 1. So project 3 is funded. The capital available for stage 2 is 100−30 = 70. For stage 2, x2 = 1for s2 = 70. Thus, project 2 is selected for funding. Since project 2 costs 70 lakh, the capital available for stage 1 is 0. Here, x1 = 0. Hence, the projects 2 and 3 are selecting for funding.


Example 1

Using dynamic programming, solve the reliability problem with the following data.

The total capital available is 10 (in units of thousand rupees).

Solution: Let R be the total reliability of a system having n components arranged in series and let there be k parallel units per component (j = 1, 2, 3).

The problem is


Maximise R = R1 R2R3

subject to


c1 + c2 + c3c

where c is the total capital available.

Let xj = the capital allocated to the stage j, where j = 1, 2, 3.

kj = the number of parallel units assigned to main components j.

fj(xj) = reliability of components (stages) 1 through j inclusive, given that 0 ≤ xjc.

The recursive equations are

Starting with stage 1, as the component 1 must include at least one (parallel) unit, we find that x1 must at least equal c1 (1) = 2. By the same reasoning x1 cannot exceed 10 − (3+1) = 6; otherwise, the remaining capital will not be sufficient to provide main components 2 and 3 with at least one (parallel) unit each. Following the same reasoning.


x1 = 5, 6, 7, 8 or 9 and x3 = 6, 7, 8, 9, or 10.

Stage 1:

Stage 2:

Stage 3:

The optimal solution (given c = 10) is k3 = 2, k2 = 1, k1 = 3 with the maximum reliability as 0.576.

Note: The reliability problem is similar to the shortest route problem or stage coach problem except that the recursive relation involves multiplication rather addition, since the reliability over two components connected in series is given by the product of respective probabilities (or reliabilities).

Example 2

Solve the following reliability network:

The letters A, B, C, D, E, F, G represent nodes of the network and the directed line segments show the possible flow of information from one node to another. The numbers assigned to each line represent the probabilities that the line will not have a failure during data transmission from one node to another node. The objective is to determine the most reliable path from A to G.

This problem has three stages.

Stage 1: Set of lines converging to G.

Stage 2: Corresponds to all the lines emanating {B, C, D} and converging to {E, F}.

Stage 3: Corresponds to all lines reaching {B, C, D}. The problem is to find the sequence of lines where total reliability is maximum. States si correspond to the node locations at a given stage i and the decision variable xi represent the lines to be chosen in order to reach the next stage. Let the return function ri(si xi) represent the reliability at a given stage. So, ri(si, xi) is the probability of the selected line. The total reliability function satisfies the recurrence relation.

Stage 1:

Stage 2:

Stage 3:

The optimum reliability is 0.8659 and the path is ACEG.


Consider the network of routes from node A to B as below:

The objective is to find the shortest route from A to B. The salesman wants to find the shortest route. From city A he should visit any one of the cities 1, 2 or 3 and from there he select next tour to any of the cities 4 or 5, and then from 4 or 5 he can reach B. Each leg of his trip can be thought of as a stage. A stage number n can be defined to be the number of stages to be covered. When he is in city 4 or 5, he has to go one more stage to complete the journey. This, we call as one stage problem indicating there is one more stage to complete the journey. When he is in city 1, 2 or 3, he has to go two more stages to reach the final destination. Like this we have 3 stages from A to B.

Each move will change the state of the system denoted by sj. Thus, so is the state in which the node A lies. Also, so has only single value, say so = 1. State s2 has only 2 possible values corresponding to two nodes in stage 1, and so on. Possible alternative paths from one stage to the other will be called decision variables denoted by xj. The return or profit function is fj(xj). Note that xj is identified with the length of the corresponding line. Here, fj(xj) = xj. The minimum transportation cost (or path) from so to sj will be denoted by .

The general recursive formula is given by

with .

Example 1

Find the shortest path from vertex A to B along arcs joining various vertices lying between A and B. Length of each path is given.


j = 0   j = 1   j = 2   j = 3   j = 4

The problem can be divided into four stages.

At j = 4, x4 has three values 3, 9, or 8. If x4 = 3, then s3 = D, and if x4 = 9, then s3 = G, and if x4 = 8, then s3 = J. Hence, the minimum path from A to B is either through s3 = D, G or J corresponding to x4 = 3, 9 or 8. Hence,


Similarly, we can calculate . and .

Stage 1:

Stage 2:

Stage 3:

Stage 4:

The minimum path from A to B whose cost (or distance) is 21. By tracing the minimum path and decision backwards, successive distance are

The optimal route connecting A and K is ABFDK.


The general linear programming problem is


Maximise z = c1 x1 + c2 x2 + … + cn xn

subject to the constraints


a11x1 + a12x2 + … + a1nxnb1
a21x1 + a22x2 + … + a2nxnb2
…   …   …   …
am1 + am2x2 + … + amnxnbm
x1, x2xn ≥ 0.

This problem can be formulated as a dynamic programming problem as follows.

Here, each activity j(j = 1, 2, … n) may be regarded as a stage. The level of activity xj(≥ 0) represents the alternatives (decision variables) at stage j. Since xj is continuous, each stage possesses an infinite number of alternatives within the feasible region.

Since the linear programming problem is an allocation problem, states may be defined as the amounts of resources to be allocated to the current stage and the succeeding stages. Here, we shall use backward computational procedure. As there are m limited resources, the states must be represented by an m-dimensional factor. Let (β1f, β2f, … βmj) be the state of the system at stage j and fj (β1f, β2f, … βmj) be the optimum value of the objective function for stages (activities) j, j + 1, … n, for given states β1f, β2f, … βmj.


Here, 0 ≤ βijbi, for all i and j. The above recursive equation (6.6.1) is used to solve a LPP by dynamic programming approach.

Example 1

Solve the following LPP by dynamic programming approach.


Maximise z = 2x1 + 5x2

subject to


2x1 + x2 ≤ 43
2x2 ≤ 46
x1, x2 ≥ 0.

Solution: Since there are two resources (constraints), the states must be represented by a pair (uj, vj), j = 1, 2. Note that the problem has two variables only, so the problem has two stages.

fj(uj, vj) be the optimal value of the objective function for stage j = 1, 2 given (uj, vj).

Using backward procedure, we have

Since max {x2}, with 0 ≤ x2u2, 0 ≤ x2 is the minimum of , we have


Since it is a two stage problem, at the first stage u1 = 43, v1 = 46. Thus,

= max {2 (10) + 115, 215 − 8 × 10} = 135, which is obtained for x1 = 10. Hence,


f1(u1, v1)achieved at


The optimal solution is given by maximum


z = 135, x1 = 10, x2 = 23.

Example 2

Solve the following LPP using dynamic programming approach.


Maximum z = 3x1 + 5x2

subject to


x1 ≤ 4
x2 ≤ 6
3x1 + 2x2 ≤ 18
x1, x2 ≥ 0.

Solution: Since we have two decision variable and three resources, the problem has two stages and three states. Let the states be represented by (uj, vj, wj) for j = 1, 2, and fj(uj, vj, wj) be the optimal value of the objective function for stage j = 1, 2.

Using backward procedure, we have


The optimal solution is maximum z = 36, x1 = 2, x2 = 6.

Advantages and Disadvantages of Dynamic Programming

The following are the major advantages and disadvantages of dynamic programming technique:


  1. For problems in areas such as inventory management, chemical engineering design, or control theory, dynamic programming is the only technique used to solve the problems.
  2. It is well suited for multi-stage or multi-period or sequential decision process.
  3. It is applicable to linear or nonlinear problems, discrete or continuous variables, deterministic or stochastic problems.
  4. It can be implemented in a computer also used to perform sensitivity analysis.


  1. No general formation of the dynamic programme is available. Each problem has its own design. So, we can model a problem by experience and insight through it.
  2. The number of state variables must be kept very low in order to avoid excessive computational work.
  3. In general, this approach is not efficient compared to other mathematical programming algorithms like the simplex method.

Section 6.1

1. Explain the concept of dynamic programming.

2. State the principle of optimality in your own words.

3. Explain the relationship between stages and states in dynamic programming.

4. If both linear programming and dynamic programming could be used to model and solve a decision problem which method would you prefer to use and why?

5. What are the essential characteristics of dynamic programming.

6. Set up the recursive relation, using dynamic programming approach, when an n – stage objective function is to be maximised.

7. What sort of problems can be solved using dynamic programming?

Section 6.2

8. Use dynamic programming to find the value of maximise z = y1y2y3

subject to the constraints


y1 + y2 + y3 = 5
y1 y2, y3 ≥ 0.


Answer: with

9. Obtain the functional equation for maximising


z = g1 (x1) + g2 (x2) + … + gn(xn)

subject to x1 + x2 + … + xn = c, xj ≥ 0, j = 1, 2, … n

[Answer: .]

10. Using dynamic programming, minimise z = uvw

subject to u + v + w = 5, u, v, w ≥ 0

[Answer: u = 0, v =0, w = 5, and min z = 0.]

11. Maximise subject to the constraints x1 + 2x2 + x3 ≤ 8, x1, x2, x3 ≥ 0.

[Answer: x1 = x2 = x3 = 2, minimum z = 20.]

12. Find the maximum value of . subject to x1 + 2x2 + x3 ≤ 8, x1, x2, x3 ≥ 0.

13. A manufacture of golf balls is trying to determine the most effective advertising and promotional strategy for the introduction of its new line of golf balls. The company has budgeted 500000 for advertising. Listed below is the estimate of the potential sales revenue per 100000 spent.

All expenditures should be in increments of 100000. Determine the amount to be spent on the media.

[Answer: The optimal strategy is to allocate 300000 for direct mailing, 100000 to TV and magazine, advertisements, and nothing to radio.]

14. An oil company has 8 units of money available for exploration of three sites. If oil is present at a site, the probability of finding it depends upon the amount allocated for exploiting the site as given below.

Units of money allocated

The probability that oil exists at the sites 1, 2 and 3 is 0.4, 0.3, 0.2, respectively. Find the optimal allocation of money.

[Answer: All the 8 units of money are to be allocated to site 1, and the maximum probability of obtaining oil is 40%.]

15. A drug manufacturing concern has ten medical representatives working in three sales areas. The profitability of each representative in the three sales areas is as follows:

Determine the optimum allocation of the medical representatives in order to maximise the profits. What will be the optimum allocation if the number of representatives available at present is only six?

[Answer: (3, 6, 1): 2,46,000; (4, 1, 1); 1,18,000.]

16. A company has three media A, B, C available for advising its product. The data collected over the past years about the relationship between the sales and frequency of advertisement in the different media is as follows:

The cost of advertisement is 5000 in medium A, 10000 in medium B and 20000 in medium C. the total budget allocated for advertising the product is 40000. Determine the optimal combination of the advertising media and the frequency of advertisement.


[Answer: once in C, once in B, twice in A.]

17. A manufacture has entered into a contractor for the supply of the following number of units of a product at the end of each month.

The units manufactured during a month are available for supply at the end of the month or they may be kept in the storage at a cost of 2 per unit per month. Each time the manufacture of a batch of units is undertaken, there is a set up cost of 400. Determine the production schedule which will minimise the total cost.

[Answer: 15 units in Jan, 59 units in August with minimum z = 1098.]

18. Explain capital budgeting problem with one example, Also, explain how a contractor determines the size of a labour force during a certain period from the dynamic programming stand point.

19. Maximise hydro electric power P(s), s = (s1, s2, s3) produced by building dams on three different given basins, where


P(s) = f1(s1) + f2(s2) + f3(s3)

and, fi(si) is the power generated from the ith basin by investing si. The total budgeting provision is s1 + s2 + s3 ≤ 3. The functions f1, f2, f3 are given in the following Table. Integer solution of the problems is required.

Section 6.3

20. A publishing company has 50,00,000 is allocate its three divisions. i = 1 “Natural Sciences”, i = 2 “Social Sciences”, i = 3 “Environmental Sciences”, for expansion. Each division has submitted 4 proposals containing the cost ci of the expansion and the expected revenue ri, as given in table

Find the optimal allocation, and optimal revenue.

Section 6.4

21. An electronic device consists of four components each of which must function for the system to function. The system reliability can be improved by installing parallel units in one or more components. The reliability of component (R) with one, two, or three parallel units, and the corresponding cost(c) are given below. The maximum amount available for this device is 100. The problem is to determine the number of parallel units in each component.

[Answer: The device should have 1, 2, 1, 3 units in components 1, 2, 3 and 4, respectively with maximum reliability of 0.308.]

Section 6.5

22. A salesman has to travel between A and B indicated by the network shown in figure:

Find the minimum distance travelled by salesman from A to B.

[Answer: AIFGB with minimum cost 15.]

23. A company has to transport some goods from city A to city J. The cost of transportation between the different cities is given in the following network. Find the optimal route connecting cities A and J.

[Answer: ACEHJ]

Section 6.6

24. Solve the following LPP by dynamic programming

  1. Maximise z = 2x1 + 3x2
    subject to


    x1x2 ≤ 1
    x1 + x2 ≤ 3
    x1, x2 ≥ 0

  2. Maximise z = x1 + 3x2 + 4x3
    subject to


    2x1 + 4x2 + 3x3 ≥ 60
    3x1 + 2x2 + x3 ≥ 60
    2x1 + x2 + 3x3 ≥ 90
    x1, x2, x3 ≥ 0

[Answer: (i) x1 = 0, x2 = 3, max z = 9.]

25. Solve the following LPP by dynamic programming,

  1. Maximise z = 8x1 + 7x2
    subject to


    2x1x2 ≤ 8
    5x1 + 2x2 ≤ 15
    x1, x2 ≥ 0

    [Answer: x1 = 0, x2 = 7.5, maximum z = 52.5.]

  2. Maximise z = 3x1 + 7x2
    subject to


    x1 + 4x2 ≤ 8
    x2 ≤ 2
    x1, x2 ≥ 0

    [Answer: x1 = 8, x2 = 0, maximum z = 24.]

26. Solve the following LPP by dynamic programming

Maximise z = 2x1 + 5x2

subject to


2x1 + x2 ≤ 430
2x2 ≤ 460
x1, x2 ≥ 0.

[Answer: x1 = 100, x2 = 230, maximum z = 1350.]

27. Solve the following LPP by dynamic programming

Maximise z = 50x1 + 100x2

subject to


2x1 + 3x2 ≤ 48
x1 + 3x2 ≤ 42
x1 + x2 ≤ 21
x1, x2 ≥ 0

[Answer: x1 = 6, x2 = 12, maximum z = 60.]