Dynamic programming is a useful mathematical technique for making a sequence of inter-related decisions. It provides a systematic procedure for determining the optimal combination of decisions, given the constraints at hand. In contrast to linear programming, there is no standard formulation for the mathematical problem. Hence each problem requires analysis and modelling for itself. A large number of algorithms are also available for this field of problems.(1)
The example above, taken from Hillier & Lieberman’s “Introduction to Operational Research”, describe the Program Evaluation and Review Technique (2) the interdependent tasks A, B, C, D, and E. Each task has an estimated time of completion -3, 4, 5, 3, and 4 respectively-. Unless process C finishes, task E cannot begin to execute.
This process is commonly used in project management scenarios, when inter-related events need to have their predecessors completed before taking up a certain task.
The purpose of this scenario is to find the expected time of completion for this project. The uncertainty comes from the fact that each of the times of completion are estimations. Therefore the scenario needs to be run multiple times and then the expected completion time to be calculated. The distribution of the completion times will be outputted as the result.
There are three milestones to this project:
- Parse an ASCII data file and construct a Graph to represent the problem.
- Find the critical path through the graph.
- Calculate the experimental distribution of the total project duration, given distributions for the completion time of each sub-task.
A. Task 1
The files will have the following syntactic and semantic structure:
0 0 0 0 A 1 2 3 0 B 2 3.5 8 0 A C 6 9 18 0 B D 4 5.5 10 0 C E 1 4.5 5 0 C F 4 4 10 0 E G 5 6.5 11 0 D H 5 8 17 0 E G I 3 7.5 9 0 C J 3 9 9 0 F I K 4 4 4 0 J L 1 5.5 7 0 J M 1 2 3 0 H N 5 5.5 9 0 K L Z 0 0 0 A B C D E F G H I J K L M N
The first column is the [name] of the milestone.
The second column is the [opt]imistic estimate of duration.
The third column is the most [likely] estimate of duration.
The fourth column is the [pess]imistic estimate of duration.
From then on the remaining characters indicate preceding milestones that this milestone [depends] on.
Once all the lines have been read, they’ll be used to construct the graph of the project. This graph will use the methods in the Graph class of the code.
From the time of completion selection a random variate will be selected based on the likely, best and worst case scenarios.
The duration of each activity will be a Normally Distributed Random Variate (NDRV) with mean μ=(o+4m+p)/6 and variance σ^2=(p-o)2/36, where o, m, p the given optimistic, likely, and pessimistic times of completion.
JAVA’s Random class was used to generate Normally distributed random numbers N(0,1). This value was then multiplied by the variance calculated above for the times of completion and subsequently the newly calculated mean was added to this value to produce an NDRV(μ, σ^2).
B. Task 2
For the critical path method, different algorithms were tried. Dijkstra’s algorithm had to be modified to work for the critical path. Since Dijkstra’s algorithm works efficiently to find the minimum cost path, the time of completions had to be inverted in order to look for the lowest inverted time. This didn’t produce optimal solution, plus had additional problems with potential 0 time of completion or negative times.
The Backwards propagation algorithm could not handle a terminal node with all edges having 0 time of completion and thus couldn’t start at all.
The forwards propagation algorithm couldn’t handle large costs deeper in the graph and thus did not produce an optimal solution. It did however produce feasible solutions.
The algorithm finally utilised was the Backwards propagating Bellman-Ford(BF) algorithm. The Bellman–Ford algorithm computes single-source shortest paths in a weighted digraph. Bellman–Ford runs in O(|V|·|E|) time, where |V| and |E| are the number of vertices and edges respectively. (3) An added benefit of the BF algorithm is that it can identify negative cycles and break the execution.
C. Task 3
The mean, variance, and distribution of the simulation will be outputted on the terminal and a small GUI showing the histogram. Normally to calculate the variance of the data set, two passes of the data would be required. One to calculate the expected value of the series and a second to calculate the residuals of the values to the expected value.
var = Sum(Xi-Xmu)^2)/N , for all i belongs to N
However by decomposing the above formula, it is possible to do this with one run, to minimise the time of execution:
var = (Sum(Xi))^2/N – (Sum/N)^2 , for all i belongs to N
The bin size is set at 1 for the min-max range of (30,59), although by calculating the range of the series the following formula can be used to determine the optimal bin size (W):
W=3.49xSDxN^(-1/3), where SD is the sample standard deviation and N the available data point. (4)
The final outcome indicates that for the case102.dat (included in the repo) the statistical output of the simulation and the logarithmic histogram are:
Experiment's mean value for the critical length paths: 43.41049912306386 Experiment's variance value for the critical length paths: 12.700112769478892 Critical Path: 0 - E - O - P - Q - Z Bin Prob LogHist Count 30 0% | 0/10000 31 0% | 0/10000 32 0% |@ 5/10000 33 0.2% |@@@ 22/10000 34 0.2% |@@@ 25/10000 35 0.8% |@@@@ 83/10000 36 1.6% |@@@@@ 155/10000 37 2.8% |@@@@@ 279/10000 38 4.4% |@@@@@@ 440/10000 39 6.3% |@@@@@@ 626/10000 40 8.4% |@@@@@@ 840/10000 41 9.8% |@@@@@@ 983/10000 42 11.6% |@@@@@@@ 1164/10000 43 11.2% |@@@@@@@ 1120/10000 44 10.3% |@@@@@@ 1028/10000 45 9.2% |@@@@@@ 920/10000 46 7.7% |@@@@@@ 774/10000 47 5.7% |@@@@@@ 570/10000 48 3.8% |@@@@@ 375/10000 49 2.6% |@@@@@ 257/10000 50 1.6% |@@@@@ 162/10000 51 0.9% |@@@@ 93/10000 52 0.4% |@@@ 44/10000 53 0.2% |@@@ 22/10000 54 0.1% |@ 6/10000 55 0% |@ 3/10000 56 0% | 1/10000 57 0% | 0/10000 58 0% | 0/10000 59 0% | 0/10000
This histogram exhibits the normal distribution expected from the simulation with the mean and distribution provided by the o, m , p times of the milestones.
This project was awarded 72/100 marks.
As usual the source for this project can be checked out and forked at https://github.com/cmdel/javaPERT.
- Scott, 1979. On optimal and data-based histograms. Biometrika, 66:605-610.