The Simplex Algorithm for Solving Linear Programming Problems

In 1947, Dantzig proposed a method for solving linear programming problems, now known as the simplex method. It is a concise and efficient algorithm, hailed as one of the ten algorithms with the greatest impact on scientific development and engineering practice in the 20th century. As mentioned earlier, the optimal solution of a linear programming problem must be a basic feasible solution. The idea of the simplex method is to compute different basic feasible solutions under different basis vectors, and then find the optimal solution. From a geometric perspective, this is the process of moving from one extreme point to another until the optimal extreme point is found. In that case, the algorithm can be divided into three subproblems: 1. How to move from one basic feasible solution to another. 2. How to determine which extreme point to move to. 3. When to stop the moving operation, i.e., how to determine whether the current basic feasible solution is optimal.

Moving Between Extreme Points

The standard form of a linear program is

minimizecTxsubject toAx=bx0\begin{align*} minimize\quad c^{T}x\\ subject\ to\quad Ax=b\\ x\ge0 \end{align*}

Consider the equation Ax=bAx=b, expanded into the following canonical system of equations

x1+y1 m+1xm+1+...+y1nxn=y10x2+y2 m+1xm+1+...+y2nxn=y20xm+ym m+1xm+1+...+ymnxn=ym0\begin{align*} x_{1}+y_{1\ m+1}x_{m+1}+...+y_{1n}x_{n}=y_{10}\\ x_{2}+y_{2\ m+1}x_{m+1}+...+y_{2n}x_{n}=y_{20}\\ \vdots\\ x_{m}+y_{m\ m+1}x_{m+1}+...+y_{mn}x_{n}=y_{m0} \end{align*}

This system can be converted into [Im,Ym,nm]x=y0[I_{m},Y_{m,n-m}]x=y_{0}. A system of equations Ax=bAx=b in this form is called the canonical form. The canonical form of a system has the same solutions as the original system. In the canonical expression, the variables corresponding to the basis column vectors are the basic variables, and the others are the nonbasic variables. That is, in the equation [Im,Ym,nm]x=y0[I_{m},Y_{m,n-m}]x=y_{0}, x1,x2...,xmx_{1},x_{2}...,x_{m} are the basic variables and the rest are nonbasic variables. Consider the canonical augmented matrix [Im,Ym,nm,y0][I_{m},Y_{m,n-m},y_{0}]; the elements of its last column are the coordinates of the vector bb with respect to the basis {a1,...,ama_{1},...,a_{m}}.

Now consider updating the augmented matrix, i.e., replacing some basic variable with some nonbasic variable and finding the canonical expression corresponding to the new basis. For example, replace the basic variable ap,1pma_{p},1\le p\le m with the nonbasic variable aq,m<qna_{q},m<q\le n. In the original matrix, aqa_{q} can be expressed as

aq=i=1myiqai=i=1ipmyiqai+ypqapa_{q}=\sum_{i=1}^{m}y_{iq}a_{i}=\sum_{i=1\\i\ne p}^{m}y_{iq}a_{i}+y_{pq}a_{p}

Note that the vector set {a1,...,ap1,aq,ap+1,...,ama_{1},...,a_{p-1},a_{q},a_{p+1},...,a_{m}} is linearly independent and can form a basis only when ypq0y_{pq}\ne 0. From the above equation we can solve for

ap=1ypqaqi=1ipmyiqypqaia_{p}=\frac{1}{y_{pq}}a_{q}-\sum_{i=1\\i\ne p}^{m}\frac{y_{iq}}{y_{pq}}a_{i}

Using the original augmented matrix, any vector aj(m<jn)a_{j}(m<j\le n) can be expressed as

aj=y1ja1+y2ja2+...+ymjama_{j}=y_{1j}a_{1}+y_{2j}a_{2}+...+y_{mj}a_{m}

Substituting the expression for apa_{p} yields

aj=i=1ipm(yijypjypqyiq)ai+ypjypqaqa_{j}=\sum_{i=1\\i\ne p}^{m}(y_{ij}-\frac {y_{pj}}{y_{pq}}y_{iq})a_{i}+\frac{y_{pj}}{y_{pq}}a_{q}

Letting yijy_{ij}^{'} denote the elements of the new augmented matrix, the above gives

yij=yijypjypqyiq,ipypj=ypjypq\begin{align*} y_{ij}^{'}=y_{ij}-\frac{y_{pj}}{y_{pq}}y_{iq},i\ne p\\ y_{pj}^{'}=\frac{y_{pj}}{y_{pq}} \end{align*}

The transformation of the matrix using the above formulas is called the pivot transformation about the element (p,q)(p,q). The above gives the formula for moving from one extreme point to another.

Determining the Extreme Point to Move To

Determining the extreme point to move to means transforming the vector set into another coordinate system. In detail, this means taking one of the mm vectors out of the vector set and selecting one vector from the other nmn-m vectors to add to the basis vector set, forming a new basis. We denote this as taking out the vector apa_{p} (leaving the basis) and adding the vector aqa_{q} (entering the basis). This problem can again be split into two parts, namely determining pp and qq separately. For determining qq, first look at the pivot transformation formula below

yij=yijypjypqyiq,ipypj=ypjypq\begin{align*} y_{ij}^{'}=y_{ij}-\frac{y_{pj}}{y_{pq}}y_{iq},i\ne p\\ y_{pj}^{'}=\frac{y_{pj}}{y_{pq}} \end{align*}

Now consider another way to turn a nonbasic variable aqa_{q} into a basis vector. Express the vector aqa_{q} using the current basis vectors

aq=y1qa1+y2qa2+...+ymqama_{q}=y_{1q}a_{1}+y_{2q}a_{2}+...+y_{mq}a_{m}

Multiplying both sides by ϵ>0\epsilon>0 yields

ϵaq=ϵy1qa1+ϵy2qa2+...+ϵymqam\epsilon a_{q}=\epsilon y_{1q}a_{1}+\epsilon y_{2q}a_{2}+...+\epsilon y_{mq}a_{m}

Substituting the basic solution x=[y10,...,ym0,0,...,0]Tx=[y_{10},...,y_{m0},0,...,0]^T into the equation Ax=bAx=b yields

y10a1+...+ym0am=by_{10}a_{1}+...+y_{m0}a_{m}=b

Combining the two equations above gives

(y10ϵy1q)a1+(y20ϵy2q)a2+...+(ym0ϵymq)am+ϵaq=b(y_{10}-\epsilon y_{1q})a_{1}+(y_{20}-\epsilon y_{2q})a_{2}+...+(y_{m0}-\epsilon y_{mq})a_{m}+\epsilon a_{q}=b

That is, the vector [y10ϵy1q,...,ym0ϵymq,0,...,ϵ,...,0][y_{10}-\epsilon y_{1q},...,y_{m0}-\epsilon y_{mq},0,...,\epsilon,...,0] is a solution of the equation Ax=bAx=b. As ϵ\epsilon continually increases, when a 0 first appears among the first mm elements of the vector, we obtain a basic feasible solution. The objective function value corresponding to the transformed basic feasible solution is

z=c1(y10y1qϵ)+...+cm(ym0ymqϵ)+cqϵ=z0+[cq(c1y1q+...+cmymq)]ϵ\begin{align*} z=c_{1}(y_{10}-y_{1q}\epsilon)+...+c_{m}(y_{m0}-y_{mq}\epsilon)+c_{q}\epsilon\\ =z_{0}+[c_{q}-(c_{1}y_{1q}+...+c_{m}y_{mq})]\epsilon \end{align*}

where z0=c1y10+...+cmym0z_{0}=c_{1}y_{10}+...+c_{m}y_{m0}. Letting zq=c1y1q+...+cmymqz_{q}=c_{1}y_{1q}+...+c_{m}y_{mq}, we obtain z=z0+(cqzq)ϵz=z_{0}+(c_{q}-z_{q})\epsilon. When zz0=(cqzq)ϵ<0z-z_{0}=(c_{q}-z_{q})\epsilon<0, the objective function value corresponding to the basic feasible solution has decreased, meaning that adding this vector to the basis vector set is beneficial.

So we can compute cqzqc_{q}-z_{q} for each q,m<qnq,m<q\le n. Letting ri=0(i=1,...,m),ri=cizi(i=m+1,...,n)r_{i}=0(i=1,...,m),r_{i}=c_{i}-z_{i}(i=m+1,...,n), we call rir_{i} the ii-th reduced cost coefficient, also known as the reduced cost (testing number). It can be proven that a basic feasible solution is optimal if and only if all corresponding reduced costs are nonnegative.

That is, we can determine whether the current solution is optimal by computing the reduced costs. When there are negative numbers among the reduced costs, the current solution is not optimal, and the objective function value can be decreased by adding the qq-th vector to the basis vector set. If there are multiple negative numbers, we choose the qq-th vector with the smallest reduced cost to add to the basis vector set.

The above determined the method for choosing qq. Now let us introduce how to determine the leaving vector apa_{p}. As shown above, for the vector [y10ϵy1q,...,ym0ϵymq,0,...,ϵ,...,0][y_{10}-\epsilon y_{1q},...,y_{m0}-\epsilon y_{mq},0,...,\epsilon,...,0], as ϵ\epsilon continually increases, among the first mm elements the ii-th element yi0ϵyiqy_{i0}-\epsilon y_{iq} will be the first to equal 0. Then we choose aia_{i} as the leaving vector apa_{p}, i.e., p=arg mini{yi0/yiq:yiq>0}p=arg\ min_{i}\{y_{i0}/y_{iq}:y_{iq}>0\}. Note that if no yiq>0y_{iq}>0 exists, then the problem has an unbounded solution, so the computation stops. Additionally, if multiple 0 elements appear simultaneously, then a degenerate basic feasible solution is obtained; in this case we choose the smallest value of pp.

Determining the Stopping Condition

The above introduced how to choose the extreme point to move to. When determining qq, we compute the reduced costs; if all reduced costs are nonnegative, then the current basic feasible solution is optimal and the computation stops. Moreover, when determining pp, if as ϵ\epsilon continually increases the first mm elements of the vector also increase, i.e., yiq<0y_{iq}<0 for 0<im0<i\le m, then the problem has an unbounded solution, and the computation also stops.

The Simplex Algorithm

The above introduced the methods for handling the several subproblems of the simplex algorithm. Now let us summarize the simplex algorithm. 1. Construct the canonical augmented matrix from the initial basic feasible solution. 2. Compute the reduced costs of the nonbasic variables. 3. If rj0r_{j}\le 0 for all jj, then stop the computation; the current basic feasible solution is optimal. Otherwise, proceed to the next step. 4. Among the reduced costs less than 0, choose the qq with the smallest reduced cost. 5. If no yiq>0y_{iq}>0 exists, then stop the computation; the problem has an unbounded solution. Otherwise, compute p=arg mini{yi0/yiq:yiq>0}p=arg\ min_{i}\{y_{i0}/y_{iq}:y_{iq>0}\}. If multiple indices ii satisfying the condition are obtained, let pp be the smallest index value. 6. Using the element (p,q)(p,q) as the pivot element, perform the pivot transformation and update the canonical augmented matrix. 7. Go to step 2.

In addition, the simplex algorithm also has a matrix expression. To address the difficulties encountered in finding an initial basic feasible solution, the two-phase simplex method was proposed; to reduce the amount of computation, the revised simplex method was proposed; and so on.

To be continued…