Originally Conjugate Gradient method was developed for solving certain systems of linear equations  where  is a known vector,  is unknown and  is a square, symmetric, positive-definite matrix. Solving this equation is equivalent to finding minimum of a corresponding square function  This method is very similar to Gradient Descent in the sense that from the initial point  we choose a vector  which will take us to the next point, where the objective function will decrease. The above is repeated until we reach function minimum, i.e. gradient becomes zero.

Unlike Gradient Descent the direction  is not set to  , is chosen such that the next direction  is conjugate to  with respect to Hessian matrix.  A few approaches exist for calculating  . The most popular one are ones by Polak-Ribiere(1) and Fletcher-Reeves(2)

1.  2.  So the algorithm looks like this:

1. Initialize  to a random value
2. Initialize direction  3. Perform a line search:  4. Move to the next point:  5. Calculate gradient in next point:  6. Calculate new direction:  7. If  is not small enough go to 3

Line search on step 3 can be replaced with an explicit solution, in cases where we can calculate Hessian matrix  of our objective function. In this case we will get  