gradient descent using python and numpy

Question

def gradient X norm y theta alpha m n num it       temp np array np zeros like theta float       for i in range 0 num it           h np dot X norm theta           temp j  theta j - alpha m     np sum   h-y  X norm   j  np newaxis                 temp 0  theta 0 - alpha m   np sum h-y           temp 1  theta 1 - alpha m   np sum  h-y  X norm   1            theta temp     return theta    X norm mean std featureScale X   length of X  number of rows  m len X  X norm np array  np ones m  X norm   n m np shape X norm  num it 1500 alpha 0 01 theta np zeros n float    np newaxis  X norm X norm transpose   theta gradient X norm y theta alpha m n num it  print theta   My theta from the above code is 100 2 100 2  but it should be 100 2 61 09 in matlab which is correct

User · Answer

I know this question already have been answer but I have made some update to the GD function :

  ### COST FUNCTION

def cost(theta,X,y):
     ### Evaluate half MSE (Mean square error)
     m = len(y)
     error = np.dot(X,theta) - y
     J = np.sum(error ** 2)/(2*m)
     return J

 cost(theta,X,y)



def GD(X,y,theta,alpha):

    cost_histo = [0]
    theta_histo = [0]

    # an arbitrary gradient, to pass the initial while() check
    delta = [np.repeat(1,len(X))]
    # Initial theta
    old_cost = cost(theta,X,y)

    while (np.max(np.abs(delta)) > 1e-6):
        error = np.dot(X,theta) - y
        delta = np.dot(np.transpose(X),error)/len(y)
        trial_theta = theta - alpha * delta
        trial_cost = cost(trial_theta,X,y)
        while (trial_cost >= old_cost):
            trial_theta = (theta +trial_theta)/2
            trial_cost = cost(trial_theta,X,y)
            cost_histo = cost_histo + trial_cost
            theta_histo = theta_histo +  trial_theta
        old_cost = trial_cost
        theta = trial_theta
    Intercept = theta[0] 
    Slope = theta[1]  
    return [Intercept,Slope]

res = GD(X,y,theta,alpha)

This function reduce the alpha over the iteration making the function too converge faster see Estimating linear regression with Gradient Descent (Steepest Descent) for an example in R. I apply the same logic but in Python.

User · Answer

I think your code is a bit too complicated and it needs more structure  because otherwise you ll be lost in all equations and operations  In the end this regression boils down to four operations    Calculate the hypothesis h   X   theta Calculate the loss   h - y and maybe the squared cost  loss 2  2m Calculate the gradient   X    loss   m Update the parameters theta   theta - alpha   gradient   In your case  I guess you have confused m with n  Here m denotes the number of examples in your training set  not the number of features   Let s have a look at my variation of your code   import numpy as np import random    m denotes the number of examples here  not the number of features def gradientDescent x  y  theta  alpha  m  numIterations       xTrans   x transpose       for i in range 0  numIterations           hypothesis   np dot x  theta          loss   hypothesis - y           avg cost per example  the 2 in 2 m doesn t really matter here            But to be consistent with the gradient  I include it          cost   np sum loss    2     2   m          print  Iteration  d   Cost   f     i  cost             avg gradient per example         gradient   np dot xTrans  loss    m           update         theta   theta - alpha   gradient     return theta   def genData numPoints  bias  variance       x   np zeros shape  numPoints  2       y   np zeros shape numPoints        basically a straight line     for i in range 0  numPoints             bias feature         x i  0    1         x i  1    i           our target variable         y i     i   bias    random uniform 0  1    variance     return x  y    gen 100 points with a bias of 25 and 10 variance as a bit of noise x  y   genData 100  25  10  m  n   np shape x  numIterations  100000 alpha   0 0005 theta   np ones n  theta   gradientDescent x  y  theta  alpha  m  numIterations  print theta    At first I create a small random dataset which should look like this     As you can see I also added the generated regression line and formula that was calculated by excel   You need to take care about the intuition of the regression using gradient descent  As you do a complete batch pass over your data X  you need to reduce the m-losses of every example to a single weight update  In this case  this is the average of the sum over the gradients  thus the division by m    The next thing you need to take care about is to track the convergence and adjust the learning rate  For that matter you should always track your cost every iteration  maybe even plot it   If you run my example  the theta returned will look like this   Iteration 99997   Cost  47883 706462 Iteration 99998   Cost  47883 706462 Iteration 99999   Cost  47883 706462   29 25567368   1 01108458    Which is actually quite close to the equation that was calculated by excel  y   x   30   Note that as we passed the bias into the first column  the first theta value denotes the bias weight

User · Answer

Below you can find my implementation of gradient descent for linear regression problem    At first  you calculate gradient like X T    X   w - y    N and update your current theta with this gradient simultaneously     X  feature matrix  y  target values  w  weights values  N  size of training set   Here is the python code   import pandas as pd import numpy as np from matplotlib import pyplot as plt import random  def generateSample N  variance 100       X   np matrix range N   T   1     Y   np matrix  random random     variance   i   10   900 for i in range len X     T     return X  Y  def fitModel gradient x  y       N   len x      w   np zeros  x shape 1   1       eta   0 0001      maxIteration   100000     for i in range maxIteration           error   x   w - y         gradient   x T   error   N         w   w - eta   gradient     return w  def plotModel x  y  w       plt plot x   1   y   x       plt plot x   1   x   w   r-       plt show    def test N  variance  modelFunction       X  Y   generateSample N  variance      X   np hstack  np matrix np ones len X    T  X       w   modelFunction X  Y      plotModel X  Y  w    test 50  600  fitModel gradient  test 50  1000  fitModel gradient  test 100  200  fitModel gradient

User · Answer

Following  thomas-jungblut implementation in python  i did the same for Octave  If you find something wrong please let me know and i will fix update   Data comes from a txt file with the following rows   1 10 1000 2 20 2500 3 25 3500 4 40 5500 5 60 6200   think about it as a very rough sample for features  number of bedrooms   mts2  and last column  rent price  which is what we want to predict   Here is the Octave implementation       Linear Regression with multiple variables      Alpha for learning curve alphaNum   0 0005     Number of features n   2     Number of iterations for Gradient Descent algorithm iterations   10000                                               No need to update after here                                             DATA   load  CHANGE WITH DATA FILE PATH       Initial theta values theta   ones n   1  1      Number of training samples m   length DATA    1       X with one mor column  x0 filled with  1 s  X   ones m  1   for i   1 n   X    X  DATA   i    endfor    Expected data must go always in the last column   y   DATA    n   1   function gradientDescent x  y  theta  alphaNum  iterations    iterations         costs          m   length y      for iteration   1 10000     hypothesis   x   theta       loss   hypothesis - y         J theta          cost   sum loss  2     2   m          Save for the graphic to see if the algorithm did work     iterations    iterations  iteration       costs    costs  cost        gradient    x    loss    m     m is for the average      theta   theta -  alphaNum   gradient     endfor          Show final theta values   display theta       Show J theta  graphic evolution to check it worked  tendency must be zero   plot iterations  costs    endfunction    Execute gradient descent gradientDescent X  y  theta  alphaNum  iterations

[python] gradient descent using python and numpy

Examples related to python

Examples related to numpy

Examples related to machine-learning

Examples related to linear-regression

Examples related to gradient-descent