gradient descent negative log likelihood