Hello all, We have learned about the perceptron trick to solve logistic regression in the previous part, and the perception trick was classifying all the points. But we observe that the original Logistic regression was performing better than the perception trick and we discussed the problem with the perceptron trick that why it is not converging best so the problem was in its Algorithm. So we need to modify the algorithm and here comes the sigmoid function with a great solution which we will discuss in our today's tutorial.
Modifications In the algorithm
In perceptron trick, if a point is misclassified then it calls the separation line towards itself and if a point is correctly classified then no changes happen. But now we will make a change in the algorithm that if a point is correctly classified then they will push the line away and if misclassified then they will pull the line. In this way when a line will be pushed by both the classes in opposite direction then a line will converge in a symmetric way.
Now for pushing and pulling a line depends on any magnitude by which all this transformation will take place, and it is decided by how much distance far is a point from the line.
For example, you can observe in the above plot that at point one you will push the line with greater magnitude, and at a second point you will push the line will very less magnitude.what changes do we need to implement the modified algorithm?
In perceptron trick algorithm the equation we form as where when a prediction was correct, then the new weight is equal to old weight because the difference between actual and predicted values become zero. Now when the prediction is correct we want to push the line and for this, we need to stop this difference from becoming zero. we cannot make any change in actual data, only we can prevent is prediction value so that difference of both is not zero. And the difference was becoming zero because we were using a step function that returns zero or one, But now to prevent this the new function is introduced as Sigmoid Function.
Sigmoid Function
The sigmoid function is a very popular function used in Machine learning and deep learning and I hope you have listened about sigmoid somewhere. we will learn about the sigmoid function and understand by implementing it how to do the behaviors of the above equation change. The most important feature of the sigmoid function is it scales down any number between 0 to 1 and it is the most useful feature.
You can observe the graph and equation of the sigmoid function below. If takes an input z and gives an output between 0 to 1. In our algorithm, we will replace the step function with a sigmoid function. If an input is negative then the output will be less than 0.5, if the input is 0 then the output will be exactly 0.5, and if positive input then the output will be exactly greater than 0.5.The advantage we will get through the sigmoid function is that first, the difference between actual and predicted becomes zero, But here sigmoid will predict the continuous value between zero and one hence the difference will never be zero.
Let us see with different cases, how sigmoid function will work, in the above cases.Case-1) Positive Point is correctly classified
Wn = Wo + learning_rate * 0.2 * Xi
we can see that it will push the line downwards with less magnitude.
Case-2) Negative Point Wrongly classified
Wn = Wo - learning_rate * 0.65 * Xi
It will pull the line towards itself (Upwards) with a greater magnitude
Case-3) Positive Point wrongly Classified
Wn = Wo + learning_rate * 0.7 * Xi
It will Pull the point towards itself(downwards)
Case-4) Negative Point Correctly Classified
Wn = Wo - learning_rate * 0.15 * Xi
It will push the line upwards with less magnitude due to the far distance from the line.
And after performing all this transformation line will converge better way.
Code the sigmoid Function for Logistic Regression
Now we will code the sigmoid function and fit our created data using a modified algorithm. we are working on the previous part-1 notebook only so I request to create data as per earlier if you have not done that. All the code is the same only a little modification is the perception function.
Visualize the results of sigmoid function and compare with perception and actual logistic regression.