Logistic Regression Part-1 | Perceptron Trick

Hello Guys, hope you are fine. In this tutorial, we are going to understand Logistic regression from the ground level and take our understanding to next level. Logistic regression is the very first algorithm that people strike through while start learning classification algorithms. Apart from this Logistic regression is a building block of complete deep learning and if you understand logistic regression well your deep learning base will be strong, and it is very easy to learn.


Table Of Contents

  • Approach to Learn Logistic Regression
  • What actually is Logistic Regression?
  • Perceptron trick for solving Logistic Regression
    • How does Transformation happen?
    • Unrevealing the Algorithm
    • Simplification
  • Implementation of perception trick with Python
  • Problem with Perceptron Trick
  • Why we study the Perceptron trick?
  • Conclusion


Approach to learn Logistic Regression

There is a huge number of articles and videos already available on the internet regarding Logistic Regression. If you search you will find two approaches with which people teach and understand Logistic regression. The first is geometric intuition and the second is the probability approach. 
We will understand with the Probability method because Probability intuition will present a clear picture of everything which will easy to understand in detail and then go for geometric intuition.


What Logistic Regression Does?

Logistic regression is a classification algorithm and used for binary classification tasks. Logistic regression does the same work as Liner regression, It separates the two classes by drawing a boundary line between them or you can say it as a hyperplane in high-dimensional data. 

Logistic regression only works when data is linearly separable or almost linear separable But in a non-linear relationship logistic regression will not perform because in a non-linear relationship we cannot separate the two classes just by drawing a line between them which you can better understand with the below figures.


This is linear separable data and logistic regression separates these 2 classes by drawing a boundary line between them.


What is Perceptron Trick for solving Logistic Regression?

Perceptron trick is very simple to understand. First take the point that if we are drawing a line to separate the two classes then the equation of line will be some as linear regression, No this is not the case in Logistic regression.


         Y = Mx + b  (Linear Regression)

        Ax + By + c = 0
Or
        Ax1 + Bx2 + c = 0

How the perception trick works is that it aims to get the separated line by applying some sort of transformations in the above equation. consider an above diagram example where using CGPA we have to see that student is placed or not. green dots represent student is placed, and blue not placed. Now how we will separate two classes with the perceptron trick.


step-1) Start with any random point and draw the line.

step-2) Comparision with other points

Now the randomly drawn line we will compare against any other random point. Suppose we pick 1st point and ask it whether it is correctly classified or not, then it is correct. Then we pick the second point and ask so it is misclassified. 
If misclassification occurs then we move the line forward towards the misclassified point and obtained the new point using some transformation in the equation. And thus we obtain the best-separated line after running several iterations.


Hence we can conclude that whenever the correct prediction occurs they do nothing and when the wrong prediction occurs, a line moves towards the misclassified point.

How do Transformations Happen?

As we already know that transformation is line only happened by making some changes in the value of A, B, and C in the equation. all the three value has its own importance and affects the different portion of a line by making changes.

  • By making changes in C like moves up or down parallelly.
  • By making changes in A, like a shift in Y-axis where it connects vertically means vertical movement of the line.
  • By making changes in B, line shift in X-axis means for horizontal movement and shifting.
If you want to experience all these changes, please visit the desmos calculator site, write an equation in general form, and increase and decrease values.

Whenever you want to move in a positive direction we decrease the values and vice-versa. It means if your negative point is misclassified in a positive region then you will add value one at back or coordinates and subtract it with coefficients of line. And if any positive point misclassified in a negative region then by applying one to the coordinate system we add this with line coefficients.


Unrevealing the Algorithm of Perceptron Trick

Now we will dive into mathematical intuition of how transformation happens. And after understanding the algorithm, we will convert it into code and implement it practically.


The equation is
        Ax + Bx+ C = 0

Either writing the equation in the above form we can write it as.
    W0 + W1 X1 + W2X2 = 0

Where,
    W0 is C
    W1 is A
    W2 is B

We have written in this form because we will simplify it more. If I add one more column before it with all values as 1 then we can write the equation as.

    W0X0 + W1X1 + W2X2 = 0

This means the general form is,

That's sit. we can also write it in form of a matrix and do a dot product and as result, we will get the same equation.



Now let's see how Algorithm will work 

  • step-1) You have to decide the number of iterations(Epoch) to run a loop
  • step-2) from training data you will randomly select any training data point
  • step-3) check that the picked point is correctly classified or not.
  • step-4) If the negative point is wrongly placed in a positive region, update the coefficient using the above transformation equation and vice-versa.

All the other case point is correctly classified means we do not want to make any changes in that situation, and this is our algorithm.

If you see the algorithm then, we have to check 2 conditions which in implementation will be a little bit hectic task so we will simplify the algorithm and only use one condition.

Simplification of the equation

we will simplify the above algorithm where we are comparing it 2 times rather than after a few implications we only need to compare one time and automatically both the equation will be formed.
Have a look at the below figure carefully and then go to an explanation.


Explanation - Watch carefully the above table in the figure. These are only the four cases that occur in prediction. the above two cases are correctly classified where no change will happen and the difference is zero means the new coefficient is equal to the old coefficient.

In the last 2 cases which are the wrong predictions and if you put the value in the equation then automatically the equation of addition and subtraction will form which we are comparing above. 


And the code will be reduced to 3 lines only. Hence we have simplified the equation. 

Making Hands Dirty with Perceptron trick

Now we will implement the above algorithm practically by creating some random classification dataset and clear all our doubts practically, how perception works for logistic regression. let's get started.

Step-1) Create data
here we will import all libraries and create any sample classification dataset.

Step-2) Implement the perceptron function
we will create a perception function which will accept input variables and output variable and will return the value of coefficient and intercept by doing all the transformation as we have studied above.

Explanation - first we have inserted a new column in input variables with value 1 for bias. then we have created an initial weights array of 1 for each column and defined the learning rate. And then we start our loop by defining the number of epochs, And as per our simplified algorithm steps perform as, first we select any random index point, and we compute the dot product of weights and randomly select data point. If the dot product is greater than 0 then we assign the value as 1, else 0. And then we update the weights according to the simplified equation we studied above. The loop will run till a number of epochs and finally we return the intercept and coefficient value.

Step-3) Calculate m and b
To plot the separated line we will find out values of m and b using the general equation.

now we plot the classified line between the 2 classes. 

Think over it, the line you are getting is classifying the classes correctly, It happens like this only But how logistic regression works is slightly different than this. hence perception is only a simple method to achieve this. 


Problem with Perceptron Trick

I will show you the difference between actual logistic regression and perception trick by fitting sciket-learn logistic regression on this.



You can see that both the line is having differences. Now to understand in a better way why logistic regression is better than perception trick, just run the complete code from start by increasing the class separation parameter from 10 to 20 and then observe the changes in the final graph.

If you observe the changes then the perception graph stops when it classifies all points, But the logistic regression line does not stop, it improves itself more and drawing the line in a symmetric way. That's why logistic regression performs better than perception trick.

Why we have studied the Perceptron trick?

If you understand the perceptron trick, then it basically makes you capable that how to solve the problem because It is classifying the classes accurately but overfitting it. And after understanding this understanding Logistic regression is so simple hence we study the perceptron trick.

Conclusion

Thank you for following the article till the end. I hope it was easy to catch the working of the perceptron algorithm for logistic regression. If you have any doubts please post them out in the comment section below. I will be happy to help you out. In the next article, we will understand complete logistic regression with Gradient descent and sigmoid function so stay tuned and keep reading.


happy learning, keep learning

Post a Comment

If you have any doubt or suggestions then, please let me know.

Previous Post Next Post