The Sigmoid Function

The Sigmoid Function: The Math That Powers Logistic Regression

When you first learn logistic regression, everyone says:

“We use the sigmoid function to map numbers to probabilities.”

But if you’re like most people, your next question is:

“Cool… but where does that magical S-shape even come from?”

To understand it fully, we need to take a short trip through logarithms, odds, and log-odds before finally arriving at the sigmoid.


Step 1: What is a Log?

A logarithm is the inverse of exponentiation.

If:

by=xb^y = x

then:

logb(x)=y\log_b(x) = y

In other words:

  • A log answers the question: “To what power must I raise the base bb to get xx?”

  • For example:

log2(8)=3because23=8\log_2(8) = 3 \quad \text{because} \quad 2^3 = 8

In logistic regression, we specifically use the natural logarithm, denoted as:

ln(x)\ln(x)

This means the base is ee (≈ 2.71828), the mathematical constant for continuous growth.

Example:

ln(e2)=2\ln(e^2) = 2 ln(1)=0\ln(1) = 0

Step 2: From Probability to Odds

Probability (pp) is straightforward:

0p10 \leq p \leq 1

It’s the fraction of times an event happens out of all trials.

Example: A coin toss

  • Probability of heads = 0.5

But statisticians often use odds:

Odds=p1p\text{Odds} = \frac{p}{1 - p}

This compares the probability of success to the probability of failure.

Example: If p=0.8p = 0.8:

Odds=0.80.2=4\text{Odds} = \frac{0.8}{0.2} = 4

Meaning “4 to 1” odds — success is four times more likely than failure.


Step 3: Odds to Log-Odds (Logit)

Now we take the log of the odds. This gives us the logit:

Logit(p)=ln(p1p)\text{Logit}(p) = \ln\left(\frac{p}{1 - p}\right)

Why bother?

  • Odds are non-negative (0 to ∞).

  • Log-odds can take any real number (-∞ to +∞).

  • This makes them perfect for linking probabilities to linear models.

Example:
If p=0.8p = 0.8:

Logit(0.8)=ln(0.80.2)=ln(4)1.386\text{Logit}(0.8) = \ln\left(\frac{0.8}{0.2}\right) = \ln(4) \approx 1.386

If p=0.2p = 0.2:

Logit(0.2)=ln(0.20.8)=ln(0.25)1.386\text{Logit}(0.2) = \ln\left(\frac{0.2}{0.8}\right) = \ln(0.25) \approx -1.386

Notice how the sign changes — higher probability → positive logit, lower probability → negative logit.


Step 4: Logistic Regression’s Assumption

Logistic regression assumes a linear relationship between the log-odds and the inputs:

ln(p1p)=z\ln\left(\frac{p}{1 - p}\right) = z

Where:

z=w1x1+w2x2++bz = w_1x_1 + w_2x_2 + \dots + b
  • wiw_i are weights

  • xix_i are features

  • bb is the bias


Step 5: Deriving the Sigmoid Function

Now let’s solve for pp:

  1. Start with:

ln(p1p)=z\ln\left(\frac{p}{1 - p}\right) = z
  1. Exponentiate both sides (to remove the log):

p1p=ez\frac{p}{1 - p} = e^z
  1. Multiply both sides by 1p1 - p:

p=ez(1p)p = e^z (1 - p)
  1. Expand:

p=ezpezp = e^z - p e^z
  1. Add pezp e^z to both sides:

p+pez=ezp + p e^z = e^z
  1. Factor out pp:

p(1+ez)=ezp (1 + e^z) = e^z
  1. Divide both sides:

p=ez1+ezp = \frac{e^z}{1 + e^z}
  1. Multiply numerator and denominator by eze^{-z}:

p=11+ezp = \frac{1}{1 + e^{-z}}

This is the sigmoid function:

σ(z)=11+ez\sigma(z) = \frac{1}{1 + e^{-z}}

Step 6: Why the Sigmoid is Perfect for Logistic Regression

  • Probability output: Always between 0 and 1.

  • Smooth gradient: Great for optimization via gradient descent.

  • Natural origin: Comes directly from transforming log-odds.

  • Interpretability: Each weight shifts the log-odds linearly.


Step 7: Intuition & Shape

The curve:

  • Looks like an S

  • At z=0z = 0, output is 0.5

  • For large zz → output ~ 1

  • For very negative zz → output ~ 0



import numpy as np
import matplotlib.pyplot as plt

# 1. Sigmoid curve
z = np.linspace(-10, 10, 500)
sigmoid = 1 / (1 + np.exp(-z))

plt.figure(figsize=(15, 4))

plt.subplot(1, 3, 1)
plt.plot(z, sigmoid, color='blue')
plt.title("Sigmoid Curve")
plt.xlabel("z")
plt.ylabel("σ(z)")


Key Takeaway
The sigmoid function is not just a mathematical trick, it’s the direct result of connecting probabilities, odds, and logs in a way that allows a simple linear equation to model complex classification tasks.


Comments