Sigmoid Function
Get Sigmoid Function essential facts below. View Videos or join the Sigmoid Function discussion. Add Sigmoid Function to your Like2do.com topic list for future reference or share this resource on social media.
Sigmoid Function
Plot of the error function

A sigmoid function is a mathematical function having a characteristic "S"-shaped curve or sigmoid curve. Often, sigmoid function refers to the special case of the logistic function shown in the first figure and defined by the formula

${\displaystyle S(x)={\frac {1}{1+e^{-x}}}={\frac {e^{x}}{e^{x}+1}}.}$

Other examples of similar shapes include the Gompertz curve (used in modeling systems that saturate at large values of x) and the ogee curve (used in the spillway of some dams). Sigmoid functions have domain of all real numbers, with return value monotonically increasing most often from 0 to 1 or alternatively from -1 to 1, depending on convention.

A wide variety of sigmoid functions have been used as the activation function of artificial neurons, including the logistic and hyperbolic tangent functions. Sigmoid curves are also common in statistics as cumulative distribution functions (which go from 0 to 1), such as the integrals of the logistic distribution, the normal distribution, and Student's t probability density functions.

## Definition

A sigmoid function is a bounded differentiable real function that is defined for all real input values and has a non-negative derivative at each point.[1]

## Properties

In general, a sigmoid function is real-valued, monotonic, and differentiable having a non-negative first derivative which is bell shaped. A sigmoid function is constrained by a pair of horizontal asymptotes as ${\displaystyle x\rightarrow \pm \infty }$.

## Examples

Some sigmoid functions compared. In the drawing all functions are normalized in such a way that their slope at the origin is 1.
${\displaystyle f(x)={\frac {1}{1+e^{-x}}}}$
${\displaystyle f(x)=\tanh x={\frac {e^{x}-e^{-x}}{e^{x}+e^{-x}}}}$
${\displaystyle f(x)=\arctan x}$
${\displaystyle f(x)=\operatorname {gd} (x)=\int _{0}^{x}{\frac {1}{\cosh t}}\,dt}$
${\displaystyle f(x)=\operatorname {erf} (x)={\frac {2}{\sqrt {\pi }}}\int _{0}^{x}e^{-t^{2}}\,dt}$
${\displaystyle f(x)=(1+e^{-x})^{-\alpha },\quad \alpha >0}$
${\displaystyle f(x)={\begin{cases}\left(\int _{0}^{1}{\big (}1-u^{2}{\big )}^{N}\ du\right)^{-1}\int _{0}^{x}{\big (}1-u^{2}{\big )}^{N}\ du\quad &|x|\leq 1\\\operatorname {sgn} (x)&|x|\geq 1\\\end{cases}}\,\quad N\geq 1}$
${\displaystyle f(x)={\frac {x}{\sqrt {1+x^{2}}}}}$.

The integral of any continuous, non-negative, "bump-shaped" function will be sigmoidal, thus the cumulative distribution functions for many common probability distributions are sigmoidal. One such example is the error function, which is related to the cumulative distribution function (CDF) of a normal distribution.

Many natural processes, such as those of complex system learning curves, exhibit a progression from small beginnings that accelerates and approaches a climax over time. When a specific mathematical model is lacking, a sigmoid function is often used.[2]

## References

1. ^ Han, Jun; Morag, Claudio (1995). "The influence of the sigmoid function parameters on the speed of backpropagation learning". In Mira, José; Sandoval, Francisco. From Natural to Artificial Neural Computation. pp. 195-201.
2. ^ Gibbs, M.N. (Nov 2000). "Variational Gaussian process classifiers". IEEE Transactions on Neural Networks. 11 (6): 1458-1464. doi:10.1109/72.883477.
• Mitchell, Tom M. (1997). Machine Learning. WCB-McGraw-Hill. ISBN 0-07-042807-7.. In particular see "Chapter 4: Artificial Neural Networks" (in particular pp. 96-97) where Mitchell uses the word "logistic function" and the "sigmoid function" synonymously - this function he also calls the "squashing function" - and the sigmoid (aka logistic) function is used to compress the outputs of the "neurons" in multi-layer neural nets.
• Humphrys, Mark. "Continuous output, the sigmoid function". Properties of the sigmoid, including how it can shift along axes and how its domain may be transformed.