Menu
Home Explore People Places Arts History Plants & Animals Science Life & Culture Technology
On this page
Logit
Function in statistics

In statistics, the logit function is the quantile function of the standard logistic distribution and plays a key role in data analysis and machine learning, particularly in data transformations. It is the inverse of the standard logistic function, defined as σ(x) = 1 / (1 + e⁻ˣ), and mathematically expressed as logit p = ln(p / (1−p)) for p in (0,1). Known also as the log-odds, since it is the logarithm of the odds p/(1−p), the logit maps probabilities from (0,1) to all real numbers (−∞, +∞), similar to the probit function.

Related Image Collections Add Image
We don't have any YouTube videos related to Logit yet.
We don't have any PDF documents related to Logit yet.
We don't have any Books related to Logit yet.
We don't have any archived web articles related to Logit yet.

Definition

If p is a probability, then p/(1 − p) is the corresponding odds; the logit of the probability is the logarithm of the odds, i.e.:

logit ⁡ ( p ) = ln ⁡ ( p 1 − p ) = ln ⁡ ( p ) − ln ⁡ ( 1 − p ) = − ln ⁡ ( 1 p − 1 ) = 2 atanh ⁡ ( 2 p − 1 ) . {\displaystyle \operatorname {logit} (p)=\ln \left({\frac {p}{1-p}}\right)=\ln(p)-\ln(1-p)=-\ln \left({\frac {1}{p}}-1\right)=2\operatorname {atanh} (2p-1).}

The base of the logarithm function used is of little importance in the present article, as long as it is greater than 1, but the natural logarithm with base e is the one most often used. The choice of base corresponds to the choice of logarithmic unit for the value: base 2 corresponds to a shannon, base e to a nat, and base 10 to a hartley; these units are particularly used in information-theoretic interpretations. For each choice of base, the logit function takes values between negative and positive infinity.

The “logistic” function of any number α {\displaystyle \alpha } is given by the inverse-logit:

logit − 1 ⁡ ( α ) = logistic ⁡ ( α ) = 1 1 + exp ⁡ ( − α ) = exp ⁡ ( α ) exp ⁡ ( α ) + 1 = tanh ⁡ ( α 2 ) + 1 2 {\displaystyle \operatorname {logit} ^{-1}(\alpha )=\operatorname {logistic} (\alpha )={\frac {1}{1+\exp(-\alpha )}}={\frac {\exp(\alpha )}{\exp(\alpha )+1}}={\frac {\tanh({\frac {\alpha }{2}})+1}{2}}}

The difference between the logits of two probabilities is the logarithm of the odds ratio (R), thus providing a shorthand for writing the correct combination of odds ratios only by adding and subtracting:

ln ⁡ ( R ) = ln ⁡ ( p 1 / ( 1 − p 1 ) p 2 / ( 1 − p 2 ) ) = ln ⁡ ( p 1 1 − p 1 ) − ln ⁡ ( p 2 1 − p 2 ) = logit ⁡ ( p 1 ) − logit ⁡ ( p 2 ) . {\displaystyle \ln(R)=\ln \left({\frac {p_{1}/(1-p_{1})}{p_{2}/(1-p_{2})}}\right)=\ln \left({\frac {p_{1}}{1-p_{1}}}\right)-\ln \left({\frac {p_{2}}{1-p_{2}}}\right)=\operatorname {logit} (p_{1})-\operatorname {logit} (p_{2})\,.}

The Taylor series for the logit function is given by:

logit ⁡ ( x ) = 2 ∑ n = 0 ∞ ( 2 x − 1 ) 2 n + 1 2 n + 1 . {\displaystyle \operatorname {logit} (x)=2\sum _{n=0}^{\infty }{\frac {(2x-1)^{2n+1}}{2n+1}}.}

History

Several approaches have been explored to adapt linear regression methods to a domain where the output is a probability value ( 0 , 1 ) {\displaystyle (0,1)} , instead of any real number ( − ∞ , + ∞ ) {\displaystyle (-\infty ,+\infty )} . In many cases, such efforts have focused on modeling this problem by mapping the range ( 0 , 1 ) {\displaystyle (0,1)} to ( − ∞ , + ∞ ) {\displaystyle (-\infty ,+\infty )} and then running the linear regression on these transformed values.2

In 1934, Chester Ittner Bliss used the cumulative normal distribution function to perform this mapping and called his model probit, an abbreviation for "probability unit". This is, however, computationally more expensive.3

In 1944, Joseph Berkson used log of odds and called this function logit, an abbreviation for "logistic unit", following the analogy for probit:

"I use this term [logit] for ln ⁡ p / q {\displaystyle \ln p/q} following Bliss, who called the analogous function which is linear on ⁠ x {\displaystyle x} ⁠ for the normal curve 'probit'."

— Joseph Berkson (1944)4

Log odds was used extensively by Charles Sanders Peirce (late 19th century).5 G. A. Barnard in 1949 coined the commonly used term log-odds;67 the log-odds of an event is the logit of the probability of the event.8 Barnard also coined the term lods as an abstract form of "log-odds",9 but suggested that "in practice the term 'odds' should normally be used, since this is more familiar in everyday life".10

Uses and properties

Comparison with probit

Closely related to the logit function (and logit model) are the probit function and probit model. The logit and probit are both sigmoid functions with a domain between 0 and 1, which makes them both quantile functions – i.e., inverses of the cumulative distribution function (CDF) of a probability distribution. In fact, the logit is the quantile function of the logistic distribution, while the probit is the quantile function of the normal distribution. The probit function is denoted Φ − 1 ( x ) {\displaystyle \Phi ^{-1}(x)} , where Φ ( x ) {\displaystyle \Phi (x)} is the CDF of the standard normal distribution, as just mentioned:

Φ ( x ) = 1 2 π ∫ − ∞ x e − y 2 / 2 d y . {\displaystyle \Phi (x)={\frac {1}{\sqrt {2\pi }}}\int _{-\infty }^{x}e^{-y^{2}/2}dy.}

As shown in the graph on the right, the logit and probit functions are extremely similar when the probit function is scaled, so that its slope at y = 0 matches the slope of the logit. As a result, probit models are sometimes used in place of logit models because for certain applications (e.g., in item response theory) the implementation is easier.15

See also

Further reading

  • Ashton, Winifred D. (1972). The Logit Transformation: with special reference to its uses in Bioassay. Griffin's Statistical Monographs & Courses. Vol. 32. Charles Griffin. doi:10.2307/2345009. ISBN 978-0-85264-212-2.

References

  1. "Logit/Probit" (PDF). http://www.columbia.edu/~so33/SusDev/Lecture_9.pdf

  2. Cramer, J. S. (2003). "The origins and development of the logit model" (PDF). Cambridge UP. Archived from the original (PDF) on 19 September 2024. https://web.archive.org/web/20240919043104/https://www.cambridge.org/resources/0521815886/1208_default.pdf

  3. Cramer, J. S. (2003). "The origins and development of the logit model" (PDF). Cambridge UP. Archived from the original (PDF) on 19 September 2024. https://web.archive.org/web/20240919043104/https://www.cambridge.org/resources/0521815886/1208_default.pdf

  4. Berkson 1944, p. 361, footnote 2. - Berkson, Joseph (1944). "Application of the Logistic Function to Bio-Assay". Journal of the American Statistical Association. 39 (227 (September)): 357–365. doi:10.2307/2280041. JSTOR 2280041. https://doi.org/10.2307%2F2280041

  5. Stigler, Stephen M. (1986). The history of statistics : the measurement of uncertainty before 1900. Cambridge, Massachusetts: Belknap Press of Harvard University Press. ISBN 978-0-674-40340-6. 978-0-674-40340-6

  6. Hilbe, Joseph M. (2009), Logistic Regression Models, CRC Press, p. 3, ISBN 9781420075779. 9781420075779

  7. Barnard 1949, p. 120. - Barnard, George Alfred (1949). "Statistical Inference". Journal of the Royal Statistical Society. B. 11 (2): 115–139. doi:10.1111/j.2517-6161.1949.tb00028.x. JSTOR 2984075. https://doi.org/10.1111%2Fj.2517-6161.1949.tb00028.x

  8. Cramer, J. S. (2003), Logit Models from Economics and Other Fields, Cambridge University Press, p. 13, ISBN 9781139438193. 9781139438193

  9. Barnard 1949, p. 120,128. - Barnard, George Alfred (1949). "Statistical Inference". Journal of the Royal Statistical Society. B. 11 (2): 115–139. doi:10.1111/j.2517-6161.1949.tb00028.x. JSTOR 2984075. https://doi.org/10.1111%2Fj.2517-6161.1949.tb00028.x

  10. Barnard 1949, p. 136. - Barnard, George Alfred (1949). "Statistical Inference". Journal of the Royal Statistical Society. B. 11 (2): 115–139. doi:10.1111/j.2517-6161.1949.tb00028.x. JSTOR 2984075. https://doi.org/10.1111%2Fj.2517-6161.1949.tb00028.x

  11. "R: Inverse logit function". Archived from the original on 2011-07-06. Retrieved 2011-02-18. https://web.archive.org/web/20110706132209/http://www.stat.ucl.ac.be/ISdidactique/Rhelp/library/msm/html/expit.html

  12. Thrun, Sebastian (2003). "Learning Occupancy Grid Maps with Forward Sensor Models" (PDF). Autonomous Robots. 15 (2): 111–127. doi:10.1023/A:1025584807625. ISSN 0929-5593. S2CID 2279013. https://mediawiki.isr.tecnico.ulisboa.pt/images/5/5b/Thrun03.pdf

  13. Styler, Alex (2012). "Statistical Techniques in Robotics" (PDF). p. 2. Retrieved 2017-01-26. https://www.cs.cmu.edu/~16831-f12/notes/F12/16831_lecture05_vh.pdf

  14. Dickmann, J.; Appenrodt, N.; Klappstein, J.; Bloecher, H. L.; Muntzinger, M.; Sailer, A.; Hahn, M.; Brenk, C. (2015-01-01). "Making Bertha See Even More: Radar Contribution". IEEE Access. 3: 1233–1247. Bibcode:2015IEEEA...3.1233D. doi:10.1109/ACCESS.2015.2454533. ISSN 2169-3536. https://doi.org/10.1109%2FACCESS.2015.2454533

  15. Albert, James H. (2016). "Logit, Probit, and other Response Functions". Handbook of Item Response Theory. Vol. Two. Chapman and Hall. pp. 3–22. doi:10.1201/b19166-1. ISBN 978-1-315-37364-5. 978-1-315-37364-5