In the mathematical theory of artificial neural networks, universal approximation theorems are theorems of the following form: Given a family of neural networks, for each function f {\displaystyle f} from a certain function space, there exists a sequence of neural networks ϕ 1 , ϕ 2 , … {\displaystyle \phi _{1},\phi _{2},\dots } from the family, such that ϕ n → f {\displaystyle \phi _{n}\to f} according to some criterion. That is, the family of neural networks is dense in the function space.
The most popular version states that feedforward networks with non-polynomial activation functions are dense in the space of continuous functions between two Euclidean spaces, with respect to the compact convergence topology.
Universal approximation theorems are existence theorems: They simply state that there exists such a sequence ϕ 1 , ϕ 2 , ⋯ → f {\displaystyle \phi _{1},\phi _{2},\dots \to f} , and do not provide any way to actually find such a sequence. They also do not guarantee any method, such as backpropagation, might actually find such a sequence. Any method for searching the space of neural networks, including backpropagation, might find a converging sequence, or not (i.e. the backpropagation might get stuck in a local optimum).
Universal approximation theorems are limit theorems: They simply state that for any f {\displaystyle f} and a criterion of closeness ϵ > 0 {\displaystyle \epsilon >0} , if there are enough neurons in a neural network, then there exists a neural network with that many neurons that does approximate f {\displaystyle f} to within ϵ {\displaystyle \epsilon } . There is no guarantee that any finite size, say, 10000 neurons, is enough.