In machine learning and computer vision, M-theory is a learning framework inspired by feed-forward processing in the ventral stream of visual cortex and originally developed for recognition and classification of objects in visual scenes. M-theory was later applied to other areas, such as speech recognition. On certain image recognition tasks, algorithms based on a specific instantiation of M-theory, HMAX, achieved human-level performance.
The core principle of M-theory is extracting representations invariant under various transformations of images (translation, scale, 2D and 3D rotation and others). In contrast with other approaches using invariant representations, in M-theory they are not hardcoded into the algorithms, but learned. M-theory also shares some principles with compressed sensing. The theory proposes multilayered hierarchical learning architecture, similar to that of visual cortex.