The Dartmouth workshop in 1956 marked the birth of artificial intelligence as a field. In the last decade, the field picked up momentum on the back of deep learning. The recent progress in AI is chalked up to technical engineering advances that led to huge improvements in the quantity of computational resources and training data. However, Microsoft researchers showed the latest advances in AI are not due to quantitative leaps in computing power alone, but also qualitative changes in how the computing power is deployed. The qualitative changes have led to a new type of computing Microsoft researchers call neurocompositional computing.
Microsoft’s paper, ‘Neurocompositional computing: From the Central Paradox of Cognition to a new generation of AI systems,’ discusses how neurocompositional computing can address AI challenges such as lack of transparency and weakness in learning general knowledge. The new systems can learn more robustly and comprehensibly than standard deep learning networks.
In neurocompositional computing, neural networks exploit the Compositionality Principle and Continuity Principle. The compositionality Principle asserts that encodings of complex information are structures that are systematically composed of simpler structured encodings. A 2014 Stanford paper titled, ‘Bringing machine learning and compositional semantics together’, argued these concepts are deeply united around the concepts of generalisation, meaning, and structural complexity. Learning-based theories of semantics bring two worlds together. Compositionality characterises the recursive nature of the linguistic ability required to generalise to a creative capacity. Learning details the conditions under which such an ability can be acquired from data. The principle of compositionality directs researchers on specific model structures, while ML provides them with methods for training.
The Continuity Principle states the encoding and processing of information are formalised with real numbers that vary continuously. The latest studies show compositionality could be realised through the traditional methods of symbolic computing and through novel forms of continuous neural computing. A 2020 Stanford workshop on compositionality and computer vision detailed how recent works on computer vision approaches have demonstrated that concepts can be learned from only a few examples using a compositional representation. Compositionality makes it possible for symbolic propositions to express the hierarchical, tree-like structure of sentences in natural language. The current neural networks exhibit a unique functional form of compositionality that may be able to model the compositional character of cognition even if the constituents are altered when composed into a complex expression.
Neural computing encodes information in numerical activation vectors, forming a vector space. The activation vector encodes output results from spreading the activation that encodes an input among multiple layers of neurons through connections of different strengths or weights, the paper stated. In a general neural network, the values of these weights are set by training the model on examples of correct input/output pairs. This allows the model to converge to connection weights that produce the correct output when given an input.
Neural computing also follows the Continuity Principle. Here, knowledge about information encoded in one vector automatically generalises to similar information encoded in nearby vectors. It results in similarity-based generalisation. Continuity enables deep learning to improve and modify a model’s statistical inference of outputs from inputs in its training set.
Is modern AI neurocompositional?
The prevalent techniques in the 20th century- symbolic- and neural non-neurocompositional, violate either of the two principles. Human intelligence respects both.
However, Convolutional Neural Networks (CNNs) and Transformers hold much potential for a breakthrough. CNN processing builds in the Compositionality Principle using spatial structure. At each layer, the analysis of the whole image comes from composing together analyses of larger patches of the previous layer’s analysis of its smaller patches. Both CNNs and Transformers derive much of their power from their additional compositional structure of spatial structure and a type of graph structure.
CNN and Transformer architectures come under 1G neurocompositional computing. Microsoft’s work aims to incorporate the Compositionality Principle by instilling network capabilities for explicit construction and processing of general, abstract, compositionally-structured activation-vector encodings—while remaining within the perimeters of neural computing in alignment with the Continuity Principle. This is 2G neurocompositional computing.