Skip to page navigation menu Skip entire header
Brown University
Skip 13 subheader links

The Emergence of Symbolic Structure from Data in Prototype Neural Networks

Description

Abstract:
Neural network models have grown in popularity to be the dominant artificial intelligence paradigm of our time, succeeding across a variety of challenging tasks despite largely receiving unstructured sequential input. Despite this broad success, it is a classically open question whether they can represent and generalize \textit{symbolic structure}; in which there are entities which are represented as atomic symbols (CAT, MUFFIN), and there are abstract content-independent functions that operate over those symbols (e.g. NOT, AND, OR). Classically, whether neural network models can learn and generalize symbolic structure has been the subject of much debate. However, modern neural models (e.g. large language models) succeed on some tasks that appear to need symbolic structure, such as language processing and code generation. We hypothesize that prototype neural networks can learn to perform well at a set of tasks previously argued to involved symbolic structure. We evaluate LSTM- and Transformer-based models at small scale with low inductive biases and without extensive architectural engineering on tasks thought to have symbolic structure. We focus on carefully controlled experimental paradigms and small models in order to yield interpretable results. In the first chapter, we evaluate whether language models can differentiate logical operators in a symbolic reasoning task within a propositional logic setting, and find that models' performance relies on the degree to which the operators are separable given their distribution in the data. In the second chapter, we evaluate object-tracking vision models at a task from the developmental psychology literature which tests reasoning by exclusion within a visual setting, and find that models do not generalize to the logical inference when it is not explicitly featured in their training data. Finally, within a computational cognitive neuroscience setting, we find that Transformer models trained on a working memory task mimic biological functionality for storing and recalling items within working memory, despite not having an explicit ``memory'' component themselves. Overall, our results provide insight into the conditions within the training data and architecture of neural models which are necessary for representations of symbolic structure to arise, which can inform evaluation of neural models across the field of artificial intelligence.
Notes:
Thesis (Ph. D.)--Brown University, 2024

Citation

Traylor, Aaron, "The Emergence of Symbolic Structure from Data in Prototype Neural Networks" (2024). Computer Science Theses and Dissertations. Brown Digital Repository. Brown University Library. https://repository.library.brown.edu/studio/item/bdr:d68mp7u2/

Relations

Collection: