Comment on I'm looking for an article showing that LLMs don't know how they work internally
Voldemort@lemmy.world 3 days agoAnd longer excepts on the similarities of AI neural networks to biological brains, more specifically human children, in the pursuit of study with improving learning and education development. Super interesting papers that are easily accessible to anyone:
“Humans are imperfect reasoners. We reason most effectively about entities and situations that are consistent with our understanding of the world. Our experiments show that language models mirror these patterns of behavior. Language models perform imperfectly on logical reasoning tasks, but this performance depends on content and context. Most notably, such models often fail in situations where humans fail — when stimuli become too abstract or conflict with prior understanding of the world. Beyond these parallels, we also observed reasoning effects in language models that to our knowledge have not been previously investigated in the human literature. For example, the patterns of errors on the ‘violate realistic’ rules, or the relative ease of ‘shuffled realistic’ rules in the Wason tasks. Likewise, language model performance on the Wason tasks increases most when they are demonstrated with realistic examples; benefits of concrete examples have been found in cognitive and educational contexts (Sweller et al., 1998; Fyfe et al., 2014), but remain to be explored in the Wason problems. Investigating whether humans show similar effects is a promising direction for future research.” 5.9-10 Language models show human-like content effects on reasoning Ishita Dasgupta*,1, Andrew K. Lampinen*,1, Stephanie C. Y. Chan1, Antonia Creswell1, Dharshan Kumaran1, James L. McClelland1,2 and Felix Hill1 *Equal contributions, listed alphabetically, 1DeepMind, 2Stanford University
“In this article we will point out several characteristics of human cognitive processes that conventional computer architectures do not capture well. Then we will note that connectionist models {neural networks} are much better able to capture these aspects of human processing. After that we will mention three recent applications in connectionist artificial intelligence which exploit these characteristics. Thus, we shall see that connectionist models offer hope of overcoming the limitations of conventional AI. The paper ends with an example illustrating how connectionist models can change our basic conceptions of the nature of intelligent processing.”
“The framework for building connectionist models is laid out in detail in Rumelhart, McClelland and the PDP Group (1986), and many examples of models constructed in that framework are described. Two examples of connectionist models of human processing abilities that capture these characteristics are the interactive activation model of visual word recognition from McClelland and Rumelhart (1981), and the model of past tense learning from Rumelhart and McClelland (1986). These models were motivated by psychological experiments, and were constructed to capture the data found in these studies. We describe them here to illustrate some of the roots of the connectionist approach in an attempt to understand detailed aspects of human cognition.”
“The models just reviewed capture important aspects of data from psychological experiments, and illustrate how the characteristics of human processing capabilities enumerated above can be captured in an explicit comptutational framework. Recently connectionist models that capture these same characteristics have begun to give rise to a new kind of Artificial Intelligence, which we will call connectionist AI. Connectionist AI is beginning to address several topics that have not been easily solved using other approaches. We will consider three cases of this. In each case we will describe recent progress that illustrates the ability of connectionist networks to capture the characteristics of human performance mentioned above.”
“This paper began with the idea that humans exploit graded information, and that computational mechanisms that aim to emulate the natural processing capabilities of humans should exploit this kind of information as well. Connectionist models do exploit graded information, and this gives them many of their attractive characteristics.” Parallel Distributed Processing: Bridging the Gap Between Human and Machine Intelligence James L. McClelland, Axel Cleeremans, and David Servan-Schreiber Carnegie Mellon University
“Artificial neural networks have come and gone and come again- and there are several good reasons to think that this time they will be around for quite a while. Cheng and Titterington have done an excellent job describing that nature of neural network models and their relations to statistical methods, and they have overviewed several applications. They have also suggested why neuroscientists interested in modeling the human brain are interested in such models. In this note, I will point out some additional motivations for the investigation of neural networks. These are motivations arising from the effort to capture key aspects of human cognition and learning that have thus far eluded cognitive science. A central goal of congnitive science is to understand the full range of human cognitive function”…“{there are} good reasons for thinking that artificial neural networks, or at least computationally explicit models that capture key properties of such networks, will play an important role in the effort to capture some of the aspects of human cognitive function that have eluded symbolic approaches.” Neural Networks: A Review from Statistical Perspective]: Comment: Neural Networks and Cognitive Science: Motivations and Applications James L. McClelland Statistical Science, Vol. 9, No. 1. (Feb., 1994), pp. 42-45.
“The idea has arisen that as the scale of experience and computation begins to approach the scale of experience and computation available to a young child—who sees millions of images and hears millions of words per year, and whose brain contains 10–100 billion neuron-like processing units each updating their state on a time scale of milliseconds—the full power and utility of neural networks to capture natural computation is finally beginning to become a reality, allowing artificially intelligent systems to capture more fully the capabilities of the natural intelligence present in real biological networks in the brain.”
“One major development in the last 25 years has been the explosive growth of computational cognitive neuroscience. The idea that computer simulations of neural mechanisms might yield insight into cognitive phenomena no longer requires, at least in most quarters, vigorous defense—there now exist whole fields, journals, and conferences dedicated to this pursuit. One consequence is the elaboration of a variety of different computationally rigorous approaches to neuroscience and cognition that capture neural information processing mechanisms at varying degrees of abstraction and complexity. These include the dynamic field theory, in which the core representational elements are fields of neurons whose activity and interactions can be expressed as a series of coupled equations (Johnson, Spencer, & Sch€oner, 2008); the neural engineering framework, which seeks to understand how spiking neurons might implement tensor-product approaches to symbolic representations (Eliasmith & Anderson, 2003; Rasmussen & Eliasmith, 2011); and approaches to neural representation based on ideal-observer models and probabilistic inference (Deneve, Latham, & Pouget, 1999; Knill & Pouget, 2004). Though these perspectives differ from PDP in many respects, all of these efforts share the idea that cognition emerges from interactions among populations of neurons whose function can be studied in simplified, abstract form.” Parallel Distributed Processing at 25: Further Explorations in the Microstructure of Cognition T. T. Rogers, J. L. McClelland / Cognitive Science 38 (2014) p1062-1063
Voldemort@lemmy.world 3 days ago
I personally think there are plenty of examples out there in neuroscience and computer science papers let alone what other fields are starting to discover with the use of AI. In my opinion it should be of no surprise and quite clear how replicating a mechanism of self-adapting logic would create behaviours that we can find directly within ourselves.
Let me know if this is enough to prove my point, but I think I’m tired of reading papers for a bit.