Pasadena, CA (Scicasts) – We tend to be creatures of habit. In fact, the human brain has a learning system that is devoted to guiding us through routine, or habitual, behaviours.
At the same time, the brain has a separate goal-directed system for the actions we undertake only after careful consideration of the consequences. We switch between the two systems as needed. But how does the brain know which system to give control to at any given moment? Enter The Arbitrator.
Researchers at the California Institute of Technology (Caltech) have pinpointed areas of the brain—the inferior lateral prefrontal cortex and frontopolar cortex—that seem to serve as this "arbitrator" between the two decision-making systems, weighing the reliability of the predictions each makes and then allocating control accordingly. The results appear in the current issue of the journal Neuron.
According to John O'Doherty, the study's principal investigator and director of the Caltech Brain Imaging Center, understanding where the arbitrator is located and how it works could eventually lead to better treatments for brain disorders, such as drug addiction, and psychiatric disorders, such as obsessive-compulsive disorder. These disorders, which involve repetitive behaviours, may be driven in part by malfunctions in the degree to which behaviour is controlled by the habitual system versus the goal-directed system.
"Now that we have worked out where the arbitrator is located, if we can find a way of altering activity in this area, we might be able to push an individual back toward goal-directed control and away from habitual control," says O'Doherty, who is also a professor of psychology at Caltech. "We're a long way from developing an actual treatment based on this for disorders that involve over-egging of the habit system, but this finding has opened up a highly promising avenue for further research."
In the study, participants played a decision-making game on a computer while connected to a functional magnetic resonance imaging (fMRI) scanner that monitored their brain activity. Participants were instructed to try to make optimal choices in order to gather coins of a certain colour, which were redeemable for money.
During a pre-training period, the subjects familiarized themselves with the game—moving through a series of on-screen rooms, each of which held different numbers of red, yellow, or blue coins. During the actual game, the participants were told which coins would be redeemable each round and given a choice to navigate right or left at two stages, knowing that they would collect only the coins in their final room. Sometimes all of the coins were redeemable, making the task more habitual than goal-directed. By altering the probability of getting from one room to another, the researchers were able to further test the extent of participants' habitual and goal-directed behaviour while monitoring corresponding changes in their brain activity.
With the results from those tests in hand, the researchers were able to compare the fMRI data and choices made by the subjects against several computational models they constructed to account for behaviour. The model that most accurately matched the experimental data involved the two brain systems making separate predictions about which action to take in a given situation. Receiving signals from those systems, the arbitrator kept track of the reliability of the predictions by measuring the difference between the predicted and actual outcomes for each system. It then used those reliability estimates to determine how much control each system should exert over the individual's behaviour. In this model, the arbitrator ensures that the system making the most reliable predictions at any moment exerts the greatest degree of control over behaviour.
"What we're showing is the existence of higher-level control in the human brain," says Sang Wan Lee, lead author of the new study and a postdoctoral scholar in neuroscience at Caltech. "The arbitrator is basically making decisions about decisions."
In line with previous findings from the O'Doherty lab and elsewhere, the researchers saw in the brain scans that an area known as the posterior putamen was active at times when the model predicted that the habitual system should be calculating prediction values. Going a step further, they examined the connectivity between the posterior putamen and the arbitrator. What they found might explain how the arbitrator sets the weight for the two learning systems: the connection between the arbitrator area and the posterior putamen changed according to whether the goal-directed or habitual system was deemed to be more reliable. However, no such connection effects were found between the arbitrator and brain regions involved in goal-directed learning. This suggests that the arbitrator may work mainly by modulating the activity of the habitual system.
"One intriguing possibility arising from these findings, which we will need to test in future work, is that being in a habitual mode of behaviour may be the default state," says O'Doherty. "So when the arbitrator determines you need to be more goal-directed in your behaviour, it accomplishes this by inhibiting the activity of the habitual system, almost like pressing the breaks on your car when you are in drive."
Publication: Neural Computations Underlying Arbitration between Model-Based and Model-free Learning. Sang Wan Lee, Shinsuke Shimojo, John P. O’Doherty. Neuron (2014): http://www.cell.com/neuron/retrieve/pii/S0896627313011252