With overtraining, rational
models of categorical choice are difficult to distinguish from simpler, habit-based accounts because highly-trained participants can produce a pattern of choices that resembles optimal responding by associating portions of the decision space with a particular action through extensive stimulus-response learning (Blair and Homa, 2003). Indeed, an influential framework suggests that model-free mechanisms, that capitalize on the extended learning Apoptosis inhibitor history to assign value to actions, may take precedence in control of action in stable, overlearned environments (Daw et al., 2005 and Dickinson and Balleine, 2002). It thus remains unknown (1) whether observers learn about the uncertainty associated with category membership (category variance), and use it to inform their decisions, and (2) which neural structures might encode category variability. The purpose of the current study was to address these questions. One important feature of signaling pathway unpredictable, fast-changing environments is that observers are obliged to distinguish between unexpected events that occur because of noise (i.e., an outlier) and those that occur
because of a state change in the environment (Yu and Dayan, 2005). For example, a bus might be late because of the vagaries of morning traffic (noise), or because new roadworks have introduced a fundamental delay that should be budgeted for when estimating subsequent journey times (a state change). When economic estimates change rapidly, new learning quickly becomes outdated, and so past category information should be discounted more steeply when choices are made
(Nassar et al., 2010 and Rushworth and many Behrens, 2008). Observers do indeed update their estimates of mean reward rate more rapidly when the environment is more volatile, a computation that has been associated with the anterior cingulate cortex (ACC) (Behrens et al., 2007). Model-based learning about the environment (e.g., explicitly encoding category uncertainty) will be most useful in a volatile world because it allows observers to distinguish optimally between outliers and those events that herald a change of state. On the other hand, in a volatile environment estimates of category variance will be of limited precision and expensive to compute. It thus remains unknown whether rational strategies will predominate during periods of environmental stability, or volatility. One efficient way of dealing with a volatile world would be to simply maintain the most recent information about each category in short-term memory—equivalent to updating category values in the frame of reference of the stimulus (rather than action) with a learning rate that equals or approaches one.