Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Rewards, and their maximisation, are crucial determinants for individual survival and evolutionary fitness. Rewards induce learning (positive reinforcement), approach behavior, economic choices and emotions (pleasure, desire).

We use behavioural tools derived from animal learning theory and machine learning (reinforcement learning) and economic decision theory (Expected Utility Theory, Revealed Preference Theory). We conceptualise rewards as probability distributions of value whose key parameters are expected (mean) value and forms of risk expressed as variance (spread) and skewness (asymmetry). Behavioural choices reveal distinct attitudes towards these risk forms and comply with predictions from estimated utility functions. The choices follow the gambles’ first, second and third order stochastic dominance and thus are meaningful and rational in the sense of getting the best reward. Behavioural choices among multi-component rewards can be studied according to formal choice indifference curves of Revealed Preference Theory and provide further tests for reward maximisation, including Arrow’s Weak Axiom of Revealed Preference Theory (WARP).

Using experimental tasks derived from these theories, we investigate the activity of individual reward neurones in specific brain structures. Dopamine neurones carry a two-component reward prediction error signal for the physical impact and value of rewards, respectively. The reward signal codes formal economic utility and is influenced by risk. Slower components of the same neurones signal motor activation. Neurones in the orbitofrontal cortex code the integrated or distinct values of multi-component rewards and follow Arrow’s utility maximisation axiom. These neurophysiological mechanisms represent the physical implementation of theoretical constructs such as reward value (utility), preference, probability, risk and stochastic dominance. They inform and validate specific theories of economic decision making.