Research groups
Collaborators
-
Tsvetomira Dumbalska
Postdoctoral Researcher
-
Matan Mazor
Research Fellow
-
Jessica Thompson
Postdoctoral Researcher
-
Kai Sandbrink
DPhil Candidate
Colleges
Brian Christian
DPhil
DPhil (2023–2026)
My DPhil (2023–2026) was supervised by Christopher Summerfield (Experimental Psychology) and co-supervised by Jakob Foerster (Engineering Science) and Tsvetomira Dumbalska (Experimental Psychology).
My research interests are in the places where psychology and engineering meet: I’m interested in computational models of human cognition, in the structure and representation of human rewards and goals, and in reward models and reinforcement learning from human feedback (RLHF) as promising, but incomplete, tools for operationalizing notions of human norms, preferences, and values.
DPhil Thesis: 'What humans want: Models of preference, choice, and reward in minds and machines'.
Key publications
Reward Model Interpretability via Optimal and Pessimal Tokens
Conference paper
Christian B. et al, (2025)
Revealing priors on category structures through iterated learning
Conference paper
Griffiths TL. et al, (2006)
Recent publications
Reward Model Interpretability via Optimal and Pessimal Tokens
Conference paper
Christian B. et al, (2025)
Using adaptive intrinsic motivation in RL to model learning across development
Conference paper
Sandbrink KJ. et al, (2025)
Computational Frameworks for Human Care
Journal article
Christian B., (2025), DAEDALUS, 154, 183 - 197