Within the field of Affective Computing, facial expressions have traditionally been used as a means of inferring valence and emotions. Most studies have focused on interpreting facial expressions as an isolated signal, typically training algorithms with annotators with a 3rd-person point of view, often without access to the original context. When the context is provided, recent research highlights that the interpretation annotators assign to facial expressions is sometimes more strongly influenced by the context than the facial expression. But even when the context is provided, annotators are psychologically and physically distant from the original setting that produced the emotion. In this paper, we explore how context and facial expressions shape 2nd-person interpretation of facial expressions and compare this to 1st-person self-report. Results show that both expression and context contribute to self and other impressions but in different ways. Expressions and context are independent predictors of 1st-person judgments but interact to determine 2nd. person judgments. In particular, the way players interpret their partner's facial cues changes dramatically based on what just occurred in the game. We discuss the implication of these findings for automatic emotion recognition methods.