From Long-Term Dialogue Memory to Personality Profiling: A Research Trajectory of Repeated Contraction

Abstract

This thesis began with a broad question about long-term memory in dialogue: in long-form multi-party conversation, what kinds of information should be remembered, and how should they be represented so that they can support later reuse? Using Friends as the main corpus, the project initially aimed to move toward a memory-oriented system for long-horizon dialogue. However, as the work developed, that broader agenda repeatedly failed to stabilize. The project therefore evolved through a sequence of increasingly constrained stages rather than a single linear pipeline.

The early stages explored cross-episode reuse as a possible criterion for memory worthiness, then shifted to schema selection for dialogue information. Existing event and relation extraction frameworks provided useful starting points, but they proved insufficient for thekinds of socially and narratively specific information required in Friends. This led to a gradual contraction of the project toward a smaller but more workable representation: utterance-level events rewritten as self-contained natural-language event descriptions.

In the final stage, this surviving representation was reused in a personality profiling experiment.The hypothesis was that dialogue-derived Big Five profiles (profiling over the five dimensions of Agreeableness, Conscientiousness, Extraversion, Openness, and Neuroticism) would show a directional trend over time, especially within a season and between Season 1 and Season 10. Two scoring routes were implemented: sub scene-level scoring followed by within-episode aggregation, and direct full-episode scoring. Neither method supported the expected monotonic trend. In both cases, the resulting curves fluctuated sharply from episode to episode, and the hypothesis was therefore not supported.

Follow-up comparisons showed that the two methods introduced different distortions: sub scene-based scoring preserved local evidence but flattened it through averaging, while full-episode scoring often imposed a dominant evaluative direction and suppressed counter-evidence. The thesis does not claim to have built or disproved a general long-term memory system. Its contribution is instead a documented research trajectory: an originally broad memory-oriented project that contracted into a narrower, testable profiling experiment, whose final hypothesis was falsified, and whose intermediate representational results remained the most defensible outcome of the work.

Department

Department of Computer Science

Term

Spring 2026

Degree

Committee

Jinho D. Choi , Computer Science, Emory University (Chair)
Michelangelo Grigni, Philosophy, Emory University
Wei Jin, Computer Science, Emory University

Links