Text-based Speaker Identification on Multiparty Dialogues Using Multi-document Convolutional Neural Networks

Kaixin Ma , Catherine Xiao , Jinho D. Choi

Abstract

We propose a convolutional neural network model for text-based speaker identification on multiparty dialogues extracted from the TV show, Friends. While most previous works on this task rely heavily on acoustic features, our approach attempts to identify speakers in dialogues using their speech patterns as captured by transcriptions to the TV show. It has been shown that different individual speakers exhibit distinct idiolectal styles. Several convolutional neural network models are developed to discriminate between differing speech patterns. Our results confirm the promise of text-based approaches, with the best performing model showing an accuracy improvement of over 6% upon the baseline CNN model.

Venue

Annual Meeting of the Association for Computational Linguistics (ACL): Student Research Workshop

Year

2017

Present

Kaixin Ma

Abstract

Links

Present