← Seminars

Assembling Multi-Turn Dialogues from Reddit Data

Mack Hutsell

Abstract

Recently, advances such as BlenderBot 2.0 have been powered by a new form of dataset created by intelligent use of computational processes. The data for BlenderBot, for example, took Reddit posts and extracted millions of two-turn conversations, as well as “persona profiles” for the associated speakers. Such approaches opened the door for more sophisticated computational approaches to dataset creation. We’ve tested several model variants — taking advantage of BlenderBot, BERT Next Sentence Prediction, and reddit’s structure — to identify a strong-performing model for multi-turn dialogue assembly.

Term
Spring 2022
Date
February 18, 2022
Time
4:00 - 5:00 PM
Location