UCSC Ph.D. students dive deep into engineering open-domain dialogue AI with the support of industry partners

LivePerson, which builds conversational AI, is funding two student’s study of the models that drive chatbots, digital assistants, and more

Ph.D. students Wen Cui and Davan Harrison pose for a portrait.
Wen Cui (left) and Davan Harrison both study the engineering of open-domain dialogue through their doctoral research in Professor Marilyn Walker's Natural Language and Dialogue Systems Lab.

Face-to-face conversations are the first and most natural form of communication for humans – and they can also be extraordinarily complex, from the rapid switching of topics to references or expressions that can only be properly understood through reference to shared cultures or contexts. 

The complexities of spoken and written conversation present major challenges for artificial intelligence (AI) models that emulate and participate in conversations, from digital assistants like Amazon's Alexa and Apple’s Siri to chatbots used on retail and commerce websites, and much more.

“Dialogue and conversation are the most fundamental method of communication,” said Davan Harrison, a candidate for a Ph.D. in Natural Language Processing (NLP) at the Baskin School of Engineering. “Being able to achieve that level of comfort and ease of communication in areas where it doesn't currently exist can potentially have large effects wherever people communicate, especially where there is some sort of educational or skill barrier to communicating with an entity or organization.”

Harrison and Wen Cui, both Ph.D. students in the Natural Language and Dialogue Systems Lab led by professor of computer engineering Marilyn Walker, are working to address some of these AI challenges in their research. Their work is supported by fellowships from LivePerson, which builds conversational AI experiences for a variety of industries and uses and has been recognized as a top innovator in AI.  

Both students are working toward making better AI for open-domain dialogues, which are conversations not specific to a particular industry or subject area. These types of dialogues are common with devices like Alexa and Siri, and underpin other technologies such as chatbots that can be used in a wide variety of different industries from retail to mobile banking.

Students doing research in NLP as well as students enrolled in UCSC’s NLP professional masters program benefit from collaboration with industry partners and mentors who can help them better understand the wants and needs of the companies that may hire them. For example, last year Harrison and Cui collaborated directly with NLP masters students on a UCSC team competing in the Alexa prize competition by Amazon focused on open-domain dialogue.

“Studying open-domain dialogue presents unique challenges and opportunities for our NLP students to make models that talk like humans do,” said Marilyn Walker, director of the NLP Program and professor of computer science and engineering. “It's an area that's becoming increasingly relevant in today's industry. LivePerson fields conversational systems that carry out almost one billion conversations a month. Our collaboration with LivePerson will help us understand better the research that is needed to make these systems more natural and useful. My students’ work within these areas will shape how people interact with the products that can make the world more accessible and easier to navigate.” 

The two Ph.D. students’ research focuses on different aspects of the AI algorithms designed for open-domain dialogue; Cui is focused on named-entity linking, while Harrison is studying dialogue management.

Cui’s work aims to develop a better system for entity linking, the connection of entities like “Lebron James” or “the Earth” to their various meanings in an existing database of knowledge – in this case, Wikidata with its more than 97 million open-source data items. This can improve the performance of dialogue systems in how they understand human speech and generate better responses in open-domain dialogues, where conversations can quickly switch topics and often revolve around popular entities such as recent movies or new songs. Her work so far has proven that providing the AI training model with semantic information representing the context of the  conversation improves the performance of that model.

Cui is working with a team at LivePerson that annotates conversational data to learn how people use entities in their spoken or written conversations. By incorporating annotated data from LivePerson, she aims to better understand the role of discourse models in entity linking for dialogue and test various algorithms. 

Cui said she hopes to be able to make the model she trained live and use it for interactions with real people, to learn from the way that they talk with it. Testing the system with real data and users can help the researchers evaluate its strengths and weaknesses.

Harrison’s work is on dialogue managers, an aspect of AI dialogue models that control the flow of a conversation, in more casual, chit-chat style settings. 

Currently, he is focused on incorporating principles from formal linguistics about the flow of conversations, called discourse relations, into AI models in order to better guide open-domain dialogue. Annotating the logical relations of a conversation could help machines better navigate conversations that bounce around from one subject to the next. 

Harrison says collaborating with researchers in industry allows for a productive exchange of ideas. Companies are often working on similar problems but may have a different perspective because they have access to different information and understand the unique contours of the particular problem they are working on.

“There could be some situation where conversation can be really helpful and beneficial to someone that I'm not aware of, but a company is, because that’s their bread and butter,” Harrison said. “There's so many open problems in the field.”

As both students continue to advance their research, future collaboration with LivePerson and other industry partners could bring their work into the products that many people interact with in their day-to-day lives.

“I think it's really challenging for cutting edge research to move into commercial products,” said Beth Ann Hockey, Senior Principal Data Scientist at LivePerson and a member of the NLP program’s Industry Advisory Board. “LivePerson has a lot of desire to move the field forward, to be more innovative, and to pursue these ideas about making conversation more natural and more responsive. I think collaboration with UC Santa Cruz is really great because maybe we can move some of that more cutting edge research into actual products.”