An Interdisciplinary Expert Pool for Natural Language Understanding
The artifacts used as training data in Machine Learning have their own historical, social, and cultural contexts – the majority of them were not produced to be used as training data for machine learning to make future predictions. Instead, they embody, represent, and reinforce a broad range of ideas and values, power relations, stereotypes and prejudices, and systematic forms of discrimination situated in their own places in time and space. Large Language Models “learn” languages by reading large amounts of these human-produced texts. Thereby they also “learn” the ideas and values, power relations, stereotypes and prejudices, systematic forms of discrimination, etc., inscribed in the training data. Therefore, questions of data, representativeness and fairness, equality, and ethics must be central to the development of language technologies. In this presentation, AI Sweden presents a new form of cross-disciplinary collaboration that aims to tackle these issues in concrete use case scenarios: The interdisciplinary expert pool for Natural Language Understanding, a platform for knowledge exchange between humanities scholars, social scientists, civil society organizations, and AI development teams.
Francisca co-leads the Natural Language Understanding Initiative at AI Sweden, the Swedish national center for applied AI. Originally a trained historian, Francisca moved into the field of AI to leverage her experiences for responsible innovation and social good. She holds a PhD from Uppsala University. Her research interests include global and gender history and span questions such as power relations and representation in historical archives. She is passionate about pulling together innovation projects that engage stakeholders who have traditionally not been involved in developing new technologies and diversifying the AI pipeline with domain experts from the humanities.