Back to All Events

Social Bias and Fairness in Natural Language Processing

Abstract

Learned continuous representations for language units were the first trembling steps of making neural networks useful for natural language processing (NLP). They promised a future with semantically rich representations for downstream solutions. NLP has now seen some of the progress that previously happened in image processing: the availability of increased computing power and the development of algorithms have allowed people to train larger models that perform better than ever. Such models also allow transfer learning for language tasks, thus leveraging large, widely available datasets.

In 2016, Bolukbasi et al. presented their paper "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings", shedding light on some of the gender bias that was available in trained word embeddings at the time. Datasets encode the social bias surrounding us, and models trained on that data may expose the bias in their decisions. It is important to be aware of what information a learned system is basing its predictions on. Some solutions have been proposed to limit the expression of societal bias in NLP systems. These include techniques such as data augmentation and representation calibration. Similar approaches may also be relevant for privacy and disentangled representations. In this talk, we'll discuss some of these issues and go through some of the recently proposed solutions.

Olof Mogren

Machine Learning Researcher @ RISE

Olof Mogren is heading the AI research group at RISE in Gothenburg. Research interests include representation learning, data privacy, and modeling the world around us in application domains such as analysis of medical texts, image processing, and sensor modeling.

Earlier Event: November 27
An Introduction to Algorithmic Fairness
Later Event: November 27
Common Pitfalls when Working with Data