Abstract
This study examines gender biases in machine learning models that predict literary canonicity. Using algorithmic fairness metrics like equality of opportunity, equalised odds, and calibration within groups, we show that models violate the fairness metrics, especially by misclassifying non-canonical books by men as canonical. Feature importance analysis shows that text-intrinsic differences between books by men and women authors contribute to these biases. Men have historically dominated canonical literature, which may bias models towards associating men-authored writing styles with literary canonicity. Our study highlights how these biased models can lead to skewed interpretations of literary history and canonicity, potentially reinforcing and perpetuating existing gender disparities in our understanding of literature. This underscores the need to integrate algorithmic fairness in computational literary studies and digital humanities more broadly to foster equitable computational practices.
| Original language | English |
|---|---|
| Article number | 76 |
| Journal | CEUR Workshop Proceedings |
| Volume | 3834 |
| Pages (from-to) | 153-171 |
| Number of pages | 19 |
| ISSN | 1613-0073 |
| Publication status | Published - 2024 |
| Event | 2024 Computational Humanities Research Conference, CHR 2024 - Aarhus University, Aarhus, Denmark Duration: 4 Dec 2024 → 6 Dec 2024 Conference number: 5 https://2024.computational-humanities-research.org |
Conference
| Conference | 2024 Computational Humanities Research Conference, CHR 2024 |
|---|---|
| Number | 5 |
| Location | Aarhus University |
| Country/Territory | Denmark |
| City | Aarhus |
| Period | 04/12/2024 → 06/12/2024 |
| Internet address |
Keywords
- algorithmic fairness
- bias
- canonicity
- computational literary studies
- gender bias